Spotlight on stylometric text analysis

From MandrakeWiki
Jump to navigation Jump to search


Different authors have different writing styles. Like the lenght of words and sentence, the frequencies of word, the frequencies of word forms, the richness of vocabulary, the use of punctuation and on. The author can also have preferences for certain spelling variants or using certain expressions.

Stylometry is the study of measurable features of style.


The Avon Novels

The Story of the Phantom is a series of 15 novels, published by Avon Publications in the U.S. from 1972 to 1975, based on Lee Falk's Phantom stories. When released the adaptor of issues 2 and 10 was not credited, and issue 15 was credited as Carson Bingham. Lee Falk did correct this using an "Author's note" in the books.

Adapted by issues note
Lee Falk 1, 6, 9, 12, 15 #15 is wrongly credited as Carson Bingham
Basil Copper 2, 3 #2 The adaptor is not credited
Frank S. Shawn (pen name of Ron Goulart) 4, 5, 7, 8, 10, 11 #10 The adaptor is not credited
Warren Shanahan 13
Carson Bingham (pen name of Bruce Cassiday) 14

The analysis

Fratelli Spada - Prose stories

The Fratelli Spada - Prose stories

King Comics stories

The King Comics stories


Without giving further details, Alfred Bester told that he ghosted Falk’s strips during the WWII years. Lee Falk worked for OWI early 1942 to August 1943. He then wrote the "Passionate Congressman" and several scripts for his newspaper characters, before he was enlisted as private in the army in March 1944 (to about mid 1945).

The Phantom Sundays

The text in the comic strips are a bit different than the previous novels. In the novels the dialogues looking about this: "Hello," said the Phantom. But in the Sunday stories the speech bubble are more like this: Hello!

PS Start Finish Weeks Story Title note
ps-002 22.10.1939 10.03.1940 21 "The Precious Cargo of Colonel Winn"
ps-003 17.03.1940 21.07.1940 19 "The Fire Goddess"
ps-004 28.07.1940 29.12.1940 23 "The Beachcomber"
ps-005 05.01.1941 23.02.1941 8 "The Saboteurs"
ps-006 02.03.1941 22.02.1942 52 "The Return of the Sky Band"
ps-007 01.03.1942 11.10.1942 33 "The Impostor" at OWI
ps-008 18.10.1942 18.04.1943 27 The Marshall Sisters Pt.1: "Castle in the Clouds" at OWI
ps-009 25.04.1943 04.07.1943 11 The Marshall Sisters Pt.2: "The Ismani Cannibals" at OWI
ps-010 11.07.1943 25.06.1944 51 The Marshall Sisters Pt.3: "Hamid the Terrible" at OWI
ps-011 02.07.1944 07.01.1945 28 "The Childhood of the Phantom" in the army
ps-012 14.01.1945 24.06.1945 24 "The Golden Princess" in the army
ps-013 01.07.1945 02.12.1945 23 "The Strange Fisherman" in the army ?
ps-014 09.12.1945 17.03.1946 15 "Queen Pera the Perfect"
Using R

The text from the Sundays (2-14) were put into one corpus folder. Two analysis were done: first 0-902 MFW 2-gram and the second 0-902 MFC 3-grams. Both using the Boostrap Consensus Tree.

The result for the MFW 2-grams grouping all stories close in writing style. The MFC 3-grams shows a slightly larger variation, but this is most likely due to the ammount of dialogues from different characters in the stories.

The analysis shows no clear indication that anyone other than Lee Falk was the author of these stories.