On Saturday, I finished Team of Rivals and while looking at my calendar noticed that it was also Lincoln’s birthday this week. What better way to celebrate his birthday than to analyze his speeches and letters? I downloaded the 7 volume set containing his speeches, letters, and essays from Project Gutenberg and spent a few hours on Sunday cleaning the text and writing a parsing script. On Monday, I started analyzing the text to see if I could make sense of it.
I was able to get 1,458 documents containing almost 16,500 sentences and a little over 547,000 words. I tried getting the date each letter was written or speech was given but was only able to get it for 60% of the documents. That was enough to get some insights.
Number of speeches/letters by year
I suspect a lot of his early writing and speeches and were lost since they just weren't preserved as well as his later speeches and letters
Trend of phrases
I wanted to examine the phrases that he most commonly used over time in order to see whether there were any noticeable changes and whether they meant something. Turns out there was some interesting stuff here that's highlighted in green.
- Slavery - There are references to slavery across the entire date range with the Dred Scott decision and the Missouri Compromise appearing as common phrases in the 1850s.
- Civil War Generals - You can trace the career of the generals during the Civil War based on their mentions. General Hooker was mentioned in 1862 and 1863; General Meade in 1863 and 1864; and General Grant in 1864 and 1865. This echoes history: General Hooker was replaced by General Meade in 1863 with General Grant being in command of the Union Army in October of 1863.
- The Presidency - When Lincoln was elected president in 1860, he started finishing his letters with the phrase "Lincoln, President of." During the presidency we also see mentions of his cabinet: Stanton and Seward.
*The table below was generated by looking at the top 20 three word phrases used in each year range and then consolidated into a top 100 list across the entire dataset. The X indicates that the phrase was in the top 20 three word phrases for that year range. I highlighted the interesting rows in green.
phrase | 1832-1845 | 1846-1853 | 1854-1859 | 1860 | 1861 | 1862 | 1863 | 1864 | 1865 |
---|---|---|---|---|---|---|---|---|---|
the united states | X | X | X | X | X | X | X | X | X |
of the united | X | X | X | X | X | X | X | X | X |
i do not | X | X | X | X | X | X | X | X | X |
the secretary of | X | X | X | X | X | X | X | ||
secretary of war | X | X | X | X | X | X | |||
in regard to | X | X | X | X | X | X | X | X | |
the people of | X | X | X | X | X | X | X | X | X |
of the people | X | X | X | X | X | X | X | X | |
president of the | X | X | X | X | X | X | X | X | X |
in favor of | X | X | X | X | X | X | X | X | |
my dear sir | X | X | X | X | X | X | X | X | |
as well as | X | X | X | X | X | X | X | X | |
so far as | X | X | X | X | X | X | X | X | X |
dred scott decision | X | ||||||||
there is no | X | X | X | X | X | X | X | X | X |
by the president | X | X | X | X | X | X | X | X | |
the supreme court | X | X | X | X | X | ||||
united states and | X | X | X | X | X | X | X | X | |
of the union | X | X | X | X | X | X | X | X | X |
that it is | X | X | X | X | X | X | X | X | X |
it is a | X | X | X | X | X | X | X | X | |
that judge douglas | X | ||||||||
the dred scott | X | ||||||||
that there is | X | X | X | X | X | X | X | X | X |
institution of slavery | X | X | X | X | |||||
secretary of state | X | X | X | X | X | X | X | X | |
the missouri compromise | X | X | |||||||
to say that | X | X | X | X | X | X | X | X | |
of the state | X | X | X | X | X | X | X | X | X |
the state of | X | X | X | X | X | X | X | X | X |
of the government | X | X | X | X | X | X | X | X | X |
major general mcclellan | X | X | X | ||||||
of the country | X | X | X | X | X | X | X | X | |
secretary of the | X | X | X | X | X | X | X | ||
of the army | X | X | X | X | X | X | |||
it is not | X | X | X | X | X | X | X | X | X |
of the potomac | X | X | X | X | X | ||||
part of the | X | X | X | X | X | X | X | X | |
one of the | X | X | X | X | X | X | X | X | |
united states to | X | X | X | X | X | X | X | ||
washington d c | X | X | X | X | X | ||||
house of representatives | X | X | X | X | X | X | X | X | X |
as to the | X | X | X | X | X | X | X | X | X |
harper s ferry | X | X | X | X | X | ||||
the public safety | X | X | X | X | |||||
major general hooker | X | X | |||||||
the gentleman from | X | X | X | ||||||
lieutenant general grant | X | X | |||||||
major general halleck | X | X | X | ||||||
major general meade | X | X | |||||||
of the enemy | X | X | X | X | X | X | |||
the union and | X | X | X | X | X | X | X | ||
the day of | X | X | X | X | X | X | X | X | X |
the president of | X | X | X | X | X | X | X | X | |
the rio grande | X | ||||||||
the senate and | X | X | X | X | X | X | X | ||
to the senate | X | X | X | X | X | X | |||
army of the | X | X | X | X | X | X | |||
city point va | X | X | |||||||
and house of | X | X | X | X | X | ||||
executive mansion washington | X | X | X | X | X | ||||
of the treasury | X | X | X | X | X | X | |||
of the secretary | X | X | X | X | X | ||||
of the bank | X | X | |||||||
of the public | X | X | X | X | X | X | |||
of the war | X | X | X | X | X | X | X | ||
yours very truly | X | X | X | X | X | X | |||
as may be | X | X | X | X | X | X | X | X | |
he did not | X | X | X | X | |||||
lincoln president of | X | X | X | X | X | X | |||
m stanton secretary | X | X | X | X | |||||
stanton secretary of | X | X | X | X | |||||
the war department | X | X | X | X | X | X | |||
i shall be | X | X | X | X | X | X | X | X | X |
william h seward | X | X | X | X | X | X | |||
edwin m stanton | X | X | X | X | |||||
for the purpose | X | X | X | X | X | X | X | X | X |
general grant city | X | X | |||||||
i have been | X | X | X | X | X | X | X | X | |
is to be | X | X | X | X | X | X | X | ||
it will be | X | X | X | X | X | X | X | X | |
it would be | X | X | X | X | X | X | X | X | |
of all the | X | X | X | X | X | X | X | X | X |
of the department | X | X | X | X | X | ||||
the post office | X | X | X | X | X | X | |||
the public lands | X | X | X | X | X | ||||
yours of the | X | X | X | X | X | X | X | ||
at p m | X | X | X | X | |||||
grant city point | X | X | |||||||
h seward secretary | X | X | X | X | X | X | |||
i have no | X | X | X | X | X | X | X | X | X |
in relation to | X | X | X | X | X | X | X | X | |
seward secretary of | X | X | X | X | X | X | |||
that i have | X | X | X | X | X | X | X | X | X |
as follows to | X | X | X | X | X | ||||
dear sir yours | X | X | X | X | X | X | |||
sir yours of | X | X | X | X | X | X | X | ||
dear sir i | X | X | X | X | X | X | X | ||
ought to be | X | X | X | X | X | X | X | X | X |
of the is | X | X | X | X | X | X | X | X |
Phrase word clouds
I tried visualizing the table above as word clouds but in hindsight don't think it was the best way to display the data. It did give me an excuse to play around with D3 library though.
As usual, the code’s up on Github.