This week in Digital History we looked at textual analysis tools. I decided to do a deeper dive into Voyant Tools, a web-based set of tools for text reading and analysis. Voyant Tools allows you to upload texts like webpages and documents for analysis. Tools you can use include word frequency clouds, word counts, and correlation graphs. As a beginner to these tools, the default display of analysis was plenty to keep me busy. More sophisticated tools and analysis can be used by those more comfortable with the application.
To get a better handle on Voyant Tools, I took a look at one of their pre-curated collections of texts: Jane Austen’s novels. As one of the world’s biggest Pride and Prejudice fans, I was excited to see what kind of information I could learn about Austen from textual analysis of her work. The easiest information to understand was the word cloud generated of Austen’s most frequently used words. Readers will not be surprised to learn that some of the most frequently used words include “Mr.” and “Miss.” Just think of any dialogue between characters—it most likely includes those polite honorifics, so no surprise there. The statistic I found most interesting was the most distinctive words in each of Austen’s works. Surprisingly, Austen misspelled “friendship” as “freindship” in Love and Freindship 40 times, making that one of the most distinctive words in that work!
I would need a lot more practice with Voyant Tools to use it to its greatest ability. Some of the tools and analysis are simply unfamiliar to me, so it’s difficult for me to grasp what their usefulness might be in my own work. For example, I’m not sure what “relative frequencies” measures, and the numeric values associated with that statistic mean little to me. In order to use Voyant Tools most effectively, I’d have to become more familiar with the tools overall and what their actual measurements mean.
As far as the field of digital history goes, I can see Voyant Tools having its uses. Analyzing large amounts of text for word frequency could be one use. For example, inputting large amounts of newspaper articles and seeing how frequently the words “fire” or “conflagration” are used could give a historian an idea of how common fires were in a certain time period, or at least how often they were reported. As a method of research, it’s a beginning, not an end. These tools can be used to give you a better handle on your data, but further analysis is definitely needed to make historical claims.