Text Analysis… So What?

Tim Hitchcock’s “Big Data for Dead People” elucidates a problem I encountered when first introduced to text analysis and continued to grapple with while working on my Clio 1 project. Hitchcock notes that “distant reading seems to tell us what we already know.” For my Clio 1 project, I used Voyant to analyze twentieth century Supreme Court cases and newspaper articles that reported on those specific cases. When I first examined the results from Voyant, I couldn’t help but think, “This is neat! But so what?”

Tim Hitchcock demonstrates how text analysis can be utilized for historical research purposes using the case of Sarah Durrant as an example. To start out, Hitchcock analyzed the record of her trial, using the trial transcript, records from her imprisonment, and the newspaper report of her case. He used an ngram viewer to compare Sarah’s words to those of women of the same age and social class, to see if her linguistic patterns matched. Since the records included Sarah’s address, Hitchcock was able to examine her neighborhood and its inhabitants, including neighbors who lived in her building and on along the same road, using GIS. Sarah’s trial was one of the first in which a detective gave evidence, showing that her case was unique. Hitchcock also analyzed Sarah’s experience with other defendants using the Old Bailey Online, and found that her plea coincides with the rise of plea bargaining. Using both close and distant reading, Hitchcock was able to contextualize Sarah’s experience and demonstrate the importance of the information she left behind.

While all this analysis feeds into Hitchcock’s larger point of using these new technologies of digital humanities to examine source bases that we do not already know (as opposed to sources we do know, i.e. books, print records, etc.), his article stood out for me because he did more than just textual analysis. Yes, it’s great that text mining and topic modeling software can spit out word frequencies and a series of topics. But so what? Hitchcock’s piece more than adequately answered the question. Using digital methods he was able to examine the data Sarah left to craft a larger story of the justice system in England in the nineteenth century.

This “So what?” question goes back to the first blog post I wrote for this minor field, where I championed findings over methodology. I’m not going to reiterate this thought, as I’ve learned over the past several weeks the importance of DH being methodologically focused. But with text analysis, it is important for me, not DH as a field, to remember that I need to use these tools to prove a point and to shape an argument. Hitchcock demonstrated how I can use textual analysis to contextualize data, and how that contextualization can lead to larger findings. Contextualization is key, as the methodology will generally tell the user what he/she already knows. Text analysis is incredibly useful for many reasons, but I have to remember to look past the words and numbers, take that next step, and look at the “bigger picture.”

