Word Clouds and Visualizing the Archive

The digitization of the Lady’s Magazine (1770-1832) has opened up exciting new research methodologies that we use on our project to help extrapolate trends and changes that occur in the periodical over the course of its 62 year print run. One of these research tools is the word cloud, a means of representing data visually by inserting a large quantity of text into a program that analyzes word frequency. The resulting word cloud depicts the range of word or phrase frequency through size difference so one can readily see how different terms are weighted relative to one another. This is useful when working with a database the size of the Lady’s Magazine because it enables us to see changes in, for example, the terminology used in titles over the magazine’s entire print run. 1770 prose top 75For example, in 1770 the most frequent 75 words that appear in the prose titles are terms descriptive of genres or types of writing: history, anecdote, treatise, account, biography, tale, letters, French, translation, etc. Also appearing frequently are the words ‘lady’, ‘lady’s’, and ‘female’.

In comparison, using the most frequent terms in prose titles from 1815 reveals a shift in the magazine’s composition. With the exception of the ubiquitous anecdote, fewer genres appear while increasingly individual names and titles (with an understandable emphasis on the French) are featured.1815 top 75 Prominently featured are the terms death, Bonaparte, France, Paris, Duke, Nelson, king, general, Lord, Hamilton, Chesterfield, Cromwell, Sir, Wellington and theatre.

The content shifts that lie beneath the articles’ titles require, of course, careful analysis of the underlying contexts that such visualizations merely nod towards. So while between 1770 and 1818 the term ‘men’ appears with around the same frequency, in 1778 the titles with men include ‘Verses on the Folly of Men’ while in 1817 readers were presented with ‘Maxims of Eminent Englishmen’. The same approximate frequency – but very different content indeed!Screen Shot 2015-01-19 at 11.48.31 Because our index includes a series of keywords for each item in the magazine, we can compare word clouds of the keywords in one year to word clouds of the titles and discover substantial differences. For example, the keywords for 1770 look quite different from the article titles.

When working with material as sizeable in scope, quantity, and chronology as the Lady’s Magazine archive, similarly diverse research methodologies are likewise required. The word cloud is one of the methods that digitization has made possible and that raises new and important questions about the magazine’s content and how such content was presented to the readers.

Dr Jenny DiPlacidi, University of Kent

 

4 thoughts on “Word Clouds and Visualizing the Archive

  1. Jenny DIPlacidi

    And thanks as well Karen; I would be very interested in the Stationers’ Hall music data – I am curious about the links between song sheets in the magazine and the Robinsons’ connection with the Thompson family.

  2. Jenny DIPlacidi

    Thanks Bill! I’m actually doing some statistical analysis of the actual genre breakdowns for the first 20 years right now and some more word clouds for a paper I’m writing – I think the visual results are really helpful with an archive this size.

  3. Karen McAulay

    Very interesting. The same could usefully be done with Stationers’ Hall music data of the C18th -19th, albeit to slightly different ends. Would anyone else be interested in this?

  4. Bill Hughes

    This is fascinating! I’m just reading Franco Moretti’s Distant Reading, where he talks about similar kinds of analysis of the titles of novels

Leave a Reply

Your email address will not be published. Required fields are marked *