Here are some ideas I had for my project this semester.
- One idea would be to try to develop a way of visualizing the establishment of clichés over time—especially ones that originate in quotations from literary texts. It would be possible to track the histories of relatively short clichés using the Google Ngrams data set, although that would require Big Data-level computing. I could also do this on a smaller scale (and with a lot more flexibility) using the just-released EEBO-TCP corpus, which includes manually transcribed versions of over 25,000 early modern English books.
- I might try to do something with computerized outlining tools. The work that I’ve done so far is way on the complicated side, so in the spirit of this class it might be useful to try to come up with a minimal viable product. In an ordinary outline, one line might be indented beneath another for any number of reasons—it might expand on an idea, provide an example, give a possible counterargument, etc. By including symbols that make these relationships explicit, it is possible to manipulate the structure using a computer—something that can be used, for instance, to play around with different possible structures for a paper in an interactive way.
- I’ve been toying around with the idea of developing a programming environment specifically designed for working with texts. There was an attempt to create a programming language for humanists way back in 1970, but nothing this century as far as I know. We have mostly picked up general-purpose languages like Python. But some of the basic operations that we have to do in manipulating texts—stripping tags, parsing document structures, tokenizing—can be awkward in these systems, and it can be difficult to the user to tell whether these operations are working right with a particular body of text. It would be much easier to work in an environment with immediate feedback. Imagine having your code on one side of the screen and a visualization of a text on the other, with annotations that indicate how the text is being chopped up, and that change immediately when you change the code. This project would constitute a desktop application along with either an interpreter for a new programming language or a library for an existing one that includes functions for the interactive manipulation of texts.

