digital humanities


This post contains a walkthrough for creating a simple critical edition in the Classical Text Editor (CTE).  For my example, I’ve chosen a text that I’m reading at the moment, Demosthenes’ Philippic I.  I’ve reproduced much of the information for the first three sections of Dilts’ edition [1].  His edition includes the Greek text, and apparatus criticus, and an apparatus of later sources.  All of the steps of this tutorial should be doable with the demo version of the CTE, with the exception that the demo version does not produce non-watermarked output.  

You may go through the tutorial with your own text, or you may use the same portion of Demosthenes that I did.  If you choose to use Demosthenes, you may make use of the following:
– The main text (an RTF file). 
– The final PDF (from which you may take the variant readings). 

I’ve also uploaded my final CTE file.  I wouldn’t open it unless you get stuck.  It’s easy to get confused in the CTE if you have multiple files open.

If you discover any of the links have broken, please let me know at once!

I’m not a CTE expert by any means, and I can’t say that I’ll be able to answer your questions, particularly if they don’t relate to the steps below.  That said, feel free to leave comments, questions, corrections, or suggestions below.  I’ll do my best to answer.

If you’d be interested in seeing tutorials for other sorts of editions (like an a text with parallel translation), do let me know.   

Creating the file and preparing the main text

Upon launching the CTE, one is greeted with the following blank page:

Screen Shot 2014 06 12 at 2 41 06 PM

The first step is to create a new file (File –> New).

At this point, one should decide the number of apparatus and notes’ files needed.  By default, the CTE creates one critical apparatus and one “notes” section.  This is satisfactory for this particular text, but it’s conceivable one would want to add more (for instance, if you needed a critical apparatus, source apparatus, and explanatory footnotes).  Be careful to note that when the CTE says “apparatus” is generally means “critical apparatus” (i.e. apparatus of variant readings).  Though we might speak more loosely of an “apparatus of sources,” in CTE parlance this is called a “Notes.”  The dialog for specifying the number of Apparatus and Notes windows is found at Format –> Number of Notes/Apparatus…

Screen Shot 2014 06 12 at 2 49 36 PM

Screen Shot 2014 06 12 at 2 50 33 PM

At this point, we should also clearly tell the CTE that our apparatus is an apparatus of variant readings:
– Bring up the apparatus window (Ctrl+A or Windows –> Apparatus)
– Click on Format –> Apparatus Settings
 
Screen Shot 2014 06 12 at 2 53 13 PM
Screen Shot 2014 06 12 at 2 53 59 PM

At this point, we may enter the main text into our “Text Window.”  For this example, I copied in the text from Perseus and then changed it to match Dilts’ text.  Next, we may introduce the section divisions into the text.  The CTE provides a number of ways of doing this; by far the easiest is simple paragraph numbers.  To enable features like automatic referencing by section and line number, we need to tell the CTE where are sections begin.  To do this, one highlights the number (note, only highlight the number, not the following punctuation) and then clicks Insert –> Chapter Identifier:
  Screen Shot 2014 06 12 at 2 59 52 PM
Screen Shot 2014 06 12 at 3 00 13 PM
After this, the number should now be highlighted yellow, which indicates that the CTE recognizes it as the start of a new section. Go ahead and repeat this step for each section.  

Creating the Sigla
At this point, you should have input the main body of the text and marked the sections.  We may now input the sigla into the CTE.  Note that for simple output, this is not strictly necessary.  You may simply type out your sigla manually when inputting textual variants.  Using the CTE facilities for sigla, however, does offer several advantages.  For instance, it makes it very easy to change a siglum if you need to do so later.  It also makes it easy to modify the formatting automatically, for instance, if one always wants to have bolded sigla, or italicized sigla, in the apparatus.  Finally, the CTE also allows creating mss groups, a feature I won’t explore here.  

Sigla are managed from the sigla window, which one accesses at Format –> Sigla…
 
Screen Shot 2014 06 12 at 3 08 39 PM
Screen Shot 2014 06 12 at 10 39 53 AM
The “New Button” allows one to create a new siglum.  The CTE keeps track of mss by number.  We don’t have any ms groups, so for our purposes, each siglum gets a unique numerical identifier (in the above picture, the “1”) and a visible identifier that will appear in the apparatus (in the above picture, S). After inputting the two identifiers, make sure to click Apply.  Note that italics, bold, etc. are all permitted for sigla.  I start at 1 and simply increment the ms number by 1 as I add mss, but what matters most is that each siglum as a unique number.  After adding all of the sigla, they should be visible at the left side of the dialog.  

Screen Shot 2014 06 12 at 3 16 00 PM

Inputting variant readings
At this point, we have everything we need to begin inputting variant readings into the critical apparatus.  To do this, one highlights the affected text in the Text window, and then selects References –> Apparatus Reference (Or press F5). Note that if the variant only affects a single word, you can simply place the cursor after that word. For one of our readings, for instance, one ms reads καὶ αὐτὸς instead of simply αὐτὸς.  To input this, we place our cursor after αὐτὸς:
 Screen Shot 2014 06 12 at 3 21 19 PM
Clicking References –> Apparatus Reference (or pressing F5) brings up the apparatus window:
Screen Shot 2014 06 12 at 3 25 12 PM
In the scheme I’ve chosen, we then input the reading of the other ms(s).
 Screen Shot 2014 06 12 at 3 27 03 PM
We may then use the Sigla dialog to input add the sigla to our apparatus to specify that this reading belongs to A.  To do this, click Insert –> Siglum…

Screen Shot 2014 06 12 at 3 28 44 PM

A box then comes up with a list of sigla:
 Screen Shot 2014 06 12 at 3 28 55 PM
Double clicking the proper siglum then adds it to the entry (it should appear in yellow).  
 Screen Shot 2014 06 12 at 3 29 02 PM

If one has multiple variants at a single location, the default way to do this is to separate entries by a colon (see, for instance, the variant for τοι εἰ above).  If the reading affects multiple words, make sure to highlight the entire phrase.  For instance, A transposes the phrase ὑμεῖς ἐπράξατε τῆς πόλεως into τῆς πόλεως ὑμεῖς ἐπράξατε.  To input this variant, we highlight the phrase and then type out the entire phrase in our entry as it appears in A:
 Screen Shot 2014 06 12 at 3 35 38 PM
Screen Shot 2014 06 12 at 3 35 50 PM

Adding Notes
Adding notes works much the same way as adding variant readings.  However, you need to decide first what sort of referencing system you wish to use for the notes.  The default choice uses line numbers (the same system used for variant readings).  Another popular system is footnotes.  As far as I’m aware, you can’t switch between the two systems on the fly.  If you decide later to change your mind, you’ll need to go back through the text and re-add the references.  

To choose your setting, open the Notes window (Windows –> Notes, or press Ctrl+N).  Then select, Format –> Notes Settings.
 
Screen Shot 2014 06 12 at 3 40 00 PM

Screen Shot 2014 06 12 at 3 40 33 PM
“Text Reference”  is the default, which uses line numbers (either section line numbers or page line numbesr).  “Footnote numbers”  will instead introduce running footnotes into the text (like this file).

After one has decided what system to use, you may then input notes almost exactly like critical apparatus entries.  Select the relevant location in the main text (either a range, or a location), and then click References –> Notes Entry, or use Shift+F5.  This will then bring up the Notes window and allow you to input the relevant information. Repeat for as many notes as you have.
Screen Shot 2014 06 12 at 3 46 18 PM 

Varia
At this point, you should have your notes and apparatus entries added to the text.  You’re now able to run print-preview (File –> Print Preview, or Ctrl+J) and see your text (perhaps heavily watermarked if you’re using the demo version).  The main work is now done, and all that remains are formatting tweaks.  

Line Numbers
By default, the CTE prints line numbers for both the sections and the page.  I find two sets of line numbers confusing and displeasing to the eye.  To change this, one goes to Format –> Document and then go to the Margins tab.   
 
Screen Shot 2014 06 12 at 3 52 47 PM
Screen Shot 2014 06 12 at 3 51 14 PM
As you can see, the default puts “Page line numbers” in the inner margin and “Chapter Line numbers” in the outer margin.  I’d recommend choosing one and making the other “None.”  Note you can also use this dialog to change the frequency of line numbers.  

Changing the separator in Chapter+Line references
By default, the CTE uses a comma as its separator in references.  Thus, if a variant shows up on the 3rd line of the 2nd section, this shows up in the apparatus as 2, 3 (As I understand it, this is the typical European practice).  Americans much prefer to use a period here, so that it’s printed as 2.3.   To change this, go to Format –> Document and then navigate to the Notes tab.  

 Screen Shot 2014 06 12 at 3 57 15 PM
Changing the character in the field “Between Chapter and Line” from a comma to a period will produce 2.3 instead of 2, 3.  

 

That’s all for now. As mentioned above, if you have any suggestions, comments, or questions, please let me know in the comments!

ἐν αὐτῷ,
ΜΑΘΠ 

[1] Dilts, M.R., Demosthenis Orationes, vol. 1 (Oxford 2002)

Advertisements

As a followup to my recent Varia post, I’d like to explain two programs that I used recently in my Textual Criticism class: Juxta and CTE.  To do so, I’ll run through how the final product came together from start to finish.  Our goals were traditional: we wanted to use Lachmanian methods to create a stemma and establish the archetypal text, to the degree possible.  

The first part of preparing an edition, of course, is to choose a text, and then to acquire images of as many of the manuscripts as possible.  This requires reading through any prior literature about the text, but also includes combing through manuscript catalogs to determine which, if any, mss contain your text.  Digital catalogs are thankfully making this process much easier (V. e. g. the marvelously helpful website Pinakes: http://pinakes.irht.cnrs.fr/).  This task is still a chore, though.  Thankfully my professor, Dr. Mantello, had already done this work for us.  He had both selected the text (a sermon of Bishop Robert Grosseteste on clerical orders), and obtained PDF copies of all of the relevant mss.  The mss came to 13 in total.  One ms’s text was partial, and another two were partially illegible, either due to poor imaging, or fire damage.  There were six students in the class, so Prof. Mantello split the sermon into 3 sections.  Each pair were responsible for a third of the text (my section came out to about 1400 words).  

The next order of business was to prepare collations: that is, to determine where the mss varied from one another.  This is where I found Juxta helpful.  Juxta allows one to compare 2+ transcriptions of a given text very easily.  Unfortunately, perhaps, this requires full-text transcriptions of each ms.  This can take a lot of time, especially with 13 mss.  Some texts, of course, have dozens, or even hundreds of manuscripts, and most texts will be much longer than the small 1400 word section of our sermon.  That said, preparing accurate transcriptions of 13 mss took me only a 2-3 months, and I was also working on plenty of other stuff in the meantime.  For those with longer texts, doing a smaller chunk (say about 1,500  words) from one part of the text will generally allow one to highlight the most important mss without having to transcribe every single mss in toto.  

Now, regarding transcriptions: In an ideal world, one would have at least two people making transcriptions of the same ms.  This allows one to compare the two transcriptions at the end to highlight trouble spots and to eliminate typos and other errors.  As my teammate chose to to a manual collation, this option wasn’t available, so I made do in other ways (her manual collations were invaluable later in the process, however).  Once I had transcriptions of two different mss, I normalized the orthography [1] then compared these two transcriptions two one another.  At each difference, I checked the mss to ensure that my transcriptions were correct.  At the end of this process, I had two fairly accurate transcriptions which I then used to correct the rest of my transcriptions as I finished them.  This is by far the most tedious part.  Even after I had ferreted out most of the problems in my initial pass, I still found myself consistently returning later to the mss to check particular readings (and often found that transcriptions still contained errors).  Unfortunately, I also took the longer approach of typing each new transcription from scratch.  It occurred to me later, through reading a paper of Tara Andrews, that it’s much faster to modify an existing transcription to fit a new ms instead of starting from scratch.  In any case, accurate transcriptions are a necessity for any further work.  This stage, though often tedious and monotonous, is extremely important.  Juxta (or another comparison tool) is quite useful even at this stage, since highlighting the differences between transcriptions will often highlight errors in your transcriptions.

After transcribing, one can then proceed to examining the differences between mss.  Juxta is helpful here.  Here’s a screenshot:

Screen Shot 2014 06 07 at 11 57 45 AM

Right now, I’m using the ms K as my “base text.” Areas with darker highlighting indicate that a larger number of mss have a variant reading at a certain point.  In this case, there’s an important omission shared by 8 mss at the beginning of our section (running from collocantur existentes … ecclesiastice hierarchie).  Clicking on the dark text will show what the variant mss read:

Screen Shot 2014 06 07 at 12 01 46 PM 

Unfortunately, Juxta is not smart enough to determine group similar readings together.  In this case, N O R Rf all have the exact same omissions.  R6 has the omission too, but inserts an et to try and make the resulting text make more sense.  Ideally, Juxta would group all of the readings together (perhaps it will in the future, or perhaps I’ll create my own version that does that: it’s free and opensource after all!).  It still, however, provides a useful overview of the tradition at any given point.  Here’s a less complicated example:

Screen Shot 2014 06 07 at 1 26 06 PM

This shows that 4 mss have the text in ecclesia or in ecclesiam.  As these four mss have a number of other shared readings that are unique to them, it’s clear that they belong to a family.  After further analysis, it becomes clear that this in an addition that doesn’t belong in the archetypal text.  If you’d like a file to test with, I’ve uploaded a test file with a selection of manuscripts.

Using Juxta, I was able to determine work out a provisional stemma of the 13 mss.  Traditional Lachmannian methods worked pretty well.  There were a number of omissions and other agreements in error that allowed us to group the mss into families and then into a stemma.  Furthermore, our examination of the internal evidence (the text) corresponded fairly well with the relationships that Thomson[2] had posited based on external criteria (like dates, and the number and order of the sermons contained in the mss).  My initial stemma required some reworking, both because of errors in my transcriptions (that my partner thankfully discovered) and because the place of one ms wasn’t clear when looking only at our sections.  Incorporating data from the other sections allowed us to place that ms with more confidence.  

The final step was to incorporate all of this information into a critical edition, replete with critical apparatus and source apparatus.  The information for the apparatus of sources was more straightforward.  Prof. Mantello had helped us track down the important sources.  Creating the critical apparatus naturally required us to decide what the original text was.  The stemma made this straightforward in most cases.  In a few cases, the better attested reading was less satisfactory on internal grounds.  In a few places, I chose a poorly attested reading, or even ventured a few emendations (though for most of them, I failed to convince Prof. Mantello).  When examining trouble spots, the electronic Grosseteste was immensely helpful.  It allowed me to check a particular construction across a wide swathe of Grosseteste’s corpus.  

I used the Classical Text Editor (CTE) to assemble my final product.  The CTE is quite a powerful tool.  It has the ability create a wide variety of critical editions.  Ours was a fairly simple text+notes+apparatus, but one can also add further apparatuses, or even add parallel texts/translations.  There are a few downsides.  First, the program is quite expensive (to the tune of several hundred USD, though there is a free trial that is fully functional except for the ability to generate non-watermarked output).  Second, the program is difficult to use if you don’t have someone to show you the basics.  I have a computer science degree, and found myself frequently frustrated at first.  That said, the basics aren’t difficult once you’ve been shown how the program works.  I gave a presentation for my classmates, and everyone decided to use it for their text.  Only one other student in the class had a technical background, but everyone was able to use the program to assemble their text.  

And I must say, the output is pretty sharp.  The only other means I know to create something comparable is LaTeX, and that requires quite a bit more technical knowledge than needed for the CTE.  (It was LaTeX, for instance, that I used to create my text and translation of Origen’s 3rd homily on Ps. 76)  As an example of CTE output, here’s the first page of our final text: InLibroNumerorum_mapoulos_excerpt.pdf.  If anyone knows of CTE tutorials (besides the help files), I’d love to know about them.  Sometime soon I’ll post some basic walkthroughs that I created for my classmates.  

I should say that there are a number of useful tools that I’ve not mentioned here.  Our final goal for this project remained a printed text.  Things look differently if web-publication is in view (the CTE does support TEI output, but I’ve not tested it to see how it works).  Also, there’s much work being done in the field of digital stemmatology.  Tools like stemmaweb allow one to use a number of different algorithms to create a stemma digitally.  Variant graphs, for instance, look like a useful way to look at the tradition. I don’t read Armenian, but I’m very impressed by the technical aspects of Tara Andrew’s digital edition of Matthew of Edessa.  Her academia.edu page is well worth a look if you’re interested in digital editions.  

Do apprise me of anything important I’ve omitted in the comments, particularly if you’ve advice on better ways to approach the task.

ἐν αὐτῷ,
ΜΑΘΠ 

[1] Normalizing the orthography is an important step as orthographic variants usually aren’t important for distinguishing the relationships between mss.  I kept my original transcriptions, which followed the orthography of the mss, but did most of my analysis on the basis of the normalized files.  
[2] Thomson, H., The Writings of Robert Grosseteste (Cambridge 1940)

I just made use for the first time of an excellent modern lexicon that is coming out of Spain.  The Diccionario Griego-Español, according to those wiser than me, is one of the best lexical tools available for ancient Greek.  It’s incomplete (and thus still in progress: A-ἔξαυος are online), but from the bit of time I’ve spent with it, it does look to be an excellent tool.  The lexicon is Greek to Spanish, which does decrease its usefulness somewhat for native English speakers like myself.  However, many Greek students have had Latin at some point, and Spanish is probably the closest of the romance languages to Latin.  Or, you may have had both Latin and Spanish like me.  Even though my Spanish is rusty, I’ve still been able to make use of it!

Today I was curious about the adjective βελτίων.  It’s a comparative of ἀγαθός, and means “better.”  However, Greek seems to use comparative adjectives more freely than English, and in a wider sense.  In this passage from Gregory on which I’ve been working for quite some time, it seems to mean “the good,” in an almost Platonic sense.  So I wanted to see if we had any record for βελτίων being used as a noun in a generic sense like that.  The LSJ was not of much help, but the DGE was.

The entry is divided into three parts, de pers. (for people), de cosas y abstr. (for things and abstractions), and adverbs.  Under the second heading, we have (inter alia)  “subst. lo que es mejor”  (used substantivally, that which is better) followed by citation of Plato’s Alcibiades:  τί καλεῖς τὸ ἐν τῷ κιθαρίζειν βέλτιον; Pl.Alc.1.108b.  

Thus, while not ensuring my interpretation of Gregory is the only right one, with the help of the DGE I have confirmed that βέτιον can be used as an abstract noun.   Hearing Plato in the background was not off-base either: the example comes the great Philosopher himself! 

My thanks to the team at the DGE, not only for creating such a useful tool, but also for placing it online, freely available to all!

ἐν αὐτῷ
ΜΑΘΠ 

As part of a project I hope to publish (regarding stylometrics and Origen), I’ve been transcribing more of his homilies on the Psalms. I don’t have the time to translate them, or really even to edit the Greek text properly at the moment, but I figure that even my transcriptions may be useful to someone. I’ve created an Origen page here, where you may find my transcriptions of (currently) two homilies on Psalm 36, in addition to the Greek text and translation of his third homily on Psalm 76.

Transcribing a text is a laborious task, and one bound to introduce errors into one’s copy. I’ve read over most of the transcribed material, but even still I’m sure more errors are present. If you find any, leave a comment or send me an e-mail.

ἐν αὐτῷ,

ΜΑΘΠ

I haven’t often mentioned my interest in things digital on this blog, but earlier this year, I was fortunate to attend a workshop in Belgium entitled, “Means and Methods for the Digital Analysis of Ancient and Medieval Texts and Manuscripts.”  I got to hear a variety of interesting papers and debates, all while enjoying terrific hospitality.  One of the happy consequences of this visit was that I met several people working on “digital humanities”[1] type projects.  One of my great interests as a budding text-critic is in digital stemmatology.  The question essentially boils down to: how can we use digital/statistical methods to help us map the history of a text’s transmission.  Ideally, the end result is a stemma, or family tree, detailing the copying history of the extant manuscripts.  This is helpful either for traditional philology (establishing the archetypal text), or for those interested in reception history.  Tara Andrews, whom I was fortunate to meet in Leuven, recently wrote a blog post which captures the history and status quaestionis quite well, here. All of this makes me wish I was in Hamburg this week at the Digital Humanities 2012 conference.  There are a number of interesting abstracts listed here.

As a Computer Science undergraduate turned (soon-to-be) Greek and Latin graduate student, I’m naturally very interested in how computers can help us study ancient texts.  Two areas, in particular, hold my interest right now: digital stemmatology and digital stylometry.

Stemmatology I mentioned earlier: I’m attempting to apply these sorts of methods to the Palaea Historica, a 10th century Byzantine Greek retelling of the Old Testament.  One of my professors at NC State is working on a critical edition, and so I hope to put these stemmatological methods to good use.  Time will tell if I’m successful, but I’ll be presenting a paper in Nov. so I’ll definitely have something to say then!

Digital Stylometry is a more recent interest of mine.  The most common application is authorship attribution: can we somehow quantify style and then use that measure to compare different texts?  Perhaps the most common application is authorship attribution.  If the methods develop well enough, this might, for instance, help us sort out anonymous catenae fragments, or anonymous homilies like the ones in the recently discovered CMB 314 (which we’re pretty sure, at least currently, belong to Origen). 

[1]  I still find this phrase frustratingly vague (I’m interested in a narrower type of research), but I employ it nevertheless.

ἐν αὐτῷ,

ΜΑΘΠ