SlideShare uma empresa Scribd logo
1 de 99
Consider the Source
                        Textual criticism and digital techniques




Wednesday, March 18, 2009
How do we know what
                     happened?


Wednesday, March 18, 2009
Wednesday, March 18, 2009
People aren’t machines



Wednesday, March 18, 2009
Wednesday, March 18, 2009
• the scriptorium is cold




Wednesday, March 18, 2009
• the scriptorium is cold
                   • the food is bad




Wednesday, March 18, 2009
• the scriptorium is cold
                   • the food is bad
                   • the tea ladies are unfriendly



Wednesday, March 18, 2009
• the scriptorium is cold
                   • the food is bad
                   • the tea ladies are unfriendly
                   • they just don’t want to be there anymore


Wednesday, March 18, 2009
• Modern historical approaches are a recent
                            thing




Wednesday, March 18, 2009
• Modern historical approaches are a recent
                            thing

                   • Not unheard of before, but not standard




Wednesday, March 18, 2009
• Modern historical approaches are a recent
                            thing

                   • Not unheard of before, but not standard
                   • This profoundly affects the way in which
                            histories have reached us




Wednesday, March 18, 2009
600 years later




Wednesday, March 18, 2009
600 years later

                   • a bunch of error-ridden copies




Wednesday, March 18, 2009
600 years later

                   • a bunch of error-ridden copies
                   • or only a few error-ridden copies



Wednesday, March 18, 2009
600 years later

                   • a bunch of error-ridden copies
                   • or only a few error-ridden copies
                   • or only a name check in a book about
                            something else




Wednesday, March 18, 2009
Textual criticism



Wednesday, March 18, 2009
Apparatus example
                             St. Stephenʼs Church in Nijmegen
                             Nobilis itaque comes Otto imperio et dominio Novimagensi sibi, ut praefer-
                             tur, impignoratis et commissis proinde praeesse cupiens, anno liiii superius 1254
                             descripto, mense Iunio, una cum iudice, scabinis ceterisque civibus civitatis
                             Novimagensis, pro ipsius et inhabitantium in ea necessitate, commodo et utili-
                             5 tate, ut ecclesia eius parochialis extra civitatem sita destrueretur et infra muros
                             transferretur ac de novo construeretur, a reverendo patre domino Conrado de
                             Hofsteden, archiepiscopo Coloniensi, licentiam, et a venerabilibus dominis de-
                             cano et capitulo sanctorum Apostolorum Coloniensi, ipsius ecclesiae ab antiquo
                             veris et pacificis patronis, consensum, citra tamen praeiudicium, damnum aut
                             10 gravamen iurium et bonorum eorundem, impetravit.
                             Et exinde liberum locum eiusdem civitatis qui dicitur Hundisbrug, de prae-
                             libati Wilhelmi Romanorum regis, ipsius fundi domini, consensu, ad aedifican-
                             dum et consecrandum ecclesiam et coemeterium, eisdem decano et capitulo de
                             expresso eiusdem civitatis assensu libera contradiderunt voluntate, obligantes
                             15 se ipsi comes et civitas dictis decano et capitulo, quod in recompensationem
                             illius areae infra castrum et portam, quae fuit dos ecclesiae, in qua plebanus
                             habitare solebat—quae tunc per novum fossatum civitatis est destructa—aliam
                             aream competentem et ecclesiae novae, ut praefertur, aedificandae satis conti-
                             guam, ipsi plebano darent et assignarent. Et desuper apud dictam ecclesiam
                             20 sanctorum Apostolorum est littera sigillis ipsorum Ottonis comitis et civitatis
                             Novimagensis sigillata.
                             3 p. 227 R 4 p. 97 N 6 p. 129 D 12 f. 72v M 13 p. 228 R 20 p. 130 D
                             2 proinde ] primum D 5 ecclesia eius ] ecclesia D: eius eius H extra civitatem om. H
                             infra ] intra D 6 transferretur ] transferreretur NH 7 Hofsteden ] Hostede D: Hosteden
                             H Coloniensi ] Colononiensi H dominis ] viris H 8 Coloniensi ] Coloniae H 10 iurium ]
                             virium D 11 liberum ] librum H qui ] quae D Hundisbrug ] Hundisburch D: Hunsdisbrug
                             R 12 regis ] imperatoris D 13 et consecrandum om. H eisdem ] eiusdem D 15 comes ]
                             comites D dictis om. H 17 tunc ] nunc H 18 ut. . . aedificandae om. H 18–19 contiguam ]
                             contiguum M 19 apud om. H 20 est ] et H littera ] litteram H 21 Novimagensis ]
                             Novimagii D sigillata ] sigillis communita H




Wednesday, March 18, 2009
Apparatus example
                             St. Stephenʼs Church in Nijmegen
                             Nobilis itaque comes Otto imperio et dominio Novimagensi sibi, ut praefer-
                             tur, impignoratis et commissis proinde praeesse cupiens, anno liiii superius 1254
                             descripto, mense Iunio, una cum iudice, scabinis ceterisque civibus civitatis
                             Novimagensis, pro ipsius et inhabitantium in ea necessitate, commodo et utili-
                             5 tate, ut ecclesia eius parochialis extra civitatem sita destrueretur et infra muros
                             transferretur ac de novo construeretur, a reverendo patre domino Conrado de
                             Hofsteden, archiepiscopo Coloniensi, licentiam, et a venerabilibus dominis de-
                             cano et capitulo sanctorum Apostolorum Coloniensi, ipsius ecclesiae ab antiquo
                             veris et pacificis patronis, consensum, citra tamen praeiudicium, damnum aut
                             10 gravamen iurium et bonorum eorundem, impetravit.
                             Et exinde liberum locum eiusdem civitatis qui dicitur Hundisbrug, de prae-
                             libati Wilhelmi Romanorum regis, ipsius fundi domini, consensu, ad aedifican-
                             dum et consecrandum ecclesiam et coemeterium, eisdem decano et capitulo de
                             expresso eiusdem civitatis assensu libera contradiderunt voluntate, obligantes
                             15 se ipsi comes et civitas dictis decano et capitulo, quod in recompensationem
                             illius areae infra castrum et portam, quae fuit dos ecclesiae, in qua plebanus
                             habitare solebat—quae tunc per novum fossatum civitatis est destructa—aliam
                             aream competentem et ecclesiae novae, ut praefertur, aedificandae satis conti-
                             guam, ipsi plebano darent et assignarent. Et desuper apud dictam ecclesiam
                             20 sanctorum Apostolorum est littera sigillis ipsorum Ottonis comitis et civitatis
                             Novimagensis sigillata.
                             3 p. 227 R 4 p. 97 N 6 p. 129 D 12 f. 72v M 13 p. 228 R 20 p. 130 D
                             2 proinde ] primum D 5 ecclesia eius ] ecclesia D: eius eius H extra civitatem om. H
                             infra ] intra D 6 transferretur ] transferreretur NH 7 Hofsteden ] Hostede D: Hosteden
                             H Coloniensi ] Colononiensi H dominis ] viris H 8 Coloniensi ] Coloniae H 10 iurium ]
                             virium D 11 liberum ] librum H qui ] quae D Hundisbrug ] Hundisburch D: Hunsdisbrug
                             R 12 regis ] imperatoris D 13 et consecrandum om. H eisdem ] eiusdem D 15 comes ]
                             comites D dictis om. H 17 tunc ] nunc H 18 ut. . . aedificandae om. H 18–19 contiguam ]
                             contiguum M 19 apud om. H 20 est ] et H littera ] litteram H 21 Novimagensis ]
                             Novimagii D sigillata ] sigillis communita H




Wednesday, March 18, 2009
Apparatus example




Wednesday, March 18, 2009
Who needs this and
                                 why?




Wednesday, March 18, 2009
Who needs this and
                                 why?

                   • Historians look for one thing




Wednesday, March 18, 2009
Who needs this and
                                 why?

                   • Historians look for one thing
                   • Linguists look for other things



Wednesday, March 18, 2009
Who needs this and
                                 why?

                   • Historians look for one thing
                   • Linguists look for other things
                   • Others will be interested too


Wednesday, March 18, 2009
The Chronicle of
                            Matthew of Edessa


Wednesday, March 18, 2009
Wednesday, March 18, 2009
Wednesday, March 18, 2009
Wednesday, March 18, 2009
Wednesday, March 18, 2009
Wednesday, March 18, 2009
Wednesday, March 18, 2009
How to make an
                               edition


Wednesday, March 18, 2009
Wednesday, March 18, 2009
Surviving manuscripts

                   • Oldest full manuscript is Venice 887




Wednesday, March 18, 2009
Surviving manuscripts

                   • Oldest full manuscript is Venice 887
                   • Next oldest is Vienna 574



Wednesday, March 18, 2009
Surviving manuscripts

                   • Oldest full manuscript is Venice 887
                   • Next oldest is Vienna 574
                   • 24 of 42 (4 of 6 fragments) copied before
                            1700




Wednesday, March 18, 2009
Extant manuscripts of
                         the Chronicle
                                       Manuscripts          Fragments


                            28

                            21

                            14

                             7

                             0
                                 pre 16th   16th     17th    18th       19th




Wednesday, March 18, 2009
Two manuscript groups
                   •                                   •
                            Group 1: like Venice 887       Group 2: like Vienna 574


                            •                              •
                                Text generally                 Text truncated near
                                complete (to 1162)             the year 1096/7

                            •                              •
                                Transmitted with the           Transmitted with
                                Life of St. Nerses             specific long sequence
                                (Mesrop the Priest)            of texts




Wednesday, March 18, 2009
Matenadaran 1896




Wednesday, March 18, 2009
Matenadaran 1896

                   • Copied in 1689




Wednesday, March 18, 2009
Matenadaran 1896

                   • Copied in 1689
                   • Uniquely preserves two passages of text



Wednesday, March 18, 2009
Matenadaran 1896

                   • Copied in 1689
                   • Uniquely preserves two passages of text
                   • These lacunae known to other copyists


Wednesday, March 18, 2009
Matenadaran 1896

                   • Copied in 1689
                   • Uniquely preserves two passages of text
                   • These lacunae known to other copyists
                   • But lots of manuscripts are older. Hm.

Wednesday, March 18, 2009
Making the edition




Wednesday, March 18, 2009
Making the edition

                   • Transcription




Wednesday, March 18, 2009
Making the edition

                   • Transcription
                   • Collation  text analysis



Wednesday, March 18, 2009
Making the edition

                   • Transcription
                   • Collation  text analysis
                   • Editing the text


Wednesday, March 18, 2009
Making the edition

                   • Transcription
                   • Collation  text analysis
                   • Editing the text
                   • Publication

Wednesday, March 18, 2009
Digital techniques



Wednesday, March 18, 2009
Transcription




Wednesday, March 18, 2009
Transcription

                   • The most time-consuming part




Wednesday, March 18, 2009
Transcription

                   • The most time-consuming part
                   • Ideal solution would be optical character
                            recognition (OCR)




Wednesday, March 18, 2009
Transcription

                   • The most time-consuming part
                   • Ideal solution would be optical character
                            recognition (OCR)
                   • No OCR for manuscripts, yet


Wednesday, March 18, 2009
Start with a manuscript
Wednesday, March 18, 2009
Into plain text
       դ. Թուխտ սիրոյ և միաբանութեան, շարագրեցեալ կղէմէս աստուածաբան վարդապետէ։
       Առաջաբանութիւն։

       Նա զի արդ գրիչս իմ անյարմարս կարօղ լինիցի երբէկ պատմագրիլ, ըստ պատշաճի
       զմեծամեծ յիշելիսն։ Որք ի վաղ ժամանակի անտի պատահեցան յեկեղեցին հայոց. և զի
       արդ մեք անհընագէտս յանձնառնցուք ճառել զանցեալ ծածկագոյն խորհուրդս այլասեռ
       ազգի, մինչև ցայժմ ո՜չ կարաց Ֆրանկ պատմիչ ոք զբուռն հարել ի սոյնպիսի օտար
       պատմագրութիւնս։ Բայց սակայն յուսացեալ ի յօգնութիւն սրբոյ աստուածածնին
       յօժարապէտս ախորժեսցուք համարձակիլ և ի յայս անհոռն ծովս մտանել...




Wednesday, March 18, 2009
XML solution: TEI
             !-- ... --
              div n=”4”
                headhi rend=”red”Թուխտ սիրոյ և միաբանութեexան/ex, շարագրեցlb/
                  եալ կղէմէս աexստուա/exծաբան վարդապետէ։lb/
                  Առաջաբանութիւն։ /hi
                /head
                phi rend=”ornament”Ն/hihi rend=”red”ա զի արդ գրիչս իմ անյարմարս/hi lb/
                  կարօղ լինիցի երբէկ պատմագրիլ, expanըստ/expan lb/
                  պատշաճի զմեծամեծ յիշելիսն։ Որք lb/
                  ի վաղ ժամանակի անտի պատահեցան lb/
                  յեկեղեցին հայոց. և զի արդ մեք անհընագէտս lb/
                  յանձնառնցուք ճառել զանցեալ ծածկագոյն lb/
                  խորհուրդս այլասեռ ազգի, մինչև ցայժմ ո՜չ lb/
                  կարաց Ֆրանկ պատմիչ ոք զբուռն հարել lb/
                  ի սոյնպիսի օտար պատմագրութիexւն/exս։ Բայց սաlb/
                  կայն յուսացեալ ի յօգնութիexւն/ex սրբոյ աexստուա/exծածնին lb/
                  յօժարապէտս ախորժեսցուք համարձակիլ և ի lb/
                  յայս անհոռն ծովս մտանել։ ... lb/
                /p
              /div
             !-- ... --




Wednesday, March 18, 2009
Wednesday, March 18, 2009
Better TEI XML text
!-- ... --
 div n=”4”
   head rend=”red”wԹուխտ/w wսիրոյ/w wև/w wմիաբանուexթեան/ex,/w wշարագրեցlb/եալ/w wկղէմէս/w
wexաստուած/exաբան/w wվարդապետէ։/wlb/
     wԱռաջաբանութիւն։
   /head

   pwhi rend=”ornament”Ն/hihi rend=”red”ա/hi/w hi rend=”red”wզի/w wարդ/w wգրիչս/w wիմ/w
wանյարմարս/w/hi lb/
     wկարօղ/w wլինիցի/w wերբէկ/w wպատմագրիլc type=”punct”,/c/w wexpanըստ/expan/w lb/
     wպատշաճի/w wզմեծամեծ/w wյիշելիսնc type=”punct”։/c/w wՈրք/w lb/
     wի/w wվաղ/w wժամանակի/w wանտի/w wպատահեցան/w lb/
     wյեկեղեցին/w wհայոցc type=”punct”./c/w wև/w wզի/w wարդ/w wմեք/w wանհընագէտս/w lb/
     wյանձնառնցուք/w wճառել/w wզանցեալ/w wծածկագոյն/w lb/
     wխորհուրդս/w wայլասեռ/w wազգիc type=”punct”,/c/w wմինչև/w wցայժմ ոc type=”punct”՜/cչ/w lb/
     wկարաց/w wՖրանկ/w wպատմիչ/w wոք/w wզբուռն/w wհարել/w lb/
     wի/w wսոյնպիսի/w wօտար/w wպատմագրուexթիւնս/exc type=”punct”։/c/w wԲայց/w wսաlb/կայն/w
wյուսացեալ/w wի յօգնուexթիւն/ex/w wսրբոյ/w wexաստուած/exածնին/w lb/
     wյօժարապէտս/w wախորժեսցուք/w wհամարձակիլ/w wև/w wի/w lb/
     wյայս/w wանհոռն/w wծովս/w wմտանելc type=”punct”։/c/w ... lb/
   /p
 /div
!-- ... --


Wednesday, March 18, 2009
Perl to the rescue #1

                   • XML is a terrible thing to edit




Wednesday, March 18, 2009
Perl to the rescue #1

                   • XML is a terrible thing to edit
                   • I want a transcription markup that I can
                            convert to TEI XML later




Wednesday, March 18, 2009
Perl to the rescue #1

                   • XML is a terrible thing to edit
                   • I want a transcription markup that I can
                            convert to TEI XML later
                   • Not a solution you’ll like, but I’ll show it to
                            you anyway



Wednesday, March 18, 2009
seg type=quot;wordquot;հsubstdelայ/deladd
    place=quot;overwritequot;ex resp= quot;#tlaquot;ո/exռex resp=
               quot;#tlaquot;ո/exմ/addոց./subst




Wednesday, March 18, 2009
հ±-այ-+(overwrite)ոռոմ+ոց




Wednesday, March 18, 2009
TEI markup
 [172]
 զօ՛րացն և զօրավարացն և ազգն հոռոմոց իւրոց քաջութեան զան
 դարձ փաղչելն արարին պարծանք նմանեացն վատ՛ հովուաց,
 ո՛ր յորժամ զգայլն տեսանէ փաղչի, սակայն հոռոմք յան
 ջանս ջանացին, ո՛ր լուր զպարիսպ ամրութեան տանս հայոց
 քակեա՛լ կործանեցին, և զպարսիկք ի վերայ արձակեցին սրով, և
 զամենայն յաղթու՛թիւնն իւրոց համարեցան, և ինքեանք անպատկառելի
 երեսօք, կուրտ՛ զօրավարք, և ներքինի զօրօք զհայ՛ք պահել
 ջանա+յ+ին, մինչև պարսիկք յան±-(blot)տ-+տ+±էր տեսին զ^ամենայն^ արևելք.
 և յայնժամ մեծաւ՛ զօրութեամբ զօրացնն այ՛լազգիքն, որ ի
 մէկ տարո՛յ հասան մինչև ի դու՛ռն կոստանդնուպօլիս, և
 առին զամենայն աշխարհն ±-հայոց-+(overwrite)հոռոմոց+±, զքաղաքս
      ծովեզերաց և զկղզիս նոցա,
 և արա՛րին զազգն յունաց որպէս զբա՛նդարգեալս ի ներս
 ի կոստանդնուպօլիս. և յորժամ առա՛ւ հայք ի յունաց, ար
 գելաւ՛ ամենայն չարութիւնն հոռոմո՛ց, յազգէն հայոց, և զկնի այսօր
 իկ հնարեցան այ՛լ կերպիւ պատերազմ յարուցանեալ ^ընդ^ ազգն
 հա՛յոց, նստան ի քննութիւն հաւ՛ատոյ, և այսու ատեա՛լ
 անարգեցին զհանդէս պատերազմի և զօրմարտի, և զկռիւս և
 զաղմու՛կս յեկեղեցի աստուծոյ կարգեալ հաստատեցին. ի պարսից
 պատերազմէն յօժարութեամբ փախչին, և զամենայն ճշմարիտ
      հաւատացեալքս
 քրիստոսի ի հաւատոյն ջանան խափանել և խաղխտել, վասն զի յորժամ
 այր քաջ զօրաւ՛որ գտանէին, զաչսն խաւարեցուցանէին, և կամ
 ի ծով ձգեա՛լ խեղդամահ սատակէին. և այ՛ն էր փո՛յթ յօժարութեան




Wednesday, March 18, 2009
TEI markup

   հ±-այ-+(overwrite)ոռոմ+ոց,

 seg type=quot;wordquot;
 substհdelայ/del
  add place=quot;overwritequot;
   ex resp=quot;#tlaquot;ո/exռ
   ex resp=quot;#tlaquot;ո/exմ/add
 /substոց,/seg




Wednesday, March 18, 2009
TEI markup
   հոռոմոց իւրոց քաջութեան

 seg type=”word”հ
  ex resp=quot;#tlaquot;ո/exռ
  ex resp=quot;#tlaquot;ո/exմոց/seg
 seg type=”word”իւր
  ex resp=quot;#tlaquot;ո/exց/seg
 seg type=”word”ք
  ex resp=quot;#tlaquot;ա/exջ
  ex resp=quot;#tlaquot;ո/exւ
  ex resp=quot;#tlaquot;թ/exե
  ex resp=quot;#tlaquot;ան/ex/seg

Wednesday, March 18, 2009
TEI text: what now?
             !-- ... --
              div n=”4”
                head rend=”red”wԹուխտ/w wսիրոյ/w wև/w wմիաբանուexթեան/ex,/w wշարագրեցlb/եալ/
             w wկղէմէս/w wexաստուած/exաբան/w wվարդապետէ։/wlb/
                  wԱռաջաբանութիւն։
                /head

                pwhi rend=”ornament”Ն/hihi rend=”red”ա/hi/w hi rend=”red”wզի/w wարդ/w wգրիչս/
             w wիմ/w wանյարմարս/w/hi lb/
                  wկարօղ/w wլինիցի/w wերբէկ/w wպատմագրիլc type=”punct”,/c/w wexpanըստ/expan/w
             lb/
                  wպատշաճի/w wզմեծամեծ/w wյիշելիսնc type=”punct”։/c/w wՈրք/w lb/
                  wի/w wվաղ/w wժամանակի/w wանտի/w wպատահեցան/w lb/
                  wյեկեղեցին/w wհայոցc type=”punct”./c/w wև/w wզի/w wարդ/w wմեք/w
             wանհընագէտս/w lb/
                  wյանձնառնցուք/w wճառել/w wզանցեալ/w wծածկագոյն/w lb/
                  wխորհուրդս/w wայլասեռ/w wազգիc type=”punct”,/c/w wմինչև/w wցայժմ ոc
             type=”punct”՜/cչ/w lb/
                  wկարաց/w wՖրանկ/w wպատմիչ/w wոք/w wզբուռն/w wհարել/w lb/
                  wի/w wսոյնպիսի/w wօտար/w wպատմագրուexթիւնս/exc type=”punct”։/c/w wԲայց/w
             wսաlb/կայն/w wյուսացեալ/w wի յօգնուexթիւն/ex/w wսրբոյ/w wexաստուած/exածնին/w
             lb/
                  wյօժարապէտս/w wախորժեսցուք/w wհամարձակիլ/w wև/w wի/w lb/
                  wյայս/w wանհոռն/w wծովս/w wմտանելc type=”punct”։/c/w ... lb/
                /p
               /div
             !-- ... --



Wednesday, March 18, 2009
Collation
                   quot;The collation of manuscripts requires the
                    infuriating accuracy of a pedant and the
                 obsessive stamina of an idiot. It is therefore an
                           ideal task for a computer.quot;
                            —Peter Robinson, “Collation and Textual Criticism”, LLC vol. 4 no. 2, 1989




Wednesday, March 18, 2009
Collation




Wednesday, March 18, 2009
Collation

                   • need to align words with each other




Wednesday, March 18, 2009
Collation

                   • need to align words with each other
                   • ...across many manuscripts




Wednesday, March 18, 2009
Collation

                   • need to align words with each other
                   • ...across many manuscripts
                   • ...even when the words aren’t exactly the
                            same
                            (e.g. “յաշխարհին” vs. “աշխարհն”)



Wednesday, March 18, 2009
յայսմ                  այս              յայսմ            այս               այս
     ամենայն                ամենայն          ամի              ամենայն           ամենայն
     եղելոցն,               եղելոց           եղելոց           եղելոց            եղելոցս
     նստուցանեն             նստուցանեն       նստուցանեն       նստուցանեն        նստուցանեն
     զաթոռ                  զաթոռ            յաթոռ            զաթոռ             զաթոռ
     հայրապետութեան         հայրապետութեան   հայրապետութեան   հայրապետութեանն   հայրապետութեան
     ի                                       ի                ի
     թաւբլուր                                թաւաբլուրն։      թաւբլուր
     եւ                                      եւ               եւ
     կացեալ                                  կացեալ           կացեալ
     անդ                                     անդ              անդ
     զամս                                    զամս             զամս
     գ                                       գ,               գ
     եւ                                      եւ               եւ
     ընդ                                     ընդ              ընդ
     ամենայն                                 ամենայն          ամենայն
     զ                                       զ                վեց
     ամ                                      ամ,              ամ
     կալեալ                                  կալեալ           կալեալ
     զաթոռ                                   զաթոռ            զաթոռ
     հայրապետութեանն                         հայրապետութեան   հայրապետութեանն
     տէր                    տէր              տէր              տէր               զտէր
     խաչիկ։                 խաչիկ։           խաչիկն։          խաչիկ։            խաչիկ։




Wednesday, March 18, 2009
!-- ... --
                             p
                               wapp
                                 rdg wit=”#A #C”յայսմ/rdg
                                 rdg wit=”#B #D #E”այս/rdg
                               /app/w
                               wapp
                                 rdg wit=”#A #B #D #E”ամենայն/rdg
                                 rdg wit=”#C”ամի/rdg
                               /app/w
                               wapp
                                 rdg wit=”#A”եղելոցն/rdg
                                 rdg wit=”#B #C #D”եղելոց/rdg
                                 rdg wit=”#E”եղելոցս/rdg
                               /app/w
                               wնստուցանեն/w
                               w type=”prefix”app
                                 rdg wit=”#A #B #D #E”զ/rdg
                                 rdg wit=”#C”յ/rdg
                               /app/w
                               wաթոռ/w
                               wհայրապետութեան/w
                               wapp
                                 rdg wit=”#A #C #D”ի/rdg
                               /app/w
                               !-- ... --
                             /p
                            !-- ... -- 




Wednesday, March 18, 2009
!-- ... --
                             p
                               wapp
                                 rdg wit=”#A #C”յայսմ/rdg
                                 lem wit=”#B #D #E”այս/rdg
                               /app/w
                               wapp
                                 lem wit=”#A #B #D #E”ամենայն/rdg
                                 rdg wit=”#C”ամի/rdg
                               /app/w
                               wapp
                                 lem wit=”#A”եղելոցն/rdg
                                 rdg wit=”#B #C #D”եղելոց/rdg
                                 rdg wit=”#E”եղելոցս/rdg
                               /app/w
                               wնստուցանեն/w
                               w type=”prefix”app
                                 rdg wit=”#A #B #D #E”զ/rdg
                                 lem wit=”#C”յ/rdg
                               /app/w
                               wաթոռ/w
                               wհայրապետութեան/w
                               wapp
                                 lem wit=”#A #C #D”ի/rdg
                               /app/w
                               !-- ... --
                             /p
                            !-- ... -- 




Wednesday, March 18, 2009
Our text  apparatus

                            այս ամենայն եղելոցն նստուցանեն զաթոռ
                       1

                            ի թաւբլուր,

                       1 այս] յայսմ AC 1 ամենայն] ամի C 1 եղելոցն] եղելոց BDE եղելոցս C
                       1 զաթոռ] յաթոռ C 2 ի թաւբլուր] om. BE
                       ...




Wednesday, March 18, 2009
!-- ... --
                             p
                               wapp
                                 rdg wit=”#A #C”յայսմ/rdg
                                 lem wit=”#B #D #E”այս/rdg
                               /app/w
                               wapp
                                 lem wit=”#A #B #D #E”ամենայն/rdg
                                 rdg wit=”#C”ամի/rdg
                               /app/w
                               wapp
                                 lem wit=”#A”եղելոցն/rdg
                                 rdg wit=”#B #C #D”եղելոց/rdg
                                 rdg wit=”#E”եղելոցս/rdg
                               /app/w
                               wնստուցանեն/w
                               w type=”prefix”app
                                 rdg wit=”#A #B #D #E”զ/rdg
                                 lem wit=”#C”յ/rdg
                               /app/w
                               wաթոռ/w
                               wհայրապետութեան/w
                               wapp
                                 lem wit=”#A #C #D”ի/rdg
                               /app/w
                               !-- ... --
                             /p
                            !-- ... -- 




Wednesday, March 18, 2009
!-- ... --
                             p
                               wapp
                                 lem wit=”#A #C”յայսմ/rdg
                                 rdg wit=”#B #D #E”այս/rdg
                               /app/w
                               wapp
                                 rdg wit=”#A #B #D #E”ամենայն/rdg
                                 lem wit=”#C”ամի/rdg
                               /app/w
                               wapp
                                 lem wit=”#A”եղելոցն/rdg
                                 rdg wit=”#B #C #D”եղելոց/rdg
                                 rdg wit=”#E”եղելոցս/rdg
                               /app/w
                               wնստուցանեն/w
                               w type=”prefix”app
                                 rdg wit=”#A #B #D #E”զ/rdg
                                 lem wit=”#C”յ/rdg
                               /app/w
                               wաթոռ/w
                               wհայրապետութեան/w
                               wapp
                                 lem wit=”#A #C #D”ի/rdg
                               /app/w
                               !-- ... --
                             /p
                            !-- ... -- 




Wednesday, March 18, 2009
New text  apparatus

                            յայսմ ամի եղելոցն նստուցանեն զաթոռ
                       1

                            ի թաւբլուր,

                       1 յայսմ] այս BDE 1 ամի] ամենայն ABDE 1 եղելոցն] եղելոց BDE եղելոցս C
                       1 զաթոռ] յաթոռ C 2 ի թաւբլուր] om. BE
                       ...




Wednesday, March 18, 2009
Wednesday, March 18, 2009
Manuscript stemmas:
                          the family tree


Wednesday, March 18, 2009
Stemma construction

                   •        Better stemma through analysis of collation results




Wednesday, March 18, 2009
Stemma construction

                   •        Better stemma through analysis of collation results

                   •        Borrows statistical models from evolutionary biology




Wednesday, March 18, 2009
Stemma construction

                   •        Better stemma through analysis of collation results

                   •        Borrows statistical models from evolutionary biology

                   •        “Maximum parsimony” based upon DNA of specimens




Wednesday, March 18, 2009
Stemma construction

                   •        Better stemma through analysis of collation results

                   •        Borrows statistical models from evolutionary biology

                   •        “Maximum parsimony” based upon DNA of specimens

                   •        Manuscripts are specimens




Wednesday, March 18, 2009
Stemma construction

                   •        Better stemma through analysis of collation results

                   •        Borrows statistical models from evolutionary biology

                   •        “Maximum parsimony” based upon DNA of specimens

                   •        Manuscripts are specimens

                   •        Biologists have DNA sequences; we have words.




Wednesday, March 18, 2009
յայսմ                  այս              յայսմ            այս               այս
     ամենայն                ամենայն          ամի              ամենայն           ամենայն
     եղելոցն,               եղելոց           եղելոց           եղելոց            եղելոցս
     նստուցանեն             նստուցանեն       նստուցանեն       նստուցանեն        նստուցանեն
     զաթոռ                  զաթոռ            յաթոռ            զաթոռ             զաթոռ
     հայրապետութեան         հայրապետութեան   հայրապետութեան   հայրապետութեանն   հայրապետութեան
     ի                                       ի                ի
     թաւբլուր                                թաւաբլուրն։      թաւբլուր
     եւ                                      եւ               եւ
     կացեալ                                  կացեալ           կացեալ
     անդ                                     անդ              անդ
     զամս                                    զամս             զամս
     գ                                       գ,               գ
     եւ                                      եւ               եւ
     ընդ                                     ընդ              ընդ
     ամենայն                                 ամենայն          ամենայն
     զ                                       զ                վեց
     ամ                                      ամ,              ամ
     կալեալ                                  կալեալ           կալեալ
     զաթոռ                                   զաթոռ            զաթոռ
     հայրապետութեանն                         հայրապետութեան   հայրապետութեանն
     տէր                    տէր              տէր              տէր               զտէր
     խաչիկ։                 խաչիկ։           խաչիկն։          խաչիկ։            խաչիկ։




Wednesday, March 18, 2009
A                      B   A   B   B
     A                      A   B   A   A
     A                      B   B   B   C
     A                      A   A   A   A
     A                      A   B   A   A
     A                      A   A   B   A
     A                      O   A   A   O
     A                      O   B   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   B   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   A   A   O
     A                      O   B   A   O
     A                      A   A   A   B
     A                      A   B   A   A




Wednesday, March 18, 2009
Wednesday, March 18, 2009
Non-fragmentary manuscripts omitted:
                                                                              !
                                                                                                                     Paris 191, 200
                                                                                                                     Jerusalem 3651
                                                                                                                     Matenadaran 2855, 2899, 3380,
                                                                                    gaps appear
                                                                                                                     6605, 8159, 8232, 8894
                                                                                                                     Rome 25
                                                                                                                     Vienna 243, 246
                                                                            quot;
                                                                                         ch
                                                  %                                        ap
                                                                                             te




                                                                   text truncated
                                                                                                rd
                                                                                                                               F (1617)
                                                                                                  ivi
                                                                                                     sio
                                                                                                           ns
                                                                                                                ap
                                  B (1623)                                                                           pe
                                                                                                                       ar
                                                                                                                                            X (1669)


                                                                                                                                $
                                            A (1689)                           #
                                                                                                                                                Matenadaran
                                                                                                                                                3520 (17th c.)
                                                                                          O (ca. 1702)

                                                                                                                                                 Matenadaran
                                       W (1601)
                                                                                                                                                 2644(1844)
                                                                                                                          V (1590-1600)
                                             J (1617)
                                                                                              D (1647)                                           (Jerusalem
                                                                                                                                                1869 edition*)
                                                  H (17th c.)   Z (17th c.)

                                                                                                  Y (17th c.)                               K (1699)

                                           L (1660)
                            I (1664)
                                                       Matenadaran
                                                      3071 (1651-61)
                                                                                                                                          Bzommar 644
                                                                                                                                           (1775-1805)



                                                                                                                            Venice 986
                                                                                                                             (1830-35)
                               *Based on Jerusalem mss. 1051, 1107




Wednesday, March 18, 2009
Non-fragmentary manuscripts omitted:
                                                                              !
                                                                                                                     Paris 191, 200
                                                                                                                     Jerusalem 3651
                                                                                                                     Matenadaran 2855, 2899, 3380,
                                                                                    gaps appear
                                                                                                                     6605, 8159, 8232, 8894
                                                                                                                     Rome 25
                                                                                                                     Vienna 243, 246
                                                                            quot;
                                                                                         ch
                                                  %                                        ap
                                                                                             te




                                                                   text truncated
                                                                                                rd
                                                                                                                               F (1617)
                                                                                                  ivi
                                                                                                     sio
                                                                                                           ns
                                                                                                                ap
                                  B (1623)                                                                           pe
                                                                                                                       ar
                                                                                                                                            X (1669)


                                                                                                                                $
                                            A (1689)                           #
                                                                                                                                                Matenadaran
                                                                                                                                                3520 (17th c.)
                                                                                          O (ca. 1702)

                                                                                                                                                 Matenadaran
                                       W (1601)
                                                                                                                                                 2644(1844)
                                                                                                                          V (1590-1600)
                                             J (1617)
                                                                                              D (1647)                                           (Jerusalem
                                                                                                                                                1869 edition*)
                                                  H (17th c.)   Z (17th c.)

                                                                                                  Y (17th c.)                               K (1699)

                                           L (1660)
                            I (1664)
                                                       Matenadaran
                                                      3071 (1651-61)
                                                                                                                                          Bzommar 644
                                                                                                                                           (1775-1805)



                                                                                                                            Venice 986
                                                                                                                             (1830-35)
                               *Based on Jerusalem mss. 1051, 1107




Wednesday, March 18, 2009
Non-fragmentary manuscripts omitted:
                                                                              !
                                                                                                                     Paris 191, 200
                                                                                                                     Jerusalem 3651
                                                                                                                     Matenadaran 2855, 2899, 3380,
                                                                                    gaps appear
                                                                                                                     6605, 8159, 8232, 8894
                                                                                                                     Rome 25
                                                                                                                     Vienna 243, 246
                                                                            quot;
                                                                                         ch
                                                  %                                        ap
                                                                                             te




                                                                   text truncated
                                                                                                rd
                                                                                                                               F (1617)
                                                                                                  ivi
                                                                                                     sio
                                                                                                           ns
                                                                                                                ap
                                  B (1623)                                                                           pe
                                                                                                                       ar
                                                                                                                                            X (1669)


                                                                                                                                $
                                            A (1689)                           #
                                                                                                                                                Matenadaran
                                                                                                                                                3520 (17th c.)
                                                                                          O (ca. 1702)

                                                                                                                                                 Matenadaran
                                       W (1601)
                                                                                                                                                 2644(1844)
                                                                                                                          V (1590-1600)
                                             J (1617)
                                                                                              D (1647)                                           (Jerusalem
                                                                                                                                                1869 edition*)
                                                  H (17th c.)   Z (17th c.)

                                                                                                  Y (17th c.)                               K (1699)

                                           L (1660)
                            I (1664)
                                                       Matenadaran
                                                      3071 (1651-61)
                                                                                                                                          Bzommar 644
                                                                                                                                           (1775-1805)



                                                                                                                            Venice 986
                                                                                                                             (1830-35)
                               *Based on Jerusalem mss. 1051, 1107




Wednesday, March 18, 2009
Publication



Wednesday, March 18, 2009
Online publication
                            •   XML can also be turned into HTML for online
                                publication

                            •   This gives:




Wednesday, March 18, 2009
Online publication
                            •   XML can also be turned into HTML for online
                                publication

                            •   This gives:

                                •   searchable text




Wednesday, March 18, 2009
Online publication
                            •   XML can also be turned into HTML for online
                                publication

                            •   This gives:

                                •   searchable text

                                •   easy updates




Wednesday, March 18, 2009
Online publication
                            •   XML can also be turned into HTML for online
                                publication

                            •   This gives:

                                •   searchable text

                                •   easy updates

                                •   configurable set of variants




Wednesday, March 18, 2009
Online publication
                            •   XML can also be turned into HTML for online
                                publication

                            •   This gives:

                                •   searchable text

                                •   easy updates

                                •   configurable set of variants

                                •   links to manuscript images where available


Wednesday, March 18, 2009
Questions?



Wednesday, March 18, 2009

Mais conteúdo relacionado

Destaque

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destaque (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

Consider the Source

  • 1. Consider the Source Textual criticism and digital techniques Wednesday, March 18, 2009
  • 2. How do we know what happened? Wednesday, March 18, 2009
  • 6. • the scriptorium is cold Wednesday, March 18, 2009
  • 7. • the scriptorium is cold • the food is bad Wednesday, March 18, 2009
  • 8. • the scriptorium is cold • the food is bad • the tea ladies are unfriendly Wednesday, March 18, 2009
  • 9. • the scriptorium is cold • the food is bad • the tea ladies are unfriendly • they just don’t want to be there anymore Wednesday, March 18, 2009
  • 10. • Modern historical approaches are a recent thing Wednesday, March 18, 2009
  • 11. • Modern historical approaches are a recent thing • Not unheard of before, but not standard Wednesday, March 18, 2009
  • 12. • Modern historical approaches are a recent thing • Not unheard of before, but not standard • This profoundly affects the way in which histories have reached us Wednesday, March 18, 2009
  • 13. 600 years later Wednesday, March 18, 2009
  • 14. 600 years later • a bunch of error-ridden copies Wednesday, March 18, 2009
  • 15. 600 years later • a bunch of error-ridden copies • or only a few error-ridden copies Wednesday, March 18, 2009
  • 16. 600 years later • a bunch of error-ridden copies • or only a few error-ridden copies • or only a name check in a book about something else Wednesday, March 18, 2009
  • 18. Apparatus example St. Stephenʼs Church in Nijmegen Nobilis itaque comes Otto imperio et dominio Novimagensi sibi, ut praefer- tur, impignoratis et commissis proinde praeesse cupiens, anno liiii superius 1254 descripto, mense Iunio, una cum iudice, scabinis ceterisque civibus civitatis Novimagensis, pro ipsius et inhabitantium in ea necessitate, commodo et utili- 5 tate, ut ecclesia eius parochialis extra civitatem sita destrueretur et infra muros transferretur ac de novo construeretur, a reverendo patre domino Conrado de Hofsteden, archiepiscopo Coloniensi, licentiam, et a venerabilibus dominis de- cano et capitulo sanctorum Apostolorum Coloniensi, ipsius ecclesiae ab antiquo veris et pacificis patronis, consensum, citra tamen praeiudicium, damnum aut 10 gravamen iurium et bonorum eorundem, impetravit. Et exinde liberum locum eiusdem civitatis qui dicitur Hundisbrug, de prae- libati Wilhelmi Romanorum regis, ipsius fundi domini, consensu, ad aedifican- dum et consecrandum ecclesiam et coemeterium, eisdem decano et capitulo de expresso eiusdem civitatis assensu libera contradiderunt voluntate, obligantes 15 se ipsi comes et civitas dictis decano et capitulo, quod in recompensationem illius areae infra castrum et portam, quae fuit dos ecclesiae, in qua plebanus habitare solebat—quae tunc per novum fossatum civitatis est destructa—aliam aream competentem et ecclesiae novae, ut praefertur, aedificandae satis conti- guam, ipsi plebano darent et assignarent. Et desuper apud dictam ecclesiam 20 sanctorum Apostolorum est littera sigillis ipsorum Ottonis comitis et civitatis Novimagensis sigillata. 3 p. 227 R 4 p. 97 N 6 p. 129 D 12 f. 72v M 13 p. 228 R 20 p. 130 D 2 proinde ] primum D 5 ecclesia eius ] ecclesia D: eius eius H extra civitatem om. H infra ] intra D 6 transferretur ] transferreretur NH 7 Hofsteden ] Hostede D: Hosteden H Coloniensi ] Colononiensi H dominis ] viris H 8 Coloniensi ] Coloniae H 10 iurium ] virium D 11 liberum ] librum H qui ] quae D Hundisbrug ] Hundisburch D: Hunsdisbrug R 12 regis ] imperatoris D 13 et consecrandum om. H eisdem ] eiusdem D 15 comes ] comites D dictis om. H 17 tunc ] nunc H 18 ut. . . aedificandae om. H 18–19 contiguam ] contiguum M 19 apud om. H 20 est ] et H littera ] litteram H 21 Novimagensis ] Novimagii D sigillata ] sigillis communita H Wednesday, March 18, 2009
  • 19. Apparatus example St. Stephenʼs Church in Nijmegen Nobilis itaque comes Otto imperio et dominio Novimagensi sibi, ut praefer- tur, impignoratis et commissis proinde praeesse cupiens, anno liiii superius 1254 descripto, mense Iunio, una cum iudice, scabinis ceterisque civibus civitatis Novimagensis, pro ipsius et inhabitantium in ea necessitate, commodo et utili- 5 tate, ut ecclesia eius parochialis extra civitatem sita destrueretur et infra muros transferretur ac de novo construeretur, a reverendo patre domino Conrado de Hofsteden, archiepiscopo Coloniensi, licentiam, et a venerabilibus dominis de- cano et capitulo sanctorum Apostolorum Coloniensi, ipsius ecclesiae ab antiquo veris et pacificis patronis, consensum, citra tamen praeiudicium, damnum aut 10 gravamen iurium et bonorum eorundem, impetravit. Et exinde liberum locum eiusdem civitatis qui dicitur Hundisbrug, de prae- libati Wilhelmi Romanorum regis, ipsius fundi domini, consensu, ad aedifican- dum et consecrandum ecclesiam et coemeterium, eisdem decano et capitulo de expresso eiusdem civitatis assensu libera contradiderunt voluntate, obligantes 15 se ipsi comes et civitas dictis decano et capitulo, quod in recompensationem illius areae infra castrum et portam, quae fuit dos ecclesiae, in qua plebanus habitare solebat—quae tunc per novum fossatum civitatis est destructa—aliam aream competentem et ecclesiae novae, ut praefertur, aedificandae satis conti- guam, ipsi plebano darent et assignarent. Et desuper apud dictam ecclesiam 20 sanctorum Apostolorum est littera sigillis ipsorum Ottonis comitis et civitatis Novimagensis sigillata. 3 p. 227 R 4 p. 97 N 6 p. 129 D 12 f. 72v M 13 p. 228 R 20 p. 130 D 2 proinde ] primum D 5 ecclesia eius ] ecclesia D: eius eius H extra civitatem om. H infra ] intra D 6 transferretur ] transferreretur NH 7 Hofsteden ] Hostede D: Hosteden H Coloniensi ] Colononiensi H dominis ] viris H 8 Coloniensi ] Coloniae H 10 iurium ] virium D 11 liberum ] librum H qui ] quae D Hundisbrug ] Hundisburch D: Hunsdisbrug R 12 regis ] imperatoris D 13 et consecrandum om. H eisdem ] eiusdem D 15 comes ] comites D dictis om. H 17 tunc ] nunc H 18 ut. . . aedificandae om. H 18–19 contiguam ] contiguum M 19 apud om. H 20 est ] et H littera ] litteram H 21 Novimagensis ] Novimagii D sigillata ] sigillis communita H Wednesday, March 18, 2009
  • 21. Who needs this and why? Wednesday, March 18, 2009
  • 22. Who needs this and why? • Historians look for one thing Wednesday, March 18, 2009
  • 23. Who needs this and why? • Historians look for one thing • Linguists look for other things Wednesday, March 18, 2009
  • 24. Who needs this and why? • Historians look for one thing • Linguists look for other things • Others will be interested too Wednesday, March 18, 2009
  • 25. The Chronicle of Matthew of Edessa Wednesday, March 18, 2009
  • 32. How to make an edition Wednesday, March 18, 2009
  • 34. Surviving manuscripts • Oldest full manuscript is Venice 887 Wednesday, March 18, 2009
  • 35. Surviving manuscripts • Oldest full manuscript is Venice 887 • Next oldest is Vienna 574 Wednesday, March 18, 2009
  • 36. Surviving manuscripts • Oldest full manuscript is Venice 887 • Next oldest is Vienna 574 • 24 of 42 (4 of 6 fragments) copied before 1700 Wednesday, March 18, 2009
  • 37. Extant manuscripts of the Chronicle Manuscripts Fragments 28 21 14 7 0 pre 16th 16th 17th 18th 19th Wednesday, March 18, 2009
  • 38. Two manuscript groups • • Group 1: like Venice 887 Group 2: like Vienna 574 • • Text generally Text truncated near complete (to 1162) the year 1096/7 • • Transmitted with the Transmitted with Life of St. Nerses specific long sequence (Mesrop the Priest) of texts Wednesday, March 18, 2009
  • 40. Matenadaran 1896 • Copied in 1689 Wednesday, March 18, 2009
  • 41. Matenadaran 1896 • Copied in 1689 • Uniquely preserves two passages of text Wednesday, March 18, 2009
  • 42. Matenadaran 1896 • Copied in 1689 • Uniquely preserves two passages of text • These lacunae known to other copyists Wednesday, March 18, 2009
  • 43. Matenadaran 1896 • Copied in 1689 • Uniquely preserves two passages of text • These lacunae known to other copyists • But lots of manuscripts are older. Hm. Wednesday, March 18, 2009
  • 45. Making the edition • Transcription Wednesday, March 18, 2009
  • 46. Making the edition • Transcription • Collation text analysis Wednesday, March 18, 2009
  • 47. Making the edition • Transcription • Collation text analysis • Editing the text Wednesday, March 18, 2009
  • 48. Making the edition • Transcription • Collation text analysis • Editing the text • Publication Wednesday, March 18, 2009
  • 51. Transcription • The most time-consuming part Wednesday, March 18, 2009
  • 52. Transcription • The most time-consuming part • Ideal solution would be optical character recognition (OCR) Wednesday, March 18, 2009
  • 53. Transcription • The most time-consuming part • Ideal solution would be optical character recognition (OCR) • No OCR for manuscripts, yet Wednesday, March 18, 2009
  • 54. Start with a manuscript Wednesday, March 18, 2009
  • 55. Into plain text դ. Թուխտ սիրոյ և միաբանութեան, շարագրեցեալ կղէմէս աստուածաբան վարդապետէ։ Առաջաբանութիւն։ Նա զի արդ գրիչս իմ անյարմարս կարօղ լինիցի երբէկ պատմագրիլ, ըստ պատշաճի զմեծամեծ յիշելիսն։ Որք ի վաղ ժամանակի անտի պատահեցան յեկեղեցին հայոց. և զի արդ մեք անհընագէտս յանձնառնցուք ճառել զանցեալ ծածկագոյն խորհուրդս այլասեռ ազգի, մինչև ցայժմ ո՜չ կարաց Ֆրանկ պատմիչ ոք զբուռն հարել ի սոյնպիսի օտար պատմագրութիւնս։ Բայց սակայն յուսացեալ ի յօգնութիւն սրբոյ աստուածածնին յօժարապէտս ախորժեսցուք համարձակիլ և ի յայս անհոռն ծովս մտանել... Wednesday, March 18, 2009
  • 56. XML solution: TEI !-- ... -- div n=”4” headhi rend=”red”Թուխտ սիրոյ և միաբանութեexան/ex, շարագրեցlb/ եալ կղէմէս աexստուա/exծաբան վարդապետէ։lb/ Առաջաբանութիւն։ /hi /head phi rend=”ornament”Ն/hihi rend=”red”ա զի արդ գրիչս իմ անյարմարս/hi lb/ կարօղ լինիցի երբէկ պատմագրիլ, expanըստ/expan lb/ պատշաճի զմեծամեծ յիշելիսն։ Որք lb/ ի վաղ ժամանակի անտի պատահեցան lb/ յեկեղեցին հայոց. և զի արդ մեք անհընագէտս lb/ յանձնառնցուք ճառել զանցեալ ծածկագոյն lb/ խորհուրդս այլասեռ ազգի, մինչև ցայժմ ո՜չ lb/ կարաց Ֆրանկ պատմիչ ոք զբուռն հարել lb/ ի սոյնպիսի օտար պատմագրութիexւն/exս։ Բայց սաlb/ կայն յուսացեալ ի յօգնութիexւն/ex սրբոյ աexստուա/exծածնին lb/ յօժարապէտս ախորժեսցուք համարձակիլ և ի lb/ յայս անհոռն ծովս մտանել։ ... lb/ /p /div !-- ... -- Wednesday, March 18, 2009
  • 58. Better TEI XML text !-- ... -- div n=”4” head rend=”red”wԹուխտ/w wսիրոյ/w wև/w wմիաբանուexթեան/ex,/w wշարագրեցlb/եալ/w wկղէմէս/w wexաստուած/exաբան/w wվարդապետէ։/wlb/ wԱռաջաբանութիւն։ /head pwhi rend=”ornament”Ն/hihi rend=”red”ա/hi/w hi rend=”red”wզի/w wարդ/w wգրիչս/w wիմ/w wանյարմարս/w/hi lb/ wկարօղ/w wլինիցի/w wերբէկ/w wպատմագրիլc type=”punct”,/c/w wexpanըստ/expan/w lb/ wպատշաճի/w wզմեծամեծ/w wյիշելիսնc type=”punct”։/c/w wՈրք/w lb/ wի/w wվաղ/w wժամանակի/w wանտի/w wպատահեցան/w lb/ wյեկեղեցին/w wհայոցc type=”punct”./c/w wև/w wզի/w wարդ/w wմեք/w wանհընագէտս/w lb/ wյանձնառնցուք/w wճառել/w wզանցեալ/w wծածկագոյն/w lb/ wխորհուրդս/w wայլասեռ/w wազգիc type=”punct”,/c/w wմինչև/w wցայժմ ոc type=”punct”՜/cչ/w lb/ wկարաց/w wՖրանկ/w wպատմիչ/w wոք/w wզբուռն/w wհարել/w lb/ wի/w wսոյնպիսի/w wօտար/w wպատմագրուexթիւնս/exc type=”punct”։/c/w wԲայց/w wսաlb/կայն/w wյուսացեալ/w wի յօգնուexթիւն/ex/w wսրբոյ/w wexաստուած/exածնին/w lb/ wյօժարապէտս/w wախորժեսցուք/w wհամարձակիլ/w wև/w wի/w lb/ wյայս/w wանհոռն/w wծովս/w wմտանելc type=”punct”։/c/w ... lb/ /p /div !-- ... -- Wednesday, March 18, 2009
  • 59. Perl to the rescue #1 • XML is a terrible thing to edit Wednesday, March 18, 2009
  • 60. Perl to the rescue #1 • XML is a terrible thing to edit • I want a transcription markup that I can convert to TEI XML later Wednesday, March 18, 2009
  • 61. Perl to the rescue #1 • XML is a terrible thing to edit • I want a transcription markup that I can convert to TEI XML later • Not a solution you’ll like, but I’ll show it to you anyway Wednesday, March 18, 2009
  • 62. seg type=quot;wordquot;հsubstdelայ/deladd place=quot;overwritequot;ex resp= quot;#tlaquot;ո/exռex resp= quot;#tlaquot;ո/exմ/addոց./subst Wednesday, March 18, 2009
  • 64. TEI markup [172] զօ՛րացն և զօրավարացն և ազգն հոռոմոց իւրոց քաջութեան զան դարձ փաղչելն արարին պարծանք նմանեացն վատ՛ հովուաց, ո՛ր յորժամ զգայլն տեսանէ փաղչի, սակայն հոռոմք յան ջանս ջանացին, ո՛ր լուր զպարիսպ ամրութեան տանս հայոց քակեա՛լ կործանեցին, և զպարսիկք ի վերայ արձակեցին սրով, և զամենայն յաղթու՛թիւնն իւրոց համարեցան, և ինքեանք անպատկառելի երեսօք, կուրտ՛ զօրավարք, և ներքինի զօրօք զհայ՛ք պահել ջանա+յ+ին, մինչև պարսիկք յան±-(blot)տ-+տ+±էր տեսին զ^ամենայն^ արևելք. և յայնժամ մեծաւ՛ զօրութեամբ զօրացնն այ՛լազգիքն, որ ի մէկ տարո՛յ հասան մինչև ի դու՛ռն կոստանդնուպօլիս, և առին զամենայն աշխարհն ±-հայոց-+(overwrite)հոռոմոց+±, զքաղաքս ծովեզերաց և զկղզիս նոցա, և արա՛րին զազգն յունաց որպէս զբա՛նդարգեալս ի ներս ի կոստանդնուպօլիս. և յորժամ առա՛ւ հայք ի յունաց, ար գելաւ՛ ամենայն չարութիւնն հոռոմո՛ց, յազգէն հայոց, և զկնի այսօր իկ հնարեցան այ՛լ կերպիւ պատերազմ յարուցանեալ ^ընդ^ ազգն հա՛յոց, նստան ի քննութիւն հաւ՛ատոյ, և այսու ատեա՛լ անարգեցին զհանդէս պատերազմի և զօրմարտի, և զկռիւս և զաղմու՛կս յեկեղեցի աստուծոյ կարգեալ հաստատեցին. ի պարսից պատերազմէն յօժարութեամբ փախչին, և զամենայն ճշմարիտ հաւատացեալքս քրիստոսի ի հաւատոյն ջանան խափանել և խաղխտել, վասն զի յորժամ այր քաջ զօրաւ՛որ գտանէին, զաչսն խաւարեցուցանէին, և կամ ի ծով ձգեա՛լ խեղդամահ սատակէին. և այ՛ն էր փո՛յթ յօժարութեան Wednesday, March 18, 2009
  • 65. TEI markup հ±-այ-+(overwrite)ոռոմ+ոց, seg type=quot;wordquot; substհdelայ/del add place=quot;overwritequot; ex resp=quot;#tlaquot;ո/exռ ex resp=quot;#tlaquot;ո/exմ/add /substոց,/seg Wednesday, March 18, 2009
  • 66. TEI markup հոռոմոց իւրոց քաջութեան seg type=”word”հ ex resp=quot;#tlaquot;ո/exռ ex resp=quot;#tlaquot;ո/exմոց/seg seg type=”word”իւր ex resp=quot;#tlaquot;ո/exց/seg seg type=”word”ք ex resp=quot;#tlaquot;ա/exջ ex resp=quot;#tlaquot;ո/exւ ex resp=quot;#tlaquot;թ/exե ex resp=quot;#tlaquot;ան/ex/seg Wednesday, March 18, 2009
  • 67. TEI text: what now? !-- ... -- div n=”4” head rend=”red”wԹուխտ/w wսիրոյ/w wև/w wմիաբանուexթեան/ex,/w wշարագրեցlb/եալ/ w wկղէմէս/w wexաստուած/exաբան/w wվարդապետէ։/wlb/ wԱռաջաբանութիւն։ /head pwhi rend=”ornament”Ն/hihi rend=”red”ա/hi/w hi rend=”red”wզի/w wարդ/w wգրիչս/ w wիմ/w wանյարմարս/w/hi lb/ wկարօղ/w wլինիցի/w wերբէկ/w wպատմագրիլc type=”punct”,/c/w wexpanըստ/expan/w lb/ wպատշաճի/w wզմեծամեծ/w wյիշելիսնc type=”punct”։/c/w wՈրք/w lb/ wի/w wվաղ/w wժամանակի/w wանտի/w wպատահեցան/w lb/ wյեկեղեցին/w wհայոցc type=”punct”./c/w wև/w wզի/w wարդ/w wմեք/w wանհընագէտս/w lb/ wյանձնառնցուք/w wճառել/w wզանցեալ/w wծածկագոյն/w lb/ wխորհուրդս/w wայլասեռ/w wազգիc type=”punct”,/c/w wմինչև/w wցայժմ ոc type=”punct”՜/cչ/w lb/ wկարաց/w wՖրանկ/w wպատմիչ/w wոք/w wզբուռն/w wհարել/w lb/ wի/w wսոյնպիսի/w wօտար/w wպատմագրուexթիւնս/exc type=”punct”։/c/w wԲայց/w wսաlb/կայն/w wյուսացեալ/w wի յօգնուexթիւն/ex/w wսրբոյ/w wexաստուած/exածնին/w lb/ wյօժարապէտս/w wախորժեսցուք/w wհամարձակիլ/w wև/w wի/w lb/ wյայս/w wանհոռն/w wծովս/w wմտանելc type=”punct”։/c/w ... lb/ /p /div !-- ... -- Wednesday, March 18, 2009
  • 68. Collation quot;The collation of manuscripts requires the infuriating accuracy of a pedant and the obsessive stamina of an idiot. It is therefore an ideal task for a computer.quot; —Peter Robinson, “Collation and Textual Criticism”, LLC vol. 4 no. 2, 1989 Wednesday, March 18, 2009
  • 70. Collation • need to align words with each other Wednesday, March 18, 2009
  • 71. Collation • need to align words with each other • ...across many manuscripts Wednesday, March 18, 2009
  • 72. Collation • need to align words with each other • ...across many manuscripts • ...even when the words aren’t exactly the same (e.g. “յաշխարհին” vs. “աշխարհն”) Wednesday, March 18, 2009
  • 73. յայսմ այս յայսմ այս այս ամենայն ամենայն ամի ամենայն ամենայն եղելոցն, եղելոց եղելոց եղելոց եղելոցս նստուցանեն նստուցանեն նստուցանեն նստուցանեն նստուցանեն զաթոռ զաթոռ յաթոռ զաթոռ զաթոռ հայրապետութեան հայրապետութեան հայրապետութեան հայրապետութեանն հայրապետութեան ի ի ի թաւբլուր թաւաբլուրն։ թաւբլուր եւ եւ եւ կացեալ կացեալ կացեալ անդ անդ անդ զամս զամս զամս գ գ, գ եւ եւ եւ ընդ ընդ ընդ ամենայն ամենայն ամենայն զ զ վեց ամ ամ, ամ կալեալ կալեալ կալեալ զաթոռ զաթոռ զաթոռ հայրապետութեանն հայրապետութեան հայրապետութեանն տէր տէր տէր տէր զտէր խաչիկ։ խաչիկ։ խաչիկն։ խաչիկ։ խաչիկ։ Wednesday, March 18, 2009
  • 74. !-- ... -- p wapp rdg wit=”#A #C”յայսմ/rdg rdg wit=”#B #D #E”այս/rdg /app/w wapp rdg wit=”#A #B #D #E”ամենայն/rdg rdg wit=”#C”ամի/rdg /app/w wapp rdg wit=”#A”եղելոցն/rdg rdg wit=”#B #C #D”եղելոց/rdg rdg wit=”#E”եղելոցս/rdg /app/w wնստուցանեն/w w type=”prefix”app rdg wit=”#A #B #D #E”զ/rdg rdg wit=”#C”յ/rdg /app/w wաթոռ/w wհայրապետութեան/w wapp rdg wit=”#A #C #D”ի/rdg /app/w !-- ... -- /p !-- ... -- Wednesday, March 18, 2009
  • 75. !-- ... -- p wapp rdg wit=”#A #C”յայսմ/rdg lem wit=”#B #D #E”այս/rdg /app/w wapp lem wit=”#A #B #D #E”ամենայն/rdg rdg wit=”#C”ամի/rdg /app/w wapp lem wit=”#A”եղելոցն/rdg rdg wit=”#B #C #D”եղելոց/rdg rdg wit=”#E”եղելոցս/rdg /app/w wնստուցանեն/w w type=”prefix”app rdg wit=”#A #B #D #E”զ/rdg lem wit=”#C”յ/rdg /app/w wաթոռ/w wհայրապետութեան/w wapp lem wit=”#A #C #D”ի/rdg /app/w !-- ... -- /p !-- ... -- Wednesday, March 18, 2009
  • 76. Our text apparatus այս ամենայն եղելոցն նստուցանեն զաթոռ 1 ի թաւբլուր, 1 այս] յայսմ AC 1 ամենայն] ամի C 1 եղելոցն] եղելոց BDE եղելոցս C 1 զաթոռ] յաթոռ C 2 ի թաւբլուր] om. BE ... Wednesday, March 18, 2009
  • 77. !-- ... -- p wapp rdg wit=”#A #C”յայսմ/rdg lem wit=”#B #D #E”այս/rdg /app/w wapp lem wit=”#A #B #D #E”ամենայն/rdg rdg wit=”#C”ամի/rdg /app/w wapp lem wit=”#A”եղելոցն/rdg rdg wit=”#B #C #D”եղելոց/rdg rdg wit=”#E”եղելոցս/rdg /app/w wնստուցանեն/w w type=”prefix”app rdg wit=”#A #B #D #E”զ/rdg lem wit=”#C”յ/rdg /app/w wաթոռ/w wհայրապետութեան/w wapp lem wit=”#A #C #D”ի/rdg /app/w !-- ... -- /p !-- ... -- Wednesday, March 18, 2009
  • 78. !-- ... -- p wapp lem wit=”#A #C”յայսմ/rdg rdg wit=”#B #D #E”այս/rdg /app/w wapp rdg wit=”#A #B #D #E”ամենայն/rdg lem wit=”#C”ամի/rdg /app/w wapp lem wit=”#A”եղելոցն/rdg rdg wit=”#B #C #D”եղելոց/rdg rdg wit=”#E”եղելոցս/rdg /app/w wնստուցանեն/w w type=”prefix”app rdg wit=”#A #B #D #E”զ/rdg lem wit=”#C”յ/rdg /app/w wաթոռ/w wհայրապետութեան/w wapp lem wit=”#A #C #D”ի/rdg /app/w !-- ... -- /p !-- ... -- Wednesday, March 18, 2009
  • 79. New text apparatus յայսմ ամի եղելոցն նստուցանեն զաթոռ 1 ի թաւբլուր, 1 յայսմ] այս BDE 1 ամի] ամենայն ABDE 1 եղելոցն] եղելոց BDE եղելոցս C 1 զաթոռ] յաթոռ C 2 ի թաւբլուր] om. BE ... Wednesday, March 18, 2009
  • 81. Manuscript stemmas: the family tree Wednesday, March 18, 2009
  • 82. Stemma construction • Better stemma through analysis of collation results Wednesday, March 18, 2009
  • 83. Stemma construction • Better stemma through analysis of collation results • Borrows statistical models from evolutionary biology Wednesday, March 18, 2009
  • 84. Stemma construction • Better stemma through analysis of collation results • Borrows statistical models from evolutionary biology • “Maximum parsimony” based upon DNA of specimens Wednesday, March 18, 2009
  • 85. Stemma construction • Better stemma through analysis of collation results • Borrows statistical models from evolutionary biology • “Maximum parsimony” based upon DNA of specimens • Manuscripts are specimens Wednesday, March 18, 2009
  • 86. Stemma construction • Better stemma through analysis of collation results • Borrows statistical models from evolutionary biology • “Maximum parsimony” based upon DNA of specimens • Manuscripts are specimens • Biologists have DNA sequences; we have words. Wednesday, March 18, 2009
  • 87. յայսմ այս յայսմ այս այս ամենայն ամենայն ամի ամենայն ամենայն եղելոցն, եղելոց եղելոց եղելոց եղելոցս նստուցանեն նստուցանեն նստուցանեն նստուցանեն նստուցանեն զաթոռ զաթոռ յաթոռ զաթոռ զաթոռ հայրապետութեան հայրապետութեան հայրապետութեան հայրապետութեանն հայրապետութեան ի ի ի թաւբլուր թաւաբլուրն։ թաւբլուր եւ եւ եւ կացեալ կացեալ կացեալ անդ անդ անդ զամս զամս զամս գ գ, գ եւ եւ եւ ընդ ընդ ընդ ամենայն ամենայն ամենայն զ զ վեց ամ ամ, ամ կալեալ կալեալ կալեալ զաթոռ զաթոռ զաթոռ հայրապետութեանն հայրապետութեան հայրապետութեանն տէր տէր տէր տէր զտէր խաչիկ։ խաչիկ։ խաչիկն։ խաչիկ։ խաչիկ։ Wednesday, March 18, 2009
  • 88. A B A B B A A B A A A B B B C A A A A A A A B A A A A A B A A O A A O A O B A O A O A A O A O A A O A O A A O A O A A O A O A A O A O A A O A O A A O A O A A O A O A B O A O A A O A O A A O A O A A O A O B A O A A A A B A A B A A Wednesday, March 18, 2009
  • 90. Non-fragmentary manuscripts omitted: ! Paris 191, 200 Jerusalem 3651 Matenadaran 2855, 2899, 3380, gaps appear 6605, 8159, 8232, 8894 Rome 25 Vienna 243, 246 quot; ch % ap te text truncated rd F (1617) ivi sio ns ap B (1623) pe ar X (1669) $ A (1689) # Matenadaran 3520 (17th c.) O (ca. 1702) Matenadaran W (1601) 2644(1844) V (1590-1600) J (1617) D (1647) (Jerusalem 1869 edition*) H (17th c.) Z (17th c.) Y (17th c.) K (1699) L (1660) I (1664) Matenadaran 3071 (1651-61) Bzommar 644 (1775-1805) Venice 986 (1830-35) *Based on Jerusalem mss. 1051, 1107 Wednesday, March 18, 2009
  • 91. Non-fragmentary manuscripts omitted: ! Paris 191, 200 Jerusalem 3651 Matenadaran 2855, 2899, 3380, gaps appear 6605, 8159, 8232, 8894 Rome 25 Vienna 243, 246 quot; ch % ap te text truncated rd F (1617) ivi sio ns ap B (1623) pe ar X (1669) $ A (1689) # Matenadaran 3520 (17th c.) O (ca. 1702) Matenadaran W (1601) 2644(1844) V (1590-1600) J (1617) D (1647) (Jerusalem 1869 edition*) H (17th c.) Z (17th c.) Y (17th c.) K (1699) L (1660) I (1664) Matenadaran 3071 (1651-61) Bzommar 644 (1775-1805) Venice 986 (1830-35) *Based on Jerusalem mss. 1051, 1107 Wednesday, March 18, 2009
  • 92. Non-fragmentary manuscripts omitted: ! Paris 191, 200 Jerusalem 3651 Matenadaran 2855, 2899, 3380, gaps appear 6605, 8159, 8232, 8894 Rome 25 Vienna 243, 246 quot; ch % ap te text truncated rd F (1617) ivi sio ns ap B (1623) pe ar X (1669) $ A (1689) # Matenadaran 3520 (17th c.) O (ca. 1702) Matenadaran W (1601) 2644(1844) V (1590-1600) J (1617) D (1647) (Jerusalem 1869 edition*) H (17th c.) Z (17th c.) Y (17th c.) K (1699) L (1660) I (1664) Matenadaran 3071 (1651-61) Bzommar 644 (1775-1805) Venice 986 (1830-35) *Based on Jerusalem mss. 1051, 1107 Wednesday, March 18, 2009
  • 94. Online publication • XML can also be turned into HTML for online publication • This gives: Wednesday, March 18, 2009
  • 95. Online publication • XML can also be turned into HTML for online publication • This gives: • searchable text Wednesday, March 18, 2009
  • 96. Online publication • XML can also be turned into HTML for online publication • This gives: • searchable text • easy updates Wednesday, March 18, 2009
  • 97. Online publication • XML can also be turned into HTML for online publication • This gives: • searchable text • easy updates • configurable set of variants Wednesday, March 18, 2009
  • 98. Online publication • XML can also be turned into HTML for online publication • This gives: • searchable text • easy updates • configurable set of variants • links to manuscript images where available Wednesday, March 18, 2009

Notas do Editor

  1. I’m going to start off by asking a very simple question.
  2. How do we know what happened in the past? [ASK: We read histories. We read literature. We look at art. We look at archaeology. We listen to people.] So about these histories we read. Where do they come from? Hint: \"the library\" is not the answer. Or, well, it is an answer. The history in Lucius' library may have had a few differences from the one in Brutus'. Why was that? No printing press. No mechanical form of copying.
  3. [woodcut slide] This was your printer.
  4. Trouble is, humans don’t make very good machines. - People are kind of bad at copying because they are - reading as they go, - they skip lines, - they change dialects in the next village, - they take shortcuts...
  5. ... because [TRANS] the monastery is cold and [TRANS] the food is terrible and [TRANS] the tea ladies are unfriendly and [TRANS] they really just don't want to be there anymore. - (Think I'm kidding? You should read some of the notes that the copyists left in their books.)
  6. ... because [TRANS] the monastery is cold and [TRANS] the food is terrible and [TRANS] the tea ladies are unfriendly and [TRANS] they really just don't want to be there anymore. - (Think I'm kidding? You should read some of the notes that the copyists left in their books.)
  7. ... because [TRANS] the monastery is cold and [TRANS] the food is terrible and [TRANS] the tea ladies are unfriendly and [TRANS] they really just don't want to be there anymore. - (Think I'm kidding? You should read some of the notes that the copyists left in their books.)
  8. ... because [TRANS] the monastery is cold and [TRANS] the food is terrible and [TRANS] the tea ladies are unfriendly and [TRANS] they really just don't want to be there anymore. - (Think I'm kidding? You should read some of the notes that the copyists left in their books.)
  9. Very important to remember - the things we care about in a historical source, including preservation of the original, are *not* the things they cared about several hundred years ago. [TRANS] You can find exceptions, especially in the study of classical texts, but it wasn’t the rule. This would seem obvious, but you would be amazed how often people still lose sight of this in professional scholarship. [TRANS] So what does this mean? For one thing, people weren't so concerned about preserving the original. - Need a copy of another book? Out of parchment? - Pick a book you don't want anymore and scrape off the ink. - Was that the original copy of some history that is going to be massively important in 600 years? How would you know? Anyway Arnulf down the road has another copy; they can read his.
  10. Very important to remember - the things we care about in a historical source, including preservation of the original, are *not* the things they cared about several hundred years ago. [TRANS] You can find exceptions, especially in the study of classical texts, but it wasn’t the rule. This would seem obvious, but you would be amazed how often people still lose sight of this in professional scholarship. [TRANS] So what does this mean? For one thing, people weren't so concerned about preserving the original. - Need a copy of another book? Out of parchment? - Pick a book you don't want anymore and scrape off the ink. - Was that the original copy of some history that is going to be massively important in 600 years? How would you know? Anyway Arnulf down the road has another copy; they can read his.
  11. Very important to remember - the things we care about in a historical source, including preservation of the original, are *not* the things they cared about several hundred years ago. [TRANS] You can find exceptions, especially in the study of classical texts, but it wasn’t the rule. This would seem obvious, but you would be amazed how often people still lose sight of this in professional scholarship. [TRANS] So what does this mean? For one thing, people weren't so concerned about preserving the original. - Need a copy of another book? Out of parchment? - Pick a book you don't want anymore and scrape off the ink. - Was that the original copy of some history that is going to be massively important in 600 years? How would you know? Anyway Arnulf down the road has another copy; they can read his.
  12. So now it's 600 years later and the study of history has become more rigorous. We want to know what our author actually wrote, but we don't have the original anymore. [TRANS] If we're lucky, we just have a bunch of error-ridden copies. [TRANS] If we're unlucky, we only have one error-ridden copy. [TRANS] If we're really unlucky, we just have a reference to the history in someone else's book.
  13. So now it's 600 years later and the study of history has become more rigorous. We want to know what our author actually wrote, but we don't have the original anymore. [TRANS] If we're lucky, we just have a bunch of error-ridden copies. [TRANS] If we're unlucky, we only have one error-ridden copy. [TRANS] If we're really unlucky, we just have a reference to the history in someone else's book.
  14. So now it's 600 years later and the study of history has become more rigorous. We want to know what our author actually wrote, but we don't have the original anymore. [TRANS] If we're lucky, we just have a bunch of error-ridden copies. [TRANS] If we're unlucky, we only have one error-ridden copy. [TRANS] If we're really unlucky, we just have a reference to the history in someone else's book.
  15. Textual criticism is the field of taking all these sources and constructing the basic foundation of history. (It means something similar but slightly different in the field of literature, but hey we’re all historians here.) In some sense it's the preserve of detail-obsessed nerds like me, but every historian needs to know what the field is all about, and what the challenges are, so you can make your own informed decision about the value of a source. The product of textual criticism on a particular text is a “critical edition”. This will have a version of the text, chosen from the available alternatives, according to whatever criteria the editor thinks best. Sometimes you may wish to politely disagree with the choices that the editor has made...
  16. This is why a critical edition also contains [TRANS] a specially formatted block of footnotes called an \"apparatus criticus\", that gives all the readings from the manuscripts that were rejected from the base text. This is also fiddly and irritating to format. People have written entire word processors just to do this. But it is the whole point of a critical edition really. [TRANS CLOSEUP]
  17. e.g. “in line 2 where it says “proinde” manuscript D has “primum” - “in line 6 where it says “transferretur” mss N and H have “transferrentur” - “in line 13 where it says “et consecrandum” that’s missing in ms H
  18. - who cares about what? - Historians want to know what happened. They need to know whether the text said “the man bit the dog” or “the dog bit the man”. - Linguists and philologists want to know how it was said. They need to know whether the author spelled “potato” with an extra E. - Other people will want other things. Someone making a historical atlas might care very much about every variation in spelling of place names, and not care about extra Es in “potato”. - Trouble is, to date, what sort of critical edition you get depends entirely on who’s doing it. The editor has had to decide for him/herself what is “interesting”, and anything that isn’t “interesting” isn’t included. So how does this play out in practice?
  19. - who cares about what? - Historians want to know what happened. They need to know whether the text said “the man bit the dog” or “the dog bit the man”. - Linguists and philologists want to know how it was said. They need to know whether the author spelled “potato” with an extra E. - Other people will want other things. Someone making a historical atlas might care very much about every variation in spelling of place names, and not care about extra Es in “potato”. - Trouble is, to date, what sort of critical edition you get depends entirely on who’s doing it. The editor has had to decide for him/herself what is “interesting”, and anything that isn’t “interesting” isn’t included. So how does this play out in practice?
  20. - who cares about what? - Historians want to know what happened. They need to know whether the text said “the man bit the dog” or “the dog bit the man”. - Linguists and philologists want to know how it was said. They need to know whether the author spelled “potato” with an extra E. - Other people will want other things. Someone making a historical atlas might care very much about every variation in spelling of place names, and not care about extra Es in “potato”. - Trouble is, to date, what sort of critical edition you get depends entirely on who’s doing it. The editor has had to decide for him/herself what is “interesting”, and anything that isn’t “interesting” isn’t included. So how does this play out in practice?
  21. I shall illustrate with the text I know best, which is a 12th century Armenian chronicle, and is a beautiful example by virtue of being equally irrelevant to all of you.
  22. [SLIDE] First a little historical background. (This is Anatolia, in case it isn’t very clear.) Matthew of Edessa was an Armenian priest who lived and [TRANS] wrote in Edessa; chronicle begins 952 [TRANS] covers good times up to 1045 [TRANS] covers the migration of the Armenian nobility to Cappadocia as the Seljuks rampaged around starts to talk about his own time, and in particular the First Crusade, but also the... [TRANS] rise of new Armenian lords in Kesoun, Raban, and... [TRANS] Cilicia, the last of which would eventually become the Cilician Kingdom of Armenia, which would be pretty important after Matthew was dead. The Chronicle was written in the 1130s. Those were exciting times if you lived in Edessa. This is why Matthew wrote the history he did. It turns out that we are really glad he did, because he was there to see the Crusader princes come through and he had a rather different viewpoint than anyone named Baldwin or anyone named Ioannes.
  23. [SLIDE] First a little historical background. (This is Anatolia, in case it isn’t very clear.) Matthew of Edessa was an Armenian priest who lived and [TRANS] wrote in Edessa; chronicle begins 952 [TRANS] covers good times up to 1045 [TRANS] covers the migration of the Armenian nobility to Cappadocia as the Seljuks rampaged around starts to talk about his own time, and in particular the First Crusade, but also the... [TRANS] rise of new Armenian lords in Kesoun, Raban, and... [TRANS] Cilicia, the last of which would eventually become the Cilician Kingdom of Armenia, which would be pretty important after Matthew was dead. The Chronicle was written in the 1130s. Those were exciting times if you lived in Edessa. This is why Matthew wrote the history he did. It turns out that we are really glad he did, because he was there to see the Crusader princes come through and he had a rather different viewpoint than anyone named Baldwin or anyone named Ioannes.
  24. [SLIDE] First a little historical background. (This is Anatolia, in case it isn’t very clear.) Matthew of Edessa was an Armenian priest who lived and [TRANS] wrote in Edessa; chronicle begins 952 [TRANS] covers good times up to 1045 [TRANS] covers the migration of the Armenian nobility to Cappadocia as the Seljuks rampaged around starts to talk about his own time, and in particular the First Crusade, but also the... [TRANS] rise of new Armenian lords in Kesoun, Raban, and... [TRANS] Cilicia, the last of which would eventually become the Cilician Kingdom of Armenia, which would be pretty important after Matthew was dead. The Chronicle was written in the 1130s. Those were exciting times if you lived in Edessa. This is why Matthew wrote the history he did. It turns out that we are really glad he did, because he was there to see the Crusader princes come through and he had a rather different viewpoint than anyone named Baldwin or anyone named Ioannes.
  25. [SLIDE] First a little historical background. (This is Anatolia, in case it isn’t very clear.) Matthew of Edessa was an Armenian priest who lived and [TRANS] wrote in Edessa; chronicle begins 952 [TRANS] covers good times up to 1045 [TRANS] covers the migration of the Armenian nobility to Cappadocia as the Seljuks rampaged around starts to talk about his own time, and in particular the First Crusade, but also the... [TRANS] rise of new Armenian lords in Kesoun, Raban, and... [TRANS] Cilicia, the last of which would eventually become the Cilician Kingdom of Armenia, which would be pretty important after Matthew was dead. The Chronicle was written in the 1130s. Those were exciting times if you lived in Edessa. This is why Matthew wrote the history he did. It turns out that we are really glad he did, because he was there to see the Crusader princes come through and he had a rather different viewpoint than anyone named Baldwin or anyone named Ioannes.
  26. [SLIDE] First a little historical background. (This is Anatolia, in case it isn’t very clear.) Matthew of Edessa was an Armenian priest who lived and [TRANS] wrote in Edessa; chronicle begins 952 [TRANS] covers good times up to 1045 [TRANS] covers the migration of the Armenian nobility to Cappadocia as the Seljuks rampaged around starts to talk about his own time, and in particular the First Crusade, but also the... [TRANS] rise of new Armenian lords in Kesoun, Raban, and... [TRANS] Cilicia, the last of which would eventually become the Cilician Kingdom of Armenia, which would be pretty important after Matthew was dead. The Chronicle was written in the 1130s. Those were exciting times if you lived in Edessa. This is why Matthew wrote the history he did. It turns out that we are really glad he did, because he was there to see the Crusader princes come through and he had a rather different viewpoint than anyone named Baldwin or anyone named Ioannes.
  27. So he wrote this history, and we need an edition. How do you do that? First you have to find out what there is to work with. Traditionally, you look at all the manuscripts, and then you make a choice (often arbitrary, or based only on age of manuscript) about which one you'll base your edition on.
  28. Then you return to the roots of scholarship. You copy it out (by longhand, typewriter, word processor, or spreadsheet), maybe one word per line, and you note everything you observe about that word. Is it abbreviated? Is it misspelled? Is it at a line boundary? A page boundary? Is there a margin note pointing to the word? Is it scratched out? Maybe these things won't be important, but you never know when they will be. That missing pair of words at the end of the line might be the proof you need that this manuscript was copied from another manuscript, for example. Then you do the same thing with all the other manuscripts. By the end, you are cold and hungry and the tea ladies are unfriendly and you really just don't want to be there anymore. On the other hand, you know the text *really* well by now. So then you have to go through your transcription one last time, sick of it as you are, and make decisions about which words will go into your edited text. In my case, the Chronicle was copied a lot, and it was thrown away a lot. The earliest copy we have is from more than 400 years later. Today there are 42 manuscripts. Of those, 6 are short extracts from the history, which leaves 36 manuscripts that need to be at least looked at closely.
  29. The oldest manuscript I know about is Venice 887. It is held by the Mekhitarist monastery in Venice, and was copied sometime between 1590 and 1600. [TRANS] The next oldest is Vienna 574, held by the Mekhitarist monastery there, and dates from 1601. [TRANS] More than half the non-fragmentary manuscripts date to the seventeenth century, which suggests that multiple older copies which have now been lost, and served as exemplars.
  30. The oldest manuscript I know about is Venice 887. It is held by the Mekhitarist monastery in Venice, and was copied sometime between 1590 and 1600. [TRANS] The next oldest is Vienna 574, held by the Mekhitarist monastery there, and dates from 1601. [TRANS] More than half the non-fragmentary manuscripts date to the seventeenth century, which suggests that multiple older copies which have now been lost, and served as exemplars.
  31. The oldest manuscript I know about is Venice 887. It is held by the Mekhitarist monastery in Venice, and was copied sometime between 1590 and 1600. [TRANS] The next oldest is Vienna 574, held by the Mekhitarist monastery there, and dates from 1601. [TRANS] More than half the non-fragmentary manuscripts date to the seventeenth century, which suggests that multiple older copies which have now been lost, and served as exemplars.
  32. This chart shows the distribution of extant manuscripts of the chronicle. You can see the sudden proliferation of copies of this manuscript in the seventeenth century; [digression about plant kingdom biology chart] the characteristics of the surviving copies strongly suggest that there were many copies made before 1600 that have now been lost. Once I’ve made my list of manuscripts, the next step is to see if I can find patterns. Were some of them obviously copied from others? Do some of them have really obvious features in common?
  33. This is what I was able to find. The two oldest manuscripts represent two distinct groups, which are pretty easy to spot. - First (V887) group has complete text, shares parchment with (10th c.) Nerses - Second (W574) group has truncated text, shares parchment with long specific set of texts that I won’t bore you with - The colophons in many of these manuscripts show that the copyists were aware of the truncation. Aha, that’s a Clue.
  34. There is one manuscript from the first group that is particularly odd. Matenadaran manuscript 1896 [TRANS] copied in 1689—many years after our two group leaders. [TRANS] Preserves 2 longish passages of text that appear nowhere else. [TRANS] Many of the other manuscripts contain marginal notes that show awareness of gaps. In fact, this one left some room for the gaps, *and then went back and filled them in*. There’s a Clue if I ever saw one. [TRANS] And yet it’s the only one of the 43 that has these bits, and it’s nowhere near the oldest. - complicated and twisted manuscript tradition. Normally I’m supposed to look at all this information, think about it for a while, and then draw a stemma. Stemma is my best guess at a manuscript family tree—shows copy relationships between mss. - The more mss I can find that were copied from others I have, the fewer I have to transcribe for the edition. Win. - This is so snarled up that I can’t even begin to draw a stemma though. Lose. - But sometimes, rarely, scribes are helpful. When that happens, win.
  35. There is one manuscript from the first group that is particularly odd. Matenadaran manuscript 1896 [TRANS] copied in 1689—many years after our two group leaders. [TRANS] Preserves 2 longish passages of text that appear nowhere else. [TRANS] Many of the other manuscripts contain marginal notes that show awareness of gaps. In fact, this one left some room for the gaps, *and then went back and filled them in*. There’s a Clue if I ever saw one. [TRANS] And yet it’s the only one of the 43 that has these bits, and it’s nowhere near the oldest. - complicated and twisted manuscript tradition. Normally I’m supposed to look at all this information, think about it for a while, and then draw a stemma. Stemma is my best guess at a manuscript family tree—shows copy relationships between mss. - The more mss I can find that were copied from others I have, the fewer I have to transcribe for the edition. Win. - This is so snarled up that I can’t even begin to draw a stemma though. Lose. - But sometimes, rarely, scribes are helpful. When that happens, win.
  36. There is one manuscript from the first group that is particularly odd. Matenadaran manuscript 1896 [TRANS] copied in 1689—many years after our two group leaders. [TRANS] Preserves 2 longish passages of text that appear nowhere else. [TRANS] Many of the other manuscripts contain marginal notes that show awareness of gaps. In fact, this one left some room for the gaps, *and then went back and filled them in*. There’s a Clue if I ever saw one. [TRANS] And yet it’s the only one of the 43 that has these bits, and it’s nowhere near the oldest. - complicated and twisted manuscript tradition. Normally I’m supposed to look at all this information, think about it for a while, and then draw a stemma. Stemma is my best guess at a manuscript family tree—shows copy relationships between mss. - The more mss I can find that were copied from others I have, the fewer I have to transcribe for the edition. Win. - This is so snarled up that I can’t even begin to draw a stemma though. Lose. - But sometimes, rarely, scribes are helpful. When that happens, win.
  37. There is one manuscript from the first group that is particularly odd. Matenadaran manuscript 1896 [TRANS] copied in 1689—many years after our two group leaders. [TRANS] Preserves 2 longish passages of text that appear nowhere else. [TRANS] Many of the other manuscripts contain marginal notes that show awareness of gaps. In fact, this one left some room for the gaps, *and then went back and filled them in*. There’s a Clue if I ever saw one. [TRANS] And yet it’s the only one of the 43 that has these bits, and it’s nowhere near the oldest. - complicated and twisted manuscript tradition. Normally I’m supposed to look at all this information, think about it for a while, and then draw a stemma. Stemma is my best guess at a manuscript family tree—shows copy relationships between mss. - The more mss I can find that were copied from others I have, the fewer I have to transcribe for the edition. Win. - This is so snarled up that I can’t even begin to draw a stemma though. Lose. - But sometimes, rarely, scribes are helpful. When that happens, win.
  38. I’ll just have to plunge in and make the edition without having the stemma yet. There are four steps to making a critical edition: [CLICK] transcription, [CLICK] collation / text analysis, [CLICK] editing, and [CLICK] publication. The first two of these, and to a lesser extent the third, are so horrendously tedious that I spent a while looking desperately for shortcuts. This brings us to the other thing I’m supposed to talk about today...
  39. I’ll just have to plunge in and make the edition without having the stemma yet. There are four steps to making a critical edition: [CLICK] transcription, [CLICK] collation / text analysis, [CLICK] editing, and [CLICK] publication. The first two of these, and to a lesser extent the third, are so horrendously tedious that I spent a while looking desperately for shortcuts. This brings us to the other thing I’m supposed to talk about today...
  40. I’ll just have to plunge in and make the edition without having the stemma yet. There are four steps to making a critical edition: [CLICK] transcription, [CLICK] collation / text analysis, [CLICK] editing, and [CLICK] publication. The first two of these, and to a lesser extent the third, are so horrendously tedious that I spent a while looking desperately for shortcuts. This brings us to the other thing I’m supposed to talk about today...
  41. I’ll just have to plunge in and make the edition without having the stemma yet. There are four steps to making a critical edition: [CLICK] transcription, [CLICK] collation / text analysis, [CLICK] editing, and [CLICK] publication. The first two of these, and to a lesser extent the third, are so horrendously tedious that I spent a while looking desperately for shortcuts. This brings us to the other thing I’m supposed to talk about today...
  42. ...how to coax the computer into doing my work for me. What’s about to follow is a rather detailed case study in how computers can take a particular academic task and make it a lot easier and a lot less tedious. Many of you will never find yourselves making a critical edition, but the point of computer techniques in humanities is to look for anything at all, no matter how small, that is mindless and repetitive and let the computer deal with it so you don’t have to.
  43. So finally I had some manuscripts. I plunged into the transcription. [TRANS] It turns out that this takes a huge amount of time. [TRANS] What I really want is for the computer to read it for me. This actually does exist, and is called “optical character recognition” (OCR for short.) It is what Google uses to scan in all those books and make them searchable. [TRANS] Unfortunately there's not yet any such thing as OCR for manuscripts. Not only were the monks in the scriptoria not machines—their handwriting wasn’t as good as the printing presses either. This means that the transcription itself is still horrible, mind-numbing, and time-consuming. But the transcription is the only horrible part I have to do, and there are a few things I learned along the way.
  44. So finally I had some manuscripts. I plunged into the transcription. [TRANS] It turns out that this takes a huge amount of time. [TRANS] What I really want is for the computer to read it for me. This actually does exist, and is called “optical character recognition” (OCR for short.) It is what Google uses to scan in all those books and make them searchable. [TRANS] Unfortunately there's not yet any such thing as OCR for manuscripts. Not only were the monks in the scriptoria not machines—their handwriting wasn’t as good as the printing presses either. This means that the transcription itself is still horrible, mind-numbing, and time-consuming. But the transcription is the only horrible part I have to do, and there are a few things I learned along the way.
  45. So finally I had some manuscripts. I plunged into the transcription. [TRANS] It turns out that this takes a huge amount of time. [TRANS] What I really want is for the computer to read it for me. This actually does exist, and is called “optical character recognition” (OCR for short.) It is what Google uses to scan in all those books and make them searchable. [TRANS] Unfortunately there's not yet any such thing as OCR for manuscripts. Not only were the monks in the scriptoria not machines—their handwriting wasn’t as good as the printing presses either. This means that the transcription itself is still horrible, mind-numbing, and time-consuming. But the transcription is the only horrible part I have to do, and there are a few things I learned along the way.
  46. So here we have a manuscript page. Just like the history itself has nothing to do with your own fields, I’ve even managed to pick a script that none of you can read. So you’ll just have to take my word for it. I initially just typed out my texts into plain files, like Notepad or TextEdit or what have you. No red font, no formatting, just the words. I figured this would make it easier for the computer to compare words later across all the different manuscripts.
  47. So I can transcribe the text, but already have dilemma: - Standardize spelling? - Unbroken lines? - How to record deletions & additions? - How to record page breaks? Section divisions? Etc. Either I lose information that I need, or I invent some way to represent all these fiddly variations. On the other hand, I don’t want to cause trouble when I get the computer, which doesn't understand Armenian, to collate the results.
  48. Naturally, someone threw XML at the problem. XML is the eXtensible Markup Language, and is a really useful way of representing and transferring anything that you want the computer to process, and especially anything you might want the computer to display differently in different application windows, or to different people. This is a fragment of XML using the guidelines of the Text Encoding Initiative, or TEI. - Easy comparison of section / paragraph divisions - Can keep, and use, all sorts of metadata later in the program - Can output the collation result back to TEI XML. More on that later. [cmp with next slide]
  49. [cmp with prev slide]
  50. But something even better. What if I want to look at texts in languages other than Armenian? [Shocking.] Languages whose words aren’t divided by whitespace? - TEI tells me what’s a word, so that I don’t have to assume whitespace split - Now the programs I write can be language-independent So it's a nice solution, but at the same time it's a new problem.
  51. Who wants to write and edit XML by hand? [TRANS] One of the first parts of this I wrote was to save myself the trouble of typing transcriptions like this, while recording all of the information that I want to keep in my TEI files. [TRANS] The thing I came up with is a very good example of when you don’t want computer scientists like me getting involved in history. So instead of
  52. Who wants to write and edit XML by hand? [TRANS] One of the first parts of this I wrote was to save myself the trouble of typing transcriptions like this, while recording all of the information that I want to keep in my TEI files. [TRANS] The thing I came up with is a very good example of when you don’t want computer scientists like me getting involved in history. So instead of
  53. Who wants to write and edit XML by hand? [TRANS] One of the first parts of this I wrote was to save myself the trouble of typing transcriptions like this, while recording all of the information that I want to keep in my TEI files. [TRANS] The thing I came up with is a very good example of when you don’t want computer scientists like me getting involved in history. So instead of
  54. The trouble is that I find myself needing a long irritating snippet of XML like this for a single little scribal correction. (I know you can’t read it, but [explain the pieces]) Now since I think like a computer programmer, and since I have become reasonably familiar with the bits of TEI I need, I just transformed it all into code whose main advantage is that it’s quick to type. If you’re me.
  55. Yes I do know how to type that weird little plus-minus character. It also helps my transcription in that the file I end up with
  56. is a better visual match to the manuscript I'm transcribing than either the plain-text or the TEI XML versions. [GO TO CLOSEUP]
  57. This is the example I showed before—a relatively complicated transcription records the fact that the copyist originally wrote “Hayoc’” (which means Armenians), and corrected it to “Hrromoc’” (which means Romans) by writing over the original word.
  58. This shows a relatively simple ways to indicate that the copyist abbreviated words, and supply an expansion. You want both to record which letters are actually there, and what you interpret the word to be. When I have finished transcribing, I pass the file to my computer program, and it writes out the XML for me. It’s a horrible solution actually. But there is nothing better, and this is one of the main problems with XML in the humanities. Was at a conf all about TEI in November, and usability is a big issue. I want us to think about this later. Also, even with all the computational aids we can throw at it, transcription is tedious and time-consuming. That means that, until we live far enough in the future for OCR to work like a dream, if you ever find yourself having to transcribe a text, you should do your bit to do a good enough job that no one else will ever have to repeat the work.
  59. So now I have some pretty well-formatted XML files, and I want to start the collation process. This means I need to extract the words. When I gave this talk at YAPC, I launched into a rant here about how much I hate XML, and admitted that my “processing” was really nothing more than “yanking the words back out into plaintext and parsing it that way.” But it turns out that, as irritating as XML is to parse, and as badly documented as XML::LibXML is, and as over-engineered a solution as it is, and as much as I just generally dislike it, the XML format is too useful to ignore, and I had to fix up the core of what I wrote over the summer to be able to handle the new XML information.
  60. [READ QUOTE] This observation was made by a man called Peter Robinson, who walked this path long before I did. He wrote a program too. It works pretty well, I'm told. Unfortunately it only works on Mac OS 9, and it doesn’t support non-Western languages very well. Time to reinvent the wheel. Only this time, I have Unicode, and I have Perl. [people still do this without unicode. seriously guys.]
  61. So this is what my collator needs to do. It should [B] align words with each other, [B] across many manuscripts, [B] even in the case of words such as աշխարհն (the land) and յաշխարհին (in the land) which are similar but not quite the same.
  62. So this is what my collator needs to do. It should [B] align words with each other, [B] across many manuscripts, [B] even in the case of words such as աշխարհն (the land) and յաշխարհին (in the land) which are similar but not quite the same.
  63. So this is what my collator needs to do. It should [B] align words with each other, [B] across many manuscripts, [B] even in the case of words such as աշխարհն (the land) and յաշխարհին (in the land) which are similar but not quite the same.
  64. So it has to do something like this. Remember back at the beginning I said people often do their collation in a spreadsheet? Here I am getting the computer to do my collation for me in what you can then pretend is a spreadsheet. But wait! Now I have the word alignment, I can spit it all back out into XML...[NEXT]
  65. ...back out into TEI format. This uses a TEI module specifically for “text criticism”. They really did think of everything. [describe the word, the reading, etc.] And then when I (the editor) do my job, I will mark one of these as a lemma:
  66. which represents my decision about which word should go into the text. - trivial from this to generate an apparatus like we saw before:
  67. And if I decide I made a dumb mistake, and that the text should really start with յայսմ ամի (in this year) rather than այս ամենայն... (all this)
  68. I can fire up my editing program [NEXT]
  69. and change the lemma: [NEXT]
  70. and the new generated text would do the right thing.
  71. Of course, someday this will not look so much like it was written for a computer scientist. Bear with me. But the idea is something like this. When the words have been aligned, the program will join together as long a chain as possible of different sets of words, and ask me which one is best. - Will provide a way to mark orthographic variants & misspellings - Computer will remember the things I’ve marked, and not ask me again - Eventually it will only need to ask me about variants that require a judgment call And that is the essence of digital techniques. Get the computer to handle anything that isn’t a judgment call.
  72. So now we can come back to the problem of a stemma. Part of the process of making a critical edition is to figure out, as best you can, the manuscript stemma—the family tree. It turns out that I can use the collation results I’ve just produced to help me with this.
  73. Manuscripts aren’t living organisms, BUT - have ancestors - the manuscripts from which they were copied; - have descendants - the manuscripts that were copied from them. - Sometimes have more than one ancestor. That’s called contamination. Let’s not think about that for now. - [TRANS] comparison to living things out to be helpful - while we medievalists have been slaving away with parchment and inkpots in libraries, biologists have come up with [TRANS] a statistical method called \"maximum parsimony\" - MP takes a bunch of genetic data (DNA), encoded as characters (those familiar letters) and produces the evolutionary tree that requires the least number of changes to get back to a common ancestor - [TRANS] Have “organisms” (mss); need family tree - Mss don’t have DNA, but [TRANS] they have words. - I have collation program that matches like words together. Pretend each occurrence of a similar word is a DNA base; feed to biologists’ statistical package. Voilà.
  74. Manuscripts aren’t living organisms, BUT - have ancestors - the manuscripts from which they were copied; - have descendants - the manuscripts that were copied from them. - Sometimes have more than one ancestor. That’s called contamination. Let’s not think about that for now. - [TRANS] comparison to living things out to be helpful - while we medievalists have been slaving away with parchment and inkpots in libraries, biologists have come up with [TRANS] a statistical method called \"maximum parsimony\" - MP takes a bunch of genetic data (DNA), encoded as characters (those familiar letters) and produces the evolutionary tree that requires the least number of changes to get back to a common ancestor - [TRANS] Have “organisms” (mss); need family tree - Mss don’t have DNA, but [TRANS] they have words. - I have collation program that matches like words together. Pretend each occurrence of a similar word is a DNA base; feed to biologists’ statistical package. Voilà.
  75. Manuscripts aren’t living organisms, BUT - have ancestors - the manuscripts from which they were copied; - have descendants - the manuscripts that were copied from them. - Sometimes have more than one ancestor. That’s called contamination. Let’s not think about that for now. - [TRANS] comparison to living things out to be helpful - while we medievalists have been slaving away with parchment and inkpots in libraries, biologists have come up with [TRANS] a statistical method called \"maximum parsimony\" - MP takes a bunch of genetic data (DNA), encoded as characters (those familiar letters) and produces the evolutionary tree that requires the least number of changes to get back to a common ancestor - [TRANS] Have “organisms” (mss); need family tree - Mss don’t have DNA, but [TRANS] they have words. - I have collation program that matches like words together. Pretend each occurrence of a similar word is a DNA base; feed to biologists’ statistical package. Voilà.
  76. Manuscripts aren’t living organisms, BUT - have ancestors - the manuscripts from which they were copied; - have descendants - the manuscripts that were copied from them. - Sometimes have more than one ancestor. That’s called contamination. Let’s not think about that for now. - [TRANS] comparison to living things out to be helpful - while we medievalists have been slaving away with parchment and inkpots in libraries, biologists have come up with [TRANS] a statistical method called \"maximum parsimony\" - MP takes a bunch of genetic data (DNA), encoded as characters (those familiar letters) and produces the evolutionary tree that requires the least number of changes to get back to a common ancestor - [TRANS] Have “organisms” (mss); need family tree - Mss don’t have DNA, but [TRANS] they have words. - I have collation program that matches like words together. Pretend each occurrence of a similar word is a DNA base; feed to biologists’ statistical package. Voilà.
  77. Manuscripts aren’t living organisms, BUT - have ancestors - the manuscripts from which they were copied; - have descendants - the manuscripts that were copied from them. - Sometimes have more than one ancestor. That’s called contamination. Let’s not think about that for now. - [TRANS] comparison to living things out to be helpful - while we medievalists have been slaving away with parchment and inkpots in libraries, biologists have come up with [TRANS] a statistical method called \"maximum parsimony\" - MP takes a bunch of genetic data (DNA), encoded as characters (those familiar letters) and produces the evolutionary tree that requires the least number of changes to get back to a common ancestor - [TRANS] Have “organisms” (mss); need family tree - Mss don’t have DNA, but [TRANS] they have words. - I have collation program that matches like words together. Pretend each occurrence of a similar word is a DNA base; feed to biologists’ statistical package. Voilà.
  78. So here I have the words lined up with each other and can tell which ones are similar and which are different. All I have to do is pretend it's DNA, and assign a letter to each variant
  79. and suddenly I have a dataset that I can feed to a statistical analysis program.
  80. - Result from the mss I’ve so far transcribed looks like this. The blue bit is the “Vienna” group of truncated mss; the rest can be thought of as the “Venice” group, though it turns out that a few members of that group have a lot more in common with each other than with any of the others. - Still requires editorial interpretation, - no accounting for relative dates of mss - no accounting for possibility of “living ancestor” - But if I use the knowledge I have about the mss to orient the tree and collapse some nodes...
  81. I end up with [SLIDE: new stemma] this. Part of it, [TRANS] here, is our Vienna set of truncated mss - The remainder of the tree not a very coherent group. The Venice group is pretty much “everything else”. - One manuscript in middle has 2 arrows into it; it was copied from more than one ms. That is contamination. I had to run 2 comparisons on 2 different chunks of text to find that. - Plausible picture of transmission history - Confirms the impression of lots of lost copies -- almost none of these were copied from each other [discourse here on Lachmann if there is time] - came up with original manual method of stemma analysis - his methods are superseded by this genetic analysis - but he’d be really excited by this because he always believed it was possible to rigorously derive the best edition - his dream won’t ever really work, but we are pushing it as close as we can to machine-generated editions
  82. I end up with [SLIDE: new stemma] this. Part of it, [TRANS] here, is our Vienna set of truncated mss - The remainder of the tree not a very coherent group. The Venice group is pretty much “everything else”. - One manuscript in middle has 2 arrows into it; it was copied from more than one ms. That is contamination. I had to run 2 comparisons on 2 different chunks of text to find that. - Plausible picture of transmission history - Confirms the impression of lots of lost copies -- almost none of these were copied from each other [discourse here on Lachmann if there is time] - came up with original manual method of stemma analysis - his methods are superseded by this genetic analysis - but he’d be really excited by this because he always believed it was possible to rigorously derive the best edition - his dream won’t ever really work, but we are pushing it as close as we can to machine-generated editions
  83. I end up with [SLIDE: new stemma] this. Part of it, [TRANS] here, is our Vienna set of truncated mss - The remainder of the tree not a very coherent group. The Venice group is pretty much “everything else”. - One manuscript in middle has 2 arrows into it; it was copied from more than one ms. That is contamination. I had to run 2 comparisons on 2 different chunks of text to find that. - Plausible picture of transmission history - Confirms the impression of lots of lost copies -- almost none of these were copied from each other [discourse here on Lachmann if there is time] - came up with original manual method of stemma analysis - his methods are superseded by this genetic analysis - but he’d be really excited by this because he always believed it was possible to rigorously derive the best edition - his dream won’t ever really work, but we are pushing it as close as we can to machine-generated editions
  84. So this is it! I have a critical edition, and I have a pretty new stemma of all of the manuscripts. I’m ready to rush to publication. This is another thing that XML makes ridiculously easy. The whole point of XML is that it can easily be transformed into Web pages, or into Word documents, or into book publishing format, or whatever else you need. I don’t ever actually have to fight with a spell checker.
  85. Let’s come back to that whole “online” thing. XML can also be turned into HTML - Gives features like [TRANS] searchable text. - All well & good about mss digitisation and online access, but isn’t it better if you can find what you’re looking for? [TRANS] frequent updates / corrections, [TRANS] configurable display of variants (what is “significant”? who’s asking?), [TRANS] links to original MS images for people that *really* care This is a big thing that’s getting a lot of attention in the digital humanities right now - really exciting world - lots of people doing lots of interesting thing so I don’t have to [COPYRIGHT??]
  86. Let’s come back to that whole “online” thing. XML can also be turned into HTML - Gives features like [TRANS] searchable text. - All well & good about mss digitisation and online access, but isn’t it better if you can find what you’re looking for? [TRANS] frequent updates / corrections, [TRANS] configurable display of variants (what is “significant”? who’s asking?), [TRANS] links to original MS images for people that *really* care This is a big thing that’s getting a lot of attention in the digital humanities right now - really exciting world - lots of people doing lots of interesting thing so I don’t have to [COPYRIGHT??]
  87. Let’s come back to that whole “online” thing. XML can also be turned into HTML - Gives features like [TRANS] searchable text. - All well & good about mss digitisation and online access, but isn’t it better if you can find what you’re looking for? [TRANS] frequent updates / corrections, [TRANS] configurable display of variants (what is “significant”? who’s asking?), [TRANS] links to original MS images for people that *really* care This is a big thing that’s getting a lot of attention in the digital humanities right now - really exciting world - lots of people doing lots of interesting thing so I don’t have to [COPYRIGHT??]
  88. Let’s come back to that whole “online” thing. XML can also be turned into HTML - Gives features like [TRANS] searchable text. - All well & good about mss digitisation and online access, but isn’t it better if you can find what you’re looking for? [TRANS] frequent updates / corrections, [TRANS] configurable display of variants (what is “significant”? who’s asking?), [TRANS] links to original MS images for people that *really* care This is a big thing that’s getting a lot of attention in the digital humanities right now - really exciting world - lots of people doing lots of interesting thing so I don’t have to [COPYRIGHT??]
  89. Let’s come back to that whole “online” thing. XML can also be turned into HTML - Gives features like [TRANS] searchable text. - All well & good about mss digitisation and online access, but isn’t it better if you can find what you’re looking for? [TRANS] frequent updates / corrections, [TRANS] configurable display of variants (what is “significant”? who’s asking?), [TRANS] links to original MS images for people that *really* care This is a big thing that’s getting a lot of attention in the digital humanities right now - really exciting world - lots of people doing lots of interesting thing so I don’t have to [COPYRIGHT??]
  90. Let’s come back to that whole “online” thing. XML can also be turned into HTML - Gives features like [TRANS] searchable text. - All well & good about mss digitisation and online access, but isn’t it better if you can find what you’re looking for? [TRANS] frequent updates / corrections, [TRANS] configurable display of variants (what is “significant”? who’s asking?), [TRANS] links to original MS images for people that *really* care This is a big thing that’s getting a lot of attention in the digital humanities right now - really exciting world - lots of people doing lots of interesting thing so I don’t have to [COPYRIGHT??]