SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
IMPLEMENTINGTEI STANDOFF
ANNOTATION INTHE BROWSER
Hugh Cayless, Duke University
@hcayless
DEFINITIONS
Source: a run of plain text or text + markup.
Standoff: markup or annotations which occur away from the
source they deal with and which are not referenced directly by
that source.
Annotation: markup that adds ancillary information to a source
or part of a source.
STANDOFF
The TEI Guidelines mostly use the term “stand-off” in one sense
—referring to markup that takes source text in one form and re-
constructs it into a different form.
Text structured
in pages
Text structured in
chapters and
paragraphs
source
standoff	
markup
restructuring	markup
STANDOFF
But people often use the term “standoff” also to refer to
annotations on the source text that associate new information
and analysis with it
Text
Notes, analysis,
additional information
source
standoff	
markup
associative	markup
WHERE IS IT?
Many kinds of associative annotation can occur in multiple
contexts. Notes, for example, can appear inline, at the point
where the note is anchored, or can use the @target attribute to
point at the thing they are annotating.
The source itself can point outward to additional information, for
example a <persName> with a @ref pointing to a <person>
elsewhere.“This string is a name for the person defined over
there”.
So annotation can be inline, referenced, or standoff.
SOMETIMESTHERE’S A CHOICE
Some annotations can work in all three ways: e.g. <note>
<p>Some	text<note>with	an	inline	note</note>.</p>
<p	xml:id="id">Some	text.</p>

...

<note	target="#id">with	a	standoff	note</note>
<p>Some	text.<ptr	target="#id"/></p>

...

<note	xml:id="id">with	a	referenced	note</note>
“Here’s some more information about this paragraph.”
BUT
Other types of annotation really only work one way:
<p><seg>When	the	Alexandrian	war	flared	up,	
<persName	ref="#JC">Caesar</persName>	summoned	
every	fleet	from	Rhodes	and	Syria	and	Cilicia;	
from	Crete	he	raised	archers,	and	cavalry	from	
Malchus,	king	of	the	Nabataeans,	and	ordered	
artillery	to	be	procured,	corn	despatched,	and	
auxiliary	troops	mustered	from	every	quarter.</
seg>...</p>
“The enclosed string is a personal name, which refers to
the person defined in the element with the id ‘JC’”.
A DIGRESSION ON WORKFLOWS
Why would you do standoff markup?
1. To have it both ways: e.g. mark the source up by pages, but have a version
with the same text and chapters/paragraphs.
2. As a step in the construction of an edition, e.g. having collaborators
identify persons and places without changing the source yet.
3. Adding information to a source you don’t own or can’t modify (but ideally
is stable).
4. Adding a new category of information to an already complex, highly-
structured source.
THREETYPES OF STANDOFF
Restructuring standoff: virtually rewrites the structure of the
source being annotated; operates on big chunks of text, not really
fragments.
Associative standoff: juxtaposes some part of the source with
a note or other piece of markup; fine-grained, but limited to
attaching one bit of information to another.
Assertive standoff: would make an assertion about some part
of the source, e.g.“This string is a place name.” BUT: how to do
it?
HOW; SOME IDEAS
Use restructuring standoff: rewrite the source with the personal names
identified.
Adopt a convention, e.g.: 

<p><seg>When	the	Alexandrian	war	flared	up,	Caesar	
summoned	every	fleet	from	Rhodes	and	Syria	and	
Cilicia;...</seg>...</p>

...

<person	xml:id="Caesar">Julius	Caesar</person>

...

<link	

			target="#match(//p[1]/seg[1],'Caesar')	#Caesar"/>
Note: Not the same thing as <persName	ref="#Caesar">Caesar</persName>.
HOW; RESTRUCTURING
<p><seg>When	the	Alexandrian	war	flared	up,	
Caesar	summoned	every	fleet	from	Rhodes	and	Syria	
and	Cilicia;	...</seg>...</p>	
<p><seg><join	target="#string-range(//p[1]/
seg[1],0,36)"/><persName	ref="#Caesar"><join	
target="#string-range(//p[1]/seg[1],37,42)"/></
persName><join	target="#string-range(//p[1]/
seg[1],43,98)"/>	...</seg>...</p>
HOW;ASSOCIATION
<link> with @target, which contains a space-separated list of
pointers understood to be associated.
<span> with @from and @to, specifying a start and end of the
thing being annotated or with @target (somewhat confusingly).
<note> with @target
All of these require some additional knowledge outside the
markup, because all they do is connect things up (they're
associative).
HOW;ASSERTION
Our example using restructuring is assertive. It clearly says “here is a
reading of this passage with personal names identified”, but it has some
drawbacks:
It requires that the whole passage be remade—it can’t target just the
names.
Annotations can have overlap problems, so restructuring runs into the
usual difficulties.
They may have interdependencies (if name x refers to person A, then y
is probably her brother, person B; if not, then y is probably person Z).
DOESTEI HAVE ASSERTIVE
ANNOTATIONS?
A critical apparatus, or apparatus criticus if you’re being snooty, is
a set of annotations that record textual variants an editor wants
the reader to know about.
<p	n="1"	xml:id="p1">

		<seg	n="1"	xml:id="seg-1.1">Bello	Alexandrino	

				conflato	Caesar	<app>

						<lem>Rhodo</lem>

						<rdg	wit="#S"	ana="#orthographical">Ordo</rdg>

				</app>	atque	ex	Syria	Ciliciaque	omnem	classem	

				arcessit;	...</seg>

		...

</p>
The lemma (what’s in 

the editor’s text)
A reading; from S (Florence,
BML Ashburnham 33)
WHAT? WHY WOULDYOU DO
SUCH ATHING?
Takes the form of inline or standoff notes on the text.
Expressly for making assertive annotations in the form "version x
reads “B” rather than “A” here.
Can accommodate differences in markup as well as text.
Can cope reasonably well with overlap.
Can handle dependencies / conflicts between annotations.
CRITICAL APPARATUS
Can also report prior editors’ emendations of the text or
speculative emendations by the current editor.
Can even record alternate ways of punctuating the text.
So it’s not too far-fetched to consider using it for emendations to
the markup.
Given:

<p><seg>When	the	Alexandrian	war	flared	up,	Caesar	
summoned	every	fleet	from	Rhodes	and	Syria	and	
Cilicia;...</seg>...</p>

...

<person	xml:id="Caesar">Julius	Caesar</person>

Instead of:



<link	

			target="#match(//p[1]/seg[1],'Caesar')	#Caesar"/>

why not:



<app	from="#match(//p[1]/seg[1],'Caesar')">

		<rdg	source="#Damon"><persName	ref="#Caesar">Caesar

		</persName></rdg>

</app>
Says, explicitly:“Damon says this is a personal name
referring to Julius Caesar.”
OK, FINE, BUTYOU SAID SOMETHING
ABOUT IMPLEMENTING IT...
We need:
1. A way to identify persons, places, etc.
2. A way to turn that into a usable data source.
3. A way to actually do things with it.
RECOGITO (#1)
https://recogito.pelagios.org/
Developed mainly by Rainer Simon of the Austrian Institute of
Technology for the Pelagios Network (https://pelagios.org/)
Designed for, and has most support for place annotations, but
does people and organizations too.
Exports to CSV, JSON-LD, RDF, GeoJSON, ...and TEI
Pretty much covers #1. #2 needs a bit of work.
#3TURNS OUTTO BE EASY(ISH)
Given a TEI document and annotations like:
<listApp>

		<app	from="#match(seg-1.1,'Caesar')"><rdg	source="#Damon"><persName	
ref="#Caesar"	>Caesar</persName></rdg></app>

		<app	from="#match(seg-1.1,'Rhodo')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/590031">Rhodo</placeName></rdg></
app>

		<app	from="#match(seg-1.1,'Syria')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/1306">Syria</placeName></rdg></app>

		<app	from="#match(seg-1.1,'Cilicia')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/628957">Cilicia</placeName></rdg></
app>

		<app	from="#match(seg-1.1,'Creta')"><rdg	source="#Damon"><placeName	
ref="http://pleiades.stoa.org/places/991373">Creta</placeName></rdg></
app>

...
we can (e.g.) turn the standoff annotations into links
THE HARD PART
#2, the boring, standards-making part of deciding what TEI standoff
annotations actually look like is hard. Export is easy—Recogito will
basically already do it—but what does the export look like?
There is a proposal underway for a new TEI <standoff> element
that could contain (e.g. the output of an annotation session).
Maybe later this year we'll be done yelling at each other and be
able to actually define it. I hope there's a place in it for assertive
annotations, even if they don't look precisely like critical apparatus.
QUESTIONS?

Mais conteúdo relacionado

Semelhante a Implementing TEI Standoff Annotation in the Browser

HTML Lists & Llinks
HTML Lists & LlinksHTML Lists & Llinks
HTML Lists & LlinksNisa Soomro
 
Readme Driven Development
Readme Driven DevelopmentReadme Driven Development
Readme Driven DevelopmentMark Rickerby
 
Information retrieval and extraction
Information retrieval and extractionInformation retrieval and extraction
Information retrieval and extractionAnkit Sharma
 
Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)Pete Johnston
 
April 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk DrupalApril 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk DrupalEric Sembrat
 
EPiServer report generation
EPiServer report generationEPiServer report generation
EPiServer report generationPaul Graham
 
Xml part3
Xml part3Xml part3
Xml part3NOHA AW
 
HTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchorsHTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchorsGrayzon Gonzales, LPT
 
Industrial strength - Natural Language Processing
Industrial strength - Natural Language ProcessingIndustrial strength - Natural Language Processing
Industrial strength - Natural Language ProcessingJeffrey Williams
 
27 f157al5enhanced er diagram
27 f157al5enhanced er diagram27 f157al5enhanced er diagram
27 f157al5enhanced er diagramdddgh
 
Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema GajjarReema Gajjar
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship DiagramSiti Ismail
 
Facet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQLFacet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQLLeigh Dodds
 
Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29Julie Allinson
 
Repositories thru the looking glass
Repositories thru the looking glassRepositories thru the looking glass
Repositories thru the looking glassEduserv Foundation
 
Understanding REST
Understanding RESTUnderstanding REST
Understanding RESTNitin Pande
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesPaolo Pareti
 

Semelhante a Implementing TEI Standoff Annotation in the Browser (20)

HTML Lists & Llinks
HTML Lists & LlinksHTML Lists & Llinks
HTML Lists & Llinks
 
Readme Driven Development
Readme Driven DevelopmentReadme Driven Development
Readme Driven Development
 
Information retrieval and extraction
Information retrieval and extractionInformation retrieval and extraction
Information retrieval and extraction
 
Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)Dublin Core Description Set Profiles (DC-2009)
Dublin Core Description Set Profiles (DC-2009)
 
April 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk DrupalApril 2016 - USG Web Tech Day - Let's Talk Drupal
April 2016 - USG Web Tech Day - Let's Talk Drupal
 
EPiServer report generation
EPiServer report generationEPiServer report generation
EPiServer report generation
 
Xml part3
Xml part3Xml part3
Xml part3
 
HTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchorsHTML5 - create hyperlinks and anchors
HTML5 - create hyperlinks and anchors
 
Industrial strength - Natural Language Processing
Industrial strength - Natural Language ProcessingIndustrial strength - Natural Language Processing
Industrial strength - Natural Language Processing
 
enhanced er diagram
enhanced er diagramenhanced er diagram
enhanced er diagram
 
27 f157al5enhanced er diagram
27 f157al5enhanced er diagram27 f157al5enhanced er diagram
27 f157al5enhanced er diagram
 
Database index by Reema Gajjar
Database index by Reema GajjarDatabase index by Reema Gajjar
Database index by Reema Gajjar
 
Entity Relationship Diagram
Entity Relationship DiagramEntity Relationship Diagram
Entity Relationship Diagram
 
The Glory of Rest
The Glory of RestThe Glory of Rest
The Glory of Rest
 
Facet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQLFacet: Building Web Pages with SPARQL
Facet: Building Web Pages with SPARQL
 
Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29Swap For Dummies Rsp 2007 11 29
Swap For Dummies Rsp 2007 11 29
 
Repositories thru the looking glass
Repositories thru the looking glassRepositories thru the looking glass
Repositories thru the looking glass
 
Understanding REST
Understanding RESTUnderstanding REST
Understanding REST
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Spotlight
SpotlightSpotlight
Spotlight
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Implementing TEI Standoff Annotation in the Browser

  • 1. IMPLEMENTINGTEI STANDOFF ANNOTATION INTHE BROWSER Hugh Cayless, Duke University @hcayless
  • 2. DEFINITIONS Source: a run of plain text or text + markup. Standoff: markup or annotations which occur away from the source they deal with and which are not referenced directly by that source. Annotation: markup that adds ancillary information to a source or part of a source.
  • 3. STANDOFF The TEI Guidelines mostly use the term “stand-off” in one sense —referring to markup that takes source text in one form and re- constructs it into a different form. Text structured in pages Text structured in chapters and paragraphs source standoff markup restructuring markup
  • 4. STANDOFF But people often use the term “standoff” also to refer to annotations on the source text that associate new information and analysis with it Text Notes, analysis, additional information source standoff markup associative markup
  • 5. WHERE IS IT? Many kinds of associative annotation can occur in multiple contexts. Notes, for example, can appear inline, at the point where the note is anchored, or can use the @target attribute to point at the thing they are annotating. The source itself can point outward to additional information, for example a <persName> with a @ref pointing to a <person> elsewhere.“This string is a name for the person defined over there”. So annotation can be inline, referenced, or standoff.
  • 6. SOMETIMESTHERE’S A CHOICE Some annotations can work in all three ways: e.g. <note> <p>Some text<note>with an inline note</note>.</p> <p xml:id="id">Some text.</p>
 ...
 <note target="#id">with a standoff note</note> <p>Some text.<ptr target="#id"/></p>
 ...
 <note xml:id="id">with a referenced note</note> “Here’s some more information about this paragraph.”
  • 7. BUT Other types of annotation really only work one way: <p><seg>When the Alexandrian war flared up, <persName ref="#JC">Caesar</persName> summoned every fleet from Rhodes and Syria and Cilicia; from Crete he raised archers, and cavalry from Malchus, king of the Nabataeans, and ordered artillery to be procured, corn despatched, and auxiliary troops mustered from every quarter.</ seg>...</p> “The enclosed string is a personal name, which refers to the person defined in the element with the id ‘JC’”.
  • 8. A DIGRESSION ON WORKFLOWS Why would you do standoff markup? 1. To have it both ways: e.g. mark the source up by pages, but have a version with the same text and chapters/paragraphs. 2. As a step in the construction of an edition, e.g. having collaborators identify persons and places without changing the source yet. 3. Adding information to a source you don’t own or can’t modify (but ideally is stable). 4. Adding a new category of information to an already complex, highly- structured source.
  • 9. THREETYPES OF STANDOFF Restructuring standoff: virtually rewrites the structure of the source being annotated; operates on big chunks of text, not really fragments. Associative standoff: juxtaposes some part of the source with a note or other piece of markup; fine-grained, but limited to attaching one bit of information to another. Assertive standoff: would make an assertion about some part of the source, e.g.“This string is a place name.” BUT: how to do it?
  • 10. HOW; SOME IDEAS Use restructuring standoff: rewrite the source with the personal names identified. Adopt a convention, e.g.: 
 <p><seg>When the Alexandrian war flared up, Caesar summoned every fleet from Rhodes and Syria and Cilicia;...</seg>...</p>
 ...
 <person xml:id="Caesar">Julius Caesar</person>
 ...
 <link 
 target="#match(//p[1]/seg[1],'Caesar') #Caesar"/> Note: Not the same thing as <persName ref="#Caesar">Caesar</persName>.
  • 12. HOW;ASSOCIATION <link> with @target, which contains a space-separated list of pointers understood to be associated. <span> with @from and @to, specifying a start and end of the thing being annotated or with @target (somewhat confusingly). <note> with @target All of these require some additional knowledge outside the markup, because all they do is connect things up (they're associative).
  • 13. HOW;ASSERTION Our example using restructuring is assertive. It clearly says “here is a reading of this passage with personal names identified”, but it has some drawbacks: It requires that the whole passage be remade—it can’t target just the names. Annotations can have overlap problems, so restructuring runs into the usual difficulties. They may have interdependencies (if name x refers to person A, then y is probably her brother, person B; if not, then y is probably person Z).
  • 14. DOESTEI HAVE ASSERTIVE ANNOTATIONS? A critical apparatus, or apparatus criticus if you’re being snooty, is a set of annotations that record textual variants an editor wants the reader to know about. <p n="1" xml:id="p1">
 <seg n="1" xml:id="seg-1.1">Bello Alexandrino 
 conflato Caesar <app>
 <lem>Rhodo</lem>
 <rdg wit="#S" ana="#orthographical">Ordo</rdg>
 </app> atque ex Syria Ciliciaque omnem classem 
 arcessit; ...</seg>
 ...
 </p> The lemma (what’s in 
 the editor’s text) A reading; from S (Florence, BML Ashburnham 33)
  • 15. WHAT? WHY WOULDYOU DO SUCH ATHING? Takes the form of inline or standoff notes on the text. Expressly for making assertive annotations in the form "version x reads “B” rather than “A” here. Can accommodate differences in markup as well as text. Can cope reasonably well with overlap. Can handle dependencies / conflicts between annotations.
  • 16. CRITICAL APPARATUS Can also report prior editors’ emendations of the text or speculative emendations by the current editor. Can even record alternate ways of punctuating the text. So it’s not too far-fetched to consider using it for emendations to the markup.
  • 18. OK, FINE, BUTYOU SAID SOMETHING ABOUT IMPLEMENTING IT... We need: 1. A way to identify persons, places, etc. 2. A way to turn that into a usable data source. 3. A way to actually do things with it.
  • 19. RECOGITO (#1) https://recogito.pelagios.org/ Developed mainly by Rainer Simon of the Austrian Institute of Technology for the Pelagios Network (https://pelagios.org/) Designed for, and has most support for place annotations, but does people and organizations too. Exports to CSV, JSON-LD, RDF, GeoJSON, ...and TEI Pretty much covers #1. #2 needs a bit of work.
  • 20. #3TURNS OUTTO BE EASY(ISH) Given a TEI document and annotations like: <listApp>
 <app from="#match(seg-1.1,'Caesar')"><rdg source="#Damon"><persName ref="#Caesar" >Caesar</persName></rdg></app>
 <app from="#match(seg-1.1,'Rhodo')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/590031">Rhodo</placeName></rdg></ app>
 <app from="#match(seg-1.1,'Syria')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/1306">Syria</placeName></rdg></app>
 <app from="#match(seg-1.1,'Cilicia')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/628957">Cilicia</placeName></rdg></ app>
 <app from="#match(seg-1.1,'Creta')"><rdg source="#Damon"><placeName ref="http://pleiades.stoa.org/places/991373">Creta</placeName></rdg></ app>
 ... we can (e.g.) turn the standoff annotations into links
  • 21. THE HARD PART #2, the boring, standards-making part of deciding what TEI standoff annotations actually look like is hard. Export is easy—Recogito will basically already do it—but what does the export look like? There is a proposal underway for a new TEI <standoff> element that could contain (e.g. the output of an annotation session). Maybe later this year we'll be done yelling at each other and be able to actually define it. I hope there's a place in it for assertive annotations, even if they don't look precisely like critical apparatus.