1. 18 June 2012
WP1 – Content
DM2E
Ewelina Suchorzebska
Austrian National Library
Department for Research and Development
2. 18 June 2012
WP1 key data
• Project member data providers: ONB (30), SBB
(16), EAJC (7), UBER (5), UIB (5), MPIWG (2)
• Non-project member data providers: NLI,
BBAW, CRNS – DGFA, CJH
• Two new providers:
– UBFFM – Universitätsbibliothek Johann
Christian Senckenberg Frankfurt am Main
– JDC – Joint Distribution Committee
• Expected progress in year one: 5000 pages
beyond DOW (according to Performance
Monitoring Table)
3. 18 June 2012
Content deviations
to the DOW
• EAJC: Content aggregator
– EAJC not providing content – stated
content is from CJH
• NLI: Selection of content
– NLI will prepare relevant content for
DM2E – distribution by numbers
may differ
4. 18 June 2012
Work progress
• Content providers: Upload of sample
data → available on FTP server
• Content and Metadata questionnaires
→ available on Redmine
• EAJC aggregation of new content
provider JDC
• Requirements gathering for
Requirements report (D.1.1)
5. 18 June 2012
Objectives and Tasks
• Objective: Analyse source structures and map
them to the EDM
• Objective: Communicate requirements of
content providers to WP2
– Findings stated in the requirements report
• Task 1.1 Collect metadata formats and
relational backend structures
– Questionnaires and carried out by WP2 and MPWIG
• Task 1.2 Collect requirements
– Questionnaires and requirements report; support
by WP2
6. 18 June 2012
Content specifics
• ONB: 50.000 digitised volumes (books, newspapers etc.) from the
Google Books cooperation, 170 manuscripts (codices) – MAB2
• UBER: 205.000 pages Polytechnisches Journal – TEI
• SBB: 140.000 pages Nachlass of Hauptmann, von Chamisso –
MAB2/EAD
• BBAW: ~ 1.300 books (historical printings from 1650-1900) – TEI
• EAJC/NLI/CJH: 400.000 files Jewish history, 10.000 Digitized Hebrew
books printed in Europe, 5.000 Newspaper pages printed in Europe,
5.000 European Archival documents, 300 complete Hebrew European
Manuscripts NLI (MARC21); CJH (MARCXML)
• MPIWG: 900 digitised rare books, early modern texts on mechanics –
Index.meta / own format
• UIB: 5.000 autograph-pages of Wittgenstein – TEI P5
• CRNS: 9.330 pages of Nietzsche‘s manuscripts (DFGA), additional
10.000 pages in 2013
• JDC: 20.000 pages, records of Jewish life between 1914-1918 – EAD
XML
• UBFFM: 549 Hebrew and Medieval manuscripts – currently 154.070
pages METS/MODS –will be partly ingested via DM2E (~70.000)
7. 18 June 2012
Challenges
• Relevant EDM knowledge /
necessary amendments for
specific content metadata → EDM
trainings by UBER
• Resources of non-project data
providers for EDM mapping
• Metadata preperation – depending
on WP3 platform
8. 18 June 2012
WP1 Time plan and
Outlook
• Final requirements gathering for
D.1.1 – Requirements Report (due
in June 2012)
• Ingestion Workshop (July 2012)
• EDM Trainings
• Communication within WPs
9. 18 June 2012
Thank you for your attention!
Contact details:
Ewelina Suchorzebska
Austrian National Library
Tel.: (+43 1) 534 10 - 355
e-mail: ewelina.suchorzebska@onb.ac.at
www.onb.ac.at