Software Developer at GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen em GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Understanding, extracting and enhancing catalogue data (CE Book history workshop, 2023)
Understanding, extracting and
enhancing catalogue data
Péter Király
(GWDG, Göttingen, Germany)
Central European Book History workshop: data & tools
11 January 2023
Österreichische Nationalbibliothek
https://bit.ly/book-history-onb-2023
place name normalization
place-synonyms.csv (8085 surface forms of 628 locations)
Milano=Milan|Milano, Italy|Milan, Italy|Milani|Cinisello Balsamo (Milano)|...
coords.csv (1800+ locations)
"Milano",3173435,"Milan","Italy","45.46427","9.18951"
Milan
Milano, Italy
Milan, Italy
Milani
Cinisello Balsamo (Milano)
…
Geonames ID normal form country latitude longitude
3173435 Milano Italy 45.46427 9.18951
https://bit.ly/book-history-onb-2023
18th century books in three catalogues
country catalogue books with
recognized
locations
place name
recognition (%)
normalized
geonames
(751, 752)
Austria ÖNB 123 431 95+ 7%
Hungary OSzK 32 974 95+ 0%
Poland BPNL 26 843 90+ 1%
https://bit.ly/book-history-onb-2023
001 990029097480603338
751 $a Milano
$e publication place
$0 3173435
$1 https://www.geonames.org/3173435
$2 Geonames
$4 pup
roundtripping
datasharing options
record enhancement: ID and publication place information
https://bit.ly/book-history-onb-2023