Opening Data at the City of Vancouver Archives

•Transferir como PPTX, PDF•

0 gostou•383 visualizações

sbigelow

Presentation given at the Open Data Learning Summit, September 2012, Vancouver, BC

Tecnologia

Opening Data at the
City of Vancouver Archives

Turning Analogue Records into Data

• GeoTiff
• PNG
• KML
• etc

Archivematica digital preservation system

Archivematica is open source

Accessible Data
Proprietary Software By Karin Apricot via www.flickr.com/people/karenapricot/

By Konrad Summers [CC-BY-SA-2.0
(www.creativecommons.org/licenses/by-sa/2.0)], via Wikimedia
Commons

City Data
By Paul Rudman via
http://www.flickr.com/photos/thecanonrattman/

Open Source Accessible Data
By Paul Rudman via http://www.flickr.com/photos/thecanonrattman/

Digital Preservation
By Trish Steel [CC-BY-SA-2.0] www.creativecommons.org/licenses/bysa/2.0)],viageograph.org.uk

Texts: OCR or Transcribe, then Structure

URLs for Resources

• Vancouver Open Data
http://data.vancouver.ca/datacatalogue/index.htm
• Europeana Data http://pro.europeana.eu/web/guest/linked-
open-data
• Europeana portal http://www.europeana.eu/portal/
• Map Warperhttp://mapwarper.net/
• Map Warper at NYPL http://maps.nypl.org/warper/
• AkomaNtoso main site http://www.akomantoso.org/
• AkomaNtoso examples http://examples.akomantoso.org/
• Waisdahttp://woordentikkertje.manbijthond.nl/

Mais conteúdo relacionado

Destaque

Улаанбаатар 2014 Ubupi (jfpr project) 1 englishBayar Tsend

Workshop slides - Introduction to AtoM and ArchivematicaArtefactual Systems - Archivematica

УЛААНБААТАР ХОТ ТӨЛӨВЛӨЛТТЭЙ ХОЛБООТОЙ ХИЙСЭН СУРГАЛТУУД Bayar Tsend

The social dimension of historic centres regenerationVIVA_EAST

Urban planning in japan (nagayama）ubmpsBayar Tsend

Urban Mobility Planning and the Development of Property Values - Views from A...Mircea Enache, Ph.D.

Улаанбаатар Ерөнхий төлөвлөгөө Боть 2Bayar Tsend

Urban planning for BMArchitects Namibia NL-NARonald Fukken

AluminumMoiz Barry

Urban Planning Legislation project - Fall 2013Galala University

Urban Design:of washington DC,River front developmentRavi Varma reddy

5 Urban ModelsEcumene

Urban Planning 494 Final Presentation Power Pointmrizzit2

Elements of urban designNeo Angala

Urban Planning PortfolioRobert Platt

Urban planning presentation 01Halima A. Othman

Urban Design project 1Salman Altuwariki

Urban design analysis, Circulation, Architecture, London, Redevelopment studiesSujeet Thakare

Archivematica and the digital archival chain of custodyArtefactual Systems - Archivematica

Destaque (19)

Улаанбаатар 2014 Ubupi (jfpr project) 1 english

Workshop slides - Introduction to AtoM and Archivematica

УЛААНБААТАР ХОТ ТӨЛӨВЛӨЛТТЭЙ ХОЛБООТОЙ ХИЙСЭН СУРГАЛТУУД

The social dimension of historic centres regeneration

Urban planning in japan (nagayama）ubmps

Urban Mobility Planning and the Development of Property Values - Views from A...

Улаанбаатар Ерөнхий төлөвлөгөө Боть 2

Urban planning for BMArchitects Namibia NL-NA

Aluminum

Urban Planning Legislation project - Fall 2013

Urban Design:of washington DC,River front development

5 Urban Models

Urban Planning 494 Final Presentation Power Point

Elements of urban design

Urban Planning Portfolio

Urban planning presentation 01

Urban Design project 1

Urban design analysis, Circulation, Architecture, London, Redevelopment studies

Archivematica and the digital archival chain of custody

Semelhante a Opening Data at the City of Vancouver Archives

Fieldtrip GB: A customisable mapping and data capture appEDINA, University of Edinburgh

Library of Congress - Neogeography and Geospatial data preservationAndrew Turner

From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...TimeScience

NASA Web World Wind: welcome to the new era of virtual globes Maria Antonia Brovelli

Open Spatial Data: Sources and ToolsEDINA, University of Edinburgh

The Elephant in the LibraryDataWorks Summit

Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Artefactual Systems - AtoM

Preserving the webJeremy Floyd

Webinar on DataUp: Describe, Manage, and Share DataCarly Strasser

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...National Information Standards Organization (NISO)

Our Data, Ourselves: The Data Democracy Deficit (EMF CAmp 2014)Giles Greenway

Web Mining Projects TopicsPhdtopiccom

Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...NoSQLmatters

DURAARK at IGeLU 2014panitzm

The role of geospatial information in a hyper connected societyMaria Antonia Brovelli

Collaboration to Curation: The High Rise Project meets Edinburgh DataShareEDINA, University of Edinburgh

Collaboration to Curation: The High Rise Project meets Edinburgh DataShare University of Edinburgh

From Seed to Harvest: Web Archiving Program Considerations for SULnullhandle

Semelhante a Opening Data at the City of Vancouver Archives (20)

Fieldtrip GB: A customisable mapping and data capture app

Library of Congress - Neogeography and Geospatial data preservation

From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...

NASA Web World Wind: welcome to the new era of virtual globes

Open Spatial Data: Sources and Tools

The Elephant in the Library

Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...

Preserving the web

Webinar on DataUp: Describe, Manage, and Share Data

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...

Our Data, Ourselves: The Data Democracy Deficit (EMF CAmp 2014)

Web Mining Projects Topics

Aleksei Udatšnõi – Crunching thousands of events per second in nearly real ti...

DURAARK at IGeLU 2014

The role of geospatial information in a hyper connected society

Collaboration to Curation: The High Rise Project meets Edinburgh DataShare

From Seed to Harvest: Web Archiving Program Considerations for SUL

Último

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

How to convert PDF to text with Nanonetsnaman860154

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Slack Application Development 101 Slidespraypatel2

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Opening Data at the City of Vancouver Archives

1. Opening Data at the City of Vancouver Archives

2. City of Vancouver Open Data

3. Archives’ database of holdings

4. Turning Analogue Records into Data • GeoTiff • PNG • KML • etc

5. City of Vancouver Open Data

6. Open data catalogue metadata

7. Business licence data: privacy

8. Archivematica digital preservation system

9. Archivematica is open source Accessible Data Proprietary Software By Karin Apricot via www.flickr.com/people/karenapricot/ By Konrad Summers [CC-BY-SA-2.0 (www.creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons City Data By Paul Rudman via http://www.flickr.com/photos/thecanonrattman/ Open Source Accessible Data By Paul Rudman via http://www.flickr.com/photos/thecanonrattman/ Digital Preservation By Trish Steel [CC-BY-SA-2.0] www.creativecommons.org/licenses/bysa/2.0)],viageograph.org.uk

10. Open Archival Information System

11. Challenge: Preservation Planning

12. Challenge: Open Data Licence

13. Archives’ database of holdings

14. Europeana data sets

15. Viewshare

16. Viewshare Timeline

17. Viewshare pie chart

18. Viewshare map

19. JISC cultural metadata competition

20. Turning Analogue Records into Data • GeoTiff • PNG • KML • etc

21. Maps: Map Warper

22. Texts: OCR or Transcribe, then Structure

23. AkomaNtoso sample xml html

24. Moving images: time-based metadata

25. Top scorers in the last 7 days

26. Playing the tagging game

27. URLs for Resources • Vancouver Open Data http://data.vancouver.ca/datacatalogue/index.htm • Europeana Data http://pro.europeana.eu/web/guest/linked- open-data • Europeana portal http://www.europeana.eu/portal/ • Map Warperhttp://mapwarper.net/ • Map Warper at NYPL http://maps.nypl.org/warper/ • AkomaNtoso main site http://www.akomantoso.org/ • AkomaNtoso examples http://examples.akomantoso.org/ • Waisdahttp://woordentikkertje.manbijthond.nl/

Notas do Editor

Open data is a good fit for archives. opening records for free public use is the core of what we do, and what we’ve done for decades. Archivists are trained to administer privacy legislation, and we routinely consider privacy concerns when we make information available. In the past, we’ve made analogue records available, so we haven’t been a resource for digital data researchers, but now we have the digital infrastructure and are acquiring the knowledge to be able to offer digital data. At our archives, we are looking at open data in 3 different contexts:
1. We are the official repository and custodians of the older open data that the City has released
2. We have our own metadata that we’d like to release
3. We want to turn some of our analogue archival records into digital data
Let’s look at the City’s Open Data sets first The City of Vancouver maintains a web site offering nearly 140 different data sets, with most sets downloadable in multiple formats. The different sets are updated at different frequencies: daily, monthly, weekly.
Each set has its own descriptive metadata.
Privacy concerns have already been dealt with. The Archives is NOT acquiring every incremental update of every set, just regular snapshots. We are still planning exactly HOW we will do this, but we intend to preserve all these data sets and make them freely downloadable.
We will preserve the data using Archivematica, a preservation system that we played a large role in developing. This is a shot of the first beta release which just came out a few days ago.
How does it embody Best Practices? It’s open source.We cannot preserve anything by putting it into a proprietary black box and hoping for the best: in this system we know every action taken because the code is open and transparent. In other words, Jeff Goldblum is not going to unexpectedly turn into The Fly. He will remain the essence of Jeff Goldblum and we’ll be able to show how that happened.
Archivematica is based on standards. It was designed from the very beginning to conform to this ISO standard OAIS, which is a framework for digital preservation and access. It also incorporates several metadata standards, such as METS and PREMIS.
We need to develop preservation plans for some of the filetypes before we put them into the system, as Archivematica does not have existing plans for everything yet.
We need to take look at the licence under which the City releases its data, and how long the licence will apply to this data. This could become tricky in the future if the City changes the licence -–and it’s likely to evolve. It would be awkward for the end user if the same data sets in different years have different licences.
On to the second context: we have rich metadata about our holding which we’d like to make available for use, and I’m sure most of you have the same: your catalogue metadata. Cultural institutions such as galleries, museums, archives and libraries are making catalogue metadata easier for others to use for analysis, not just for searching. There isn’t a single best practice for sharing these data sets, although most applications use fairly common formats and schema. Best practice is to try to provide what the community can use, and that can be more than one type of access or format. Data can be shared as XML, JSON, or even CSVs; it can be made available for download or via API, or both.
Europeana, Europe’s collaborative program for access to digitized cultural heritage, released linked open metadata about its 20 million digital cultural objects a few days ago. The participating institutions signed a Data Exchange Agreement so they all agree to release the metadata under a Creative Commons Zero licence and to use a standard metadata schema. The public can download the entire data set, or subsets.
Catalogue metadata can be uploaded or harvested by an application such as Viewshare, an open source platform that allows people to look at metadata and digital objects in various views, with no hacking or coding required.
It automatically creates timeline views
pie chart views
and map views.All these views are available with faceting.
More sophisticated hacking has already been done with cultural metadata. Last year, there was a competition in the UK held to use open data including some catalogue data, library user activity data, and even OpenURL router data, which logs patron requests for digital academic papers. The winners included a service that links information about musical composers and one that tells you which English outdoor heritage features are near you so you can visit them.
Finally,Turning Analogue Records into Data We have been digitizing archival records for 15 years, taking analogue records and, until recently, turning them into still or moving images. We want to go further and turn them into data, and there are different approaches we can use depending on the medium.Crowd sourcing will be necessary for some of this, and we’d love it if libraries could encourage people to do some of this work.
This software is called the Map Warper. We are in discussion with the developers, hoping to roll this out next year. An open source application for georectifying images of old maps, the map warper was further developed by the New York Public Library and became easier for public use.We intend that the application will reside on City servers, and we will upload high-resolution scans of our maps to it. Each scan would exist merely as an image until someone wanted to use it. Then they would match known points on the old map with known points on Open Street Map, and, if there were enough control points, they would rectify the old map: that is, the map would know where it belongs geographically. Once the map is rectified, the user can save it to the system in common formats and then others can download the rectified versions.
We’d like to make the City Council Minutes available as structured digital information. Presently, it’s mostly available as handwritten or typed pages, although later years do exist in various digital formats. This is a huge project and it’s just starting. A project to transcribe the handwritten pages has just begun. We’re also planning to scan and OCR the typewritten ones. But, even when we get the OCR and transcription done, it’s still not data – there’s no machine-readable structure, just words.
We’re looking at applying the AkomaNtoso XML schema. Developed for African parliaments, it is becoming widely used for legislative and parliamentary documents BC’s Queen’s Printers have developed a tool for marking up documents in this schema, and legislative XML documents were featured in a Victoria hackathon this year This slide shows a report, viewed in XML and HTML.
This is a project of the Dutch Institute for Sound and Vision. They had hundreds of hours of digitized television broadcast footage that they wanted to have tagged. They created an open source, web-based application to allow crowd-sourced tagging.
To encourage both participation and accuracy, they made it a game – here you can see the top scorers.
To play the game, people watch the footage and they type what they see, and the program associates the tag with a time code in the video. Then the Institute uses software to analyze the tags to fix errors and inconsistencies. For example, they use Freebase to figure out what some of the tags mean. Maybe this stretches the idea of data being structured information, but I think taking a visual medium and turning it into the structure of time code vs. tag counts, and could be very useful to digital humanities researchers. To conclude, I think the cultural sector should be careful to use existing standards, even de facto ones, or make sure they can transform their data and metadata into those standards easily, and also to make the licencing as open as possible, because it’s going to be increasingly important that these data sets be interoperable.

Opening Data at the City of Vancouver Archives

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (19)

Semelhante a Opening Data at the City of Vancouver Archives

Semelhante a Opening Data at the City of Vancouver Archives (20)

Último

Último (20)

Opening Data at the City of Vancouver Archives

Notas do Editor