This document summarizes the challenges of archiving born-digital newspapers and uses Canada as a case study. It describes how copyright issues, technical hurdles, and a lack of consensus have hindered digital newspaper archiving. In Canada, responsibility for archiving is unclear, the news industry is struggling financially, and the legal framework around copyright has produced uncertainty. As a result, the archiving of recent digital news content has been limited.
Enjoy Night ≽ 8448380779 ≼ Call Girls In Gurgaon Sector 46 (Gurgaon)
Lynch fontaine
1. Preserving the Unpreservable:
Form, Content, Copyright and the
Archiving of Born-Digital
Newspapers
Lisa Lynch
Concordia University
Paul Fontaine
McGill University
April 4, 2014
2. 1. We describe the prevailing practices in born-
digital newspaper archiving
2. Turn to Canada as a case study to illustrate
the current impasse
Born-digital archiving
3. The British Library has preserved
over 65 million news articles from
archives dating back 300 years
The Library of Congress has
collected newspapers from each
state for its Chronicling America
project
The Swedish Royal Libraries’
KulturaRW3 program, has been
collecting news websites since
1996
5. Public vs. private interests
• Issues around copyright have been made more
urgent by the fact that newspapers see their
own archives as a marketable good
• The copyright interest of newspapers has
meant, overwhelmingly, the archiving of
historical newspapers ends sometime in the
first part of the twentieth century
6. Technological hurdles
• PDF archiving: the collecting and
processing of PDFs instead of digitizing or
microforming printed newspapers
Challenge: requires a relationship
between archivists and news outlets
• Web harvesting: software captures and
archives a site’s web pages
Challenge: news sites behind paywalls
generally are not harvested
7. Whose task is it?
• The Centre for Research Libraries argues
that libraries should work directly with for-profit
archiving companies and newspapers
• As both government and foundation money has
become more scarce, archiving projects have
relied on public-private partnerships in order to
secure funding
8. Why Can’t We
Be Friends?
• Library archivists may be loath to give control
over to those with commercial interests
• The financial situation of many newspapers
might discourage them from involvement in a
project that has potential costs
9. The Canadian Context
• Library and Archives Canada has long
considered the archiving of newspapers to
be a provincial rather than federal
responsibility
• In 2013, a deal was announced between
LAC and Canadiana to digitize 40 million
texts images from LAC’s archives
• This deal emphasized that LAC’s own role
in future digitization projects would be
minimal
10. • Globe and Mail is still in
the process of prioritizing
what material will be
transferred. A portion of
their content remains in a
dark archive accessible
only to employees
• Sun Media, like The
Globe, either lost or
decided to leave behind
content during their most
recent migration
What to archive?
11. Limitations
• Weak national library system,
• Economically ailing news industry that is
framed as a commercial enterprise rather
than a public good
• Legal framework that has produced an
unclear climate around copyright
• Archival community that has largely been
unwilling to challenge copyright laws
12. Conclusions
• The failure to archive born-digital news
represents an abdication of responsibility
towards an important part of the world’s
cultural heritage
• However, as Canada is an active partner
in the Center for Research Libraries, it
may be that momentum on the U.S. end
can convince Canadian news publishers
to partner with libraries and aggregators
Notas do Editor
-Beginning in the 1990s, digitization emerged as a new form of newspaper preservation that promised to revolutionize access to archives. By putting newspaper content online, it would be made available to an exponentially larger numbers of users.
-Over the past decade, there have been great successes in digital newspaper archiving.
(Slide) All Points
-The breadth of such efforts might convey the impression that digital technologies have solved the problem of capturing, storing and distributing the output of the world’s newspapers. But this is far from the case, as archivists interested in preserving today’s newspapers face obstacles far more challenging that any previously encountered by newspaper archivists.
(Next Slide)
-Those challenges include: Copyright concerns, technical hurdles, and lack of consensus over what gets archived and who should be doing the archiving.
(Next Slide)
(Slide) Point 1
-This development is a relatively recent one in the history of newspaper archiving: it dates back to the 1980s, when for-profit content aggregators first persuaded the news industry that old content might be digitally processed and indexed for sale in text-only electronic databases
-Since then, there have been many instances when news outlets have pushed back against projects that provide public access to historical newspapers
(Slide) Point 2
-Increasingly, newspaper companies do not primarily produce physical artifacts to be preserved; instead, they produce digital objects that circulate in an increasingly ephemeral media space, and that have no permanent trace except for that created deliberately by archivists. Archivists have employed a number of approaches to capture and preserve born-digital content.
(Slide) Point 1
-Harmony between newspaper and archivist has been difficult in practice with PDFs arriving sporadically and with missing metadata.
(Slide) Point 2
Unlike PDF-based archiving projects, web harvesting projects can potentially capture the full range of a media organization’s output, including the text stories, video, audio, and interactive media.
-Rather than a single, overarching copyright issue, the web harvester enters a minefield, where potential copyright claims make material potentially too legally volatile to display.
Some archivists have launched programs without legislation that would support their efforts.
-Advocates of web harvesting did not imagine that paywalls, long (and still) a contested model for news revenue, would close down access to many of the world’s national and regional newspapers
(Next Slide)
-Over the past several years, the various issues provoked by web harvesting and PDF archiving have left library archivists feeling increasingly unsettled about the future of born-digital news archiving. As a result, The Center For Research Libraries, a North American consortium of university and public libraries, now recommends that libraries pursue neither PDF archiving nor web harvesting as the primary means to archive online content.
(Slide) Points 1 and 2
(Next Slide)
While the CRL’s approach represents the best option at present for jump-starting stalled efforts at born-digital archiving, such negotiation pose clear challenges for libraries.
(Slide) Points 1 and 2
-Library of Congress noted that born digital preservation should be a particular mandate of the library itself given the dire state of the news industry. As well, the current disarray of in-house archiving practices at most North American papers limits archiving efforts.
-I will now discuss the specific problems faced in Canada, where born-digital archiving has seemed to pose an even greater challenge than it has in Europe or the US.
(Next Slide)
(Slide) Point 1
-In 2012, following budget cuts, LAC’s then head librarian and archivist, Daniel Caron, gave a speech titled “The End of Archives is Nigh,” where he described the library’s plan to shift from preserving the online world to curating it.
-Even after Caron’s 2013 resignation, LAC wasn’t optimistic that the institution would change.
-With the library taking a back seat to digitization projects, a single nonprofit remains as the probable contender to handle not only Canada’s heritage digitization projects, but (perhaps) its born-digital archiving as well. That non-profit is Canadiana.
(Slide) Points 2 and 3
(Next Slide)
-Interviews with Canadian news providers revealed that the question of how to archive material at the corporate level has been superseded by the question of what to archive.
-Canadian news industry digital archiving practices have been marked in the last couple of years by a shift away from in-house archiving to the use of third-party technology for archiving purposes. The news outlets we interviewed exemplified this shift.
-In 2009, The Globe and Mail decided to replace their in-house system with a content management system. In 2012, they upgraded.
-As The Globe migrated from the ‘pub tool’ to Escenic 4 and finally to Escenic 5, some content was taken offline. While the former switch was a massive operation in which almost all of the content (text and images) was migrated to the new CMS, the second migration was more straightforward, but it was also selective.
(Slide) Point 1
The Globe’s strategy thus far has been to keep content management in house, so even if they buy CMS programs from outside vendors.
-Our second example, Sun Media, are currently using MediaScan JazzBox, a system they have used since 2007.
-Previously, each news portal of the QMI/Sun Media family had its own locally hosted archiving system; separate systems and databases thus existed at their major -newspapers in Alberta, Ontario, and Manitoba. In recent years they have started an initiative to consolidate to one database, a move that has coincided with the organization’s emphasis on content sharing.
(Slide) Point 2
-Like many aspects of the Canadian newspaper industry, CMS-related archiving issues are hardly unique, though these problems have perhaps drawn less attention in Canada than elsewhere due to a lack of communication between the industry and the professional archival community.
(Next Slide)
-Just as the LAC has found itself weakened due to recent circumstances, the Canadian news industry has gone through shifts that has left it ill-equipped to focus resources and attention on digitization.
(Slide) Points 1 and 2
(Next Slide)
-As the above discussion has emphasized, though the problems attendant on born-digital archiving are universal, these factors vary greatly from country to country. It is thus likely that different countries will continue to float different solutions for born-digital archiving for a period of time until an international consensus might be reached.
-As we have suggested, archivists in Canada, like those in many countries, have reached an impasse when it comes to born-digital news archiving. In terms of the variables that might affect whether the impasse might be overcome, the news is not good.
(Slide) Point 1
-There remains some hope, however.
(Slide) Point 2
Thank you