SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Using texts to explore historical texts:
           Examples from Lake District literature and the
                   Registrar General’s Reports


                                            Ian Gregory
                                      Lancaster University

Acknowledgements:
    Alistair Baron, Patricia Murrieta-Flores, Andrew Hardie , and Paul Rayson (Lancaster)
    Claire Grover (Edinburgh) – providing access to the geo-reference Histpop data
    Richard Deswarte – help with the HistPop data
What is GIS?
Change in Infant Mortality in
  England & Wales, 1851-2001
      180

      160

      140

      120

      100
IMR




      80

      60

      40

      20

       0
        1851 1861 1871 1881 1891 1901 1911 1921 1931 1941 1951 1961 1971 1981 1991 2001
Traditional HGIS:
Infant mortality decline in England & Wales, 1851-1911

                                          30



                                          20

                                                                                                      1




                        .
                                          10                                                          2
                                                                                                      3




                        % national rate
                                                                                                      4
                                           0
                                                                                                      5
                                                1850s   1860s   1870s   1880s   1890s   1900s
                                                                                                      6
                                          -10                                                         7
                                                                                                      8

                                          -20



                                          -30




                                                                                 Source: Gregory (2008)
                                                                                 Annals of the Assoc. of
                                                                                 American Geographers
Distant Reading




Graphs (p. 16)         Maps (p. 55)            Trees (p. 73)



                             Moretti (2005) Graphs, Maps, Trees
Literary Mapping of the Lakes
• British Academy funded pilot project
  with David Cooper and Sally Bushell
• Two tours of the Lake District
    – Thomas Gray, 1769 (9,000 words)
         • Proto-Picturesque
    – ST Coleridge, 1802 (10,000 words)
         • Romantic
• Aims:
    – Can we create a GIS of text?
    – What can it offer to literary research?
• Method:
    –   Texts typed up by hand
    –   Places tagged manually
    –   Conversion
    –   Analysis
Place names coded in XML

<p in_text="Y">On Sunday Augt. 1st - half after 12 I had a Shirt, cravat, 2 pair of
Stockings, a little paper &amp; half a dozen Pens, a German Book (Voss's Poems)
&amp; a little Tea &amp; Sugar, with my Night Cap, packed up in my natty green oil-
skin, neatly squared, and put into my <format format_type="I">net</format>
Knapsack / and the Knap-sack on my back &amp; the Besom stick in my hand, which
for want of a better, and in spite of <person>Mrs C.</person> &amp;
<person>Mary</person>, who both raised their voices against it, especially as I left
the Besom scattered on the Kitchen Floor, off I sallied - over the
Bridge<my_comment><pl_name visited="Y">Greta Bridge,
Keswick</pl_name></my_comment>, thro' the Hop-Field, thro' the <pl_name
visited="Y">Prospect Bridge</pl_name> at <pl_name
visited="Y">Portinscale</pl_name>, so on by the tall Birch that grows out of the
center of the huge Oak, along into <pl_name visited="Y">Newlands</pl_name>--
<pl_name visited="Y">Newlands</pl_name>is indeed a lovely Place-the houses…
Convert to a GIS




OS 1:50,000 gazetteer – all places on 1:50,000 maps
• Accuracy
• Spelling problems
• Disambiguation
Coleridge & Gray in a GIS
Smoothed surface of Gray’s places
  All mentions            Visits
Smoothed surface of Coleridges’s places
     All mentions              Visits




                           Class intervals are 10 equal intervals of
                           the all mentions. Bandwidth=10km
Comparing Coleridge and Gray
  All mentions         Visits




                     Green: Only in Gray
                     Yellow: Evenly in both
                     Red: Only in Coleridge
Mapping Emotional Response




 Gray              Coleridge
Physical Characteristics of Tours
                            70                                                                                    700
                            60                                                                                    600
                            50
            % of mentions




                                                                                                                  500




                                                                                                   Pop Density
                            40                                                                                    400
                            30                                                                                    300
Gray




                            20                                                                                    200
                            10                                                                                    100

                             0                                                                                      0
                                 0 to 99 100 to 200 to 300 to 400 to 500 to 600 to 700 to   800+                            STC Not visited    STC Visited   Grey Not visited   Grey Visited
                                          199    299    399    499    599    699    799
                            70                               Height
                            60
                                                                                                                                               Normal
                                                Visited   Didn't visit/Unclear
                            50
            % of mentions




                                                                                                                  1000
                            40
Coleridge




                            30




                                                                                                   Pop. Density
                                                                                                                   100
                            20
                            10
                             0                                                                                      10
                                 0 to 99 100 to 200 to 300 to 400 to 500 to 600 to 700 to   800+
                                          199    299    399    499    599    699    799
                                                             Height                                                     1
                                                                                                                             STC Not visited   STC Visited   Grey Not visited   Grey Visited

                                                Visited   Didn't visit/Unclear
                                                                                                                                               Logged
                                      Altitude of mentions                                                                        Population density
Close Reading with Internet Mapping




                     http://www.lancs.ac.uk/mappingthelakes
                     http://www.lancs.ac.uk/mappingthelakes/v2
The Histpop Collection
• Covers the printed reports published in the Census
  and the Registrar General’s Annual Reports, 1801-
  1937
• Nearly 13,000,000 words
• Georeferenced by C. Grover (University of Edinburgh)
• Just concerned with the Registrar General’s Reports,
  1851-1911
• Total: 3,750,000 words
• England & Wales: 2,000,000 words
• http://www.histpop.org
Dot maps of place-name instances
Place-name instances, 1850s




                  Density Smoothing   Cluster identification:
                                      Standard deviations
www.histpop.org                            of density
Extract place-names
   Word         Cnt      Kernel         Density   Cnt
 Frequency               Density
North Shields   300 Bermondsey            .5849     6

London          294 Newington             .5842     4
Durham          207 Spitalfields          .5835     1
Nottingham      193 Whitechapel           .5835     1

Liverpool       171 Stepney               .5823     2
Hawarden        145 Rotherhithe           .5809     5
Grantham        131 London                .5803   294
Cardington      125 Shoreditch            .5794     1
Linslade        121 Bethnal Green         .5788     4
Wakefield       121 Camberwell            .5787    12
                      58th: Southwick     .3498     1
                      (nr Sunderland)
Collocation
• “In Southwick and Monkwearmouth offensive nuisances
  abound.”
• “At Royton, in Oldham, where the drainage was imperfect,
  typhoid fever was prevalent”
• “The deaths in the Liverpool workhouse, in the Mount
  Pleasant sub-district of Liverpool, were above 100 more than
  in the same period of the two previous years, owing chiefly to
  an epidemic of measles among children of German emigrants
  temporarily located in this institution; there were also 101
  deaths from typhus, nearly all of which occurred in the
  workhouse.”
KWIC of “West Bromwich”
Most common words in clusters
•   Uses Mutual Information scores – top 10 for each cluster, excluding place-names, numbers,
    and punctuation
•   1 (North-East): Fog, took [changes in rainfall or temperature took place], largest [changes in
    weather], least [as largest], dense [weather related], greatest [weather], observatory, Asiatic
    [cholera], Halos [lunar or solar], thunder. WEATHER
•   2 (Wakefield): Falls, rain, seen [meteorological phenomena or “swallows”], reading, fell [snow
    or rain], number [met. readings], June, March. WEATHER
•   3 (South Lancs): declining [marriages, births or mortality], incorporated [boundary changes],
    noted [health or weather], cubic [cubic feet – earth movement for sanitation], workhouse,
    sail [Irish emigrants sailing from Liverpool], observatory, aurora, salutary [salutary effects
    that led to death], took [weather]. MIXED
•   4 (Oxon to Beds): cuckoo [was first heard], infirmary, Regius Professor, intermittent
    [intermittent fevers], sleet, solar, halos, least [rainfall or temperature], heard [thunder],
    thunder - WEATHER
•   5 (London): changed [changed water supply], anemometer, exclusively [supplied by one
    water company], hospital, command [front matter], Junction [Grand Junction Water
    Company], Company [almost always water company], pipes, Bills [Bills of Mortality], asylum,
    sewage – WATER SUPPLY
“Company” in Cluster 5
Mentions of diseases collocating to
             place-names
       Mentions of diseases from 1850 to 1910
                                1600
                                1400
                                1200
        Frequency




                                1000
                                  800
                                  600
                                  400
                                  200
                                    0
                                                                                                Scarlet-               Whooping
                                         Diarrhoea    Diphtheria   Dysentery        Measles                 Smallpox
                                                                                                 Fever                  -cough
                    Mentions_1850-1911     1555         1261           332           1513        964          333          23
                                         Diseases related to placenames
                    700
                    600                                                                                             Whooping cough
                    500
   Mentions




                                                                                                                    Smallpox
                    400
                                                                                                                    Dysenterya
                    300
                                                                                                                    Scarlet Fever
                    200
                    100                                                                                             Diphtheria

                      0                                                                                             Measles
                           1850         1860         1870       1880         1890        1900        1910           Diarrhoea

                                                               Decades
Places that collocate with “measles”




                                 www.histpop.org
Comparing texts with statistics
        40
    %

        30


        20


        10                                                        Mentions of measles
                                                                  Districts
        0                                                         Population
             1      2   3     4    5    6      7     8
                            Urban Level



    % national      Sample areas
    pop (1911)
1             9.4   Stow on the Wold (Glou), Whitchurch (Hants.), Hexham (N’humb), Oakham (Rutland), Northallerton (N.Rid.), Holbeach (Lincs)

2            13.0   Cockermouth (Cumb), Chippenham (Wilts), Bridport (Dorset), Bangor (Carn), Alton (Hants), Pembroke (Pembs)
3            17.8   Guildford (Surrey), Redruth (Corn), York (E.Rid), Bucklow (Chesh), Chorley (Lancs), Maidstone (Kent)
4            18.7   Swansea, Canterbury, Hastings, Rochdale, Bolton, Wolverhampton
5            18.0   Sheffield, Leeds, Oxford, Southampton, Coventry, Edmonton (Mdlsex)
6            11.9   Exeter, Hull, Nottingham, Portsmouth, Leicester, Salford (Lancs)
7             9.0   Most of London, also Manchester, Liverpool and Birmingham
8             2.1   Only London, mainly East End
Do mentions of “Diarrhoea, dysentery and cholera”
   correlate with deaths from these diseases?


                                                                IMRchdidy Mchdiady
 Kendall's tau_b      IMRchdidy Correlation Coefficient             1.000     .225**
                                     Sig. (1-tailed)                           .000
                                     N                                626       626
                       Mchdiady Correlation Coefficient                 **    1.000
                                                                    .225
                                     Sig. (1-tailed)                 .000
                                     N                                626       626
 Spearman's rho IMRchdidy Correlation Coefficient                   1.000     .290**
                                     Sig. (1-tailed)                           .000
                                     N                                626       626
                       Mchdiady Correlation Coefficient                 **    1.000
                                                                    .290
                                     Sig. (1-tailed)                 .000
                                     N                                626       626
 **. Correlation is significant at the 0.01 level (1-tailed).
Geographical Text Analysis
• Combination of Corpus Linguistics and GIS allows us to:
   – 1. Geographical approach:
       • Ask where is this corpus talking about?
       • Identify place-names in areas that the corpus concentrates on.
       • Find out what it is saying about these places
   – 2. Theme of interest approach:
       •   Find out which places are associated with our theme
       •   Find out what it is saying in relation to this theme
       •   Find out what other themes are associated with these places
       •   Compare geography of place-name mentions with statistical evidence to
           explore biases in sources
Further work
• HistPop
• BL’s C19th Century
  Newspapers
• Other sources

Mais conteúdo relacionado

Destaque

Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...
Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...
Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...Digital History
 
Adam Crymble - Digital History seminar 15 October 2013
Adam Crymble - Digital History seminar 15 October 2013Adam Crymble - Digital History seminar 15 October 2013
Adam Crymble - Digital History seminar 15 October 2013Digital History
 
Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)
Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)
Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)Digital History
 
Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...
Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...
Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...Digital History
 
Holford mapping the medieval countryside 2014-06-17
Holford   mapping the medieval countryside 2014-06-17Holford   mapping the medieval countryside 2014-06-17
Holford mapping the medieval countryside 2014-06-17Digital History
 
Digital History - 8 May 2012
Digital History - 8 May 2012Digital History - 8 May 2012
Digital History - 8 May 2012Digital History
 

Destaque (7)

Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...
Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...
Legal-Miller - mistreated and molested: jailhouse violence and the civil righ...
 
Adam Crymble - Digital History seminar 15 October 2013
Adam Crymble - Digital History seminar 15 October 2013Adam Crymble - Digital History seminar 15 October 2013
Adam Crymble - Digital History seminar 15 October 2013
 
Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)
Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)
Robertson mapping everyday life digital harlem 1915 30 (8 jan 2013)
 
Petrie ihr presentation
Petrie ihr presentationPetrie ihr presentation
Petrie ihr presentation
 
Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...
Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...
Magnus Huber - The Old Bailey Corpus: Spoken English in the 18th and 19th Cen...
 
Holford mapping the medieval countryside 2014-06-17
Holford   mapping the medieval countryside 2014-06-17Holford   mapping the medieval countryside 2014-06-17
Holford mapping the medieval countryside 2014-06-17
 
Digital History - 8 May 2012
Digital History - 8 May 2012Digital History - 8 May 2012
Digital History - 8 May 2012
 

Mais de Digital History

Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Digital History
 
Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Digital History
 
Commemorating the Great War on Twitter
Commemorating the Great War on TwitterCommemorating the Great War on Twitter
Commemorating the Great War on TwitterDigital History
 
Community Archives and Ethics
Community Archives and EthicsCommunity Archives and Ethics
Community Archives and EthicsDigital History
 
Contemporary web archives ihr
Contemporary web archives ihrContemporary web archives ihr
Contemporary web archives ihrDigital History
 
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...Digital History
 
The Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic ApproachThe Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic ApproachDigital History
 
Identifying responses to revolution
Identifying responses to revolutionIdentifying responses to revolution
Identifying responses to revolutionDigital History
 
Chance encounters with the past
Chance encounters with the pastChance encounters with the past
Chance encounters with the pastDigital History
 
The lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offendersThe lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offendersDigital History
 
Tudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth AhnertTudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth AhnertDigital History
 
The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...Digital History
 
Cordell scientific american
Cordell scientific americanCordell scientific american
Cordell scientific americanDigital History
 
Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...Digital History
 
European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...Digital History
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...Digital History
 
Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’ Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’ Digital History
 

Mais de Digital History (20)

Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020
 
Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020Ihr dig hist_teachingpanel_feb2020
Ihr dig hist_teachingpanel_feb2020
 
Commemorating the Great War on Twitter
Commemorating the Great War on TwitterCommemorating the Great War on Twitter
Commemorating the Great War on Twitter
 
Community Archives and Ethics
Community Archives and EthicsCommunity Archives and Ethics
Community Archives and Ethics
 
Contemporary web archives ihr
Contemporary web archives ihrContemporary web archives ihr
Contemporary web archives ihr
 
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
The ‘Digital Thematic Deconstruction’ of early modern urban maps and bird’s-e...
 
The Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic ApproachThe Language of Migration in the Victorian Press: A Corpus Linguistic Approach
The Language of Migration in the Victorian Press: A Corpus Linguistic Approach
 
Identifying responses to revolution
Identifying responses to revolutionIdentifying responses to revolution
Identifying responses to revolution
 
Chance encounters with the past
Chance encounters with the pastChance encounters with the past
Chance encounters with the past
 
The lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offendersThe lives and criminal careers of juvenile offenders
The lives and criminal careers of juvenile offenders
 
History of teaching ihr
History of teaching ihrHistory of teaching ihr
History of teaching ihr
 
Tudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth AhnertTudor Intelligence Networks - Ruth Ahnert
Tudor Intelligence Networks - Ruth Ahnert
 
The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...The Pictorial publisher - Agents technologies and the illustrrated book in Br...
The Pictorial publisher - Agents technologies and the illustrrated book in Br...
 
Cordell scientific american
Cordell scientific americanCordell scientific american
Cordell scientific american
 
Mapping paris
Mapping parisMapping paris
Mapping paris
 
Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...Political Meetings Mapper with British Library Labs: mapping the origins of B...
Political Meetings Mapper with British Library Labs: mapping the origins of B...
 
European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...European or Imperial Metropolis? Depictions of London in British Newspapers, ...
European or Imperial Metropolis? Depictions of London in British Newspapers, ...
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
 
Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’ Emma Bayne: ‘Traces Through Time overview and next steps’
Emma Bayne: ‘Traces Through Time overview and next steps’
 
Ihr june15-evans
Ihr june15-evansIhr june15-evans
Ihr june15-evans
 

Ig ihr 2012

  • 1. Using texts to explore historical texts: Examples from Lake District literature and the Registrar General’s Reports Ian Gregory Lancaster University Acknowledgements: Alistair Baron, Patricia Murrieta-Flores, Andrew Hardie , and Paul Rayson (Lancaster) Claire Grover (Edinburgh) – providing access to the geo-reference Histpop data Richard Deswarte – help with the HistPop data
  • 3. Change in Infant Mortality in England & Wales, 1851-2001 180 160 140 120 100 IMR 80 60 40 20 0 1851 1861 1871 1881 1891 1901 1911 1921 1931 1941 1951 1961 1971 1981 1991 2001
  • 4. Traditional HGIS: Infant mortality decline in England & Wales, 1851-1911 30 20 1 . 10 2 3 % national rate 4 0 5 1850s 1860s 1870s 1880s 1890s 1900s 6 -10 7 8 -20 -30 Source: Gregory (2008) Annals of the Assoc. of American Geographers
  • 5. Distant Reading Graphs (p. 16) Maps (p. 55) Trees (p. 73) Moretti (2005) Graphs, Maps, Trees
  • 6. Literary Mapping of the Lakes • British Academy funded pilot project with David Cooper and Sally Bushell • Two tours of the Lake District – Thomas Gray, 1769 (9,000 words) • Proto-Picturesque – ST Coleridge, 1802 (10,000 words) • Romantic • Aims: – Can we create a GIS of text? – What can it offer to literary research? • Method: – Texts typed up by hand – Places tagged manually – Conversion – Analysis
  • 7. Place names coded in XML <p in_text="Y">On Sunday Augt. 1st - half after 12 I had a Shirt, cravat, 2 pair of Stockings, a little paper &amp; half a dozen Pens, a German Book (Voss's Poems) &amp; a little Tea &amp; Sugar, with my Night Cap, packed up in my natty green oil- skin, neatly squared, and put into my <format format_type="I">net</format> Knapsack / and the Knap-sack on my back &amp; the Besom stick in my hand, which for want of a better, and in spite of <person>Mrs C.</person> &amp; <person>Mary</person>, who both raised their voices against it, especially as I left the Besom scattered on the Kitchen Floor, off I sallied - over the Bridge<my_comment><pl_name visited="Y">Greta Bridge, Keswick</pl_name></my_comment>, thro' the Hop-Field, thro' the <pl_name visited="Y">Prospect Bridge</pl_name> at <pl_name visited="Y">Portinscale</pl_name>, so on by the tall Birch that grows out of the center of the huge Oak, along into <pl_name visited="Y">Newlands</pl_name>-- <pl_name visited="Y">Newlands</pl_name>is indeed a lovely Place-the houses…
  • 8. Convert to a GIS OS 1:50,000 gazetteer – all places on 1:50,000 maps • Accuracy • Spelling problems • Disambiguation
  • 9. Coleridge & Gray in a GIS
  • 10. Smoothed surface of Gray’s places All mentions Visits
  • 11. Smoothed surface of Coleridges’s places All mentions Visits Class intervals are 10 equal intervals of the all mentions. Bandwidth=10km
  • 12. Comparing Coleridge and Gray All mentions Visits Green: Only in Gray Yellow: Evenly in both Red: Only in Coleridge
  • 13. Mapping Emotional Response Gray Coleridge
  • 14. Physical Characteristics of Tours 70 700 60 600 50 % of mentions 500 Pop Density 40 400 30 300 Gray 20 200 10 100 0 0 0 to 99 100 to 200 to 300 to 400 to 500 to 600 to 700 to 800+ STC Not visited STC Visited Grey Not visited Grey Visited 199 299 399 499 599 699 799 70 Height 60 Normal Visited Didn't visit/Unclear 50 % of mentions 1000 40 Coleridge 30 Pop. Density 100 20 10 0 10 0 to 99 100 to 200 to 300 to 400 to 500 to 600 to 700 to 800+ 199 299 399 499 599 699 799 Height 1 STC Not visited STC Visited Grey Not visited Grey Visited Visited Didn't visit/Unclear Logged Altitude of mentions Population density
  • 15. Close Reading with Internet Mapping http://www.lancs.ac.uk/mappingthelakes http://www.lancs.ac.uk/mappingthelakes/v2
  • 16. The Histpop Collection • Covers the printed reports published in the Census and the Registrar General’s Annual Reports, 1801- 1937 • Nearly 13,000,000 words • Georeferenced by C. Grover (University of Edinburgh) • Just concerned with the Registrar General’s Reports, 1851-1911 • Total: 3,750,000 words • England & Wales: 2,000,000 words • http://www.histpop.org
  • 17. Dot maps of place-name instances
  • 18. Place-name instances, 1850s Density Smoothing Cluster identification: Standard deviations www.histpop.org of density
  • 19. Extract place-names Word Cnt Kernel Density Cnt Frequency Density North Shields 300 Bermondsey .5849 6 London 294 Newington .5842 4 Durham 207 Spitalfields .5835 1 Nottingham 193 Whitechapel .5835 1 Liverpool 171 Stepney .5823 2 Hawarden 145 Rotherhithe .5809 5 Grantham 131 London .5803 294 Cardington 125 Shoreditch .5794 1 Linslade 121 Bethnal Green .5788 4 Wakefield 121 Camberwell .5787 12 58th: Southwick .3498 1 (nr Sunderland)
  • 20. Collocation • “In Southwick and Monkwearmouth offensive nuisances abound.” • “At Royton, in Oldham, where the drainage was imperfect, typhoid fever was prevalent” • “The deaths in the Liverpool workhouse, in the Mount Pleasant sub-district of Liverpool, were above 100 more than in the same period of the two previous years, owing chiefly to an epidemic of measles among children of German emigrants temporarily located in this institution; there were also 101 deaths from typhus, nearly all of which occurred in the workhouse.”
  • 21. KWIC of “West Bromwich”
  • 22. Most common words in clusters • Uses Mutual Information scores – top 10 for each cluster, excluding place-names, numbers, and punctuation • 1 (North-East): Fog, took [changes in rainfall or temperature took place], largest [changes in weather], least [as largest], dense [weather related], greatest [weather], observatory, Asiatic [cholera], Halos [lunar or solar], thunder. WEATHER • 2 (Wakefield): Falls, rain, seen [meteorological phenomena or “swallows”], reading, fell [snow or rain], number [met. readings], June, March. WEATHER • 3 (South Lancs): declining [marriages, births or mortality], incorporated [boundary changes], noted [health or weather], cubic [cubic feet – earth movement for sanitation], workhouse, sail [Irish emigrants sailing from Liverpool], observatory, aurora, salutary [salutary effects that led to death], took [weather]. MIXED • 4 (Oxon to Beds): cuckoo [was first heard], infirmary, Regius Professor, intermittent [intermittent fevers], sleet, solar, halos, least [rainfall or temperature], heard [thunder], thunder - WEATHER • 5 (London): changed [changed water supply], anemometer, exclusively [supplied by one water company], hospital, command [front matter], Junction [Grand Junction Water Company], Company [almost always water company], pipes, Bills [Bills of Mortality], asylum, sewage – WATER SUPPLY
  • 24. Mentions of diseases collocating to place-names Mentions of diseases from 1850 to 1910 1600 1400 1200 Frequency 1000 800 600 400 200 0 Scarlet- Whooping Diarrhoea Diphtheria Dysentery Measles Smallpox Fever -cough Mentions_1850-1911 1555 1261 332 1513 964 333 23 Diseases related to placenames 700 600 Whooping cough 500 Mentions Smallpox 400 Dysenterya 300 Scarlet Fever 200 100 Diphtheria 0 Measles 1850 1860 1870 1880 1890 1900 1910 Diarrhoea Decades
  • 25. Places that collocate with “measles” www.histpop.org
  • 26. Comparing texts with statistics 40 % 30 20 10 Mentions of measles Districts 0 Population 1 2 3 4 5 6 7 8 Urban Level % national Sample areas pop (1911) 1 9.4 Stow on the Wold (Glou), Whitchurch (Hants.), Hexham (N’humb), Oakham (Rutland), Northallerton (N.Rid.), Holbeach (Lincs) 2 13.0 Cockermouth (Cumb), Chippenham (Wilts), Bridport (Dorset), Bangor (Carn), Alton (Hants), Pembroke (Pembs) 3 17.8 Guildford (Surrey), Redruth (Corn), York (E.Rid), Bucklow (Chesh), Chorley (Lancs), Maidstone (Kent) 4 18.7 Swansea, Canterbury, Hastings, Rochdale, Bolton, Wolverhampton 5 18.0 Sheffield, Leeds, Oxford, Southampton, Coventry, Edmonton (Mdlsex) 6 11.9 Exeter, Hull, Nottingham, Portsmouth, Leicester, Salford (Lancs) 7 9.0 Most of London, also Manchester, Liverpool and Birmingham 8 2.1 Only London, mainly East End
  • 27. Do mentions of “Diarrhoea, dysentery and cholera” correlate with deaths from these diseases? IMRchdidy Mchdiady Kendall's tau_b IMRchdidy Correlation Coefficient 1.000 .225** Sig. (1-tailed) .000 N 626 626 Mchdiady Correlation Coefficient ** 1.000 .225 Sig. (1-tailed) .000 N 626 626 Spearman's rho IMRchdidy Correlation Coefficient 1.000 .290** Sig. (1-tailed) .000 N 626 626 Mchdiady Correlation Coefficient ** 1.000 .290 Sig. (1-tailed) .000 N 626 626 **. Correlation is significant at the 0.01 level (1-tailed).
  • 28. Geographical Text Analysis • Combination of Corpus Linguistics and GIS allows us to: – 1. Geographical approach: • Ask where is this corpus talking about? • Identify place-names in areas that the corpus concentrates on. • Find out what it is saying about these places – 2. Theme of interest approach: • Find out which places are associated with our theme • Find out what it is saying in relation to this theme • Find out what other themes are associated with these places • Compare geography of place-name mentions with statistical evidence to explore biases in sources
  • 29. Further work • HistPop • BL’s C19th Century Newspapers • Other sources