Alive and kicking! Keeping data re-usable in the European Values Study:
- Data and information flow in the EVS project
- Principles and workflows for managing data and documentation in survey projects
Organizational Structure Running A Successful Business
Brislinger, Recker: Keeping data re-usable in the evs
1. Alive and kicking!
Keeping data re-usable in the
European Values Study
IASSIST Cologne, May 2013
Astrid.Recker@gesis.org, Evelyn.Brislinger@gesis.org
GESIS, Data Archive for the Social Sciences
2. Overview
Data and information flow in the EVS project
Principles and workflows for managing data and
documentation in survey projects
3. GESIS Data Archive
Basis
Interplay between Principal Investigators (PI) and Data Archive
Agreement on submission of data and information packages
Goals
Ease access to data for a broad user community
Provide metadata for discovery, understanding, and good use of data
Preserve data and metadata for re-use and replications
Holdings
Studies, study series, and complex survey programs as ISSP, Eurobarometer,
ALLBUS, European Values Study (EVS), or election studies
4. Data and information created in a survey project
Total stock of data and
documentation created
Data and documentation
submitted to an archive
Further information necessary
for the project(?)
Selection processes
Management solutions for structuring data and information
5. Example: European Values Study (EVS)
9-year-period, 4 waves
49 countries, 125 national surveys
Cross-national, longitudinal
research program
National surveys
Waves
1981/1990/1999/2008
Longitudinal data File
1981-2008 (LdF)
Integrated Values Surveys
EVS/WVS (IVS)
Harmonization and integration process
Number of files
Size of files
Atlas of European Values
www.europeanvaluesstudy.eu/evs/evsatlas.html
6. Collaboration of actors involved (EVS 2008)
Data
created
processed
documented
National team
Data
standardized
harmonized
integrated
Central team
Data Archive Secondary usersPrincipal Investigators
Data
checked
documented
preserved
released
Data
re-used
Analyses
replicated
Results
reported
7. Users: analyze and evaluate outcomes
Questions
Check trend questions and original
questions
ZACAT-Online Study Catalogue
Data
Analyze data, report errors, monitor
error reporting
GESIS Data Catalogue
Publications
Replicate analysis of other projects
EVS Repository
…. and detect peculiarities in
questions or problems in data
8. Peculiarities in question text spotted?
Project Design
Questionnaire Design
Questionnaire Translation
Data Collection
Data Documentation
Data Processing
Check question and translation
Master/field questionnaire, methodological
questionnaire, report ‘Translation History’
Check source of question
Trend question from EVS and WVS,
questions borrowed from other surveys
Identify consequences for
Countries sharing/adopting affected
language, languages belonging to a family,
further languages used in a country
EVS 2008 Data lifecycle
9. Data error detected?
Standardization and harmonization process: check comparability of surveys,
questions, variables cumulate data and document each step
Integrated
Values
Surveys
EVS/WVS
Longitudi-
nal data
File
1981-2008
Wave
2008
National
data
Original
data file
Wave
1999
…..
National
data
…..
Retrace data processing steps across surveys: check data, syntax
files, and documentation update data and highlight problems for next wave
Error detected
10. Data and information created
Designated communities
Principal Investigator/Project
Secondary user
Experiences from EVS project
Data and information packages
Project package
Archive package
Selection processes
Within project
Between project and archive
Project
Archive
Total stock
11. Communicating with the future: Activity on two levels
Macro level
Defining workflows, file and information paths on which
necessary information is passed on
Micro level
Organizing information so that it is
re-usable (RDM, metadata,
systematic file structures)
12. Begin by identifying principles for structuring and documenting files in
the project (Research Data Management)
Select
which information
is relevant
to whom?
A tidy house, a tidy mind!
Reference, don’t
duplicate files
whenever possible
Identify and
capture “kinship
relations”
Capture process
knowledge
classes
itineraries Make changes
traceable
versioning
document revisions &
annotations
minutes
protocols
13. The magic wand
Follow principles of good research
data management (RDM)
Use metadata to document process
and content information
Use standards wherever possible
(e.g. DDI, Dublin Core, ISO codes,
file naming conventions, etc.)
(and not the one used by the sorcerer’s apprentice)
15. Managing information flows in a collaborative, long-
term project
Which paths does information (data, documentation, other
contextual material) take from producers to users?
Two models helped us clarify processes and paths, as well as
identify helpful terminology and concepts
– Project life cycle
– Open Archival Information System (OAIS) reference model
(CCSDS 2012)
CCSDS (2012). Reference Model for an Open Archival Information System (OAIS). Recommended Practice.
http://public.ccsds.org/publications/archive/650x0m2.pdf
16. Project Repository
Ingest
Data processing
and enhancement
Data
Management
Temporary
Storage
Access
(project-internal
use, PIs)
Project Design
Data
Dissemination
Questionnaire
Design
Questionnaire
Translation
Data Collection
Data
Documentation
Data
Processing
Project life cycle: Data flow during creation of a survey
Guidelines
17. Data Archive
(preservation service provider)
Data
Management
Access
Archival Storage
(long-term)
Preservation Planning
Administration
Ingest
Secondary
Users
(future)
Principal
Investigators
SIP AIPAIP DIP
Project Repository
(content provider)
Ingest
Data processing
and enhancement
Data
Management
Temporary
Storage
Access
(project-internal
use, PIs)
Project and Data Archive as distributed system
PIP
PIP
PIP
PIP
PIP
PIP
PIP
PIP
PIP
PIP = Project Information Package, SIP = Submission Information Package,
AIP = Archival Information Package, DIP = Dissemination Information Package
Project Design
Data
Dissemination
Questionnaire
Design
Questionnaire
Translation
Data Collection
Data
Processing
Data
Documentation
18. Staying Alive! Where we are going from here
Developing a guideline for projects
– structuring and annotating of information on the micro level
– issues to discuss with an Archive (preservation service provider)
Testing our model
– implementing our ideas in smaller projects with the aim of
making the results available to other projects
19. Thank you for your attention!
Evelyn Brislinger | Astrid Recker
GESIS – Leibniz Institute for the Social Sciences, Data Archive
evelyn.brislinger@gesis.org | astrid.recker@gesis.org
www.gesis.org