This document discusses the concept of "data literacy" and proposes moving towards a concept of "literacy in the age of data". It argues that data literacy as currently conceived focuses too narrowly on individual skills and competencies, and should instead promote social inclusion and empowerment. The document outlines pillars of literacy in the age of data, including understanding data's influence and impact, critical thinking about data and algorithms, and using data for civic purposes. It calls for making data more accessible and meaningful for communities, and politicizing data issues to achieve greater data inclusion.
Learning from past infrastructure to embrace friction and create the Research...Research Data Alliance
RDA provides a neutral space for researchers to develop standards and share data across disciplines through working groups and interest groups. It focuses on developing interoperability through deliverables like registries and identifiers. While it doesn't define architecture, it aims to foster connections and provide unity. RDA also takes a "glocal" approach, implementing standards locally while addressing global issues. Friction in collaboration is inevitable but necessary for progress, and RDA provides a place for discussions to work through differences.
IASC Operational Guidance on Responsibilities of Sector Cluster Leads and OCH...Brendan McDonald
The document outlines responsibilities for information management between cluster/sector leads and OCHA during humanitarian emergencies. It states that cluster/sector leads are responsible for information management within their clusters to coordinate response, while OCHA is responsible for information management between clusters to ensure effective inter-cluster coordination. It provides details on establishing information sharing standards and networks, appointing information focal points, generating and disseminating relevant data and situation reports, and building on existing national information systems. The goal is to facilitate situational awareness and informed decision-making during the emergency response.
DataCenter supports grassroots organizing for social justice and sustainability through strategic research, training, and collaboration. They work to move the knowledge and solutions of marginalized communities to the center of decision-making. DataCenter coordinates research conducted by and for social movements to build political legitimacy and power for communities.
This document discusses leveraging NGO resources through knowledge management. It covers how knowledge structures relate to social, business, and technology structures. It defines knowledge management and knowledge work, and outlines a knowledge infrastructure including people, content, tools, processes, and governance. The document then discusses how knowledge management relates to knowledge assets, sharing, collaboration, resources, and stakeholders. It provides examples of understanding, managing, and storing content, as well as retrieving and sharing explicit and collaborative content. The document concludes with the main messages that managing knowledge assets leverages an NGO's capacity, social interaction includes sharing, collaboration, negotiation and competition, and knowledge work involves both technical and social aspects.
Telling stories about (re)search: research practices reconfigured by digital ...Berber Hagedoorn
Paper "Telling stories about (re)search: research practices reconfigured by digital search technologies", Sabrina Sauer & Berber Hagedoorn, EASST conference 2018: Meetings – Making Science, Technology and Society together, 27 July 2018, Lancaster University, Lancaster
This document discusses mining social data from online sources to gain insights. It defines social data and information, and notes that unstructured data found online provides a rich source of knowledge. It recommends developing skills in statistics, data processing, and data visualization to extract value from social data. Finally, it outlines best practices for social media analytics, including defining goals, selecting metrics, targeting data sources, using analytics tools, and delivering insights through dashboards, reports, and infographics.
This document discusses the concept of "data literacy" and proposes moving towards a concept of "literacy in the age of data". It argues that data literacy as currently conceived focuses too narrowly on individual skills and competencies, and should instead promote social inclusion and empowerment. The document outlines pillars of literacy in the age of data, including understanding data's influence and impact, critical thinking about data and algorithms, and using data for civic purposes. It calls for making data more accessible and meaningful for communities, and politicizing data issues to achieve greater data inclusion.
Learning from past infrastructure to embrace friction and create the Research...Research Data Alliance
RDA provides a neutral space for researchers to develop standards and share data across disciplines through working groups and interest groups. It focuses on developing interoperability through deliverables like registries and identifiers. While it doesn't define architecture, it aims to foster connections and provide unity. RDA also takes a "glocal" approach, implementing standards locally while addressing global issues. Friction in collaboration is inevitable but necessary for progress, and RDA provides a place for discussions to work through differences.
IASC Operational Guidance on Responsibilities of Sector Cluster Leads and OCH...Brendan McDonald
The document outlines responsibilities for information management between cluster/sector leads and OCHA during humanitarian emergencies. It states that cluster/sector leads are responsible for information management within their clusters to coordinate response, while OCHA is responsible for information management between clusters to ensure effective inter-cluster coordination. It provides details on establishing information sharing standards and networks, appointing information focal points, generating and disseminating relevant data and situation reports, and building on existing national information systems. The goal is to facilitate situational awareness and informed decision-making during the emergency response.
DataCenter supports grassroots organizing for social justice and sustainability through strategic research, training, and collaboration. They work to move the knowledge and solutions of marginalized communities to the center of decision-making. DataCenter coordinates research conducted by and for social movements to build political legitimacy and power for communities.
This document discusses leveraging NGO resources through knowledge management. It covers how knowledge structures relate to social, business, and technology structures. It defines knowledge management and knowledge work, and outlines a knowledge infrastructure including people, content, tools, processes, and governance. The document then discusses how knowledge management relates to knowledge assets, sharing, collaboration, resources, and stakeholders. It provides examples of understanding, managing, and storing content, as well as retrieving and sharing explicit and collaborative content. The document concludes with the main messages that managing knowledge assets leverages an NGO's capacity, social interaction includes sharing, collaboration, negotiation and competition, and knowledge work involves both technical and social aspects.
Telling stories about (re)search: research practices reconfigured by digital ...Berber Hagedoorn
Paper "Telling stories about (re)search: research practices reconfigured by digital search technologies", Sabrina Sauer & Berber Hagedoorn, EASST conference 2018: Meetings – Making Science, Technology and Society together, 27 July 2018, Lancaster University, Lancaster
This document discusses mining social data from online sources to gain insights. It defines social data and information, and notes that unstructured data found online provides a rich source of knowledge. It recommends developing skills in statistics, data processing, and data visualization to extract value from social data. Finally, it outlines best practices for social media analytics, including defining goals, selecting metrics, targeting data sources, using analytics tools, and delivering insights through dashboards, reports, and infographics.
Information consolidation is defined as the process of evaluating and compressing relevant documents to provide users with reliable and concise information. It involves defining responsibility for analyzing documents and packaging information appropriately for users' needs, levels, and time constraints. The benefits of information consolidation include increasing the effectiveness and use of information for various activities, as well as expanding the circle of potential users by providing evaluated and synthesized information. The basic processes involve studying user needs, selecting relevant sources, evaluating and analyzing information, restructuring it into a new whole, and packaging and disseminating it to encourage use.
Library as a knowledge management centrePrasanna Iyer
1) The document discusses how a library can serve as a knowledge management center by facilitating the sharing of information from various internal and external resources on topics like diabetes treatment.
2) It proposes ways for the library to leverage relationships and social capital, such as by facilitating networking, validating ideas through cross-pollination, and eliciting information through groups and events.
3) The library is well-suited to serve as a knowledge management center because it already collects, indexes, and provides access to documents; knows many experts and organizations; and can customize services to meet user needs.
This document provides an introduction to advanced data analytics. It discusses [1] how organizations lose millions annually due to inefficient use of data, [2] the sources and types of big data being generated, and [3] the multi-disciplinary nature of data analytics, drawing on fields like database technology, statistics, machine learning, and visualization. The key steps of analytics projects are outlined, including understanding the domain, preprocessing data, reducing and transforming it, selecting analytical approaches, communicating results, and deploying and evaluating new systems.
Transforming The Academic Library Services For Generation Y Using Knowledge M...tulipbiru64
Paper presented by Sharifah Fahimah Saiyed Yeop at the 4th PERPUN International Conference 2015: Information Revolution, 11-12th August 2015 at Avillion Legacy Hotel, Melaka.
This document discusses the role of libraries in knowledge management. It begins by defining information, knowledge, knowledge management, and the differences between information management and knowledge management. It then examines how the rise of knowledge management has increased questions for librarians about their role. The document proposes that librarians and libraries should take a leadership role in knowledge management by developing knowledge resources, facilitating knowledge sharing and networking, leveraging information technology, and improving user services to support knowledge creation and access.
The document identifies several conceptual inhibitors to effective information sharing between departments and agencies. These include having an unclear scope that does not account for budget, participants, end products, and other constraints. Poor information architecture that lacks common organizing principles and taxonomy can also undermine a solution. Insufficient attention to design elements like templates, pages, and navigation can negatively impact how information is displayed and used. Failure to define required functionality up front based on participant needs also poses a conceptual risk.
Presented at the IAALD-AFITA-WCCA Conference held in Atsugi (Japan) in August 2008.
The WebRing concept has evolved in the meantime and the resulting service is the CIARD RING, available at: http://ring.ciard.net
A more up-to-date presentation is available here:
http://www.slideshare.net/valeriap/the-ciard-ring-an-infrastructure-for-interoperability-of-agricultural-research-information-services
Handout for Planning and Implementing a Digital Library ProjectJenn Riley
The document provides guidelines for a grant program that will fund projects by Indiana libraries to digitize historical materials. Libraries can apply for subgrants to digitize materials from their collections to contribute to the Indiana Digital Library. Projects must follow standards for digitization and metadata and make materials accessible online. The deadline to apply is March 31, 2006. Successful applicants will be notified in early May 2006.
The document discusses 9 images taken by the author for a school magazine project. For each image, the author explains the purpose of the shot and how it could be used in the magazine layout, such as matching colors in the image to the magazine masthead or cover lines. The images include full-body shots, close-ups, and portraits intended to showcase outfits and poses for the magazine's theme of featuring pop artists.
The document discusses 5 potential objects for 3D animation: 1) a set of keys on a ring representing a human figure, 2) a snake-like necklace that could interact with the keys, 3) a cologne bottle with an intricate logo, 4) a PS Vita gaming console with buttons and lines, and 5) a desk that the other objects would be placed on. Additional items like a phone, tile, furniture and laptop are also considered. Research was done on variations of desks and cologne bottles.
This document summarizes a presentation on best practices for polling and survey data. It cautions against simply aggregating polls, noting that doing so risks losing nuance and precision. It emphasizes the importance of representative sampling, transparency, and minimizing errors. Key points include carefully evaluating coverage and potential biases in samples, especially for international data, and considering how factors like question wording, response options, and population studied can affect results. The overall message is that high-quality methodology, transparency, and understanding sources of error are needed to ensure survey accuracy.
This document contains three paragraphs summarizing movie poster images. The first paragraph describes an image of a blurred person reaching through a wall, suggesting they are trying to escape and need help. The open mouth links to horror themes of shouting for help. The second paragraph analyzes a screaming white mask, linking to the film's title and foreshadowing screams for help. The simple white mask stands out and is easy to remember in connection with the film. The third paragraph praises an image's faint effect, with a tongue sticking out face incorporating film stills to briefly foreshadow events while sparking the imagination. The white face stands out against the background and its expression instantly connects to the horror genre.
The Challenges and Pitfalls of Aggregating Social Media DataDataCards
This document discusses the challenges and pitfalls of aggregating social media data. It notes that social media is often seen as a "panacea" but questions remain about what questions the data can answer. A case study of analyzing social media data from Mexico revealed issues like inadequate data processing capabilities and questions about the collection profile and presence of U.S. person data. Next steps include building a baseline of traditional Mexican media's social media presence. Analyzing social media data faces hurdles like lack of agreement among experts and lack of databases. Initial findings note gaps between what can be remotely collected and what media people actually use, with Facebook dominating but Twitter disliked by most. The wrap up emphasizes being skeptical and clarifying
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...DataCards
The Actionable Intelligence Retrieval System (AIRS) is an integrated prototype that aligns different data models, applies advanced analytics algorithms, and allows analysts to search for information across many data sources. AIRS includes three essential components: an ontology to align data, advanced analytics algorithms, and an integrated prototype. The document outlines AIRS' research areas and tasks to develop its capabilities for retrieving and analyzing information from multiple data sources.
INDIAN STATISTICAL INSTITUTE
Documentation Research & Training Centre
8th Mile, Mysore Road, RVCE Post
Bangalore-560 059
DRTC Seminar- 5
2014
Data Literacy
ABSTRACT
In our increasingly data-driven society, data literacy is an important civic skill which we should be developing in our society. Data is slowly but steadily forcing their way into the societies. Data literacy may seem less technical than either Computer Science or any other fields. Still we need to envisage a wide variety of tools for accessing, converting and manipulating data. These require to understand relational databases (like MS Access), data manipulation techniques, statistical software tools (like Minitab, SPSS, STATA and MS Excel) and data representation software tools (like MS PowerPoint and MS Excel). This seminar includes an introduction on data literacy, its inter-relationship with information literacy and statistical literacy. It also includes various steps for working with data followed by short demonstration of data analysis techniques by using the software STATA11.
Speaker: Jayanta Kr. Nayek
Date:29 .10.2014. Time: 2 p.m.
Venue: DRTC, ISI Bangalore.
All are cordially invited.
Seminar Coordinator
Biswanath Dutta
Information consolidation is defined as the process of evaluating and compressing relevant documents to provide users with reliable and concise information. It involves defining responsibility for analyzing documents and packaging information appropriately for users' needs, levels, and time constraints. The benefits of information consolidation include increasing the effectiveness and use of information for various activities, as well as expanding the circle of potential users by providing evaluated and synthesized information. The basic processes involve studying user needs, selecting relevant sources, evaluating and analyzing information, restructuring it into a new whole, and packaging and disseminating it to encourage use.
Library as a knowledge management centrePrasanna Iyer
1) The document discusses how a library can serve as a knowledge management center by facilitating the sharing of information from various internal and external resources on topics like diabetes treatment.
2) It proposes ways for the library to leverage relationships and social capital, such as by facilitating networking, validating ideas through cross-pollination, and eliciting information through groups and events.
3) The library is well-suited to serve as a knowledge management center because it already collects, indexes, and provides access to documents; knows many experts and organizations; and can customize services to meet user needs.
This document provides an introduction to advanced data analytics. It discusses [1] how organizations lose millions annually due to inefficient use of data, [2] the sources and types of big data being generated, and [3] the multi-disciplinary nature of data analytics, drawing on fields like database technology, statistics, machine learning, and visualization. The key steps of analytics projects are outlined, including understanding the domain, preprocessing data, reducing and transforming it, selecting analytical approaches, communicating results, and deploying and evaluating new systems.
Transforming The Academic Library Services For Generation Y Using Knowledge M...tulipbiru64
Paper presented by Sharifah Fahimah Saiyed Yeop at the 4th PERPUN International Conference 2015: Information Revolution, 11-12th August 2015 at Avillion Legacy Hotel, Melaka.
This document discusses the role of libraries in knowledge management. It begins by defining information, knowledge, knowledge management, and the differences between information management and knowledge management. It then examines how the rise of knowledge management has increased questions for librarians about their role. The document proposes that librarians and libraries should take a leadership role in knowledge management by developing knowledge resources, facilitating knowledge sharing and networking, leveraging information technology, and improving user services to support knowledge creation and access.
The document identifies several conceptual inhibitors to effective information sharing between departments and agencies. These include having an unclear scope that does not account for budget, participants, end products, and other constraints. Poor information architecture that lacks common organizing principles and taxonomy can also undermine a solution. Insufficient attention to design elements like templates, pages, and navigation can negatively impact how information is displayed and used. Failure to define required functionality up front based on participant needs also poses a conceptual risk.
Presented at the IAALD-AFITA-WCCA Conference held in Atsugi (Japan) in August 2008.
The WebRing concept has evolved in the meantime and the resulting service is the CIARD RING, available at: http://ring.ciard.net
A more up-to-date presentation is available here:
http://www.slideshare.net/valeriap/the-ciard-ring-an-infrastructure-for-interoperability-of-agricultural-research-information-services
Handout for Planning and Implementing a Digital Library ProjectJenn Riley
The document provides guidelines for a grant program that will fund projects by Indiana libraries to digitize historical materials. Libraries can apply for subgrants to digitize materials from their collections to contribute to the Indiana Digital Library. Projects must follow standards for digitization and metadata and make materials accessible online. The deadline to apply is March 31, 2006. Successful applicants will be notified in early May 2006.
The document discusses 9 images taken by the author for a school magazine project. For each image, the author explains the purpose of the shot and how it could be used in the magazine layout, such as matching colors in the image to the magazine masthead or cover lines. The images include full-body shots, close-ups, and portraits intended to showcase outfits and poses for the magazine's theme of featuring pop artists.
The document discusses 5 potential objects for 3D animation: 1) a set of keys on a ring representing a human figure, 2) a snake-like necklace that could interact with the keys, 3) a cologne bottle with an intricate logo, 4) a PS Vita gaming console with buttons and lines, and 5) a desk that the other objects would be placed on. Additional items like a phone, tile, furniture and laptop are also considered. Research was done on variations of desks and cologne bottles.
This document summarizes a presentation on best practices for polling and survey data. It cautions against simply aggregating polls, noting that doing so risks losing nuance and precision. It emphasizes the importance of representative sampling, transparency, and minimizing errors. Key points include carefully evaluating coverage and potential biases in samples, especially for international data, and considering how factors like question wording, response options, and population studied can affect results. The overall message is that high-quality methodology, transparency, and understanding sources of error are needed to ensure survey accuracy.
This document contains three paragraphs summarizing movie poster images. The first paragraph describes an image of a blurred person reaching through a wall, suggesting they are trying to escape and need help. The open mouth links to horror themes of shouting for help. The second paragraph analyzes a screaming white mask, linking to the film's title and foreshadowing screams for help. The simple white mask stands out and is easy to remember in connection with the film. The third paragraph praises an image's faint effect, with a tongue sticking out face incorporating film stills to briefly foreshadow events while sparking the imagination. The white face stands out against the background and its expression instantly connects to the horror genre.
The Challenges and Pitfalls of Aggregating Social Media DataDataCards
This document discusses the challenges and pitfalls of aggregating social media data. It notes that social media is often seen as a "panacea" but questions remain about what questions the data can answer. A case study of analyzing social media data from Mexico revealed issues like inadequate data processing capabilities and questions about the collection profile and presence of U.S. person data. Next steps include building a baseline of traditional Mexican media's social media presence. Analyzing social media data faces hurdles like lack of agreement among experts and lack of databases. Initial findings note gaps between what can be remotely collected and what media people actually use, with Facebook dominating but Twitter disliked by most. The wrap up emphasizes being skeptical and clarifying
Alignment and Analytics of Large Scale, Disparate Data from IARPA's Knowledge...DataCards
The Actionable Intelligence Retrieval System (AIRS) is an integrated prototype that aligns different data models, applies advanced analytics algorithms, and allows analysts to search for information across many data sources. AIRS includes three essential components: an ontology to align data, advanced analytics algorithms, and an integrated prototype. The document outlines AIRS' research areas and tasks to develop its capabilities for retrieving and analyzing information from multiple data sources.
INDIAN STATISTICAL INSTITUTE
Documentation Research & Training Centre
8th Mile, Mysore Road, RVCE Post
Bangalore-560 059
DRTC Seminar- 5
2014
Data Literacy
ABSTRACT
In our increasingly data-driven society, data literacy is an important civic skill which we should be developing in our society. Data is slowly but steadily forcing their way into the societies. Data literacy may seem less technical than either Computer Science or any other fields. Still we need to envisage a wide variety of tools for accessing, converting and manipulating data. These require to understand relational databases (like MS Access), data manipulation techniques, statistical software tools (like Minitab, SPSS, STATA and MS Excel) and data representation software tools (like MS PowerPoint and MS Excel). This seminar includes an introduction on data literacy, its inter-relationship with information literacy and statistical literacy. It also includes various steps for working with data followed by short demonstration of data analysis techniques by using the software STATA11.
Speaker: Jayanta Kr. Nayek
Date:29 .10.2014. Time: 2 p.m.
Venue: DRTC, ISI Bangalore.
All are cordially invited.
Seminar Coordinator
Biswanath Dutta
Module 3 - Improving Current Business with External Data- Online caniceconsulting
The document discusses how to use external data to improve business. It defines external data as data generated outside an organization that can come from a variety of sources and serve nearly every industry. The document outlines different types of external data like primary data, secondary data, and open data. It provides examples of sources for primary data, which is original and reliable, and secondary data, which already exists. The benefits of using external data to supplement internal data and gain a more comprehensive view are also discussed.
A call to librarians to use their library powers in the community beyond the walls of their institutions as the open data folks need their knowledge!
Title:
Open Sesame: Open Data, Data Liberation and New Opportunities for Libraries
Abstract:
Cities and data producers are quickly embracing Open Data, albeit unevenly. The Data Liberation Initiative (DLI) has been a pioneer in broadening access to data for nearly two decades. This session will examine the relevance of Data Liberation in terms of Open Data and explore how librarians can step up to the plate to make Open Data/Open Government as successful as DLI.
Speakers:
- Wendy Watkins, Data Librarian, Carleton University
- Ernie Boyko, Adjunct Data Librarian, Carleton University
- Tracey P. Lauriault, Post Doctoral Fellow, Carleton University (tlauriau@gmail.com)
- Margaret Haines, University Librarian, Carleton University
This document outlines an agenda and activities for a workshop on practical data management planning. The workshop will discuss challenges with data management, including data loss and how poor management affects all. Activities will guide participants in inventorying their data and developing storage and backup plans. The goal is to help researchers effectively manage their data over the long-term and address funder and legal requirements.
The document provides an overview of data science, big data, data mining, and data mining techniques. It defines data science as a multi-disciplinary field that uses scientific methods to extract knowledge from structured and unstructured data. Big data is described as large, diverse datasets that are too large for traditional databases to handle. Common data mining tasks like prediction, classification, clustering and association rule mining are summarized. Finally, specific techniques like decision trees, k-means clustering, and association rule mining are overviewed.
This document provides an overview of data science including its importance, what data scientists do, how the field has emerged, and how to become a data scientist. It discusses how data science can help answer important business questions using LinkedIn in 2006 as a case study. It also outlines the typical data science process of framing questions, collecting and cleaning data, exploring patterns, and communicating results. Finally, it introduces some common data science tools like SQL, analytics software, and machine learning algorithms and discusses options for continuing education in data science.
This document provides an overview of data science including its importance, what data scientists do, how the field has emerged, and how to become a data scientist. It notes that by 2018 the US could face shortages of people with data analytics skills. It then discusses how LinkedIn's early growth in 2006 exemplifies the data science process of framing questions, collecting and processing data, exploring patterns, and communicating results. Finally, it outlines the tools used in data science like SQL, analytics software, and machine learning and discusses getting started in the field through education, curiosity, and ongoing learning with mentorship support.
Big Data for International DevelopmentAlex Rascanu
Alex Rascanu delivered the "Big Data for International Development" presentation at the International Development Conference that took place on February 7, 2015 at University of Toronto Scarborough.
Presentation on the Cambridgeshire Open-Data Partners: Open Technology for an Open Partnership project by Michael Soper of Cambridgeshire County Council
Kimberly Silk presented on data management and discovery at the Martin Prosperity Institute. The MPI collects large social science datasets from various common and authoritative sources to support research. To better organize their growing collection, the MPI implemented an open data discovery platform called Dataverse to catalog and provide access to their datasets. Open data initiatives aim to make certain government data freely available to the public, but also present challenges around data preparation, support, and responsiveness. Big data refers to extremely large datasets beyond the capabilities of typical database tools, and data visualization is an important way to communicate insights from data.
This document provides guidance on research data management and developing data management plans. It discusses why managing research data is important, including making research easier to conduct, avoiding accusations of fraud or bad science, and getting credit for data produced. The document outlines what is involved in research data management and considerations for sharing and preserving data, such as file formats, documentation, and standards. It emphasizes the importance of data management planning and provides tips on developing plans to meet funder requirements.
1. 3rd Socio-Cultural Data Summit
National Defense University
Center for Technology and National Security Policy
2. Admin
• Unclassified conference
• Chatham House rules
• Lunch in the new fiscal reality (the cafeteria)
• We have breaks and time built into our schedule to continue
discussions or to sidebar
2
3. Data Summit(s) Objective
• “Good” data are required for reliable analysis.
− Socio-cultural data of any sort are hard to find.
− When we do find them, they are messy, fragmented,
disorganized, poorly measured, etc.
• These Data Summits are committed to fostering a community that is
interested in finding, evaluating, collecting, cleaning up, smartly
integrating, and then using socio-cultural data against applied
problems with scientific rigor.
− Focus on a broad community with as few restrictions as possible.
− Focus on rigor and science without sacrificing the ability to
conduct real world applications.
3
4. Logical Progression of these Data Efforts
1. DataCards: quick and dirty effort to find, tag, and index data of all
sorts for as many audiences as possible to reduce search costs for
socio-cultural data.
2. First Data Summit: Take a first cut at data evaluation criteria and
beat the heck out of it in working groups so that can start to
evaluate socio-cultural data that we’ve found.
3. Second Data Summit: Expand the aperture on what constitutes
data and relate working group insights back to prior evaluation
criteria and lessons learned for continuing to find and define data.
4. Third Data Summit: Start to tackle the complex issue of “how we
put the data together” once we have found it.
......more working groups focused on areas where we perceive we can
make concrete progress on data integration, cleaning, and fusion.
4
5. DataCards Overview
• DataCards is a structured wiki-like platform that uses “cards” (like card
catalog cards or baseball cards) to index and describe key details re:
socio-cultural (and related) data sources.
• Objectives of DataCards include:
– Make sources of data discoverable.
– Reduce search costs for data.
– Conduit to discover and share data sources between and among
non-traditional, academic, NGO, defense, law enforcement, and
intelligence communities.
• Accessing DataCards:
− Commercial Internet: http://www.datacards.org/
− Development Site: http://beta.datacards.org/
− SIPRNet: by request, hosted by OSD CAPE
5
6. DataCards Content/Usage Update
• Total cards: 1,682
(2,416 pending additional cards)
• Total datacards.org users: 537
• Since .org launch: 5,703 visits; 54,229 pageviews; 00:10:40 average
time/visit; multiple visits from 28 countries
6
8. Summary of 1st Data Summit
• Data, and the quality of the data, used for applied socio-cultural work for the
DoD and other agencies is generally poor.
• Often general and hard to apply to real world situations
• Rarely evaluated, and even more rarely evaluated objectively
• Worked on data evaluation criteria so that a “smart person” isn’t needed to
evaluate data sources.
• Smart people used to create the criteria, and will use “smart people in
training” to apply the ratings.
• The ratings shouldn’t rely on the experience of the rater, but on the
quality of the criteria.
• The effort acknowledged that one size does not fit all requirements, and
criteria should be flexible enough to accommodate a variety of conceptions of
what constitutes “data.”
• DataCards assists consumers of socio-cultural data to rapidly find the data they
need. The evaluation criteria help assess suitability and quality of possible data
sources for their desired application.
8
9. Summary of 2nd Data Summit
• “Data” is a user-defined term; it is not specific to one particular type of data.
DataCards is a platform with a wide user base with varied data needs.
DataCards should seek to assist with the discovery and evaluation of data
sources.
• Big data is a growing field of interest within analytical and knowledge
communities. Big data, which was defined by the complexity, structure, and
size of data, is not just social media but is generally transactional in
nature, including financial transactions, SMS, and search engine results.
• Many data sources are qualitative in nature and cannot be analyzed and
machine processed the way quantitative or geospatial data are processed and
analyzed.
• The most important considerations for users of geospatial data require robust
searching capabilities, a minimal path to finding data, and complete data.
• There is no one way that individuals use to find data. Discovery is often project
specific and individuals tend to establish and follow predictable patterns of
behavior when finding data because certain sources tend to be proven
relevant and trustworthy.
9
10. What is this Summit About?
• This summit is about getting the mess of socio-cultural “stuff” we
often call data into a usable analytic format.
• The first panel focuses on two unique and innovative approaches
toward putting data together for intelligence and analytic purposes;
and a Phase 3 IARPA program that is rapidly fusing data in support of
the intelligence community’s requirements for integrated and
disparate data.
• The second panel focuses on two of the major types of data that are
often trumpeted as the silver bullet to understanding all things
socio-cultural: social media and polling/surveys. However, these are
great case studies in the potential pitfalls of data aggregation
without careful thought about what it is you are putting together.
10
11. What is this Summit About? (continued)
• The third panel provides three approaches to dealing with socio-
cultural data, with moderate technical detail. This includes a look at
the application of statistics to missing data, the dirty work of getting
socio-cultural data ready for a DARPA program, and dealing with
situations where socio-cultural data are sparse.
• Tomorrow, the fourth panel will focus on scientific and technical
approaches to information extraction and data fusion challenges.
• The fifth panel will offer up thoughts on three compelling and
promising areas for socio-cultural data integration: geospatial data
of multiple resolutions, qualitative/subject matter expert-derived
data, and human geography data.
• We’ll end after lunch with a discussion about how we as a
community want to proceed on this conquest.
11
12. What Do I Want to Get Out Of this Summit?
• Community-building and the invigoration of new ideas to support
better work with socio-cultural data.
• Feedback on what methods we are missing and what has merit.
• Feedback on what the forward operator needs from a group like
this—this includes the warfighter, but also law enforcement
officers, NGOs, partner nations, foreign service officers, economic
development professionals: anyone working in the field to make a
difference.
12