SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Towards a Big Data Taxonomy
Bill Mandrick, PhD
Data Tactics
Version 26_August_2013
Scientific Taxonomies Represent
• Types of Processes
• Types of Objects
– Physical Objects
– Information Artifacts
• Types of Characteristics
– Qualities
– Roles
• Relationships
– Between Processes
– Between Objects
– Between Characteristics
2
Big Data Taxonomy
• Big Data Related Processes
• Big Data Characteristics
• Big Data Information Artifacts
• Big Data Information Bearers
• Relationships between Big Data Elements
• Mapping Instances to the Taxonomy
• Creating Situational Awareness
3
Relations Between Processes
• Processes A <relation> Processes B
– Complex Process <has part> Sub-Process
– Sub-Process <part of> Complex Process
– Process A <precedes> Process B
– Process A <follows> Process B
Examples:
Data Curation Process <has part> Data Selection Process
Data Curation Process <has part> Data Collection Process
Data Curation Process <has part> Data Archiving Process
4
Information Artifact Lifecycle Processes
• Collecting
• Curating
• Representing
• Storing
– Cluster Storing
• Managing
– Processing
• Distributed Processing
– Map Reduce
• Analyzing
– Data Mining
– Causal Analysis
– Probabilistic Analysis
– Correlation Analysis
• Data Collection Process
• Data Curation Process
• Data Representation Process
• Data Storing Process
– Cluster Storing Process
• Data Management Process
– Processing
• Distributed Data Process
– Map Reduce Process
• Data Analytics Process
– Data Mining Process
– Causal Analysis Process
– Probabilistic Analysis Process
– Correlation Analysis Process
Common Labels Taxonomy Labels
5
Big Data Processes
6
Big Data Processes can be
decomposed and related to
other (sub)processes
…as well as to their outputs
(Information Artifacts).
Relating Processes to Products
7
Big Data Information Artifacts
8
9
10
Information Content Entities
11
Use Case
Data Characteristics
12
Information Bearers
13
Partial Taxonomy
14
Human Genome Data
15
Terms from Human Genome Data Use Case
Use Case Term:
Genomic Measurements
Reference Materials
Reference Data
Reference Methods
Assess Performance
Genome Sequencing
Integrate Data
Sequencing Technologies
Sequencing Methods
Characterization
Whole Human Genomes
Assess Performance
Genome Sequencing Run
Computer System
Storage
Networking
Processing
Software
Open Source Sequencing Bioinformatics Software
Data Source
Sequencer
Volume
Variety
Variability
Veracity
Visualization
Data Quality
Data Types
Data Analytics
Taxonomical Term:
Genomic Measurement Result (Measurement Result)
Reference Material Role
Reference Data Role
Reference Method
Performance Assessment Process
Genome Sequencing Process
Data Integration Process
Data Sequencing Technology (Tool)
Sequencing Method (Process)
Characterization (Data Characterization, IA or ICE)
Whole Human Genome Characterization (IA or ICE?)
Performance Assessment Process
Genome Sequencing Run
Computer System
Data Storage Process
Computer Networking Process
Data Processing Process
Software (IAO placement?)
Bioinformatics Sequencing Software
Data Source Role
Sequencer
Data Volume (Characteristic)
Data Variety (Characteristic)
Data Variability (Characteristic)
Data Veracity (Characteristic)
Data Visualization Process
Data Quality (Characteristic)
Data Type
Data Analytics Process
16
Information Artifacts:
Human Genome Data Measurement Result
Characterization (Data Characterization, IA or ICE)
Whole Human Genome Characterization (IA or ICE?)
Performance Assessment
Genome Sequence
Software (IAO placement?)
Data Visualization
Processes:
Human Genome Data Measurement Process
Reference Method
Performance Assessment Process
Genome Sequencing Process
Data Integration Process
Sequencing Method (Process)
Data Characterization Process
Performance Assessment Process
Genome Sequencing Run
Data Storage Process
Computer Networking Process
Data Processing Process
Data Visualization Process
Data Analytics Process
Roles and Characteristics:
Reference Material Role
Reference Data Role
Data Source Role
Data Volume (Characteristic)
Data Variety (Characteristic)
Data Variability (Characteristic)
Data Veracity (Characteristic)
Data Visualization Process
Data Quality (Characteristic)
Artifacts/Tools:
Data Sequencing Technology (Tool)
Computer System
Computer Network
Software (IAO placement?)
Bioinformatics Sequencing Software
Sequencer
17
Terms from Human Genome Data Use Case
Genomic Research Organizations
18
Instances
DNA Data Sets
19
Instances
DNA Organizational Roles
20Instances
Agent Roles
21
DNA Visualization
22
Instances
Conclusion
• This method can be done for any part of the
Big Data Taxonomy
• Need SME input for various areas/domains
• Need to add definitions in owl
• Need to expand set of standardized relations
• Link instances to the taxonomy (e.g. actual
data sets, organizations, etc.)
23

Mais conteúdo relacionado

Mais procurados

Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Kathmandu Living Labs
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2malathieswaran29
 
Reading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesReading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesJürgen Umbrich
 
Role of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big DataRole of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big DataManish Mathuria
 
Data mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPData mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPjaya lakshmi
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...Big Data Value Association
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutionscsandit
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAlasdair Gray
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingsuganmca14
 
Data pre processing
Data pre processingData pre processing
Data pre processingpommurajopt
 
Resume xiaodan(vinci)
Resume xiaodan(vinci)Resume xiaodan(vinci)
Resume xiaodan(vinci)vinci105
 
Data Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaData Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaEdureka!
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersCrossref
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 

Mais procurados (19)

Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2
 
Reading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesReading Group: From Database to Dataspaces
Reading Group: From Database to Dataspaces
 
Role of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big DataRole of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big Data
 
Data mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPData mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAP
 
Discovery informaticsstanton
Discovery informaticsstantonDiscovery informaticsstanton
Discovery informaticsstanton
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
Data mining
Data miningData mining
Data mining
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Resume xiaodan(vinci)
Resume xiaodan(vinci)Resume xiaodan(vinci)
Resume xiaodan(vinci)
 
Data Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaData Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | Edureka
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, Members
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 

Destaque

Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...William LaPorte
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomiesawesomesos
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageZubair Nabi
 
Global taxonomy initiative ppt
Global taxonomy initiative  pptGlobal taxonomy initiative  ppt
Global taxonomy initiative pptKrishnapriya Priya
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesJoão Gabriel Lima
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Designsarakirsten
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureAccess Innovations, Inc.
 
Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyAlexander Falk
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 

Destaque (12)

Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and Storage
 
Global taxonomy initiative ppt
Global taxonomy initiative  pptGlobal taxonomy initiative  ppt
Global taxonomy initiative ppt
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notes
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomy
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 

Semelhante a Big Data Taxonomy 8/26/2013

Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 Scott Edmunds
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and TrendsVijayasankariS
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendSalah Amean
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.pptSamPrem3
 
Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberHouw Liong The
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.pptPalaniKumarR2
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e sciencebdemchak
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data miningDevakumar Jain
 

Semelhante a Big Data Taxonomy 8/26/2013 (20)

Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
data mining
data miningdata mining
data mining
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & Kamber
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
13 trend
13 trend13 trend
13 trend
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e science
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 

Mais de DataTactics

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATADataTactics
 
C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic PresentationDataTactics
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka DataTactics
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics PracticeDataTactics
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data ConferenceDataTactics
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities DemoDataTactics
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag DataTactics
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and ReportsDataTactics
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionDataTactics
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013DataTactics
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataDataTactics
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2DataTactics
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013DataTactics
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief AnalyticsDataTactics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcDataTactics
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1DataTactics
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3DataTactics
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source BriefDataTactics
 

Mais de DataTactics (20)

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
 
C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic Presentation
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics Practice
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities Demo
 
DLISA
DLISADLISA
DLISA
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and Description
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source Brief
 

Último

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 

Último (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 

Big Data Taxonomy 8/26/2013

  • 1. Towards a Big Data Taxonomy Bill Mandrick, PhD Data Tactics Version 26_August_2013
  • 2. Scientific Taxonomies Represent • Types of Processes • Types of Objects – Physical Objects – Information Artifacts • Types of Characteristics – Qualities – Roles • Relationships – Between Processes – Between Objects – Between Characteristics 2
  • 3. Big Data Taxonomy • Big Data Related Processes • Big Data Characteristics • Big Data Information Artifacts • Big Data Information Bearers • Relationships between Big Data Elements • Mapping Instances to the Taxonomy • Creating Situational Awareness 3
  • 4. Relations Between Processes • Processes A <relation> Processes B – Complex Process <has part> Sub-Process – Sub-Process <part of> Complex Process – Process A <precedes> Process B – Process A <follows> Process B Examples: Data Curation Process <has part> Data Selection Process Data Curation Process <has part> Data Collection Process Data Curation Process <has part> Data Archiving Process 4
  • 5. Information Artifact Lifecycle Processes • Collecting • Curating • Representing • Storing – Cluster Storing • Managing – Processing • Distributed Processing – Map Reduce • Analyzing – Data Mining – Causal Analysis – Probabilistic Analysis – Correlation Analysis • Data Collection Process • Data Curation Process • Data Representation Process • Data Storing Process – Cluster Storing Process • Data Management Process – Processing • Distributed Data Process – Map Reduce Process • Data Analytics Process – Data Mining Process – Causal Analysis Process – Probabilistic Analysis Process – Correlation Analysis Process Common Labels Taxonomy Labels 5
  • 6. Big Data Processes 6 Big Data Processes can be decomposed and related to other (sub)processes …as well as to their outputs (Information Artifacts).
  • 8. Big Data Information Artifacts 8
  • 9. 9
  • 10. 10
  • 16. Terms from Human Genome Data Use Case Use Case Term: Genomic Measurements Reference Materials Reference Data Reference Methods Assess Performance Genome Sequencing Integrate Data Sequencing Technologies Sequencing Methods Characterization Whole Human Genomes Assess Performance Genome Sequencing Run Computer System Storage Networking Processing Software Open Source Sequencing Bioinformatics Software Data Source Sequencer Volume Variety Variability Veracity Visualization Data Quality Data Types Data Analytics Taxonomical Term: Genomic Measurement Result (Measurement Result) Reference Material Role Reference Data Role Reference Method Performance Assessment Process Genome Sequencing Process Data Integration Process Data Sequencing Technology (Tool) Sequencing Method (Process) Characterization (Data Characterization, IA or ICE) Whole Human Genome Characterization (IA or ICE?) Performance Assessment Process Genome Sequencing Run Computer System Data Storage Process Computer Networking Process Data Processing Process Software (IAO placement?) Bioinformatics Sequencing Software Data Source Role Sequencer Data Volume (Characteristic) Data Variety (Characteristic) Data Variability (Characteristic) Data Veracity (Characteristic) Data Visualization Process Data Quality (Characteristic) Data Type Data Analytics Process 16
  • 17. Information Artifacts: Human Genome Data Measurement Result Characterization (Data Characterization, IA or ICE) Whole Human Genome Characterization (IA or ICE?) Performance Assessment Genome Sequence Software (IAO placement?) Data Visualization Processes: Human Genome Data Measurement Process Reference Method Performance Assessment Process Genome Sequencing Process Data Integration Process Sequencing Method (Process) Data Characterization Process Performance Assessment Process Genome Sequencing Run Data Storage Process Computer Networking Process Data Processing Process Data Visualization Process Data Analytics Process Roles and Characteristics: Reference Material Role Reference Data Role Data Source Role Data Volume (Characteristic) Data Variety (Characteristic) Data Variability (Characteristic) Data Veracity (Characteristic) Data Visualization Process Data Quality (Characteristic) Artifacts/Tools: Data Sequencing Technology (Tool) Computer System Computer Network Software (IAO placement?) Bioinformatics Sequencing Software Sequencer 17 Terms from Human Genome Data Use Case
  • 23. Conclusion • This method can be done for any part of the Big Data Taxonomy • Need SME input for various areas/domains • Need to add definitions in owl • Need to expand set of standardized relations • Link instances to the taxonomy (e.g. actual data sets, organizations, etc.) 23