SlideShare uma empresa Scribd logo
1 de 12
Structuring big data
 Mark Wilson
 January 2012




#CloudCamp              UNCLASSIFIED   © Copyright 2012 Fujitsu Services Limited
The problem with big data: and a solution
The problem:
        “New reference architectures will include both big data and enterprise
         data warehouses”
                                                              [IDC, 19 January 2012]
        Two worlds: structured and unstructured data (plus external data
         sources, documents stored in structured databases, etc.)
        Siloes create issues with management, integration, etc.
The solution:
        Linked data – a single reference point for all data in the enterprise




#CloudCamp                                 1                                 UNCLASSIFIED
Some history



               Fixed structure
                   Difficult to change schema
               Simple reporting capabilities
                   Complex to create new reports




#CloudCamp                     2                    UNCLASSIFIED
Some history


                   Completed
                    transactions
                    transferred to separate
                    database for analysis
                       “Data warehouse”
                   Better reporting, data
                    mining, etc.
                       Still highly structured
                   Data is historical
                       May be aggregated




#CloudCamp     3                            UNCLASSIFIED
The smart guys



Real-time update of completed
 transactions
        Transactions moved to data warehouse
         upon completion
        Smaller transactional database
Allows for alerts to be generated when
 specific conditions met and action
 taken




#CloudCamp                             4        UNCLASSIFIED
A third “data silo”



                      Masses of unstructured/semi-
                       structured data being processed in
                       NoSQL databases
                      May, or may not be transferred
                       to/from structured databases
                          Time-consuming and inefficient
                      Three types of data, each with their
                       own limitations and own
                       management considerations




#CloudCamp                   5                              UNCLASSIFIED
Data everywhere!




#CloudCamp         6   UNCLASSIFIED
Linked Data
Tie records together – even from separate data sets
We can express as triples with a specific grammar:




Build up a graph to show machine-readable data in human
 form




#CloudCamp                     7                       UNCLASSIFIED
Then add lots more data…




Source: http://lod-cloud.net/
        Each node is itself another graph (zoom in)
#CloudCamp                               8             UNCLASSIFIED
Aren’t we missing a trick?
Use linked data as a the
 optimal reference source
        Broker of all data sources
Single view on structured and
 unstructured data
        Bring in external sources too
Mapping, interconnecting,
 indexing and feeding
        In real time
Query linked data to derive
 new value from old
        Infer relationships
        Gain new insights


#CloudCamp                               9   UNCLASSIFIED
About the author
Mark Wilson, Strategy Manager, Fujitsu
Mark is an analyst working within Fujitsu’s UK and
Ireland Office of the CTO, providing thought
leadership both internally and to customers,
shaping business and technology strategy. He has
17 years' experience of working in the IT industry,
12 of which have been with Fujitsu. Mark has a
background in leading large IT infrastructure
projects with customers in the UK, mainland
Europe and Australia. He has a degree in
Computer Studies from the University of
Glamorgan. Mark is also active in social media and
won the Individual IT Professional (Male) award in
the 2010 Computer Weekly IT Blog Awards. Mark
may be found on Twitter @markwilsonit.

If you would like to comment on the topics in this
presentation, Mark would welcome your feedback,
by email to mark.a.wilson@uk.fujitsu.com.

Mais conteúdo relacionado

Mais procurados

4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Flow oriented modeling
Flow oriented modelingFlow oriented modeling
Flow oriented modelingramyaaswin
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query ProcessingMythili Kannan
 
Performance analysis(Time & Space Complexity)
Performance analysis(Time & Space Complexity)Performance analysis(Time & Space Complexity)
Performance analysis(Time & Space Complexity)swapnac12
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with RGreat Wide Open
 
Introduction to distributed database
Introduction to distributed databaseIntroduction to distributed database
Introduction to distributed databaseSonia Panesar
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysisMahesh Bhalerao
 
Data Modeling PPT
Data Modeling PPTData Modeling PPT
Data Modeling PPTTrinath
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMSkoolkampus
 

Mais procurados (20)

4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Data structure ppt
Data structure pptData structure ppt
Data structure ppt
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Flow oriented modeling
Flow oriented modelingFlow oriented modeling
Flow oriented modeling
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
Performance analysis(Time & Space Complexity)
Performance analysis(Time & Space Complexity)Performance analysis(Time & Space Complexity)
Performance analysis(Time & Space Complexity)
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
Acid properties
Acid propertiesAcid properties
Acid properties
 
Deadlock dbms
Deadlock dbmsDeadlock dbms
Deadlock dbms
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
Data mining
Data miningData mining
Data mining
 
Dbms lab questions
Dbms lab questionsDbms lab questions
Dbms lab questions
 
Introduction to distributed database
Introduction to distributed databaseIntroduction to distributed database
Introduction to distributed database
 
Multimedia Mining
Multimedia Mining Multimedia Mining
Multimedia Mining
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMS
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
Data Modeling PPT
Data Modeling PPTData Modeling PPT
Data Modeling PPT
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS
 

Destaque

Journey Through the AWS Cloud; Disaster Recovery
 Journey Through the AWS Cloud; Disaster Recovery Journey Through the AWS Cloud; Disaster Recovery
Journey Through the AWS Cloud; Disaster RecoveryAmazon Web Services
 
Making a Cleaner Cloud with Open Source
Making a Cleaner Cloud with Open SourceMaking a Cleaner Cloud with Open Source
Making a Cleaner Cloud with Open SourceAndy Piper
 
Making The Most Of Your Fears
Making The Most Of Your Fears Making The Most Of Your Fears
Making The Most Of Your Fears Ben Seymour
 
Good presentations matter
Good presentations matterGood presentations matter
Good presentations matterNed Potter
 
The History of Pets vs. Cattle ... And Using It Properly
The History of Pets vs. Cattle ... And Using It ProperlyThe History of Pets vs. Cattle ... And Using It Properly
The History of Pets vs. Cattle ... And Using It ProperlyRandy Bias
 
(Graham Brown mobileYouth) The London Riots - wtf?
(Graham Brown mobileYouth) The London Riots - wtf? (Graham Brown mobileYouth) The London Riots - wtf?
(Graham Brown mobileYouth) The London Riots - wtf? Graham Brown
 

Destaque (7)

Journey Through the AWS Cloud; Disaster Recovery
 Journey Through the AWS Cloud; Disaster Recovery Journey Through the AWS Cloud; Disaster Recovery
Journey Through the AWS Cloud; Disaster Recovery
 
Making a Cleaner Cloud with Open Source
Making a Cleaner Cloud with Open SourceMaking a Cleaner Cloud with Open Source
Making a Cleaner Cloud with Open Source
 
Making The Most Of Your Fears
Making The Most Of Your Fears Making The Most Of Your Fears
Making The Most Of Your Fears
 
Adaptive Brands
Adaptive BrandsAdaptive Brands
Adaptive Brands
 
Good presentations matter
Good presentations matterGood presentations matter
Good presentations matter
 
The History of Pets vs. Cattle ... And Using It Properly
The History of Pets vs. Cattle ... And Using It ProperlyThe History of Pets vs. Cattle ... And Using It Properly
The History of Pets vs. Cattle ... And Using It Properly
 
(Graham Brown mobileYouth) The London Riots - wtf?
(Graham Brown mobileYouth) The London Riots - wtf? (Graham Brown mobileYouth) The London Riots - wtf?
(Graham Brown mobileYouth) The London Riots - wtf?
 

Semelhante a Structuring Big Data

Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationMyth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationDenodo
 
Why Data Mesh Needs Data Virtualization (ASEAN)
Why Data Mesh Needs Data Virtualization (ASEAN)Why Data Mesh Needs Data Virtualization (ASEAN)
Why Data Mesh Needs Data Virtualization (ASEAN)Denodo
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)Denodo
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Denodo
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
A novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computingA novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computingJoão Gabriel Lima
 
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singC cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singJohn Sing
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentHostedbyConfluent
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesDenodo
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentationDavid Rice
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionDATAVERSITY
 
Top 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionTop 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionDataStax
 
Snowflake Cloning.pdf
Snowflake Cloning.pdfSnowflake Cloning.pdf
Snowflake Cloning.pdfVishnuGone
 
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summits
 

Semelhante a Structuring Big Data (20)

Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationMyth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
 
Why Data Mesh Needs Data Virtualization (ASEAN)
Why Data Mesh Needs Data Virtualization (ASEAN)Why Data Mesh Needs Data Virtualization (ASEAN)
Why Data Mesh Needs Data Virtualization (ASEAN)
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
NOSQL
NOSQLNOSQL
NOSQL
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Report 2.0.docx
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
 
Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)Accelerate Migration to the Cloud using Data Virtualization (APAC)
Accelerate Migration to the Cloud using Data Virtualization (APAC)
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
A novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computingA novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computing
 
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singC cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
Modern data warehouse presentation
Modern data warehouse presentationModern data warehouse presentation
Modern data warehouse presentation
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data Solution
 
No sql database
No sql databaseNo sql database
No sql database
 
Top 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionTop 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data Solution
 
Snowflake Cloning.pdf
Snowflake Cloning.pdfSnowflake Cloning.pdf
Snowflake Cloning.pdf
 
Report 1.0.docx
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
 
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
 

Mais de Fujitsu UK

Fujitsu Graduate and Industrial Placement Career Opportunities 2013
Fujitsu Graduate and Industrial Placement Career Opportunities 2013Fujitsu Graduate and Industrial Placement Career Opportunities 2013
Fujitsu Graduate and Industrial Placement Career Opportunities 2013Fujitsu UK
 
Futurology: art, science, nonsense?
Futurology: art, science, nonsense?Futurology: art, science, nonsense?
Futurology: art, science, nonsense?Fujitsu UK
 
High Performance Computing: Luxury, Vanity or Essential?
High Performance Computing: Luxury, Vanity or Essential?High Performance Computing: Luxury, Vanity or Essential?
High Performance Computing: Luxury, Vanity or Essential?Fujitsu UK
 
What do we know about the future, today? 12 changes and their implications fo...
What do we know about the future, today? 12 changes and their implications fo...What do we know about the future, today? 12 changes and their implications fo...
What do we know about the future, today? 12 changes and their implications fo...Fujitsu UK
 
What in the world?
What in the world?What in the world?
What in the world?Fujitsu UK
 
Separation Services from Fujitsu
Separation Services from FujitsuSeparation Services from Fujitsu
Separation Services from FujitsuFujitsu UK
 
Integration Services from Fujitsu
Integration Services from FujitsuIntegration Services from Fujitsu
Integration Services from FujitsuFujitsu UK
 
Technology, Inside the Black Box
Technology, Inside the Black BoxTechnology, Inside the Black Box
Technology, Inside the Black BoxFujitsu UK
 
Journey Into The Cloud
Journey Into The CloudJourney Into The Cloud
Journey Into The CloudFujitsu UK
 
Cloud Computing Infrastructure: Practical Insights
Cloud Computing Infrastructure: Practical InsightsCloud Computing Infrastructure: Practical Insights
Cloud Computing Infrastructure: Practical InsightsFujitsu UK
 
The Changing Landscape
The Changing LandscapeThe Changing Landscape
The Changing LandscapeFujitsu UK
 
A Journey into the Cloud
A Journey into the CloudA Journey into the Cloud
A Journey into the CloudFujitsu UK
 
An Innovation Perspective
An Innovation PerspectiveAn Innovation Perspective
An Innovation PerspectiveFujitsu UK
 
Time is an illusion, cloud time doubly so!
Time is an illusion, cloud time doubly so!Time is an illusion, cloud time doubly so!
Time is an illusion, cloud time doubly so!Fujitsu UK
 

Mais de Fujitsu UK (14)

Fujitsu Graduate and Industrial Placement Career Opportunities 2013
Fujitsu Graduate and Industrial Placement Career Opportunities 2013Fujitsu Graduate and Industrial Placement Career Opportunities 2013
Fujitsu Graduate and Industrial Placement Career Opportunities 2013
 
Futurology: art, science, nonsense?
Futurology: art, science, nonsense?Futurology: art, science, nonsense?
Futurology: art, science, nonsense?
 
High Performance Computing: Luxury, Vanity or Essential?
High Performance Computing: Luxury, Vanity or Essential?High Performance Computing: Luxury, Vanity or Essential?
High Performance Computing: Luxury, Vanity or Essential?
 
What do we know about the future, today? 12 changes and their implications fo...
What do we know about the future, today? 12 changes and their implications fo...What do we know about the future, today? 12 changes and their implications fo...
What do we know about the future, today? 12 changes and their implications fo...
 
What in the world?
What in the world?What in the world?
What in the world?
 
Separation Services from Fujitsu
Separation Services from FujitsuSeparation Services from Fujitsu
Separation Services from Fujitsu
 
Integration Services from Fujitsu
Integration Services from FujitsuIntegration Services from Fujitsu
Integration Services from Fujitsu
 
Technology, Inside the Black Box
Technology, Inside the Black BoxTechnology, Inside the Black Box
Technology, Inside the Black Box
 
Journey Into The Cloud
Journey Into The CloudJourney Into The Cloud
Journey Into The Cloud
 
Cloud Computing Infrastructure: Practical Insights
Cloud Computing Infrastructure: Practical InsightsCloud Computing Infrastructure: Practical Insights
Cloud Computing Infrastructure: Practical Insights
 
The Changing Landscape
The Changing LandscapeThe Changing Landscape
The Changing Landscape
 
A Journey into the Cloud
A Journey into the CloudA Journey into the Cloud
A Journey into the Cloud
 
An Innovation Perspective
An Innovation PerspectiveAn Innovation Perspective
An Innovation Perspective
 
Time is an illusion, cloud time doubly so!
Time is an illusion, cloud time doubly so!Time is an illusion, cloud time doubly so!
Time is an illusion, cloud time doubly so!
 

Último

Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Último (20)

Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Structuring Big Data

  • 1. Structuring big data Mark Wilson January 2012 #CloudCamp UNCLASSIFIED © Copyright 2012 Fujitsu Services Limited
  • 2. The problem with big data: and a solution The problem:  “New reference architectures will include both big data and enterprise data warehouses” [IDC, 19 January 2012]  Two worlds: structured and unstructured data (plus external data sources, documents stored in structured databases, etc.)  Siloes create issues with management, integration, etc. The solution:  Linked data – a single reference point for all data in the enterprise #CloudCamp 1 UNCLASSIFIED
  • 3. Some history Fixed structure  Difficult to change schema Simple reporting capabilities  Complex to create new reports #CloudCamp 2 UNCLASSIFIED
  • 4. Some history Completed transactions transferred to separate database for analysis  “Data warehouse” Better reporting, data mining, etc.  Still highly structured Data is historical  May be aggregated #CloudCamp 3 UNCLASSIFIED
  • 5. The smart guys Real-time update of completed transactions  Transactions moved to data warehouse upon completion  Smaller transactional database Allows for alerts to be generated when specific conditions met and action taken #CloudCamp 4 UNCLASSIFIED
  • 6. A third “data silo” Masses of unstructured/semi- structured data being processed in NoSQL databases May, or may not be transferred to/from structured databases  Time-consuming and inefficient Three types of data, each with their own limitations and own management considerations #CloudCamp 5 UNCLASSIFIED
  • 8. Linked Data Tie records together – even from separate data sets We can express as triples with a specific grammar: Build up a graph to show machine-readable data in human form #CloudCamp 7 UNCLASSIFIED
  • 9. Then add lots more data… Source: http://lod-cloud.net/  Each node is itself another graph (zoom in) #CloudCamp 8 UNCLASSIFIED
  • 10. Aren’t we missing a trick? Use linked data as a the optimal reference source  Broker of all data sources Single view on structured and unstructured data  Bring in external sources too Mapping, interconnecting, indexing and feeding  In real time Query linked data to derive new value from old  Infer relationships  Gain new insights #CloudCamp 9 UNCLASSIFIED
  • 11.
  • 12. About the author Mark Wilson, Strategy Manager, Fujitsu Mark is an analyst working within Fujitsu’s UK and Ireland Office of the CTO, providing thought leadership both internally and to customers, shaping business and technology strategy. He has 17 years' experience of working in the IT industry, 12 of which have been with Fujitsu. Mark has a background in leading large IT infrastructure projects with customers in the UK, mainland Europe and Australia. He has a degree in Computer Studies from the University of Glamorgan. Mark is also active in social media and won the Individual IT Professional (Male) award in the 2010 Computer Weekly IT Blog Awards. Mark may be found on Twitter @markwilsonit. If you would like to comment on the topics in this presentation, Mark would welcome your feedback, by email to mark.a.wilson@uk.fujitsu.com.

Notas do Editor

  1. Everyone’s talking about big data but the bulk of the conversation seems to focus on a new level of business intelligence and an ever-increasing volume of data organised into OLTP, OLAP and NoSQLsiloes.  In this talk, Mark Wilson puts forward a view that the real value is not from the big data itself but how we can employ linked data concepts to integrate structured, unstructured and semistructured data sets – and then use this unified data source to derive new value.