SlideShare uma empresa Scribd logo
1 de 24
Previously known as
Think Big. Move Fast.
Template designed by
brought to you by
SolidQ
• Born in 2002 in USA and Spain
• Established in 2007 in Italy
• More than 1000 customers and more than 200 consultants worldwide
• Dedicated to Data Management on the Microsoft Platform
• Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors
• www.solidq.com
Davide Mauri
• 18 Years of experience on the SQL Server Platform
• Specialized in Data Solution Architecture, Database Design, Performance
Tuning, Business Intelligence
• Microsoft SQL Server MVP
• President of UGISS (Italian SQL Server UG)
• Mentor @ SolidQ
• Video, Book & Article Author
• Regular Speaker @ SQL Server events
• Projects, Consulting, Mentoring & Training
Data Science
Reinassance 2.0
“Companies are collecting
mountains of information about
you, to predict how
likely you are to buy a product,
and using that knowledge to
craft a marketing message
precisely calibrated to get you to
do so”
Data Science
• Extraction of knowledge from data
• So, what’s new?
• Nothing. Except that it’s now economic and fast.
• It’s now applicable to everything. And we have a lot of data produced everyday
that can be used to extract knowledge
Data Science
DecisionsKnowledgeInformationData
Data Science
• A Sum Of
• Statistics
• Mathematics
• Machine Learning
• Data Mining
• Computer Programming
• Data Engineering
• Visualization
• Data Warehousing
• High Performance Computing
• To support (Informed) Decision Making
• Data-Driven Decisions
Data Scientist
• IBM
• A data scientist represents an evolution from the business or data analyst role.
• The formal training is similar, with a solid foundation typically in computer science and
applications, modeling, statistics, analytics and math.
• What sets the data scientist apart is strong business acumen, coupled with the ability to
communicate findings to both business and IT leaders in a way that can influence how
an organization approaches a business challenge.
• It's almost like a Renaissance individual who really wants to learn and bring change to
an organization.
Algorithms
• Algorithms are the new gatekeepers
• http://www.slideshare.net/socialisten/algorithms-are-the-new-gatekeepers
• There is simply too much data for a human to analyze!
• They decide
• What we find
• What we see
• What we buy
• Data is the foundation upon which algorithm works
• Better Data lease Better Results
• Data-Driven Decisions will be a MUST in the next years!
• Data Scientists will help companies to leverage their most valuable asset: Data
Modern Data Environment
Master
Data
EDW
Data Mart
Big Data
Unstructured
Data
BI Environment
Analytics Environment
Structured
Data
Big Data
The 3 V
No, the 4 V!!!
No, no, the 5 V!!!!!
http://www.ibmbigdatahub.com/infographic/four-vs-big-data
Big Data
• Volume, Velocity, Variety, Veracity….V<your-v-here>
• Data sets with sizes beyond the ability of commonly used software tools
to capture, curate, manage, and process the data within a tolerable elapsed
time
• Grid Computing, Parallel Computing needed
• keep processing time reasonable
• provide scalability
Big Data Data
• Paradigm: “Store Now, Figure Out Later”
• Data is the new resource. Never throw it away!
• Unstructured Data
• Text Files
• Images
• Sounds
• Structured/Semi Structured Data
• Sensors
• Transactions
• Logs
Data Storage
• RDBMS
• SQL Server
• Hadoop
• HDInsight
• Hortonworks Data Platform
• Distributed File (Eco)System
• CSV
• JSON
• *.*
Data Storage
• Hadoop Ecosystem
http://hortonworks.com/hadoop-modern-data-architecture/
Data Science & Big Data
• Data Science != Big Data
• Data Science Not Only on Big Data
• Data Science can be applied to Big Data
• Data Science starts from Small Data
• 1) find the algorithm that extract knowledge
• 2) measure algorithm results and in terms of probability
Machine Learning
• Machine learning, a branch of artificial intelligence, concerns the construction
and study of systems that can learn from data. (Wikipedia)
• For example, a machine learning system could be trained on email messages to learn to
distinguish between spam and non-spam messages. After learning, it can then be used
to classify new email messages into spam and non-spam folders.
• Flavors
• Supervised
• Unsupervised
Data Analysis
• Common Data Scientists Tools
• R
• Weka
• Octave
• Scikit-Learn
• Common Data Scientists Languages
• Python
• Scala
• F#
Resources
• https://www.coursera.org/
• Data Scientist Specialization
• https://www.khanacademy.org/
• Math
• http://www.osservatori.net/business_intelligence
• Italian Big Data Market Analysis Resources
• http://www.solidq.com/consulting/
• Data Science Services
• Big Data / Business Intelligence / Data Warehousing
Previously known as
Think Big. Move Fast.

Mais conteúdo relacionado

Mais procurados

7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome ThemQubole
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Lunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataLunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataMelissa Hornbostel
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for EveryoneCaserta
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsMohd Izhar Firdaus Ismail
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionDATAVERSITY
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsFairCom
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on HadoopCaserta
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThomas Kelly, PMP
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLJen Stirrup
 
Big Data Content Organization, Discovery, and Management
Big Data Content Organization, Discovery, and ManagementBig Data Content Organization, Discovery, and Management
Big Data Content Organization, Discovery, and ManagementAccess Innovations, Inc.
 
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...Usama Fayyad
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?Caserta
 
Data catalog
Data catalogData catalog
Data catalogiamtodor
 

Mais procurados (20)

The Big Data Dream Team
The Big Data Dream TeamThe Big Data Dream Team
The Big Data Dream Team
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them
 
Big data
Big dataBig data
Big data
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Lunch & Learn Intro to Big Data
Lunch & Learn Intro to Big DataLunch & Learn Intro to Big Data
Lunch & Learn Intro to Big Data
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data SolutionThe Top 5 Factors to Consider When Choosing a Big Data Solution
The Top 5 Factors to Consider When Choosing a Big Data Solution
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical Applications
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on Hadoop
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy
 
Big Data Boom
Big Data BoomBig Data Boom
Big Data Boom
 
What is big data
What is big dataWhat is big data
What is big data
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureML
 
What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 
Big Data Content Organization, Discovery, and Management
Big Data Content Organization, Discovery, and ManagementBig Data Content Organization, Discovery, and Management
Big Data Content Organization, Discovery, and Management
 
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...Usama Fayyad talk at IIT Madras on March 27, 2015:  BigData, AllData, Old Dat...
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
 
Data catalog
Data catalogData catalog
Data catalog
 

Semelhante a Data Science Overview

Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeCaserta
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsCaserta
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedDunn Solutions Group
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with MicrosoftCaserta
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
How to crack Big Data and Data Science roles
How to crack Big Data and Data Science rolesHow to crack Big Data and Data Science roles
How to crack Big Data and Data Science rolesUpXAcademy
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data AnalyticsS P Sajjan
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 

Semelhante a Data Science Overview (20)

Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with Microsoft
 
big-data-notes1.ppt
big-data-notes1.pptbig-data-notes1.ppt
big-data-notes1.ppt
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
How to crack Big Data and Data Science roles
How to crack Big Data and Data Science rolesHow to crack Big Data and Data Science roles
How to crack Big Data and Data Science roles
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 

Mais de Davide Mauri

Azure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstartAzure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstartDavide Mauri
 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data WarehousingDavide Mauri
 
Dapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDavide Mauri
 
When indexes are not enough
When indexes are not enoughWhen indexes are not enough
When indexes are not enoughDavide Mauri
 
Building a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with AzureBuilding a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with AzureDavide Mauri
 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveDavide Mauri
 
Azure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSONAzure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSONDavide Mauri
 
SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2Davide Mauri
 
SQL Server 2016 Temporal Tables
SQL Server 2016 Temporal TablesSQL Server 2016 Temporal Tables
SQL Server 2016 Temporal TablesDavide Mauri
 
SQL Server 2016 What's New For Developers
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For DevelopersDavide Mauri
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream AnalyticsDavide Mauri
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningDavide Mauri
 
Dashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDavide Mauri
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsDavide Mauri
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsDavide Mauri
 
SQL Server 2016 JSON
SQL Server 2016 JSONSQL Server 2016 JSON
SQL Server 2016 JSONDavide Mauri
 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveDavide Mauri
 
Real Time Power BI
Real Time Power BIReal Time Power BI
Real Time Power BIDavide Mauri
 
AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)Davide Mauri
 
Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)Davide Mauri
 

Mais de Davide Mauri (20)

Azure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstartAzure serverless Full-Stack kickstart
Azure serverless Full-Stack kickstart
 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data Warehousing
 
Dapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDapper: the microORM that will change your life
Dapper: the microORM that will change your life
 
When indexes are not enough
When indexes are not enoughWhen indexes are not enough
When indexes are not enough
 
Building a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with AzureBuilding a Real-Time IoT monitoring application with Azure
Building a Real-Time IoT monitoring application with Azure
 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep Dive
 
Azure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSONAzure SQL & SQL Server 2016 JSON
Azure SQL & SQL Server 2016 JSON
 
SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2SQL Server & SQL Azure Temporal Tables - V2
SQL Server & SQL Azure Temporal Tables - V2
 
SQL Server 2016 Temporal Tables
SQL Server 2016 Temporal TablesSQL Server 2016 Temporal Tables
SQL Server 2016 Temporal Tables
 
SQL Server 2016 What's New For Developers
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For Developers
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
Dashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BI
 
Azure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applicationsAzure ML: from basic to integration with custom applications
Azure ML: from basic to integration with custom applications
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream Analytics
 
SQL Server 2016 JSON
SQL Server 2016 JSONSQL Server 2016 JSON
SQL Server 2016 JSON
 
SSIS Monitoring Deep Dive
SSIS Monitoring Deep DiveSSIS Monitoring Deep Dive
SSIS Monitoring Deep Dive
 
Real Time Power BI
Real Time Power BIReal Time Power BI
Real Time Power BI
 
AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)AzureML - Creating and Using Machine Learning Solutions (Italian)
AzureML - Creating and Using Machine Learning Solutions (Italian)
 
Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)Datarace: IoT e Big Data (Italian)
Datarace: IoT e Big Data (Italian)
 

Último

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Último (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Data Science Overview

  • 1. Previously known as Think Big. Move Fast.
  • 3. SolidQ • Born in 2002 in USA and Spain • Established in 2007 in Italy • More than 1000 customers and more than 200 consultants worldwide • Dedicated to Data Management on the Microsoft Platform • Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors • www.solidq.com
  • 4. Davide Mauri • 18 Years of experience on the SQL Server Platform • Specialized in Data Solution Architecture, Database Design, Performance Tuning, Business Intelligence • Microsoft SQL Server MVP • President of UGISS (Italian SQL Server UG) • Mentor @ SolidQ • Video, Book & Article Author • Regular Speaker @ SQL Server events • Projects, Consulting, Mentoring & Training
  • 6. “Companies are collecting mountains of information about you, to predict how likely you are to buy a product, and using that knowledge to craft a marketing message precisely calibrated to get you to do so”
  • 7. Data Science • Extraction of knowledge from data • So, what’s new? • Nothing. Except that it’s now economic and fast. • It’s now applicable to everything. And we have a lot of data produced everyday that can be used to extract knowledge
  • 9. Data Science • A Sum Of • Statistics • Mathematics • Machine Learning • Data Mining • Computer Programming • Data Engineering • Visualization • Data Warehousing • High Performance Computing • To support (Informed) Decision Making • Data-Driven Decisions
  • 10. Data Scientist • IBM • A data scientist represents an evolution from the business or data analyst role. • The formal training is similar, with a solid foundation typically in computer science and applications, modeling, statistics, analytics and math. • What sets the data scientist apart is strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge. • It's almost like a Renaissance individual who really wants to learn and bring change to an organization.
  • 11. Algorithms • Algorithms are the new gatekeepers • http://www.slideshare.net/socialisten/algorithms-are-the-new-gatekeepers • There is simply too much data for a human to analyze! • They decide • What we find • What we see • What we buy • Data is the foundation upon which algorithm works • Better Data lease Better Results • Data-Driven Decisions will be a MUST in the next years! • Data Scientists will help companies to leverage their most valuable asset: Data
  • 12. Modern Data Environment Master Data EDW Data Mart Big Data Unstructured Data BI Environment Analytics Environment Structured Data
  • 13. Big Data The 3 V No, the 4 V!!! No, no, the 5 V!!!!!
  • 15. Big Data • Volume, Velocity, Variety, Veracity….V<your-v-here> • Data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time • Grid Computing, Parallel Computing needed • keep processing time reasonable • provide scalability
  • 16. Big Data Data • Paradigm: “Store Now, Figure Out Later” • Data is the new resource. Never throw it away! • Unstructured Data • Text Files • Images • Sounds • Structured/Semi Structured Data • Sensors • Transactions • Logs
  • 17. Data Storage • RDBMS • SQL Server • Hadoop • HDInsight • Hortonworks Data Platform • Distributed File (Eco)System • CSV • JSON • *.*
  • 18. Data Storage • Hadoop Ecosystem http://hortonworks.com/hadoop-modern-data-architecture/
  • 19. Data Science & Big Data • Data Science != Big Data • Data Science Not Only on Big Data • Data Science can be applied to Big Data • Data Science starts from Small Data • 1) find the algorithm that extract knowledge • 2) measure algorithm results and in terms of probability
  • 20. Machine Learning • Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. (Wikipedia) • For example, a machine learning system could be trained on email messages to learn to distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders. • Flavors • Supervised • Unsupervised
  • 21. Data Analysis • Common Data Scientists Tools • R • Weka • Octave • Scikit-Learn • Common Data Scientists Languages • Python • Scala • F#
  • 22.
  • 23. Resources • https://www.coursera.org/ • Data Scientist Specialization • https://www.khanacademy.org/ • Math • http://www.osservatori.net/business_intelligence • Italian Big Data Market Analysis Resources • http://www.solidq.com/consulting/ • Data Science Services • Big Data / Business Intelligence / Data Warehousing
  • 24. Previously known as Think Big. Move Fast.

Notas do Editor

  1. Last Changes: 2014-04-25 – DM – v1
  2. http://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/
  3. http://www-01.ibm.com/software/data/infosphere/data-scientist/
  4. http://www.ibmbigdatahub.com/infographic/four-vs-big-data
  5. http://nirvacana.com/thoughts/becoming-a-data-scientist/
  6. Last Changes: 2012-07-30 DM