SlideShare uma empresa Scribd logo
1 de 50
Baixar para ler offline
Real life for Big Data:
what is data science ?
Irina Muhina,
PhD in AI with 25 years practical experience,
Big Data and STEM Expert, Founder of iECARUS,
President of ERUDITE school
iECARUS is your concierge for educational intelligence.
www.iecarus.com
September, 2016,
Russia
The future belongs to the companies аnd people that turn data into products.
Agenda
• History of Data Mining and Big Data …
• What is the Big Data ?
• What are the real life dimensions for Big Data ?
- return on investment (ROI)
- amount of real-time data
- demand for data scientists job and average compensation packages
- expectations for the data scientist
- salaries for data scientists
How to use Big Data for STEM and INFONOMICS?
• Case studies and tools using Big Data examples from industries:
– Trading strategy analysis
– Parametric and distribution analysis
– Two-regimes risk model
– Correlation analysis with different cut-off
– Optimization models with re-sampling
• What is the future of Data Science ?
History of data mining
https://rayli.net/blog/data/history-of-data-mining/
http://insideanalysis.com/2012/04/data-mining-and-beyond/
Statistics and Analytics in 20th Century
Recent History
Google Trends for Data Mining and Analytics
“Analytics” versus “Google Analytics”
News References to Term “Data Mining
Evolution of Terminology
Increased Use of Term “Big Data”
on the 2012 list of most ambiguous terms -
Global Language Monitor most
searched term among clients –
on Gartner.com
Big Data initiatives
Traditional DW & BI Big Data & Advanced Analytics
Big Data is #1
Requirements-based
Top-down design
Integration and reuse
Competence centers
Better decisions
Enterprise
Opportunity-oriented
Bottom-up experimentation
Immediate use
Hackathons
Business innovation
Functional
Who is a Data Scientist ?
• Works more closely with multiple teams when compared to
statisticians
• always expected to work with types of big data — operational
technology, text, streaming
• Combinations of mathematics, statistics, machine learning and
algorithmic processing
• Demand for communication skills much more frequently than BI
or statistics roles
• Have to be able to code, write and present well
Current roles:
• Solution architect
• Business analyst
• Requirements analyst
• Data modeler
•Data integration lead
•Data integration
developer
•Report writer
•BI platform lead
•Database administrator
•User trainer
•Data steward
Success of Data Science Solutions:
skills, roles, responsibilities
What is Big Data ? Gartner IT model
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
It has many similarities with existing distributed file systems.
However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and
is designed to be deployed on low-cost hardware.
HDFS provides high throughput access to application data and is suitable for applications that have large data sets.
HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as
infrastructure for the Apache Nutch web search engine project. HDFS is now an Apache Hadoop subproject
The project URL is http://hadoop.apache.org/hdfs/.
Master Management System‎
Database Management System
Hybrid information architectures
Information Capability Framework
Anticipate, govern and hedge information-borne risks.
Data is the new currency and new asset.
Likelihood of optimistic,
pessimistic and realistic
scenarios .
My role is a translator: from business to analytics to IT
and back to business.
Big‎Data‎‘‎You‎torture‎the‎data‎until‎it‎is‎confess‎‘‎
O’Reilly Data Science
Salary Survey, we’ve analyzed input from 983 respondents
working in the data space, across a variety of industries—
representing 45 countries and 45 US states and 3/5 from US
representing 45 countries and 45 US states.
There is a difference of $10K
between the median salaries of
men and women. Keeping all other
variables constant—same roles,
same skills—women make less than
men.
• How to use Big Data for STEM ?
Emerging Role of the Data Scientist the Art of Data Science for IT,
business The Birth of Infonomics, the New Economics of Information
Real projects using Big Data
case studies and tools from industries
• Trading strategy case study
• Parametric and distribution case study
• Two-regimes risk model case study
• Correlation analysis with different cut-off
• Optimization models with re-sampling
Analytical Tools
Excel, SAS, SPSS, R , SQL, Tableau,
MatLab, Watson , Hadoop
Which is the biggest opportunity for Big Data?
Daily price crossing 50D EMA of
ACWI seems to be a good strategy
Price crosses EMA from below, go overweight
Price crosses EMA from above, go underweight
Different trading strategies analysis
Trade benefit VS Trade length
Bad trades tend to be very short,
i.e. occur when the model is
switching between overweight
and underweight rapidly
3 Scenarios for the Future of Data Science
•Big Data Ventures
Data Science will be practiced exclusively by companies
specializing in big data analytics
•Big Data Accountants
Data Science will become a specialized, in-house function,
similar to today’s Accounting, Legal, and IT departments.
•Everybody’s a Big Data Expert
The vision of “data democracy” will come true and everybody in
the organization will create and consume big data. Data
science fundamentals will be thoroughly integrated in all levels
of management education.
https://whatsthebigdata.com/2012/03/12/3-scenarios-for-the-future-of-
data-science/
How the Internet of Things
Changes Big Data Analytics
Expand your analytic capabilities
Data Mining resources ( just a few )
http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture13.pdf http://en.wikipedia.org/wiki/AdaBoost
http://en.wikipedia.org/wiki/Boosting_(machine_learning) http://en.wikipedia.org/wiki/Decision_tree_learning
http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm http://en.wikipedia.org/wiki/Naive_Bayes_classifier
http://en.wikipedia.org/wiki/PageRank http://infolab.stanford.edu/~backrub/google.html
http://nikhilvithlani.blogspot.com/2012/03/apriori-algorithm-for-data-mining-made.html
http://stackoverflow.com/questions/10059594/a-simple-explanation-of-naive-bayes-classification
http://stackoverflow.com/questions/10617401/advantages-of-svm-over-decion-trees-and-adaboost-
algorithm/10626287#10626287 http://stackoverflow.com/questions/11808074/what-is-an-intuitive-explanation-
of-expectation-maximization-technique http://stackoverflow.com/questions/12097155/weak-
classifier/12097371#12097371 http://stackoverflow.com/questions/1922985/explaining-the-adaboost-
algorithms-to-non-technical-people/2295419#2295419 http://stackoverflow.com/questions/9979461/different-
decision-tree-algorithms-with-comparison-of-complexity-or-performance
http://stats.stackexchange.com/questions/23391/how-does-a-support-vector-machine-svm-work
http://stats.stackexchange.com/questions/2641/what-is-the-difference-between-likelihood-and-probability
http://stats.stackexchange.com/questions/82049/what-is-meant-by-weak-learner
http://www.bmnh.org/web_users/pf/idiots.pdf http://www.bruceclay.com/blog/what-is-pagerank/
http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm
http://www.mathworks.com/help/stats/classification-trees-and-regression-trees.html
http://www.quora.com/What-are-the-advantages-of-different-classification-algorithms
http://www.quora.com/What-does-support-vector-machine-SVM-mean-in-laymans-terms
http://www.reddit.com/r/statistics/comments/19ubvi/could_someone_please_explain_max_likelihood_and/
http://www.simafore.com/blog/bid/62482/2-main-differences-between-classification-and-regression-trees
http://www.slideshare.net/maimustafa566/page-rank-algorithm-33212250
http://www.statsoft.com/Textbook/Classification-and-Regression-Trees
https://chrisjmccormick.wordpress.com/2013/12/13/adaboost-tutorial/ https://class.coursera.org/pgm-
003/lecture (Week 9) https://www.cs.duke.edu/courses/fall07/cps271/EM.pdf
https://www.ee.washington.edu/techsite/papers/documents/UWEETR-2010-0002.pdf
No one knows for certain what the future can bring, but
without vision, how can we achieve our dreams?
www.gartner.com
www.theoryandpractice.ru
www.ted.com
www.zonein.ca/virtual-child
www.digcompass.ca
www.ictc-ctic.ca
www.computingcareers.acm.org
www.tfsa.ca/centre-of-excellence
http://thinkbigdata.in/
http://data-informed.com/
If you have questions about this presentation you could
write us at iecarus.ca@gmail.com

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data Mining
 
How to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsHow to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using Semantics
 
So you want to be a Data Scientist?
So you want to be a Data Scientist?So you want to be a Data Scientist?
So you want to be a Data Scientist?
 
Road Map for Careers in Big Data
Road Map for Careers in Big DataRoad Map for Careers in Big Data
Road Map for Careers in Big Data
 
000 introduction to big data analytics 2021
000   introduction to big data analytics  2021000   introduction to big data analytics  2021
000 introduction to big data analytics 2021
 
Big data
Big dataBig data
Big data
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
 
Datamining and Business Analytics
Datamining and Business Analytics Datamining and Business Analytics
Datamining and Business Analytics
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
 
Data mining
Data miningData mining
Data mining
 

Destaque

Big Data for Smart City
Big Data for Smart CityBig Data for Smart City
Big Data for Smart City
Koltiva
 
Km4city Smart City Ecosystem Urban Platform
Km4city Smart City Ecosystem Urban PlatformKm4city Smart City Ecosystem Urban Platform
Km4city Smart City Ecosystem Urban Platform
Paolo Nesi
 

Destaque (20)

Smart City Ecosystem, fram data to value for the citizens, Km4City solution, ...
Smart City Ecosystem, fram data to value for the citizens, Km4City solution, ...Smart City Ecosystem, fram data to value for the citizens, Km4City solution, ...
Smart City Ecosystem, fram data to value for the citizens, Km4City solution, ...
 
Share Information, Change the World: Big Data, Small Apps, Smart Dashboards &...
Share Information, Change the World: Big Data, Small Apps, Smart Dashboards &...Share Information, Change the World: Big Data, Small Apps, Smart Dashboards &...
Share Information, Change the World: Big Data, Small Apps, Smart Dashboards &...
 
Big Data + Social Graph
Big Data + Social GraphBig Data + Social Graph
Big Data + Social Graph
 
myRide: A Real-Time Information System for the Carnegie Mellon University Shu...
myRide: A Real-Time Information System for the Carnegie Mellon University Shu...myRide: A Real-Time Information System for the Carnegie Mellon University Shu...
myRide: A Real-Time Information System for the Carnegie Mellon University Shu...
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
 
Real time data services
Real time data servicesReal time data services
Real time data services
 
Real Time Big Data
Real Time Big DataReal Time Big Data
Real Time Big Data
 
Banking & Smart City Ecosystem
Banking & Smart City EcosystemBanking & Smart City Ecosystem
Banking & Smart City Ecosystem
 
Data Science in the Real World: Making a Difference
Data Science in the Real World: Making a Difference Data Science in the Real World: Making a Difference
Data Science in the Real World: Making a Difference
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
SmartCity StreamApp Platform: Real-time Information for Smart Cities and Tran...
SmartCity StreamApp Platform: Real-time Information for Smart Cities and Tran...SmartCity StreamApp Platform: Real-time Information for Smart Cities and Tran...
SmartCity StreamApp Platform: Real-time Information for Smart Cities and Tran...
 
Big Data for Smart City
Big Data for Smart CityBig Data for Smart City
Big Data for Smart City
 
Km4city Smart City Ecosystem Urban Platform
Km4city Smart City Ecosystem Urban PlatformKm4city Smart City Ecosystem Urban Platform
Km4city Smart City Ecosystem Urban Platform
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + Visualization
 
Smart Cities and the Value of Ecosystem Services
Smart Cities and the Value of Ecosystem ServicesSmart Cities and the Value of Ecosystem Services
Smart Cities and the Value of Ecosystem Services
 
Real-Time Analytics: The Future of Big Data in the Agency
Real-Time Analytics: The Future of Big Data in the AgencyReal-Time Analytics: The Future of Big Data in the Agency
Real-Time Analytics: The Future of Big Data in the Agency
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing
 
Smart City Framework
Smart City FrameworkSmart City Framework
Smart City Framework
 
A chart of the big data ecosystem
A chart of the big data ecosystemA chart of the big data ecosystem
A chart of the big data ecosystem
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 

Semelhante a Data science fin_tech_2016

02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
Raul Chong
 

Semelhante a Data science fin_tech_2016 (20)

Data science York_University _2016
Data science York_University _2016Data science York_University _2016
Data science York_University _2016
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
 
TOUG Big Data Challenge and Impact
TOUG Big Data Challenge and ImpactTOUG Big Data Challenge and Impact
TOUG Big Data Challenge and Impact
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
 

Mais de iECARUS

Social engineering
Social engineeringSocial engineering
Social engineering
iECARUS
 

Mais de iECARUS (20)

Replacement for cosmic_era_29_03_21
Replacement for cosmic_era_29_03_21Replacement for cosmic_era_29_03_21
Replacement for cosmic_era_29_03_21
 
Russia against techno globalization
Russia against techno globalizationRussia against techno globalization
Russia against techno globalization
 
Cosmic civilization for our future
Cosmic civilization for our future Cosmic civilization for our future
Cosmic civilization for our future
 
ЧЕЛОВЕК – ВЗГЛЯД ИЗ ЦИФРОВOННОГО МИРА !
ЧЕЛОВЕК – ВЗГЛЯД ИЗ ЦИФРОВOННОГО МИРА !ЧЕЛОВЕК – ВЗГЛЯД ИЗ ЦИФРОВOННОГО МИРА !
ЧЕЛОВЕК – ВЗГЛЯД ИЗ ЦИФРОВOННОГО МИРА !
 
Мудрость современной женщины от Ирины Мухиной
Мудрость современной женщины от Ирины Мухиной Мудрость современной женщины от Ирины Мухиной
Мудрость современной женщины от Ирины Мухиной
 
Marriage or smart contract
Marriage or smart contractMarriage or smart contract
Marriage or smart contract
 
Social engineering
Social engineeringSocial engineering
Social engineering
 
ЧЕЛОВЕК В МИРЕ ВИДИМОМ, НЕВИДИМОМ И ВИРТУАЛЬНОМ
ЧЕЛОВЕК В МИРЕ ВИДИМОМ,НЕВИДИМОМ И ВИРТУАЛЬНОМЧЕЛОВЕК В МИРЕ ВИДИМОМ,НЕВИДИМОМ И ВИРТУАЛЬНОМ
ЧЕЛОВЕК В МИРЕ ВИДИМОМ, НЕВИДИМОМ И ВИРТУАЛЬНОМ
 
Woman_in_digitization
Woman_in_digitizationWoman_in_digitization
Woman_in_digitization
 
“AI and digitalization for construction management opportunities”
“AI and digitalization for construction management opportunities” “AI and digitalization for construction management opportunities”
“AI and digitalization for construction management opportunities”
 
Digital law
Digital lawDigital law
Digital law
 
Штрих код для русских детей.
Штрих код для русских детей. Штрих код для русских детей.
Штрих код для русских детей.
 
Data intelligence for fintech 2019
Data intelligence for fintech 2019Data intelligence for fintech 2019
Data intelligence for fintech 2019
 
Профессионалы будущего
Профессионалы будущегоПрофессионалы будущего
Профессионалы будущего
 
Человек центральное звено цифровой экономики.
Человек центральное звено цифровой экономики.Человек центральное звено цифровой экономики.
Человек центральное звено цифровой экономики.
 
Job market 201710
Job market 201710Job market 201710
Job market 201710
 
Управление качеством клиентского портфеля “по- умному”.
Управление качеством клиентского портфеля “по- умному”. Управление качеством клиентского портфеля “по- умному”.
Управление качеством клиентского портфеля “по- умному”.
 
О рынке профессий и нужных специальностях будущего
О рынке профессий  и нужных специальностях будущего  О рынке профессий  и нужных специальностях будущего
О рынке профессий и нужных специальностях будущего
 
Educational intelligence in XXI century: Talents @ Technology
Educational intelligence in XXI century: Talents @ TechnologyEducational intelligence in XXI century: Talents @ Technology
Educational intelligence in XXI century: Talents @ Technology
 
Job market trends in 2016
Job market trends in 2016Job market trends in 2016
Job market trends in 2016
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 

Data science fin_tech_2016

  • 1. Real life for Big Data: what is data science ? Irina Muhina, PhD in AI with 25 years practical experience, Big Data and STEM Expert, Founder of iECARUS, President of ERUDITE school iECARUS is your concierge for educational intelligence. www.iecarus.com September, 2016, Russia The future belongs to the companies аnd people that turn data into products.
  • 2. Agenda • History of Data Mining and Big Data … • What is the Big Data ? • What are the real life dimensions for Big Data ? - return on investment (ROI) - amount of real-time data - demand for data scientists job and average compensation packages - expectations for the data scientist - salaries for data scientists How to use Big Data for STEM and INFONOMICS? • Case studies and tools using Big Data examples from industries: – Trading strategy analysis – Parametric and distribution analysis – Two-regimes risk model – Correlation analysis with different cut-off – Optimization models with re-sampling • What is the future of Data Science ?
  • 3. History of data mining https://rayli.net/blog/data/history-of-data-mining/
  • 5. Google Trends for Data Mining and Analytics “Analytics” versus “Google Analytics”
  • 6. News References to Term “Data Mining Evolution of Terminology
  • 7. Increased Use of Term “Big Data”
  • 8. on the 2012 list of most ambiguous terms - Global Language Monitor most searched term among clients – on Gartner.com Big Data initiatives Traditional DW & BI Big Data & Advanced Analytics Big Data is #1 Requirements-based Top-down design Integration and reuse Competence centers Better decisions Enterprise Opportunity-oriented Bottom-up experimentation Immediate use Hackathons Business innovation Functional
  • 9.
  • 10.
  • 11. Who is a Data Scientist ? • Works more closely with multiple teams when compared to statisticians • always expected to work with types of big data — operational technology, text, streaming • Combinations of mathematics, statistics, machine learning and algorithmic processing • Demand for communication skills much more frequently than BI or statistics roles • Have to be able to code, write and present well Current roles: • Solution architect • Business analyst • Requirements analyst • Data modeler •Data integration lead •Data integration developer •Report writer •BI platform lead •Database administrator •User trainer •Data steward
  • 12. Success of Data Science Solutions: skills, roles, responsibilities
  • 13.
  • 14.
  • 15.
  • 16. What is Big Data ? Gartner IT model
  • 17. The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is now an Apache Hadoop subproject The project URL is http://hadoop.apache.org/hdfs/. Master Management System‎ Database Management System Hybrid information architectures
  • 19. Anticipate, govern and hedge information-borne risks. Data is the new currency and new asset. Likelihood of optimistic, pessimistic and realistic scenarios .
  • 20. My role is a translator: from business to analytics to IT and back to business.
  • 21.
  • 23.
  • 24.
  • 25. O’Reilly Data Science Salary Survey, we’ve analyzed input from 983 respondents working in the data space, across a variety of industries— representing 45 countries and 45 US states and 3/5 from US representing 45 countries and 45 US states.
  • 26.
  • 27. There is a difference of $10K between the median salaries of men and women. Keeping all other variables constant—same roles, same skills—women make less than men.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. • How to use Big Data for STEM ? Emerging Role of the Data Scientist the Art of Data Science for IT, business The Birth of Infonomics, the New Economics of Information
  • 33. Real projects using Big Data case studies and tools from industries • Trading strategy case study • Parametric and distribution case study • Two-regimes risk model case study • Correlation analysis with different cut-off • Optimization models with re-sampling Analytical Tools Excel, SAS, SPSS, R , SQL, Tableau, MatLab, Watson , Hadoop
  • 34.
  • 35. Which is the biggest opportunity for Big Data?
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Daily price crossing 50D EMA of ACWI seems to be a good strategy Price crosses EMA from below, go overweight Price crosses EMA from above, go underweight Different trading strategies analysis Trade benefit VS Trade length Bad trades tend to be very short, i.e. occur when the model is switching between overweight and underweight rapidly
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. 3 Scenarios for the Future of Data Science •Big Data Ventures Data Science will be practiced exclusively by companies specializing in big data analytics •Big Data Accountants Data Science will become a specialized, in-house function, similar to today’s Accounting, Legal, and IT departments. •Everybody’s a Big Data Expert The vision of “data democracy” will come true and everybody in the organization will create and consume big data. Data science fundamentals will be thoroughly integrated in all levels of management education. https://whatsthebigdata.com/2012/03/12/3-scenarios-for-the-future-of- data-science/
  • 46. How the Internet of Things Changes Big Data Analytics
  • 47. Expand your analytic capabilities
  • 48.
  • 49. Data Mining resources ( just a few ) http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture13.pdf http://en.wikipedia.org/wiki/AdaBoost http://en.wikipedia.org/wiki/Boosting_(machine_learning) http://en.wikipedia.org/wiki/Decision_tree_learning http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm http://en.wikipedia.org/wiki/Naive_Bayes_classifier http://en.wikipedia.org/wiki/PageRank http://infolab.stanford.edu/~backrub/google.html http://nikhilvithlani.blogspot.com/2012/03/apriori-algorithm-for-data-mining-made.html http://stackoverflow.com/questions/10059594/a-simple-explanation-of-naive-bayes-classification http://stackoverflow.com/questions/10617401/advantages-of-svm-over-decion-trees-and-adaboost- algorithm/10626287#10626287 http://stackoverflow.com/questions/11808074/what-is-an-intuitive-explanation- of-expectation-maximization-technique http://stackoverflow.com/questions/12097155/weak- classifier/12097371#12097371 http://stackoverflow.com/questions/1922985/explaining-the-adaboost- algorithms-to-non-technical-people/2295419#2295419 http://stackoverflow.com/questions/9979461/different- decision-tree-algorithms-with-comparison-of-complexity-or-performance http://stats.stackexchange.com/questions/23391/how-does-a-support-vector-machine-svm-work http://stats.stackexchange.com/questions/2641/what-is-the-difference-between-likelihood-and-probability http://stats.stackexchange.com/questions/82049/what-is-meant-by-weak-learner http://www.bmnh.org/web_users/pf/idiots.pdf http://www.bruceclay.com/blog/what-is-pagerank/ http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm http://www.mathworks.com/help/stats/classification-trees-and-regression-trees.html http://www.quora.com/What-are-the-advantages-of-different-classification-algorithms http://www.quora.com/What-does-support-vector-machine-SVM-mean-in-laymans-terms http://www.reddit.com/r/statistics/comments/19ubvi/could_someone_please_explain_max_likelihood_and/ http://www.simafore.com/blog/bid/62482/2-main-differences-between-classification-and-regression-trees http://www.slideshare.net/maimustafa566/page-rank-algorithm-33212250 http://www.statsoft.com/Textbook/Classification-and-Regression-Trees https://chrisjmccormick.wordpress.com/2013/12/13/adaboost-tutorial/ https://class.coursera.org/pgm- 003/lecture (Week 9) https://www.cs.duke.edu/courses/fall07/cps271/EM.pdf https://www.ee.washington.edu/techsite/papers/documents/UWEETR-2010-0002.pdf
  • 50. No one knows for certain what the future can bring, but without vision, how can we achieve our dreams? www.gartner.com www.theoryandpractice.ru www.ted.com www.zonein.ca/virtual-child www.digcompass.ca www.ictc-ctic.ca www.computingcareers.acm.org www.tfsa.ca/centre-of-excellence http://thinkbigdata.in/ http://data-informed.com/ If you have questions about this presentation you could write us at iecarus.ca@gmail.com