SlideShare uma empresa Scribd logo
1 de 24
NOWCASTING
                                   BUSINESS PERFORMANCE




miércoles, 28 de noviembre de 12
GROWTH INTELLIGENCE

                                          What we do
                                    Classification of companies
                                       Revenue estimation

                                          What we use
                                        Machine Learning
                                      Times Series methods

miércoles, 28 de noviembre de 12
SOME OF OUR CLIENTS




miércoles, 28 de noviembre de 12
NOWCASTING

                          Estimating the value of a time series
                             not readily available at present




                                                      present


miércoles, 28 de noviembre de 12
NOWCASTING




                                                present

miércoles, 28 de noviembre de 12
NOWCASTING

                                      Previously called
                                    short-term forecasting
                                          forecasting

                              More an approach and a goal than
                                a different theory and field



miércoles, 28 de noviembre de 12
NOWCASTING USE CASES


                                    Weather nowcasting
                                    Search-based nowcasting
                                    GDP nowcasting




miércoles, 28 de noviembre de 12
WEATHER NOWCASTING

                      Simplified model that is applied quickly

                                       Uses weather models

                   Forecast at location x given weather at y

                                   → Not applicable to other fields


miércoles, 28 de noviembre de 12
SEARCH-BASED NOWCASTING

                                   Popularized by Google

                                     Recent successes
                                          Flu predictions
                                      Consumer behaviour
                                   travel, movies and products

                           Based on Google’s data, simple AR models
                        Only used to study what people are searching for



miércoles, 28 de noviembre de 12
miércoles, 28 de noviembre de 12
miércoles, 28 de noviembre de 12
GDP NOWCASTING

                                   Field with the most generic research

                                      Major research since the 90's

                      GDP released quarterly with further revisions

                                   1000's of signals for GDP nowcasting
                                     Industrial production, unemployment,
                                       confidence surveys, retail sales, ...



miércoles, 28 de noviembre de 12
GDP NOWCASTING

                              Vector auto-regression and the “jagged edge”




                                                                         Present




                                   Different frequencies, different lag, missing data



miércoles, 28 de noviembre de 12
miércoles, 28 de noviembre de 12
miércoles, 28 de noviembre de 12
miércoles, 28 de noviembre de 12
Patents
                           Search results
                                                       Advertisement
                                                         spending
                                    LinkedIn info

                         Web traffic             Assets          Tweets

                                                              Website
                              Liabilities
                                               Press          updates



miércoles, 28 de noviembre de 12
TIE WITH “BIG DATA”

                                   Need to gather signals in large quantity

                      Machine learning as a pre-processing step
                          and to integrate discrete events
               Example: companies in a sector which receive investment




miércoles, 28 de noviembre de 12
TIE WITH ESTIMATION THEORY


                                            Beneath all this:
                               Getting to a variable not directly observable
                                     with the help of measured signals

            Replacing probability distribution from physical models with
                            machine learned knowledge




miércoles, 28 de noviembre de 12
METHODOLOGIES

                                        Vector auto-regression
                      Challenge with large number of signals (predictors):

                                   Curse of dimensionality when applying VAR

                                      Machine Learning approach
                                             Own solution: ziggurat



miércoles, 28 de noviembre de 12
TIME SERIES + MACHINE LEARNING




                        avg, std dev, model params   Δrevenue

miércoles, 28 de noviembre de 12
OUR PIPELINE FOR NOWCASTING


                                   Clustering companies in sets (ML)

                                           Signals gathering
                                        Time Series processing
                                     ML with model for each cluster

                             > Revenue for each company and each cluster




miércoles, 28 de noviembre de 12
TECHNOLOGIES




miércoles, 28 de noviembre de 12
DATA SCIENCE AT
                                   GROWTH INTELLIGENCE




                                           :D




miércoles, 28 de noviembre de 12

Mais conteúdo relacionado

Mais de Data Science London

Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Data Science London
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
Real-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaReal-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaData Science London
 
Numpy, the Python foundation for number crunching
Numpy, the Python foundation for number crunchingNumpy, the Python foundation for number crunching
Numpy, the Python foundation for number crunchingData Science London
 
Python pandas workshop iPython notebook (163 pages)
Python pandas workshop iPython notebook (163 pages)Python pandas workshop iPython notebook (163 pages)
Python pandas workshop iPython notebook (163 pages)Data Science London
 
Big Practical Recommendations with Alternating Least Squares
Big Practical Recommendations with Alternating Least SquaresBig Practical Recommendations with Alternating Least Squares
Big Practical Recommendations with Alternating Least SquaresData Science London
 
Bringing back the excitement to data analysis
Bringing back the excitement to data analysisBringing back the excitement to data analysis
Bringing back the excitement to data analysisData Science London
 
ACM RecSys 2012: Recommender Systems, Today
ACM RecSys 2012: Recommender Systems, TodayACM RecSys 2012: Recommender Systems, Today
ACM RecSys 2012: Recommender Systems, TodayData Science London
 
Beyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignBeyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignData Science London
 
Autonomous Discovery: The New Interface?
Autonomous Discovery: The New Interface?Autonomous Discovery: The New Interface?
Autonomous Discovery: The New Interface?Data Science London
 
Machine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and FutureMachine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and FutureData Science London
 
Music and Data: Adding Up the UK Music Industry
Music and Data: Adding Up the UK Music IndustryMusic and Data: Adding Up the UK Music Industry
Music and Data: Adding Up the UK Music IndustryData Science London
 
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRSuper-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRData Science London
 
Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutData Science London
 
Going Real-Time with Mahout, Predicting gender of Facebook Users
Going Real-Time with Mahout, Predicting gender of Facebook UsersGoing Real-Time with Mahout, Predicting gender of Facebook Users
Going Real-Time with Mahout, Predicting gender of Facebook UsersData Science London
 
Investigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxInvestigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxData Science London
 

Mais de Data Science London (20)

Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Real-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaReal-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera Impala
 
Numpy, the Python foundation for number crunching
Numpy, the Python foundation for number crunchingNumpy, the Python foundation for number crunching
Numpy, the Python foundation for number crunching
 
Python pandas workshop iPython notebook (163 pages)
Python pandas workshop iPython notebook (163 pages)Python pandas workshop iPython notebook (163 pages)
Python pandas workshop iPython notebook (163 pages)
 
Big Practical Recommendations with Alternating Least Squares
Big Practical Recommendations with Alternating Least SquaresBig Practical Recommendations with Alternating Least Squares
Big Practical Recommendations with Alternating Least Squares
 
Bringing back the excitement to data analysis
Bringing back the excitement to data analysisBringing back the excitement to data analysis
Bringing back the excitement to data analysis
 
Survival Analysis of Web Users
Survival Analysis of Web UsersSurvival Analysis of Web Users
Survival Analysis of Web Users
 
ACM RecSys 2012: Recommender Systems, Today
ACM RecSys 2012: Recommender Systems, TodayACM RecSys 2012: Recommender Systems, Today
ACM RecSys 2012: Recommender Systems, Today
 
Beyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignBeyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems Design
 
Autonomous Discovery: The New Interface?
Autonomous Discovery: The New Interface?Autonomous Discovery: The New Interface?
Autonomous Discovery: The New Interface?
 
Machine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and FutureMachine Learning and Hadoop: Present and Future
Machine Learning and Hadoop: Present and Future
 
Data Science for Live Music
Data Science for Live MusicData Science for Live Music
Data Science for Live Music
 
Research at last.fm
Research at last.fmResearch at last.fm
Research at last.fm
 
Music and Data: Adding Up the UK Music Industry
Music and Data: Adding Up the UK Music IndustryMusic and Data: Adding Up the UK Music Industry
Music and Data: Adding Up the UK Music Industry
 
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRSuper-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapR
 
Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in Mahout
 
Going Real-Time with Mahout, Predicting gender of Facebook Users
Going Real-Time with Mahout, Predicting gender of Facebook UsersGoing Real-Time with Mahout, Predicting gender of Facebook Users
Going Real-Time with Mahout, Predicting gender of Facebook Users
 
Practical Magic with Incanter
Practical Magic with IncanterPractical Magic with Incanter
Practical Magic with Incanter
 
Investigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxInvestigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists Toolbox
 

Último

EPA-pdf resultado da prova presencial Uninove
EPA-pdf resultado da prova presencial UninoveEPA-pdf resultado da prova presencial Uninove
EPA-pdf resultado da prova presencial UninoveFagnerLisboa3
 
Redes direccionamiento y subredes ipv4 2024 .pdf
Redes direccionamiento y subredes ipv4 2024 .pdfRedes direccionamiento y subredes ipv4 2024 .pdf
Redes direccionamiento y subredes ipv4 2024 .pdfsoporteupcology
 
KELA Presentacion Costa Rica 2024 - evento Protégeles
KELA Presentacion Costa Rica 2024 - evento ProtégelesKELA Presentacion Costa Rica 2024 - evento Protégeles
KELA Presentacion Costa Rica 2024 - evento ProtégelesFundación YOD YOD
 
POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...
POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...
POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...silviayucra2
 
CLASE DE TECNOLOGIA E INFORMATICA PRIMARIA
CLASE  DE TECNOLOGIA E INFORMATICA PRIMARIACLASE  DE TECNOLOGIA E INFORMATICA PRIMARIA
CLASE DE TECNOLOGIA E INFORMATICA PRIMARIAWilbisVega
 
Trabajo Mas Completo De Excel en clase tecnología
Trabajo Mas Completo De Excel en clase tecnologíaTrabajo Mas Completo De Excel en clase tecnología
Trabajo Mas Completo De Excel en clase tecnologíassuserf18419
 
Cortes-24-de-abril-Tungurahua-3 año 2024
Cortes-24-de-abril-Tungurahua-3 año 2024Cortes-24-de-abril-Tungurahua-3 año 2024
Cortes-24-de-abril-Tungurahua-3 año 2024GiovanniJavierHidalg
 
trabajotecologiaisabella-240424003133-8f126965.pdf
trabajotecologiaisabella-240424003133-8f126965.pdftrabajotecologiaisabella-240424003133-8f126965.pdf
trabajotecologiaisabella-240424003133-8f126965.pdfIsabellaMontaomurill
 
Global Azure Lima 2024 - Integración de Datos con Microsoft Fabric
Global Azure Lima 2024 - Integración de Datos con Microsoft FabricGlobal Azure Lima 2024 - Integración de Datos con Microsoft Fabric
Global Azure Lima 2024 - Integración de Datos con Microsoft FabricKeyla Dolores Méndez
 
La era de la educación digital y sus desafios
La era de la educación digital y sus desafiosLa era de la educación digital y sus desafios
La era de la educación digital y sus desafiosFundación YOD YOD
 
Hernandez_Hernandez_Practica web de la sesion 12.pptx
Hernandez_Hernandez_Practica web de la sesion 12.pptxHernandez_Hernandez_Practica web de la sesion 12.pptx
Hernandez_Hernandez_Practica web de la sesion 12.pptxJOSEMANUELHERNANDEZH11
 
Proyecto integrador. Las TIC en la sociedad S4.pptx
Proyecto integrador. Las TIC en la sociedad S4.pptxProyecto integrador. Las TIC en la sociedad S4.pptx
Proyecto integrador. Las TIC en la sociedad S4.pptx241521559
 
International Women's Day Sucre 2024 (IWD)
International Women's Day Sucre 2024 (IWD)International Women's Day Sucre 2024 (IWD)
International Women's Day Sucre 2024 (IWD)GDGSucre
 
Plan de aula informatica segundo periodo.docx
Plan de aula informatica segundo periodo.docxPlan de aula informatica segundo periodo.docx
Plan de aula informatica segundo periodo.docxpabonheidy28
 
guía de registro de slideshare por Brayan Joseph
guía de registro de slideshare por Brayan Josephguía de registro de slideshare por Brayan Joseph
guía de registro de slideshare por Brayan JosephBRAYANJOSEPHPEREZGOM
 
9egb-lengua y Literatura.pdf_texto del estudiante
9egb-lengua y Literatura.pdf_texto del estudiante9egb-lengua y Literatura.pdf_texto del estudiante
9egb-lengua y Literatura.pdf_texto del estudianteAndreaHuertas24
 

Último (16)

EPA-pdf resultado da prova presencial Uninove
EPA-pdf resultado da prova presencial UninoveEPA-pdf resultado da prova presencial Uninove
EPA-pdf resultado da prova presencial Uninove
 
Redes direccionamiento y subredes ipv4 2024 .pdf
Redes direccionamiento y subredes ipv4 2024 .pdfRedes direccionamiento y subredes ipv4 2024 .pdf
Redes direccionamiento y subredes ipv4 2024 .pdf
 
KELA Presentacion Costa Rica 2024 - evento Protégeles
KELA Presentacion Costa Rica 2024 - evento ProtégelesKELA Presentacion Costa Rica 2024 - evento Protégeles
KELA Presentacion Costa Rica 2024 - evento Protégeles
 
POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...
POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...
POWER POINT YUCRAElabore una PRESENTACIÓN CORTA sobre el video película: La C...
 
CLASE DE TECNOLOGIA E INFORMATICA PRIMARIA
CLASE  DE TECNOLOGIA E INFORMATICA PRIMARIACLASE  DE TECNOLOGIA E INFORMATICA PRIMARIA
CLASE DE TECNOLOGIA E INFORMATICA PRIMARIA
 
Trabajo Mas Completo De Excel en clase tecnología
Trabajo Mas Completo De Excel en clase tecnologíaTrabajo Mas Completo De Excel en clase tecnología
Trabajo Mas Completo De Excel en clase tecnología
 
Cortes-24-de-abril-Tungurahua-3 año 2024
Cortes-24-de-abril-Tungurahua-3 año 2024Cortes-24-de-abril-Tungurahua-3 año 2024
Cortes-24-de-abril-Tungurahua-3 año 2024
 
trabajotecologiaisabella-240424003133-8f126965.pdf
trabajotecologiaisabella-240424003133-8f126965.pdftrabajotecologiaisabella-240424003133-8f126965.pdf
trabajotecologiaisabella-240424003133-8f126965.pdf
 
Global Azure Lima 2024 - Integración de Datos con Microsoft Fabric
Global Azure Lima 2024 - Integración de Datos con Microsoft FabricGlobal Azure Lima 2024 - Integración de Datos con Microsoft Fabric
Global Azure Lima 2024 - Integración de Datos con Microsoft Fabric
 
La era de la educación digital y sus desafios
La era de la educación digital y sus desafiosLa era de la educación digital y sus desafios
La era de la educación digital y sus desafios
 
Hernandez_Hernandez_Practica web de la sesion 12.pptx
Hernandez_Hernandez_Practica web de la sesion 12.pptxHernandez_Hernandez_Practica web de la sesion 12.pptx
Hernandez_Hernandez_Practica web de la sesion 12.pptx
 
Proyecto integrador. Las TIC en la sociedad S4.pptx
Proyecto integrador. Las TIC en la sociedad S4.pptxProyecto integrador. Las TIC en la sociedad S4.pptx
Proyecto integrador. Las TIC en la sociedad S4.pptx
 
International Women's Day Sucre 2024 (IWD)
International Women's Day Sucre 2024 (IWD)International Women's Day Sucre 2024 (IWD)
International Women's Day Sucre 2024 (IWD)
 
Plan de aula informatica segundo periodo.docx
Plan de aula informatica segundo periodo.docxPlan de aula informatica segundo periodo.docx
Plan de aula informatica segundo periodo.docx
 
guía de registro de slideshare por Brayan Joseph
guía de registro de slideshare por Brayan Josephguía de registro de slideshare por Brayan Joseph
guía de registro de slideshare por Brayan Joseph
 
9egb-lengua y Literatura.pdf_texto del estudiante
9egb-lengua y Literatura.pdf_texto del estudiante9egb-lengua y Literatura.pdf_texto del estudiante
9egb-lengua y Literatura.pdf_texto del estudiante
 

Nowcasting Business Performance with ML & Time Series

  • 1. NOWCASTING BUSINESS PERFORMANCE miércoles, 28 de noviembre de 12
  • 2. GROWTH INTELLIGENCE What we do Classification of companies Revenue estimation What we use Machine Learning Times Series methods miércoles, 28 de noviembre de 12
  • 3. SOME OF OUR CLIENTS miércoles, 28 de noviembre de 12
  • 4. NOWCASTING Estimating the value of a time series not readily available at present present miércoles, 28 de noviembre de 12
  • 5. NOWCASTING present miércoles, 28 de noviembre de 12
  • 6. NOWCASTING Previously called short-term forecasting forecasting More an approach and a goal than a different theory and field miércoles, 28 de noviembre de 12
  • 7. NOWCASTING USE CASES Weather nowcasting Search-based nowcasting GDP nowcasting miércoles, 28 de noviembre de 12
  • 8. WEATHER NOWCASTING Simplified model that is applied quickly Uses weather models Forecast at location x given weather at y → Not applicable to other fields miércoles, 28 de noviembre de 12
  • 9. SEARCH-BASED NOWCASTING Popularized by Google Recent successes Flu predictions Consumer behaviour travel, movies and products Based on Google’s data, simple AR models Only used to study what people are searching for miércoles, 28 de noviembre de 12
  • 10. miércoles, 28 de noviembre de 12
  • 11. miércoles, 28 de noviembre de 12
  • 12. GDP NOWCASTING Field with the most generic research Major research since the 90's GDP released quarterly with further revisions 1000's of signals for GDP nowcasting Industrial production, unemployment, confidence surveys, retail sales, ... miércoles, 28 de noviembre de 12
  • 13. GDP NOWCASTING Vector auto-regression and the “jagged edge” Present Different frequencies, different lag, missing data miércoles, 28 de noviembre de 12
  • 14. miércoles, 28 de noviembre de 12
  • 15. miércoles, 28 de noviembre de 12
  • 16. miércoles, 28 de noviembre de 12
  • 17. Patents Search results Advertisement spending LinkedIn info Web traffic Assets Tweets Website Liabilities Press updates miércoles, 28 de noviembre de 12
  • 18. TIE WITH “BIG DATA” Need to gather signals in large quantity Machine learning as a pre-processing step and to integrate discrete events Example: companies in a sector which receive investment miércoles, 28 de noviembre de 12
  • 19. TIE WITH ESTIMATION THEORY Beneath all this: Getting to a variable not directly observable with the help of measured signals Replacing probability distribution from physical models with machine learned knowledge miércoles, 28 de noviembre de 12
  • 20. METHODOLOGIES Vector auto-regression Challenge with large number of signals (predictors): Curse of dimensionality when applying VAR Machine Learning approach Own solution: ziggurat miércoles, 28 de noviembre de 12
  • 21. TIME SERIES + MACHINE LEARNING avg, std dev, model params Δrevenue miércoles, 28 de noviembre de 12
  • 22. OUR PIPELINE FOR NOWCASTING Clustering companies in sets (ML) Signals gathering Time Series processing ML with model for each cluster > Revenue for each company and each cluster miércoles, 28 de noviembre de 12
  • 24. DATA SCIENCE AT GROWTH INTELLIGENCE :D miércoles, 28 de noviembre de 12