SlideShare uma empresa Scribd logo
1 de 9
Small Data Sets to Scale:
  Planning for the Evolution of Data

      Poornima Vijayashanker
     CEO & Founder BizeeBee
     poornima@bizeebee.com
           @poornima
       www.femgineer.com
AGENDA
I. Stealth Mode - “pre-data” phase
II. Launch
III. Compute Growth Rate
IV. Optimizations
V. Data Storage
Pre-Data
Stealth Mode - “pre-data” phase
Small initial data set
  Easy storage
  Storage solutions like Heroku, RackSpace
  Design features around it

Simplicity of Storage v. Complexity of Design
  e.g. Mint - 3 months of financial data, FB - social graph is
 limited to universities
0 to 100k to 1M
0 - 100k easiest schema design
  Single DB - with user & static data
  Single instance of app accessing the db

100k - 1M+ time to re-design db and app
  Break up databases - user & static
  Multiple instances of the app
Growth Rate
What is your user growth rate?
  Basic unit e.g. Mint - transaction
  User generated content
  Size of unit e.g. FB - photo

Storage capacity v. Seek v. Size
Optimizations
Capacity - throw hardware
Seek - throw software
  Cache data

Size - design around it
  Limit usage size e.g. 4MB picture
Optimizations Cont’d
Code Level
  Processes - Computation v. Retrieval
  DB Techniques - Index, De-Normalize

Data Level
  Partioning: Siloed v. Interconnected
Data Storage
Single User’s Data v. Aggregated Data
  Single user’s data v. data aggregated across users
  e.g Mint - Spending Trends
  Scheme to compute, store, and retrieve aggregated data
Conclusion
  Start small - provide enough value to user
  Monitor & project growth rate of data
  Break data apart
  Simple optimizations - indexing, de-
normalizing, caching
  Large data sets - warehousing, partitioning
db
  Hiring designer & engineer for BizeeBee :)

Mais conteúdo relacionado

Mais de Mediabistro

Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneChris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneMediabistro
 
Terry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneTerry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneMediabistro
 
2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos PresentationMediabistro
 
Gary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneGary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneMediabistro
 
James canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalJames canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalMediabistro
 
Gst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsGst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsMediabistro
 
Building a trading platform from scratch
Building a trading platform from scratchBuilding a trading platform from scratch
Building a trading platform from scratchMediabistro
 
Bitcoin Lateral Economics
Bitcoin Lateral EconomicsBitcoin Lateral Economics
Bitcoin Lateral EconomicsMediabistro
 
State of Ethereum, and Mining
State of Ethereum, and MiningState of Ethereum, and Mining
State of Ethereum, and MiningMediabistro
 
Future of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanFuture of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanMediabistro
 
Evan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationEvan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationMediabistro
 
Morning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMorning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMediabistro
 
Yuan Bao Presentation
Yuan Bao PresentationYuan Bao Presentation
Yuan Bao PresentationMediabistro
 
Bitcoin derivatives
Bitcoin derivativesBitcoin derivatives
Bitcoin derivativesMediabistro
 
Inside3 d printing_brianfederal
Inside3 d printing_brianfederalInside3 d printing_brianfederal
Inside3 d printing_brianfederalMediabistro
 
3 d printing_paultrani
3 d printing_paultrani3 d printing_paultrani
3 d printing_paultraniMediabistro
 
Inside3DPrinting_marktrageser
Inside3DPrinting_marktrageserInside3DPrinting_marktrageser
Inside3DPrinting_marktrageserMediabistro
 
Inside3DPrinting_johnhornick
Inside3DPrinting_johnhornickInside3DPrinting_johnhornick
Inside3DPrinting_johnhornickMediabistro
 
Inisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesInisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesMediabistro
 

Mais de Mediabistro (20)

Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneChris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing Melbourne
 
Terry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneTerry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing Melbourne
 
2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation
 
Gary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneGary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing Melbourne
 
James canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalJames canning inside bitcoin melbourne final
James canning inside bitcoin melbourne final
 
Gst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsGst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential Pitfalls
 
Building a trading platform from scratch
Building a trading platform from scratchBuilding a trading platform from scratch
Building a trading platform from scratch
 
Bitcoin Lateral Economics
Bitcoin Lateral EconomicsBitcoin Lateral Economics
Bitcoin Lateral Economics
 
State of Ethereum, and Mining
State of Ethereum, and MiningState of Ethereum, and Mining
State of Ethereum, and Mining
 
Future of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanFuture of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh Zerlan
 
Evan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationEvan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody Presentation
 
Crypto Law
Crypto LawCrypto Law
Crypto Law
 
Morning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMorning Keynote: Bobby Lee
Morning Keynote: Bobby Lee
 
Yuan Bao Presentation
Yuan Bao PresentationYuan Bao Presentation
Yuan Bao Presentation
 
Bitcoin derivatives
Bitcoin derivativesBitcoin derivatives
Bitcoin derivatives
 
Inside3 d printing_brianfederal
Inside3 d printing_brianfederalInside3 d printing_brianfederal
Inside3 d printing_brianfederal
 
3 d printing_paultrani
3 d printing_paultrani3 d printing_paultrani
3 d printing_paultrani
 
Inside3DPrinting_marktrageser
Inside3DPrinting_marktrageserInside3DPrinting_marktrageser
Inside3DPrinting_marktrageser
 
Inside3DPrinting_johnhornick
Inside3DPrinting_johnhornickInside3DPrinting_johnhornick
Inside3DPrinting_johnhornick
 
Inisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesInisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachines
 

Último

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

  • 1. Small Data Sets to Scale: Planning for the Evolution of Data Poornima Vijayashanker CEO & Founder BizeeBee poornima@bizeebee.com @poornima www.femgineer.com
  • 2. AGENDA I. Stealth Mode - “pre-data” phase II. Launch III. Compute Growth Rate IV. Optimizations V. Data Storage
  • 3. Pre-Data Stealth Mode - “pre-data” phase Small initial data set Easy storage Storage solutions like Heroku, RackSpace Design features around it Simplicity of Storage v. Complexity of Design e.g. Mint - 3 months of financial data, FB - social graph is limited to universities
  • 4. 0 to 100k to 1M 0 - 100k easiest schema design Single DB - with user & static data Single instance of app accessing the db 100k - 1M+ time to re-design db and app Break up databases - user & static Multiple instances of the app
  • 5. Growth Rate What is your user growth rate? Basic unit e.g. Mint - transaction User generated content Size of unit e.g. FB - photo Storage capacity v. Seek v. Size
  • 6. Optimizations Capacity - throw hardware Seek - throw software Cache data Size - design around it Limit usage size e.g. 4MB picture
  • 7. Optimizations Cont’d Code Level Processes - Computation v. Retrieval DB Techniques - Index, De-Normalize Data Level Partioning: Siloed v. Interconnected
  • 8. Data Storage Single User’s Data v. Aggregated Data Single user’s data v. data aggregated across users e.g Mint - Spending Trends Scheme to compute, store, and retrieve aggregated data
  • 9. Conclusion Start small - provide enough value to user Monitor & project growth rate of data Break data apart Simple optimizations - indexing, de- normalizing, caching Large data sets - warehousing, partitioning db Hiring designer & engineer for BizeeBee :)

Notas do Editor