SlideShare uma empresa Scribd logo
1 de 14
Bigtop Working Group
Elance 6/27/2013
DC Absolute SW:Intro to BWG, intro to team
Roman Cloudera: Bigtop Creator
Marshall/Ryan Palomino Labs: BenchPress
Thank You Sponsors for the
Donations!
● Elance, post your Hadoop Jobs here!! Meeting
Space/food
● Docusign/SF for meeting space/food
● Cloudera
● DataPipe/$500 credit, free time for people doing
POCs, Gary?
● Safari Online Books/ 30 day donation
● Amazon AWS, $100/credits
●
Poll
● How many are managing POCs?
● How many are looking to do a career change
into Hadoop*?
Intro
● Technical Architect @ABSW, 2 POCS, Hbase &
Storm, Mongo doesn't count
● POC example, overly simplistic example:
– Write performance: Incoming data, save to disk
– Read Performance:Read Time, all table scans are
awful for browser interaction (reporting)
Slideshare:
POCs
● Proof of Concept, to verify scope, architecture
and cost
● A BigData Stack Implementation consists of:
1) DevOps
2) Application (e.g. Astyanax)
3) Internals: Cloudera/MapR/HW. We don't cover
internals. Take cs346 Please!!! Github/redbase
– We cover 1) and some of 2) For a POC?
Small vs. Large POCs
● GM >>$1M, $5-$10M hire Cloudera. World
experts who cover 1), 2) & 3)
● @$500k/$1M; you get 1y and most fail
– A high level person ~200k/year who doesn't code
– You as a newly hired tech lead or architect
– 1-2+ programmers who know nothing about
Hadoop* but know the business processes
● What happens after this?
Scope creep; HLP adds
components; defines effort
● Hadoop alone not fit; a VC, >1Y, fails or zombie
project, extrapolation from HLP downloads and
runs wc, HLP learns from web posts and sales
people
HDFS/Hadoop
HBase Storm
HLP gets info from BigData vendors
● Argument between Cassandra/Hadoop centers on
SPOF, building an application is difficult!
– Cassandra vs. HBase; nobody talks about Astyanax
Lethal underspecification of 2).
● See this in Job postings also. J2EE !=scalable
distributed programming
● Go to Palomino Labs for 2). Have to understand
Zookeeper programming first! PL can do 1) and 3)
● Java Concurrency->Zookeeper->Scalable Dist Apps
HLP && Machine Learning
● BigData == Machine Learning. Find someone who
knows R/Mahout. The same job listing w/J2EE
● R & Mahout aren't used in production.
● For this to work you have to be a GOOD server
programmer first. Not someone who downloads Tomcat
and figures out how to stub out REST calls.
● Separate track TBD/w Charles Nainen. Need sample
POC! W/sponsoring vendor
What to do?
● Contribute to Bigtop. Why?
– Teaches you the internals of
Bigtop/Hbase/Hadoop/Flume and gets you 1) and API
practice for 2)
– Add new components to Bigtop
● Hands on experience w/new components
● Contribute to Benchpress to get to 2) as a first
step. Gets you ZK. Still long way to go
● We don't cover 3). Not on the road map
Logistics
● Max 20 ppl. Maple Tree Inn
● Charge $100->$200/month for the room rental.
● Meet 2 weeks to do demos.
● Will cover Bigtop & Benchpress/ Storm future
session
● First session 3 meetings only. We reserve right
to stop these if we run out of time
Not a class
● Cloudera/HW/MapR have classes, $800-1k/day
for 3 days-1 week
● They have to charge this to pay someone's
salary to create material for you.
● We don't replace this. I took these classes. Not
going to steal their material. You will have to
read and write test code.
● Same information as a new Cloudera employee
● This works if the consultants get new business
and we get open source code contributions.
Group POC?
● Please talk to Roman/Bruno/Ryan/Charles if you have funding
which gets them new business
● POC mentors
– Ryan/Marshall: Application on Hadoop*. 9 people. Any POC
– Ron: Chef/Bigtop group project; MongoDB in production
– Roman/Bruno:Hadoop* Hadoop* POC
– DC:Hadoop/Storm
– Charles Nainen: ML POC. Need data/problem description/$
● We can do a group POC. Talk to a POC mentor
Group Sign Up Sheets
● Group POC
● Form groups, at least 2+; 500k-1M POCs are
good business; have to do it as a team.
● Skill shortage; not a budget issue
● Working Group Session Signup for Safari
subscription and AWS codes.
– June 30, July 14th
, July 28th
.

Mais conteúdo relacionado

Destaque

Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondXiangrui Meng
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRDatabricks
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...MLconf
 
Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondDataWorks Summit
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learningTheodoros Vasiloudis
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkDB Tsai
 

Destaque (7)

Training
TrainingTraining
Training
 
Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and Beyond
 
Generalized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkRGeneralized Linear Models in Spark MLlib and SparkR
Generalized Linear Models in Spark MLlib and SparkR
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and Beyond
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache Spark
 

Semelhante a Bigtop elancesmallrev1

Apache bigtopwg7142013
Apache bigtopwg7142013Apache bigtopwg7142013
Apache bigtopwg7142013Doug Chang
 
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University TalksHadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talksyhadoop
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
 
Bigtop june302013
Bigtop june302013Bigtop june302013
Bigtop june302013Doug Chang
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018Holden Karau
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Holden Karau
 
Workers and Worker Patterns at Scale
Workers and Worker Patterns at ScaleWorkers and Worker Patterns at Scale
Workers and Worker Patterns at ScaleChad Arimura
 
Managing a Project the Drupal Way - Drupal Open Days Ireland
Managing a Project the Drupal Way - Drupal Open Days IrelandManaging a Project the Drupal Way - Drupal Open Days Ireland
Managing a Project the Drupal Way - Drupal Open Days IrelandEmma Jane Hogbin Westby
 
Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeAhmedabadJavaMeetup
 
Hadoop applicationarchitectures
Hadoop applicationarchitecturesHadoop applicationarchitectures
Hadoop applicationarchitecturesDoug Chang
 
DiUS Computing Lca Rails Final
DiUS  Computing Lca Rails FinalDiUS  Computing Lca Rails Final
DiUS Computing Lca Rails FinalRobert Postill
 
JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...
JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...
JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...David Taieb
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverlessgjdevos
 
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective AudienceHadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective AudienceChandra Sekhar
 
Untangling spring week1
Untangling spring week1Untangling spring week1
Untangling spring week1Derek Jacoby
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB
 
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional   w/ Apache Spark @ Scala Days NYCKeeping the fun in functional   w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional w/ Apache Spark @ Scala Days NYCHolden Karau
 
meetup version of Paving the road to production
  meetup version of Paving the road to production    meetup version of Paving the road to production
meetup version of Paving the road to production Matthew Reynolds
 
Big data-denis-rothman
Big data-denis-rothmanBig data-denis-rothman
Big data-denis-rothmanDenis Rothman
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...Holden Karau
 

Semelhante a Bigtop elancesmallrev1 (20)

Apache bigtopwg7142013
Apache bigtopwg7142013Apache bigtopwg7142013
Apache bigtopwg7142013
 
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University TalksHadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
Bigtop june302013
Bigtop june302013Bigtop june302013
Bigtop june302013
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018
 
Workers and Worker Patterns at Scale
Workers and Worker Patterns at ScaleWorkers and Worker Patterns at Scale
Workers and Worker Patterns at Scale
 
Managing a Project the Drupal Way - Drupal Open Days Ireland
Managing a Project the Drupal Way - Drupal Open Days IrelandManaging a Project the Drupal Way - Drupal Open Days Ireland
Managing a Project the Drupal Way - Drupal Open Days Ireland
 
Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google Chrome
 
Hadoop applicationarchitectures
Hadoop applicationarchitecturesHadoop applicationarchitectures
Hadoop applicationarchitectures
 
DiUS Computing Lca Rails Final
DiUS  Computing Lca Rails FinalDiUS  Computing Lca Rails Final
DiUS Computing Lca Rails Final
 
JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...
JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...
JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverless
 
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective AudienceHadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective Audience
 
Untangling spring week1
Untangling spring week1Untangling spring week1
Untangling spring week1
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
 
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional   w/ Apache Spark @ Scala Days NYCKeeping the fun in functional   w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
 
meetup version of Paving the road to production
  meetup version of Paving the road to production    meetup version of Paving the road to production
meetup version of Paving the road to production
 
Big data-denis-rothman
Big data-denis-rothmanBig data-denis-rothman
Big data-denis-rothman
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
 

Mais de Doug Chang

BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkDoug Chang
 
Odersky week1 notes
Odersky week1 notesOdersky week1 notes
Odersky week1 notesDoug Chang
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming InfoDoug Chang
 
Capital onehadoopclass
Capital onehadoopclassCapital onehadoopclass
Capital onehadoopclassDoug Chang
 
Capital onehadoopintro
Capital onehadoopintroCapital onehadoopintro
Capital onehadoopintroDoug Chang
 
L'Oreal Tech Talk
L'Oreal Tech TalkL'Oreal Tech Talk
L'Oreal Tech TalkDoug Chang
 
Hadoop/HBase POC framework
Hadoop/HBase POC frameworkHadoop/HBase POC framework
Hadoop/HBase POC frameworkDoug Chang
 
Demographics andweblogtargeting
Demographics andweblogtargetingDemographics andweblogtargeting
Demographics andweblogtargetingDoug Chang
 

Mais de Doug Chang (9)

BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning Talk
 
Hapi
HapiHapi
Hapi
 
Odersky week1 notes
Odersky week1 notesOdersky week1 notes
Odersky week1 notes
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming Info
 
Capital onehadoopclass
Capital onehadoopclassCapital onehadoopclass
Capital onehadoopclass
 
Capital onehadoopintro
Capital onehadoopintroCapital onehadoopintro
Capital onehadoopintro
 
L'Oreal Tech Talk
L'Oreal Tech TalkL'Oreal Tech Talk
L'Oreal Tech Talk
 
Hadoop/HBase POC framework
Hadoop/HBase POC frameworkHadoop/HBase POC framework
Hadoop/HBase POC framework
 
Demographics andweblogtargeting
Demographics andweblogtargetingDemographics andweblogtargeting
Demographics andweblogtargeting
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Bigtop elancesmallrev1

  • 1. Bigtop Working Group Elance 6/27/2013 DC Absolute SW:Intro to BWG, intro to team Roman Cloudera: Bigtop Creator Marshall/Ryan Palomino Labs: BenchPress
  • 2. Thank You Sponsors for the Donations! ● Elance, post your Hadoop Jobs here!! Meeting Space/food ● Docusign/SF for meeting space/food ● Cloudera ● DataPipe/$500 credit, free time for people doing POCs, Gary? ● Safari Online Books/ 30 day donation ● Amazon AWS, $100/credits ●
  • 3. Poll ● How many are managing POCs? ● How many are looking to do a career change into Hadoop*?
  • 4. Intro ● Technical Architect @ABSW, 2 POCS, Hbase & Storm, Mongo doesn't count ● POC example, overly simplistic example: – Write performance: Incoming data, save to disk – Read Performance:Read Time, all table scans are awful for browser interaction (reporting) Slideshare:
  • 5. POCs ● Proof of Concept, to verify scope, architecture and cost ● A BigData Stack Implementation consists of: 1) DevOps 2) Application (e.g. Astyanax) 3) Internals: Cloudera/MapR/HW. We don't cover internals. Take cs346 Please!!! Github/redbase – We cover 1) and some of 2) For a POC?
  • 6. Small vs. Large POCs ● GM >>$1M, $5-$10M hire Cloudera. World experts who cover 1), 2) & 3) ● @$500k/$1M; you get 1y and most fail – A high level person ~200k/year who doesn't code – You as a newly hired tech lead or architect – 1-2+ programmers who know nothing about Hadoop* but know the business processes ● What happens after this?
  • 7. Scope creep; HLP adds components; defines effort ● Hadoop alone not fit; a VC, >1Y, fails or zombie project, extrapolation from HLP downloads and runs wc, HLP learns from web posts and sales people HDFS/Hadoop HBase Storm
  • 8. HLP gets info from BigData vendors ● Argument between Cassandra/Hadoop centers on SPOF, building an application is difficult! – Cassandra vs. HBase; nobody talks about Astyanax Lethal underspecification of 2). ● See this in Job postings also. J2EE !=scalable distributed programming ● Go to Palomino Labs for 2). Have to understand Zookeeper programming first! PL can do 1) and 3) ● Java Concurrency->Zookeeper->Scalable Dist Apps
  • 9. HLP && Machine Learning ● BigData == Machine Learning. Find someone who knows R/Mahout. The same job listing w/J2EE ● R & Mahout aren't used in production. ● For this to work you have to be a GOOD server programmer first. Not someone who downloads Tomcat and figures out how to stub out REST calls. ● Separate track TBD/w Charles Nainen. Need sample POC! W/sponsoring vendor
  • 10. What to do? ● Contribute to Bigtop. Why? – Teaches you the internals of Bigtop/Hbase/Hadoop/Flume and gets you 1) and API practice for 2) – Add new components to Bigtop ● Hands on experience w/new components ● Contribute to Benchpress to get to 2) as a first step. Gets you ZK. Still long way to go ● We don't cover 3). Not on the road map
  • 11. Logistics ● Max 20 ppl. Maple Tree Inn ● Charge $100->$200/month for the room rental. ● Meet 2 weeks to do demos. ● Will cover Bigtop & Benchpress/ Storm future session ● First session 3 meetings only. We reserve right to stop these if we run out of time
  • 12. Not a class ● Cloudera/HW/MapR have classes, $800-1k/day for 3 days-1 week ● They have to charge this to pay someone's salary to create material for you. ● We don't replace this. I took these classes. Not going to steal their material. You will have to read and write test code. ● Same information as a new Cloudera employee ● This works if the consultants get new business and we get open source code contributions.
  • 13. Group POC? ● Please talk to Roman/Bruno/Ryan/Charles if you have funding which gets them new business ● POC mentors – Ryan/Marshall: Application on Hadoop*. 9 people. Any POC – Ron: Chef/Bigtop group project; MongoDB in production – Roman/Bruno:Hadoop* Hadoop* POC – DC:Hadoop/Storm – Charles Nainen: ML POC. Need data/problem description/$ ● We can do a group POC. Talk to a POC mentor
  • 14. Group Sign Up Sheets ● Group POC ● Form groups, at least 2+; 500k-1M POCs are good business; have to do it as a team. ● Skill shortage; not a budget issue ● Working Group Session Signup for Safari subscription and AWS codes. – June 30, July 14th , July 28th .