SlideShare uma empresa Scribd logo
1 de 22
Overview
SCALE14x 2016
Agenda/Schedule
-Apache Bigtop Overview
-Apache Spark Overview/Getting Started
-Lunch Break
-Apache Ignite
-Workshop, tutorial, open time
http://workshops.bigtop.rocks
(click on Agenda button)
What is Bigtop?
Setting the standard for testing, packaging and
integration of leading big/fast data components
and many other…
Components as Building Blocks
--------------------------------------------------------------------
-----
Dependency Hell!!
hdfs
zookeeper
hbase
kafka
spark
.
.
.
mapred
oozie
hive
etc
-----------------------------------------------------
-----
-----------------------------------------------------
-----
-----------------------------------------------------
-----
-----------------------------------------------------
-----
-----------------------------------------------------
-----
-----------------------------------------------------
-----
Build all the
Things!!!
The BOM
Build of Materials (BOM)
* List of >=1 components
* Gradle for build/actions
* Produce sets of debs/rpms
Bigtop Origins
Yahoo!, 2010
Created, fostered early Hadoop community
Working on Hadoop 0.20 stack
2011
Yahoo!’s to Cloudera, solving early problems of packaging and
maintaining first commercial supported Hadoop distro
Early value add
Provide a common foundation for proper integration of
growing number of Hadoop family components
Foundation provides solid base for validating applications
running on top of the stack(s)
Provide neutral packaging and deployment/config
Early Mission Accomplished
Foundation for commercial Hadoop distros/services
Leveraged by app providers…
What now?
We are done right?1?!?
Industry/Ecosystem Evolution
&
New Community Needs/Ideas
Where should we spend our time?,
which users should benefit?
Moving beyond oob mapreduce…
Lambda/Stream Architectures
HDFS + Zookeeper +
Get out from the Apache dome
New focus and target end users
Data engineers vs distro
builders
Enhance
Operations/Deployment
Reference implementations
& tutorials
Laying new foundation with 1.0+
Self-starter, non-kitchen sink building
-Making gradle tooling smarter
-Jenkins job autogen
-leveraging containers for
parallelization
Data data data…
Smarter/Realistic test data
-bigpetstore
-bigtop-bazaar
-weather data gen
Tutorial/Learning Data sets
-githubarchive.org
-more tbd…
Deployment/Mgmt
Updated puppet modules
-newest best practices
-next level enhanced security options
Wider range of starter deployment topologies
Include some handling of test/tutorial data
More components…
Sounds interesting, how can I help?
*Join mailing list, ask questions, suggest features, etc
*Contribute (components, tutorials, docs)
*Report bugs
Thank You, Q&A
Nate D’Amico
kaiyzen@apache.org
@kaiyzen

Mais conteúdo relacionado

Mais procurados

August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
Yahoo Developer Network
 

Mais procurados (9)

Installing Cacti openSUSE Leap 42.1
Installing Cacti openSUSE Leap 42.1Installing Cacti openSUSE Leap 42.1
Installing Cacti openSUSE Leap 42.1
 
Speeding Up The Snail
Speeding Up The SnailSpeeding Up The Snail
Speeding Up The Snail
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
 
Web scraping with nutch solr part 2
Web scraping with nutch solr part 2Web scraping with nutch solr part 2
Web scraping with nutch solr part 2
 
알쓸신잡
알쓸신잡알쓸신잡
알쓸신잡
 
Nuvola: a tale of migration to AWS
Nuvola: a tale of migration to AWSNuvola: a tale of migration to AWS
Nuvola: a tale of migration to AWS
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache Hive
 
Hadoop sqoop
Hadoop sqoop Hadoop sqoop
Hadoop sqoop
 
Mysql
Mysql Mysql
Mysql
 

Semelhante a scale14x-bigtop-overview-roadmap

DistributingSoftwareKnowledgeForDevOps
DistributingSoftwareKnowledgeForDevOpsDistributingSoftwareKnowledgeForDevOps
DistributingSoftwareKnowledgeForDevOps
Paul Worrall
 

Semelhante a scale14x-bigtop-overview-roadmap (20)

Cadence flow
Cadence flowCadence flow
Cadence flow
 
Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0
 
Accelerating Spark SQL Workloads to 50X Performance with Apache Arrow-Based F...
Accelerating Spark SQL Workloads to 50X Performance with Apache Arrow-Based F...Accelerating Spark SQL Workloads to 50X Performance with Apache Arrow-Based F...
Accelerating Spark SQL Workloads to 50X Performance with Apache Arrow-Based F...
 
Load demo-oct2016
Load demo-oct2016Load demo-oct2016
Load demo-oct2016
 
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
 
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
 
Apache Spark - Dataframes & Spark SQL - Part 2 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 2 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 2 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 2 | Big Data Hadoop Spark Tutori...
 
Storage Benchmarks - Voodoo oder Wissenschaft? – data://disrupted® 2020
Storage Benchmarks - Voodoo oder Wissenschaft? – data://disrupted® 2020Storage Benchmarks - Voodoo oder Wissenschaft? – data://disrupted® 2020
Storage Benchmarks - Voodoo oder Wissenschaft? – data://disrupted® 2020
 
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data EverywhereApache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
 
Fico links Anilkumar chowdary
Fico links Anilkumar chowdaryFico links Anilkumar chowdary
Fico links Anilkumar chowdary
 
Talend openstudio bigdata_gettingstarted_6.3.0_en
Talend openstudio bigdata_gettingstarted_6.3.0_enTalend openstudio bigdata_gettingstarted_6.3.0_en
Talend openstudio bigdata_gettingstarted_6.3.0_en
 
Running PHP on a Java container
Running PHP on a Java containerRunning PHP on a Java container
Running PHP on a Java container
 
Deploy Rails Application by Capistrano
Deploy Rails Application by CapistranoDeploy Rails Application by Capistrano
Deploy Rails Application by Capistrano
 
Variant Configurition in SAP: Beginners Guide | www.sapdocs.info
Variant Configurition in SAP: Beginners Guide | www.sapdocs.infoVariant Configurition in SAP: Beginners Guide | www.sapdocs.info
Variant Configurition in SAP: Beginners Guide | www.sapdocs.info
 
Oreilly solinea-managing-openstack
Oreilly solinea-managing-openstackOreilly solinea-managing-openstack
Oreilly solinea-managing-openstack
 
DistributingSoftwareKnowledgeForDevOps
DistributingSoftwareKnowledgeForDevOpsDistributingSoftwareKnowledgeForDevOps
DistributingSoftwareKnowledgeForDevOps
 
NYC_2016_slides
NYC_2016_slidesNYC_2016_slides
NYC_2016_slides
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
Pluggable Databases: What they will break and why you should use them anyway!
Pluggable Databases: What they will break and why you should use them anyway!Pluggable Databases: What they will break and why you should use them anyway!
Pluggable Databases: What they will break and why you should use them anyway!
 
SAP HANA DEVE ONLINE TRAINING
SAP HANA DEVE ONLINE TRAININGSAP HANA DEVE ONLINE TRAINING
SAP HANA DEVE ONLINE TRAINING
 

scale14x-bigtop-overview-roadmap