Apache Kylin Open Source Journey for QCon2015 Beijing

Luke Han
Luke HanCo-Founder & CEO at Kyligence Inc. em Kyligence
Apache Kylin
Open Source Journey
韩卿 | Luke Han
Co-Creator & PMC Member
lukehan@apache.org
2015-­‐04-­‐25
Agenda
• About Apache Kylin
• Kylin Open Source Journey
• Apache Incubating
• Build Community and Ecosystem
• The Good, The Bad and The Ugly
• Q&A
About	
  Apache	
  Kylin	
  (麒麟)
Extreme OLAP Engine
for Big Data
http://kylin.io	
  
Kylin is an open source Distributed Analytics Engine
that provides SQL interface and multi-dimensional
analysis (OLAP) on Hadoop supporting extremely
large datasets
• First Apache Project open sourced by eBay Inc.
• First Apache Project fully contributed from eBay CCOE
• Open Sourced on Oct 1st, 2014
• Be accepted as Apache Incubator Project on Nov 25th, 2014
• Apache Kylin is an effort undergoing incubation at The Apache Software
Foundation (ASF), sponsored by Incubator.
Technical	
  Challenges
• Huge volume data
– Table scan
• Big table joins
– Data shuffling
• Analysis on different granularity
– Runtime aggregation expensive
• Map Reduce job
– Batch processing
Apache	
  Kylin	
  Architecture
Cube	
  Build	
  Engine	
  
(MapReduce,	
  Streaming…)
SQL
Low	
  	
  Latency	
  -­‐	
  Seconds
Mid	
  Latency	
  -­‐	
  Minutes
Routing
3rd	
  Party	
  App	
  
(Web	
  App,	
  Mobile…)
Metadata
SQL-­‐Based	
  Tool	
  
(BI	
  Tools:	
  Tableau…)
Query	
  Engine
Hadoop
Hive
REST	
  API JDBC/ODBC
➢ Online	
  Analysis	
  Data	
  Flow	
  
➢ Offline	
  Data	
  Flow	
  
➢ Clients/Users	
  interactive	
  with	
  Kylin	
  
via	
  SQL	
  
➢ OLAP	
  Cube	
  is	
  transparent	
  to	
  users
Star	
  Schema	
  Data Key	
  Value	
  Data
Data	
  
Cube
OLAP	
  
Cube	
  
(HBase)
SQL
REST	
  Server
Features
• Extremely Fast OLAP Engine at scale
• ANSI SQL Interface on Hadoop
• Seamless Integration with BI Tools, like Tableau
• Interactive Query Capability
• MOLAP Cube
• Compression and Encoding Support
• Incremental Build of Cubes
• Approximate Query Capability for Distinct Count (HyperLogLog)
• Leverage HBase Coprocessor for query latency
• Job Management and Monitoring
• User friendly Web GUI for manage, build, monitor and query cubes
• Security capability to set ACL at Cube/Project Level
• Support LDAP Integration
• Streaming Support Coming soon!
6
90%$le'queries'<5s'
Agenda
• About Apache Kylin
• Kylin Open Source Journey
• Apache Incubating
• Build Community and Ecosystem
• The Good, The Bad and The Ugly
• Q&A
Jun	
  2014
US#Patent#Filed#
Kylin	
  Open	
  Source	
  Journey
Sep	
  2013
Ini$a$ve(
Jan	
  2014
POC$Completed$
	
  Jul	
  2014
V1.0%Beta%Released%
Oct	
  2014
V1.0%GA%Released%
Open%Sourced%
Apache	
  Top	
  Project
Nov	
  2014
Apache''
Incubator'Project'
Ready	
  for	
  Open	
  Source
• Open	
  Source	
  from	
  Day	
  One	
  
• Internal	
  vs	
  External	
  
• Intellectual	
  Property	
  
• Legal	
  
• Domain	
  
• License	
  
– Apache/MIT/BSD/GPL…	
  
• Team
Patent
• Why?
• How?
• Patent vs Open Source
Phase	
  I:	
  Open	
  Source	
  on	
  Github
• Code pushed to github.com on Oct 1st, 2014
Phase	
  II:	
  Apache	
  Incubator
• Be accepted as Apache Incubator Project on
Nov 25th, 2014
Why	
  &	
  How	
  Apache?
• Hadoop Ecosystem Home
• Branding
• Community
• The Apache Way
Incubation	
  Progress
• IPMC & PPMC
• Mentors and Champion
• Committers
Incubator	
  Project	
  Proposal
Agenda
• About Apache Kylin
• Kylin Open Source Journey
• Apache Incubating
• Build Community and Ecosystem
• The Good, The Bad and The Ugly
• Q&A
Infrastructure	
  Setup
•	
  Mailing	
  List	
  
– Private@	
  
– Dev@	
  
•	
  Source	
  Code	
  Repo	
  
– git	
  &	
  svn	
  
– Migration	
  
•	
  Website	
  
•	
  JIRA	
  
•	
  Wiki
IP	
  Clearance	
  &	
  Release
• Kylin	
  for	
  brand	
  name?	
  
• Apache	
  License	
  
• GPL	
  Dependency?	
  	
  
• Apache	
  Release	
  
• README,	
  LICENSE,	
  NOTICS,	
  DECLIARMER	
  
• Source	
  Headers	
  
• Licensing	
  of	
  dependencies	
  
• Binaries
18
Team	
  onboard	
  Apache	
  Way
• Community	
  then	
  Code	
  
• Mailing	
  list	
  discussions	
  
• Vote	
  
• Code	
  Quality	
  and	
  Style	
  
• JIRA	
  for	
  each	
  issue,	
  feature	
  
• Merge	
  Pull	
  Request	
  
• Recruiting	
  contributor/committer
19
How	
  to	
  contribute?
• Join	
  mailing	
  list:	
  
• dev@kylin.incubator.apache.org	
  	
  
• Create	
  JIRA	
  or	
  Leave	
  Comments	
  
• Pull	
  Request/Patch	
  to	
  Apache	
  Github	
  Mirror
20
Graduate	
  to	
  Top	
  Project
21
• Diversity	
  
• Complete	
  (and	
  sign	
  off)	
  tasks	
  documented	
  in	
  the	
  
status	
  file	
  
• Ensure	
  suitability	
  for	
  project	
  name	
  and	
  product	
  name	
  
• Demonstrate	
  ability	
  to	
  create	
  Apache	
  releases	
  
• Demonstrate	
  community	
  readiness	
  
• Ensure	
  that	
  mentors	
  and	
  the	
  IPMC	
  have	
  no	
  remaining	
  
issues
Ready	
  to	
  Apache?
22
Agenda
• About Apache Kylin
• Kylin Open Source Journey
• Apache Incubating
• Build Community and Ecosystem
• The Good, The Bad and The Ugly
• Q&A
Build	
  Community	
  and	
  Ecosystem
• What’s community?
• How to grow community?
• Community than Code!
Marketing	
  -­‐	
  Website
• http://kylin.io
– Hosted on github.io (Github Pages)
– Hosted on Apache Infra Server
– http://kylin.incubator.apache.org
Marketing	
  -­‐	
  Blog
• Publish	
  via	
  eBay	
  Tech	
  Blog	
  to	
  gain	
  focus	
  from	
  industry	
  
• http://www.ebaytechblog.com/2014/10/20/announcing-­‐kylin-­‐extreme-­‐olap-­‐engine-­‐for-­‐big-­‐data	
  
“Like	
  arch-­‐rival	
  Amazon.com,	
  the	
  soon-­‐to-­‐split	
  eBay	
  Inc.	
  is	
  
something	
  of	
  an	
  oddity	
  in	
  that	
  it	
  hasn’t	
  historically	
  been	
  a	
  
big	
  contributor	
  to	
  the	
  open-­‐source	
  community.	
  But	
  the	
  e-­‐
commerce	
  pioneer	
  hopes	
  to	
  change	
  that	
  with	
  the	
  release	
  
of	
  the	
  source-­‐code	
  for	
  a	
  homegrown	
  online	
  analytics	
  
processing	
  (OLAP)	
  engine	
  that	
  promises	
  to	
  speed	
  up	
  
Hadoop	
  while	
  also	
  making	
  it	
  more	
  accessible	
  to	
  everyday	
  
enterprise	
  users.”	
  
	
   -­‐-­‐	
  siliconangle.com
Marketing	
  –	
  Social	
  Media
• Github
• KylinOLAP
• Twitter
– @ApacheKylin
• HackNews
• Facebook
– Page: kylin.io
• LinkedIn
– Group: Kylin
• WeChat(微信)
– ApacheKylin
• …
Marketing	
  -­‐	
  Media
• InfoQ	
  
• CSDN	
  
• OSChina	
  
• …
28
Build	
  Community	
  –	
  Mailing	
  List
Build	
  Community	
  –	
  Meetup
• Hive Meetup Bay Area, Dec 2014
• Apache Kylin Meetup Bay Area, Dec 2014
• Apache Kylin Tech Talk @AWS Seattle, Dec 2014
• Apache Kylin Meetup Beijing, Dec 2014
• Spark Meetup Bay Area, March 2015
• Kylin Meetup in China, coming soon
• …
• Big Data Summit Shanghai, Oct 2014
• Big Data Technology Conference Beijing, Dec 2014
• Database Technology Conference Beijing, April 2015
• Hadoop Summit Europe, April 2015
• QCon Beijing, April 2015
• Strata+Hadoop World London, May 2015
• HBaseCon San Francisco, May 2015
• Hadoop Summit San Jose, June 2015
• …
Build	
  Community	
  –	
  Conference
Know	
  your	
  community
• Google	
  Analytics	
  
• Github	
  Statistics	
  
• Mailing	
  List	
  
• WeChat	
  
• …
Apache	
  Kylin	
  Ecosystem
Kylin OLAP
Core
Extension
!  Security
!  Redis Storage
!  Spark Engine
!  Docker
Interface
!  Web Console
!  Customized BI
!  Ambari/Hue Plugin
Integration
!  ODBC Driver
!  ETL
!  Drill
!  SparkSQL
• Kylin Core
• Fundamental framework of Kylin OLAP
Engine
•Extension
– Plugins to support for additional
functions and features
•Integration
– Lifecycle Management Support to
integrate with other applications like BI
tools
•Interface
– Allows for third party users to build
more features via user-interface atop
Kylin core
Apache	
  Kylin	
  Evolution	
  Roadmap
2015%2014%2013%
Ini$al%
Prototype.
for.MOLAP.
•  Basic.end.to.end.
POC.
.
MOLAP.
•  Incremental.
Refresh.
•  ANSI.SQL.
•  ODBC.Driver.
•  Web.GUI.
•  ACL.
•  Open.Source%
HOLAP.
•  Streaming.OLAP.
•  JDBC.Driver.
•  New.GUI.
•  Excel.Support.
•  SparkSQL.
•  ….more.
%
.
Next.Gen.
•  Lambda.Arch.
•  Automa$on.
•  Capacity.
Management.
•  InNMemory.
Analysis.(TBD).
•  Spark.(TBD).
•  Mobile.(TBD).
•  ….more.
TBD.
Future…%
Sep,%2013%
Jan,%2014%
Sep,%2014%
H1,%2015%
Excellence	
  of	
  Engineering
Recruit best people
Done is better than perfect
Do academic research
Explain design in simple words
Everyone does dirty work
You write first version, I write second one
Debate, Decision & Delivery
35
Team Philosophy
Agenda
• About Apache Kylin
• Kylin Open Source Journey
• Apache Incubating
• Build Community and Ecosystem
• The Good, The Bad and The Ugly
• Q&A
• 知名度	
  
• 个⼈人成⻓长	
  
• 团队⽂文化	
  
• 项⺫⽬目质量	
  
• 成就感	
  
• 和⽜牛⼈人做邻居
全世界都在注视着你和你的代码!
The	
  Good
37
The	
  Bad
• 开发效率降低	
  
• 内部项⺫⽬目进度vs外部⽀支持和问题	
  
• 业余时间	
  
• Roadmap	
  and	
  Features	
  from	
  external	
  
38
The	
  Ugly
• 开源不等于免费	
  
• 请尊重开源作者	
  
• Ask	
  question	
  with	
  right	
  way	
  
39
If	
  you	
  want	
  to	
  go	
  fast,	
  go	
  alone.	
  
If	
  you	
  want	
  to	
  go	
  far,	
  go	
  together.
!!African)Proverb)
• Kylin Site:
– http://kylin.incubator.apache.org
– http://kylin.io 	
  
• Twitter:
– @ApacheKylin	
  
• WeChat(微信)
– ApacheKylin
Apache	
  Kylin
@InfoQ infoqchina
1 de 42

Recomendados

6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai por
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @ShanghaiLuke Han
1.6K visualizações10 slides
Apache Kylin Introduction por
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin IntroductionLuke Han
1.8K visualizações35 slides
Apache kylin - Big Data Technology Conference 2014 Beijing por
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingLuke Han
2.1K visualizações38 slides
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ... por
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...Luke Han
3.1K visualizações37 slides
Adding Spark support to Kylin at Bay Area Spark Meetup por
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupLuke Han
1.4K visualizações8 slides
Kylin OLAP Engine Tour por
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine TourLuke Han
5.8K visualizações28 slides

Mais conteúdo relacionado

Mais procurados

Apache kylin (china hadoop summit 2015 shanghai) por
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)qhzhou
1.1K visualizações38 slides
Apache Kylin Extreme OLAP Engine for Big Data por
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataLuke Han
2.9K visualizações34 slides
Apache Kylin: Hadoop OLAP Engine, 2014 Dec por
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecYang Li
4.4K visualizações42 slides
Kylin olap part 1- getting started por
Kylin olap   part 1- getting startedKylin olap   part 1- getting started
Kylin olap part 1- getting startedShubham Shirude
687 visualizações21 slides
Big Data MDX with Mondrian and Apache Kylin por
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylininovex GmbH
3.6K visualizações30 slides
Apache Kylin’s Performance Boost from Apache HBase por
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseHBaseCon
3.5K visualizações21 slides

Mais procurados(20)

Apache kylin (china hadoop summit 2015 shanghai) por qhzhou
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)
qhzhou1.1K visualizações
Apache Kylin Extreme OLAP Engine for Big Data por Luke Han
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
Luke Han2.9K visualizações
Apache Kylin: Hadoop OLAP Engine, 2014 Dec por Yang Li
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Yang Li4.4K visualizações
Kylin olap part 1- getting started por Shubham Shirude
Kylin olap   part 1- getting startedKylin olap   part 1- getting started
Kylin olap part 1- getting started
Shubham Shirude687 visualizações
Big Data MDX with Mondrian and Apache Kylin por inovex GmbH
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylin
inovex GmbH3.6K visualizações
Apache Kylin’s Performance Boost from Apache HBase por HBaseCon
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBase
HBaseCon3.5K visualizações
Apache Kylin Streaming por hongbin ma
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming
hongbin ma1.6K visualizações
Apache Kylin 1.5 Updates por Yang Li
Apache Kylin 1.5 UpdatesApache Kylin 1.5 Updates
Apache Kylin 1.5 Updates
Yang Li1.1K visualizações
Apache Kylin – Cubes on Hadoop por DataWorks Summit
Apache Kylin – Cubes on HadoopApache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on Hadoop
DataWorks Summit8.5K visualizações
Apache Kylin - Balance between space and time - Hadoop Summit 2015 por Debashis Saha
Apache Kylin -  Balance between space and time - Hadoop Summit 2015Apache Kylin -  Balance between space and time - Hadoop Summit 2015
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Debashis Saha2.8K visualizações
Apache Kylin Use Cases in China and Japan por Luke Han
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
Luke Han1.2K visualizações
Apache Kylin on HBase: Extreme OLAP engine for big data por Shi Shao Feng
Apache Kylin on HBase: Extreme OLAP engine for big dataApache Kylin on HBase: Extreme OLAP engine for big data
Apache Kylin on HBase: Extreme OLAP engine for big data
Shi Shao Feng1.6K visualizações
Kylin Engineering Principles por Xu Jiang
Kylin Engineering PrinciplesKylin Engineering Principles
Kylin Engineering Principles
Xu Jiang1.3K visualizações
Design cube in Apache Kylin por Yang Li
Design cube in Apache KylinDesign cube in Apache Kylin
Design cube in Apache Kylin
Yang Li13.2K visualizações
Apache kylin 2.0: from classic olap to real-time data warehouse por Yang Li
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
Yang Li2K visualizações
Apache Kylin @ Big Data Europe 2015 por Seshu Adunuthula
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015
Seshu Adunuthula2.6K visualizações
Apache Kylin - OLAP Cubes for SQL on Hadoop por Ted Dunning
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
Ted Dunning8.5K visualizações
Datacubes in Apache Hive at ApacheCon por amarsri
Datacubes in Apache Hive at ApacheConDatacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheCon
amarsri3.5K visualizações
ApacheKylin_HBaseCon2015 por Luke Han
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
Luke Han433 visualizações

Similar a Apache Kylin Open Source Journey for QCon2015 Beijing

How and Why you can and should Participate in Open Source Projects (AMIS, Sof... por
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...Lucas Jellema
515 visualizações89 slides
OpenStack Documentation in the Open por
OpenStack Documentation in the OpenOpenStack Documentation in the Open
OpenStack Documentation in the OpenAnne Gentle
2.4K visualizações16 slides
Kuali OLE: Enabling Choices for Libraries por
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesRobert H. McDonald
1.6K visualizações20 slides
Create great cncf user base from lessons learned from other open source com... por
Create great cncf user base from   lessons learned from other open source com...Create great cncf user base from   lessons learned from other open source com...
Create great cncf user base from lessons learned from other open source com...Krishna-Kumar
247 visualizações23 slides
Digital Publishing Made Easy with the OSCI Toolkit por
 Digital Publishing Made Easy with the OSCI Toolkit Digital Publishing Made Easy with the OSCI Toolkit
Digital Publishing Made Easy with the OSCI ToolkitKyle Jaebker
649 visualizações36 slides
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop por
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for HadoopHBaseCon
3.3K visualizações21 slides

Similar a Apache Kylin Open Source Journey for QCon2015 Beijing(20)

How and Why you can and should Participate in Open Source Projects (AMIS, Sof... por Lucas Jellema
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
Lucas Jellema515 visualizações
OpenStack Documentation in the Open por Anne Gentle
OpenStack Documentation in the OpenOpenStack Documentation in the Open
OpenStack Documentation in the Open
Anne Gentle2.4K visualizações
Kuali OLE: Enabling Choices for Libraries por Robert H. McDonald
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for Libraries
Robert H. McDonald1.6K visualizações
Create great cncf user base from lessons learned from other open source com... por Krishna-Kumar
Create great cncf user base from   lessons learned from other open source com...Create great cncf user base from   lessons learned from other open source com...
Create great cncf user base from lessons learned from other open source com...
Krishna-Kumar 247 visualizações
Digital Publishing Made Easy with the OSCI Toolkit por Kyle Jaebker
 Digital Publishing Made Easy with the OSCI Toolkit Digital Publishing Made Easy with the OSCI Toolkit
Digital Publishing Made Easy with the OSCI Toolkit
Kyle Jaebker649 visualizações
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop por HBaseCon
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon3.3K visualizações
Building Enterprise OLAP on Hadoop for FSI por Luke Han
Building Enterprise OLAP on Hadoop for FSIBuilding Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSI
Luke Han981 visualizações
Circuit 2015 Keynote - Carsten Ziegeler por ICF CIRCUIT
Circuit 2015 Keynote -  Carsten ZiegelerCircuit 2015 Keynote -  Carsten Ziegeler
Circuit 2015 Keynote - Carsten Ziegeler
ICF CIRCUIT703 visualizações
OpenStack Doc Overview for Boot Camp por Anne Gentle
OpenStack Doc Overview for Boot CampOpenStack Doc Overview for Boot Camp
OpenStack Doc Overview for Boot Camp
Anne Gentle7.6K visualizações
Beyond DevOps: How Netflix Bridges the Gap? por C4Media
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
C4Media1.4K visualizações
Alfresco Day Vienna 2015 - Technical Track - REST API of the Future por Alfresco Software
Alfresco Day Vienna 2015 - Technical Track - REST API of the FutureAlfresco Day Vienna 2015 - Technical Track - REST API of the Future
Alfresco Day Vienna 2015 - Technical Track - REST API of the Future
Alfresco Software1.1K visualizações
Unicon June 2014 IAM Briefing por John Gasper
Unicon June 2014 IAM BriefingUnicon June 2014 IAM Briefing
Unicon June 2014 IAM Briefing
John Gasper507 visualizações
AWS User Group - Survey Results and Building APIs on AWS por Sebastian Krueger
AWS User Group - Survey Results and Building APIs on AWSAWS User Group - Survey Results and Building APIs on AWS
AWS User Group - Survey Results and Building APIs on AWS
Sebastian Krueger386 visualizações
Getting a Neural Network Up and Running with OpenLab por Melvin Hillsman
Getting a Neural Network Up and Running with OpenLabGetting a Neural Network Up and Running with OpenLab
Getting a Neural Network Up and Running with OpenLab
Melvin Hillsman78 visualizações
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to... por Big Data Spain
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Big Data Spain3.7K visualizações
Apache Kylin 101 por SamanthaBerlant
Apache Kylin 101Apache Kylin 101
Apache Kylin 101
SamanthaBerlant108 visualizações
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets por Tyler Wishnoff
Apache kylin 101 - Get Sub-Second Analytics on Massive DatasetsApache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Tyler Wishnoff424 visualizações
Ibm leads way with hadoop and spark 2015 may 15 por IBMInfoSphereUGFR
Ibm leads way with hadoop and spark 2015 may 15Ibm leads way with hadoop and spark 2015 may 15
Ibm leads way with hadoop and spark 2015 may 15
IBMInfoSphereUGFR1.3K visualizações
AD1545 - Extending the XPages Extension Library por paidi_ed
AD1545 - Extending the XPages Extension LibraryAD1545 - Extending the XPages Extension Library
AD1545 - Extending the XPages Extension Library
paidi_ed668 visualizações
caseywest por tutorialsruby
caseywestcaseywest
caseywest
tutorialsruby171 visualizações

Mais de Luke Han

Augmented OLAP for Big Data por
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big DataLuke Han
10.6K visualizações37 slides
Apache Kylin and Use Cases - 2018 Big Data Spain por
Apache Kylin and Use Cases - 2018 Big Data SpainApache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data SpainLuke Han
1.3K visualizações39 slides
Refactoring your EDW with Mobile Analytics Products por
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsLuke Han
312 visualizações48 slides
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai por
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @ShanghaiLuke Han
970 visualizações24 slides
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai por
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @ShanghaiLuke Han
3.7K visualizações16 slides
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai por
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
4.2K visualizações63 slides

Mais de Luke Han(7)

Augmented OLAP for Big Data por Luke Han
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big Data
Luke Han10.6K visualizações
Apache Kylin and Use Cases - 2018 Big Data Spain por Luke Han
Apache Kylin and Use Cases - 2018 Big Data SpainApache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han1.3K visualizações
Refactoring your EDW with Mobile Analytics Products por Luke Han
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
Luke Han312 visualizações
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai por Luke Han
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
Luke Han970 visualizações
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai por Luke Han
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
Luke Han3.7K visualizações
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai por Luke Han
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Luke Han4.2K visualizações
Actuate presentation 2011 por Luke Han
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011
Luke Han1.2K visualizações

Último

MariaDB stored procedures and why they should be improved por
MariaDB stored procedures and why they should be improvedMariaDB stored procedures and why they should be improved
MariaDB stored procedures and why they should be improvedFederico Razzoli
8 visualizações32 slides
Advanced API Mocking Techniques por
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking TechniquesDimpy Adhikary
19 visualizações11 slides
Navigating container technology for enhanced security by Niklas Saari por
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas SaariMetosin Oy
12 visualizações34 slides
El Arte de lo Possible por
El Arte de lo PossibleEl Arte de lo Possible
El Arte de lo PossibleNeo4j
39 visualizações35 slides
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... por
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...Deltares
10 visualizações32 slides
Roadmap y Novedades de producto por
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de productoNeo4j
50 visualizações33 slides

Último(20)

MariaDB stored procedures and why they should be improved por Federico Razzoli
MariaDB stored procedures and why they should be improvedMariaDB stored procedures and why they should be improved
MariaDB stored procedures and why they should be improved
Federico Razzoli8 visualizações
Advanced API Mocking Techniques por Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary19 visualizações
Navigating container technology for enhanced security by Niklas Saari por Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy12 visualizações
El Arte de lo Possible por Neo4j
El Arte de lo PossibleEl Arte de lo Possible
El Arte de lo Possible
Neo4j39 visualizações
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... por Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares10 visualizações
Roadmap y Novedades de producto por Neo4j
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de producto
Neo4j50 visualizações
Software testing company in India.pptx por SakshiPatel82
Software testing company in India.pptxSoftware testing company in India.pptx
Software testing company in India.pptx
SakshiPatel827 visualizações
ict act 1.pptx por sanjaniarun08
ict act 1.pptxict act 1.pptx
ict act 1.pptx
sanjaniarun0813 visualizações
Unleash The Monkeys por Jacob Duijzer
Unleash The MonkeysUnleash The Monkeys
Unleash The Monkeys
Jacob Duijzer7 visualizações
Copilot Prompting Toolkit_All Resources.pdf por Riccardo Zamana
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdf
Riccardo Zamana8 visualizações
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... por Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
Deltares17 visualizações
DevsRank por devsrank786
DevsRankDevsRank
DevsRank
devsrank78611 visualizações
A first look at MariaDB 11.x features and ideas on how to use them por Federico Razzoli
A first look at MariaDB 11.x features and ideas on how to use themA first look at MariaDB 11.x features and ideas on how to use them
A first look at MariaDB 11.x features and ideas on how to use them
Federico Razzoli45 visualizações
What Can Employee Monitoring Software Do?​ por wAnywhere
What Can Employee Monitoring Software Do?​What Can Employee Monitoring Software Do?​
What Can Employee Monitoring Software Do?​
wAnywhere21 visualizações
Citi TechTalk Session 2: Kafka Deep Dive por confluent
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
confluent17 visualizações
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx por animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm13 visualizações
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema por Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares17 visualizações
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... por Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 visualizações
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs por Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares8 visualizações

Apache Kylin Open Source Journey for QCon2015 Beijing

  • 1. Apache Kylin Open Source Journey 韩卿 | Luke Han Co-Creator & PMC Member lukehan@apache.org 2015-­‐04-­‐25
  • 2. Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 3. About  Apache  Kylin  (麒麟) Extreme OLAP Engine for Big Data http://kylin.io   Kylin is an open source Distributed Analytics Engine that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets • First Apache Project open sourced by eBay Inc. • First Apache Project fully contributed from eBay CCOE • Open Sourced on Oct 1st, 2014 • Be accepted as Apache Incubator Project on Nov 25th, 2014 • Apache Kylin is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by Incubator.
  • 4. Technical  Challenges • Huge volume data – Table scan • Big table joins – Data shuffling • Analysis on different granularity – Runtime aggregation expensive • Map Reduce job – Batch processing
  • 5. Apache  Kylin  Architecture Cube  Build  Engine   (MapReduce,  Streaming…) SQL Low    Latency  -­‐  Seconds Mid  Latency  -­‐  Minutes Routing 3rd  Party  App   (Web  App,  Mobile…) Metadata SQL-­‐Based  Tool   (BI  Tools:  Tableau…) Query  Engine Hadoop Hive REST  API JDBC/ODBC ➢ Online  Analysis  Data  Flow   ➢ Offline  Data  Flow   ➢ Clients/Users  interactive  with  Kylin   via  SQL   ➢ OLAP  Cube  is  transparent  to  users Star  Schema  Data Key  Value  Data Data   Cube OLAP   Cube   (HBase) SQL REST  Server
  • 6. Features • Extremely Fast OLAP Engine at scale • ANSI SQL Interface on Hadoop • Seamless Integration with BI Tools, like Tableau • Interactive Query Capability • MOLAP Cube • Compression and Encoding Support • Incremental Build of Cubes • Approximate Query Capability for Distinct Count (HyperLogLog) • Leverage HBase Coprocessor for query latency • Job Management and Monitoring • User friendly Web GUI for manage, build, monitor and query cubes • Security capability to set ACL at Cube/Project Level • Support LDAP Integration • Streaming Support Coming soon! 6 90%$le'queries'<5s'
  • 7. Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 8. Jun  2014 US#Patent#Filed# Kylin  Open  Source  Journey Sep  2013 Ini$a$ve( Jan  2014 POC$Completed$  Jul  2014 V1.0%Beta%Released% Oct  2014 V1.0%GA%Released% Open%Sourced% Apache  Top  Project Nov  2014 Apache'' Incubator'Project'
  • 9. Ready  for  Open  Source • Open  Source  from  Day  One   • Internal  vs  External   • Intellectual  Property   • Legal   • Domain   • License   – Apache/MIT/BSD/GPL…   • Team
  • 10. Patent • Why? • How? • Patent vs Open Source
  • 11. Phase  I:  Open  Source  on  Github • Code pushed to github.com on Oct 1st, 2014
  • 12. Phase  II:  Apache  Incubator • Be accepted as Apache Incubator Project on Nov 25th, 2014
  • 13. Why  &  How  Apache? • Hadoop Ecosystem Home • Branding • Community • The Apache Way
  • 15. • IPMC & PPMC • Mentors and Champion • Committers Incubator  Project  Proposal
  • 16. Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 17. Infrastructure  Setup •  Mailing  List   – Private@   – Dev@   •  Source  Code  Repo   – git  &  svn   – Migration   •  Website   •  JIRA   •  Wiki
  • 18. IP  Clearance  &  Release • Kylin  for  brand  name?   • Apache  License   • GPL  Dependency?     • Apache  Release   • README,  LICENSE,  NOTICS,  DECLIARMER   • Source  Headers   • Licensing  of  dependencies   • Binaries 18
  • 19. Team  onboard  Apache  Way • Community  then  Code   • Mailing  list  discussions   • Vote   • Code  Quality  and  Style   • JIRA  for  each  issue,  feature   • Merge  Pull  Request   • Recruiting  contributor/committer 19
  • 20. How  to  contribute? • Join  mailing  list:   • dev@kylin.incubator.apache.org     • Create  JIRA  or  Leave  Comments   • Pull  Request/Patch  to  Apache  Github  Mirror 20
  • 21. Graduate  to  Top  Project 21 • Diversity   • Complete  (and  sign  off)  tasks  documented  in  the   status  file   • Ensure  suitability  for  project  name  and  product  name   • Demonstrate  ability  to  create  Apache  releases   • Demonstrate  community  readiness   • Ensure  that  mentors  and  the  IPMC  have  no  remaining   issues
  • 23. Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 24. Build  Community  and  Ecosystem • What’s community? • How to grow community? • Community than Code!
  • 25. Marketing  -­‐  Website • http://kylin.io – Hosted on github.io (Github Pages) – Hosted on Apache Infra Server – http://kylin.incubator.apache.org
  • 26. Marketing  -­‐  Blog • Publish  via  eBay  Tech  Blog  to  gain  focus  from  industry   • http://www.ebaytechblog.com/2014/10/20/announcing-­‐kylin-­‐extreme-­‐olap-­‐engine-­‐for-­‐big-­‐data   “Like  arch-­‐rival  Amazon.com,  the  soon-­‐to-­‐split  eBay  Inc.  is   something  of  an  oddity  in  that  it  hasn’t  historically  been  a   big  contributor  to  the  open-­‐source  community.  But  the  e-­‐ commerce  pioneer  hopes  to  change  that  with  the  release   of  the  source-­‐code  for  a  homegrown  online  analytics   processing  (OLAP)  engine  that  promises  to  speed  up   Hadoop  while  also  making  it  more  accessible  to  everyday   enterprise  users.”     -­‐-­‐  siliconangle.com
  • 27. Marketing  –  Social  Media • Github • KylinOLAP • Twitter – @ApacheKylin • HackNews • Facebook – Page: kylin.io • LinkedIn – Group: Kylin • WeChat(微信) – ApacheKylin • …
  • 28. Marketing  -­‐  Media • InfoQ   • CSDN   • OSChina   • … 28
  • 29. Build  Community  –  Mailing  List
  • 30. Build  Community  –  Meetup • Hive Meetup Bay Area, Dec 2014 • Apache Kylin Meetup Bay Area, Dec 2014 • Apache Kylin Tech Talk @AWS Seattle, Dec 2014 • Apache Kylin Meetup Beijing, Dec 2014 • Spark Meetup Bay Area, March 2015 • Kylin Meetup in China, coming soon • …
  • 31. • Big Data Summit Shanghai, Oct 2014 • Big Data Technology Conference Beijing, Dec 2014 • Database Technology Conference Beijing, April 2015 • Hadoop Summit Europe, April 2015 • QCon Beijing, April 2015 • Strata+Hadoop World London, May 2015 • HBaseCon San Francisco, May 2015 • Hadoop Summit San Jose, June 2015 • … Build  Community  –  Conference
  • 32. Know  your  community • Google  Analytics   • Github  Statistics   • Mailing  List   • WeChat   • …
  • 33. Apache  Kylin  Ecosystem Kylin OLAP Core Extension !  Security !  Redis Storage !  Spark Engine !  Docker Interface !  Web Console !  Customized BI !  Ambari/Hue Plugin Integration !  ODBC Driver !  ETL !  Drill !  SparkSQL • Kylin Core • Fundamental framework of Kylin OLAP Engine •Extension – Plugins to support for additional functions and features •Integration – Lifecycle Management Support to integrate with other applications like BI tools •Interface – Allows for third party users to build more features via user-interface atop Kylin core
  • 34. Apache  Kylin  Evolution  Roadmap 2015%2014%2013% Ini$al% Prototype. for.MOLAP. •  Basic.end.to.end. POC. . MOLAP. •  Incremental. Refresh. •  ANSI.SQL. •  ODBC.Driver. •  Web.GUI. •  ACL. •  Open.Source% HOLAP. •  Streaming.OLAP. •  JDBC.Driver. •  New.GUI. •  Excel.Support. •  SparkSQL. •  ….more. % . Next.Gen. •  Lambda.Arch. •  Automa$on. •  Capacity. Management. •  InNMemory. Analysis.(TBD). •  Spark.(TBD). •  Mobile.(TBD). •  ….more. TBD. Future…% Sep,%2013% Jan,%2014% Sep,%2014% H1,%2015%
  • 35. Excellence  of  Engineering Recruit best people Done is better than perfect Do academic research Explain design in simple words Everyone does dirty work You write first version, I write second one Debate, Decision & Delivery 35 Team Philosophy
  • 36. Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 37. • 知名度   • 个⼈人成⻓长   • 团队⽂文化   • 项⺫⽬目质量   • 成就感   • 和⽜牛⼈人做邻居 全世界都在注视着你和你的代码! The  Good 37
  • 38. The  Bad • 开发效率降低   • 内部项⺫⽬目进度vs外部⽀支持和问题   • 业余时间   • Roadmap  and  Features  from  external   38
  • 39. The  Ugly • 开源不等于免费   • 请尊重开源作者   • Ask  question  with  right  way   39
  • 40. If  you  want  to  go  fast,  go  alone.   If  you  want  to  go  far,  go  together. !!African)Proverb)
  • 41. • Kylin Site: – http://kylin.incubator.apache.org – http://kylin.io   • Twitter: – @ApacheKylin   • WeChat(微信) – ApacheKylin Apache  Kylin