SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Hadoop and Hive at Square
Nicolas Thiébaud
!
nicothieb@
nicolas@squareup.com
Data Engineering at Square
July 2014
Square: Make
commerce easy
Remove crappy POSes from the counter
Building the best register for small businesses.
Started with card processing and bringing more
value to merchants using the point of sale.
!
Merchant and Buyer facing products
Square Register, Square Cash, Pickup,
Feedback
!
Data products
Merchant Analytics, Capital
Data at Square
Internal Data
!
Produced on app servers (~200+ services),
mysql or psql
!
Logging and tracing from apps and web
to public endpoint
!
Example: payment data, user data, ledger
entries
External Data
!
Payment processing partners ship flat files
to us
Offline Data usage at Square
!
BI/Analysis/Reporting: ~200 mysql users,
~100 hadoop users
!
ML: Risk detection, recommendation
!
Apps: A/B testing, Commercial support,
Capital
Data Architecture at
Square: Kafka
Historical, most of our users still use this
App DB -> Analytical DB stripping out PII,
cursoring, looking at binlog replication
!
Hadoop: Kafka as a backbone
App DB -> Kafka using cursoring and PII
stripping
App Server -> Kafka (eg: tracing) in proto
format
Feed consumption -> Kafka
!
Kafka written to hdfs using offsets, dupes
are written when the consumer restarts
!
Raw data is deduped and extracted from
protos to rcfiles in daily batches. Everything
is exposed in Hive
Most datasets don’t fit in mysql. Most queries
cannot run anymore
Analysts broke down their jobs to run on single
day windows. The query sniper keeps hitting
them.
!
Mysql no longer supported as source of truth
for offline data. Tables are windowed
We keep revisiting the amount of data stored in
MySQL
!
Everyone must migrate to hive (users and
apps)
Mysql Analytical DBs will now be an export
location for data reduced in Hadoop
!
All datasets must be present in Hadoop
Even small ones :)
Transitioning to Hive
Transitioning to Hive
Stability
!
Hive 10 + Hue 2.5 as starting point + many
patches -> 2 restarts a day with small load
!
Decided to go to hive 12 and patch the
bugs affecting us in an internal build
!
Two major tasks: 10 -> 12 and building
hive internally
Reliability
!
Sentinel, data validation daemon
!
Conduit, hive etls
!
Customer defined SLA’s
Education
!
Office hours, trainings, mailing list
Project Babar: Building
a stable Hive 12
Project Babar: Building a stable Hive 12
Patch open source hive to address
Square specific issues
!
Setup integration tests in kochiku, no
performance test
!
Hiveserver only, no cli. Staging and
production envs
!
Push and pull changes to apache jira
Build and deploy hive artifacts
!
Makefile
!
metastore, hiveserver (staging and prod),
cli tools (beeline), hivesandbox
!
package configuration
Misc
!
hue 3.5
!
hive-udfs
Internal Hive Build
cdh5-0.12.0_5.0.1 branch + 9 commits
3 test fixes, 2 square specific changes (pom
+ ci)
!
DATAPLAT-436 Beeline should return non-
zero on invalid statements
!
HIVE-5799: session/operation timeout for
hiveserver2
HIVE-5707: Validate values for ConfVar
!
HIVE-7040: Allow TCP keep alive on Hive
Server 2
!
(merged in cdh5-0.12.0_5.0.1) HIVE-6893:
out of sequence error in HiveMetastore
Story of HIVE-7040 + HIVE-5799
HIVE-7040: Allow TCP keep alive on Hive
Server 2
F5 stateful firewall kills open connections
HIVE-5799: session/operation timeout for
hiveserver2
Beeline interrupt does not close sessions
Hive Ops trick:
./wait_for_hive_jobs && sudo sv restart /var/service/hiveserver
Next Steps
Figure out the best way to contribute back
patches
!
HIVE-668{3,4}: Beeline comments suck
HIVE-7200: Beeline output displays column
heading even if --showHeader=false is set
HIVE-4924: Support JDBC query timeouts
HIVE-5232: Use async interface for jdbc
!
Hive HA
Shark
Tez?
Hive et Hadoop Usage chez Square

Mais conteúdo relacionado

Mais procurados

Scaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksScaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksDatabricks
 
Azure IaaS-PaaS Migrations - Lessons Learned
Azure IaaS-PaaS Migrations - Lessons LearnedAzure IaaS-PaaS Migrations - Lessons Learned
Azure IaaS-PaaS Migrations - Lessons LearnedJohn Calvert
 
Create a Chatbot with AWS Lex, Lambda, and HERE
Create a Chatbot with AWS Lex, Lambda, and HERECreate a Chatbot with AWS Lex, Lambda, and HERE
Create a Chatbot with AWS Lex, Lambda, and HERENic Raboy
 
Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...
Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...
Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...Databricks
 
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemOptimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemDataWorks Summit
 
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...Spark Summit
 
DOES SFO 2016 - Avan Mathur - Planning for Huge Scale
DOES SFO 2016 - Avan Mathur - Planning for Huge ScaleDOES SFO 2016 - Avan Mathur - Planning for Huge Scale
DOES SFO 2016 - Avan Mathur - Planning for Huge ScaleGene Kim
 
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle DiagnosticsScylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle DiagnosticsScyllaDB
 
Bridging the Gap Between Datasets and DataFrames
Bridging the Gap Between Datasets and DataFramesBridging the Gap Between Datasets and DataFrames
Bridging the Gap Between Datasets and DataFramesDatabricks
 
Building a Self-Service Big Data Pipeline
Building a Self-Service Big Data PipelineBuilding a Self-Service Big Data Pipeline
Building a Self-Service Big Data PipelineDataWorks Summit
 
Palringo : a startup's journey from a data center to the cloud
Palringo : a startup's journey from a data center to the cloudPalringo : a startup's journey from a data center to the cloud
Palringo : a startup's journey from a data center to the cloudPhilipBasford
 
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data SpainApache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data SpainLuke Han
 
Codemotion 2014 4ward time series with MongoDB
Codemotion 2014 4ward time series with MongoDBCodemotion 2014 4ward time series with MongoDB
Codemotion 2014 4ward time series with MongoDBIvan Fioravanti
 
The New Tech Stack for Device Data
The New Tech Stack for Device DataThe New Tech Stack for Device Data
The New Tech Stack for Device DataRyan Tabora
 
RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
RedisConf17 - Redis Powers Next-gen Ambient Intelligence PlatformRedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
RedisConf17 - Redis Powers Next-gen Ambient Intelligence PlatformRedis Labs
 
Rounds analytics pipeline
Rounds analytics pipelineRounds analytics pipeline
Rounds analytics pipelineAviv Laufer
 
Disrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the CloudDisrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the CloudJen Aman
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at ScaleDatabricks
 
Spark - Migration Story
Spark - Migration Story Spark - Migration Story
Spark - Migration Story Roman Chukh
 

Mais procurados (20)

Scaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksScaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber Attacks
 
Azure IaaS-PaaS Migrations - Lessons Learned
Azure IaaS-PaaS Migrations - Lessons LearnedAzure IaaS-PaaS Migrations - Lessons Learned
Azure IaaS-PaaS Migrations - Lessons Learned
 
Create a Chatbot with AWS Lex, Lambda, and HERE
Create a Chatbot with AWS Lex, Lambda, and HERECreate a Chatbot with AWS Lex, Lambda, and HERE
Create a Chatbot with AWS Lex, Lambda, and HERE
 
Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...
Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...
Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to ...
 
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemOptimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
 
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
 
DOES SFO 2016 - Avan Mathur - Planning for Huge Scale
DOES SFO 2016 - Avan Mathur - Planning for Huge ScaleDOES SFO 2016 - Avan Mathur - Planning for Huge Scale
DOES SFO 2016 - Avan Mathur - Planning for Huge Scale
 
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle DiagnosticsScylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
Scylla Summit 2018: Scylla and KairosDB in Smart Vehicle Diagnostics
 
Bridging the Gap Between Datasets and DataFrames
Bridging the Gap Between Datasets and DataFramesBridging the Gap Between Datasets and DataFrames
Bridging the Gap Between Datasets and DataFrames
 
Building a Self-Service Big Data Pipeline
Building a Self-Service Big Data PipelineBuilding a Self-Service Big Data Pipeline
Building a Self-Service Big Data Pipeline
 
Palringo : a startup's journey from a data center to the cloud
Palringo : a startup's journey from a data center to the cloudPalringo : a startup's journey from a data center to the cloud
Palringo : a startup's journey from a data center to the cloud
 
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data SpainApache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
 
Codemotion 2014 4ward time series with MongoDB
Codemotion 2014 4ward time series with MongoDBCodemotion 2014 4ward time series with MongoDB
Codemotion 2014 4ward time series with MongoDB
 
The New Tech Stack for Device Data
The New Tech Stack for Device DataThe New Tech Stack for Device Data
The New Tech Stack for Device Data
 
RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
RedisConf17 - Redis Powers Next-gen Ambient Intelligence PlatformRedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
 
Rounds analytics pipeline
Rounds analytics pipelineRounds analytics pipeline
Rounds analytics pipeline
 
Disrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the CloudDisrupting Big Data with Apache Spark in the Cloud
Disrupting Big Data with Apache Spark in the Cloud
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at Scale
 
Spark - Migration Story
Spark - Migration Story Spark - Migration Story
Spark - Migration Story
 

Destaque

Understanding Hadoop through examples
Understanding Hadoop through examplesUnderstanding Hadoop through examples
Understanding Hadoop through examplesYoshitomo Matsubara
 
HUG France - 20160114 industrialisation_process_big_data CanalPlus
HUG France -  20160114 industrialisation_process_big_data CanalPlusHUG France -  20160114 industrialisation_process_big_data CanalPlus
HUG France - 20160114 industrialisation_process_big_data CanalPlusModern Data Stack France
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Modern Data Stack France
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...Modern Data Stack France
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with ExamplesJoe McTee
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with sparkModern Data Stack France
 

Destaque (7)

Hug janvier 2016 -EDF
Hug   janvier 2016 -EDFHug   janvier 2016 -EDF
Hug janvier 2016 -EDF
 
Understanding Hadoop through examples
Understanding Hadoop through examplesUnderstanding Hadoop through examples
Understanding Hadoop through examples
 
HUG France - 20160114 industrialisation_process_big_data CanalPlus
HUG France -  20160114 industrialisation_process_big_data CanalPlusHUG France -  20160114 industrialisation_process_big_data CanalPlus
HUG France - 20160114 industrialisation_process_big_data CanalPlus
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 

Semelhante a Hive et Hadoop Usage chez Square

Srikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hydSrikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hydsrikanth K
 
Nagarjuna_Damarla
Nagarjuna_DamarlaNagarjuna_Damarla
Nagarjuna_DamarlaNag Arjun
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformInMobi Technology
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaSunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaMopuru Babu
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceHortonworks
 
Nagarjuna_Damarla_Resume
Nagarjuna_Damarla_ResumeNagarjuna_Damarla_Resume
Nagarjuna_Damarla_ResumeNag Arjun
 
How Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.comHow Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.comSalesforce Engineering
 
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkSunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkMopuru Babu
 
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkSunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkMopuru Babu
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Feb 2024 Apache Hudi Community Sync with Daniel Ford
Feb 2024 Apache Hudi Community Sync with Daniel FordFeb 2024 Apache Hudi Community Sync with Daniel Ford
Feb 2024 Apache Hudi Community Sync with Daniel Fordnadine39280
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
 
Summer Shorts: Big Data Integration
Summer Shorts: Big Data IntegrationSummer Shorts: Big Data Integration
Summer Shorts: Big Data Integrationibi
 
HIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlHIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlSascha Oehl
 

Semelhante a Hive et Hadoop Usage chez Square (20)

Srikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hydSrikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hyd
 
Nagarjuna_Damarla
Nagarjuna_DamarlaNagarjuna_Damarla
Nagarjuna_Damarla
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
 
Sourav banerjee resume
Sourav banerjee   resumeSourav banerjee   resume
Sourav banerjee resume
 
Mansi Khare
Mansi KhareMansi Khare
Mansi Khare
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaSunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
Nagarjuna_Damarla_Resume
Nagarjuna_Damarla_ResumeNagarjuna_Damarla_Resume
Nagarjuna_Damarla_Resume
 
Prasanna Resume
Prasanna ResumePrasanna Resume
Prasanna Resume
 
How Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.comHow Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.com
 
Yasar resume 2
Yasar resume 2Yasar resume 2
Yasar resume 2
 
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkSunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
 
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkSunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Feb 2024 Apache Hudi Community Sync with Daniel Ford
Feb 2024 Apache Hudi Community Sync with Daniel FordFeb 2024 Apache Hudi Community Sync with Daniel Ford
Feb 2024 Apache Hudi Community Sync with Daniel Ford
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
 
Summer Shorts: Big Data Integration
Summer Shorts: Big Data IntegrationSummer Shorts: Big Data Integration
Summer Shorts: Big Data Integration
 
HIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlHIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha Oehl
 
Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 

Mais de Modern Data Stack France

Talend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupTalend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupModern Data Stack France
 
Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017Modern Data Stack France
 
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)Modern Data Stack France
 
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015Modern Data Stack France
 
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...Modern Data Stack France
 
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015Modern Data Stack France
 
June Spark meetup : search as recommandation
June Spark meetup : search as recommandationJune Spark meetup : search as recommandation
June Spark meetup : search as recommandationModern Data Stack France
 
Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Modern Data Stack France
 
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamielParis Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamielModern Data Stack France
 
Hadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REX
Hadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REXHadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REX
Hadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REXModern Data Stack France
 
The Cascading (big) data application framework
The Cascading (big) data application frameworkThe Cascading (big) data application framework
The Cascading (big) data application frameworkModern Data Stack France
 
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014Modern Data Stack France
 
Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...
Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...
Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...Modern Data Stack France
 
HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...
HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...
HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...Modern Data Stack France
 

Mais de Modern Data Stack France (20)

Stash - Data FinOPS
Stash - Data FinOPSStash - Data FinOPS
Stash - Data FinOPS
 
Vue d'ensemble Dremio
Vue d'ensemble DremioVue d'ensemble Dremio
Vue d'ensemble Dremio
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Talend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupTalend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark Meetup
 
Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017
 
Hugfr SPARK & RIAK -20160114_hug_france
Hugfr  SPARK & RIAK -20160114_hug_franceHugfr  SPARK & RIAK -20160114_hug_france
Hugfr SPARK & RIAK -20160114_hug_france
 
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
 
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
 
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
 
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
 
Spark dataframe
Spark dataframeSpark dataframe
Spark dataframe
 
June Spark meetup : search as recommandation
June Spark meetup : search as recommandationJune Spark meetup : search as recommandation
June Spark meetup : search as recommandation
 
Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)
 
Spark meetup at viadeo
Spark meetup at viadeoSpark meetup at viadeo
Spark meetup at viadeo
 
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamielParis Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
 
Hadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REX
Hadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REXHadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REX
Hadoop User Group 29Jan2015 Apache Flink / Haven / CapGemnini REX
 
The Cascading (big) data application framework
The Cascading (big) data application frameworkThe Cascading (big) data application framework
The Cascading (big) data application framework
 
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
 
Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...
Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...
Hug france - Administration Hadoop et retour d’expérience BI avec Impala, lim...
 
HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...
HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...
HUGFR : Une infrastructure Kafka & Storm pour lutter contre les attaques DDoS...
 

Último

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 

Último (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 

Hive et Hadoop Usage chez Square

  • 1. Hadoop and Hive at Square Nicolas Thiébaud ! nicothieb@ nicolas@squareup.com Data Engineering at Square July 2014
  • 2. Square: Make commerce easy Remove crappy POSes from the counter Building the best register for small businesses. Started with card processing and bringing more value to merchants using the point of sale. ! Merchant and Buyer facing products Square Register, Square Cash, Pickup, Feedback ! Data products Merchant Analytics, Capital
  • 3.
  • 4.
  • 5.
  • 6. Data at Square Internal Data ! Produced on app servers (~200+ services), mysql or psql ! Logging and tracing from apps and web to public endpoint ! Example: payment data, user data, ledger entries External Data ! Payment processing partners ship flat files to us Offline Data usage at Square ! BI/Analysis/Reporting: ~200 mysql users, ~100 hadoop users ! ML: Risk detection, recommendation ! Apps: A/B testing, Commercial support, Capital
  • 7. Data Architecture at Square: Kafka Historical, most of our users still use this App DB -> Analytical DB stripping out PII, cursoring, looking at binlog replication ! Hadoop: Kafka as a backbone App DB -> Kafka using cursoring and PII stripping App Server -> Kafka (eg: tracing) in proto format Feed consumption -> Kafka ! Kafka written to hdfs using offsets, dupes are written when the consumer restarts ! Raw data is deduped and extracted from protos to rcfiles in daily batches. Everything is exposed in Hive
  • 8. Most datasets don’t fit in mysql. Most queries cannot run anymore Analysts broke down their jobs to run on single day windows. The query sniper keeps hitting them. ! Mysql no longer supported as source of truth for offline data. Tables are windowed We keep revisiting the amount of data stored in MySQL ! Everyone must migrate to hive (users and apps) Mysql Analytical DBs will now be an export location for data reduced in Hadoop ! All datasets must be present in Hadoop Even small ones :) Transitioning to Hive
  • 9. Transitioning to Hive Stability ! Hive 10 + Hue 2.5 as starting point + many patches -> 2 restarts a day with small load ! Decided to go to hive 12 and patch the bugs affecting us in an internal build ! Two major tasks: 10 -> 12 and building hive internally Reliability ! Sentinel, data validation daemon ! Conduit, hive etls ! Customer defined SLA’s Education ! Office hours, trainings, mailing list
  • 10. Project Babar: Building a stable Hive 12
  • 11. Project Babar: Building a stable Hive 12 Patch open source hive to address Square specific issues ! Setup integration tests in kochiku, no performance test ! Hiveserver only, no cli. Staging and production envs ! Push and pull changes to apache jira Build and deploy hive artifacts ! Makefile ! metastore, hiveserver (staging and prod), cli tools (beeline), hivesandbox ! package configuration Misc ! hue 3.5 ! hive-udfs
  • 12. Internal Hive Build cdh5-0.12.0_5.0.1 branch + 9 commits 3 test fixes, 2 square specific changes (pom + ci) ! DATAPLAT-436 Beeline should return non- zero on invalid statements ! HIVE-5799: session/operation timeout for hiveserver2 HIVE-5707: Validate values for ConfVar ! HIVE-7040: Allow TCP keep alive on Hive Server 2 ! (merged in cdh5-0.12.0_5.0.1) HIVE-6893: out of sequence error in HiveMetastore
  • 13. Story of HIVE-7040 + HIVE-5799 HIVE-7040: Allow TCP keep alive on Hive Server 2 F5 stateful firewall kills open connections HIVE-5799: session/operation timeout for hiveserver2 Beeline interrupt does not close sessions
  • 14. Hive Ops trick: ./wait_for_hive_jobs && sudo sv restart /var/service/hiveserver
  • 15. Next Steps Figure out the best way to contribute back patches ! HIVE-668{3,4}: Beeline comments suck HIVE-7200: Beeline output displays column heading even if --showHeader=false is set HIVE-4924: Support JDBC query timeouts HIVE-5232: Use async interface for jdbc ! Hive HA Shark Tez?