SlideShare uma empresa Scribd logo
1 de 63
Baixar para ler offline
BIG DATA WEB APPS
FOR INTERACTIVE
HADOOP
Enrico Berti

Big Data Spain, Nov 17, 2014
GOAL

OF HUE
WEB INTERFACE FOR ANALYZING DATA
WITH APACHE HADOOP	
  
SIMPLIFY AND INTEGRATE



FREE AND OPEN SOURCE
—> OPEN UP BIG DATA
VIEW FROM

30K FEET
Hadoop Web Server
You, your colleagues and even that
friend that uses IE9 ;)
OPEN SOURCE

~4000 COMMITS	
  


56 CONTRIBUTORS



911 STARS



337 FORKS


github.com/cloudera/hue
TALKS
Meetups	
  and	
  events	
  in	
  NYC,	
  Paris,	
  
LA,	
  Tokyo,	
  SF,	
  Stockholm,	
  Vienna,	
  
San	
  Jose,	
  Singapore,	
  Budapest,	
  DC,	
  
Madrid…
AROUND

THE WORLD
RETREATS
Nov	
  13	
  Koh	
  Chang,	
  Thailand	
  
May	
  14	
  Curaçao,	
  Netherlands	
  AnMlles	
  
Aug	
  14	
  Big	
  Island,	
  Hawaii	
  
Nov	
  14	
  Tenerife,	
  Spain	
  
Nov	
  14	
  Nicaragua	
  and	
  Belize	
  
Jan	
  15	
  Philippines
TREND: GROWTH
gethue.com
HISTORY

HUE 1
Desktop-­‐like	
  in	
  a	
  browser,	
  did	
  its	
  
job	
  but	
  preVy	
  slow,	
  memory	
  leaks	
  
and	
  not	
  very	
  IE	
  friendly	
  but	
  
definitely	
  advanced	
  for	
  its	
  Mme	
  
(2009-­‐2010).
HISTORY

HUE 2
The	
  first	
  flat	
  structure	
  port,	
  with	
  
TwiVer	
  Bootstrap	
  all	
  over	
  the	
  
place.
HUE 2.5
New	
  apps,	
  improved	
  the	
  UX	
  
adding	
  new	
  nice	
  funcMonaliMes	
  
like	
  autocomplete	
  and	
  drag	
  &	
  
drop.
HISTORY

HUE 3 ALPHA
Proposed	
  design,	
  didn’t	
  make	
  it.
HISTORY

HUE 3.6+
Where	
  we	
  are	
  now,	
  a	
  brand	
  new	
  
way	
  to	
  search	
  and	
  explore	
  your	
  
data.
WHICH DISTRIBUTION?
Advanced	
  preview The	
  most	
  stable	
  and	
  cross	
  
component	
  checked
Very	
  latest
GITHUB CDH / CMTARBALL
HACKER ADVANCED USER NORMAL USER
WHERE TO PUT HUE? IN ONE MACHINE
WHERE TO PUT HUE? OUTSIDE THE CLUSTER
WHERE TO PUT HUE? INSIDE THE CLUSTER
Python	
  2.4	
  2.6



That’s	
  it	
  if	
  using	
  a	
  packaged	
  version.	
  If	
  building	
  from	
  the	
  
source,	
  here	
  are	
  the	
  extra	
  packages
SERVER CLIENT
Web	
  Browser



IE	
  9+,	
  FF	
  10+,	
  Chrome,	
  Safari
WHAT DO YOU NEED?
Hi	
  there,	
  I’m	
  “just”	
  a	
  web	
  server.
HOW DOES THE HUE SERVICE LOOK LIKE?
Process	
  serving	
  pages	
  and	
  also	
  
static	
  content
1 SERVER 1 DB
For	
  cookies,	
  saved	
  queries,	
  
workflows,	
  …
Hi	
  there,	
  I’m	
  “just”	
  a	
  web	
  server.
HOW TO CONFIGURE HUE
HUE.INI
Similar	
  to	
  core-­‐site.xml	
  but	
  
with	
  .INI	
  syntax	
  
Where?	
  
/etc/hue/conf/hue.ini

or	
  
$HUE_HOME/desktop/conf/
pseudo-distributed.ini
[desktop]
[[database]]
# Database engine is typically one of:
# postgresql_psycopg2, mysql, or sqlite3
engine=sqlite3
## host=
## port=
## user=
## password=
name=desktop/desktop.db
AUTHENTICATION
Login/Password	
  in	
  a	
  Database	
  
(SQLite,	
  MySQL,	
  …)
SIMPLE ENTERPRISE
LDAP	
  (most	
  used),	
  OAuth,	
  
OpenID,	
  SAML
DB BACKEND
LDAP BACKEND
Integrate	
  your	
  employees:	
  LDAP	
  How	
  to	
  guide
USERS
Can	
  give	
  and	
  revoke	
  
permissions	
  to	
  single	
  users	
  or	
  
group	
  of	
  users
ADMIN USER
Regular	
  user	
  +	
  permissions
LIST OF GROUPS AND PERMISSIONS
A	
  permission	
  can:	
  
- allow	
  access	
  to	
  one	
  app	
  (e.g.	
  
Hive	
  Editor)	
  
- modify	
  data	
  from	
  the	
  app	
  (e.g	
  
drop	
  Hive	
  Tables	
  or	
  edit	
  cells	
  in	
  
HBase	
  Browser)
CONFIGURE APPS

AND PERMISSIONS
A	
  list	
  of	
  permissions
PERMISSIONS IN ACTION
User	
  ‘test’	
  belonging	
  to	
  the	
  group	
  
‘hiveonly’	
  that	
  has	
  just	
  the	
  ‘hive’	
  
permissions
CONFIGURE APPS

AND PERMISSIONS
HOW HUE INTERACTS

WITH HADOOP
YARN
JobTracker
Oozie
Hue Plugins
LDAP
SAML
Pig
HDFS HiveServer2
Hive
Metastore
Cloudera
Impala
Solr
HBase
Sqoop2
Zookeeper
RCP CALLS TO ALL

THE HADOOP COMPONENTS
HDFS EXAMPLE
WebHDFS
REST
DN
DN
DN
…
DN
NN
hVp://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
HOW
List	
  all	
  the	
  host/port	
  of	
  Hadoop	
  
APIs	
  in	
  the	
  hue.ini	
  
For	
  example	
  here	
  HBase	
  and	
  Hive.
RCP CALLS TO ALL

THE HADOOP COMPONENTS
Full	
  list
[hbase]
# Comma-separated list of HBase Thrift servers for
# clusters in the format of '(name|host:port)'.
hbase_clusters=(Cluster|localhost:9090)
[beeswax]
hive_server_host=host-abc
hive_server_port=10000
HTTPS SSL DBSSL WITH HIVESERVER2
READ MORE …
SECURITY

FEATURES
KERBEROSSENTRY
2	
  Hue	
  instances	
  
HA	
  proxy	
  
MulM	
  DB	
  
Performances:	
  like	
  a	
  website,	
  
mostly	
  RPC	
  calls
HIGH AVAILABILITY
HOW
FULL SUITE OF APPS
Simple	
  custom	
  query	
  language	
  
Supports	
  HBase	
  filter	
  language	
  
Supports	
  selecMon	
  &	
  Copy	
  +	
  Paste,	
  
gracefully	
  degrades	
  in	
  IE	
  
Autocomplete	
  Help	
  Menu	
  
Row$Key$
Scan$Length$
Prefix$Scan$
Column/Family$Filters$
Thri=$Filterstring$
Searchbar(Syntax(Breakdown(
HBASE BROWSER
WHAT
Impala,	
  Hive	
  integraMon,	
  Spark	
  
InteracMve	
  SQL	
  editor	
  	
  
IntegraMon	
  with	
  MapReduce,	
  
Metastore,	
  HDFS
SQL
WHAT
SENTRY APP

Solr	
  &	
  Cloud	
  integraMon	
  
Custom	
  interacMve	
  dashboards	
  
Drag	
  &	
  drop	
  widgets	
  (charts,	
  
Mmeline…)
SEARCH
WHAT
JUST A VIEW

ON TOP OF SOLR API
REST
HISTORY

V1 USER
HISTORY

V1 ADMIN
HISTORY

V2 USER
HISTORY

V2 ADMIN
ARCHITECTURE
REST AJAX
/select
/admin/collections
/get
/luke...
/add_widget
/zoom_in
/select_facet
/select_range...
Templates
+
JS Model
www….
ARCHITECTURE

UI FOR FACETS
All the 2D positioning (cell ids), visual, drag&drop
Dashboard, fields, template, widgets (ids)
Search terms, selected facets (q, fqs)
LAYOUT
COLLECTION
QUERY
ADDING A WIDGET

LIFECYCLE
REST AJAX
/solr/zookeeper/clusterstate.json
/solr/admin/luke…
/get_collection
Load the initial page
Edit mode and Drag&Drop
ADDING A WIDGET

LIFECYCLE
REST AJAX
/solr/select?stats=true /new_facet
Select the field
Guess ranges (number or dates)
Rounding (number or dates)
ADDING A WIDGET

LIFECYCLE
Query part 1
Query Part 2
Augment Solr response
facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000&	
  
f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10
q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000]
{
'facet_counts':{
'facet_ranges':{
'bytes':{
'start':10000,
'counts':[
'900000',
3423,
'1800000',
339,
...
]
}
}
{
...,
'normalized_facets':[
{
'extraSeries':[
],
'label':'bytes',
'field':'bytes',
'counts':[
{
'from’:'900000',
'to':'1800000',
'selected':True,
'value':3423,
'field’:'bytes',
'exclude':False
}
], ...
}
}
}
JSON TO WIDGET
{
"field":"rate_code",
"counts":[
{
"count":97797,
"exclude":true,
"selected":false,
"value":"1",
"cat":"rate_code"
} ...
{
"field":"medallion",
"counts":[
{
"count":159,
"exclude":true,
"selected":false,
"value":"6CA28FC49A4C49A9A96",
"cat":"medallion"
} ….
{
"extraSeries":[
],
"label":"trip_time_in_secs",
"field":"trip_time_in_secs",
"counts":[
{
"from":"0",
"to":"10",
"selected":false,
"value":527,
"field":"trip_time_in_secs",
"exclude":true
} ...
{
"field":"passenger_count",
"counts":[
{
"count":74766,
"exclude":true,
"selected":false,
"value":"1",
"cat":"passenger_count"
} ...
REPEAT UNTIL…
ENTERPRISE FEATURES
- Access to Search App configurable, LDAP/SAML auths
- Share by link
- Solr Cloud (or non Cloud)
- Proxy user

/solr/jobs_demo/select?user.name=hue&doAs=romain&q=
- Security

Kerberos
- Sentry

Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
SPARK IGNITER
HISTORY
OCT 2013
Submit	
  through	
  Oozie	
  
Shell	
  like	
  for	
  Java,	
  Scala,	
  Python	
  
HISTORY
JAN 2014
V2	
  Spark	
  Igniter
Spark	
  0.8
Java,	
  Scala	
  with	
  Spark	
  Job	
  Server
APR 2014
Spark	
  0.9
JUN 2014
Ironing	
  +	
  How	
  to	
  deploy
“JUST A VIEW”

ON TOP OF SPARK
Saved script metadata Hue Job Server
eg. name, args, classname, jar name…
submit
list apps
list jobs
list contexts
HOW TO TALK

TO SPARK?
Hue Spark Job Server
Spark
APP

LIFE CYCLE
Hue Spark Job Server
Spark
… extend SparkJob
.scala
sbt _/package
JAR
Upload
APP

LIFE CYCLE
… extend SparkJob
.scala
sbt _/package
JAR
Upload
APP

LIFE CYCLE
Context
create context: auto or manual
SPARK JOB SERVER
WHERE
curl -d "input.string = a b c a b see" 'localhost:8090/jobs?
appName=test&classPath=spark.jobserver.WordCountExample'
{
"status": "STARTED",
"result": {
"jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4",
"context": "b7ea0eb5-spark.jobserver.WordCountExample"
}
}
hVps://github.com/ooyala/spark-­‐jobserver
WHAT
REST	
  job	
  server	
  for	
  Spark
WHEN
Spark	
  Summit	
  talk	
  Monday	
  5:45pm:	
  	
  
Spark	
  Job	
  Server:	
  Easy	
  Spark	
  Job	
  	
  
Management	
  by	
  Ooyala
FOCUS ON UX
curl -d "input.string = a b c a b see" 'localhost:8090/jobs?
appName=test&classPath=spark.jobserver.WordCountExample'
{
"status": "STARTED",
"result": {
"jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4",
"context": "b7ea0eb5-spark.jobserver.WordCountExample"
}
}
VS
TRAIT SPARKJOB
/**
* This trait is the main API for Spark jobs submitted to the Job Server.
*/
trait SparkJob {
/**
* This is the entry point for a Spark Job Server to execute Spark jobs.
* */
def runJob(sc: SparkContext, jobConfig: Config): Any
/**
* This method is called by the job server to allow jobs to validate their input and reject
* invalid job requests. */
def validate(sc: SparkContext, config: Config): SparkJobValidation
}
DEMO
TIME

SUM-UP
Enable	
  Hadoop	
  Service	
  APIs	
  
for	
  Hue	
  as	
  a	
  proxy	
  user
Configure	
  hue.ini	
  to	
  point	
  to	
  
each	
  Service	
  API
Get	
  help	
  on	
  @gethue	
  or	
  hue-­‐
user
Install	
  Hue	
  on	
  one	
  machine
Use	
  an	
  LDAP	
  backend
INSTALL CONFIGUREENABLE
HELPLDAP
ROADMAP

NEXT 6 MONTHS
Oozie	
  v2	
  
Spark	
  v2	
  
SQL	
  v2	
  
More	
  dashboards!	
  
Inter	
  component	
  integraMons	
  
(HBase	
  <-­‐>	
  Search,	
  create	
  index	
  
wizards,	
  document	
  permissions),	
  
Hadoop	
  Web	
  apps	
  SDK	
  
Your	
  idea	
  here.
WHAT
CONFIGURATIONS ARE HARD…
…GIVE CLOUDERA MANAGER A TRY!
vimeo.com/91805055
MISSED

SOMETHING?
learn.gethue.com
TWITTER
@gethue
USER GROUP
hue-­‐user@
WEBSITE
hVp://gethue.com
LEARN
hVp://learn.gethue.com
GRACIAS!


Mais conteúdo relacionado

Mais procurados

Infrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to TerraformInfrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to TerraformAlexander Popov
 
Infrastructure as Code with Terraform
Infrastructure as Code with TerraformInfrastructure as Code with Terraform
Infrastructure as Code with TerraformMario IC
 
Terraform in deployment pipeline
Terraform in deployment pipelineTerraform in deployment pipeline
Terraform in deployment pipelineAnton Babenko
 
Interactively Search and Visualize Your Big Data
Interactively Search and Visualize Your Big DataInteractively Search and Visualize Your Big Data
Interactively Search and Visualize Your Big Datagethue
 
AnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and TricksAnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and Tricksjimi-c
 
Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.gutierrezga00
 
Breaking Up With Your Data Center Presentation
Breaking Up With Your Data Center PresentationBreaking Up With Your Data Center Presentation
Breaking Up With Your Data Center PresentationTelescope_Inc
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails DevsDiacode
 
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloudOpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloudNetcetera
 
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSYevgeniy Brikman
 
Using python and docker for data science
Using python and docker for data scienceUsing python and docker for data science
Using python and docker for data scienceCalvin Giles
 
Using docker for data science - part 2
Using docker for data science - part 2Using docker for data science - part 2
Using docker for data science - part 2Calvin Giles
 
Terraform: Cloud Configuration Management (WTC/IPC'16)
Terraform: Cloud Configuration Management (WTC/IPC'16)Terraform: Cloud Configuration Management (WTC/IPC'16)
Terraform: Cloud Configuration Management (WTC/IPC'16)Martin Schütte
 
Web development with Lua @ Bulgaria Web Summit 2016
Web development with Lua @ Bulgaria Web Summit 2016Web development with Lua @ Bulgaria Web Summit 2016
Web development with Lua @ Bulgaria Web Summit 2016Etiene Dalcol
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchAlexei Gorobets
 
Transfer to kubernetes data platform from EMR
Transfer to kubernetes data platform from EMRTransfer to kubernetes data platform from EMR
Transfer to kubernetes data platform from EMR창언 정
 
Debugging and Testing ES Systems
Debugging and Testing ES SystemsDebugging and Testing ES Systems
Debugging and Testing ES SystemsChris Birchall
 
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...gethue
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 

Mais procurados (20)

Infrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to TerraformInfrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to Terraform
 
Infrastructure as Code with Terraform
Infrastructure as Code with TerraformInfrastructure as Code with Terraform
Infrastructure as Code with Terraform
 
Terraform in deployment pipeline
Terraform in deployment pipelineTerraform in deployment pipeline
Terraform in deployment pipeline
 
Interactively Search and Visualize Your Big Data
Interactively Search and Visualize Your Big DataInteractively Search and Visualize Your Big Data
Interactively Search and Visualize Your Big Data
 
AnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and TricksAnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and Tricks
 
Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.
 
Hadoop on osx
Hadoop on osxHadoop on osx
Hadoop on osx
 
Breaking Up With Your Data Center Presentation
Breaking Up With Your Data Center PresentationBreaking Up With Your Data Center Presentation
Breaking Up With Your Data Center Presentation
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails Devs
 
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloudOpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
OpenCloudDay 2014: Deploying trusted developer sandboxes in Amazon's cloud
 
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECS
 
Using python and docker for data science
Using python and docker for data scienceUsing python and docker for data science
Using python and docker for data science
 
Using docker for data science - part 2
Using docker for data science - part 2Using docker for data science - part 2
Using docker for data science - part 2
 
Terraform: Cloud Configuration Management (WTC/IPC'16)
Terraform: Cloud Configuration Management (WTC/IPC'16)Terraform: Cloud Configuration Management (WTC/IPC'16)
Terraform: Cloud Configuration Management (WTC/IPC'16)
 
Web development with Lua @ Bulgaria Web Summit 2016
Web development with Lua @ Bulgaria Web Summit 2016Web development with Lua @ Bulgaria Web Summit 2016
Web development with Lua @ Bulgaria Web Summit 2016
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet Elasticsearch
 
Transfer to kubernetes data platform from EMR
Transfer to kubernetes data platform from EMRTransfer to kubernetes data platform from EMR
Transfer to kubernetes data platform from EMR
 
Debugging and Testing ES Systems
Debugging and Testing ES SystemsDebugging and Testing ES Systems
Debugging and Testing ES Systems
 
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
Spark Summit Europe: Building a REST Job Server for interactive Spark as a se...
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 

Destaque

Oozie workflow using HUE 2.2
Oozie workflow using HUE 2.2Oozie workflow using HUE 2.2
Oozie workflow using HUE 2.2Uday Vakalapudi
 
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and HueHadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Huegethue
 
Hue architecture in the Hadoop ecosystem and SQL Editor
Hue architecture in the Hadoop ecosystem and SQL EditorHue architecture in the Hadoop ecosystem and SQL Editor
Hue architecture in the Hadoop ecosystem and SQL EditorRomain Rigaux
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impalamarkgrover
 
Talend MDM
Talend MDMTalend MDM
Talend MDMTalend
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerVertiCloud Inc
 
August 2013 HUG: Hue: the UI for Apache Hadoop
August 2013 HUG: Hue: the UI for Apache HadoopAugust 2013 HUG: Hue: the UI for Apache Hadoop
August 2013 HUG: Hue: the UI for Apache HadoopYahoo Developer Network
 
The future of real time information
The future of real time informationThe future of real time information
The future of real time informationthaiscarbonell1512
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Evan Chan
 
Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseIntegrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseDataWorks Summit
 
Casablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à HadoopCasablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à HadoopBenoît de CHATEAUVIEUX
 
How to extract valueable information from real time data feeds
How to extract valueable information from real time data feedsHow to extract valueable information from real time data feeds
How to extract valueable information from real time data feedsGene Leybzon
 
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop MeetupIntegrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetupgethue
 
Big Data: Improving capacity utilization of transport companies
Big Data: Improving capacity utilization of transport companiesBig Data: Improving capacity utilization of transport companies
Big Data: Improving capacity utilization of transport companiesData Science Society
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
Real-time data integration to the cloud
Real-time data integration to the cloudReal-time data integration to the cloud
Real-time data integration to the cloudSankar Nagarajan
 
Real-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataReal-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataData Science Society
 

Destaque (20)

Oozie workflow using HUE 2.2
Oozie workflow using HUE 2.2Oozie workflow using HUE 2.2
Oozie workflow using HUE 2.2
 
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and HueHadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
 
Hue architecture in the Hadoop ecosystem and SQL Editor
Hue architecture in the Hadoop ecosystem and SQL EditorHue architecture in the Hadoop ecosystem and SQL Editor
Hue architecture in the Hadoop ecosystem and SQL Editor
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impala
 
Talend MDM
Talend MDMTalend MDM
Talend MDM
 
YARN - Hadoop's Resource Manager
YARN - Hadoop's Resource ManagerYARN - Hadoop's Resource Manager
YARN - Hadoop's Resource Manager
 
August 2013 HUG: Hue: the UI for Apache Hadoop
August 2013 HUG: Hue: the UI for Apache HadoopAugust 2013 HUG: Hue: the UI for Apache Hadoop
August 2013 HUG: Hue: the UI for Apache Hadoop
 
The future of real time information
The future of real time informationThe future of real time information
The future of real time information
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
 
Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseIntegrating Hadoop Into the Enterprise
Integrating Hadoop Into the Enterprise
 
Casablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à HadoopCasablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à Hadoop
 
How to extract valueable information from real time data feeds
How to extract valueable information from real time data feedsHow to extract valueable information from real time data feeds
How to extract valueable information from real time data feeds
 
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop MeetupIntegrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
 
The future of Big Data tooling
The future of Big Data toolingThe future of Big Data tooling
The future of Big Data tooling
 
Real-time analytics with HBase
Real-time analytics with HBaseReal-time analytics with HBase
Real-time analytics with HBase
 
Big Data: Improving capacity utilization of transport companies
Big Data: Improving capacity utilization of transport companiesBig Data: Improving capacity utilization of transport companies
Big Data: Improving capacity utilization of transport companies
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Real-time data integration to the cloud
Real-time data integration to the cloudReal-time data integration to the cloud
Real-time data integration to the cloud
 
Real-time information analysis: social networks and open data
Real-time information analysis: social networks and open dataReal-time information analysis: social networks and open data
Real-time information analysis: social networks and open data
 
GlusterFS And Big Data
GlusterFS And Big DataGlusterFS And Big Data
GlusterFS And Big Data
 

Semelhante a Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014

Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Alluxio, Inc.
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosEuangelos Linardos
 
April 2014 HUG : Integrating HUE with Multi-tenant cluster
April 2014 HUG : Integrating HUE with Multi-tenant clusterApril 2014 HUG : Integrating HUE with Multi-tenant cluster
April 2014 HUG : Integrating HUE with Multi-tenant clusterYahoo Developer Network
 
node.js: Javascript's in your backend
node.js: Javascript's in your backendnode.js: Javascript's in your backend
node.js: Javascript's in your backendDavid Padbury
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesNicola Ferraro
 
2019 11-bgphp
2019 11-bgphp2019 11-bgphp
2019 11-bgphpdantleech
 
Serverless Ballerina
Serverless BallerinaServerless Ballerina
Serverless BallerinaBallerina
 
Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12
Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12
Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12Keith Resar
 
Just one-shade-of-openstack
Just one-shade-of-openstackJust one-shade-of-openstack
Just one-shade-of-openstackRoberto Polli
 
Ruby on Rails
Ruby on RailsRuby on Rails
Ruby on RailsDelphiCon
 
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQueryRemedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQueryTatsuhiko Miyagawa
 
Advanced Web Hosting
Advanced Web HostingAdvanced Web Hosting
Advanced Web HostingOVHcloud
 
Docker včera, dnes a zítra
Docker včera, dnes a zítraDocker včera, dnes a zítra
Docker včera, dnes a zítraLadislav Prskavec
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkSamy Dindane
 
Drupal 8 preview_slideshow
Drupal 8 preview_slideshowDrupal 8 preview_slideshow
Drupal 8 preview_slideshowTee Malapela
 
Design Summit - RESTful API Overview - John Hardy
Design Summit - RESTful API Overview - John HardyDesign Summit - RESTful API Overview - John Hardy
Design Summit - RESTful API Overview - John HardyManageIQ
 

Semelhante a Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014 (20)

Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
 
April 2014 HUG : Integrating HUE with Multi-tenant cluster
April 2014 HUG : Integrating HUE with Multi-tenant clusterApril 2014 HUG : Integrating HUE with Multi-tenant cluster
April 2014 HUG : Integrating HUE with Multi-tenant cluster
 
node.js: Javascript's in your backend
node.js: Javascript's in your backendnode.js: Javascript's in your backend
node.js: Javascript's in your backend
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
Lumen
LumenLumen
Lumen
 
Rack
RackRack
Rack
 
Modern php
Modern phpModern php
Modern php
 
2019 11-bgphp
2019 11-bgphp2019 11-bgphp
2019 11-bgphp
 
Serverless Ballerina
Serverless BallerinaServerless Ballerina
Serverless Ballerina
 
Api Design
Api DesignApi Design
Api Design
 
Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12
Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12
Ansible Automation Best Practices From Startups to Enterprises - Minnebar 12
 
Just one-shade-of-openstack
Just one-shade-of-openstackJust one-shade-of-openstack
Just one-shade-of-openstack
 
Ruby on Rails
Ruby on RailsRuby on Rails
Ruby on Rails
 
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQueryRemedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
 
Advanced Web Hosting
Advanced Web HostingAdvanced Web Hosting
Advanced Web Hosting
 
Docker včera, dnes a zítra
Docker včera, dnes a zítraDocker včera, dnes a zítra
Docker včera, dnes a zítra
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Drupal 8 preview_slideshow
Drupal 8 preview_slideshowDrupal 8 preview_slideshow
Drupal 8 preview_slideshow
 
Design Summit - RESTful API Overview - John Hardy
Design Summit - RESTful API Overview - John HardyDesign Summit - RESTful API Overview - John Hardy
Design Summit - RESTful API Overview - John Hardy
 

Mais de gethue

Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0gethue
 
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop MeetupSqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetupgethue
 
Hue: The Hadoop UI - Hadoop Singapore
Hue: The Hadoop UI - Hadoop SingaporeHue: The Hadoop UI - Hadoop Singapore
Hue: The Hadoop UI - Hadoop Singaporegethue
 
SF Dev Meetup - Hue SDK
SF Dev Meetup - Hue SDKSF Dev Meetup - Hue SDK
SF Dev Meetup - Hue SDKgethue
 
Hue: The Hadoop UI - Where we stand, Hue Meetup SF
Hue: The Hadoop UI - Where we stand, Hue Meetup SF Hue: The Hadoop UI - Where we stand, Hue Meetup SF
Hue: The Hadoop UI - Where we stand, Hue Meetup SF gethue
 
Hue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG FranceHue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG Francegethue
 
Hue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUGHue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUGgethue
 

Mais de gethue (7)

Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
 
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop MeetupSqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
 
Hue: The Hadoop UI - Hadoop Singapore
Hue: The Hadoop UI - Hadoop SingaporeHue: The Hadoop UI - Hadoop Singapore
Hue: The Hadoop UI - Hadoop Singapore
 
SF Dev Meetup - Hue SDK
SF Dev Meetup - Hue SDKSF Dev Meetup - Hue SDK
SF Dev Meetup - Hue SDK
 
Hue: The Hadoop UI - Where we stand, Hue Meetup SF
Hue: The Hadoop UI - Where we stand, Hue Meetup SF Hue: The Hadoop UI - Where we stand, Hue Meetup SF
Hue: The Hadoop UI - Where we stand, Hue Meetup SF
 
Hue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG FranceHue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG France
 
Hue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUGHue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUG
 

Último

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 

Último (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 

Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014

  • 1. BIG DATA WEB APPS FOR INTERACTIVE HADOOP Enrico Berti Big Data Spain, Nov 17, 2014
  • 2. GOAL
 OF HUE WEB INTERFACE FOR ANALYZING DATA WITH APACHE HADOOP   SIMPLIFY AND INTEGRATE
 
 FREE AND OPEN SOURCE —> OPEN UP BIG DATA
  • 3. VIEW FROM
 30K FEET Hadoop Web Server You, your colleagues and even that friend that uses IE9 ;)
  • 4. OPEN SOURCE
 ~4000 COMMITS   
 56 CONTRIBUTORS
 
 911 STARS
 
 337 FORKS 
 github.com/cloudera/hue
  • 5. TALKS Meetups  and  events  in  NYC,  Paris,   LA,  Tokyo,  SF,  Stockholm,  Vienna,   San  Jose,  Singapore,  Budapest,  DC,   Madrid… AROUND
 THE WORLD RETREATS Nov  13  Koh  Chang,  Thailand   May  14  Curaçao,  Netherlands  AnMlles   Aug  14  Big  Island,  Hawaii   Nov  14  Tenerife,  Spain   Nov  14  Nicaragua  and  Belize   Jan  15  Philippines
  • 7. HISTORY
 HUE 1 Desktop-­‐like  in  a  browser,  did  its   job  but  preVy  slow,  memory  leaks   and  not  very  IE  friendly  but   definitely  advanced  for  its  Mme   (2009-­‐2010).
  • 8. HISTORY
 HUE 2 The  first  flat  structure  port,  with   TwiVer  Bootstrap  all  over  the   place. HUE 2.5 New  apps,  improved  the  UX   adding  new  nice  funcMonaliMes   like  autocomplete  and  drag  &   drop.
  • 9. HISTORY
 HUE 3 ALPHA Proposed  design,  didn’t  make  it.
  • 10. HISTORY
 HUE 3.6+ Where  we  are  now,  a  brand  new   way  to  search  and  explore  your   data.
  • 11. WHICH DISTRIBUTION? Advanced  preview The  most  stable  and  cross   component  checked Very  latest GITHUB CDH / CMTARBALL HACKER ADVANCED USER NORMAL USER
  • 12. WHERE TO PUT HUE? IN ONE MACHINE
  • 13. WHERE TO PUT HUE? OUTSIDE THE CLUSTER
  • 14. WHERE TO PUT HUE? INSIDE THE CLUSTER
  • 15. Python  2.4  2.6
 
 That’s  it  if  using  a  packaged  version.  If  building  from  the   source,  here  are  the  extra  packages SERVER CLIENT Web  Browser
 
 IE  9+,  FF  10+,  Chrome,  Safari WHAT DO YOU NEED? Hi  there,  I’m  “just”  a  web  server.
  • 16. HOW DOES THE HUE SERVICE LOOK LIKE? Process  serving  pages  and  also   static  content 1 SERVER 1 DB For  cookies,  saved  queries,   workflows,  … Hi  there,  I’m  “just”  a  web  server.
  • 17. HOW TO CONFIGURE HUE HUE.INI Similar  to  core-­‐site.xml  but   with  .INI  syntax   Where?   /etc/hue/conf/hue.ini
 or   $HUE_HOME/desktop/conf/ pseudo-distributed.ini [desktop] [[database]] # Database engine is typically one of: # postgresql_psycopg2, mysql, or sqlite3 engine=sqlite3 ## host= ## port= ## user= ## password= name=desktop/desktop.db
  • 18. AUTHENTICATION Login/Password  in  a  Database   (SQLite,  MySQL,  …) SIMPLE ENTERPRISE LDAP  (most  used),  OAuth,   OpenID,  SAML
  • 20. LDAP BACKEND Integrate  your  employees:  LDAP  How  to  guide
  • 21. USERS Can  give  and  revoke   permissions  to  single  users  or   group  of  users ADMIN USER Regular  user  +  permissions
  • 22. LIST OF GROUPS AND PERMISSIONS A  permission  can:   - allow  access  to  one  app  (e.g.   Hive  Editor)   - modify  data  from  the  app  (e.g   drop  Hive  Tables  or  edit  cells  in   HBase  Browser) CONFIGURE APPS
 AND PERMISSIONS A  list  of  permissions
  • 23. PERMISSIONS IN ACTION User  ‘test’  belonging  to  the  group   ‘hiveonly’  that  has  just  the  ‘hive’   permissions CONFIGURE APPS
 AND PERMISSIONS
  • 24. HOW HUE INTERACTS
 WITH HADOOP YARN JobTracker Oozie Hue Plugins LDAP SAML Pig HDFS HiveServer2 Hive Metastore Cloudera Impala Solr HBase Sqoop2 Zookeeper
  • 25. RCP CALLS TO ALL
 THE HADOOP COMPONENTS HDFS EXAMPLE WebHDFS REST DN DN DN … DN NN hVp://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
  • 26. HOW List  all  the  host/port  of  Hadoop   APIs  in  the  hue.ini   For  example  here  HBase  and  Hive. RCP CALLS TO ALL
 THE HADOOP COMPONENTS Full  list [hbase] # Comma-separated list of HBase Thrift servers for # clusters in the format of '(name|host:port)'. hbase_clusters=(Cluster|localhost:9090) [beeswax] hive_server_host=host-abc hive_server_port=10000
  • 27. HTTPS SSL DBSSL WITH HIVESERVER2 READ MORE … SECURITY
 FEATURES KERBEROSSENTRY
  • 28. 2  Hue  instances   HA  proxy   MulM  DB   Performances:  like  a  website,   mostly  RPC  calls HIGH AVAILABILITY HOW
  • 30. Simple  custom  query  language   Supports  HBase  filter  language   Supports  selecMon  &  Copy  +  Paste,   gracefully  degrades  in  IE   Autocomplete  Help  Menu   Row$Key$ Scan$Length$ Prefix$Scan$ Column/Family$Filters$ Thri=$Filterstring$ Searchbar(Syntax(Breakdown( HBASE BROWSER WHAT
  • 31. Impala,  Hive  integraMon,  Spark   InteracMve  SQL  editor     IntegraMon  with  MapReduce,   Metastore,  HDFS SQL WHAT
  • 33. Solr  &  Cloud  integraMon   Custom  interacMve  dashboards   Drag  &  drop  widgets  (charts,   Mmeline…) SEARCH WHAT
  • 34. JUST A VIEW
 ON TOP OF SOLR API REST
  • 40. ARCHITECTURE
 UI FOR FACETS All the 2D positioning (cell ids), visual, drag&drop Dashboard, fields, template, widgets (ids) Search terms, selected facets (q, fqs) LAYOUT COLLECTION QUERY
  • 41. ADDING A WIDGET
 LIFECYCLE REST AJAX /solr/zookeeper/clusterstate.json /solr/admin/luke… /get_collection Load the initial page Edit mode and Drag&Drop
  • 42. ADDING A WIDGET
 LIFECYCLE REST AJAX /solr/select?stats=true /new_facet Select the field Guess ranges (number or dates) Rounding (number or dates)
  • 43. ADDING A WIDGET
 LIFECYCLE Query part 1 Query Part 2 Augment Solr response facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000&   f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10 q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000] { 'facet_counts':{ 'facet_ranges':{ 'bytes':{ 'start':10000, 'counts':[ '900000', 3423, '1800000', 339, ... ] } } { ..., 'normalized_facets':[ { 'extraSeries':[ ], 'label':'bytes', 'field':'bytes', 'counts':[ { 'from’:'900000', 'to':'1800000', 'selected':True, 'value':3423, 'field’:'bytes', 'exclude':False } ], ... } } }
  • 44. JSON TO WIDGET { "field":"rate_code", "counts":[ { "count":97797, "exclude":true, "selected":false, "value":"1", "cat":"rate_code" } ... { "field":"medallion", "counts":[ { "count":159, "exclude":true, "selected":false, "value":"6CA28FC49A4C49A9A96", "cat":"medallion" } …. { "extraSeries":[ ], "label":"trip_time_in_secs", "field":"trip_time_in_secs", "counts":[ { "from":"0", "to":"10", "selected":false, "value":527, "field":"trip_time_in_secs", "exclude":true } ... { "field":"passenger_count", "counts":[ { "count":74766, "exclude":true, "selected":false, "value":"1", "cat":"passenger_count" } ...
  • 46. ENTERPRISE FEATURES - Access to Search App configurable, LDAP/SAML auths - Share by link - Solr Cloud (or non Cloud) - Proxy user
 /solr/jobs_demo/select?user.name=hue&doAs=romain&q= - Security
 Kerberos - Sentry
 Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
  • 48. HISTORY OCT 2013 Submit  through  Oozie   Shell  like  for  Java,  Scala,  Python  
  • 49. HISTORY JAN 2014 V2  Spark  Igniter Spark  0.8 Java,  Scala  with  Spark  Job  Server APR 2014 Spark  0.9 JUN 2014 Ironing  +  How  to  deploy
  • 50. “JUST A VIEW”
 ON TOP OF SPARK Saved script metadata Hue Job Server eg. name, args, classname, jar name… submit list apps list jobs list contexts
  • 51. HOW TO TALK
 TO SPARK? Hue Spark Job Server Spark
  • 52. APP
 LIFE CYCLE Hue Spark Job Server Spark
  • 53. … extend SparkJob .scala sbt _/package JAR Upload APP
 LIFE CYCLE
  • 54. … extend SparkJob .scala sbt _/package JAR Upload APP
 LIFE CYCLE Context create context: auto or manual
  • 55. SPARK JOB SERVER WHERE curl -d "input.string = a b c a b see" 'localhost:8090/jobs? appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", "context": "b7ea0eb5-spark.jobserver.WordCountExample" } } hVps://github.com/ooyala/spark-­‐jobserver WHAT REST  job  server  for  Spark WHEN Spark  Summit  talk  Monday  5:45pm:     Spark  Job  Server:  Easy  Spark  Job     Management  by  Ooyala
  • 56. FOCUS ON UX curl -d "input.string = a b c a b see" 'localhost:8090/jobs? appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", "context": "b7ea0eb5-spark.jobserver.WordCountExample" } } VS
  • 57. TRAIT SPARKJOB /** * This trait is the main API for Spark jobs submitted to the Job Server. */ trait SparkJob { /** * This is the entry point for a Spark Job Server to execute Spark jobs. * */ def runJob(sc: SparkContext, jobConfig: Config): Any /** * This method is called by the job server to allow jobs to validate their input and reject * invalid job requests. */ def validate(sc: SparkContext, config: Config): SparkJobValidation }
  • 59. SUM-UP Enable  Hadoop  Service  APIs   for  Hue  as  a  proxy  user Configure  hue.ini  to  point  to   each  Service  API Get  help  on  @gethue  or  hue-­‐ user Install  Hue  on  one  machine Use  an  LDAP  backend INSTALL CONFIGUREENABLE HELPLDAP
  • 60. ROADMAP
 NEXT 6 MONTHS Oozie  v2   Spark  v2   SQL  v2   More  dashboards!   Inter  component  integraMons   (HBase  <-­‐>  Search,  create  index   wizards,  document  permissions),   Hadoop  Web  apps  SDK   Your  idea  here. WHAT
  • 61. CONFIGURATIONS ARE HARD… …GIVE CLOUDERA MANAGER A TRY! vimeo.com/91805055