SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
Hue
A closer Look at Hue
What's on the Menu
● Hue Architecture
  ○ Many interfaces to implement
  ○ How do I list HDFS files, how do I submit a job...?
  ○ SDK
● Hue UI: Dynamic Workflow Editor
  ○   Why improve the user experience?
  ○   How can we improve the user experience?
  ○   Design Considerations
  ○   Design and Code Deep Dive
View from 30 000 feet
Ecosystem
Integrate with the Web
● HTTP, stateless (async queries)
● Frontend / Backend (e.g. different servers,
  pagination)
● Resources (e.g. img, js, callbacks, css, json)
● Browsers, multi techs
● DB (sqlite, MySql, PostGres...)
● i18n
● ...

More on UI later
Integrate Users
● Auth
  ○   Standard
  ○   LDAP
  ○   PAM
  ○   Spnego
  ○   Custom
      (OAuth, Cookie...)
Integrate HDFS
● Interfaces
     ○ Thrift (old)
       ■ NN
     ○ REST
       ■ WebHdfs
       ■ HttpFs (HA, new bugs)


● Uploads to HDFS
class HDFStemporaryUploadedFile(object):
class HDFSfileUploadHandler(FileUploadHandler):
Integrate Hive
   ● Beeswax: embedded Hive CLI
   ● Concurrent executions

   ● Beeswax / Hive Server 2 Thrift interfaces
   ● Hue models, HQL, Impala, DDL
service BeeswaxService {
                                                     service TCLIService {
 QueryHandle query(1:Query query) throws(1:
BeeswaxException error),                              TExecuteStatementResp ExecuteStatement(1:
                                                     TExecuteStatementReq req);
  QueryHandle executeAndWait(1:Query query, 2:
LogContextId clientCtx)                                TGetOperationStatusResp GetOperationStatus
                 throws(1:BeeswaxException error),   (1:TGetOperationStatusReq req);
....                                                 ....
Integrate Hive
Moving to Pluggable interfaces
                                       DBMS
                                      SQL API




                                 Beeswax        HS2




                                       Table




                                  BTable       HS2Table
Integrate Impala

● New app
● Same Beeswax/Hive Server 2 interfaces
● One more moving target..
Integrate Jobs
    ● List, access, kill
    ● aka JobBrowser

    ● JobTracker Thrift Plugin
mapred-site.xml
                                                       More Thrift
<property>
 <name>jobtracker.thrift.address</name>                service Jobtracker extends common.
 <value>0.0.0.0:9290</value>                           HadoopServiceBase {
</property>                                             ThriftJobInProgress getJob(10: common.
<property>                                             RequestContext ctx, 1: ThriftJobID jobID)
 <name>mapred.jobtracker.plugins</name>                      throws(1: JobNotFoundException err),
 <value>
   org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin    ThriftJobList getRunningJobs(10: common.
 </value>                                              RequestContext ctx),
</property>
Integrate Jobs
● Submit jobs (MR, Hive, Java, Pig...)
● Manage workflows
● Schedule workflows


● REST (GET, PUT, POST)
Integrate Shell
● Pig
● HBase
● Sqoop 2

●   Spawning Server
●   Greenlets
●   popen/pty/tty
●   IO (HTTP, DB...)
●   setuid
●   css/js/POST
Integrate YARN
● JobBrowser MR2, Oozie

● No JT, 4 more REST API
● MR to History Server, missing logs...
● MR1/2 API not 100% compatible
  (like Beeswax/HiveServer2, Beeswax
  UI/Impala switches)
Integrate security
● 'hue' superuser                                 ●   One 'hue'
  JT, Shell setuid root:hue                           Kerberos ticket
                                                  ●   Hive Server 2 ?
● 'hue' Proxy User / doAs
  HDFS
  Oozie
      <property>
        <name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name>
        <value>*</value>
      </property>
      <property>
        <name>oozie.service.ProxyUserService.proxyuser.hue.groups</name>
        <value>*</value>
      </property>
SDK: Integrate Developers
● Set of raw libs         ● Hue models

libs                      apps/
    /hadoop                   /jobbrowser
            /jobtracker       /oozie
            /webhdfs          /...
            /yarn
    /liboozie
    /rest
    /thrift
SDK: Integrate Developers
$ ./build/env/bin/hue create_desktop_app
clouddemo


● Custom: views/model/templates
● Reuse Hue libs

http://cloudera.github.com/hue/docs-2.1.0
/sdk/sdk.html#fast-guide-to-creating-a-new-
hue-application
CloudDemo example
Single click:

●   HTTP
●   HDFS
●   Oozie
●   JT
After the Interfaces...

... now the dynamic UI
    (Oozie App use case)
Why Improve User Experience
● Users like things that are easy to use
● Intuition and ease of use
How to Improve User Experience
● How can we do this for Oozie?
  ○ Hue users are not engineers
  ○ Most users are not familiar with shortcuts and
    command lines
  ○ Windowing systems have taught us drag and drop is
    good




Drag and drop every thing in a Workflow!
Old Hue Windowing System
Fundamentals of Front End Design
● Behavior
  ○ Javascript
  ○ Knockout JS
  ○ JQuery
● Presentation
  ○ CSS
  ○ Bootstrap
● Content
  ○ HTML (Templates)
● MV*
  ○ MVC
  ○ MVP
  ○ MVVM
Design Constraints
● Existing backend from Hue 2.1
  ○ Need to be able to easily migrate from Hue 2.1 to
    Hue 2.2
● Knockout JS and JQuery already chosen
  ○   Rudimentary templating
  ○   Subscription based bindings
  ○   Observables for arrays and Javascript literals only
  ○   Event delegation
● Existing UI from Hue 2.1
  ○ Provides basic node movement through form
    submission (reloads the page)
  ○ Not dynamic
Other Design Considerations
● Serializing should be trivial
● Basic API
   ○ Save a workflow
   ○ Validate a node
   ○ Read a workflow
● Difference in representation between Hue
  2.1 backend and the KnockoutJS way of
  doing things
● New nodes need an ID
Design - High Level Components




● Left out
  ○ Many event bindings and custom events
  ○ Views left out
Purpose of the Node Model
● Provides defaults for data:
var NodeModel = ModelModule($);
$.extend(NodeModel.prototype, {
  id: 0,
  name: '',
  description: '',
  node_type: '',
  workflow: 0,
  child_links: []
});

● Sent over the wire
● Mimics Django models
Model - ModelView Separation
● ModelViews should be the "shield" and
  Models the source of truth.
● Models are more serializable if they do not
  carry extraneous data.
● Subscribed update through KnockoutJS:
$.each(mapping, function(key, value) {
    var key = key;
    if (ko.isObservable(self[key])) {
         self[key].subscribe(function(value) {
             model[key] = ko.mapping.toJS(value);
         });
    }
});
Purpose of the Registry
●   Construction optimization
●   Constant time node lookup
●   Looking towards the future and storage
●   Simple start:
    var self = this;
    self.nodes = {};
    module.prototype.initialize.apply(self, arguments);
    return self;
Purpose of ID Generation
● Unique identifier for new nodes (IE: mapreduce:1).
● Assists in creating parent-child relationships through
   links.
var IdGeneratorModule = function($) {
   return function(options) {
      var self = this;
      $.extend(self, options);
      self.counter = 1;
      self.nextId = function() {
         return ((self.prefix) ? self.prefix + ':' : '') +
self.counter++;
      };
   };
};
Transpose to Show
● KnockoutJS supports 3 kinds of observables
    ○ Observables for literals
    ○ Observable arrays
    ○ Computed Observables
●   DAG received is represented as a tree




● DAG represented as a list of lists when we display...
    MVVM restriction
Other Difficulties
● Decision node representation
● JSON.stringify does not include parent class
  members
● Memory consumption
● Cycles, cycles, cycles
Next steps
● Integrate
  ○   Pig, Hive Server 2
  ○   Oozie Bundles, SLA
  ○   Document model, "Editors", git
  ○   SDK revamp, language agnostic, proxy app
● UX
  ○ Impala real time UI
  ○ Redesign overall layout
● Sqoop 2, HBase? Mahout?...


              Face of Hadoop/CDH

Mais conteúdo relacionado

Último

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Último (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Destaque

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Destaque (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 

A closer look at hue: how to interface with Hadoop

  • 2. What's on the Menu ● Hue Architecture ○ Many interfaces to implement ○ How do I list HDFS files, how do I submit a job...? ○ SDK ● Hue UI: Dynamic Workflow Editor ○ Why improve the user experience? ○ How can we improve the user experience? ○ Design Considerations ○ Design and Code Deep Dive
  • 3. View from 30 000 feet
  • 5. Integrate with the Web ● HTTP, stateless (async queries) ● Frontend / Backend (e.g. different servers, pagination) ● Resources (e.g. img, js, callbacks, css, json) ● Browsers, multi techs ● DB (sqlite, MySql, PostGres...) ● i18n ● ... More on UI later
  • 6. Integrate Users ● Auth ○ Standard ○ LDAP ○ PAM ○ Spnego ○ Custom (OAuth, Cookie...)
  • 7. Integrate HDFS ● Interfaces ○ Thrift (old) ■ NN ○ REST ■ WebHdfs ■ HttpFs (HA, new bugs) ● Uploads to HDFS class HDFStemporaryUploadedFile(object): class HDFSfileUploadHandler(FileUploadHandler):
  • 8. Integrate Hive ● Beeswax: embedded Hive CLI ● Concurrent executions ● Beeswax / Hive Server 2 Thrift interfaces ● Hue models, HQL, Impala, DDL service BeeswaxService { service TCLIService { QueryHandle query(1:Query query) throws(1: BeeswaxException error), TExecuteStatementResp ExecuteStatement(1: TExecuteStatementReq req); QueryHandle executeAndWait(1:Query query, 2: LogContextId clientCtx) TGetOperationStatusResp GetOperationStatus throws(1:BeeswaxException error), (1:TGetOperationStatusReq req); .... ....
  • 9. Integrate Hive Moving to Pluggable interfaces DBMS SQL API Beeswax HS2 Table BTable HS2Table
  • 10. Integrate Impala ● New app ● Same Beeswax/Hive Server 2 interfaces ● One more moving target..
  • 11. Integrate Jobs ● List, access, kill ● aka JobBrowser ● JobTracker Thrift Plugin mapred-site.xml More Thrift <property> <name>jobtracker.thrift.address</name> service Jobtracker extends common. <value>0.0.0.0:9290</value> HadoopServiceBase { </property> ThriftJobInProgress getJob(10: common. <property> RequestContext ctx, 1: ThriftJobID jobID) <name>mapred.jobtracker.plugins</name> throws(1: JobNotFoundException err), <value> org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin ThriftJobList getRunningJobs(10: common. </value> RequestContext ctx), </property>
  • 12. Integrate Jobs ● Submit jobs (MR, Hive, Java, Pig...) ● Manage workflows ● Schedule workflows ● REST (GET, PUT, POST)
  • 13. Integrate Shell ● Pig ● HBase ● Sqoop 2 ● Spawning Server ● Greenlets ● popen/pty/tty ● IO (HTTP, DB...) ● setuid ● css/js/POST
  • 14. Integrate YARN ● JobBrowser MR2, Oozie ● No JT, 4 more REST API ● MR to History Server, missing logs... ● MR1/2 API not 100% compatible (like Beeswax/HiveServer2, Beeswax UI/Impala switches)
  • 15. Integrate security ● 'hue' superuser ● One 'hue' JT, Shell setuid root:hue Kerberos ticket ● Hive Server 2 ? ● 'hue' Proxy User / doAs HDFS Oozie <property> <name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name> <value>*</value> </property> <property> <name>oozie.service.ProxyUserService.proxyuser.hue.groups</name> <value>*</value> </property>
  • 16. SDK: Integrate Developers ● Set of raw libs ● Hue models libs apps/ /hadoop /jobbrowser /jobtracker /oozie /webhdfs /... /yarn /liboozie /rest /thrift
  • 17. SDK: Integrate Developers $ ./build/env/bin/hue create_desktop_app clouddemo ● Custom: views/model/templates ● Reuse Hue libs http://cloudera.github.com/hue/docs-2.1.0 /sdk/sdk.html#fast-guide-to-creating-a-new- hue-application
  • 18. CloudDemo example Single click: ● HTTP ● HDFS ● Oozie ● JT
  • 19. After the Interfaces... ... now the dynamic UI (Oozie App use case)
  • 20. Why Improve User Experience ● Users like things that are easy to use ● Intuition and ease of use
  • 21. How to Improve User Experience ● How can we do this for Oozie? ○ Hue users are not engineers ○ Most users are not familiar with shortcuts and command lines ○ Windowing systems have taught us drag and drop is good Drag and drop every thing in a Workflow!
  • 23. Fundamentals of Front End Design ● Behavior ○ Javascript ○ Knockout JS ○ JQuery ● Presentation ○ CSS ○ Bootstrap ● Content ○ HTML (Templates) ● MV* ○ MVC ○ MVP ○ MVVM
  • 24. Design Constraints ● Existing backend from Hue 2.1 ○ Need to be able to easily migrate from Hue 2.1 to Hue 2.2 ● Knockout JS and JQuery already chosen ○ Rudimentary templating ○ Subscription based bindings ○ Observables for arrays and Javascript literals only ○ Event delegation ● Existing UI from Hue 2.1 ○ Provides basic node movement through form submission (reloads the page) ○ Not dynamic
  • 25. Other Design Considerations ● Serializing should be trivial ● Basic API ○ Save a workflow ○ Validate a node ○ Read a workflow ● Difference in representation between Hue 2.1 backend and the KnockoutJS way of doing things ● New nodes need an ID
  • 26. Design - High Level Components ● Left out ○ Many event bindings and custom events ○ Views left out
  • 27. Purpose of the Node Model ● Provides defaults for data: var NodeModel = ModelModule($); $.extend(NodeModel.prototype, { id: 0, name: '', description: '', node_type: '', workflow: 0, child_links: [] }); ● Sent over the wire ● Mimics Django models
  • 28. Model - ModelView Separation ● ModelViews should be the "shield" and Models the source of truth. ● Models are more serializable if they do not carry extraneous data. ● Subscribed update through KnockoutJS: $.each(mapping, function(key, value) { var key = key; if (ko.isObservable(self[key])) { self[key].subscribe(function(value) { model[key] = ko.mapping.toJS(value); }); } });
  • 29. Purpose of the Registry ● Construction optimization ● Constant time node lookup ● Looking towards the future and storage ● Simple start: var self = this; self.nodes = {}; module.prototype.initialize.apply(self, arguments); return self;
  • 30. Purpose of ID Generation ● Unique identifier for new nodes (IE: mapreduce:1). ● Assists in creating parent-child relationships through links. var IdGeneratorModule = function($) { return function(options) { var self = this; $.extend(self, options); self.counter = 1; self.nextId = function() { return ((self.prefix) ? self.prefix + ':' : '') + self.counter++; }; }; };
  • 31. Transpose to Show ● KnockoutJS supports 3 kinds of observables ○ Observables for literals ○ Observable arrays ○ Computed Observables ● DAG received is represented as a tree ● DAG represented as a list of lists when we display... MVVM restriction
  • 32. Other Difficulties ● Decision node representation ● JSON.stringify does not include parent class members ● Memory consumption ● Cycles, cycles, cycles
  • 33. Next steps ● Integrate ○ Pig, Hive Server 2 ○ Oozie Bundles, SLA ○ Document model, "Editors", git ○ SDK revamp, language agnostic, proxy app ● UX ○ Impala real time UI ○ Redesign overall layout ● Sqoop 2, HBase? Mahout?... Face of Hadoop/CDH