2. What's on the Menu
● Hue Architecture
○ Many interfaces to implement
○ How do I list HDFS files, how do I submit a job...?
○ SDK
● Hue UI: Dynamic Workflow Editor
○ Why improve the user experience?
○ How can we improve the user experience?
○ Design Considerations
○ Design and Code Deep Dive
5. Integrate with the Web
● HTTP, stateless (async queries)
● Frontend / Backend (e.g. different servers,
pagination)
● Resources (e.g. img, js, callbacks, css, json)
● Browsers, multi techs
● DB (sqlite, MySql, PostGres...)
● i18n
● ...
More on UI later
14. Integrate YARN
● JobBrowser MR2, Oozie
● No JT, 4 more REST API
● MR to History Server, missing logs...
● MR1/2 API not 100% compatible
(like Beeswax/HiveServer2, Beeswax
UI/Impala switches)
15. Integrate security
● 'hue' superuser ● One 'hue'
JT, Shell setuid root:hue Kerberos ticket
● Hive Server 2 ?
● 'hue' Proxy User / doAs
HDFS
Oozie
<property>
<name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.hue.groups</name>
<value>*</value>
</property>
16. SDK: Integrate Developers
● Set of raw libs ● Hue models
libs apps/
/hadoop /jobbrowser
/jobtracker /oozie
/webhdfs /...
/yarn
/liboozie
/rest
/thrift
20. Why Improve User Experience
● Users like things that are easy to use
● Intuition and ease of use
21. How to Improve User Experience
● How can we do this for Oozie?
○ Hue users are not engineers
○ Most users are not familiar with shortcuts and
command lines
○ Windowing systems have taught us drag and drop is
good
Drag and drop every thing in a Workflow!
23. Fundamentals of Front End Design
● Behavior
○ Javascript
○ Knockout JS
○ JQuery
● Presentation
○ CSS
○ Bootstrap
● Content
○ HTML (Templates)
● MV*
○ MVC
○ MVP
○ MVVM
24. Design Constraints
● Existing backend from Hue 2.1
○ Need to be able to easily migrate from Hue 2.1 to
Hue 2.2
● Knockout JS and JQuery already chosen
○ Rudimentary templating
○ Subscription based bindings
○ Observables for arrays and Javascript literals only
○ Event delegation
● Existing UI from Hue 2.1
○ Provides basic node movement through form
submission (reloads the page)
○ Not dynamic
25. Other Design Considerations
● Serializing should be trivial
● Basic API
○ Save a workflow
○ Validate a node
○ Read a workflow
● Difference in representation between Hue
2.1 backend and the KnockoutJS way of
doing things
● New nodes need an ID
26. Design - High Level Components
● Left out
○ Many event bindings and custom events
○ Views left out
27. Purpose of the Node Model
● Provides defaults for data:
var NodeModel = ModelModule($);
$.extend(NodeModel.prototype, {
id: 0,
name: '',
description: '',
node_type: '',
workflow: 0,
child_links: []
});
● Sent over the wire
● Mimics Django models
28. Model - ModelView Separation
● ModelViews should be the "shield" and
Models the source of truth.
● Models are more serializable if they do not
carry extraneous data.
● Subscribed update through KnockoutJS:
$.each(mapping, function(key, value) {
var key = key;
if (ko.isObservable(self[key])) {
self[key].subscribe(function(value) {
model[key] = ko.mapping.toJS(value);
});
}
});
29. Purpose of the Registry
● Construction optimization
● Constant time node lookup
● Looking towards the future and storage
● Simple start:
var self = this;
self.nodes = {};
module.prototype.initialize.apply(self, arguments);
return self;
30. Purpose of ID Generation
● Unique identifier for new nodes (IE: mapreduce:1).
● Assists in creating parent-child relationships through
links.
var IdGeneratorModule = function($) {
return function(options) {
var self = this;
$.extend(self, options);
self.counter = 1;
self.nextId = function() {
return ((self.prefix) ? self.prefix + ':' : '') +
self.counter++;
};
};
};
31. Transpose to Show
● KnockoutJS supports 3 kinds of observables
○ Observables for literals
○ Observable arrays
○ Computed Observables
● DAG received is represented as a tree
● DAG represented as a list of lists when we display...
MVVM restriction
32. Other Difficulties
● Decision node representation
● JSON.stringify does not include parent class
members
● Memory consumption
● Cycles, cycles, cycles
33. Next steps
● Integrate
○ Pig, Hive Server 2
○ Oozie Bundles, SLA
○ Document model, "Editors", git
○ SDK revamp, language agnostic, proxy app
● UX
○ Impala real time UI
○ Redesign overall layout
● Sqoop 2, HBase? Mahout?...
Face of Hadoop/CDH