SlideShare uma empresa Scribd logo
1 de 80
Baixar para ler offline
Apache	Hadoop	Crash	Course
Rafael	Coss
Data	Evangelist	
@racoss
#FutureOfData
2 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Agenda
Future	of	Data
Traditional	Data	Architectures
What’s	Apache	Hadoop?
Data	Access	with	Hadoop
Lab	Intro
3 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Customers	are	building	Modern	Data	Applications	to	transform	their	industries	–
renovating	their	IT	architectures	and	innovating	with	their	Data	in	Motion	
or	Data	at	Rest	to	power	actionable	intelligence.
Social	
Mapping
Payment	
Tracking
Factory	
Yields
Defect	
Detection
Call	Analysis
Machine	
Data
Product	
Design
M	&	A
Due	
Diligence
Next	Product	
Recs
Cyber	
Security
Risk	
Modeling
Ad	
Placement
Proactive	
Repair
Disaster	
Mitigation
Investment	
Planning
Inventory	
Predictions
Customer	
Support
Sentiment	
Analysis
Supply	Chain
Ad	
Placement
Basket	
Analysis
Segments
Cross-
Sell
Customer	
Retention
Vendor	
Scorecards
Optimize	
Inventories
OPEX	
Reduction
Mainframe	
Offloads
Historical	
Records
Data
as	a	Service
Public
Data	
Capture
Fraud	
Prevention
Device	Data
Ingest
Rapid	
Reporting
Digital	
Protection
3 © Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Future	of	Data
5 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
INTERNET
OF
ANYTHING
The	Future	of	Data	is	about	
actionable	intelligence	derived	from	a	
constantly	connected	society	with	easy	
secure	access	to	rich	data	sets	coming	
from	the	Internet	of	Anything
Data	Powers
Highway	Safety
7 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Tire	Pressure
Server	log Mobile
Sensor
Location
Precipitation
Social
Click-stream
Data	Powers	Highway	Safety
8 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
New	Data	Paradigm	Opens	Up	New	Opportunity
2.8	zettabytes
in	2012
44	zettabytes
in	2020
N E W
1 zettabyte (ZB) = 1 million petabytes (PB); Sources: IDC, IDG Enterprise, and AMR Research
Clickstream
ERP,	CRM,	SCM
Web	&	social
Geolocation
Internet	of	Things
Server	logs
Files,	 emails
Transform	every	industry	via	
full	fidelity	of	data	and	analytics
Opportunity
T R A D I T I O N A L
LAGGARDS
LEADERS
Ability	to	
Consume	Data
Enterprise	
Blind	Spot
9 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
What	disrupted	the	data	center?
?
Data?
10 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Modern	Data	Applications
Polygot Persistence
SQL
NoSQL
NewSQL
Search
Graph
At-Rest In-Motion
Analytics
Data	Variety
Integration
Data	Lake Federation
Optimization
Storage,	Compute
Distributed	Computing
Commodity	Hardware
Cloud
Hybrid	Distributed	Computing
11 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
The	Future	of	Data
Actionable	Intelligence
D A T A 	I N 	 M O T I O N
STORAGE
STORAGE
GROUP	2GROUP	1
GROUP	4GROUP	3
D A T A 	A T 	R E S T
INTERNET
OF
ANYTHING
Connected	Data	Platforms
are	powering	Actionable	Intelligence
Any	and	all	data	
from	sensors,	
machines,	
geolocation,	clicks,	
files,	social.
Secure	point-to-point	and	
bi-directional	data	flows
Collect	and	curate	all	data.
12 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Traditional	Data	Architectures
13 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Systems	of	Intelligence
Systems	of	
Engagements
Systems	of	
Interactions
Data	Systems
13
Systems	of	
Record
Systems	of	Insight
Events
In	
Gray
Analytics
In	
Green
OperatorsDevelopers
14 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
RDBMS
Sales
NoSQL
Unstructured
Visualization
&	Dashboards
Business	
Analytics
Data
Marts
Data	
Marts Archive
StatisticsOLAP
EDW
File	
Server
Clickstream	
Logs
Web	&	
Social	Logs
AudioVideo
LogsLogs
Logs
Geolocation
JSON
ETL
POS CRM ERP
ECM
Filter
App
Server
Message
Bus
Documents
15 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
RDBMS
Sales
NoSQL
Unstructured
Visualization
&	Dashboards
Business	
Analytics
Data
Marts
Data	
Marts Archive
StatisticsOLAP
EDW
File	
Server
Clickstream	
Logs
Web	&	
Social	Logs
AudioVideo
LogsLogs
Logs
Geolocation
JSON
ETL
POS CRM ERP
ECM
Filter
App
Server
Message
Bus
Documents
à Too	expensive	and	slow	as	data	growth	keeps	accelerating
à Too	slow	to	get	the	data	prepared	for	analytics
à Analytics	is	only	leveraging	a	limited	data	set
à Cold	data	becomes	archived	and	is	no	longer	usable	for	analytics
à Data	ingest	is	rigid	and	slow	for	new	IoAT data	types
à Limited	real	time	insights
Traditional	Data	Architecture	Challenges	with	Big	Data
16 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
RDBMS
Sales
NoSQL
Unstructured
Visualization
&	Dashboards
Business	
Analytics
Data
Marts
Data	
Marts Archive
StatisticsOLAP
EDW
File	
Server
Clickstream	
Logs
Web	&	
Social	Logs
AudioVideo
LogsLogs
Logs
Geolocation
JSON
ETL
POS CRM ERP
ECM
Filter
App
Server
Message
Bus
Documents
17 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Next Generation Analytics
Iterative & Exploratory
Data is the structure
IT Team
Delivers Data
On Flexible
Platform
Business
Users
Explore and
Ask Any Question
Analyze ALL Available Information
Whole population analytics connects the dots
Traditional Analytics
Structured & Repeatable
Structure built to store data
Business
Users
Determine
Questions
IT Team
Builds System
To Answer
Known Questions
17
Available Information
Analyzed
Information
Capacity constrained down sampling of available
information
Carefully cleanse all information before any analysis
Analyzed
Information
Analyze information as is & cleanse as needed
Analyzed
Information
Modern	Data	Applications
18 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Next Generation Analytics
Iterative & Exploratory
Data is the structure
Traditional Analytics
Structured & Repeatable
Structure built to store data
18
?
Analyzed
Information
Question
DataAnswer
Hypothesis
Start	with	hypothesis
Test	against	selected	data
Data leads the way
Explore all data, identify correlations
Data
Correlation
All Information
Exploration
Actionable Insight
Analyze	after	landing… Analyze	in	motion…
Modern	Data	Applications	Has	Two	Themes
What’s	Apache	Hadoop?
20 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hadoop	Architecture
Data	Access	Engines
Distributed	Reliable	Storage
Distributed	Compute	Framework
Resource	Management,	Data	LocalityData	Operating	System
Batch Interactive Real-time
Governance
&
Integration
Security
Applications
Deploy	Anywhere
21 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hadoop	Data	Platform	Architecture
Store	and	process	all	of	your	Corporate	Data	Assets
YARN:	Data	Operating	System
DATA MANAGEMENT
Provide	layered	
approach	to
security	through	
Authentication,	
Authorization,	
Accounting,	and	
Data	Protection
SECURITY
Access	 your	data	simultaneously	in	multiple	 ways
(batch,	interactive,	 real-time)
DATA ACCESS
Load	data	and	
manage	 according	
to	policy
GOVERNANCE &
INTEGRATION
ENTERPRISE	MGMT	&	SECURITY
Empower	existing	operations	and	
security	tools	to	manage	 Hadoop
PRESENTATION	&	APPLICATION
Enable	both	existing	and	new	application	to
provide	value	to	the	organization	
Provide	deployment	choice	 across	on-premise,	appliance,	 virtualized,	cloud
DEPLOYMENT	OPTIONS
Deploy	and	
effectively	
manage	 the	
platform
OPERATIONS
22 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
runs	on
ETL
RDBMS	Import/Export
Distributed	Storage	&	Processing	Framework
Secure	NoSQL DB
SQL	on	HBase
NoSQL DB
Workflow	Management
SQL
Streaming	Data	Ingestion
Cluster	System	Operations
Secure	Gateway
Distributed	Registry
ETL
Search	&	Indexing
Even	Faster	Data	Processing
Data	Management
Machine	Learning
Hadoop	Ecosystem
23 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Open	Enterprise	Hadoop	Capabilities
YARN : Data Operating System
DATA ACCESS SECURITY
GOVERNANCE &
INTEGRATION OPERATIONS
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° °
°
N
Data Lifecycle &
Governance
Falcon
Atlas
Administration
Authentication
Authorization
Auditing
Data Protection
Ranger
Knox
Atlas
HDFS	EncryptionData Workflow
Sqoop
Flume
Kafka
NFS
WebHDFS
Provisioning,
Managing, &
Monitoring
Ambari
Cloudbreak
Zookeeper
Scheduling
Oozie
Batch
MapReduce
Script
Pig
Search
Solr
SQL
Hive
NoSQL
HBase
Accumulo
Phoenix
Stream
Storm
In-memory
Spark
Others
ISV Engines
Tez Tez Slider Slider
DATA MANAGEMENT
Hortonworks	Data	Platform
Deployment	ChoiceLinux	 Windows	 On-Premise	 Cloud
HDFS Hadoop Distributed File System
24 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
HORTONWORKS	 DATA	PLATFORM
DATA	MGMT
HDP	2.2
Dec	2014
HDP	2.1
April	2014
HDP	2.0
Oct	2013
HDP	2.2
Dec	2014
HDP	2.1
April	2014
HDP	2.0
Oct	2013
2.2.0
2.4.0
2.6.0
Ongoing	Innovation	in	Apache
HDFS
YARN
MapReduce
Hadoop	Core
What	is	Apache	Hadoop?
Yahoo!
2006
Hortonworks	
Oct	2011
Yahoo!	start	focus	on	multiple	Hadoop	apps	&	clusters	
Contributes	Hadoop	to	Apache
2008
HDP	1.0
Oct	2012
Apache	Hadoop	v2	YARN
Google	publishes	GFS	&	MapReduce papers
2004-2005
HDP 2.4
March	2016
2.7.1
HDP	2.2
Dec	2014
HDP	2.3
July	2015
2.7.1
25 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
`
+
/directory/structure/in/memory.txt
Resource management + schedulingDisk, CPU, Memory
Core
NameNode
HDFS
ResourceManager
YARN
Hadoop daemon
User application
NN
RM
DataNode
HDFS
NodeManager
YARN
Worker Node
26 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
HDFS:	Scalable,	Reliable	and	Secure	Storage	Platform
The	Storage	Platform	for	Hadoop	2.0
Scalable	
Horizontally	grow	as	data	volumes	grow,	adding	
one	or	multiple	nodes	at	a	time
Reliable
Highly	available	(HA)	and	fault	tolerant	to
protect	against	data	loss	and	corruption
Cost	Effective
Leverage	Commodity	Hardware
Cross	workload	access
Secure
Strong	access	controls,	integrated	with	
authentication	mechanisms
Granular	data	access	controls	to	datasets	across	
users	and	groups
Protects	data	over	the	wire	and	at	rest
HDFS
YARN: Data Operating System
C A B C B B A C
B A B A C A
Standards Based
Data Interfaces
NFS
Source /
Destination
REST
RPC
Source /
Destination
Source /
Destination
Ingest	and	store	any	data	in	any	format
Flexible	read	access	 enables	a	variety	of	work	loads
27 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Heterogeneous Storage
Before
• DataNodeis	a	single	storage
• Storage	is	uniform	-Only	storage	type	Disk
• Storage	types	hidden	from	the	file	system
New Architecture
• DataNodeis	a	collection	of	storages
• Support	different	types	of	storages
– Disk,	SSDs,	Memory
All	disks	as	a	single	storage
S3
Swift
SAN
Filers
Collection	of	tiered	storages
28 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hadoop Distributed File System (HDFS)
Fault Tolerant Distributed Storage
• Divide	files	into	big	blocks	and	distribute	3	copies	randomlyacross	the	cluster
• Processing	Data	Locality
• Not	Just	storage	but	computation
10110100101
00100111001
11111001010
01110100101
00101100100
10101001100
01010010111
01011101011
11011011010
10110100101
01001010101
01011100100
11010111010
0
Logical File
1
2
3
4
Blocks
1
Cluster
1
1
2
2
2
3
3
34
4
4
29 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Batch Processing in Hadoop
MapReduce
Batch Access to Data
Original data access mechanism for Hadoop
• Framework
Made	for	developing	distributed	applications	to	
process	vast	amounts	of	data	in-parallel	on	large	
clusters
• Proven
Reliable	interface	to	Hadoop	which	works	from	GB	to	
PB.	But,	batch	oriented	– Speed	is	not	it’s	strong	point.
• Ecosystem
Ported	to	Hadoop	2	to	run	on	YARN.		Supports	original	
investments	in	Hadoop	by	customers	and	partner	
ecosystem.		
DataNode1
Mapper
Data	is	shuffled
across	the	network
&	sorted
Map	Phase Shuffle/Sort Reduce	Phase
MapReduce Job	Lifecycle
Saying	that	MapReduce	is	dead	is	
preposterous
- Would	limits	us	to	only	new	workloads	
- ALL	Hadoop clusters	use	map	reduce
- Proven	at	Enterprise	Scale
DataNode2
Mapper
DataNode3
Mapper
DataNode1
Reducer
DataNode2
Reducer
DataNode3
Reducer
YARN:	Data	Operating	System
Interactive Real-TimeBatch
30 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
What is MapReduce?
Break a large problem into sub-solutions
Map
• Iterate over a large # of records
• Extract something of interest from
each record
Shuffle
• Sort Intermediate results
Reduce
• Aggregate, summarize, filter or
transform intermediate results
• Generate final output
Map	Process
Map	Process
Map	Process
Map	Process
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data Map	Process
Reduce	
Process
Reduce	
Process
Data
Read	&	ETL
Shuffle	&	
Sort Aggregation
Data
Data
Data
Data
Data
Data
Data
Data
31 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
1st Gen	Hadoop:	Cost	Effective	Batch	at	Scale
HADOOP	1.0
Built	for	Web-Scale	Batch	Apps
Single	App
BATCH
HDFS
Single	App
INTERACTIVE
Single	App
BATCH
HDFS
Silos	created	for	distinct	
use	casesSingle	App
BATCH
HDFS
Single	App
ONLINE
32 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hadoop	emerged	as	foundation	of	new	data	architecture
Apache	Hadoop	is	an	open	source	data	platform	for	managing	large	
volumes	of	high	velocity	and	variety	of	data
• Built	by	Yahoo!	to	be	the	heartbeat	of	its	ad	&	search	business
• Donated	to	Apache	Software	Foundation	in	2005	with	rapid	adoption	by	large	
web	properties	&	early	adopter	enterprises
• Incredibly	disruptive	to	current	platform	economics
Traditional	Hadoop	Advantages
ü Manages	new	data	paradigm
ü Handles	data	at	scale
ü Cost	effective
ü Open	source
Traditional	Hadoop	Had	Limitations
Batch-only	architecture	
Single	purpose	clusters,	specific	data	sets
Difficult	to	integrate	with	existing	investments
Not	enterprise-grade
Application
Storage
HDFS
Batch Processing
MapReduce
33 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
YARN	extends	Hadoop	into	data	center	leaders
YARN
The Architectural
Center of Hadoop
• Common data platform, many applications
• Support multi-tenant access & processing
• Batch, interactive & real-time use cases
• Supports 3rd-party ISV tools
(ex. SAS, Syncsort,Actian, etc.)
YARN Ready Applications
Facilitates ongoing innovation and enterprise adoption via
ecosystem of new and existing“YARN Ready” solutions
YARN : Data Operating System
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° °
°
N
HDFS Hadoop Distributed File System
DATA MANAGEMENT
Batch
MapReduce
Script
Pig
Search
Solr
SQL
Hive
NoSQL
HBase
Accumulo
Phoenix
Stream
Storm
In-memory
Spark
Others
ISV Engines
Tez Tez Slider Slider
34 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
What	does	iOS 6	and	Windows	3.1	have	in	common?
35 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hadoop	Beyond	Batch	with	YARN
Single	Use	Sysztem
Batch	Apps
Multi	Use	Data	Platform
Batch,	Interactive,	Online,	Streaming,	…
A	shift	from	the	old	to	the	new…
HADOOP 1
MapReduce
(cluster resource management
& data processing)
Data Flow
Pig
SQL
Hive
Others
API,
Engine,
and
System
YARN
(Data Operating System: resource management, etc.)
Data Flow
Pig
SQL
Hive
Other
ISV
Apache Yarn as a Base
System
Engine
API’s
1 ° ° ° ° °
° ° ° ° ° N
HDFS
(redundant, reliable storage)
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° N
HDFS
(redundant, reliable storage)
Batch
MapReduce
Tez Tez
MapReduce as the Base
HADOOP 2
36 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Architecture	Enabled	by	YARN
A	single	set	of	data	across	the	entire	cluster	with	multiple	access	methods	
using	“zones”	for	processing	
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
° ° ° ° ° ° ° n
SQL
Hive
Interactive	SQL	Query	
for	Analytics
Pig
Script-based	ETL
Algorithm	executed	in	batch	to	rework	
data	used	by	Hive	and	HBase	consumers
• Maximize compute
resources to lower TCO
• No standalone,
silo’d clusters
• Simple management
& operations
…all enabled by YARN
Stream	Processing
Storm
Identify	&	act	on	
real-time	events
NoSQL
Hbase
Accumulo
Low-latency	access	serving	up	
a	web	front	end
37 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hadoop	Workload	Evolution
Single	Use	System
Batch	Apps
Multi	Use	Data	Platform
Batch,	Interactive,	Online,	Streaming,	…
A	shift	from	the	old	to	the	new… Multi	Use	Platform
Data	&	Beyond
HADOOP 1
YARN
HADOOP 2
1 ° ° ° °
° ° ° ° N
HDFS
(redundant, reliable storage)
1 ° ° °
° ° ° N
HDFS
MapReduce
HADOOP.Next
YARN ‘
1 ° ° ° ° ° °
° ° ° ° ° ° N
HDFS
(redundant, reliable storage)
DATA ACCESS APPS
Docker
MySQLMR2 Others
(ISV Engines)
Multiple
(Script, SQL, NoSQL, …)
MR2 Others
(ISV Engines)
Multiple
(Script, SQL, NoSQL, …)
Docker
Tomcat
Docker
Other
38 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Gartner:	What	is	Hadoop?
à Common	Apache	Projects
– ALL														=	7						(6)
– Except	for	1	=	3						(5)
– Except	for	2	=	4						(4)
² About	14	Common	Projects
à Uncommon	Projects
– Only	1									=	9							(1)
– Only	2									=	7 (2)
– Only	3									=	6 (3)
² About	22	Uncommon	Projects	
http://blogs.gartner.com/merv-adrian/2015/07/02/now-what-is-hadoop/
ODPi
ODPi
ODPi
ODPi
ODPi ODPi ODPi
Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
HORTONWORKS DATA PLATFORM
Hadoop
&YARN
Flume
Oozie
HDP 2.3 is Apache Hadoop; not “based on” Hadoop
Pig
Hive
Tez
Sqoop
Cloudbreak
Ambari
Slider
Kafka
Knox
Solr
Zookeeper
Spark
Falcon
Ranger
HBase
Atlas
Accumulo
Storm
Phoenix
4.10.2
DATA MGMT DATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS SECURITY
HDP 2.2
Dec 2014
HDP 2.1
April 2014
HDP 2.0
Oct 2013
HDP 2.2
Dec 2014
HDP 2.1
April 2014
HDP 2.0
Oct 2013
0.12.0 0.12.0
0.12.1 0.13.0 0.4.0
1.4.4 1.4.4 3.3.23.4.5
0.4.00.5.0
0.14.0 0.14.0 3.4.6 0.5.0 0.4.00.9.30.5.2
4.0.04.7.2
1.2.1 0.60.0 0.98.4 4.2.0 1.6.1 0.6.0 1.5.21.4.5 4.1.02.0.0
1.4.0 1.5.1 4.0.0
1.3.1
1.5.1 1.4.4 3.4.5
2.2.0
2.4.0
2.6.0
2.7.1 1.4.6 1.0.0 0.6.0 0.5.02.1.00.8.2 3.4.61.5.25.2.1 0.80.0 0.5.01.7.04.4.0 0.10.0 0.6.10.7.01.2.10.15.0
HDP 2.3
Oct 2015 4.2.0
0.96.1
0.98.0 0.9.1
0.8.1
1.4.1 1.1.2
2.7.1 1.4.6 1.3.0 0.9.0 0.6.02.4.00.10.0 3.4.61.5.25.5.1 0.80.0 0.7.01.7.04.7.0 1.0.1 0.10.00.7.01.2.10.16.0
HDP 2.5*
2H2016
4.2.01.6.2 1.1.2
2.7.1 1.4.6 1.1.0 0.6.0 0.5.02.2.10.9.0 3.4.61.5.25.2.1 0.80.0 0.5.01.7.04.4.0 0.10.0 0.6.10.7.01.2.10.15.0
HDP 2.4
Mar 2016 4.2.01.6.0 1.1.2
Zeppelin
Ongoing Innovation in Apache
0.6.0
* HDP 2.5 – Shows current Apache branches being used. Final component version subject to change based on Apache release process.
40 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Next	Generation	Data	Vendors	Investment	for	the	Enterprise
Vertical
Integration with
YARN and HDFS
Ensure engines can run
reliably and respectfully
in a YARN based
cluster
Provision,
Manage &
Monitor
Ambari
Zookeeper
Scheduling
Oozie
Load	data	and	
manage	
according	
to	policy
Provide	layered	
approach	to
security	through	
Authentication,	
Authorization,	
Accounting,	and	
Data	Protection
SECURITYGOVERNANCE
Deploy	and	
effectively	
manage	 the	
platform
° ° ° ° ° ° ° ° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Java
Scala
Cascading
Stream
Storm
Search
Solr
NoSQL
HBase
Accumulo
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Others
ISV
Engines
1 ° ° ° ° ° ° ° ° ° ° ° ° ° °
YARN: Data Operating System
(Cluster	Resource	Management)
HDFS
(Hadoop Distributed File System)
Tez Slider SliderTez Tez
OPERATIONS
Horizontal Integration for Enterprise Services
Ensure consistent enterprise services are applied across the Hadoop stack
41 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
What	do	distributions	do?
à Define	a	stack	of	components
• Rich	and	latest	set	of	Apache	Projects	(open	source	&	open	community)	without	lock	in
à Vertical	and	Horizontal	integration	of	components
• Vertical:	Best	Speed	and	Scale
• Horizontal:	Open	Enterprise	Ready
à Provision	and	Upgrade	stack
• Robust,	Easy	and	Anywhere
à Accelerate	time	to	value	(easy	of	use)
• New	Face	of	Hadoop	with	Uis from	Ambari,	Ambari	Views,	Ranger,	Falcon,	Atlas
à Partner	Ecosystem
• Rich	and	Deep	
à Support
• Industry’s	best,	SmartSenseand	influence	community
Hadoop
Operations	&	Tools
43 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
How Do You Operate a Hadoop Cluster?
Apache™ Ambari is	a	platform	
to	provision,	manage	and	
monitor	Hadoop	clusters
44 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Ambari Core Features and Extensibility
Install	&	Configure
Operate,	Manage	&	
Administer
Develop
Optimize	&	Tune
Developer
Data	Architect
Ambari	provides	core	services	for	operations,	development	and	
extensions	points	for	both
Extensibility	Features
Stacks,	Blueprints	&	REST	APIs
Core	Features
Install	Wizard	&	Web
Web,	Operator	Views,	
Metrics	&	Alerts
User	Views
User	Views
Views	Framework	&	REST	APIs
Views	Framework
Views	Framework
How?
Cluster	Admin
45 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
New	user	interface	enables	fast	&	
easy	SQL	definition	and	execution.
46 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
New User Views for DevOps
Capacity	Scheduler	View
Browse	and	manage	YARN	queues
Tez View
View	information	related	to	Tez jobs	that	
are	executing	on	the	cluster
47 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
New	User	Views	for	Development
Pig	View
Author	and	execute	Pig	Scripts.
Hive	View
Author,	execute	and	debug	Hive	
queries.
Files	View
Browse	HDFS	file	system.
48 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Apache	Zeppelin
• Web-based	notebook	for	data	engineers,	data	
analysts	and	data	scientists
• Brings	interactive	data	ingestion,	data	
exploration,	visualization,	sharing	and	
collaboration	features	to	Hadoop	and	
Spark
• Modern	data	science	studio
• Scala	with	Spark
• Python	with	Spark
• SparkSQL
• Apache	Hive,	and	more.
Hadoop
Data	Access
50 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Access patterns enabled by YARN
YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° °
°
°N
HDFS
Hadoop Distributed File System
Interactive Real-TimeBatch
Applications Batch
Needs to happen but, no
timeframe limitations
Interactive
Needs to happen at
Human time
Real-Time
Needs to happen at
Machine Execution time.
51 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Apache Hive: SQL in Hadoop
• Created by a team at Facebook
• Provides a standard SQL interface to data stored in Hadoop
• Quickly find value in raw data files
• Proven at petabyte scale
• Compatible with ALL major BI tools such as Tableau, Excel, MicroStrategy,
Business Objects, etc…
SensorMobile
Weblog
Operational
/	MPP
SQL	Queries
52 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hive and the Stinger Initiative
Base	Optimizations
Generate	simplified	DAGs
In-memory	Hash	Joins
Vector	Query	Engine
Optimized	for	modern	processor	
architectures
Tez
Express	tasks	more	simply
Eliminate	 disk	writes
Pre-warmed	Containers
ORCFile
Column	Store
High	Compression
Predicate	/	Filter	Pushdowns
YARN
Next-gen	Hadoop	data	processing	
framework
+ +
Query	Planner
Intelligent	Cost-Based	Optimizer
Performance	Optimizations
100x+	faster	time	to	insight
Deeper	analytical	capabilities
53 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Stinger.next and	Sub-Second	SQL
Emergenceof LLAP brings Sub-Second SQL response times within reach with Hive.
BATCH & INTERACTIVE BATCH & INTERACTIVE BATCH, INTERACTIVE & SUB-SECONDSPEED
DELIVERY
SQL
UPDATES
ENGINES
STINGER
DELIVERED
PROGRESS
DELIVERED
FINAL
VERSION
HDP 2.1
VERSION
0.13
VERSION
HDP 2.3
VERSION
1.2.1
SQL:2003+ SQL:2011 SUBSET
READ-ONLY SQL INSERT/UPDATE/DELETE
MR, TEZ MR, TEZ
FUTURE
STINGER NEXT
COMPLETE ACID SUPPORT INCLUDING MERGE
COMPREHENSIVE SQL:2011 BASED ANALYTICS
MR, TEZ, LLAP
DELIVERED IN DEVELOPMENT
Tiered	Data	Storage
Stinger.next Phase	3
YARN:	Containerized	
Applications
54 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Data	Types SQL Features File Formats Latest Additions…
Numeric Core	SQL	Features Columnar Scalable	Cross	Product
FLOAT/DOUBLE Date,	Time and	Arithmetical	Functions ORCFile Primary	Key	/	Foreign Key
DECIMAL INNER,	OUTER,	CROSS	and	SEMI	Joins Parquet Non-Equijoin
INT/TINYINT/SMALLINT/BIGINT Derived	Table	Subqueries
Text
Tech	Preview:
Proc.	Extensions	(PL/SQL)
BOOLEAN Correlated	+ Uncorrelated	Subqueries CSV Future
String UNION	ALL Logfile ACID	MERGE
CHAR	/	VARCHAR UDFs, UDAFs,	UDTFs Nested	/	Complex Multi	Subquery
STRING Common	Table	Expressions Avro Comparison	to	sub-select
BINARY UNION	DISTINCT JSON INTERSECT and	EXCEPT
Date, Time Advanced	Analytics XML
DATE OLAP	and	Windowing	 Functions Custom	Formats
TIMESTAMP CUBE and	Grouping	 Sets Other	Features
Interval	Types Nested	Data	Analytics XPath Analytics
Complex	Types Nested	Data	Traversal
ARRAY Lateral	Views
MAP ACID	Transactions
STRUCT INSERT	/	UPDATE	/	DELETE
UNION
Apache	Hive:	Journey	to	SQL:2011	Analytics
Legend
Existing
Future
New	with	Hive	2.0
55 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Storage
Columnar Storage
ORCFile Parquet
Unstructured Data
JSON CSV
Text Avro
Custom
Weblog
Engine
SQL Engines
Row	Engine Vector	Engine
SQL
SQL Support
SQL:2011 Optimizer HCatalog HiveServer2
Cache
Block Cache
Linux	Cache
Distributed
Execution
Hadoop 1
MapReduce
Hadoop 2
Tez Spark
Vector Cache
LLAP
Persistent Server
Historical
Current
In-development
Legend
Apache Hive: Modern Architecture
56 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Apache	Tez	is	a	critical	innovation	of	the	Stinger	Initiative.
• Along with YARN, Tez not only improves
Hive, but improves	all	things	batch	and interactive	
for	Hadoop;	Pig,	Cascading…
• More Efficient Processing than MapReduce
• Reduce	operations	and	complexity	of	back	end	processing
• Allows	for	Map	Reduce	Reduce	which	saves	hard	disk	operations
• Implements	a	“service”	which	is	always	on,	decreasing	start	times	of	jobs
• Allows	Caching	of	Data	in	Memory
YARN
Dev
Cascading/
Scalding
Why	is	Tez Important?
°1 ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° °
°
°°
° ° ° ° ° ° °
° ° ° ° ° ° N
HDFS
(Hadoop Distributed File
System)
Scripting
Pig
SQL
Hive
Tez Tez
Applications
Tez
YARN:	Data	Operating	System
Interactive Real-TimeBatch
57 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Apache	Tez
Hive	– MapReduce Hive	– Tez
SELECT a.state, COUNT(*), AVG(c.price)
FROM a
JOIN b ON (a.id = b.id)
JOIN c ON (a.itemId = c.itemId)
GROUP BY a.state
SELECT	a.state
JOIN	(a,	c)
SELECT	c.price
SELECT	b.id
JOIN(a,	b)
GROUP	BY	a.state
COUNT(*)
AVG(c.price)
M M M
R R
M M
R
M M
R
M M
R
HDFS
HDFS
HDFS
M M M
R R
R
M M
R
R
SELECT	a.state,
c.itemId
JOIN	(a,	c)
JOIN(a,	b)
GROUP	BY	a.state
COUNT(*)
AVG(c.price)
SELECT	b.id
Tez avoids	unneeded	writes	to	
HDFS
58 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Scripting Data Pipeline & ETL
Apache Pig
• Data	flow	engine	and	scripting	language	(Pig	Latin)
• Allows	you	to	transformdata	and	datasets
Advantages over MapReduce
• Reduces	time	to	write	jobs
• Community	support
• Piggybank	has	a	significant	number	of	UDF’s	to	help	adoption
• There	are	a	large	number	of	existing	shops	using	PIG
YARN:	Data	Operating	System
Interactive Real-TimeBatch
59 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Pig	Latin
• Pig	executes	in	a	unique	fashion:
oDuring	execution,	each	statement	is	processed	by	the	Pig	interpreter	
oIf	a	statement	is	valid,	it	gets	added	to	a	logical	plan built	by	the	
interpreter
oThe	steps	in	the	logical	plan	do	not	actually	execute	until	a	DUMP	or	
STORE	command	is	used
60 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Why	use	Pig?
• Maybe	we	want	to	join	two	datasets,	from	different	sources,	on	a	
common	value,	and	want	to	filter,	and	sort,	and	get	top	5	sites
61 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
ResourceManagement
Storage
Elegant Developer APIs
DataFrames, Machine Learning, and SQL
Made for Data Science
All apps need to get predictive at scale and fine granularity
Democratize Machine Learning
Spark is doing to ML on Hadoop what Hive did for SQL on
Hadoop
Community
Broad developer, customer and partner interest
Realize Value of Data Operating System
A key tool in the Hadoop toolbox
Apache	Spark	enthusiasm
Applications
Spark	Core	Engine
Scala
Java
Python
libraries
MLlib	
(Machine	
learning)
Spark	
SQL*
Spark	
Streaming*
Spark	Core	Engine
62 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Apache Spark & Apache Hadoop Perfect Together
General Purpose Data Access Engine
for	fast,	large-scale	data	processing
Designed for Iterative, In-Memory
computations	and	interactive	data	mining
Expressive Multi-LanguageAPIs
for	Java,	Scala,	Python	and	R
Built-in Libraries
Enable	data	workers	to	rapidly	iterate	over	data	for:		
ETL,	Machine	Learning,	SQL	and	Stream	processing
YARN
Scala
Java
Python
R
APIs
Spark Core Engine
Spark
SQL
Spark
Streaming
MLlib GraphX
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° °
°
N
HDFS
63 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Apache Projects Enable Access Patterns
Various open source projects have
incubated in order to meet these access
pattern needs
Today, they can all run on a single cluster
on a single set of data because of YARN
All powered by a broad open community
YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° °
°
°N
HDFS
Hadoop Distributed File System
Interactive
Solr
Spark
Hive
Pig
Real-Time
HBase
Accumulo
Storm
Batch
MapReduce
Applications
Kafka
64 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Connected	Data	Platforms
Connected	Data	Platforms	Enable	Architectural	Transformations
Data	in	
Motion
(Cloud)
Data	in	
Motion
(on-premises)
Data	at	
Rest
(on-premises)
Edge	
Data
Data	in	
Motion
Edge	
Analytics
Data	at	
Rest
(Cloud)
Edge	
Data
Data	at	
Rest
(on-premises)
Closed	
Loop	
Analytics
Machine
Learning
Deep	
Historical
Analysis
Must-have	Considerations	for	Technology
Continuous	Data	
Life	Cycle
Real-time	
insights	from	
origin	to	rest	
Enterprise	
Ready	
Management	
Security	
Governance
Deployment	
Flexibility	
On	Premise
Cloud	
Hybrid
Open	
Innovation
Architecture		
Community	
Ecosystem
Hands	on	Lab	Overview
HDP	2.4	Sandbox
à Provides	Free	preconfigured	
HDP
– Runs	in	a	Virtual	Machine	or	
Azure
Hortonworks.com/sandbox
à Easy	to	Use
– Operations
• Ambari
– Dev	and	DevOps
• Ambari	User	Views
– Web	Notebook
• Zeppelin
à Works	with	60+	Free	tutorial
Hortonworks.com/tutorials
Data	Discovery	Lab
• Elefante Wine	Company	has	a	fleet	of	over	100	trucks.
• The	geolocation	data	collected	from	the	trucks	contains	events	generated	while	the	truck	drivers	are	
driving.
• The	company’s	goal	with	Hadoop	is	to	Mitigate	Risk:
o Understand	correlations	between	miles	driven	and	events
o Compute	the	risk	factor	for	each	driver	based	on	mileage	&	events
o Lab	Env
o Sandbox	2.4
o Lab	Doc
o URL:	http://goo.gl/14OAat
o Load	Data
o Query	Data
o Process	Data
Elefante Wine Current Challenges
The Company
Elefante Wine is a boutique wine fulfillment company with a large fleet of trucks. It delivers wine
in a highly-regulated industry with stringent transportation requirements.
The Situation
Recently a number of driver violations led to fines and increased insurance rates
The Challenges
• Rising Operational Costs
• Driver Safety
• Risk Management
• Logistics Optimization
© HortonworksInc. 2012
Professional Services
Elefante Wine	Company	has	a	large	fleet	of	trucks	in	USA
A	truck	generates	millions	of	events	for	a	
given	route;	an	event	could	be:
§ 'Normal'	events:	starting	/	stopping	of	the	
vehicle
§ ‘Violation’	events:	speeding,	excessive	
acceleration	and	breaking,	unsafe	tail	distance
Company	uses	an	application	that	monitors	
truck	locations	and	violations	from	the	
truck/driver	in	real-time	to	calculate	risk
Route?
Truck?
Driver?
Analysts	query	a	broad	
history	to	understand	if	
today’s	violations	are	
part	of	a	larger	problem	
with	specific	routes,	
trucks,	or	drivers
Elefante Wine Risk and Driver Safety Challenges
Trucks	outfitted	with	new	sensors	generating	large	
volumes	of	new	data:
• Location
• Speed
• Driver	Violations
Need	to	be	integrate	real-time	&	historical	data
Increase safety and reduce liabilities
Anticipate driver violations BEFORE they
happen and take precautionary actions
Find	predictive	correlations	in	driver	behavior	over	
large	volumes	of	real-time	data
Difficult to deliver timely insights to the right
people and systems to take action
Data Discovery
Uncover new
findings
Predictive Analytics
Identify your next best
action
Better Understanding
of the Past
Better Prediction
of the Future
What’s	our	goal?
à Solution:
– Collect	additional	data	via	sensors	in	trucks	to	better	understand	Risk	Factors
à How:
– Quickly	store	new	sensor	data	in	a	common	repository
– Prepare	the	data	for	analysis
– Explore	the	data
– Calculate	Risk
– Generate	a	report
Move Data Into Hadoop
Geolocation.csv
trucks.csv
Geolocation_stage Geolocation
Trucks_stage Trucks
csv
csv ORC
ORC
SQL
SQL
move
LOAD
Geolocation
Trucks
ORC
ORC
SQL
SQL
PIG	or	Spark
Risk	Calculation
Truck_mileage
ORC
Avg_mileage
ORC
DriverMileage
ORC
RiskFactor
ORC
Events
ORC
Trucking Risk Analysis – Hadoop ELT
Calculate	Risk
Getting	Started	Resources
78 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
developer.hortonworks.com
79 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Hortonworks	Nourishes	the	Community
H O R TO NW O R KS 	
C O M M UNI TY	C O NNE C T I ON
H O R TO N W OR KS 	
PA R T N ERWO RKS
https://community.hortonworks.com
80 ©	Hortonworks	Inc.	2011	–2016.	All	Rights	Reserved
Thank	you!
rafael@hortonworks.com
@racoss

Mais conteúdo relacionado

Mais procurados

Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidDataWorks Summit
 
Data Science with Apache Spark - Crash Course - HS16SJ
Data Science with Apache Spark - Crash Course - HS16SJData Science with Apache Spark - Crash Course - HS16SJ
Data Science with Apache Spark - Crash Course - HS16SJDataWorks Summit/Hadoop Summit
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it DataWorks Summit/Hadoop Summit
 
Intro to Spark with Zeppelin
Intro to Spark with ZeppelinIntro to Spark with Zeppelin
Intro to Spark with ZeppelinHortonworks
 
SparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsSparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsDataWorks Summit
 
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017Timothy Spann
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Spark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteSpark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteHortonworks
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & FutureDataWorks Summit
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteDataWorks Summit/Hadoop Summit
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaDataWorks Summit/Hadoop Summit
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies DataWorks Summit/Hadoop Summit
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Hortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 

Mais procurados (20)

#HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course #HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
 
Data Science with Apache Spark - Crash Course - HS16SJ
Data Science with Apache Spark - Crash Course - HS16SJData Science with Apache Spark - Crash Course - HS16SJ
Data Science with Apache Spark - Crash Course - HS16SJ
 
Modernise your EDW - Data Lake
Modernise your EDW - Data LakeModernise your EDW - Data Lake
Modernise your EDW - Data Lake
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
 
Intro to Spark with Zeppelin
Intro to Spark with ZeppelinIntro to Spark with Zeppelin
Intro to Spark with Zeppelin
 
SparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsSparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data Scientists
 
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Spark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteSpark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's Keynote
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
 
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 

Destaque

Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJIntro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJDaniel Madrigal
 
Data Science Crash Course Hadoop Summit SJ
Data Science Crash Course Hadoop Summit SJData Science Crash Course Hadoop Summit SJ
Data Science Crash Course Hadoop Summit SJDaniel Madrigal
 
Open Source Ingredients for Interactive Data Analysis in Spark
Open Source Ingredients for Interactive Data Analysis in Spark Open Source Ingredients for Interactive Data Analysis in Spark
Open Source Ingredients for Interactive Data Analysis in Spark DataWorks Summit/Hadoop Summit
 
How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...
How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...
How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...Enspektos, LLC
 
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetupAutoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetupRafal Kwasny
 
Hadoop Workshop on EC2 : March 2015
Hadoop Workshop on EC2 : March 2015Hadoop Workshop on EC2 : March 2015
Hadoop Workshop on EC2 : March 2015IMC Institute
 
Multi User Data science with Zeppelin
Multi User Data science with ZeppelinMulti User Data science with Zeppelin
Multi User Data science with ZeppelinVinay Shukla
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataDataWorks Summit/Hadoop Summit
 
Apache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data AnalysisDataWorks Summit/Hadoop Summit
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningAsim Jalis
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudAmazon Web Services
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupRafal Kwasny
 
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache SparkBuilding a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache SparkDataWorks Summit
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSAmazon Web Services
 
Workshop on Quantitative Analytics Using Interactive On-line Tool
Workshop on Quantitative Analytics Using Interactive On-line ToolWorkshop on Quantitative Analytics Using Interactive On-line Tool
Workshop on Quantitative Analytics Using Interactive On-line ToolOlga Scrivner
 

Destaque (20)

Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJIntro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
 
Data Science Crash Course Hadoop Summit SJ
Data Science Crash Course Hadoop Summit SJData Science Crash Course Hadoop Summit SJ
Data Science Crash Course Hadoop Summit SJ
 
The Path to Wellness through Big Data
The Path to Wellness through Big DataThe Path to Wellness through Big Data
The Path to Wellness through Big Data
 
Open Source Ingredients for Interactive Data Analysis in Spark
Open Source Ingredients for Interactive Data Analysis in Spark Open Source Ingredients for Interactive Data Analysis in Spark
Open Source Ingredients for Interactive Data Analysis in Spark
 
How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...
How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...
How Big Data, Smart Devices and Wearables Will Save Lives: Revealing the Emer...
 
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetupAutoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
 
Hadoop Workshop on EC2 : March 2015
Hadoop Workshop on EC2 : March 2015Hadoop Workshop on EC2 : March 2015
Hadoop Workshop on EC2 : March 2015
 
Multi User Data science with Zeppelin
Multi User Data science with ZeppelinMulti User Data science with Zeppelin
Multi User Data science with Zeppelin
 
Apache Zeppelin Helium and Beyond
Apache Zeppelin Helium and BeyondApache Zeppelin Helium and Beyond
Apache Zeppelin Helium and Beyond
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
 
Apache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + Livy: Bringing Multi Tenancy to Interactive Data Analysis
 
Neural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep LearningNeural Networks, Spark MLlib, Deep Learning
Neural Networks, Spark MLlib, Deep Learning
 
Producing Spark on YARN for ETL
Producing Spark on YARN for ETLProducing Spark on YARN for ETL
Producing Spark on YARN for ETL
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache SparkBuilding a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache Spark
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Workshop on Quantitative Analytics Using Interactive On-line Tool
Workshop on Quantitative Analytics Using Interactive On-line ToolWorkshop on Quantitative Analytics Using Interactive On-line Tool
Workshop on Quantitative Analytics Using Interactive On-line Tool
 

Semelhante a Hadoop Crash Course Hadoop Summit SJ

Data in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureData in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureMats Johansson
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash CourseDataWorks Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Credit fraud prevention on hwx stack
Credit fraud prevention on hwx stackCredit fraud prevention on hwx stack
Credit fraud prevention on hwx stackKirk Haslbeck
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionHortonworks
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Powering the Future of Data  
Powering the Future of Data	   Powering the Future of Data	   
Powering the Future of Data  Bilot
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopMats Johansson
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 

Semelhante a Hadoop Crash Course Hadoop Summit SJ (20)

Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureData in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash Course
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Credit fraud prevention on hwx stack
Credit fraud prevention on hwx stackCredit fraud prevention on hwx stack
Credit fraud prevention on hwx stack
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Powering the Future of Data  
Powering the Future of Data	   Powering the Future of Data	   
Powering the Future of Data  
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with Hadoop
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 

Último

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Último (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Hadoop Crash Course Hadoop Summit SJ