SlideShare uma empresa Scribd logo
1 de 47
© Rocana, Inc. All Rights Reserved. | 1
Joey Echeverria, Platform Technical Lead
Strata+Hadoop World, March 31st 2016
San Jose, CA
Embeddable data transformation for
real-time streams
© Rocana, Inc. All Rights Reserved. | 2 http://j.mp/hw-questions
Slides
http://j.mp/rocana-transform-slides
© Rocana, Inc. All Rights Reserved. | 3 http://j.mp/hw-questions
Questions
http://j.mp/hw-questions
© Rocana, Inc. All Rights Reserved. | 4 http://j.mp/hw-questions
Context
© Rocana, Inc. All Rights Reserved. | 5 http://j.mp/hw-questions
Joey
‱ Where I work: Rocana – Platform Technical Lead
‱ Where I used to work: Cloudera (’11-’15), NSA
‱ Distributed systems, security, data processing, big data
© Rocana, Inc. All Rights Reserved. | 6
Signing today at 1pm at the
Cloudera booth
© Rocana, Inc. All Rights Reserved. | 7 http://j.mp/hw-questions
History
© Rocana, Inc. All Rights Reserved. | 8 http://j.mp/hw-questions
Spark
Impala
“Legacy” data architecture
HDFS
Avro/Parquet FilesFlume/Sqoop
Data Producers
MapReduc
e
Visualization/Query
© Rocana, Inc. All Rights Reserved. | 9 http://j.mp/hw-questions
Flink
Storm
Stream data architecture
Kafka
Avro Serialized
Recrods
Data Producers Spark Streaming
Real-time Visualization
HDFS
Avro/Parquet FilesKafka Consumers
© Rocana, Inc. All Rights Reserved. | 10 http://j.mp/hw-questions
Flink
Storm
Stream data architecture
Kafka
Avro Serialized
Recrods
Data Producers Spark Streaming
Real-time Visualization
HDFS
Avro/Parquet FilesKafka Consumers
© Rocana, Inc. All Rights Reserved. | 11 http://j.mp/hw-questions
Stream processing
A primer
© Rocana, Inc. All Rights Reserved. | 12 http://j.mp/hw-questions
Stream processing
‱ Filter
‱ Extract
‱ Project
‱ Aggregate
‱ Join
‱ Model
© Rocana, Inc. All Rights Reserved. | 13 http://j.mp/hw-questions
Stream processing
‱ Filter
‱ Extract
‱ Project
‱ Aggregate
‱ Join
‱ Model
© Rocana, Inc. All Rights Reserved. | 14 http://j.mp/hw-questions
Stream processing
‱ Filter
‱ Extract
‱ Project
‱ Aggregate
‱ Join
‱ Model
‱ Data transformation
© Rocana, Inc. All Rights Reserved. | 15 http://j.mp/hw-questions
Apache Storm
‱ "Distributed real-time computation system"
‱ Applications packaged into topologies (think MapReduce job)
‱ Topologies operate over streams of tuples
‱ Spout: source of a stream
‱ Bolt: arbitrary operation such as filtering, aggregating, joining, or
executing arbitrary functions
© Rocana, Inc. All Rights Reserved. | 16 http://j.mp/hw-questions
Apache Spark
‱ Supports batch and stream processing
‱ Continuous stream of records discretized into a DStream
‱ DStream: a sequence of RDDs (batches of records)
‱ Micro-batch
© Rocana, Inc. All Rights Reserved. | 17 http://j.mp/hw-questions
Apache Flink
‱ Supports batch and stream processing
‱ DataStream: unbounded collection of records
‱ Operations can apply to individual records or windows of records
‱ Supports record-at-a-time processing (like Storm)
© Rocana, Inc. All Rights Reserved. | 18 http://j.mp/hw-questions
Apache Kafka
‱ Pub-sub messaging system implemented as a distributed commit log
‱ Popular as a source and sink for data streams
‱ Scalability, durability, and easy-to-understand delivery guarantees
‱ Can do stream processing directly in Kafka consumers
© Rocana, Inc. All Rights Reserved. | 19 http://j.mp/hw-questions
Data transformation
© Rocana, Inc. All Rights Reserved. | 20 http://j.mp/hw-questions
Filter
filter
© Rocana, Inc. All Rights Reserved. | 21 http://j.mp/hw-questions
Extract
127.0.0.1 Mozilla/5.0 laura [31/Mar/2016] "GET /index.html HTTP/1.0" 200 2326
ts: 1436576671000
body: <binary blob>
event_type_id: 100
...
extract
ts: 1436576671000
body: <binary blob>
event_type_id: 100
attributes: {
ip: "127.0.0.1"
user_agent: "Mozilla/5.0"
user_id: "laura"
date: "[31/March/2016]"
request: "GET /index.html HTTP/1.0"
status_code: "200"
size: "2326"
}
© Rocana, Inc. All Rights Reserved. | 22 http://j.mp/hw-questions
Project
ts: 1436576671000
body: <binary blob>
event_type_id: 100
attributes: {
ip: "127.0.0.1"
user_agent: "Mozilla/5.0"
user_id: "laura"
date: "[31/March/2016]"
request: "GET /index.html HTTP/1.0"
status_code: "200"
size: "2326"
}
ts: 1459444413000
ip: "127.0.0.1"
user_agent: "Mozilla/5.0"
user_id: "laura"
request: "GET /index.html HTTP/1.0"
status_code: 200
size: 2326
project
© Rocana, Inc. All Rights Reserved. | 23 http://j.mp/hw-questions
Problem
© Rocana, Inc. All Rights Reserved. | 24 http://j.mp/hw-questions
Who
‱ Developers
‱ Data engineers
‱ Sysadmins
‱ Analysts
© Rocana, Inc. All Rights Reserved. | 25 http://j.mp/hw-questions
Tools
© Rocana, Inc. All Rights Reserved. | 26 http://j.mp/hw-questions
The dark art of data science
‱ Feature engineering
‱ “Getting a mess of raw data that can be used as input to a machine
learning algorithm” - @josh_wills
‱ Video from Midwest.io 2014
© Rocana, Inc. All Rights Reserved. | 27 http://j.mp/hw-questions
Data transformation for all
© Rocana, Inc. All Rights Reserved. | 28 http://j.mp/hw-questions
Rocana Transform
‱ Library
‱ Java
‱ Rocana configuration
‱ JSON + comments + specific numeric types - excess quoting
© Rocana, Inc. All Rights Reserved. | 29 http://j.mp/hw-questions
Data model
‱ Event schema
‱ id: A globally unique identifier for this event
‱ ts: Epoch timestamp in milliseconds
‱ event_type_id: ID indicating the type of the event
‱ location: Location from which the event was generated
‱ host: Hostname, IP, or other device identifier from which the event was
generated
‱ service: Service or process from which the event was generated
‱ body: Raw event content in bytes
‱ attributes: Event type-specific key/value pairs
© Rocana, Inc. All Rights Reserved. | 30 http://j.mp/hw-questions
Example event
{
"id": "JRHAIDMLCKLEAPMIQDHFLO3MXYXV7NVBEJNDKZGS2XVSEINGGBHA====",
"event_type_id": 100,
"ts": 1436576671000,
"location": "aws/us-west-2a",
"host": "example01.rocana.com",
"service": "dhclient",
"body": "<36>Jul 10 18:04:31 gs09.example.com dhclient[865] DHCPACK from 
",
"attributes": {
"syslog_timestamp": "1436576671000",
"syslog_process": "dhclient",
"syslog_pid": "865",
"syslog_facility": "3",
"syslog_severity": "6",
"syslog_hostname": "example01",
"syslog_message": "DHCPACK from 10.10.1.1 (xid=0x5c64bdb0)"
}
}
© Rocana, Inc. All Rights Reserved. | 31 http://j.mp/hw-questions
Filter, extract, and flatten
© Rocana, Inc. All Rights Reserved. | 32 http://j.mp/hw-questions
Filter, extract, and flatten
‱ Filter out events without type id 100
‱ Filter out events without hostname prefix "ex"
‱ Extract a numeric prefix from the syslog message
‱ Flatten syslog attributes to top-level fields in a different avro schema
© Rocana, Inc. All Rights Reserved. | 33 http://j.mp/hw-questions
Filter, extract, and flatten
{
load-event: {},
// Filter by event_type_id
filter: { expression: "${event_type_id == 100}" },
// Extract hostname prefix
regex: { ... },
filter: { expression: "${host_prefix.match.group.1 == 'ex'}",
// Extract a numeric prefix from the syslog message
regex: { ... },
// Build flattened record
build-avro-record: { ... },
// Accumulate output record
accumulate-output: {
value: "${output_record}"
}
}
© Rocana, Inc. All Rights Reserved. | 34 http://j.mp/hw-questions
Extract hostname prefix
{
load-event: {},
filter: { expression: "${event_type_id == 100}" },
regex: {
pattern: "^(.{2}).*$",
value: "${attr.syslog_hostname}",
destination: "host_prefix"
},
filter: { expression: "${host_prefix.match.group.1 == 'ex'}",
...
}
© Rocana, Inc. All Rights Reserved. | 35 http://j.mp/hw-questions
Extract numeric prefix
...
filter: { expression: "${host_prefix.match.group.1 == 'ex'}",
regex: {
pattern: "^([0-9]*)",
value: "${attributes['syslog_message']}",
destination: "msg",
match-actions: {
set-values: { extracted_field: "${msg.match.group.1}" }
},
no-match-actions: {
set-values: { extracted_field: "" }
}
},
...
© Rocana, Inc. All Rights Reserved. | 36 http://j.mp/hw-questions
Build flattened record
...
build-avro-record: {
schema-uri: "resource:avro-schemas/flattened-syslog.avsc",
destination: "output_record",
field-mapping: {
ts: "${ts}",
event_type_id: "${event_type_id}",
source: "${source}",
syslog_facility: "${convert:toInt(attributes['syslog_facility'])}",
syslog_severity: "${convert:toInt(attributes['syslog_severity'])}",
...
syslog_message: "${attributes['syslog_message']}",
syslog_pid: "${convert:toInt(attributes['syslog_pid)}",
extracted_field: "${extracted_field}"
},
},
...
© Rocana, Inc. All Rights Reserved. | 37 http://j.mp/hw-questions
Extract metrics from log data
© Rocana, Inc. All Rights Reserved. | 38 http://j.mp/hw-questions
Extract metrics
‱ Input: HTTP status logs
‱ Extract request latency
‱ Extract counts by HTTP status code
‱ Metric types
‱ Guage: A value that varies over time (think latency, CPU %, etc.)
‱ Counter: A value that accumulates over time (think event volume, status codes,
etc.)
© Rocana, Inc. All Rights Reserved. | 39 http://j.mp/hw-questions
Example metric event
{
"id": "JRHAIDMLCKLEAPMIQDHFLO3MXBBQ7NVBEJNDKZGS2XVSEINGGBHA====",
"event_type_id": 107,
"ts": 1436576671000,
"location": "aws/us-west-2a",
"host": "web01.rocana.com",
"service": "httpd",
"attributes": {
"m.http.request.latency": "4.2000000000E1|g",
"m.http.status.401.count": "1.0000000000E0|c",
}
}
© Rocana, Inc. All Rights Reserved. | 40 http://j.mp/hw-questions
Extract metrics
{
load-event: {},
build-metric: {
gauge-mapping: {
http.request.latency: "${convert:toDouble(attributes['latency'])}"
},
destination: "latency_metric"
},
accumulate-output: { value: "${latency_metric}" },
build-metric: {
dynamic-counter-mapping: [
"${string:format('http.status.%s.count', attributes['sc_status'])}", 1D
],
destination: "status_metric"
},
accumulate-output: { value: "${status_metric}" }
}
© Rocana, Inc. All Rights Reserved. | 41 http://j.mp/hw-questions
Architecture
© Rocana, Inc. All Rights Reserved. | 42 http://j.mp/hw-questions
Java action objects
Architecture
Configuration file Java action objects Context
Variables
Driver
1. Parse config
2. Initialize
context
5. Copy output
3. Execute actions
4. Read/write
variables
© Rocana, Inc. All Rights Reserved. | 43 http://j.mp/hw-questions
Custom actions
‱ Actions loaded at runtime using Java services framework
‱ Add your jar to the classpath
‱ Custom actions appear as top-level keywords just like regular actions
‱ Implement the execute() method of the Action interface
‱ Implement the build() method of the ActionBuilder interface
© Rocana, Inc. All Rights Reserved. | 44 http://j.mp/hw-questions
Custom actions
‱ Parse custom log formats
‱ Cisco ACS
‱ Citrix
‱ Juniper
‱ Customer-specific formats
‱ Lookup IP addresses in the MaxMind GeoIP2 database
‱ Reference dataset lookups
‱ Device id to device name
© Rocana, Inc. All Rights Reserved. | 45 http://j.mp/hw-questions
Putting it all together
‱ Stream processing is causing us to re-think how we analyze data
‱ Limiting accessibility of data transformation side increases costs and
decreases velocity
‱ Reduce your reliance on developers to code custom pipelines
‱ Re-use transformation configuration in any stream processing framework
or batch job
© Rocana, Inc. All Rights Reserved. | 46 http://j.mp/hw-questions
Coming soon
‱ Rocana transform will be released under the ASL 2.0
‱ The base configuration library is available today:
‱ https://github.com/scalingdata/rocana-configuration
© Rocana, Inc. All Rights Reserved. | 47 http://j.mp/hw-questions
Questions?
‱ Signing "Hadoop Security" today at 1pm at the Cloudera booth

Mais conteĂșdo relacionado

Mais procurados

Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
DataWorks Summit
 

Mais procurados (20)

Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Cost-based Query Optimization
Cost-based Query Optimization Cost-based Query Optimization
Cost-based Query Optimization
 
Active Learning for Fraud Prevention
Active Learning for Fraud PreventionActive Learning for Fraud Prevention
Active Learning for Fraud Prevention
 
Introduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormIntroduction to Apache NiFi And Storm
Introduction to Apache NiFi And Storm
 
LEGO: Data Driven Growth Hacking Powered by Big Data
LEGO: Data Driven Growth Hacking Powered by Big Data LEGO: Data Driven Growth Hacking Powered by Big Data
LEGO: Data Driven Growth Hacking Powered by Big Data
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache ZeppelinIntro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Standalone metastore-dws-sjc-june-2018
Standalone metastore-dws-sjc-june-2018Standalone metastore-dws-sjc-june-2018
Standalone metastore-dws-sjc-june-2018
 
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieBuilding and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 

Destaque

ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래
ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래
ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래
Wooseung Kim
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
vivekjv
 

Destaque (14)

Hybrid & Logical Data Warehouse
Hybrid & Logical Data WarehouseHybrid & Logical Data Warehouse
Hybrid & Logical Data Warehouse
 
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
 
Introduction to sentry
Introduction to sentryIntroduction to sentry
Introduction to sentry
 
Supporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data VirtualizationSupporting Data Services Marketplace using Data Virtualization
Supporting Data Services Marketplace using Data Virtualization
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop security
 
What's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and BeyondWhat's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and Beyond
 
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logica...
 
ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래
ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래
ëč…데읎터 플랫폌 ìƒˆëĄœìšŽ 믾래
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes
 
Big Data Industry Insights 2015
Big Data Industry Insights 2015 Big Data Industry Insights 2015
Big Data Industry Insights 2015
 
Big Data Security and Governance
Big Data Security and GovernanceBig Data Security and Governance
Big Data Security and Governance
 
Real-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and ChallengesReal-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and Challenges
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Semelhante a Embeddable data transformation for real time streams

Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
confluent
 
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
confluent
 

Semelhante a Embeddable data transformation for real time streams (20)

Streaming ETL for All
Streaming ETL for AllStreaming ETL for All
Streaming ETL for All
 
Building a system for machine and event-oriented data - SF HUG Nov 2015
Building a system for machine and event-oriented data - SF HUG Nov 2015Building a system for machine and event-oriented data - SF HUG Nov 2015
Building a system for machine and event-oriented data - SF HUG Nov 2015
 
Building production spark streaming applications
Building production spark streaming applicationsBuilding production spark streaming applications
Building production spark streaming applications
 
Spark+flume seattle
Spark+flume seattleSpark+flume seattle
Spark+flume seattle
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with Rocana
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
 
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...
 
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
Rocana Deep Dive OC Big Data Meetup #19 Sept 21st 2016
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
Wikipedia’s Event Data Platform, Or: JSON Is Okay Too With Andrew Otto | Curr...
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 
REST easy with API Platform
REST easy with API PlatformREST easy with API Platform
REST easy with API Platform
 
Hadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an exampleHadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an example
 
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
 
iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)
 
Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
 

Mais de Joey Echeverria

Debugging Apache Spark
Debugging Apache SparkDebugging Apache Spark
Debugging Apache Spark
Joey Echeverria
 
Apache Accumulo and Cloudera
Apache Accumulo and ClouderaApache Accumulo and Cloudera
Apache Accumulo and Cloudera
Joey Echeverria
 
Analyzing twitter data with hadoop
Analyzing twitter data with hadoopAnalyzing twitter data with hadoop
Analyzing twitter data with hadoop
Joey Echeverria
 
Hadoop in three use cases
Hadoop in three use casesHadoop in three use cases
Hadoop in three use cases
Joey Echeverria
 
Scratching your own itch
Scratching your own itchScratching your own itch
Scratching your own itch
Joey Echeverria
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computing
Joey Echeverria
 
Hadoop and h base in the real world
Hadoop and h base in the real worldHadoop and h base in the real world
Hadoop and h base in the real world
Joey Echeverria
 

Mais de Joey Echeverria (10)

Debugging Apache Spark
Debugging Apache SparkDebugging Apache Spark
Debugging Apache Spark
 
The Future of Apache Hadoop Security
The Future of Apache Hadoop SecurityThe Future of Apache Hadoop Security
The Future of Apache Hadoop Security
 
Building data pipelines with kite
Building data pipelines with kiteBuilding data pipelines with kite
Building data pipelines with kite
 
Apache Accumulo and Cloudera
Apache Accumulo and ClouderaApache Accumulo and Cloudera
Apache Accumulo and Cloudera
 
Analyzing twitter data with hadoop
Analyzing twitter data with hadoopAnalyzing twitter data with hadoop
Analyzing twitter data with hadoop
 
Big data security
Big data securityBig data security
Big data security
 
Hadoop in three use cases
Hadoop in three use casesHadoop in three use cases
Hadoop in three use cases
 
Scratching your own itch
Scratching your own itchScratching your own itch
Scratching your own itch
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computing
 
Hadoop and h base in the real world
Hadoop and h base in the real worldHadoop and h base in the real world
Hadoop and h base in the real world
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 

Embeddable data transformation for real time streams

  • 1. © Rocana, Inc. All Rights Reserved. | 1 Joey Echeverria, Platform Technical Lead Strata+Hadoop World, March 31st 2016 San Jose, CA Embeddable data transformation for real-time streams
  • 2. © Rocana, Inc. All Rights Reserved. | 2 http://j.mp/hw-questions Slides http://j.mp/rocana-transform-slides
  • 3. © Rocana, Inc. All Rights Reserved. | 3 http://j.mp/hw-questions Questions http://j.mp/hw-questions
  • 4. © Rocana, Inc. All Rights Reserved. | 4 http://j.mp/hw-questions Context
  • 5. © Rocana, Inc. All Rights Reserved. | 5 http://j.mp/hw-questions Joey ‱ Where I work: Rocana – Platform Technical Lead ‱ Where I used to work: Cloudera (’11-’15), NSA ‱ Distributed systems, security, data processing, big data
  • 6. © Rocana, Inc. All Rights Reserved. | 6 Signing today at 1pm at the Cloudera booth
  • 7. © Rocana, Inc. All Rights Reserved. | 7 http://j.mp/hw-questions History
  • 8. © Rocana, Inc. All Rights Reserved. | 8 http://j.mp/hw-questions Spark Impala “Legacy” data architecture HDFS Avro/Parquet FilesFlume/Sqoop Data Producers MapReduc e Visualization/Query
  • 9. © Rocana, Inc. All Rights Reserved. | 9 http://j.mp/hw-questions Flink Storm Stream data architecture Kafka Avro Serialized Recrods Data Producers Spark Streaming Real-time Visualization HDFS Avro/Parquet FilesKafka Consumers
  • 10. © Rocana, Inc. All Rights Reserved. | 10 http://j.mp/hw-questions Flink Storm Stream data architecture Kafka Avro Serialized Recrods Data Producers Spark Streaming Real-time Visualization HDFS Avro/Parquet FilesKafka Consumers
  • 11. © Rocana, Inc. All Rights Reserved. | 11 http://j.mp/hw-questions Stream processing A primer
  • 12. © Rocana, Inc. All Rights Reserved. | 12 http://j.mp/hw-questions Stream processing ‱ Filter ‱ Extract ‱ Project ‱ Aggregate ‱ Join ‱ Model
  • 13. © Rocana, Inc. All Rights Reserved. | 13 http://j.mp/hw-questions Stream processing ‱ Filter ‱ Extract ‱ Project ‱ Aggregate ‱ Join ‱ Model
  • 14. © Rocana, Inc. All Rights Reserved. | 14 http://j.mp/hw-questions Stream processing ‱ Filter ‱ Extract ‱ Project ‱ Aggregate ‱ Join ‱ Model ‱ Data transformation
  • 15. © Rocana, Inc. All Rights Reserved. | 15 http://j.mp/hw-questions Apache Storm ‱ "Distributed real-time computation system" ‱ Applications packaged into topologies (think MapReduce job) ‱ Topologies operate over streams of tuples ‱ Spout: source of a stream ‱ Bolt: arbitrary operation such as filtering, aggregating, joining, or executing arbitrary functions
  • 16. © Rocana, Inc. All Rights Reserved. | 16 http://j.mp/hw-questions Apache Spark ‱ Supports batch and stream processing ‱ Continuous stream of records discretized into a DStream ‱ DStream: a sequence of RDDs (batches of records) ‱ Micro-batch
  • 17. © Rocana, Inc. All Rights Reserved. | 17 http://j.mp/hw-questions Apache Flink ‱ Supports batch and stream processing ‱ DataStream: unbounded collection of records ‱ Operations can apply to individual records or windows of records ‱ Supports record-at-a-time processing (like Storm)
  • 18. © Rocana, Inc. All Rights Reserved. | 18 http://j.mp/hw-questions Apache Kafka ‱ Pub-sub messaging system implemented as a distributed commit log ‱ Popular as a source and sink for data streams ‱ Scalability, durability, and easy-to-understand delivery guarantees ‱ Can do stream processing directly in Kafka consumers
  • 19. © Rocana, Inc. All Rights Reserved. | 19 http://j.mp/hw-questions Data transformation
  • 20. © Rocana, Inc. All Rights Reserved. | 20 http://j.mp/hw-questions Filter filter
  • 21. © Rocana, Inc. All Rights Reserved. | 21 http://j.mp/hw-questions Extract 127.0.0.1 Mozilla/5.0 laura [31/Mar/2016] "GET /index.html HTTP/1.0" 200 2326 ts: 1436576671000 body: <binary blob> event_type_id: 100 ... extract ts: 1436576671000 body: <binary blob> event_type_id: 100 attributes: { ip: "127.0.0.1" user_agent: "Mozilla/5.0" user_id: "laura" date: "[31/March/2016]" request: "GET /index.html HTTP/1.0" status_code: "200" size: "2326" }
  • 22. © Rocana, Inc. All Rights Reserved. | 22 http://j.mp/hw-questions Project ts: 1436576671000 body: <binary blob> event_type_id: 100 attributes: { ip: "127.0.0.1" user_agent: "Mozilla/5.0" user_id: "laura" date: "[31/March/2016]" request: "GET /index.html HTTP/1.0" status_code: "200" size: "2326" } ts: 1459444413000 ip: "127.0.0.1" user_agent: "Mozilla/5.0" user_id: "laura" request: "GET /index.html HTTP/1.0" status_code: 200 size: 2326 project
  • 23. © Rocana, Inc. All Rights Reserved. | 23 http://j.mp/hw-questions Problem
  • 24. © Rocana, Inc. All Rights Reserved. | 24 http://j.mp/hw-questions Who ‱ Developers ‱ Data engineers ‱ Sysadmins ‱ Analysts
  • 25. © Rocana, Inc. All Rights Reserved. | 25 http://j.mp/hw-questions Tools
  • 26. © Rocana, Inc. All Rights Reserved. | 26 http://j.mp/hw-questions The dark art of data science ‱ Feature engineering ‱ “Getting a mess of raw data that can be used as input to a machine learning algorithm” - @josh_wills ‱ Video from Midwest.io 2014
  • 27. © Rocana, Inc. All Rights Reserved. | 27 http://j.mp/hw-questions Data transformation for all
  • 28. © Rocana, Inc. All Rights Reserved. | 28 http://j.mp/hw-questions Rocana Transform ‱ Library ‱ Java ‱ Rocana configuration ‱ JSON + comments + specific numeric types - excess quoting
  • 29. © Rocana, Inc. All Rights Reserved. | 29 http://j.mp/hw-questions Data model ‱ Event schema ‱ id: A globally unique identifier for this event ‱ ts: Epoch timestamp in milliseconds ‱ event_type_id: ID indicating the type of the event ‱ location: Location from which the event was generated ‱ host: Hostname, IP, or other device identifier from which the event was generated ‱ service: Service or process from which the event was generated ‱ body: Raw event content in bytes ‱ attributes: Event type-specific key/value pairs
  • 30. © Rocana, Inc. All Rights Reserved. | 30 http://j.mp/hw-questions Example event { "id": "JRHAIDMLCKLEAPMIQDHFLO3MXYXV7NVBEJNDKZGS2XVSEINGGBHA====", "event_type_id": 100, "ts": 1436576671000, "location": "aws/us-west-2a", "host": "example01.rocana.com", "service": "dhclient", "body": "<36>Jul 10 18:04:31 gs09.example.com dhclient[865] DHCPACK from 
", "attributes": { "syslog_timestamp": "1436576671000", "syslog_process": "dhclient", "syslog_pid": "865", "syslog_facility": "3", "syslog_severity": "6", "syslog_hostname": "example01", "syslog_message": "DHCPACK from 10.10.1.1 (xid=0x5c64bdb0)" } }
  • 31. © Rocana, Inc. All Rights Reserved. | 31 http://j.mp/hw-questions Filter, extract, and flatten
  • 32. © Rocana, Inc. All Rights Reserved. | 32 http://j.mp/hw-questions Filter, extract, and flatten ‱ Filter out events without type id 100 ‱ Filter out events without hostname prefix "ex" ‱ Extract a numeric prefix from the syslog message ‱ Flatten syslog attributes to top-level fields in a different avro schema
  • 33. © Rocana, Inc. All Rights Reserved. | 33 http://j.mp/hw-questions Filter, extract, and flatten { load-event: {}, // Filter by event_type_id filter: { expression: "${event_type_id == 100}" }, // Extract hostname prefix regex: { ... }, filter: { expression: "${host_prefix.match.group.1 == 'ex'}", // Extract a numeric prefix from the syslog message regex: { ... }, // Build flattened record build-avro-record: { ... }, // Accumulate output record accumulate-output: { value: "${output_record}" } }
  • 34. © Rocana, Inc. All Rights Reserved. | 34 http://j.mp/hw-questions Extract hostname prefix { load-event: {}, filter: { expression: "${event_type_id == 100}" }, regex: { pattern: "^(.{2}).*$", value: "${attr.syslog_hostname}", destination: "host_prefix" }, filter: { expression: "${host_prefix.match.group.1 == 'ex'}", ... }
  • 35. © Rocana, Inc. All Rights Reserved. | 35 http://j.mp/hw-questions Extract numeric prefix ... filter: { expression: "${host_prefix.match.group.1 == 'ex'}", regex: { pattern: "^([0-9]*)", value: "${attributes['syslog_message']}", destination: "msg", match-actions: { set-values: { extracted_field: "${msg.match.group.1}" } }, no-match-actions: { set-values: { extracted_field: "" } } }, ...
  • 36. © Rocana, Inc. All Rights Reserved. | 36 http://j.mp/hw-questions Build flattened record ... build-avro-record: { schema-uri: "resource:avro-schemas/flattened-syslog.avsc", destination: "output_record", field-mapping: { ts: "${ts}", event_type_id: "${event_type_id}", source: "${source}", syslog_facility: "${convert:toInt(attributes['syslog_facility'])}", syslog_severity: "${convert:toInt(attributes['syslog_severity'])}", ... syslog_message: "${attributes['syslog_message']}", syslog_pid: "${convert:toInt(attributes['syslog_pid)}", extracted_field: "${extracted_field}" }, }, ...
  • 37. © Rocana, Inc. All Rights Reserved. | 37 http://j.mp/hw-questions Extract metrics from log data
  • 38. © Rocana, Inc. All Rights Reserved. | 38 http://j.mp/hw-questions Extract metrics ‱ Input: HTTP status logs ‱ Extract request latency ‱ Extract counts by HTTP status code ‱ Metric types ‱ Guage: A value that varies over time (think latency, CPU %, etc.) ‱ Counter: A value that accumulates over time (think event volume, status codes, etc.)
  • 39. © Rocana, Inc. All Rights Reserved. | 39 http://j.mp/hw-questions Example metric event { "id": "JRHAIDMLCKLEAPMIQDHFLO3MXBBQ7NVBEJNDKZGS2XVSEINGGBHA====", "event_type_id": 107, "ts": 1436576671000, "location": "aws/us-west-2a", "host": "web01.rocana.com", "service": "httpd", "attributes": { "m.http.request.latency": "4.2000000000E1|g", "m.http.status.401.count": "1.0000000000E0|c", } }
  • 40. © Rocana, Inc. All Rights Reserved. | 40 http://j.mp/hw-questions Extract metrics { load-event: {}, build-metric: { gauge-mapping: { http.request.latency: "${convert:toDouble(attributes['latency'])}" }, destination: "latency_metric" }, accumulate-output: { value: "${latency_metric}" }, build-metric: { dynamic-counter-mapping: [ "${string:format('http.status.%s.count', attributes['sc_status'])}", 1D ], destination: "status_metric" }, accumulate-output: { value: "${status_metric}" } }
  • 41. © Rocana, Inc. All Rights Reserved. | 41 http://j.mp/hw-questions Architecture
  • 42. © Rocana, Inc. All Rights Reserved. | 42 http://j.mp/hw-questions Java action objects Architecture Configuration file Java action objects Context Variables Driver 1. Parse config 2. Initialize context 5. Copy output 3. Execute actions 4. Read/write variables
  • 43. © Rocana, Inc. All Rights Reserved. | 43 http://j.mp/hw-questions Custom actions ‱ Actions loaded at runtime using Java services framework ‱ Add your jar to the classpath ‱ Custom actions appear as top-level keywords just like regular actions ‱ Implement the execute() method of the Action interface ‱ Implement the build() method of the ActionBuilder interface
  • 44. © Rocana, Inc. All Rights Reserved. | 44 http://j.mp/hw-questions Custom actions ‱ Parse custom log formats ‱ Cisco ACS ‱ Citrix ‱ Juniper ‱ Customer-specific formats ‱ Lookup IP addresses in the MaxMind GeoIP2 database ‱ Reference dataset lookups ‱ Device id to device name
  • 45. © Rocana, Inc. All Rights Reserved. | 45 http://j.mp/hw-questions Putting it all together ‱ Stream processing is causing us to re-think how we analyze data ‱ Limiting accessibility of data transformation side increases costs and decreases velocity ‱ Reduce your reliance on developers to code custom pipelines ‱ Re-use transformation configuration in any stream processing framework or batch job
  • 46. © Rocana, Inc. All Rights Reserved. | 46 http://j.mp/hw-questions Coming soon ‱ Rocana transform will be released under the ASL 2.0 ‱ The base configuration library is available today: ‱ https://github.com/scalingdata/rocana-configuration
  • 47. © Rocana, Inc. All Rights Reserved. | 47 http://j.mp/hw-questions Questions? ‱ Signing "Hadoop Security" today at 1pm at the Cloudera booth