SlideShare uma empresa Scribd logo
1 de 53
Baixar para ler offline
Data Engineer
Cisco Umbrella
yeungp@cisco.com
Unified Data Platform
Pauline Yeung
ClickHouse Meetup
Dec 3, 2019
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Agenda
1
2
3
4
5
Problems
Use Case: Authlog
Use Case: Whois Records
Use Case: Network Tunnels
Next
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
‱ Data Engineer at Cisco Umbrella,
Investigate team
‱ M. S. Computer Engineering, Santa
Clara U
‱ B. S. Electrical Engineering, U of
Calgary
$ whois Pauline
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Problems
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Investigate: the most powerful way to uncover threats
Console
API
SIEM, TIP
Key points
Intelligence about domains, IPs,
and malware across the internet
Live graph of DNS requests and
other contextual data
Correlated against statistical models
Discover and predict malicious
domains and IPs
Enrich security data with global intelligence
domains, IPs, ASNs, file hashes
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Investigate Backend
Whois ASN
IntelDB
Umbrella
Investigate
passive
DNS
We want
‱ Easy, fast, and flexible platform for ad hoc
analysis of authlog, which are stored in
passive DNS.
‱ Increase throughput and reduce costs for
Whois database.
‱ Fast access to ASN and enrich security
data.
‱ One datastore for multiple use cases.
Share datastore with other product teams.
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
DNS Authoritative Log
(authlog)
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Passive DNS
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Domain to IP Relationships
11 JAN 2019
domain2.com
10 JAN 2019
domain1.com
12 JAN 2019
domain3.com
12.4.0.4/32
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
AuthLog Examples
owner name datacenter name_server
1 tabiturient.ru. tabiturient.ru. lon ns4.nic.ru.
2 thefacebook.com. certs.thefacebook.com. sea b.ns.facebook.com.
3 333az.net. nbb4yd.333az.net. yyz ns2-09.azure-dns.net.
4 dotnetwork2.co.za. d1000253-146.dotnetwork2.co.za. jnb ns3.dotnetworkdns.co.za.
name_server_ip rr ttl type timestamp
1 194.226.96.8 195.24.68.22 3600 A 2019-12-02 10:56:47
2 2a03:2880:ffff:c:face:b00c:0:35 2620:10d:c0a1:10:0:0:0:35 600 AAAA 2019-12-02 12:40:46
3 2620:1ec:8ec::9 ns1-07.azure-dns.com. 20 NS 2019-12-02 10:34:15
4 41.223.172.166 mail.d1000253-146.dotnetwork2.co.za. 3600 MX 2019-11-30 03:05:17
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
AuthLog Data Pipeline
authlog
producer
authlog
clickhouse
ingester
S3
archiver
resolvers
32 data centers
3 days
authlog
HBase
ingester
Investigate
UI
API Server
6 nodes, 1 replica
r4.4xlarge
16 vCPU, 122 GB, 2 TB disk
32 nodes
i3 2xlarge
8 vCPU, 61 GB
authlog
parquet
passive
DNS
120b requests/day
4b authlog/day
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
One Set of Questions
‱ What’s the increase in disk usage for passive DNS per day?
‱ What type of traffic contribute the most to the increase of disk usage?
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
AuthLog for Past 3 Days
CREATE TABLE IF NOT EXISTS authlog.alog_local (
owner String,
name String,
datacenter String,
name_server String,
name_server_ip String,
rr String,
ttl Int32,
type String,
timestamp DateTime)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY (name, timestamp)
TTL timestamp + toIntervalDay(3)
SETTINGS index_granularity = 8192
48 golang workers write to 6 shards
ingest 1.2m rows per second
ClickHouse kafka engine does not support avro
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
AuthLog for Past 3 Days
CREATE TABLE IF NOT EXISTS authlog.alog(
owner String,
name String,
datacenter String,
name_server String,
name_server_ip String,
rr String,
ttl Int32,
type String,
timestamp DateTime)
ENGINE = Distributed(log_cluster, authlog, alog_local, cityHash64(name))
access all shards
4b rows per day
200 GB for 3 days
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Payload for Domains in 2 Consecutive Days
CREATE TABLE IF NOT EXISTS authlog.a
ENGINE = MergeTree()
ORDER BY name
AS SELECT name, type, sum(length(name) + length(rr)) AS payload
FROM authlog.alog
WHERE timestamp >= toDateTime('2019-11-28 16:00:00') and timestamp <= toDateTime('2019-
11-28 19:59:59')
GROUP BY name, type
CREATE TABLE IF NOT EXISTS authlog.b
ENGINE = MergeTree()
ORDER BY name
AS SELECT name, type, sum(length(name) + length(rr)) AS payload
FROM authlog.alog
WHERE timestamp >= toDateTime('2019-11-29 16:00:00') and timestamp <= toDateTime('2019-
11-29 19:59:59')
GROUP BY name, type
took 2 minutes, 148m rows, 3.0 GB
took 2 minutes, 166m rows, 3.4 GB
4 hours, Πdaily authlog
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Payload for Domains Only in Day 2
CREATE TABLE IF NOT EXISTS authlog.b_only
ENGINE = MergeTree()
ORDER BY (name)
AS SELECT
b.name as name,
b.type as type,
sum(b.payload) as payload
FROM a
RIGHT JOIN b ON a.name = b.name
WHERE a.name like ''
GROUP BY
b.name,
b.type
ba
took 5 minutes, 108m rows, 2.5 GB
users.xml
max_memory_usage = 96GB
max_bytes_before_external_group_by = 48GB
max_bytes_before_external_sort = 48GB
4 hours
Nov 28
4 hours
Nov 29
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Second Level Domains with Highest Payload
SELECT
arrayStringConcat([splitByString('.', name)[-3], '.', splitByString('.', name)[-2], '.’]) AS pname,
sum(payload) / 1024 / 1024 AS payload_MB
FROM b_only
GROUP BY pname
ORDER BY payload_MB DESC
LIMIT 100
pname payload_MB
cloudfront.net. 878.2994289398193
office.com. 719.9946641921997
clienttons.com. 608.300389289856
cnr.io. 473.1693649291992
akamaihd.net. 395.0745334625244
cedexis-radar.net. 364.29007720947266
footprintdns.com. 265.04366874694824
gstatic.com. 250.41933727264404
squarespace.com. 151.24806880950928
forter.com. 151.08679962158203
wacodenver-com.mail.protection.outlook.com.
to
outlook.com.
took 6 seconds
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Resource Type with Highest Payload
SELECT
type,
sum(payload) / 1024 / 1024 / 1024 AS payload_GB
FROM b_only
GROUP BY type
ORDER BY payload_GB desc
type payload_GB
A 4.34150860644877
CNAME 3.289442714303732
RRSIG 1.5495505537837744
DNSKEY 0.6393735473975539
TXT 0.5898861000314355
SELECT sum(payload) / 1024 / 1024 / 1024 as payload_GB FROM b_only
payload_GB
11.225993978790939
took 483 msec
took 131 msec
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
DataBricks Spark
‱ 1 day authlog, ~4b rows
‱ Process 4x authlog
‱ Took 9 minutes
pname payload_GB
office.com. 3.812719924375415
cloudfront.net. 3.3973056096583605
cnr.io. 2.608651074580848
clienttons.com. 2.2882667966187
cedexis-radar.net. 1.9318219376727939
type payload_GB
A 16.849326515570283
CNAME 14.351725150831044
RRSIG 3.1499833753332496
TXT 3.047112719155848
NS 1.328178352676332
payload_GB
41.024286944419146
took 5 seconds
took 1.3
seconds
took 490 msec
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Whois Record Data
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
WHOIS Record Data
§ Who registered the domain
§ Contact information used
§ When/where registered
§ Expiration date
§ Historical data
§ Correlations with other
malicious domains
See relationships between
attackers’ infrastructure
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
$ whois facebook.com
:
Domain Name: FACEBOOK.COM
Registry Domain ID: 2320948_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.registrarsafe.com
Registrar URL: https://www.registrarsafe.com
Updated Date: 2019-10-17T18:52:06Z
Creation Date: 1997-03-29T05:00:00Z
Registrar Registration Expiration Date: 2028-03-30T04:00:00Z
Registrar: RegistrarSafe, LLC
Registrar IANA ID: 3237
Registrar Abuse Contact Email: abusecomplaints@registrarsafe.com
Registrar Abuse Contact Phone: +1.6503087004
Domain Status: clientDeleteProhibited https://www.icann.org/epp#clientDeleteProhibited
Domain Status: clientTransferProhibited https://www.icann.org/epp#clientTransferProhibited
Domain Status: serverDeleteProhibited https://www.icann.org/epp#serverDeleteProhibited
Domain Status: serverTransferProhibited https://www.icann.org/epp#serverTransferProhibited
Domain Status: clientUpdateProhibited https://www.icann.org/epp#clientUpdateProhibited
Domain Status: serverUpdateProhibited https://www.icann.org/epp#serverUpdateProhibited
API request: domainName
API response: WhoisRecord_rawText
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Registry Registrant ID:
Registrant Name: Domain Admin
Registrant Organization: Facebook, Inc.
Registrant Street: 1601 Willow Rd
Registrant City: Menlo Park
Registrant State/Province: CA
Registrant Postal Code: 94025
Registrant Country: US
Registrant Phone: +1.6505434800
Registrant Phone Ext:
Registrant Fax: +1.6505434800
Registrant Fax Ext:
Registrant Email: domain@fb.com
Registry Admin ID:
Admin Name: Domain Admin
Admin Organization: Facebook, Inc.
Admin Street: 1601 Willow Rd
Admin City: Menlo Park
Admin State/Province: CA
Admin Postal Code: 94025
Admin Country: US
Admin Phone: +1.6505434800
Admin Phone Ext:
Admin Fax: +1.6505434800
Admin Fax Ext:
Admin Email: domain@fb.com
Tech Name: Domain Admin
Tech Organization: Facebook, Inc.
Tech Street: 1601 Willow Rd
Tech City: Menlo Park
Tech State/Province: CA
Tech Postal Code: 94025
Tech Country: US
Tech Phone: +1.6505434800
Tech Phone Ext:
Tech Fax: +1.6505434800
Tech Fax Ext:
Tech Email: domain@fb.com
Name Server: A.NS.FACEBOOK.COM
Name Server: B.NS.FACEBOOK.COM
DNSSEC: unsigned
:
API request: contactEmail
API response: list of domainName
API request: list of nameServer
API response: list of domainName
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Questions
‱ Continue to maintain 29 nodes cluster, running Ubuntu Trusty, ElasticSearch 1.6?
‱ Migrate to AWS ElasticSearch 7.1?
‱ Migrate to AWS Aurora Postgres?
‱ Migrate to ClickHouse?
don’t need full text search
does not support shard for scaling
update is slow
insert efficient for bulk insert only
no secondary index
no
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Whois Data Pipeline
whois
ingester
whois
indexer
ClickHouse
ElasticSearch
Investigate
UI
API Server
29 nodes, 2 replicas
2 indices
12 TB
6 nodes, 2 replicas
3 tables, 1 materialized view
2 TB
download
download
1 index
2 tables, 1 MV
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Domain Table
CREATE TABLE IF NOT EXISTS whois.domain_local(
domainName String,
contactEmail String,
RegistryData_rawText String,
WhoisRecord_rawText String)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/whois.domain_local',
'{replica}’)
PRIMARY KEY (domainName)
ORDER BY (domainName)
SETTINGS index_granularity = 512
CREATE TABLE IF NOT EXISTS whois.domain(
domainName String,
contactEmail String,
RegistryData_rawText String,
WhoisRecord_rawText String)
ENGINE = Distributed(whois_cluster, whois, domain_local, cityHash64(domainName))
48 golang writers write to 6 shards
cityhash.Hash64([]byte(domainName)) % numHosts
for merging,
ClickHouse selects the last inserted row,
or if version column exists,
selects the row with the max value in the version column
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Email Table
CREATE MATERIALIZED VIEW IF NOT EXISTS whois.email_mv_local(
contactEmail String,
domainName String)
ENGINE = AggregatingMergeTree
ORDER BY (contactEmail, domainName)
POPULATE
AS SELECT contactEmail, domainName
FROM db.domain_local
WHERE contactEmail != ''
GROUP BY contactEmail, domainName
CREATE TABLE IF NOT EXISTS whois.email_mv(
contactEmail String,
domainName String)
ENGINE = Distributed(whois_cluster, db, email_mv_local, cityHash64(contactEmail))
domain table is 150 GB
email table is 3 GB
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
NS Table
CREATE TABLE IF NOT EXISTS whois.ns_local(
nameServer String,
domainName String)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/whois.ns_local', '{replica}')
PRIMARY KEY (nameServer, domainName)
ORDER BY (nameServer, domainName)
SETTINGS index_granularity = 512
CREATE TABLE IF NOT EXISTS whois.ns(
nameServer String,
domainName String)
ENGINE = Distributed(whois_cluster, whois, ns_local, cityHash64(nameServer))
name server table is 4 GB
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Whois Queries
SELECT WhoisRecord_rawText FROM domain_local FINAL WHERE domainName = 'facebook.com’
SELECT WhoisRecord_rawText FROM domain FINAL WHERE domainName = 'facebook.com’
SELECT domainName FROM email_mv WHERE contactEmail = 'domain@fb.com’
domainName
buyfbfansnow.com
facebook-hardware.com
instagram-engineering.net
pokerface-book.com
what3app.com
SELECT * FROM ns WHERE nameServer LIKE ‘%.facebook.com’
nameServer domainName
ns1.facebook.com djgabeholm.com
ns2.facebook.com shellpriv.com
ns3.facebook.com arabfashioncompany.com
a.ns.facebook.com zuckerberg.com
b.ns.facebook.com zuckerberg.net
took 9 msec
postgres 7 msec
took 21 msec
data selected fully merged, slower
took 16 msec, 2549 rows
took 36 msec, 7075 rows
Will add TLD column, e.g. facebook.com
ORDER By TLD, nameserver, domainName
expect query < 10msec
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Network Tunnels
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Network Tunnels
‱ Network tunnels deliver the
branch office traffic to the
Cisco’s cloud edge where
Umbrella runs a number of
security functions.
‱ Firewall, web security, DNS
security.
Provisioned per
organization
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Network Tunnels
DNS
security
web security
S3
Network
Tunnels UI
API Server
Tunnel
Visibility
Sensors
states
events
ClickHouse
downloadnetwork tunnels
3 nodes, 1 replica
firewall
script
CSV
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Event Table
CREATE TABLE IF NOT EXISTS tunnel_viz.event_local(
OrgID UInt32,
TunnelID UInt32,
EventTime DateTime,
EventID String,
EventType String,
PeerID String,
PeerIP IPv4,
PeerPort UInt16,
Code LowCardinality(String),
Reason String)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(EventTime)
PRIMARY KEY (OrgID, TunnelID, EventTime)
ORDER BY (OrgID, TunnelID, EventTime)
TTL EventTime + toIntervalDay(120)
SETTINGS index_granularity = 8196
dictionary encoding, 10 unique codes
event table, 3.3m rows, 260 MB
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Event Table
CREATE TABLE IF NOT EXISTS tunnel_viz.event(
OrgID UInt32,
TunnelID UInt32,
EventTime DateTime,
EventID String,
EventType String,
PeerID String,
PeerIP IPv4,
PeerPort UInt16,
Code LowCardinality(String),
Reason String)
ENGINE = Distributed(cdfw_cluster, tunnel_viz, event_local, murmurHash3_32(OrgID))"
fairly even distribution for integer
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Event Queries
SELECT Code, count() as c FROM event GROUP BY Code ORDER BY c DESC
┌─Code─────────────────┬──────c─┐
│ PROPOSAL_MISMATCH_CHILD │ 2164575 │
│ PEER_AUTH_FAILED │ 871937 │
│ RETRANSMIT_SEND │ 243559 │
│ RETRANSMIT_SEND_TIMEOUT │ 42829 │
│ UNIQUE_REPLACE │ 26540 │
│ PARSE_ERROR_BODY │ 2717 │
│ CERT_REVOKED │ 1004 │
│ LOCAL_AUTH_FAILED │ 186 │
│ TS_MISMATCH │ 9 │
│ VIP_FAILURE │ 1 │
└─────────────────────┮────────┘
took 23 msec, 10 unique codes
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Next
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
What we learned
‱ Cheap
‱ Fast
‱ Flexible
‱ Good compression
‱ Cluster isolation for multiple
stores
‱ Ad hoc analysis for authlog
using 200 GB storage
‱ Lower cost and acceptable
performance for whois
database.
‱ Share hardware for different
type of datastores.
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
ClickHouse Wish List
‱ Support avro in kafka engine.
‱ Balance cluster and copy data after failed or added node.
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Questions?
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Backup – Other Use Cases
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Threat Library
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Threat Library Data Pipeline
Web UI
API Server
S3
DNS query log
threat/attack
DNS query log
blocked domains
run job
blocked domains
Kubernetes
Cluster
threat/attack feed 1..n blocked domains
threat/attack
ClickHouse
Airflow
1 pod, 256 GB disk
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Blocked Domains
CREATE TABLE IF NOT EXISTS attribution.blocked(
datetime DateTime,
domain String,
threat LowCardinality(String),
attack LowCardinality(String),
count UInt32)
ENGINE = ReplacingMergeTree()
PARTITION BY toYYYYMMDD(datetime)
PRIMARY KEY (datetime, domain, threat, attack)
ORDER BY (datetime, domain, threat, attack)
TTL datetime + toIntervalDay(30)
SETTINGS index_granularity = 8196
dictionary encoding, 37 threats, 118 attacks
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
ASN
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Autonomous Systems
‱ IP to ASN
ASN Attribution
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Domain Ă  IP Ă  ASN relationships
AS 701AS 3462 AS 12271
1.168.6.17
domain1.com
100.2.65.157 104.162.93.136
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
ASN Data Pipeline
data
importer
ClickHouse
Aurora
Postgres
Web UI
API Server
1 write 1 read
download
download
CSV
multiple
tables
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Postgres
WITH b as (
SELECT asn, cidr FROM delta_bgp_routes
WHERE (cidr >>= CAST('104.244.42.193' AS ip4r))
AND (period && DATERANGE(CURRENT_DATE - integer '2', CURRENT_DATE,'[]')))
SELECT a.asn, b.cidr, a.description, a.creation_date AS creationDate, a.ir
FROM delta_autonomous_systems AS a, b
WHERE (a.asn = b.asn)
AND (period && DATERANGE(CURRENT_DATE - integer '2', CURRENT_DATE,'[]’));
asn | cidr | description | creationdate | ir
-------+-----------------+----------------------------------+--------------+----
13414 | 104.244.42.0/24 | TWITTER - Twitter Inc., US 86400 | 2010-07-09 | 3
Took 52 ms
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
/etc/clickhouse-server/asn_dictionary.xml<yandex>
<dictionary>
<name>asn_dict</name>
<layout>
<ip_trie />
</layout>
<structure>
<key>
<attribute>
<name>prefix</name>
<type>String</type>
</attribute>
</key>
<attribute>
<name>asn</name>
<type>UInt32</type>
<null_value />
</attribute>
<attribute>
<name>country</name>
<type>String</type>
<null_value>??</null_value>
</attribute>
<attribute>
<name>created_at</name>
<type>DateTime</type>
<null_value />
</attribute>
<attribute>
<name>registry</name>
<type>UInt32</type>
<null_value />
</attribute>
<attribute>
<name>description</name>
<type>String</type>
<null_value />
</attribute>
<attribute>
<name>datastr</name>
<type>String</type>
<null_value />
</attribute>
</structure>
<source>
<file>
<path>/opt/dictionaries/asnprefixes.csv</path>
<format>CSVWithNames</format>
</file>
</source>
<lifetime>300</lifetime>
</dictionary>
</yandex>
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
ASN Dictionary
‱ 1.5 minutes in Spark job to download and generate CSV
SELECT dictGetString('asn_dict', 'datastr', tuple(IPv4StringToNum('143.202.186.23’)))
143.202.186.0/24 264076 BR 1445817600 4 BREM TECHNOLOGY LTDA - ME, BR
SELECT dictGetString('asn_dict', 'datastr', tuple(IPv6StringToNum('2800:5f0:800::1’)))
40.0.0.0/19 4249 US 0789782400 3 LILLY-AS - Eli Lilly and Company, US
took 2 ms
took 2 ms
Unified Data Platform, by Pauline Yeung of Cisco Systems

Mais conteĂșdo relacionado

Mais procurados

Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkDatabricks
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesMydbops
 
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQLBloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQLMasahiko Sawada
 
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...Amazon Web Services
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Altinity Ltd
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBYugabyteDB
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusMarco Pas
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA EDB
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
Maximizing performance via tuning and optimization
Maximizing performance via tuning and optimizationMaximizing performance via tuning and optimization
Maximizing performance via tuning and optimizationMariaDB plc
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Keeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkKeeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkDatabricks
 
[2018] MySQL 읎쀑화 진화Ʞ
[2018] MySQL 읎쀑화 진화Ʞ[2018] MySQL 읎쀑화 진화Ʞ
[2018] MySQL 읎쀑화 진화ꞰNHN FORWARD
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 

Mais procurados (20)

Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best Practices
 
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQLBloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
 
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
 
Maximizing performance via tuning and optimization
Maximizing performance via tuning and optimizationMaximizing performance via tuning and optimization
Maximizing performance via tuning and optimization
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Keeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkKeeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache Spark
 
[2018] MySQL 읎쀑화 진화Ʞ
[2018] MySQL 읎쀑화 진화Ʞ[2018] MySQL 읎쀑화 진화Ʞ
[2018] MySQL 읎쀑화 진화Ʞ
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 

Semelhante a Unified Data Platform, by Pauline Yeung of Cisco Systems

Inspec one tool to rule them all
Inspec one tool to rule them allInspec one tool to rule them all
Inspec one tool to rule them allKimball Johnson
 
Building an Automated Behavioral Malware Analysis Environment using Free and ...
Building an Automated Behavioral Malware Analysis Environment using Free and ...Building an Automated Behavioral Malware Analysis Environment using Free and ...
Building an Automated Behavioral Malware Analysis Environment using Free and ...Jim Clausing
 
Cloud-based Virtualization for Test Automation
Cloud-based Virtualization for Test AutomationCloud-based Virtualization for Test Automation
Cloud-based Virtualization for Test AutomationVikram G Hosakote
 
27.2.12 lab interpret http and dns data to isolate threat actor
27.2.12 lab   interpret http and dns data to isolate threat actor27.2.12 lab   interpret http and dns data to isolate threat actor
27.2.12 lab interpret http and dns data to isolate threat actorFreddy Buenaño
 
How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...InfluxData
 
Passive DNS Collection – Henry Stern, Cisco
Passive DNS Collection – Henry Stern, CiscoPassive DNS Collection – Henry Stern, Cisco
Passive DNS Collection – Henry Stern, CiscoHenry Stern
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftAmazon Web Services
 
Introdução ao Data Warehouse Amazon Redshift
Introdução ao Data Warehouse Amazon RedshiftIntrodução ao Data Warehouse Amazon Redshift
Introdução ao Data Warehouse Amazon RedshiftAmazon Web Services LATAM
 
Good-cyber-hygiene-at-scale-and-speed
Good-cyber-hygiene-at-scale-and-speedGood-cyber-hygiene-at-scale-and-speed
Good-cyber-hygiene-at-scale-and-speedJames '​-- Mckinlay
 
PyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to ProfilingPyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to ProfilingPerrin Harkins
 
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!EC-Council
 
The Boring Security Talk - Azure Global Bootcamp Melbourne 2019
The Boring Security Talk - Azure Global Bootcamp Melbourne 2019The Boring Security Talk - Azure Global Bootcamp Melbourne 2019
The Boring Security Talk - Azure Global Bootcamp Melbourne 2019kieranjacobsen
 
DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]
DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]
DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]APNIC
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 
Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...
Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...
Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...Amazon Web Services
 
20150909_network_security_lecture
20150909_network_security_lecture20150909_network_security_lecture
20150909_network_security_lectureUniversity of Twente
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...Andrew Liu
 
#NSD15 - Attaques DDoS Internet et comment les arrĂȘter
#NSD15 - Attaques DDoS Internet et comment les arrĂȘter#NSD15 - Attaques DDoS Internet et comment les arrĂȘter
#NSD15 - Attaques DDoS Internet et comment les arrĂȘterNetSecure Day
 

Semelhante a Unified Data Platform, by Pauline Yeung of Cisco Systems (20)

Inspec one tool to rule them all
Inspec one tool to rule them allInspec one tool to rule them all
Inspec one tool to rule them all
 
Building an Automated Behavioral Malware Analysis Environment using Free and ...
Building an Automated Behavioral Malware Analysis Environment using Free and ...Building an Automated Behavioral Malware Analysis Environment using Free and ...
Building an Automated Behavioral Malware Analysis Environment using Free and ...
 
Cloud-based Virtualization for Test Automation
Cloud-based Virtualization for Test AutomationCloud-based Virtualization for Test Automation
Cloud-based Virtualization for Test Automation
 
27.2.12 lab interpret http and dns data to isolate threat actor
27.2.12 lab   interpret http and dns data to isolate threat actor27.2.12 lab   interpret http and dns data to isolate threat actor
27.2.12 lab interpret http and dns data to isolate threat actor
 
How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...
 
Passive DNS Collection – Henry Stern, Cisco
Passive DNS Collection – Henry Stern, CiscoPassive DNS Collection – Henry Stern, Cisco
Passive DNS Collection – Henry Stern, Cisco
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
 
Introdução ao Data Warehouse Amazon Redshift
Introdução ao Data Warehouse Amazon RedshiftIntrodução ao Data Warehouse Amazon Redshift
Introdução ao Data Warehouse Amazon Redshift
 
Good-cyber-hygiene-at-scale-and-speed
Good-cyber-hygiene-at-scale-and-speedGood-cyber-hygiene-at-scale-and-speed
Good-cyber-hygiene-at-scale-and-speed
 
PyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to ProfilingPyGotham 2014 Introduction to Profiling
PyGotham 2014 Introduction to Profiling
 
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
 
The Boring Security Talk - Azure Global Bootcamp Melbourne 2019
The Boring Security Talk - Azure Global Bootcamp Melbourne 2019The Boring Security Talk - Azure Global Bootcamp Melbourne 2019
The Boring Security Talk - Azure Global Bootcamp Melbourne 2019
 
Atelier Technique CISCO ACSS 2018
Atelier Technique CISCO ACSS 2018Atelier Technique CISCO ACSS 2018
Atelier Technique CISCO ACSS 2018
 
DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]
DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]
DNSSEC Tutorial, by Champika Wijayatunga [APNIC 38]
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...
Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...
Use Monitoring, Logs, and Analytics Tools to Measure CDN and Site Performance...
 
20150909_network_security_lecture
20150909_network_security_lecture20150909_network_security_lecture
20150909_network_security_lecture
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
 
#NSD15 - Attaques DDoS Internet et comment les arrĂȘter
#NSD15 - Attaques DDoS Internet et comment les arrĂȘter#NSD15 - Attaques DDoS Internet et comment les arrĂȘter
#NSD15 - Attaques DDoS Internet et comment les arrĂȘter
 

Mais de Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
 

Mais de Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 

Último

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Christopher Logan Kennedy
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 

Último (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Unified Data Platform, by Pauline Yeung of Cisco Systems

  • 1. Data Engineer Cisco Umbrella yeungp@cisco.com Unified Data Platform Pauline Yeung ClickHouse Meetup Dec 3, 2019
  • 2. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Agenda 1 2 3 4 5 Problems Use Case: Authlog Use Case: Whois Records Use Case: Network Tunnels Next
  • 3. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential ‱ Data Engineer at Cisco Umbrella, Investigate team ‱ M. S. Computer Engineering, Santa Clara U ‱ B. S. Electrical Engineering, U of Calgary $ whois Pauline
  • 4. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Problems
  • 5. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Investigate: the most powerful way to uncover threats Console API SIEM, TIP Key points Intelligence about domains, IPs, and malware across the internet Live graph of DNS requests and other contextual data Correlated against statistical models Discover and predict malicious domains and IPs Enrich security data with global intelligence domains, IPs, ASNs, file hashes
  • 6. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Investigate Backend Whois ASN IntelDB Umbrella Investigate passive DNS We want ‱ Easy, fast, and flexible platform for ad hoc analysis of authlog, which are stored in passive DNS. ‱ Increase throughput and reduce costs for Whois database. ‱ Fast access to ASN and enrich security data. ‱ One datastore for multiple use cases. Share datastore with other product teams.
  • 7. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential DNS Authoritative Log (authlog)
  • 8. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Passive DNS
  • 9. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Domain to IP Relationships 11 JAN 2019 domain2.com 10 JAN 2019 domain1.com 12 JAN 2019 domain3.com 12.4.0.4/32
  • 10. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential AuthLog Examples owner name datacenter name_server 1 tabiturient.ru. tabiturient.ru. lon ns4.nic.ru. 2 thefacebook.com. certs.thefacebook.com. sea b.ns.facebook.com. 3 333az.net. nbb4yd.333az.net. yyz ns2-09.azure-dns.net. 4 dotnetwork2.co.za. d1000253-146.dotnetwork2.co.za. jnb ns3.dotnetworkdns.co.za. name_server_ip rr ttl type timestamp 1 194.226.96.8 195.24.68.22 3600 A 2019-12-02 10:56:47 2 2a03:2880:ffff:c:face:b00c:0:35 2620:10d:c0a1:10:0:0:0:35 600 AAAA 2019-12-02 12:40:46 3 2620:1ec:8ec::9 ns1-07.azure-dns.com. 20 NS 2019-12-02 10:34:15 4 41.223.172.166 mail.d1000253-146.dotnetwork2.co.za. 3600 MX 2019-11-30 03:05:17
  • 11. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential AuthLog Data Pipeline authlog producer authlog clickhouse ingester S3 archiver resolvers 32 data centers 3 days authlog HBase ingester Investigate UI API Server 6 nodes, 1 replica r4.4xlarge 16 vCPU, 122 GB, 2 TB disk 32 nodes i3 2xlarge 8 vCPU, 61 GB authlog parquet passive DNS 120b requests/day 4b authlog/day
  • 12. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential One Set of Questions ‱ What’s the increase in disk usage for passive DNS per day? ‱ What type of traffic contribute the most to the increase of disk usage?
  • 13. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential AuthLog for Past 3 Days CREATE TABLE IF NOT EXISTS authlog.alog_local ( owner String, name String, datacenter String, name_server String, name_server_ip String, rr String, ttl Int32, type String, timestamp DateTime) ENGINE = MergeTree() PARTITION BY toYYYYMMDD(timestamp) ORDER BY (name, timestamp) TTL timestamp + toIntervalDay(3) SETTINGS index_granularity = 8192 48 golang workers write to 6 shards ingest 1.2m rows per second ClickHouse kafka engine does not support avro
  • 14. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential AuthLog for Past 3 Days CREATE TABLE IF NOT EXISTS authlog.alog( owner String, name String, datacenter String, name_server String, name_server_ip String, rr String, ttl Int32, type String, timestamp DateTime) ENGINE = Distributed(log_cluster, authlog, alog_local, cityHash64(name)) access all shards 4b rows per day 200 GB for 3 days
  • 15. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Payload for Domains in 2 Consecutive Days CREATE TABLE IF NOT EXISTS authlog.a ENGINE = MergeTree() ORDER BY name AS SELECT name, type, sum(length(name) + length(rr)) AS payload FROM authlog.alog WHERE timestamp >= toDateTime('2019-11-28 16:00:00') and timestamp <= toDateTime('2019- 11-28 19:59:59') GROUP BY name, type CREATE TABLE IF NOT EXISTS authlog.b ENGINE = MergeTree() ORDER BY name AS SELECT name, type, sum(length(name) + length(rr)) AS payload FROM authlog.alog WHERE timestamp >= toDateTime('2019-11-29 16:00:00') and timestamp <= toDateTime('2019- 11-29 19:59:59') GROUP BY name, type took 2 minutes, 148m rows, 3.0 GB took 2 minutes, 166m rows, 3.4 GB 4 hours, ÂŒ daily authlog
  • 16. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Payload for Domains Only in Day 2 CREATE TABLE IF NOT EXISTS authlog.b_only ENGINE = MergeTree() ORDER BY (name) AS SELECT b.name as name, b.type as type, sum(b.payload) as payload FROM a RIGHT JOIN b ON a.name = b.name WHERE a.name like '' GROUP BY b.name, b.type ba took 5 minutes, 108m rows, 2.5 GB users.xml max_memory_usage = 96GB max_bytes_before_external_group_by = 48GB max_bytes_before_external_sort = 48GB 4 hours Nov 28 4 hours Nov 29
  • 17. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Second Level Domains with Highest Payload SELECT arrayStringConcat([splitByString('.', name)[-3], '.', splitByString('.', name)[-2], '.’]) AS pname, sum(payload) / 1024 / 1024 AS payload_MB FROM b_only GROUP BY pname ORDER BY payload_MB DESC LIMIT 100 pname payload_MB cloudfront.net. 878.2994289398193 office.com. 719.9946641921997 clienttons.com. 608.300389289856 cnr.io. 473.1693649291992 akamaihd.net. 395.0745334625244 cedexis-radar.net. 364.29007720947266 footprintdns.com. 265.04366874694824 gstatic.com. 250.41933727264404 squarespace.com. 151.24806880950928 forter.com. 151.08679962158203 wacodenver-com.mail.protection.outlook.com. to outlook.com. took 6 seconds
  • 18. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Resource Type with Highest Payload SELECT type, sum(payload) / 1024 / 1024 / 1024 AS payload_GB FROM b_only GROUP BY type ORDER BY payload_GB desc type payload_GB A 4.34150860644877 CNAME 3.289442714303732 RRSIG 1.5495505537837744 DNSKEY 0.6393735473975539 TXT 0.5898861000314355 SELECT sum(payload) / 1024 / 1024 / 1024 as payload_GB FROM b_only payload_GB 11.225993978790939 took 483 msec took 131 msec
  • 19. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential DataBricks Spark ‱ 1 day authlog, ~4b rows ‱ Process 4x authlog ‱ Took 9 minutes pname payload_GB office.com. 3.812719924375415 cloudfront.net. 3.3973056096583605 cnr.io. 2.608651074580848 clienttons.com. 2.2882667966187 cedexis-radar.net. 1.9318219376727939 type payload_GB A 16.849326515570283 CNAME 14.351725150831044 RRSIG 3.1499833753332496 TXT 3.047112719155848 NS 1.328178352676332 payload_GB 41.024286944419146 took 5 seconds took 1.3 seconds took 490 msec
  • 20. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Whois Record Data
  • 21. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential WHOIS Record Data § Who registered the domain § Contact information used § When/where registered § Expiration date § Historical data § Correlations with other malicious domains See relationships between attackers’ infrastructure
  • 22. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential $ whois facebook.com : Domain Name: FACEBOOK.COM Registry Domain ID: 2320948_DOMAIN_COM-VRSN Registrar WHOIS Server: whois.registrarsafe.com Registrar URL: https://www.registrarsafe.com Updated Date: 2019-10-17T18:52:06Z Creation Date: 1997-03-29T05:00:00Z Registrar Registration Expiration Date: 2028-03-30T04:00:00Z Registrar: RegistrarSafe, LLC Registrar IANA ID: 3237 Registrar Abuse Contact Email: abusecomplaints@registrarsafe.com Registrar Abuse Contact Phone: +1.6503087004 Domain Status: clientDeleteProhibited https://www.icann.org/epp#clientDeleteProhibited Domain Status: clientTransferProhibited https://www.icann.org/epp#clientTransferProhibited Domain Status: serverDeleteProhibited https://www.icann.org/epp#serverDeleteProhibited Domain Status: serverTransferProhibited https://www.icann.org/epp#serverTransferProhibited Domain Status: clientUpdateProhibited https://www.icann.org/epp#clientUpdateProhibited Domain Status: serverUpdateProhibited https://www.icann.org/epp#serverUpdateProhibited API request: domainName API response: WhoisRecord_rawText
  • 23. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Registry Registrant ID: Registrant Name: Domain Admin Registrant Organization: Facebook, Inc. Registrant Street: 1601 Willow Rd Registrant City: Menlo Park Registrant State/Province: CA Registrant Postal Code: 94025 Registrant Country: US Registrant Phone: +1.6505434800 Registrant Phone Ext: Registrant Fax: +1.6505434800 Registrant Fax Ext: Registrant Email: domain@fb.com Registry Admin ID: Admin Name: Domain Admin Admin Organization: Facebook, Inc. Admin Street: 1601 Willow Rd Admin City: Menlo Park Admin State/Province: CA Admin Postal Code: 94025 Admin Country: US Admin Phone: +1.6505434800 Admin Phone Ext: Admin Fax: +1.6505434800 Admin Fax Ext: Admin Email: domain@fb.com Tech Name: Domain Admin Tech Organization: Facebook, Inc. Tech Street: 1601 Willow Rd Tech City: Menlo Park Tech State/Province: CA Tech Postal Code: 94025 Tech Country: US Tech Phone: +1.6505434800 Tech Phone Ext: Tech Fax: +1.6505434800 Tech Fax Ext: Tech Email: domain@fb.com Name Server: A.NS.FACEBOOK.COM Name Server: B.NS.FACEBOOK.COM DNSSEC: unsigned : API request: contactEmail API response: list of domainName API request: list of nameServer API response: list of domainName
  • 24. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Questions ‱ Continue to maintain 29 nodes cluster, running Ubuntu Trusty, ElasticSearch 1.6? ‱ Migrate to AWS ElasticSearch 7.1? ‱ Migrate to AWS Aurora Postgres? ‱ Migrate to ClickHouse? don’t need full text search does not support shard for scaling update is slow insert efficient for bulk insert only no secondary index no
  • 25. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Whois Data Pipeline whois ingester whois indexer ClickHouse ElasticSearch Investigate UI API Server 29 nodes, 2 replicas 2 indices 12 TB 6 nodes, 2 replicas 3 tables, 1 materialized view 2 TB download download 1 index 2 tables, 1 MV
  • 26. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Domain Table CREATE TABLE IF NOT EXISTS whois.domain_local( domainName String, contactEmail String, RegistryData_rawText String, WhoisRecord_rawText String) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/whois.domain_local', '{replica}’) PRIMARY KEY (domainName) ORDER BY (domainName) SETTINGS index_granularity = 512 CREATE TABLE IF NOT EXISTS whois.domain( domainName String, contactEmail String, RegistryData_rawText String, WhoisRecord_rawText String) ENGINE = Distributed(whois_cluster, whois, domain_local, cityHash64(domainName)) 48 golang writers write to 6 shards cityhash.Hash64([]byte(domainName)) % numHosts for merging, ClickHouse selects the last inserted row, or if version column exists, selects the row with the max value in the version column
  • 27. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Email Table CREATE MATERIALIZED VIEW IF NOT EXISTS whois.email_mv_local( contactEmail String, domainName String) ENGINE = AggregatingMergeTree ORDER BY (contactEmail, domainName) POPULATE AS SELECT contactEmail, domainName FROM db.domain_local WHERE contactEmail != '' GROUP BY contactEmail, domainName CREATE TABLE IF NOT EXISTS whois.email_mv( contactEmail String, domainName String) ENGINE = Distributed(whois_cluster, db, email_mv_local, cityHash64(contactEmail)) domain table is 150 GB email table is 3 GB
  • 28. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential NS Table CREATE TABLE IF NOT EXISTS whois.ns_local( nameServer String, domainName String) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/whois.ns_local', '{replica}') PRIMARY KEY (nameServer, domainName) ORDER BY (nameServer, domainName) SETTINGS index_granularity = 512 CREATE TABLE IF NOT EXISTS whois.ns( nameServer String, domainName String) ENGINE = Distributed(whois_cluster, whois, ns_local, cityHash64(nameServer)) name server table is 4 GB
  • 29. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Whois Queries SELECT WhoisRecord_rawText FROM domain_local FINAL WHERE domainName = 'facebook.com’ SELECT WhoisRecord_rawText FROM domain FINAL WHERE domainName = 'facebook.com’ SELECT domainName FROM email_mv WHERE contactEmail = 'domain@fb.com’ domainName buyfbfansnow.com facebook-hardware.com instagram-engineering.net pokerface-book.com what3app.com SELECT * FROM ns WHERE nameServer LIKE ‘%.facebook.com’ nameServer domainName ns1.facebook.com djgabeholm.com ns2.facebook.com shellpriv.com ns3.facebook.com arabfashioncompany.com a.ns.facebook.com zuckerberg.com b.ns.facebook.com zuckerberg.net took 9 msec postgres 7 msec took 21 msec data selected fully merged, slower took 16 msec, 2549 rows took 36 msec, 7075 rows Will add TLD column, e.g. facebook.com ORDER By TLD, nameserver, domainName expect query < 10msec
  • 30. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Network Tunnels
  • 31. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Network Tunnels ‱ Network tunnels deliver the branch office traffic to the Cisco’s cloud edge where Umbrella runs a number of security functions. ‱ Firewall, web security, DNS security. Provisioned per organization
  • 32. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Network Tunnels DNS security web security S3 Network Tunnels UI API Server Tunnel Visibility Sensors states events ClickHouse downloadnetwork tunnels 3 nodes, 1 replica firewall script CSV
  • 33. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Event Table CREATE TABLE IF NOT EXISTS tunnel_viz.event_local( OrgID UInt32, TunnelID UInt32, EventTime DateTime, EventID String, EventType String, PeerID String, PeerIP IPv4, PeerPort UInt16, Code LowCardinality(String), Reason String) ENGINE = MergeTree() PARTITION BY toYYYYMM(EventTime) PRIMARY KEY (OrgID, TunnelID, EventTime) ORDER BY (OrgID, TunnelID, EventTime) TTL EventTime + toIntervalDay(120) SETTINGS index_granularity = 8196 dictionary encoding, 10 unique codes event table, 3.3m rows, 260 MB
  • 34. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Event Table CREATE TABLE IF NOT EXISTS tunnel_viz.event( OrgID UInt32, TunnelID UInt32, EventTime DateTime, EventID String, EventType String, PeerID String, PeerIP IPv4, PeerPort UInt16, Code LowCardinality(String), Reason String) ENGINE = Distributed(cdfw_cluster, tunnel_viz, event_local, murmurHash3_32(OrgID))" fairly even distribution for integer
  • 35. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Event Queries SELECT Code, count() as c FROM event GROUP BY Code ORDER BY c DESC ┌─Code─────────────────┬──────c─┐ │ PROPOSAL_MISMATCH_CHILD │ 2164575 │ │ PEER_AUTH_FAILED │ 871937 │ │ RETRANSMIT_SEND │ 243559 │ │ RETRANSMIT_SEND_TIMEOUT │ 42829 │ │ UNIQUE_REPLACE │ 26540 │ │ PARSE_ERROR_BODY │ 2717 │ │ CERT_REVOKED │ 1004 │ │ LOCAL_AUTH_FAILED │ 186 │ │ TS_MISMATCH │ 9 │ │ VIP_FAILURE │ 1 │ └─────────────────────┮────────┘ took 23 msec, 10 unique codes
  • 36. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Next
  • 37. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential What we learned ‱ Cheap ‱ Fast ‱ Flexible ‱ Good compression ‱ Cluster isolation for multiple stores ‱ Ad hoc analysis for authlog using 200 GB storage ‱ Lower cost and acceptable performance for whois database. ‱ Share hardware for different type of datastores.
  • 38. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential ClickHouse Wish List ‱ Support avro in kafka engine. ‱ Balance cluster and copy data after failed or added node.
  • 39. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Questions?
  • 40. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Backup – Other Use Cases
  • 41. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Threat Library
  • 42. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
  • 43. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
  • 44. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Threat Library Data Pipeline Web UI API Server S3 DNS query log threat/attack DNS query log blocked domains run job blocked domains Kubernetes Cluster threat/attack feed 1..n blocked domains threat/attack ClickHouse Airflow 1 pod, 256 GB disk
  • 45. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Blocked Domains CREATE TABLE IF NOT EXISTS attribution.blocked( datetime DateTime, domain String, threat LowCardinality(String), attack LowCardinality(String), count UInt32) ENGINE = ReplacingMergeTree() PARTITION BY toYYYYMMDD(datetime) PRIMARY KEY (datetime, domain, threat, attack) ORDER BY (datetime, domain, threat, attack) TTL datetime + toIntervalDay(30) SETTINGS index_granularity = 8196 dictionary encoding, 37 threats, 118 attacks
  • 46. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential ASN
  • 47. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Autonomous Systems ‱ IP to ASN ASN Attribution
  • 48. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Domain Ă  IP Ă  ASN relationships AS 701AS 3462 AS 12271 1.168.6.17 domain1.com 100.2.65.157 104.162.93.136
  • 49. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential ASN Data Pipeline data importer ClickHouse Aurora Postgres Web UI API Server 1 write 1 read download download CSV multiple tables
  • 50. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Postgres WITH b as ( SELECT asn, cidr FROM delta_bgp_routes WHERE (cidr >>= CAST('104.244.42.193' AS ip4r)) AND (period && DATERANGE(CURRENT_DATE - integer '2', CURRENT_DATE,'[]'))) SELECT a.asn, b.cidr, a.description, a.creation_date AS creationDate, a.ir FROM delta_autonomous_systems AS a, b WHERE (a.asn = b.asn) AND (period && DATERANGE(CURRENT_DATE - integer '2', CURRENT_DATE,'[]’)); asn | cidr | description | creationdate | ir -------+-----------------+----------------------------------+--------------+---- 13414 | 104.244.42.0/24 | TWITTER - Twitter Inc., US 86400 | 2010-07-09 | 3 Took 52 ms
  • 51. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential /etc/clickhouse-server/asn_dictionary.xml<yandex> <dictionary> <name>asn_dict</name> <layout> <ip_trie /> </layout> <structure> <key> <attribute> <name>prefix</name> <type>String</type> </attribute> </key> <attribute> <name>asn</name> <type>UInt32</type> <null_value /> </attribute> <attribute> <name>country</name> <type>String</type> <null_value>??</null_value> </attribute> <attribute> <name>created_at</name> <type>DateTime</type> <null_value /> </attribute> <attribute> <name>registry</name> <type>UInt32</type> <null_value /> </attribute> <attribute> <name>description</name> <type>String</type> <null_value /> </attribute> <attribute> <name>datastr</name> <type>String</type> <null_value /> </attribute> </structure> <source> <file> <path>/opt/dictionaries/asnprefixes.csv</path> <format>CSVWithNames</format> </file> </source> <lifetime>300</lifetime> </dictionary> </yandex>
  • 52. © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential ASN Dictionary ‱ 1.5 minutes in Spark job to download and generate CSV SELECT dictGetString('asn_dict', 'datastr', tuple(IPv4StringToNum('143.202.186.23’))) 143.202.186.0/24 264076 BR 1445817600 4 BREM TECHNOLOGY LTDA - ME, BR SELECT dictGetString('asn_dict', 'datastr', tuple(IPv6StringToNum('2800:5f0:800::1’))) 40.0.0.0/19 4249 US 0789782400 3 LILLY-AS - Eli Lilly and Company, US took 2 ms took 2 ms