5. 自作Netflowコレクタ
5
flow record FLOW-RECORD
match ipv4 protocol
match ipv4 source address
match ipv4 destination address
match transport source-port
match transport destination-port
match application name
collect counter bytes
collect counter packets
collect timestamp absolute first
collect timestamp absolute last
テンプレートフローセットから
データフローセットを動的に解析し、
適切なフィールド名になる
https://github.com/tetsusat/fnfc
ルータの設定
10. • アプリケーション毎のバイト数
Apache Spark + Apache Zeppelin 3/6
%sql
SELECT record.application_name, sum(record.client_bytes) bytes FROM records GROUP BY record.application_name
11. • アプリケーション毎のバイト数(WHERE句をパラメータ化)
Apache Spark + Apache Zeppelin 4/6
%sql
SELECT record.application_name, sum(record.client_bytes) bytes FROM records
WHERE record.ipv4_src_addr="${src}" AND record.ipv4_dst_addr="${dst}"
GROUP BY record.application_name
14. • 特定の1日で30分毎のバイト数を集計
Apache Spark + Apache Zeppelin 5/6
%sql
SELECT from_unixtime(m.timeslot*(30*60)) dtime, sum(m.bytes) bytes
FROM (
SELECT record.client_bytes bytes, floor(unix_timestamp(record.absolute_first)/(30*60)) timeslot
FROM records
WHERE record.absolute_first >= "2016-03-24" AND record.absolute_first < "2016-03-25“
) AS m
GROUP BY m.timeslot ORDER BY m.timeslot
16. • 特定の1日で30分毎のバイト数を集計(アプリケーション毎の集約)
Apache Spark + Apache Zeppelin 6/6
%sql
SELECT from_unixtime(m.timeslot*(30*60)) dtime, m.app, sum(m.bytes) bytes
FROM (
SELECT record.client_bytes bytes, record.application_name app, floor(unix_timestamp(record.absolute_first)/(30*60)) timeslot
FROM records
WHERE record.absolute_first >= "2016-03-24" AND record.absolute_first < "2016-03-25“
) AS m
GROUP BY m.timeslot, m.app ORDER BY m.timeslot