natfilterd identifies unique hosts behind a NAT. It monitors TCP packets
by hooking into netfilter using the netfilter_queue extension and parses
TCP headers for the TCP Timestamps extension options. TCP timestamps are
generated by the OS based on `ticks' since boot time. Collecting per
connection (timestamp, wall clock) tuples allows identifying unique
hosts sharing the same IP with some math in realtime.
This allows natfilterd to drop packets of specific hosts sharing the
same source IP. Also, a fancy webinterface is provided.
Minimizing Hidden Node Problem in Vehicular Ad-hoc Network (VANET)
Identifying hosts with natfilterd
1. Identifying hosts with natfilterd
A TCP timestamp analysis based solution
Georg Wicherski
UMIC LuFG IT-Security,
RWTH Aachen University
2011-02-14
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 1/9
2. Motivation
SNAT 1 makes individual host identification hard
All hosts behind the same SNAT gateway appear to have the same IP
address
In Capture-the-Flag contests2 , SNAT is used to mix other teams’
hosts and the gameserver to prevent trivial traffic filtering of
opponent teams’ attacks
We need to identify individual attacking hosts and drop their traffic
Some people use botnet sinkholing3 to estimate the size of a threat
Hosts behind SNAT are counted as a single infection if no application
layer ID is available
1
Source Network Address Translation
2
http://www.cipher-ctf.org/CaptureTheFlag.php
3
e.g. http://www.cs.ucsb.edu/~kemm/courses/cs177/torpig.pdf
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 2/9
3. TCP Timestamps
The TCP protocol allows for options in the header (RFC 793)
TCP timestamps are such an extension option to optimize
performance (RFC 1323)
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 3/9
4. TCP Timestamps
The TCP protocol allows for options in the header (RFC 793)
TCP timestamps are such an extension option to optimize
performance (RFC 1323)
Support is indicated by supplying a timestamp option header with
zero timestamp
If both hosts support it, timestamps are exchanged
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 3/9
5. Timestamp Generation
RFC 1323 – 3.3 The RTTM Mechanism
The timestamp value to be sent in TSval is to be obtained from
a (virtual) clock that we call the ”timestamp clock”. Its values
must be at least approximately proportional to real time, in order
to measure actual RTT.
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 4/9
6. Timestamp Generation
RFC 1323 – 3.3 The RTTM Mechanism
The timestamp value to be sent in TSval is to be obtained from
a (virtual) clock that we call the ”timestamp clock”. Its values
must be at least approximately proportional to real time, in order
to measure actual RTT.
TSval = (wallclock − boottime ) ∗ tickscale
host specific kernel specific
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 4/9
7. Fingerprinting Hosts
First documented in Phrack #63 0x03-2: “TCP Timestamp To count
Hosts behind NAT”
Track TCP connections: each packet belongs to the same host
Approximate linear regression equation y = c0 + x ∗ c1 from set of
points (wallclock, TSval)
If distance to next host equation below threshold, update old equation
Otherwise add new host to database
Once a host is in the database, try to match new packets against it in
realtime
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 5/9
8. Fingerprinting Hosts
First documented in Phrack #63 0x03-2: “TCP Timestamp To count
Hosts behind NAT”
Track TCP connections: each packet belongs to the same host
Approximate linear regression equation y = c0 + x ∗ c1 from set of
points (wallclock, TSval)
If distance to next host equation below threshold, update old equation
Otherwise add new host to database
Once a host is in the database, try to match new packets against it in
realtime
Without optimizations:
O(n2 ) for adding n hosts with significant c for distance calculation
O(n) for matching one packet against n host with significant c
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 5/9
9. Introducing “Rate Classes”
TSval = (wallclock − boottime ) ∗ tickscale
host specific kernel specific
Windows: Apparently uses kernel equivalent of GetTickCount()
1
Linux: Uses jiffies, incremented every HZ seconds
Common values for HZ are 100, 250, 1000
OpenBSD, FreeBSD, NetBSD: Did not test
Finite and sufficiently small set of r different values for tickscale
Round value to 0.01ms granularity / resolution
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 6/9
10. Optimizations by “Rate Classes”
x2 − x1
dist =
sin (tan−1 (tickscale))
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 7/9
11. Optimizations by “Rate Classes”
x2 − x1
dist =
sin (tan−1 (tickscale))
Algorithm for finding host for connection O(log2 n)
min ← ∞
rateclass ← rateclasses.hashlookup(round(tickscale))
neighbours ← rateclass.btree(xnormalized ) {O(log2 n)}
for all neighbour ∈ neighbours do
if dist(neighbour .x, xnormalized , rateclass.rate) < min then
min ← neighbour , dist
end if
end for
return min
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 7/9
12. Optimizations by “Rate Classes” (contd.)
Algorithm for matching single packets O(log2 n)
min ← ∞
for all rateclass ∈ rateclasses do
neighbours ← rateclass.btree(xnormalized ) {O(log2 n)}
for all neighbour ∈ neighbours do
if dist(neighbour .x, xnormalized , rateclass.rate) < min then
min ← neighbour , dist
end if
end for
end for
return min
Wicherski (RWTH Aachen University) natfilterd 2011-02-14 8/9