4. This is what data is fetched from each document ('row') in the database. The variable 'v1' will contain request.time These are the column definitions. This is python code which is evaluated. They have access to the variables, and a library of 'transformations' date(millis) takes an UTC timestamp and converts it to a nice human readable format. The second column will be titled Date and contain the result of date(v1)
5. The v0 parameter is the object id. This column uses 'Coloring', which means that the value is not displayed, instead a color is calculated from the hash of the value. This is particularly useful e.g when values are long but not interesting. Cookie values take a lot of screen real estate, but often it is only interesting to see when they are changed – which is shown by the color.
6. There are a lot of prefedined 'transformers' which can be used when defining the columns For example, the function below makes it possible to display both URL-parmeters and POST-parameters in the same column. showparams(url,form) Sorts parameters by keys. You can send in two dicts, and get the combined result. This makes it easier to show both form-data and url-data in the same column. Example variable v2: request.url variable v3: request.data column: sortparams(v2, v3) //Another version variable v1: request column: sortparams(form=v1.data,url=v1.url)
7. It is simple to write the kind of view you need for the particular purpose at hand. Some example scenarios: - Analysing user interaction using several accounts with different browsers: * Color cookies * Color user-agent * Parameters * Response content type (?) - Analysing server infrastructure * Color server headers * Server header value for X-powered-by, Server etc. * File extension * Cookie names - Searching for reflected content (e.g. for XSS) * Parameter values * True/False if parameter value is found in response body (simple python hack) - Analyzing brute-force attempt * Request parameter username * Request parameter password * Response delay * Response body size * Response code * Response body hash After you write some good column definitions for a particular purpose, save it for next time
8. This is an example of how an object (request-response) is stored in the database. Each individual field can be used in database queries, more advanced functionality can be achieved using javascript which is executed inside the database. Since MongoDB does not impose a schema, these structures were dynamically generated by the writer (Hatkit proxy) on the fly. Dynamic properties such as headers and parameters can be used for selection just as any ’static’ property, such as response.rtt which always will be there. This enables semantics like ”Select request.url.parameters.z from x where request.url.parameters.z exists”. … (but just to be clear: all keys/values are dynamic)
10. Aggregation (grouping) is a feature of MongoDB. It is like a specialized Map/Reduce which can only be performed on <10 000 documents. You provide the framework with a couple of directives, and the database will return the results, which are different kinds of sums. This enables pretty nice kind of queries which can be displayed in a tree-form. Example: sitemap can be easily generated Example: Show all http response codes, sorted by host/path Example: Show all unique http header keys, sorted by extension Example: Show all request parameter names, grouped by host Example: Show all unique request parameter values, in grouped by host
11.
12.
13.
14. Provides capabilities to use existing frameworks, libraries and applicationsfor analysing captured data
15. 3rd party analysis – The idea is to use plugins that use the stored traffic and ’replays’ it through other frameworks. Status: API defined, no UI exists. Runnable through console. W3af plugin Plugin which uses the ’greppers’ in w3af to analyse each request/response pair. Requires w3af to be installed, calls relevant parts of the w3af code directly. Status: Code works, but not feature complete. Ratproxy plugin Plugin which starts ratproxy (by lcamtuf) and opens a port (X) for listening. It sets ratproxy to use port X as forward proxy, then replays all traffic through ratproxy, while capturing the output from the process. Status:PoC performed, but not nearly finished Httprint plugin Plugin which uses httprint to fingerprint remote servers. Status: Idea-stage, unsure if httprint is still alive
16.
17. For ’breakers’ : Datafiddler is very useful for analyzing remote servers and applications, from a low-level infrastructure point-of-view to high-level application flow. For ’defenders’ : Hatkit proxy can be set as a reverse proxy, logging all incoming traffic. Datafiddler can be used as a tool to analyze user interaction, e.g. to detect malicious activity and perform post mortem analysis. The proxy is very lightweight on resources (using Rogan Dawes’ Owasp Proxy), and the backend (MongoDB) has great potential to scale and can handle massive amounts of data.