2. Problems with Security in the SDLC
• Waterfall versus Agile (Technical debt)
• Security backlogs, Hardening sprints
• Ratcheting is pen-testing for the SDLC
3. Testing in Prod
• [Almost] Never test in production
• Configure temporary DNS/IP for test box
• Run only test cases that require on-Internet
4. AppSec Programs & App Assessments
• Don’t blindly hire external pen-testers
• Don’t blindly follow the maturity models
• Lead with a tool, but instrumentation
• Not app|code scanners, manual pen-tests
5. Start with Instrumentation
• DBI (Pintool, DynamoRIO, IDA+PaiMei/PyDbg)
• Compiler-based (does only gcc support this?)
• Actually is perfect for web applications
• Fortify PTA, Aspect Security, Morcilla PHP
6. TAOSSA Code-Audit Strategies
• Instrumentation (CC5) takes care of inputs
(filters/validation) and outputs (escaping)
• Candidate-points mostly taken care of
• CC1-4: Don’t worry about object-oriented
• DGs: Use OOA&D with Patterns, EAI/Web2.0
7. Which Apps to Test?
• Don’t enumerate or discover web apps
• Locate databases and understand data
• Find where the data flows to
• Threat-model and refactor to security
patterns. Then do posture assessments
8. How to Test Risky Apps
• Do the manual penetration-testing
• Reverse testing
• Tiered testing
• Make somebody else do it for you
9. Dev-Test and SQE (Quality)
• Leverage any existing test-harness
• Outsource to large usability tests
• Company-wide bug hunt days
10. Leverage the Test Harness
• Webapp: HtmlUnit, Selenium RC, JsTestDriver
• Fatapp: Test|Fake client, Corpus distillation
• RESTful apps: SoapUI, Unit testing frameworks
• Continuous-prevention development
11. Usability Outsourcing
• E.g. Nielsen Norman Group
• Testing Intranets
• If you can’t do this, then do bug-hunts
• Invite everybody
13. Epic-Fail Guy (EFG) Revisited
• Required static analysis doesn’t stop EFG
• OWASP ESAPI doesn’t stop EFG
• Appsec training doesn’t stop EFG
• They are legion
14. Static Analysis Tools Suck
• Too expensive in both money and time
• 3k/2wk/app, 30k/yr, 60k/yr
• Security coverage costs 25k/yr
• SATE 2009, ManVsAutoVulnAssessment
15. Fuzzers and Scanners Suck
• Software Security Testing & Quality Assurance
• “… the fuzzers found, on average, over 50% more bugs
than just running the most effective fuzzer by itself “
• “every 1% of code coverage = finding 1% more bugs”
• Wivet and SQLiBENCH results are still poor
16. Code Reviews Don’t Scale
• Walkthroughs rarely happen/useful
• Specs and Requirements rarely happen/useful
• They are awesome though
17. Pen-Tests Don’t Scale
• All pen-tests should include free, automated
regressions that can be run in e.g. cron and
provided to the business with free support
• The Appsec SaaS companies do this already
18. Types of Pen-Testing
• Peripheral (mostly point-and-shoot + reports)
• Adversarial (threat-modeling required)
• Still doesn’t scale, but pretty cool guy
19. State of the Art AppSec Risk Management
• Combine methods (SAST+DAST, VA+WAF, etc)
• Threadfix, HoneyApps, O2, Aspect Security
• Pen-test specific: The Dradis Framework
• Vendor specific: 360, AMP, Hybrid 2.0
20. The DevTest Security Analyst
• aka Security Bugfixer aka “Security Buddy”
• Uses test harness, HP Test Data Management
• Reads InfoQ, Hacker News, SpotTheVuln
• Stamps out classes of security bugs
"Too early for Maturity Models" -- The same X activities are all trying to accomplish the same goal, giving you more controls (throwing away the principles) but only so that there's more stuff to confuse, more stuff to pay for, more stuff to train on, more money to spend on training, more hamster wheel of pain!!! -- We need to focus on one activity that accomplishes the goals, gives us data, gives us a central location to work, and that doesn't require XX solutions
Hire two internal appsec experts. The reason you need two is because the app owners never give you more than one account to access the system. Pair testing is the penultimate way of testing apps.
Enterprise licenses are often cheaper than consulting licenses.
That one activity to start with is Instrumentation. If you are going to buy something; buy this. If you are going to invest your time in something; invest in this. If you are going to tie it to risk management; tie it with this. If you are going to mine data on your appsec program; mine this.
http://www.gdssecurity.com/l/b/2007/11/17/early-look-at-tracer-20-beta/
http://chorizo-scanner.com
Not reading TAOSSA is like being 1500 days behind everyone who did and still thinking that you are contributing something to appsec
If you really do want to enumerate all of your web servers (HTTP and TLS) running on any TCP port, try http://dogtown604.googlecode.com
However, this doesn’t force risk management; it plays on compliance practices. Understand the data, the data flow, and the execution flow that interacts with that data flow and only then can you start to understand the risk management.
Beat the hackers by getting there first.
"Reverse Testing"The concept where you train an app scanner/crawler against an instrumented code-base and then work backwards
e.g. walking all parts of the app in Firefox, submitting every form, etc through Burp Suite Pro on an app instrumented with Fortify PTA. Instead of jumping to Intruder, Repeater, or Scanner – instead build a list of all of the hits from walking the app on URLs, HTTP headers/cookies, query|form parameters, etc that Fortify PTA claims as vulnerable. Finally, run attacks specifically on the places that you know are vulnerable, utilizing the information from either the source/pass-thru/sink type or the source-code directly (if you have source code it also maps URLs and/or params to code). Stop testing when you get reliable security code coverage across sources and sinks.
Pros: You only test what is exposed by the app (saving time and providing focus). You know when you are done testing i.e. did a thorough test
Cons: You might miss stuff due to lost sinks. Only works well on Java Enterprise, perhaps .NET, and barely PHP. Doesn’t hit client-side-only code"Tiered Testing"Do a few crawl-only scans by doing ones with credentials and a crawl without credentials (I suggest WebInspect, Acunetix, and/or Netsparker to do this)Do a few more crawl-only scans with all of the query|form parameters giving valid, desired data as input as both credentialed and not (if customizable)
Using the information from the crawls, credential to the highest level of access and train Burp Suite Pro. Look for areas that the scanners missedRun the target analyzer and focus attacks on the dynamic URLs, starting with the ones with the most parameters. Then tiered test the static URLs by selecting aspx|php|jsp|etc pages (especially ones with more params if they have any), followed by controller actions (assuming they can be identified), followed by any other static content (e.g. js files and crap). Use the Scanner, Repeater, and/or IntruderInject faults/fuzz-strings into insertion points, document results, save Burp state, logout of app, restart BurpRepeat with lower credentials (and lastly, without credentials), starting again with the dynamic URLs that have the most parametersPros: Risky attack surface tested first (saving time and providing focus). Loop back to reverse testing by going straight to code when you find a valid issue
Cons: Unless you combine this method with reverse testing you are only doing a black-box test i.e. you don’t know if you have tested everything
Note that you can also run the SQE’s QA test harness through Burp Suite Pro (I like to save the state, but others like to log req/resp’s) or another passive tool
Such as: Casaba Watcher (in Fiddler2), Google Ratproxy (don’t forget the Metasploit WMAP patches)
-- Note that Casaba has another tool called x5s that can kick off some XSS/HTMLi active tests when used in Fiddler2 as a proxy
-- w3af, Netsparker, and many other average-to-good black-box tools also support a proxy mode
Usaproxy or similar method can log the interactions that the usability testers perform. This includes client-side state such as mouse and keyboard events. The browser Blacksheep logs mouse and keyboard DOM events in standard XPath format.
FiddlerCap , Firebug Net Panel History Overlay, and other tools can capture user events and history. Burp Suite (proxy) and Slogger (Firefox add-on) log basic HTTP/TLS requests and responses, but don’t usually capture browser events. All of the passive tools suggested for use with test harnesses can also be used by usability testers.
This information can actually be valuable during user misuse/abuse cases/stories when iterating/refactoring designs and to patterns. Better, it could be very useful for understanding the execution flow and business logic of the app, which could allow for easier threat-modeling.
Number one reason to test out of prod: ability to break the app in ways otherwise impossible in prod. Number two reason: instrumenting and logging the app in ways not possible in prod, or other restrictions on performance.
Set session-management timers way up for long-scans. Set way down when looking for bugs. Restart app pools to also test for state bugs. Check logging and tracing information for interesting events. Especially look for data and state that involves data that is not supposed to be exposed (e.g. encrypted or decrypted usernames and passwords, PII, source code, error files, etc). Another reason not to test in prod is all of the crazy artifacts that testing leaves behind!
O2 has a module which maps scanner/pen-tester attacks to exception handler output
Immunity Security Debugger once had a plugin called SQLHooker to sit in between an IIS server instance and MS-SQL
SANS literally said (in their appsec training): "We've been using Paros mainly because it's simpler (i.e. has less options)“
According to recent web app scanner reports (e.g. “Why Johnny can’t Pen-Test”), Paros performed worse than any other scanner besides it’s own commercial version (MILEScan), NTOSpider, and Grendel-Scan.
“Our analysis must not be used as a direct source for rating or choosing tools or even in making a decision whether or not to use tools" -SATE 200919k warnings reported: 8 teams reviewed a 778 subset over 3 months, resulting in 114 findingsFP error rate of 99.4% before analysisFP error rate of over 85% after sampling just a subsetManual analysis only found 13 issues with an overlap of 53.8%.(However, the ManVsAutoVulnAssessment found 15 issues manually when SCA only found 6 of the vulns or 40%)WHO IS RIGHT?!@?#!?@!#
App scanners usually try to find out what type of web app (or web server) they are testing in order to focus the scan. However, many apps are not built from one language or one framework but are a huge combination of components, URL rewritings to other web servers, load balancer-based and app-based routing techniques, and public interface exposures of third-party components, contributed components, modified third-party components, etc.
IOW: One parameter you are testing on one page could be an ASP.NET app, while another could be an ASP.NET MVC app, while the top-level URI could be JEE (all on a single Windows Server). And another part of the web site could have a load-balancer URL routing/re-writing to an LAMP server. Each could have a different database as well
"BUT my developers want to use Fortify SCA | Ounce Labs | O2 | Klocwork | Coverity | Checkmarx | Compuware | Armorize | Grammatech | SciTools | SofCheck | LDRA Testbed | Veracode | etc" -- Fine, let them. Help them. Encourage them. Praise be SCA! (Just don't touch the stuff yourselves; it has very low risk management value)
False positives with recurring idioms are annoying. Vendors please stop this or at least give us a way to deal with it better. Burp Scanner isn't horrible at it, especially when you can extract grep in the Intruder using the same session data as the Scanner (or use the Comparer) while staying in the same tool
Re-testing false positives over and over again is the worst waste of bandwidth since 4chan
In the QA world, this is known as “equivalence classing”
"BUT my operations|network security team wants to use | should be using | is trained to perform app scans using commercial web application security scanners!" -- No, no no no no. Only experts should use these tools, although you can have LEVELS of penetration-testers performing adversarial and peripheral activities. -- Do not let anyone or any organization "just run an app scanner" against your app. That's like just running a fuzzer against your fat app and saying "Hey, look, I made it crash" when the fault isn't even a first or second chance exception but due to some incidental. It's like "Who cares? Where's the risk management?" -- Nobody should be using app scanners and we should really stop now. They are good for teaching appsec to a point (although I can think of better ways) and good for awareness. That time is now over
DO NOT build a “scan factory”. If you must include operators, build an ASOC
"BUT Burp is a penetration-testing tool, not really an app scanner!" -- You are correct. Burp is magical fairy dust for appsec -- Tools like Nmap, Nikto, and Netsparker should be leveraged because of their connectivity to The Dradis Framework as well as their extensibility through plugins“
"BUT Burp doesn't work because my fat app doesn't use HTTP/TLS or AMF or some other web app technology that could even be supported in a plugin" -- Oh no. Now it's time to move on to Echo Mirage, WPE Pro, JavaSnoop, ProxyFuzz, Peach Framework (peachshark), EFS/Paimei, IDA Pro, Sulley, OllyDbg/ImmDbg (Uhooker), Mallory, et al
"BUT Flash and SWF files are driving me crazy" -- Me, too, buddy: Get in line. Until then, start reading the code with SWFScan unless obfuscated and then try that SWF Disassembler plugin for IDA Pro