This document discusses digital security and the evolution of software development tools. It summarizes the story of IDA, a disassembler and decompiler, from its origins as a hobby project to its current use. Improving digital security requires better tools for software analysis and testing, which have evolved greatly over the past decades but still have room for improvement.
2. 2(c) 2014 Ilfak Guilfanov
Presentation Outline
About me
Digital security is more important than ever
Evolution of development tools
Story of IDA
What we could do to improve the situation
Your feedback
3. 3(c) 2014 Ilfak Guilfanov
About myself
Started a programmer career more than 27 years ago
The author of IDA Pro and Decompiler
Founder and CEO of Hex-Rays
I have passion about programming and beautiful code
– I'm not a reverse engineer but a software developer
I believe that efficient and robust software makes our lives
better
4. 4(c) 2014 Ilfak Guilfanov
Digital security
We store more and more personal data digitally:
– Medical records
– Bank account info
– Communications (emails, sms, etc)
– Photos and videos
– etc...
More and more (all?) decisions are taken based on digital
data
Virtually all our devices are connected to each other, which
makes them powerful and vulnerable at the same time
5. 5(c) 2014 Ilfak Guilfanov
Security: current state
Far from perfect
Most our software has vulnerability holes due to:
– Design flaws (security was not built in from the start)
– Buggy implementations
– Poor or no testing
– Changed environments
We will hear more stories about thousands of stolen credit
cards, disclosed sensitive information, etc
6. 6(c) 2014 Ilfak Guilfanov
Everything is lost? Nope
I think that the situation is improving over time
– The software improves
– Most new systems get designed with security in mind
– Society better handles digital security related issues
7. 7(c) 2014 Ilfak Guilfanov
Development methods are constantly improving
Compilers become better (the very first compilers would not
issue any warnings!)
We have more compilers (clang, for example)
Programming languages evolve (initially c++ had no
templates)
Better programming paradigms and design patterns
Better software revision systems
Agile and other development approaches
8. 8(c) 2014 Ilfak Guilfanov
Software analysis and testing
There are many source code analyzers on the net (25 years
ago there was just “lint”)
Many testing frameworks appeared too
Debuggers, fuzzers, valgrind, etc
9. 9(c) 2014 Ilfak Guilfanov
How difficult was it to develop software in the past?
Overall, much harder than today. One proof of this difficulty
is the software size (measured as the number of source lines
of code, not as the size of the binary image)
Example, MS Windows (data from wikipedia)
Year Operating System SLOC (mil)
1993 Windows NT 3.1 4-5
1994 Windows NT 3.5 7-8
1996 Windows NT 4.0 11-12
2000 Windows 2000 More than 29
2001 Windows XP 45
2003 Windows Server
2003
50
10. 10(c) 2014 Ilfak Guilfanov
Principal reasons of difficulty
Missing or inefficient development tools: editors, compilers,
debuggers, code analyzers, testers, etc.
Slower computers
No memory protection or program isolation provided by the
operating system
Immature programming paradigms and methodologies
Lack of program verification and testing tools
Let us take a tour
11. 11(c) 2014 Ilfak Guilfanov
Editor 'ed'
A text editor with virtually no visual feedback
Yet considered a powerful tool:
– It has regular expressions
– Programmable
Even modern Linux distributions still include it
12. 12(c) 2014 Ilfak Guilfanov
Compilers
No or little warnings
Could produce buggy code
Poor code optimization
However, as a software developer I still consider compiler
writers as semi-gods :)
Turbo C compilers from Borland were a breath of fresh air
13. 13(c) 2014 Ilfak Guilfanov
Debuggers
My favorite was Turbo Debugger. Compared to other
debuggers it was a feast to the eyes
Powerful, robust, can display the source code
Supported hardware and conditional breakpoints
14. 14(c) 2014 Ilfak Guilfanov
SoftICE: system level debugger
Can debug the entire operating system
Very powerful
Its popularity was a disadvantage in some cases, anti-
SoftICE tricks were employed by some software
(from
wikipedia)
15. 15(c) 2014 Ilfak Guilfanov
Disassemblers
Debuggers were not enough in some cases
– Programs get bigger
– Algorithms became more complex
– Anti-debugger tricks were used more often
– More detailed analysis was required, especially for viruses
– Compiler bugs required to check non-executable (object)
files
Disassemblers would:
– Analyze the program in depth
– Show cross references
– Assign meaningful names to functions and data
– etc...
16. 16(c) 2014 Ilfak Guilfanov
The most popular disassembler: Sourcer
Sourcer from V Communications was a great tool
It was like magic for a newcomer, in fact
– It would tell apart code from data
– It would assign meaningful names and comments
17. 17(c) 2014 Ilfak Guilfanov
Sourcer: batch mode disassembler
The biggest shortcoming: it was
a batch mode program
Could handle programs of
limited size
Would occasionally misidentify
code or data
Slow for big programs
18. 18(c) 2014 Ilfak Guilfanov
Interactivity is the answer
There was a disassembler called Dis*Doc from RJ Swantek
I haven't used it myself so can not tell you much
But I liked the idea very much:
– No need to wait for the results
– The user can browse the listing and annotate it
– The user can guide the disassembler by marking locations
as code or data
– WYSIWIG (what you see is what you get) was a la mode at
that time (remember 'ed'?)
This was the reason why I decided to create IDA
19. 19(c) 2014 Ilfak Guilfanov
Initial design and implementation
I tried a few approaches and rewritten the code at least 4
times before I hit the right thing
The result was either too heavy and slow, either too
lightweight and limited
Remember about 640KB memory and slow processors!
I needed a robust and fast database
20. 20(c) 2014 Ilfak Guilfanov
Database choice
Requirements:
– Fast
– Capable of storing variable sized objects
– Robust
I tried the available databases like Paradox from Borland but
quickly abandoned the idea, they were way too slow
Fortunately my friend Pavel Rousnak implemented a B-tree
engine
We are still using his database in IDA, upgraded and
improved over many times but still the same code
21. 21(c) 2014 Ilfak Guilfanov
IDA 0.1, the first public version
It took me 6 months to implement version 0.1
The basic functionality was present but the user interface
was ugly
It supported only x86 instructions
Yet it was interactive and working!
22. 22(c) 2014 Ilfak Guilfanov
IDA v2.09: nice text interface
IDA v2.09 was released in 1994
TurboVision library from Borland fixed the user interface
It was robust, supported 3 processor families (x86, i51, and
z80), 8 input file types, had a built-in C like language
Since it was already over 500KB, it was a heavily overlayed
program. I was saving every byte
23. 23(c) 2014 Ilfak Guilfanov
IDA v3.5b: with symbol files
It was released in 1996
It had symbol files (IDS), could run on OS/2, MS DOS
extender, had loadable modules, etc
There were many other releases I won't mention in order not
to bore you
24. 24(c) 2014 Ilfak Guilfanov
IDA Roadmap
My initial plans for IDA were really ambitious. They included:
– AI (artificial intelligence) with a LISP like language
– Building a binary program optimizer on top of IDA
– Using IDA for binary translation
– Building some kind of knowledge database about common
program snippets and their meanings
– IDA would point out suspicious or problematic parts of the
code (vulnerability scanner?)
– Etc,etc,etc
However, with ever growing users of IDA I was simply
overwhelmed by the user requests and bug fixes
Even today it is like this
25. 25(c) 2014 Ilfak Guilfanov
Datarescue and IDA
I was lucky that Mr. Pierre
Vandevenne got interested in
IDA. His contribution can not be
overestimated
Datarescue converted my
hobby project into a commercial
program in 1996
The first GUI version of IDA
was built there, in 1999
We made a long and very
interesting way together
BTW, Pierre found the lady we
use for IDA logo
26. 26(c) 2014 Ilfak Guilfanov
PC Magazine: Technical Excellence Award
In 2001 IDA Pro was nominated as a finalist
of the Annual Awards for Technical
Excellence
We went to Las Vegas to participate in the
award ceremony
We lost the competition... to Microsoft's
Visual Studio .NET
It was still fun :)
27. 27(c) 2014 Ilfak Guilfanov
IDA and pirates
Unfortunately we were plagued by piracy
There were more pirates than legitimate users
Pirates were eating our time and resources
A typical conversation would start with a compliment from a
stranger; he would ask for a “little help” in the second
message
It was even boring, so predictable
I do not understand when clever people pirate software and
then shamelessly ask for help. Probably they aren't that
clever after all
28. 28(c) 2014 Ilfak Guilfanov
IDA piracy map 2006
Just a map of places where a pirated version of IDA was
used (circa 2006)
(from www.datarescue.com)
29. 29(c) 2014 Ilfak Guilfanov
Decompiler: a plugin on top of IDA
Was greatly inspired by Cristina Cifuentes' PhD thesis on
decompilation
After reading the thesis it was clear how to build a
decompiler
But the devil is in the details... many subproblems were still
not solved. For them:
– Come up with an idea how to solve it
– Implement it
– Test it
– Throw away and start over if it did not work
“Wash, rinse, repeat” – for years... (I liked it!)
The first attempts were made in 1998 or even earlier
The first public version appeared in 2007
30. 30(c) 2014 Ilfak Guilfanov
Decompiler details
Decompilation is a complex
problem, insolvable in
general
Very time consuming to
develop
Seemingly minor design
mistakes haunt and hinder
development
One has to cut corners in
order to come up with a
working decompiler
Question: which corners to
cut?
31. 31(c) 2014 Ilfak Guilfanov
Hex-Rays
Unfortunately Pierre decided to quit in 2007
I had to continue with the decompiler alone
Hex-Rays quickly became a strong and passionate team
– We do care about our code
– We want to publish as bug free software as we can
– We care about our users
32. 32(c) 2014 Ilfak Guilfanov
Why IDA, after all
I created IDA because there was no interactive and robust
disassembler at that time; on the other hand there was a
strong need in such a tool
I kept maintaining IDA all these years because
– IDA helps to solve some problems we face, like viruses
– IDA improves our digital security
– IDA users are very nice people in general (legit ones)
Like any tool IDA can be used for lowly deeds. Examples:
– Cracking software
– Stealing code and algorithms
33. 33(c) 2014 Ilfak Guilfanov
IDA as a seeing aid
I usually compare it to a
microscope
Basically useless to general
public but indispensable to
professionals
Requires skills to use it
efficiently
34. 34(c) 2014 Ilfak Guilfanov
Who uses IDA?
This is a frequent question
I can only mention some users categories:
– Anti-virus companies
– Security oriented organizations
– Governments and military
– Hobbyists
– Shady persons of all kinds
– Pirates (the dogs bark but the caravan goes on)
Overall it is a motley crew
35. 35(c) 2014 Ilfak Guilfanov
How IDA improves digital security
Our legit users are white hats (or at least they pretend to be
so :)
IDA itself is not in the spotlight and stays in shadow but many
of our users are famous security researchers
We are glad they we can help them with their tasks
We want IDA to be safe for them (and for all our users)
36. 36(c) 2014 Ilfak Guilfanov
How we improve security of IDA
We run tests
We compile our code with various compilers on different
platforms
We use code reviews
We use lint, valgrind, and other verification tools
User reports are handled by the developers (there are no
first/second help lines). This ensures that developers really
suffer from their bugs :)
37. 37(c) 2014 Ilfak Guilfanov
Testing IDA
We continuously work on improving our coding style
We keep adding more test cases. Every new reported bug
ideally creates a new test case
We keep adding more testing methods
– We have an extensive test suite for our analysis engine
– We recently added tests for the user interface
– We have a constantly growing set of decompiler tests
– Our decompilation test suite is about 500GB (only output
files)
Virtually every day we add a couple of new test cases
There are dedicated computers for running tests
We have a bug bounty program for critical bugs
We know that we are still not testing IDA enough
38. 38(c) 2014 Ilfak Guilfanov
Bug bounty
The idea is simple: if you find a critical bug in IDA, we will
pay you a bounty
Many other companies do that; we think that it is really a
good idea
It is difficult to come up with a reliable exploit for IDA:
– IDA kernel is personalized for each user
– We use ASLR
– IDA randomizes the heap at the start
– We use stack canaries
– Our stack is not executable
Nevertheless we offer bounties for memory corruption bugs
even if there is no POC code
39. 39(c) 2014 Ilfak Guilfanov
Things to improve in the nearest future
We have to add a fuzzer to out test methods
A good static code analyzer is in the plans (in the past we
used one quite famous one but were disappointed)
More tests for the debuggers (since we support remote
debugging for all platforms, there are 24 possible
combinations of local and remote computers)
More tests for IDA Python
40. 40(c) 2014 Ilfak Guilfanov
Why is security hard? Because we are blind
Making a watertight liquid recipient is hard for a blind
person. He has to palpate it entirely to ensure there are no
holes
If he misses a tiniest hole, the liquid will leak out
The same for us with security: we are essentially blind to
security holes and can see only the most obvious ones
This means that humans are hopelessly bad when it comes
to security, at least today
We need help, we need tools so we can see the light
41. 41(c) 2014 Ilfak Guilfanov
Computers to rescue us
Since us humans can not cope with the task, our only hope
are computers
Computers can
– Test our software
– Monitor its use
– Prove its correctness
– Serve as a seeing aid
– Eventually computers would develop software
42. 42(c) 2014 Ilfak Guilfanov
Testing
Unfortunately it is impossible to test all cases
Not an excuse to abandon testing altogether
Testing must be continuous
Test as many different aspects as possible
Think as an attacker, try to foresee the possible scenarios
Keep adding tests for all newly discovered bugs
Write a test case before fixing the bug
43. 43(c) 2014 Ilfak Guilfanov
Monitoring systems
The digital world changes over time
New threats and attack vectors are discovered
We must monitor our systems
Many solutions exist: Tripwire, Nessus, OSSEC, …
Simple custom scripts have the advantage of being
unknown to the attackers
44. 44(c) 2014 Ilfak Guilfanov
Proving the software correctness
Software verification tools
Need support from the programming languages
– C++ is not the best language for verification
– If not, at least good coding practices must be adopted
– Unfortunately this comes with a price to pay (MISRA et other
guidelines)
Code generation tools
Eventually these tools will become mainstream
45. 45(c) 2014 Ilfak Guilfanov
Thank you!
Thank you for your attention!
Questions?