An overview of techniques for defending against SQL Injection using Python tools. This slide deck was presented at the DC Python Meetup on October 4th, 2011 by Edgar Roman, Sr Director of Application Development at PBS
2. What is SQL Injection?
Unauthorized database access by an external
source using specially crafted code to piggyback
on standard user input to bypass normal
protections.
Why?
• Gain access to restricted website areas
• Query unauthorized data
• Delete or corrupt data
3. import MySQLdb
def book_search_view(request):
if 'bookname' not in request.GET:
raise Http404
conn = MySQLdb.connect (host = "localhost", user = "testuser",
passwd = "testpass", db = "test")
cursor = conn.cursor ()
name = request.GET['bookname']
cursor.execute ("SELECT * FROM table_books WHERE book_name =
„%s‟" % name)
row = cursor.fetchone ()
cursor.close ()
conn.close ()
return render_to_response('booklist.html', row,
context_instance=RequestContext(request))
4. • Normal SQL
– name=“Moby Dick”
SELECT * FROM table_books WHERE book_name = „Moby Dick‟
• SQL Injection – bad day
– name=“1‟; SELECT * from Users; --”
SELECT * FROM table_books WHERE book_name = „1‟;
SELECT * from Users;
--‟
• SQL Injection 2 – really bad day
– name=“1‟; DROP TABLE Users; --”
SELECT * FROM table_books WHERE book_name = „1‟;
DROP TABLE Users;
--‟
6. Multiple Layers
• Assume the worst and plan for it
• Coding protection is only one layer
– Which we will focus on for this presentation
• Database lockdown
– User partitioning
– Password protection
• But there are other attacks too: Open Web
Application Security Project (OWASP)
– https://www.owasp.org/
7. General approaches to SQL Injection
Defense
• Escape User Input
• White Lists
• Stored Procs
• Parameterized Queries
8. Escape User Input
• Hard to do right
• You‟ll probably screw it up if you don‟t cover all
the cases
– So don‟t write your own regex
• MySQLdb.escape_string
– Pro: Handles almost all encoding evasions
– Con: Error prone because it depends on
humans to always use it
9. import MySQLdb
def book_search_view(request):
if 'bookname' not in request.GET:
raise Http404
conn = MySQLdb.connect (host = "localhost", user = "testuser",
passwd = "testpass", db = "test")
cursor = conn.cursor ()
name = MySQLdb.escape_string(request.GET['bookname'] )
cursor.execute ("SELECT * FROM table_books WHERE book_name =
„%s‟" % name)
row = cursor.fetchone ()
cursor.close ()
conn.close ()
return render_to_response('booklist.html', row,
context_instance=RequestContext(request))
10. What does the escaped version look
like?
• SQL Injection – bad day
– name=“1‟; SELECT * from Users; --”
SELECT * FROM table_books WHERE book_name = „1‟; SELECT *
from Users; --‟
• SQL Injection 2 – really bad day
– name=“1‟; DROP TABLE Users; --”
SELECT * FROM table_books WHERE book_name = „1‟;DROP
TABLE Users; --‟
12. Even more Evasion Techniques
• Multibyte atttacks
– http://shiflett.org/blog/2006/jan/addslashes-versus-mysql-real-escape-
string
– http://ilia.ws/archives/103-mysql_real_escape_string-versus-Prepared-
Statements.html
• Even the experts don‟t get it right
– MySQL patches bugs in their escaping
routines
13. White List
• Scrub data to a known set of inputs
• Pros
– Works well for variables with limited range
– Fast
• Cons
– Can only be used in customized locations
– Error prone
• You might forgot
• Or the intern might not understand
• Example: user id must only contain 6 numbers
14. Stored Procedures
• Use the inherent store procedure capabilities
• Pros
– Forces parameterization of all user input
• Cons
– Can still be bypassed if sql string is generated
in code and passed to stored procedure
– Not portable between databases
15. Parameterized Queries
• Use DB API (mysqldb.execute) properly
• Use Django ORM
• Use SQLAlchemy (pylons, flask)
– Really have to work hard to expose yourself
• Pros
– Generally easier to model data
• Cons
– ORMs sometimes limit advanced SQL
• Bottom line: use a framework!
16. MySQLdb.execute
Bad:
cursor.execute ("SELECT * FROM table_books WHERE book_name = „%s‟" % name)
Good:
cursor.execute ("SELECT * FROM table_books WHERE book_name = „%s‟" , name)
Seriously?
Yes
17. Django ORM
• Automatically escapes all input parameters
• Be aware of extra() method – this is raw!
• More info
– http://www.djangobook.com/en/2.0/chapter20/
18. Conclusions
• Use a db framework
• If possible, white list your inputs
• Be careful if writing raw SQL
http://xkcd.com/327/