deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation

deDacota: Toward Preventing
Server-Side XSS via Automatic
Code and Data Separation
Adam Doupé, Weidong Cui€, Mariusz H. Jakubowski€, Marcus
Peinado€, Christopher Kruegel, and Giovanni Vigna
University of California, Santa Barbara
€Microsoft Research
CCS 2013 – 11/7/13

XSS Vulnerabilities Still Exist Today

Doupé - 11/7/13

Courtesy of Ashar Javed
Doupé - 11/7/13

Test.aspx
<html>
<body>
<p>Hello <%= this.Name %></p>
</body>
</html>
Doupé - 11/7/13

http://example.com/Test.aspx?name=adam

<html>
<body>
</body>
</html>
Doupé - 11/7/13

Ask
Test.dll
for output


Ask
Test.dll
for output

<html>
<body>
<p>Hello adam</p>
</body>
</html>
Doupé - 11/7/13

<html>
<body>
<p>Hello adam</p>
</body>
</html>

Ask
Test.dll
for output

Doupé - 11/7/13

Test.aspx
http://example.com/Test.aspx?name=<script>alert("xss");</script>

<html>
<body>
<p>Hello <%= this.Name %>
</script></p>
</body>
</html>
Doupé - 11/7/13

Test.aspx

<html>
<body>
<p>Hello <script>alert("xss");
</script></p>
</body>
</html>
Doupé - 11/7/13

XSS – Impact
• Steal cookies

• Perform actions as user
• Exploit user’s browser
• Fake login form
Doupé - 11/7/13

Fixing XSS – Sanitization
<html>
<body>
<p>Hello
<%= HtmlEncode(this.Name) %>
</p>
</body>
</html>
Doupé - 11/7/13

Fixing XSS – Sanitization
<html>
<script>alert("xss");</script>
<body>
<p>Hello
<%= HtmlEncode(this.Name) %>
</p>
</body>
<script>alert("xss");
</html>

</script>

Doupé - 11/7/13

XSS as Input Validation

Doupé - 11/7/13

Problem
Find All Paths
Many Different Contexts

Research
WWW 2004, USENIX 2005,
Oakland 2006
CCS 2011, CCS 2011

Is Sanitization Correct?

Oakland 2008, USENIX
2011

Parsing Quirks

Oakland 2009
Doupé - 11/7/13

Problem
Find All Paths
Different Context

Parsing Quirks

Research
Oakland 2006
CCS 2011, CCS 2011

2011
Oakland 2009
Doupé - 11/7/13

Problem
Find All Paths
Different Context

Parsing Quirks

Research
Oakland 2006
CCS 2011, CCS 2011

2011
Oakland 2009, CCS 2013
Doupé - 11/7/13

Problem
Find All Paths

Research
Oakland 2006
CCS 2011, CCS 2011

We want to fundamentally
Different Context
solve XSS vulnerabilities
2011
Parsing Quirks

Oakland 2009, CCS 2013
Doupé - 11/7/13

Another Example
<html>
<body>
<script>
alert("welcome to example.com!");
</script>
</body>
</html>
Doupé - 11/7/13

Another Example

Developer indented for this code to be executed on the
browser
<html>

<body>
<script>
</script>
</body>
</html>
Doupé - 11/7/13

Another Example

<html>
<body>
<script>
</script>
</p>
</body>
</html>
Doupé - 11/7/13

Another Example

<html>
<body>
<script>
</script>
<p>Hello <script>alert("xss");</script>
</p>
</body>
</html>
Doupé - 11/7/13

The Fundamental Problem

browser
<html>
<body>
<script>
</script>
</p>
</body>
Developer did not intend for this code to be executed on
</html>
the browser
Doupé - 11/7/13

The Fundamental Problem

browser
<html>
<body>
The
<script> browser can’t tell the
difference!
</script>
</p>
</body>
Developer did not intend for this code to be executed on
</html>
the browser
Doupé - 11/7/13

The Fundamental Solution
Data
<html>
<body>
<script>
</script>
</p>
</body>
</html>

Doupé - 11/7/13

Code

The Fundamental Solution
Data
To fundamentally solve XSS
<html>
<body>
vulnerabilities, we must apply the
Code
<script>
basic security principles of Code
</script>
and Data separation!
</p>
</body>
</html>

Doupé - 11/7/13

Content Security Policy (CSP)
• Mechanism for the website to communicate a policy to the browser
about what JavaScript to execute
• The browser then enforces this policy
• Supported by many modern browsers (68% of users use one of
these browsers
–
–
–
–
–
–
–

Firefox
Chrome
IE (10)
Safari
Opera
iOS
Android
Doupé - 11/7/13

Content Security Policy
Data
Content-Security-Policy: script-src
http://example.com/0cc111eb135.js
<html>
<body>
<script>
</script>
</p>
</body>
</html>
Doupé - 11/7/13

Code

Content Security Policy
Data
<html>
<body>
<script src="0cc111eb135.js">
</script>
</p>
</body>
</html>

Doupé - 11/7/13

Code

Code and Data Separation
• Code and Data separation from start
– No legacy applications

• Manually rewrite application
– Difficult and error-prone (HotSec 2011)

deDacota: Automatically separate code and
data of a web application
Doupé - 11/7/13

Threat Model
• Benign web application
– The developer has not obfuscated the web application

• Server-side XSS
– Our approach will only address traditional XSS, in other words,
XSS where the resulting bug is in the server-side code

• Inline JavaScript
– For the deDacota prototype, we focused only on inline
JavaScript
– We ignore JavaScript in HTML attributes and CSS

Doupé - 11/7/13

deDacota Process

Approximate
HTML Output

Extract Inline
JavaScript

Doupé - 11/7/13

Rewrite Web
Application

deDacota Process
The goal is to rewrite the web
application so that it is
Approximate
Extract Inline
Rewrite Web
semantically equivalent yet
HTML Output
JavaScript
Application
separates the code and data.

Doupé - 11/7/13

Approximate HTML Output
<%@ Page Language="C#"
CodeBehind="CodeBehind.cs" Inherits="Test" %>
<html>
<body>
<%= Scripts() %>
</body>
</html>
Doupé - 11/7/13

class test_aspx : System.Web.UI.Page {
public test_aspx () {
this.Name = Request.QueryString["name"];
this.Year = "2013";
}
protected void Render(HtmlTextWriter writer) {
writer.write("<html><body><p>");
writer.write(this.Name);
writer.write(Scripts());
writer.write("</p></body></html>");
}
protected string Scripts() {
return "<script>alert('" + this.Year + "');</script>";
}
}
Doupé - 11/7/13

this.Year = "2013";
}
}
}
}

The goal here is to create a graph
that approximates the HTML
content of the web page. We use
static analysis techniques to
construct the graph.
Doupé - 11/7/13

"<html><body><p>"
this.Year = "2013";
}
}
}
}
Doupé - 11/7/13

"<html><body><p>"
this.Year = "2013";
this.Name
}
}
}
}
Doupé - 11/7/13

"<html><body><p>"
this.Year = "2013";
this.Name
}
}
}
}

Here we need to analyze the
control flow of the application,
which means following the control
flow into the Scripts() method.

Doupé - 11/7/13

"<html><body><p>"
this.Year = "2013";
this.Name
}
}
}
}

Here we encounter string
concatenation, which our analysis
is able to handle.

Doupé - 11/7/13

"<html><body><p>"
this.Year = "2013";
this.Name
}
"<script>alert('"
}
this.Year
}
}
"');</script>"
Doupé - 11/7/13

Now that we have constructed
the approximation graph, we
must determine what is being
output by each node in the graph.
Here we use data-flow analysis
and points-to analysis.

"<html><body><p>"
this.Year = "2013";
this.Name
}
"<script>alert('"
}
this.Year
}
}
"');</script>"
Doupé - 11/7/13

"<html><body><p>"
<html><body><p>
this.Year = "2013";
this.Name
}
"<script>alert('"
}
this.Year
}
}
"');</script>"
Doupé - 11/7/13

In this case,
Request.QueryString["name"]
is statically undecidable because
it comes from user input. In the
approximation graph we
represent this as a * which means
the output at this node could be
anything.

"<html><body><p>"
<html><body><p>
this.Year = "2013";
this.Name
}
"<script>alert('"
}
this.Year
}
}
"');</script>"
Doupé - 11/7/13

"<html><body><p>"
<html><body><p>
this.Year = "2013";
*
this.Name
}
"<script>alert('"
}
this.Year
}
}
"');</script>"
Doupé - 11/7/13

"<html><body><p>"
<html><body><p>
this.Year = "2013";
*
this.Name
}
"<script>alert('"
<script>alert('
}
this.Year
}
}
"');</script>"
Doupé - 11/7/13

"<html><body><p>"
<html><body><p>
this.Year = "2013";
*
this.Name
}
"<script>alert('"
<script>alert('
}
this.Year
2013
}
}
"');</script>"
Doupé - 11/7/13

"<html><body><p>"
<html><body><p>
this.Year = "2013";
*
this.Name
}
"<script>alert('"
<script>alert('
}
this.Year
2013
}
}
"');</script>"
');</script>
Doupé - 11/7/13

<html><body><p>

*

<script>alert('

2013

');</script>

</p></body></html>
Doupé - 11/7/13

<html><body><p>

*
This approximation graph
contains a static approximation of
<script>alert('
the HTML content of the web
page. Any path 2013
through this graph
is one possible output of the
');</script>
page.
</p></body></html>
Doupé - 11/7/13

In this example approximation graph from a real-world
application, the branch in the graph comes from a
conditional branch in the control-flow of the application.

Doupé - 11/7/13

Statically undecidable content, represented here as a *,
can come from two different areas:
1. Statically undecidable according to the static analysis.
2. To make our analysis conservative, we treat all loops as
outputting a *, because we cannot statically determine
how many times a loop will execute.

Doupé - 11/7/13

Extract Inline JavaScript

Doupé - 11/7/13

In the second step, we simply extract the inline JavaScript
(aka the developer intended code) from the approximation
graph.

Doupé - 11/7/13

Rewrite Web Application
<html>
<body>
<script>
</script>
</p>
</body>
</html>

Doupé - 11/7/13


Data
<html>
<body>
</script>
</p>
</body>
</html>

Doupé - 11/7/13

Code

At this
Data point, if the inline
JavaScript code is static, we have
<html>
protected the application. No
<body>
Code
attacked data inalert("welcome to example.com!");
the Data
</script>
</p>segment will ever be interpreted
</body>
as Code.
</html>

Doupé - 11/7/13

Unfortunately, developers
Data
sometimes dynamically generate
<html> the Code of an application. If this
<body>
Code
happens with untrusted Data,
</script>
there can still be a XSS
</p>
</body>
vulnerability.
</html>

Doupé - 11/7/13

Dynamic Inline JavaScript
<html>
<script>
var username = "<%= Username %>";
</script>
</html>

Doupé - 11/7/13

Data
<html>
<script>
</script>
</html>

Code

Here, the developer has chosen to dynamically generate
the Code from untrusted data.
Doupé - 11/7/13

Data
<html>
<script>
</script>
</html>

Code

var username = "*";

Doupé - 11/7/13

We developed a technique to safely
transform cases of dynamic inline
Data
JavaScript. If the statically undecidable
<html>
content is used in a known Code
JavaScript
<script>
</script>
context (JavaScript string or comment),
</html>
we can safely rewrite thevar username = "*";
application.
We call these cases “safe dynamic
inline JavaScript.”
Doupé - 11/7/13

Applications
Application

Lines of Code

Known
Vulnerability

BugTracker.NET
BlogEngine.NET
BlogSA.NET
ScrewTurn Wiki
WebGoat.NET
ChronoZoom

35,674
29,512
6,994
12,155
11,993
21,261

CVE-2010-3266
CVE-2008-6476
CVE-2009-0814
CVE-2008-3483
2 Intentional
N/A

Doupé - 11/7/13

Evaluation
• Security
– Crafted exploits for applications with known
vulnerabilities
– Transformed applications, along with CSP, blocked
the exploits

• Functional correctness
– ChronoZoom had 160 JavaScript tests and all passed
after the transformation
– Manually browsed the application and source code
looking for missing inline JavaScript
Doupé - 11/7/13

100%
90%
80%
70%
60%

Unsafe Dynamic

50%

Safe Dynamic
Static

40%
30%
20%
10%
0%
BugTracker.NET BlogEngine.NET

BlogSA.NET

ScrewTurn Wiki

Doupé - 11/7/13

WebGoat.NET

ChronoZoom

100%
90%
80%
70%
60%
50%

Here we are going to look at what
percentage of the inline
JavaScript in each application is
either: static, safe dynamic, or
unsafe dynamic.

Unsafe Dynamic
Safe Dynamic

40%
30%
20%
10%

Static

0%

BlogSA.NET

ScrewTurn Wiki

Doupé - 11/7/13

WebGoat.NET

ChronoZoom

100%
90%
80%
70%
60%

6

50%
40%

41

10

5

20%

4

0%

BlogSA.NET

ScrewTurn Wiki

Doupé - 11/7/13

Safe Dynamic
Static

27

30%

10%

Unsafe Dynamic

WebGoat.NET

ChronoZoom

100%
90%

3

1

80%

4

70%
60%
50%
40%

41

10

6
10

5

20%

4

0%

BlogSA.NET

ScrewTurn Wiki

Doupé - 11/7/13

Safe Dynamic
Static

27

30%

10%

Unsafe Dynamic

WebGoat.NET

ChronoZoom

100%
90%

3

1

80%

4

70%
60%
50%
40%

41

10

6
10

27

5

Unsafe Dynamic
Safe Dynamic
Static

30%
20%

4
In these safe dynamic situations, we are able to safely
0%
transform the dynamic inline JavaScript code.
BlogSA.NET
ScrewTurn Wiki WebGoat.NET
ChronoZoom

10%

Doupé - 11/7/13

100%
90%

2
3

4

1
1

80%

4

4

70%
60%
50%
40%

41

10

6
10

5

20%

4

0%

BlogSA.NET

ScrewTurn Wiki

Doupé - 11/7/13

Safe Dynamic
Static

27

30%

10%

Unsafe Dynamic

WebGoat.NET

ChronoZoom

100%
90%

2
3

4

80%

1
1

4

4

70%
60%
50%

10

6

5

Unsafe Dynamic
Safe Dynamic

41
In
10
40% cases of unsafe dynamic inline JavaScript, we alert the
Static
27
developer that the transformation could potentially contain
30%
an XSS vulnerability. After the developer confirms the
20%
absence of an XSS vulnerability in the unsafe dynamic
4
10%
inline JavaScript, then the application is guaranteed free of
0%
BlogSA.NET vulnerabilities.
XSS ScrewTurn Wiki WebGoat.NET ChronoZoom
Doupé - 11/7/13

Limitations
• Might miss inline JavaScript
– Loops
– Dynamic code execution

• Does not handle HTML attributes and CSS

Doupé - 11/7/13

Summary
• Code and Data separation necessary to
prevent XSS
• deDacota can automatically separate
Code and Data of web application
• deDacota works in practice
Doupé - 11/7/13

Adam Doupé
Email:
Twitter:

adoupe@cs.ucsb.edu
@adamdoupe

DEDACOTA: TOWARD
PREVENTING SERVER-SIDE XSS
VIA AUTOMATIC CODE AND DATA
SEPARATION
Doupé - 11/7/13

deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation

Similar to deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation (20)

Recently uploaded

Recently uploaded (20)

deDacota: Toward Preventing Server-Side XSS via Automatic Code and Data Separation

Editor's Notes