Twitter
« Automated Padding Oracle Attacks with PadBuster | Main | GWTEnum: Enumerating GWT-RPC Method Calls »
Tuesday
Aug102010

Constricting the Web: The GDS Burp API

Last week, I presented "Constricting the Web: Offensive Python for Web Hackers" at Black Hat USA 2010 and DEF CON 18 with Nathan Hamiel. In this talk, we presented methods and techniques for using Python in custom tools to allow testers to be more effective at finding flaws in web applications. One of the tools I demo'd during our talk was the GDS Burp API.

Introducing the GDS Burp API

At GDS, of the many web application security testing tools available, we often use PortSwigger's Burp Suite. We found Burp to fit well with our testing methodology, partly because it seems the target audience is penetration testers like us. One of the features I like most is the ability to record requests/responses that were intercepted by its proxy. After a thorough crawl of the application (having visited every page, and submitted every form), this proxy log contains a treasure trove of data. Because of my lack of desire to write extensions in Java and current limitations of the Burp Extender API, I wrote a toolkit in Python which I call the GDS Burp API.

What is it?

The GDS Burp API exposes a Python object interface to requests/responses recorded by Burp (whether Proxy/Spider/Repeater, etc). The API is used to parse Burp logs, creating a list of “Burp objects” that contain the request / response data and related meta-data. To help visualize this, imagine a request and its associated response within your Burp history now available to you as a single Python object outside of Burp and free of Extender limitations.

One of the main reasons for doing this is because web application scanners usually do not provide enough context into the actual requests they send and the responses they receive. I like to have a full understanding into the underlying scanner implementation, so that I know in what scenarios a scanner will be effective versus where I will have to resort to manual/semi-automated methods.

How do I use it?

Using the GDS Burp API is relatively simple, and takes only a beginner level understanding to Python to take advantage of. After you've performed a thorough crawl, logging each request/response, you use the gds.pub.burp.parse() method to parse the proxy log into a list of Burp objects.

>>> import gds.pub.burp
>>> proxylog = gds.pub.burp.parse("my_application_proxy.log")
>>> proxylog
[<Burp 1>, <Burp 2>, <Burp 3>, <Burp 4>, <Burp 5>]


Within each Burp object there are several properties, as seen in an example object below:

>>> pprint(proxylog[2].__dict__)
{'host': 'http://demo.blah:80',
 'index': 3,
 'ip_address': '65.61.137.117',
 'parameters': {'query': {'txtSearch': 'hello'}},
 'request': {'body': '',
                  'headers': {'Cookie': 'ASP=a0iq3c550j1xtyqbjajmymi4;',
                              'User-Agent': 'Mozilla/5.0 (X11; U; Linux x86_64; },
             'method': 'GET',
             'path': '/search.aspx?txtSearch=hello',
             'version': 'HTTP/1.1'},
 'response': {'body': '\\r\\n\\r\\n<!DOCTYPE html PUBLIC ..snip..',
              'headers': {'Content-Length': '7275',
                          'Content-Type': 'text/html; charset=utf-8',
                          'Server': 'Microsoft-IIS/6.0',
                          'X-Aspnet-Version': '2.0.50727',},
              'reason': 'OK',
              'status': 200,
              'version': 'HTTP/1.1'},
 'time': '1:56:38 PM',
 'url': ParseResult(scheme='http', netloc='demo.blah:80', path='/search.aspx', params='', query='txtSearch=hello', fragment='')}

You can access these object members directly or via object API methods such as get_request_header("User-Agent"), get_request_path(), get_response_body(), get_response_status(), etc. The full generated API documentation for the GDS Burp API can be accessed online or within the docs folder included in the zip file.

What can I do with it?

The purpose of this API is to simply provide a Python object interface to the request/response data that was recorded by Burp. I did not want to implement any fuzzing or analyses capabilities in the API, because I felt that that would detract people from writing their own. I do not want to limit anyone by including a scanner, since someone else's requirements may and very likely will be different than my own.

A couple ideas to get you started that I have done on my own are the following:

  • Replaying individual requests as-is
  • Replaying sequences of requests (such as a login or a checkout operation)
  • Comparing two proxy logs, returning a diff output of the URLs and parameters submitted
  • Fuzzing requests (creating a Burp object) and appending them to the original's replayed list
  • Comparing response bodies and headers of fuzzed requests against their original
  • Using difflib module in Python standard library to return a diff-formatted HTML output of two response bodies

Download Constricting the Web: Offensive Python for Web Hackers.
Visit GDS Burp API on GitHub.

Reader Comments (1)

Hey,

This code is great, and I'm looking forward to see next releases.

Just a little comment, about the regexp used in parsers.py :
HEADER = re.compile('(\d{1,2}:\d{2}:\d{2} (AM|PM) )[ \t]+(\S+)([ \t]+\[(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|unknown host)\])?')

I use Burp Pro 1.3.09 and my logs does not display (AM|PM) so why not change it to (AM|PM)*, giving :
HEADER = re.compile('(\d{1,2}:\d{2}:\d{2} (AM|PM)* )[ \t]+(\S+)([ \t]+\[(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|unknown host)\])?')

Loic,

January 11, 2011 | Unregistered CommenterLoic Castel

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.