Cатсн²² (in)sесuяitу / ChrisJohnRiley

Because we're damned if we do, and we're damned if we don't!

[SecTorCA] Reverse Engineering a Web Application – for fun, behavior & WAF Detection

Reverse Engineering a Web Application

For fun, behavior & WAF Detection

by Rodrigo “Sp0oKeR” Montoro (Sucuri Security)


Screening HTTP traffic can be something really tricky and attacks to applications are becoming increasingly complex day by day. By analyzing thousands upon thousands of infections, we noticed that regular blacklisting is increasingly failing so we started research on a new approach to mitigate the problem. We started with reverse engineering the most popular CMS applications such as Joomla, vBulletin and WordPress, which led to us to creating a way to detect attackers based on whitelist protection in combination with behavior analysis.  Integrating traffic analysis with log correlation has resulted in more than 2500 websites now being protected, generating 2 to 3 million alerts daily with a low false positive rate. In this presentation we will share some of our research, our results and how we have maintained WAF (Web Application Firewall) using very low CPU processes and high detection rates.

Presenation is based on WordPress / NGINX, but concepts can be applied to any Wed Application / CMS technologies. The goal of this talk to better protect CMSs, with better performance (less rules is better), but also protect against new vulnerabilities as they are released/discovered.


By reverse engineering common CMSs (in this case WordPress) it is possible to better understand how they work.

WAF Detection (breakdown):

  • Traffic Analysis
  • Application Structural Analysis
  • Behavior

Detection steps

Reverse Engineering Traffic

As we’re taking about web applications we’re mostly talking about HTTP here. By breakdown down the traffic into specific categories it’s possible to better understand the traffic. We include such as IP source in this section.

Crawling the application

Various ways to crawl the application from a blackbox perspective (Burp Suite for example). From a whitebox rerspective there are various other options.

Looking at requests

By looking at the parameters used by the applications it’s possible to identify parts where an application is only sending numbers or letters as part of the parameter. For example, a name field should not contain numbers. However this could be problematic if you don’t consider edge case situations, like names with special characters.

Looking at the common headers, it’s easy to identify headers values that must fall within specific whitelists. E.g. HTTP/1.0 or HTTP/1.1. Anything else is either corrupted data or somebody fiddling with the date being sent.

With wordpress.com, the response contains an x-hacker header saying “if you should read this you should apply for a job…”

Brute-force attacks are on the rise, so if you can, compare users passwords against a list of the top X passwords and inform the user that it’s weak.

Malicious user-agent strings tend to be shorter than legitimate user agent strings. They also tend to send more complete request headers (often over 8 headers). Also, you don’t see normal browsers sending HTTP/1.0 requests anymore. Drop these simple things. Checking that all expected parameters are sent is also important. A lot of attackers only send the parameters they need, and ignore the others. This can be checked easily enough.

A regular user is also not going to request a whole load of pages that result in a 404. If there are a lot of request that end in a 404, this looks more like a attack than a normal users traffic.


Using a PCAP of real traffic and simple regex matching, it’s possible to test your logic to list what requests would normally be dropped BEFORE implementing something as a rule. You can then tweak the matching logic before going live.

NGINX is meant to be quick, so doesn’t allow IF ELSE, only IF statements.


if ($request_method != <something>){
     return <status_code>

WordPress has a lot of files (check the tarball for a full list). So we can slim that down a bit by removing things like initial config and setup files. Administration (/wp-admin) console is also something that can either by disabled or restricted to specific source addresses (think 2FA). Core files (wp-includes) are not meant to be externally accessible, same with uploaded content (wp-content). WordPress also has an XML-RPC interface that allows somebody to perform specific actions (e.g. ping-backs, comments, user-auth, …). Redirecting them to a honeypot might be an option for you.

<ifModule mod_alias.c>
     Redirect 301 /xmlrpc.php

Lots of brute-forces seen from June 2014 using the xmlrpc.php. Similar rise in traffic seen in the use of xmlrpc.php as a DDoS tool in March 2014. By looking at the logs it’s easy to see spikes where there may be new attacks or new methods being tried out.

To secure things further, deny specific filetypes in directories where you may have user content or data (e.g. uploads, logs, …).

Mitigating the attack surface

Turn off the machine and remove the network cable –> Not really an option

OSSEC for real-time monitoring.

Monitor specific locations for alteration or addition of files to ensure you get visibility on the web application.

Threshold ideas

Too many 404s –> somebody searching the web app

GET/POST per time for same IP source –> automated user hitting the site (not a normal user)

File specific: Set files on Linux as immutable (lsattr)

Statistical data

Useful for counter intelligence and to find behaviors, new trends and alerts.

Instead of blocking “user-agent: ABCD”, think about blocking connections from user-agents with < 19 bytes (maybe a few false positives, but less specific).

GEO-IP Blocking –> based on top countries, you could block specific countries if you don’t have business reasons to allow traffic from them. This may change week by week however.

Methods –> If your application only allows GET/POST, then drop everything else

HTTP Version –> If you only accept HTTP/1.1, then drop 1.0 and all malformed versions (stats from Sucuri show 1 mill hits a week dropped by this rule alone)


This is a constant process, not set it and forget it.


  • Developers
  • plug-ins
  • Bad Code
  • languages

Next steps:

  • Integration with SCAP
  • open source PCAP parser tool
  • build rule-set for CMSs under OWASP banner



2 responses to “[SecTorCA] Reverse Engineering a Web Application – for fun, behavior & WAF Detection

  1. Clerkendweller October 22, 2014 at 13:31

    There are some more application and external attacker detection ideas in the OWASP AppSensor project http://www.appsensor.org/

  2. Pingback: Week 43 In Review – 2014 - Infosec Events

%d bloggers like this: