> ModSecurity Handbook: Getting Started: Chapter 1. Introduction


Contents
Previous
Next

1 Introduction

ModSecurity is a tool that will help you secure your web applications. No, scratch that. Actually, ModSecurity is a tool that will help you sleep better at night, and I will explain how. I usually call ModSecurity a web application firewall (WAF), because that’s the generally accepted term to refer to the class of products that are specifically designed to secure web applications. Other times I will call it an HTTP intrusion detection tool, because I think that name better describes what ModSecurity does. Neither name is entirely adequate, yet we don’t have a better one. Besides, it doesn’t really matter what we call it. The point is that web applications—yours, mine, everyone’s—are terribly insecure on average. We struggle to keep up with the security issues and need any help we can get to secure them.

The idea to write ModSecurity came to me during one of my sleepless nights—I couldn’t sleep because I was responsible for the security of several web-based products. I could see how most web applications were slapped together with little time spent on design and little time spent on understanding the security issues. Furthermore, not only were web applications insecure, but we had no idea how insecure they were or if they were being attacked. Our only eyes were the web server access and error logs, and they didn’t say much.

ModSecurity will help you sleep better at night because, above all, it solves the visibility problem: it lets you see your web traffic. That visibility is key to security: once you are able to see HTTP traffic, you are able to analyze it in real time, record it as necessary, and react to the events. The best part of this concept is that you get to do all of that without actually touching web applications. Even better, the concept can be applied to any application—even if you can’t access the source code.

Like many other open source projects, ModSecurity started out as a hobby. Software development had been my primary concern back in 2002, when I realized that producing secure web applications is virtually impossible. As a result, I started to fantasize about a tool that would sit in front of web applications and control the flow of data in and out. The first version was released in November 2002, but a few more months were needed before the tool became useful. Other people started to learn about it, and the popularity of ModSecurity started to rise.

Initially, most of my effort was spent wrestling with Apache to make request body inspection possible. Apache 1.3.x did not have any interception or filtering APIs, but I was able to trick it into submission. Apache 2.x improved things by providing APIs that do allow content interception, but there was no documentation to speak of. Nick Kew released the excellent The Apache Modules Book (Prentice Hall) in 2007, which unfortunately was too late to help me with the development of ModSecurity.

By 2004, I was a changed man. Once primarily a software developer, I became obsessed with web application security and wanted to spend more time working on it. I quit my job and started treating ModSecurity as a business. My big reward came in the summer of 2006, when ModSecurity went head to head with other web application firewalls, in an evaluation conducted by Forrester Research, and came out very favorably. Later that year, my company was acquired by Breach Security. A team of one eventually became a team of many: Brian Rectanus came to work on ModSecurity, Ofer Shezaf took on the rules, and Ryan C. Barnett the community management and education. ModSecurity 2.0, a complete rewrite, was released in late 2006. At the same time we released ModSecurity Community Console, which combined the functionality of a remote logging sensor and a monitoring and reporting GUI.

I stopped being in charge of ModSecurity in January 2009, when I left Breach Security. Brian Rectanus subsequently took the lead. In the meantime, Ryan C. Barnett took charge of the ModSecurity rules and produced a significant improvement with CRS v2. In 2010, Trustwave acquired Breach Security and promised to revitalize ModSecurity. The project is currently run by Ryan C. Barnett and Breno Silva, and there are indeed some signs that the project is getting healthier. I remain involved primarily through my work on this book.

Something spectacular happened in March 2011: Trustwave announced that they would be changing the license of ModSecurity from GPLv2 to Apache Software License (ASLv2). This is a great step toward a wider use of ModSecurity because ASL falls into the category of permissive licenses. Later, the same change was announced for the Core Rule Set project (which is hosted with OWASP).

ModSecurity is a toolkit for real-time web application monitoring, logging, and access control. I like to think about it as an enabler: there are no hard rules telling you what to do; instead, it is up to you to choose your own path through the available features. That’s why the title of this section asks what ModSecurity can do, not what it does.

The freedom to choose what to do is an essential part of ModSecurity’s identity and goes very well with its open source nature. With full access to the source code, your freedom to choose extends to the ability to customize and extend the tool itself to make it fit your needs. It’s not a matter of ideology, but of practicality. I simply don’t want my tools to restrict what I can do.

Back on the topic of what ModSecurity can do, the following is a list of the most important usage scenarios:

Real-time application security monitoring and access control

At its core, ModSecurity gives you access to the HTTP traffic stream, in real-time, along with the ability to inspect it. This is enough for real-time security monitoring. There’s an added dimension of what’s possible through ModSecurity’s persistent storage mechanism, which enables you to track system elements over time and perform event correlation. You are able to reliably block, if you so wish, because ModSecurity uses full request and response buffering.

Virtual patching

Virtual patching is a concept of vulnerability mitigation in a separate layer, where you get to fix problems in applications without having to touch the applications themselves. Virtual patching is applicable to applications that use any communication protocol, but it is particularly useful with HTTP, because the traffic can generally be well understood by an intermediary device. ModSecurity excels at virtual patching because of its reliable blocking capabilities and the flexible rule language that can be adapted to any need. It is, by far, the activity that requires the least investment, is the easiest activity to perform, and the one that most organizations can benefit from straight away.

Full HTTP traffic logging

Web servers traditionally do very little when it comes to logging for security purposes. They log very little by default, and even with a lot of tweaking you are not able to get everything that you need. I have yet to encounter a web server that is able to log full transaction data. ModSecurity gives you that ability to log anything you need, including raw transaction data, which is essential for forensics. In addition, you get to choose which transactions are logged, which parts of a transaction are logged, and which parts are sanitized.

Continuous passive security assessment

Security assessment is largely seen as an active scheduled event, in which an independent team is sourced to try to perform a simulated attack. Continuous passive security assessment is a variation of real-time monitoring, where, instead of focusing on the behavior of the external parties, you focus on the behavior of the system itself. It’s an early warning system of sorts that can detect traces of many abnormalities and security weaknesses before they are exploited.

Web application hardening

One of my favorite uses for ModSecurity is attack surface reduction, in which you selectively narrow down the HTTP features you are willing to accept (e.g., request methods, request headers, content types, etc.). ModSecurity can assist you in enforcing many similar restrictions, either directly, or through collaboration with other Apache modules. They all fall under web application hardening. For example, it is possible to fix many session management issues, as well as cross-site request forgery vulnerabilities.

Something small, yet very important to you

Real life often throws unusual demands to us, and that is when the flexibility of ModSecurity comes in handy where you need it the most. It may be a security need, but it may also be something completely different. For example, some people use ModSecurity as an XML web service router, combining its ability to parse XML and apply XPath expressions with its ability to proxy requests. Who knew?

There are four guiding principles on which ModSecurity is based, as follows:

There are bits in ModSecurity that fall outside the scope of these four principles. For example, ModSecurity can change the way Apache identifies itself to the outside world, confine the Apache process within a jail, and even implement an elaborate scheme to deal with a once-infamous universal XSS vulnerability in Adobe Reader. Although it was I who added those features, I now think that they detract from the main purpose of ModSecurity, which is a reliable and predictable tool that allows for HTTP traffic inspection.

ModSecurity supports two deployment options: embedded and reverse proxy deployment. There is no one correct way to use them; choose an option based on what best suits your circumstances. There are advantages and disadvantages to both options:

Embedded

Because ModSecurity is an Apache module, you can add it to any compatible version of Apache. At the moment that means a reasonably recent Apache version from the 2.0.x branch, although a newer 2.2.x version is recommended. The embedded option is a great choice for those who already have their architecture laid out and don’t want to change it. Embedded deployment is also the only option if you need to protect hundreds of web servers. In such situations, it is impractical to build a separate proxy-based security layer. Embedded ModSecurity not only does not introduce new points of failure, but it scales seamlessly as the underlying web infrastructure scales. The main challenge with embedded deployment is that server resources are shared between the web server and ModSecurity.

Reverse proxy

Reverse proxies are effectively HTTP routers, designed to stand between web servers and their clients. When you install a dedicated Apache reverse proxy and add ModSecurity to it, you get a “proper” network web application firewall, which you can use to protect any number of web servers on the same network. Many security practitioners prefer having a separate security layer. With it you get complete isolation from the systems you are protecting. On the performance front, a standalone ModSecurity will have resources dedicated to it, which means that you will be able to do more (i.e., have more complex rules). The main disadvantage of this approach is the new point of failure, which will need to be addressed with a high-availability setup of two or more reverse proxies.

In this first practical section in the book, I will give you a whirlwind tour of the ModSecurity internals, which should help you get started.

ModSecurity is a hybrid web application firewall engine that relies on the host web server for some of the work. The only supported web server at the moment is Apache 2.x, but it is possible, in principle, to integrate ModSecurity with any other web server that provides sufficient integration APIs.

Apache does for ModSecurity what it does for all other modules—it handles the infrastructure tasks:

There a few additional tasks Apache performs in a reverse proxy scenario:

The advantage of a hybrid implementation is that it is very efficient—the duplication of work is minimal when it comes to HTTP parsing. A couple of disadvantages of this approach are that you don’t always get access to the raw data stream and that web servers sometimes don’t process data in the way a security-conscious tool would. In the case of Apache, the hybrid approach works reasonably well, with a few minor issues:

The functionality offered by ModSecurity falls roughly into four areas:

Everything in ModSecurity revolves around two things: configuration and rules. The configuration tells ModSecurity how to process the data it sees; the rules decide what to do with the processed data. Although it is too early to go into how the rules work, I will show you a quick example here just to give you an idea what they look like.

For example:

SecRule ARGS "<script>" log,deny,status:404

Even without further assistance, you can probably recognize the part in the rule that specifies what we wish to look for in input data (<script>). Similarly, you will easily figure out what will happen if we do find the desired pattern (log,deny,status:404). Things will become more clear if I tell you about the general rule syntax, which is the following:

SecRule VARIABLES OPERATOR ACTIONS

The three parts have the following meanings:

I hope you are not disappointed with the simplicity of this first rule. I promise you that by combining the various facilities offered by ModSecurity, you will be able to write very useful rules that implement complex logic where necessary.

In ModSecurity, every transaction goes through five steps, or phases. In each of the phases, ModSecurity will do some work at the beginning (e.g., parse data that has become available), invoke the rules specified to work in that phase, and perhaps do a thing or two after the phase rules have finished. At first glance, it may seem that five phases are too many, but there’s a reason why each of the phases exist. There is always one thing, sometimes several, that can only be done at a particular moment in the transaction lifecycle.

To give you a better idea what happens on every transaction, we’ll examine a detailed debug log of one POST transaction. I’ve deliberately chosen a transaction type that uses the request body as its principal method to transmit data, because following such a transaction will exercise most parts of ModSecurity. To keep things relatively simple, I used a configuration without any rules, removed some of the debug log lines for clarity, and removed the timestamps and some additional metadata from each line.

The transaction I am using as an example in this section is very straightforward. I made a point of placing request data in two different places, parameter a in the query string and parameter b in the request body, but there is little else of interest in the request:

POST /?a=test HTTP/1.0
Content-Type: application/x-www-form-urlencoded
Content-Length: 6

b=test

The response is entirely unremarkable:

HTTP/1.1 200 OK
Date: Sun, 17 Jan 2010 00:13:44 GMT
Server: Apache
Content-Length: 12
Connection: close
Content-Type: text/html

Hello World!

ModSecurity is first invoked by Apache after request headers become available, but before a request body (if any) is read. First comes the initialization message, which contains the unique transaction ID generated by mod_unique_id. Using this information, you should be able to pair the information in the debug log with the information in your access and audit logs. At this point, ModSecurity will parse the information on the request line and in the request headers. In this example, the query string part contains a single parameter (a), so you will see a message documenting its discovery. ModSecurity will then create a transaction context and invoke the REQUEST_HEADERS phase:

[4] Initialising transaction (txid SopXW38EAAE9YbLQ).
[5] Adding request argument (QUERY_STRING): name "a", value "test"
[4] Transaction context created (dcfg 8121800).
[4] Starting phase REQUEST_HEADERS.

Assuming that a rule didn’t block the transaction, ModSecurity will now return control to Apache, allowing other modules to process the request before control is given back to it.

In the second phase, ModSecurity will first read and process the request body, if it is present. In the following example, you can see three messages from the input filter, which tell you what was read. The fourth message tells you that one parameter was extracted from the request body. The content type used in this request (application/x-www-form-urlencoded) is one of the types ModSecurity recognizes and parses automatically. Once the request body is processed, the REQUEST_BODY rules are processed.

[4] Second phase starting (dcfg 8121800).
[4] Input filter: Reading request body.
[9] Input filter: Bucket type HEAP contains 6 bytes.
[9] Input filter: Bucket type EOS contains 0 bytes.
[5] Adding request argument (BODY): name "b", value "test"
[4] Input filter: Completed receiving request body (length 6).
[4] Starting phase REQUEST_BODY.

The filters that keep being mentioned in the logs are parts of ModSecurity that handle request and response bodies:

[4] Hook insert_filter: Adding input forwarding filter (r 81d0588).
[4] Hook insert_filter: Adding output filter (r 81d0588).

There will be a message in the debug log every time ModSecurity sends a chunk of data to the request handler, and one final message to say that there isn’t any more data in the buffers.

[4] Input filter: Forwarding input: mode=0, block=0, nbytes=8192 ↩
(f 81d2228, r 81d0588).
[4] Input filter: Forwarded 6 bytes.
[4] Input filter: Sent EOS.
[4] Input filter: Input forwarding complete.

Shortly thereafter, the output filter will start receiving data, at which point the RESPONSE_HEADERS rules will be invoked:

[9] Output filter: Receiving output (f 81d2258, r 81d0588).
[4] Starting phase RESPONSE_HEADERS.

Once all the rules have run, ModSecurity will continue to store the response body in its buffers, after which it will run the RESPONSE_BODY rules:

[9] Output filter: Bucket type MMAP contains 12 bytes.
[9] Output filter: Bucket type EOS contains 0 bytes.
[4] Output filter: Completed receiving response body (buffered full - 12 bytes).
[4] Starting phase RESPONSE_BODY.

Again, assuming that none of the rules blocked, the accumulated response body will be forwarded to the client:

[4] Output filter: Output forwarding complete.

Finally, the logging phase will commence. The LOGGING rules will be run first to allow them to influence logging, after which the audit logging subsystem will be invoked to log the transaction if necessary. A message from the audit logging subsystem will be the last transaction message in the logs. In this example, ModSecurity tells us that it didn’t find anything of interest in the transaction and that it sees no reason to log it:

[4] Initialising logging.
[4] Starting phase LOGGING.
[4] Audit log: Ignoring a non-relevant request.

Requests that contain files are processed slightly differently. The changes can be best understood by again following the activity in the debug log:

[4] Input filter: Reading request body.
[9] Multipart: Boundary: ---------------------------2411583925858
[9] Input filter: Bucket type HEAP contains 256 bytes.
[9] Multipart: Added part header "Content-Disposition" "form-data; name=\"f\"; ↩
filename=\"eicar.com.txt\""
[9] Multipart: Added part header "Content-Type" "text/plain"
[9] Multipart: Content-Disposition name: f
[9] Multipart: Content-Disposition filename: eicar.com.txt
[4] Multipart: Created temporary file: ↩
/opt/modsecurity/var/tmp/20090819-175503-SowuZ38AAQEAACV-Agk-file-gmWmrF
[9] Multipart: Changing file mode to 0600: ↩
/opt/modsecurity/var/tmp/20090819-175503-SowuZ38AAQEAACV-Agk-file-gmWmrF
[9] Multipart: Added file part 9c870b8 to the list: name "f" file name ↩
"eicar.com.txt" (offset 140, length 68)
[9] Input filter: Bucket type EOS contains 0 bytes.
[4] Reqest body no files length: 96
[4] Input filter: Completed receiving request body (length 256).

In addition to seeing the multipart parser in action, you see ModSecurity creating a temporary file (into which it will extract the upload) and adjusting its privileges to match the desired configuration.

Then, at the end of the transaction, you will see the cleanup and the temporary file deleted:

[4] Multipart: Cleanup started (remove files 1).
[4] Multipart: Deleted file (part) ↩
"/opt/modsecurity/var/tmp/20090819-175503-SowuZ38AAQEAACV-Agk-file-gmWmrF"

The temporary file will not be deleted if ModSecurity decides to keep an uploaded file. Instead, it will be moved to the storage area:

[4] Multipart: Cleanup started (remove files 0).
[4] Input filter: Moved file from ↩
"/opt/modsecurity/var/tmp/20090819-175503-SowuZ38AAQEAACV-Agk-file-gmWmrF" to ↩
"/opt/modsecurity/var/upload/20090819-175503-SowuZ38AAQEAACV-Agk-file-gmWmrF".

In the example traces, you’ve observed an upload of a small file that was stored in RAM. When large uploads take place, ModSecurity will attempt to use RAM at first, switching to on-disk storage once it becomes obvious that the file is larger:

[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 1536 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 576 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[9] Input filter: Bucket type HEAP contains 8000 bytes.
[4] Input filter: Request too large to store in memory, switching to disk.

A new file will be created to store the entire raw request body:

[4] Input filter: Created temporary file to store request body: ↩
/opt/modsecurity/var/tmp//20090819-180105-Sowv0X8AAQEAACWAArs-request_body-4nZjqf
[4] Input filter: Wrote 129559 bytes from memory to disk.

This file is always deleted in the cleanup phase:

[4] Input filter: Removed temporary file: ↩
/opt/modsecurity/var/tmp//20090819-180105-Sowv0X8AAQEAACWAArs-request_body-4nZjqf

The addition of ModSecurity will change how your web server operates. As with all Apache modules, you pay for the additional flexibility and security ModSecurity gives you with increased CPU and RAM consumption on your server. The exact amount will depend on your configuration of ModSecurity and the usage of your server. Following is a detailed list of the various activities that increase resource consumption:

In practice, this list is important because it keeps you informed; what matters is that you have enough resources to support your ModSecurity needs. If you do, then it doesn’t matter how expensive ModSecurity is. Also, what’s expensive to someone may not be to someone else. If you don’t have enough resources to do everything you want with ModSecurity, you will need to monitor the operation of your system and remove some of the functionality to reduce the resource consumption. Virtually everything that ModSecurity does is configurable, so you should have no problems doing that.

It is generally easier to run ModSecurity in reverse proxy mode, because then you usually have an entire server (with its own CPU and RAM) to play with. In embedded mode, ModSecurity will add to the processing already done by the web server, so this method is more challenging on a busy server.

For what it’s worth, ModSecurity generally uses the minimal necessary resources to perform the desired functions, so this is really a case of exchanging functionality for speed: if you want to do more, you have to pay more.

The purpose of this section is to map your future ModSecurity activities and help you determine where to go from here. Where you will go depends on what you want to achieve and how much time you have to spend. A complete ModSecurity experience, so to speak, consists of the following elements:

Installation and configuration

This is the basic step that all users must learn how to perform. The next three chapters will teach you how to make ModSecurity operational, performing installation, general configuration, and logging configuration. Once you are done with that, you need to decide what you want to do with it. That’s what the remainder of the book is for.

Rule writing

Rule writing is an essential skill. You may currently view rules as a tool to use to detect application security attacks. They are that, but they are also much more. In ModSecurity, you write rules to find out more about HTTP clients (e.g., geolocation and IP address reputation), perform long-term activity tracking (of IP addresses, sessions and users, for example), implement policy decisions (use the available information to make the decisions to warn or block), write virtual patches, and even to check on the status of ModSecurity itself.

It is true that the attack detection rules are in a class of its own, but that’s mostly because, in order to write them successfully, you need to know so much about application security. For that reason, many ModSecurity users generally focus on using third-party rule sets for the attack detection. It’s a legitimate choice. Not everyone has the time and inclination to become an application security expert. Even if you end up not using any inspection rules whatsoever, the ability to write virtual patches is reason enough to use ModSecurity.

Rule sets

The use of existing rule sets is the easiest way to get to the proverbial low hanging fruit: invest small effort and reap big benefits. Traditionally, the main source of ModSecurity rules has been the Core Rule Set project, now hosted with OWASP. On the other hand, if you are keen to get your hands dirty, I can tell you that I draw great pleasure from writing my own rules. It’s a great way to learn about application security. The only drawback is that it requires a large time investment.

Remote logging and alert management GUI

ModSecurity is perfectly usable without a remote logging solution and without a GUI (the two usually go together). Significant error messages are copied to Apache’s error log. Complete transactions are usually logged to the audit log. With a notification system in place, you will know when something happens, and you can visit the audit logs to investigate. For example, many installations will divert Apache’s error log to a central logging system (via syslog).

The process does become more difficult with more than one sensor to manage. Furthermore, GUIs make the whole experience of monitoring much more pleasant. For that reason you will probably seek to install one of the available remote centralization tools and use their GUIs. The available options are listed in the Resources section, which follows.

This section contains a list of assorted ModSecurity resources that can assist you in your work.

The following resources are the bare essentials:

ModSecurity web site

ModSecurity’s web site is probably going to be your main source of information. You should visit the web site from time to time, as well as subscribe to receive the updates from the blog.

Official documentation

The official ModSecurity documentation is maintained in a wiki, but copies of it are made for inclusion with every release.

Issue tracker

The ModSecurity issue tracker is the place you will want to visit for one of two reasons: to report a problem with ModSecurity itself (e.g., when you find a bug) or to check out the progress on the next (major or minor) version. Before reporting any problems, go through the Support Checklist, which will help you assemble the information required to help resolve your problem. Providing as much information as you can will help the developers understand and replicate the problem, and provide a fix (or a workaround) quickly.

Users’ mailing list

The users’ mailing list (mod-security-users@lists.sourceforge.net) is a general-purpose mailing list where you can discuss ModSecurity. Feel free to ask questions, propose improvements, and discuss ideas. That is the place where you’ll hear first about new ModSecurity versions.

ModSecurity@Freshmeat

If you subscribe to the users’ mailing list, you will generally find out about new versions of ModSecurity as soon as they are released. If you care only about version releases, however, you may consider subscribing to the new version notifications at the ModSecurity page at Freshmeat.

Core Rules mailing list

Starting with version 2, the Core Rules project is part of OWASP, and has a separate mailing list (owasp-modsecurity-core-rule-set@lists.owasp.org).

If you are interested in development work, you will need these:

Developers’ mailing list

The developers’ mailing list is generally a lonely place, but if you do decide to start playing with the ModSecurity source code, this list is the place to go to discuss your work.

Source code access

The source code of ModSecurity is hosted at a Subversion repository at SourceForge, which allows you to access it directly or through a web-based user interface.

FishEye interface

If you are not looking to start developing immediately but still want to have a look at the source code of ModSecurity, I recommend that you use the ModSecurity FishEye interface, which is much better than the stock interface available at SourceForge.

This chapter was your ModSecurity orientation. I introduced ModSecurity at a high level, discussed what it is and what it isn’t, and what it can do and what it cannot. I also gave you a taste of what ModSecurity is like and described common usage scenarios, as well as covered some of the interesting parts of its operation.

The foundation you now have should be enough to help you set off on a journey of ModSecurity exploration. The next chapter discusses installation.