> Apache Security: Chapter 5. Denial of Service Attacks


Contents
Previous
Next

5 Denial of Service Attacks

A denial of service (DoS) attack is an attempt to prevent legitimate users from using a service. This is usually done by consuming all of a resource used to provide the service. The resource targeted is typically one of the following:

Sometimes, a less obvious resource is targeted. Many applications have fixed length internal structures and if an attacker can find a way to populate all of them quickly, the application can become unresponsive. A good example is the maximum number of Apache processes that can exist at any one time. Once the maximum is reached, new clients will be queued and not served.

DoS attacks are not unique to the digital world. They existed many years before anything digital was created. For example, someone sticking a piece of chewing gum into the coin slot of a vending machine prevents thirsty people from using the machine to fetch a refreshing drink.

In the digital world, DoS attacks can be acts of vandalism, too. They are performed for fun, pleasure, or even financial gain. In general, DoS attacks are a tough problem to solve because the Internet was designed on a principle that everyone plays by the rules.

You can become a victim of a DoS attack for various reasons:

Bad luck

In the worst case, you may be at the wrong place at the wrong time. Someone may think your web site is a good choice for an attack, or it may simply be the first web site that comes to mind. He may decide he does not like you personally and choose to make your life more troubled. (This is what happened to Steve Gibson, of http://www.grc.com fame, when a 13-year-old felt offended by the “script kiddies” term he used.)

Controversial content

Some may choose to attack you because they do not agree with the content you are providing. Many people believe disrupting your operation is acceptable in a fight for their cause. Controversial subjects such as the right to choose, globalization, and politics are likely to attract their attention and likely to cause them to act.

Unfair competition

In a fiercely competitive market, you may end up against competitors who will do anything to win. They may constantly do small things that slow you down or go as far as to pay someone to attack your resources.

Controversy over a site you host

If your job is to host other sites, the chances of being attacked via a DoS attack increase significantly. With many web sites hosted on your servers, chances are good that someone will find one of the sites offending.

Extortion

Many attempts of extortion were reported in the past. Companies whose revenue depends on their web presence are especially vulnerable. Only the wealthiest of companies can afford to pay for infrastructure that would resist well-organized DoS attacks. Only the cases where companies refused to pay are publicly known; we do not know how many companies accepted blackmail terms.

DoS attacks can be broadly divided into five categories:

These types of attacks are described in the rest of this chapter.

Network attacks are the most popular type of attack because they are easy to execute (automated tools are available) and difficult to defend against. Since these attacks are not specific to Apache, they fall outside the scope of this book and thus they are not covered in detail in the following sections. As a rule of thumb, only your upstream provider can defend you from attacks performed on the network level. At the very least you will want your provider to cut off the attacks at their routers so you do not have to pay for the bandwidth incurred by the attacks.

If you are sitting on a high-speed Internet link, it may be difficult for the attacker to successfully use brute-force attacks. You may be able to filter the offending packets on your router and continue with operations almost as normal (still paying for the incurred bandwidth, unfortunately).

SYN Flood attacks also rely on sending a large number of packets, but their purpose is not to saturate the connection. Instead, they exploit weaknesses in the TCP/IP protocol to render the target’s network connection unusable. A TCP/IP connection can be thought of as a pipe connecting two endpoints. Three packets are needed to establish a connection: SYN, SYN+ACK, and ACK. This process is known as a three-way handshake, and it is illustrated in Figure 5-2.

In the normal handshaking process, a host wanting to initiate a connection sends a packet with a SYN flag set. Upon receiving the packet and assuming the server is open for connections on the target port, the target host sends back a packet with flags SYN and ACK set. Finally, the client host sends a third packet with the flag ACK set. The connection is now established until one of the hosts sends a packet with the RST or FIN flag set.

The situation exploited in a SYN flood attack is that many operating systems have fixed-length queues to keep track of connections that are being opened. These queues are large but not unlimited. The attacker will exploit this by sending large numbers of SYN packets to the target without sending the final, third packet. The target will eventually remove the connection from the queue but not before the timeout for receiving the third packet expires. The only thing an attacker needs to do is send new SYN packets at a faster rate than the target removes them from the queue. Since the timeout is usually measured in minutes and the attacker can send thousands of packets in a second, this turns out to be very easy.

In a flood of bogus SYN packets, legitimate connection requests have very little chance of success.

Linux comes with an effective defense against SYN flood attacks called SYN cookies. Instead of allocating space in the connection queue after receiving the first packet the Linux kernel just sends a cookie in the SYN+ACK packet and allocates space for the connection only after receiving the ACK packet. D. J. Bernstein created the SYN cookies idea and maintains a page where their history is documented: http://cr.yp.to/syncookies.html.

To enable this defense at runtime, type the following:

# echo 1 > /proc/sys/net/ipv4/tcp_syncookies

For permanent changes, put the same command in one of the startup scripts located in /etc/init.d (or /etc/rc.local on Red Hat systems).

The above attacks are annoying and sometimes difficult to handle but in general easy to defend against because the source address of the attack is known. Unfortunately, nothing prevents attackers from faking the source address of the traffic they create. When such traffic reaches the attack target, the target will have no idea of the actual source and no reason to suspect the source address is a fake.

To make things worse, attackers will typically use a different (random) source address for each individual packet. At the receiving end there will be an overwhelmingly large amount of seemingly legitimate traffic. Not being able to isolate the real source, a target can do little. In theory, it is possible to trace the traffic back to the source. In practice, since the tracing is mostly a manual operation, it is very difficult to find technicians with the incentive and the time to do it.

Source address spoofing can largely be prevented by putting outbound traffic filtering in place. This type of filtering is known as egress filtering. In other words, organizations must make sure they are sending only legitimate traffic to the Internet. Each organization will most likely know the address space it covers, and it can tell whether the source address of an outgoing packet makes sense. If it makes no sense, the packet is most likely a part of a DoS attack. Having egress filtering in place helps the Internet community, but it also enables organizations to detect compromised hosts within their networks.

Core providers may have trouble doing this since they need to be able to forward foreign traffic as part of their normal operation. Many other operators (cable and DSL providers) are in a better position to do this, and it is their customers that contribute most to DoS attacks.

Address spoofing and egress filtering are described in more detail in the SANS Institute paper “Egress filtering v0.2” at http://www.sans.org/y2k/egress.htm.

With most content-serving servers sitting on high bandwidth links these days, attackers are having trouble finding single systems they can compromise that have connections fast enough to be used for attacks. That is, most systems’ network connections are fast enough that one single system cannot do much harm to another system. This has led to the creation of a new breed of attacks. Distributed denial of service (DDoS) attacks are performed by a large number of systems, each contributing its share to form a massive attack network. The combined power is too big even for the largest web sites.

Distributed attacks are rarely performed manually. Instead, automated scripts are used to break into vulnerable systems and bring them under the control of a master system. Compromised systems are often referred to as zombies. Such a network of zombies can be used to attack targets at will. The other use for zombies is to send spam. An example zombie network is illustrated in Figure 5-3.

These DDoS scripts are often publicly available and even people with very little skill can use them. Some well-known DDoS attack tools are:

To find more information on DDoS attacks and tools, follow these links:

Viruses and worms are often used for DoS attacks. The target address is sometimes hardcoded into the virus, so it is not necessary for a virus to communicate back to the master host to perform its attacks. These types of attacks are practically impossible to trace.

Administrators often have only themselves to blame for service failure. Leaving a service configured with default installation parameters is asking for trouble. Such systems are very susceptible to DoS attacks and a simple traffic spike can imbalance them.

One thing to watch for with Apache is memory usage. Assuming Apache is running in prefork mode, each request is handled by a separate process. To serve one hundred requests at one time, a hundred processes are needed. The maximum number of processes Apache can create is controlled with the MaxClients directive, which is set to 256 by default. This default value is often used in production and that can cause problems if the server cannot cope with that many processes.

Figuring out the maximum number of Apache processes a server can accommodate is surprisingly difficult. On a Unix system, you cannot obtain precise figures on memory utilization. The best thing we can do is to use the information we have, make assumptions, and then simulate traffic to correct memory utilization issues.

Looking at the output of the ps command, we can see how much memory a single process takes (look at the RSZ column as it shows the amount of physical memory in use by a process):

# ps -A -o pid,vsz,rsz,command
  PID   VSZ  RSZ COMMAND
 3587  9580 3184 /usr/local/apache/bin/httpd
 3588  9580 3188 /usr/local/apache/bin/httpd
 3589  9580 3188 /usr/local/apache/bin/httpd
 3590  9580 3188 /usr/local/apache/bin/httpd
 3591  9580 3188 /usr/local/apache/bin/httpd
 3592  9580 3188 /usr/local/apache/bin/httpd

In this example, each Apache instance takes 3.2 MB. Assuming the default Apache configuration is in place, this server requires 1 GB of RAM to reach the peak capacity of serving 256 requests in parallel, and this is only assuming additional memory for CGI scripts and dynamic pages will not be required.

Do not be surprised if you see systems with very large Apache processes. Apache installations with a large number of virtual servers and complex configurations require large amounts of memory just to store the configuration data. Apache process sizes in excess of 30 MB are common.

So, suppose you are running a busy, shared hosting server with hundreds of virtual hosts, the size of each Apache process is 30 MB, and some of the sites have over 200 requests at the same time. How much memory do you need? Not as much as you may think.

Most modern operating systems (Linux included) have a feature called copy-on-write, and it is especially useful in cases like this one. When a process forks to create a new process (such as an Apache child), the kernel allocates the required amount of memory to accommodate the size of the process. However, this will be virtual memory (of which there is plenty), not physical memory (of which there is little). Memory locations of both processes will point to the same physical memory location. Only when one of the processes attempts to make changes to data will the kernel separate the two memory locations and give each process its own physical memory segment. Hence, the name copy-on-write.

As I mentioned, this works well for us. For the most part, Apache configuration data does not change during the lifetime of the server, and this allows the kernel to use one memory segment for all Apache processes.

Having an application that communicates to a database on every page request, when it is not necessary to do so, can be a big problem. But it often happens with poorly written web applications. There is nothing wrong with this concept when the number of visitors is low, but the concept scales poorly.

The first bottleneck may be the maximum number of connections the database allows. Each request requires one database connection. Therefore, the database server must be configured to support as many connections as there can be web server processes. Connecting to a database can take time, which can be much better spent processing the request. Many web applications support a feature called persistent database connections. When this feature is enabled, a connection is kept opened at the end of script execution and reused when the next request comes along. The drawback is that keeping database connections open like this puts additional load on the database. Even an Apache process that does nothing but wait for the next client keeps the database connection open.

Talking to a database consumes a large amount of processor time. A large number of concurrent page requests will force the server to give all processor time to the database. However, for most sites this is not needed since the software and the database spend time delivering identical versions of the same web page. A better approach would be to save the web page to the disk after it is generated for the first time and avoid talking to the database on subsequent requests.

The most flexible approach is to perform page caching at the application level since that would allow the cached version to be deleted at the same time the page is updated (to avoid serving stale content). Doing it on any other level (using mod_cache in Apache 2, for example) would mean having to put shorter expiration times in place and would require the cache to be refreshed more often. However, mod_cache can serve as a good short-term solution since it can be applied quickly to any application.

You should never underestimate the potential mistakes made by beginning programmers. More than once I have seen web applications store images into a database and then fetch several images from the database on every page request. Such usage of the database brings a server to a crawl even for a modest amount of site traffic.

The concept of cacheability is important if you are preparing for a period of increased traffic, but it also can and should be used as a general technique to lower bandwidth consumption. It is said that content is cacheable when it is accompanied by HTTP response headers that provide information about when the content was created and how long it will remain fresh. Making content cacheable results in browsers and proxies sending fewer requests because they do not bother checking for updates of the content they know is not stale, and this results in lower bandwidth usage.

By default, Apache will do a reasonable job of making static documents cacheable. After having received a static page or an image from the web server once, a browser makes subsequent requests for the same resource conditional. It essentially says, “Send me the resource identified by the URL if it has not changed since I last requested it.“ Instead of returning the status 200 (OK) with the resource attached, Apache returns 304 (Not Modified) with no body.

Problems can arise when content is served through applications that are not designed with cacheability in mind. Most application servers completely disable caching under the (valid) assumption that it is better for applications not to have responses cached. This does not work well for content-serving web sites.

A good thing to do would be to use a cacheability engine to test the cacheability of an application and then talk to programmers about enhancing the application by adding support for HTTP caching.

Detailed information about caching and cacheability is available at:

Assume you have chosen to serve a maximum of one hundred requests at any given time. After performing a few tests from the local network, you may have seen that Apache serves the requests quickly, so you think you will never reach the maximum. There are some things to watch for in real life:

Slow clients

Measuring the speed of request serving from the local network can be deceptive. Real clients will come from various speeds, with many of them using slow modems. Apache will be ready to serve the request fast but clients will not be ready to receive. A 20-KB page, assuming the client uses a modem running at maximum speed without any other bottlenecks (a brave assumption), can take over six seconds to serve. During this period, one Apache process will not be able to do anything else.

Large files

Large files take longer to download than small files. If you make a set of large files available for download, you need to be aware that Apache will use one process for each file being downloaded. Worse than that, users can have special download software packages (known as download accelerators), which open multiple download requests for the same file. However, for most users, the bottleneck is their network connection, so these additional download requests have no impact on the download speed. Their network connection is already used up.

Keep-Alive functionality

Keep-Alive is an HTTP protocol feature that allows clients to remain connected to the server between requests. The idea is to avoid having to re-establish TCP/IP connections with every request. Most web site users are slow with their requests, so the time Apache waits, before realizing the next request is not coming, is time wasted. The timeout is set to 15 seconds by default, which means 15 seconds for one process to do nothing. You can keep this feature enabled until you reach the maximum capacity of the server. If that happens you can turn it off or reduce the timeout Apache uses to wait for the next request to come. Newer Apache versions are likely to be improved to allow an Apache process to serve some other client while it is waiting for a Keep-Alive client to come up with another request.

Unless you perform tests beforehand, you will never know how well the server will operate under a heavy load. Many free load-testing tools exist. I recommend you download one of the tools listed at:

“Web Site Test Tools and Site Management Tools,” maintained by Rick Hower (http://www.softwareqatest.com)

A sudden spike in the web server traffic can have the same effect as a DoS attack. A well-configured server will cope with the demand, possibly slowing down a little or refusing some clients. If the server is not configured properly, it may crash.

Traffic spikes occur for many reasons, and some of them may be normal. A significant event will cause people to log on and search for more information on the subject. If a site often takes a beating in spite of being properly configured, perhaps it is time to upgrade the server or the Internet connection.

The following sections describe the causes and potential solutions for traffic spikes.

Bandwidth stealing (also known as hotlinking) is a common problem on the Internet. It refers to the practice of rogue sites linking directly to files (often images) residing on other sites (victims). To users, it looks like the files are being provided by the rogue site, while the owner of the victim site is paying for the bandwidth.

One way to deal with this is to use mod_rewrite to reject all requests for images that do not originate from our site. We can do this because browsers send the address of the originating page in the Referer header field of every request. Valid requests contain the address of our site in this field, and this allows us to reject everything else.

# allow empty referrers, for when a user types the URL directly
RewriteCond %{HTTP_REFERER} !^$

# allow users coming from apachesecurity.net
RewriteCond %{HTTP_REFERER} !^http://www\.apachesecurity\.net [nocase]

# only prevent images from being hotlinked - otherwise
# no one would be able to link to the site at all!
RewriteRule (\.gif|\.jpg|.\png|\.swf)$ $0 [forbidden]

Some people have also reported attacks by competitors with busier sites, performed by embedding many invisible tiny (typically 1x1 pixel) frames pointing to their sites. Innocent site visitors would visit the competitor’s web site and open an innocent-looking web page. That “innocent” web page would then open dozens of connections to the target web site, usually targeting large images for download. And all this without the users realizing what is happening. Luckily, these attacks can be detected and prevented with the mod_rewrite trick described above.

With other types of attacks being easy, almost trivial, to perform, hardly anyone bothers attacking Apache directly. Under some circumstances, Apache-level attacks can be easier to perform because they do not require as much bandwidth as other types of attacks. Some Apache-level attacks can be performed with as few as a dozen bytes.

Less-skilled attackers will often choose this type of attack because it is so obvious.

Programming errors come in different shapes. Many have security implications. A programming error that can be exploited to abuse system resources should be classified as a vulnerability. For example, in 1998, a programming error was discovered in Apache: specially crafted small-sized requests caused Apache to allocate large amounts of memory. For more information, see:

“YA Apache DoS Attack,” discovered by Dag-Erling Smørgrav (http://marc.theaimsgroup.com/?l=bugtraq&m=90252779826784&w=2)

More serious vulnerabilities, such as nonexploitable buffer overflows, can cause the server to crash when attacked. (Exploitable buffer overflows are not likely to be used as DoS attacks since they can and will be used instead to compromise the host.)

When Apache is running in a prefork mode as it usually is, there are many instances of the server running in parallel. If a child crashes, the parent process will create a new child. The attacker will have to send a large number of requests constantly to disrupt the operation.

In a multithreaded (not prefork) mode of operation, there is only one server process. A crash while processing a request will cause the whole server to go down and make it unavailable. This will be easy to detect because you have server monitoring in place or you start getting angry calls from your customers.

Vulnerabilities are easy to resolve in most cases: you need to patch the server or upgrade to a version that fixes the problem. Things can be unpleasant if you are running a vendor-supplied version of Apache, and the vendor is slow in releasing the upgrade.

Any of the widely available web server load-testing tools can be used to attack a web server. It would be a crude, visible, but effective attack nevertheless. One such tool, ab (short for Apache Benchmark), is distributed with Apache. To perform a simple attack against your own server, execute the following, replacing the URL with the URL for your server.

$ /usr/local/apache/bin/ab -n 1000 -c 100 http://www.yourserver.com/

Choose the concurrency level (the -c switch) to be the same as or larger than the maximum number of Apache processes allowed (MaxClients). The slower the connection to the server, the more effect the attack will have. You will probably find it difficult to perform the attack from the local network.

To defend against this type of attack, first identify the IP address the attacker is coming from and then deny it access to the server on the network firewall. You can do this manually, or you can set up an automated script. If you choose the latter approach, make sure your detection scripts will not make mistakes that would cause legitimate users to be denied service. There is no single method of detection that can be used to detect all attack types. Here are some possible detection approaches:

I designed three tools that can be helpful with brute-force DoS attacks. All three are available for download from http://www.apachesecurity.net.

The brute-force attacks we have discussed are easy to perform but may require a lot of bandwidth, and they are easy to spot. With some programming skills, the attack can be improved to leave no trace in the logs and to require little bandwidth.

The trick is to open a connection to the server but not send a single byte. Opening the connection and waiting requires almost no resources by the attacker, but it permanently ties up one Apache process to wait patiently for a request. Apache will wait until the timeout expires, and then close the connection. As of Apache 1.3.31, request-line timeouts are logged to the access log (with status code 408). Request line timeout messages appear in the error log with the level info. Apache 2 does not log such messages to the error log, but efforts are underway to add the same functionality as is present in the 1.x branch.

Opening just one connection will not disrupt anything, but opening hundreds of connections at the same time will make all available Apache processes busy. When the maximal number of processes is reached, Apache will log the event into the error log (“server reached MaxClients setting, consider raising the MaxClients setting”) and start holding new connections in a queue. This type of attack is similar to the SYN flood network attack we discussed earlier. If we continue to open new connections at a high rate, legitimate requests will hardly be served.

If we start opening our connections at an even higher rate, the waiting queue itself will become full (up to 511 connections are queued by default; another value can be configured using the ListenBackLog directive) and will result in new connections being rejected.

Defending against this type of attack is difficult. The only solution is to monitor server performance closely (in real-time) and deny access from the attacker’s IP address when attacked.

Not all attacks come from the outside. Consider the following points:

Having a malicious user on the system can have various consequences, but in this chapter, we are concerned only with the DoS attacks. What can such a user do? As it turns out, most systems are not prepared to handle DoS attacks, and it is easy to bring the server down from the inside via the following possibilites:

To keep the system under control, you need to:

Process limits, process accounting, and kernel auditing are described in the following sections.

Traffic shaping is a technique that establishes control over web server traffic. Many Apache modules perform traffic shaping, and their goal is usually to slow down a (client) IP address or to control the bandwidth consumption on the per-virtual host level. As a side effect, these modules can be effective against certain types of DoS attacks. The following are some of the more popular traffic-shaping modules:

One module is designed specifically as a remedy for Apache DoS attacks:

The mod_dosevasive module will allow you to specify a maximal number of requests executed by the same IP address against one Apache child. If the threshold is reached, the IP address is blacklisted for a time period you specify. You can send an email message or execute a system command (to talk to a firewall, for example) when that happens.

The mod_dosevasive module is not as good as it could be because it does not use shared memory to keep information about previous requests persistent. Instead, the information is kept with each child. Other children know nothing about abuse against one of them. When a child serves the maximum number of requests and dies, the information goes with it.

Blacklisting IP addresses can be dangerous. An attempt to prevent DoS attacks can become a self-inflicted DoS attack because users in general do not have unique IP addresses. Many users browse through proxies or are hidden behind a network address translation (NAT) system. Blacklisting a proxy will cause all users behind it to be blacklisted. If you really must use one of the traffic-shaping techniques that uses the IP address of the client for that purpose, do the following:

With some exceptions (such as with vulnerabilities that can be easily fixed) DoS attacks are very difficult to defend against. The main problem remains being able to distinguish legitimate requests from requests belonging to an attack.

The chapter concludes with a strategy for handling DoS attacks: