A denial of service (DoS) attack is an attempt to prevent legitimate users from using a service. This is usually done by consuming all of a resource used to provide the service. The resource targeted is typically one of the following:
CPU
Operating memory (RAM)
Bandwidth
Disk space
Sometimes, a less obvious resource is targeted. Many applications have fixed length internal structures and if an attacker can find a way to populate all of them quickly, the application can become unresponsive. A good example is the maximum number of Apache processes that can exist at any one time. Once the maximum is reached, new clients will be queued and not served.
DoS attacks are not unique to the digital world. They existed many years before anything digital was created. For example, someone sticking a piece of chewing gum into the coin slot of a vending machine prevents thirsty people from using the machine to fetch a refreshing drink.
In the digital world, DoS attacks can be acts of vandalism, too. They are performed for fun, pleasure, or even financial gain. In general, DoS attacks are a tough problem to solve because the Internet was designed on a principle that everyone plays by the rules.
You can become a victim of a DoS attack for various reasons:
In the worst case, you may be at the wrong place at the wrong time. Someone
may think your web site is a good choice for an attack, or it may simply be the
first web site that comes to mind. He may decide he does not like you personally
and choose to make your life more troubled. (This is what happened to Steve
Gibson, of http://www.grc.com
fame, when a
13-year-old felt offended by the “script kiddies” term he used.)
Some may choose to attack you because they do not agree with the content you are providing. Many people believe disrupting your operation is acceptable in a fight for their cause. Controversial subjects such as the right to choose, globalization, and politics are likely to attract their attention and likely to cause them to act.
In a fiercely competitive market, you may end up against competitors who will do anything to win. They may constantly do small things that slow you down or go as far as to pay someone to attack your resources.
If your job is to host other sites, the chances of being attacked via a DoS attack increase significantly. With many web sites hosted on your servers, chances are good that someone will find one of the sites offending.
Many attempts of extortion were reported in the past. Companies whose revenue depends on their web presence are especially vulnerable. Only the wealthiest of companies can afford to pay for infrastructure that would resist well-organized DoS attacks. Only the cases where companies refused to pay are publicly known; we do not know how many companies accepted blackmail terms.
DoS attacks can be broadly divided into five categories:
Network attacks
Self-inflicted attacks
Traffic spikes
Attacks on Apache (or other services in general—e.g., FTP)
Local attacks
These types of attacks are described in the rest of this chapter.
Network attacks are the most popular type of attack because they are easy to execute (automated tools are available) and difficult to defend against. Since these attacks are not specific to Apache, they fall outside the scope of this book and thus they are not covered in detail in the following sections. As a rule of thumb, only your upstream provider can defend you from attacks performed on the network level. At the very least you will want your provider to cut off the attacks at their routers so you do not have to pay for the bandwidth incurred by the attacks.
The simplest network attacks target weaknesses in implementations of the TCP/IP protocol. Some implementations are not good at handling error conditions and cause systems to crash or freeze. Some examples of this type of attack are:
Sending very large Internet Control Message Protocol (ICMP) packets. This type of attack, known as the Ping of death, caused crashes on some older Windows systems.
Setting invalid flags on TCP/IP packets.
Setting the destination and the source IP addresses of a TCP packet to the address of the attack target (Land attack).
These types of attacks have only historical significance, since most TCP/IP implementations are no longer vulnerable.
In the simplest form, an effective network attack can be performed from a single host with a fast Internet connection against a host with a slower Internet connection. By using brute force, sending large numbers of traffic packets creates a flood attack and disrupts target host operations. The concept is illustrated in Figure 5-1.
At the same time, this type of attack is the easiest to defend against. All you
need to do is to examine the incoming traffic (e.g., using a packet sniffer like
tcpdump
), discover the IP address from which the traffic is
coming from, and instruct your upstream provider to block the address at their
router.
At first glance, you may want to block the attacker’s IP address on your own firewall but that will not help. The purpose of this type of attack is to saturate the Internet connection. By the time a packet reaches your router (or server), it has done its job.
Be prepared and have contact details of your upstream provider (or server hosting company) handy. Larger companies have many levels of support and quickly reaching someone knowledgable may be difficult. Research telephone numbers in advance. If you can, get to know your administrators before you need their help.
Steve Gibson wrote a fascinating story about his first fight against a DoS attack:
The Gibson Research Corporation’s “Denial Of Service Investigation &
Exploration Pages“ (http://www.grc.com/dos/ ) |
If you are sitting on a high-speed Internet link, it may be difficult for the attacker to successfully use brute-force attacks. You may be able to filter the offending packets on your router and continue with operations almost as normal (still paying for the incurred bandwidth, unfortunately).
SYN
Flood attacks also rely on sending a large number of
packets, but their purpose is not to saturate the connection. Instead, they exploit
weaknesses in the TCP/IP protocol to render the target’s network connection
unusable. A TCP/IP connection can be thought of as a pipe connecting two endpoints.
Three packets are needed to establish a connection: SYN
,
SYN+ACK
, and ACK
. This process is known as
a three-way handshake, and it is illustrated in Figure 5-2.
In the normal handshaking process, a host wanting to initiate a connection sends a
packet with a SYN
flag set. Upon receiving the packet and
assuming the server is open for connections on the target port, the target host
sends back a packet with flags SYN
and ACK
set. Finally, the client host sends a third packet with the flag
ACK
set. The connection is now established until one of the
hosts sends a packet with the RST
or FIN
flag
set.
The situation exploited in a SYN
flood attack is that many
operating systems have fixed-length queues to keep track of connections that are
being opened. These queues are large but not unlimited. The attacker will exploit
this by sending large numbers of SYN
packets to the target
without sending the final, third packet. The target will eventually remove the
connection from the queue but not before the timeout for receiving the third packet
expires. The only thing an attacker needs to do is send new SYN
packets at a faster rate than the target removes them from the queue. Since the
timeout is usually measured in minutes and the attacker can send thousands of
packets in a second, this turns out to be very easy.
In a flood of bogus SYN
packets, legitimate connection requests
have very little chance of success.
Linux comes with an effective defense against SYN
flood attacks
called SYN cookies. Instead of allocating space in the
connection queue after receiving the first packet the Linux kernel just sends a
cookie in the SYN+ACK
packet and allocates space for the
connection only after receiving the ACK
packet. D. J. Bernstein
created the SYN
cookies idea and maintains a page where their
history is documented: http://cr.yp.to/syncookies.html
.
To enable this defense at runtime, type the following:
# echo 1 > /proc/sys/net/ipv4/tcp_syncookies
For permanent changes, put the same command in one of the startup scripts located
in /etc/init.d
(or /etc/rc.local
on Red
Hat systems).
The above attacks are annoying and sometimes difficult to handle but in general easy to defend against because the source address of the attack is known. Unfortunately, nothing prevents attackers from faking the source address of the traffic they create. When such traffic reaches the attack target, the target will have no idea of the actual source and no reason to suspect the source address is a fake.
To make things worse, attackers will typically use a different (random) source address for each individual packet. At the receiving end there will be an overwhelmingly large amount of seemingly legitimate traffic. Not being able to isolate the real source, a target can do little. In theory, it is possible to trace the traffic back to the source. In practice, since the tracing is mostly a manual operation, it is very difficult to find technicians with the incentive and the time to do it.
Source address spoofing can largely be prevented by putting outbound traffic filtering in place. This type of filtering is known as egress filtering. In other words, organizations must make sure they are sending only legitimate traffic to the Internet. Each organization will most likely know the address space it covers, and it can tell whether the source address of an outgoing packet makes sense. If it makes no sense, the packet is most likely a part of a DoS attack. Having egress filtering in place helps the Internet community, but it also enables organizations to detect compromised hosts within their networks.
Core providers may have trouble doing this since they need to be able to forward foreign traffic as part of their normal operation. Many other operators (cable and DSL providers) are in a better position to do this, and it is their customers that contribute most to DoS attacks.
Address spoofing and egress filtering are described in more detail in the SANS
Institute paper “Egress filtering v0.2” at http://www.sans.org/y2k/egress.htm
.
With most content-serving servers sitting on high bandwidth links these days, attackers are having trouble finding single systems they can compromise that have connections fast enough to be used for attacks. That is, most systems’ network connections are fast enough that one single system cannot do much harm to another system. This has led to the creation of a new breed of attacks. Distributed denial of service (DDoS) attacks are performed by a large number of systems, each contributing its share to form a massive attack network. The combined power is too big even for the largest web sites.
When Yahoo! was attacked in February 2000, the combined bandwidth targeted at them was around 1 Gbps at its peak, with hundreds of attacking stations participating in the attack.
Distributed attacks are rarely performed manually. Instead, automated scripts are used to break into vulnerable systems and bring them under the control of a master system. Compromised systems are often referred to as zombies. Such a network of zombies can be used to attack targets at will. The other use for zombies is to send spam. An example zombie network is illustrated in Figure 5-3.
These DDoS scripts are often publicly available and even people with very little skill can use them. Some well-known DDoS attack tools are:
Trinoo
Tribe Flood Network (TFN)
Tribe Flood Network 2000 (TFN2K)
Stacheldraht (German for “barbed wire”)
To find more information on DDoS attacks and tools, follow these links:
The Packet Storm web site at http://www.packetstormsecurity.org/distributed/
The “DDoS Attacks/Tools” web page maintained by David Dittrich
(http://staff.washington.edu/dittrich/misc/ddos/
)
Viruses and worms are often used for DoS attacks. The target address is sometimes hardcoded into the virus, so it is not necessary for a virus to communicate back to the master host to perform its attacks. These types of attacks are practically impossible to trace.
Address spoofing is easy to use and most DoS attacks use it. Because target systems believe the source address received in a TCP packet, address spoofing allows attackers to attack a target through other, genuine Internet systems:
The attacker sends a packet to a well-connected system and forges the
source address to look like the packet is coming from the target of his
attack. The packet may request a connection to be established
(SYN
).
That system receives the packet and replies (to the target, not to the
actual source) with a SYN+ACK
response.
The target is now being attacked by an innocent system.
The flow of data from the attacker to the systems being used for reflection is usually low in volume, low enough not to motivate their owners to investigate the origin. The combined power of traffic against the target can be devastating. These types of attacks are usually distributed and are known as distributed reflection denial of service (DRDoS) attacks (the concept of such attacks is illustrated in Figure 5-4). Steve Gibson wrote a follow-up to his story on DoS attacks, including coverage of DRDoS attacks:
The Gibson Research Corporation’s “Distributed Reflection Denial of Service”
page (http://www.grc.com/dos/drdos.htm ).
|
Administrators often have only themselves to blame for service failure. Leaving a service configured with default installation parameters is asking for trouble. Such systems are very susceptible to DoS attacks and a simple traffic spike can imbalance them.
One thing to watch for with Apache is memory usage. Assuming Apache is running in
prefork mode, each request is handled by a separate process. To serve one hundred
requests at one time, a hundred processes are needed. The maximum number of
processes Apache can create is controlled with the MaxClients
directive, which is set to 256 by default.
This default value is often used in production and that can cause problems if the
server cannot cope with that many processes.
Figuring out the maximum number of Apache processes a server can accommodate is surprisingly difficult. On a Unix system, you cannot obtain precise figures on memory utilization. The best thing we can do is to use the information we have, make assumptions, and then simulate traffic to correct memory utilization issues.
Looking at the output of the ps
command, we can see how much
memory a single process takes (look at the RSZ column as it shows the amount of
physical memory in use by a process):
# ps -A -o pid,vsz,rsz,command
PID VSZ RSZ COMMAND
3587 9580 3184 /usr/local/apache/bin/httpd
3588 9580 3188 /usr/local/apache/bin/httpd
3589 9580 3188 /usr/local/apache/bin/httpd
3590 9580 3188 /usr/local/apache/bin/httpd
3591 9580 3188 /usr/local/apache/bin/httpd
3592 9580 3188 /usr/local/apache/bin/httpd
In this example, each Apache instance takes 3.2 MB. Assuming the default Apache configuration is in place, this server requires 1 GB of RAM to reach the peak capacity of serving 256 requests in parallel, and this is only assuming additional memory for CGI scripts and dynamic pages will not be required.
Most web servers do not operate at the edge of their capacity. Your initial goal is to limit the number of processes to prevent server crashes. If you set the maximum number of processes to a value that does not make full use of the available memory, you can always change it later when the need for more processes appears.
Do not be surprised if you see systems with very large Apache processes. Apache installations with a large number of virtual servers and complex configurations require large amounts of memory just to store the configuration data. Apache process sizes in excess of 30 MB are common.
So, suppose you are running a busy, shared hosting server with hundreds of virtual hosts, the size of each Apache process is 30 MB, and some of the sites have over 200 requests at the same time. How much memory do you need? Not as much as you may think.
Most modern operating systems (Linux included) have a feature called copy-on-write, and it is especially useful in cases like this one. When a process forks to create a new process (such as an Apache child), the kernel allocates the required amount of memory to accommodate the size of the process. However, this will be virtual memory (of which there is plenty), not physical memory (of which there is little). Memory locations of both processes will point to the same physical memory location. Only when one of the processes attempts to make changes to data will the kernel separate the two memory locations and give each process its own physical memory segment. Hence, the name copy-on-write.
As I mentioned, this works well for us. For the most part, Apache configuration data does not change during the lifetime of the server, and this allows the kernel to use one memory segment for all Apache processes.
Having an application that communicates to a database on every page request, when it is not necessary to do so, can be a big problem. But it often happens with poorly written web applications. There is nothing wrong with this concept when the number of visitors is low, but the concept scales poorly.
The first bottleneck may be the maximum number of connections the database allows. Each request requires one database connection. Therefore, the database server must be configured to support as many connections as there can be web server processes. Connecting to a database can take time, which can be much better spent processing the request. Many web applications support a feature called persistent database connections. When this feature is enabled, a connection is kept opened at the end of script execution and reused when the next request comes along. The drawback is that keeping database connections open like this puts additional load on the database. Even an Apache process that does nothing but wait for the next client keeps the database connection open.
Unlike for most database servers, establishing a connection with MySQL server is quick. It may be possible to turn persistent connections off in software (e.g., the PHP engine) and create connections on every page hit, which will reduce the maximum number of concurrent connections in the database.
Talking to a database consumes a large amount of processor time. A large number of concurrent page requests will force the server to give all processor time to the database. However, for most sites this is not needed since the software and the database spend time delivering identical versions of the same web page. A better approach would be to save the web page to the disk after it is generated for the first time and avoid talking to the database on subsequent requests.
The most flexible approach is to perform page caching at the application level
since that would allow the cached version to be deleted at the same time the page is
updated (to avoid serving stale content). Doing it on any other level (using
mod_cache
in Apache 2, for example) would mean having to put
shorter expiration times in place and would require the cache to be refreshed more
often. However, mod_cache
can serve as a good short-term solution
since it can be applied quickly to any application.
You should never underestimate the potential mistakes made by beginning programmers. More than once I have seen web applications store images into a database and then fetch several images from the database on every page request. Such usage of the database brings a server to a crawl even for a modest amount of site traffic.
The concept of cacheability is important if you are preparing for a period of increased traffic, but it also can and should be used as a general technique to lower bandwidth consumption. It is said that content is cacheable when it is accompanied by HTTP response headers that provide information about when the content was created and how long it will remain fresh. Making content cacheable results in browsers and proxies sending fewer requests because they do not bother checking for updates of the content they know is not stale, and this results in lower bandwidth usage.
By default, Apache will do a reasonable job of making static documents cacheable.
After having received a static page or an image from the web server once, a browser
makes subsequent requests for the same resource conditional. It
essentially says, “Send me the resource identified by the URL if it has not changed
since I last requested it.“ Instead of returning the status 200
(OK) with the resource attached, Apache returns 304
(Not
Modified) with no body.
Problems can arise when content is served through applications that are not designed with cacheability in mind. Most application servers completely disable caching under the (valid) assumption that it is better for applications not to have responses cached. This does not work well for content-serving web sites.
A good thing to do would be to use a cacheability engine to test the cacheability of an application and then talk to programmers about enhancing the application by adding support for HTTP caching.
Detailed information about caching and cacheability is available at:
“Caching Tutorial for Web Authors and Webmasters” by Mark Nottingham
(http://www.mnot.net/cache_docs/
)
“Cacheability Engine” (http://www.mnot.net/cacheability/
)
Assume you have chosen to serve a maximum of one hundred requests at any given time. After performing a few tests from the local network, you may have seen that Apache serves the requests quickly, so you think you will never reach the maximum. There are some things to watch for in real life:
Measuring the speed of request serving from the local network can be deceptive. Real clients will come from various speeds, with many of them using slow modems. Apache will be ready to serve the request fast but clients will not be ready to receive. A 20-KB page, assuming the client uses a modem running at maximum speed without any other bottlenecks (a brave assumption), can take over six seconds to serve. During this period, one Apache process will not be able to do anything else.
Large files take longer to download than small files. If you make a set of large files available for download, you need to be aware that Apache will use one process for each file being downloaded. Worse than that, users can have special download software packages (known as download accelerators), which open multiple download requests for the same file. However, for most users, the bottleneck is their network connection, so these additional download requests have no impact on the download speed. Their network connection is already used up.
Keep-Alive is an HTTP protocol feature that allows clients to remain connected to the server between requests. The idea is to avoid having to re-establish TCP/IP connections with every request. Most web site users are slow with their requests, so the time Apache waits, before realizing the next request is not coming, is time wasted. The timeout is set to 15 seconds by default, which means 15 seconds for one process to do nothing. You can keep this feature enabled until you reach the maximum capacity of the server. If that happens you can turn it off or reduce the timeout Apache uses to wait for the next request to come. Newer Apache versions are likely to be improved to allow an Apache process to serve some other client while it is waiting for a Keep-Alive client to come up with another request.
Unless you perform tests beforehand, you will never know how well the server will operate under a heavy load. Many free load-testing tools exist. I recommend you download one of the tools listed at:
“Web Site Test Tools and Site Management Tools,” maintained by Rick Hower
(http://www.softwareqatest.com ) |
A sudden spike in the web server traffic can have the same effect as a DoS attack. A well-configured server will cope with the demand, possibly slowing down a little or refusing some clients. If the server is not configured properly, it may crash.
Traffic spikes occur for many reasons, and some of them may be normal. A significant event will cause people to log on and search for more information on the subject. If a site often takes a beating in spite of being properly configured, perhaps it is time to upgrade the server or the Internet connection.
The following sections describe the causes and potential solutions for traffic spikes.
If you have processing power to spare but not enough bandwidth, you might exchange one for the other, making it possible to better handle traffic spikes. Most modern browsers support content compression automatically: pages are compressed before they leave the server and decompressed after they arrive at the client. The server will know the client supports compression when it receives a request header such as this one:
Accept-Encoding: gzip,deflate
Content compression makes sense when you want to save the bandwidth, and when the clients have slow Internet connections. A 40-KB page may take eight seconds to download over a modem. If it takes the server a fraction of a second to compress the page to 15 KB (good compression ratios are common with HTML pages), the 25-KB length difference will result in a five-second acceleration. On the other hand, if your clients have fast connection speeds (e.g., on local networks), there will be no significant download time reduction.
For Apache 1, mod_gzip
(http://www.schroepl.net/projekte/mod_gzip/
) is used for content
compression. For Apache 2, mod_deflate
does the same and is
distributed with the server. However, compression does not have to be implemented on
the web server level. It can work just as well in the application server (e.g., PHP;
see http://www.php.net/zlib
) or in the
application.
Bandwidth stealing (also known as hotlinking) is a common problem on the Internet. It refers to the practice of rogue sites linking directly to files (often images) residing on other sites (victims). To users, it looks like the files are being provided by the rogue site, while the owner of the victim site is paying for the bandwidth.
One way to deal with this is to use mod_rewrite
to reject all
requests for images that do not originate from our site. We can do this because
browsers send the address of the originating page in the Referer
header field of every request. Valid requests contain the address of our site in
this field, and this allows us to reject everything else.
# allow empty referrers, for when a user types the URL directly RewriteCond %{HTTP_REFERER} !^$ # allow users coming from apachesecurity.net RewriteCond %{HTTP_REFERER} !^http://www\.apachesecurity\.net [nocase] # only prevent images from being hotlinked - otherwise # no one would be able to link to the site at all! RewriteRule (\.gif|\.jpg|.\png|\.swf)$ $0 [forbidden]
Some people have also reported attacks by competitors with busier sites, performed
by embedding many invisible tiny (typically 1x1 pixel) frames pointing to their
sites. Innocent site visitors would visit the competitor’s web site and open an
innocent-looking web page. That “innocent” web page would then open dozens of
connections to the target web site, usually targeting large images for download. And
all this without the users realizing what is happening. Luckily, these attacks can
be detected and prevented with the mod_rewrite
trick described
above.
High-tech skills such as programming are not needed to perform DoS attacks. Cyber-activism is a new form of protest in which people perform virtual sit-ins that block web sites using only their browsers and a large number of activists. These attacks are also known as coordinated denial of service attacks.
Activists will typically advertise virtual sit-ins days in advance so if you are hosting a web site of a high-profile organization you may have time to organize a defense. To learn more about cyber-activism, read the following pages:
“Cyber Activists bring down Immigration web site,” Scoop
Media, January 2004 (http://www.scoop.co.nz/mason/stories/WO0401/S00024.htm
)
“Econ Forum Site Goes Down,” Wired News, January 2001
(http://www.wired.com/news/politics/0,1283,50159,00.html
)
Activist web sites often publish the numbers of how many people participated in a virtual sit-in. These numbers will give you an excellent idea as to how many hits you can expect against the server, so use them to prepare in advance.
Slashdot (http://www.slashdot.org
) is a
popular technology news site. According to the last information published (late
2000, see http://slashdot.org/faq/tech.shtml
),
it uses 10 servers to serve content. The site publishes articles of its own, but it
often comments on interesting articles available elsewhere.
When a link to an external article is published on the home page, large numbers of
site visitors jump to read it. A massive surge in traffic to a web site is known as
the Slashdot effect (http://en.wikipedia.org/wiki/Slashdot_effect
). A site made
unresponsive by this effect is said to be slashdotted.
Sites that have been slashdotted report traffic between several hundred and several thousand hits per minute. Although this kind of traffic is out of the ordinary for most sites, it isn’t enough to crash a well-configured Apache web server. Sites usually fail for the following reasons:
Not enough bandwidth is available (which often happens if there are screenshots of a product or other large files for download).
Software wants to talk to the database on every page hit, so the database or the CPU is overloaded.
The server is not configured properly, so it consumes too much memory and crashes.
The hardware is not powerful enough to support a large number of visitors, so the server works but too many clients wait in line to be served.
With other types of attacks being easy, almost trivial, to perform, hardly anyone bothers attacking Apache directly. Under some circumstances, Apache-level attacks can be easier to perform because they do not require as much bandwidth as other types of attacks. Some Apache-level attacks can be performed with as few as a dozen bytes.
Less-skilled attackers will often choose this type of attack because it is so obvious.
Programming errors come in different shapes. Many have security implications. A programming error that can be exploited to abuse system resources should be classified as a vulnerability. For example, in 1998, a programming error was discovered in Apache: specially crafted small-sized requests caused Apache to allocate large amounts of memory. For more information, see:
“YA Apache DoS Attack,” discovered by Dag-Erling Smørgrav (http://marc.theaimsgroup.com/?l=bugtraq&m=90252779826784&w=2 ) |
More serious vulnerabilities, such as nonexploitable buffer overflows, can cause the server to crash when attacked. (Exploitable buffer overflows are not likely to be used as DoS attacks since they can and will be used instead to compromise the host.)
When Apache is running in a prefork mode as it usually is, there are many instances of the server running in parallel. If a child crashes, the parent process will create a new child. The attacker will have to send a large number of requests constantly to disrupt the operation.
A crash will prevent the server from logging the offending request since logging takes place in the last phase of request processing. The clue that something happened will be in the error log, as a message that a segmentation fault occurred. Not all segmentation faults are a sign of attack though. The server can crash under various circumstances (typically due to bugs), and some vendor-packaged servers crash quite often. Several ways to determine what is causing the crashes are described in Chapter 8.
In a multithreaded (not prefork) mode of operation, there is only one server process. A crash while processing a request will cause the whole server to go down and make it unavailable. This will be easy to detect because you have server monitoring in place or you start getting angry calls from your customers.
Vulnerabilities are easy to resolve in most cases: you need to patch the server or upgrade to a version that fixes the problem. Things can be unpleasant if you are running a vendor-supplied version of Apache, and the vendor is slow in releasing the upgrade.
Any of the widely available web server load-testing tools can be used to attack a
web server. It would be a crude, visible, but effective attack nevertheless. One
such tool, ab
(short for Apache Benchmark), is distributed with
Apache. To perform a simple attack against your own server, execute the following,
replacing the URL with the URL for your server.
$/usr/local/apache/bin/ab -n 1000 -c 100
http://www.yourserver.com/
Choose the concurrency level (the -c
switch) to be the same as
or larger than the maximum number of Apache processes allowed
(MaxClients
). The slower the connection to the server, the
more effect the attack will have. You will probably find it difficult to perform the
attack from the local network.
To defend against this type of attack, first identify the IP address the attacker is coming from and then deny it access to the server on the network firewall. You can do this manually, or you can set up an automated script. If you choose the latter approach, make sure your detection scripts will not make mistakes that would cause legitimate users to be denied service. There is no single method of detection that can be used to detect all attack types. Here are some possible detection approaches:
Watch the mod_status
output to detect too many
identical requests.
Watch the error log for suspicious messages (request line timeouts, messages about the maximum number of clients having been reached, or other errors). Log watching is covered in more detail in Chapter 8.
Examine the access log in regular time intervals and count the number of requests coming from each IP address. (This approach is usable only if you are running one web site or if all the traffic is recorded in the same file.)
I designed three tools that can be helpful with brute-force DoS attacks. All three
are available for download from http://www.apachesecurity.net
.
blacklist
Makes the job of maintaining a dynamic host-based firewall easy. It accepts an IP address and a time period on the command line, blocks requests from the IP address, and lifts the ban automatically when the period expires.
apache-protect
Designed to monitor mod_status
output and detect
too many identical requests coming from the same IP address.
blacklist-webclient
A small, C-based program that allows non-root
scripts to use the blacklist
tool (e.g., if you
want to use blacklist
for attacks detected by
mod_security
).
The brute-force attacks we have discussed are easy to perform but may require a lot of bandwidth, and they are easy to spot. With some programming skills, the attack can be improved to leave no trace in the logs and to require little bandwidth.
The trick is to open a connection to the server but not send a single byte.
Opening the connection and waiting requires almost no resources by the attacker, but
it permanently ties up one Apache process to wait patiently for a request. Apache
will wait until the timeout expires, and then close the connection. As of Apache
1.3.31, request-line timeouts are logged to the access log (with status code
408
). Request line timeout messages appear in the error log
with the level info
. Apache 2 does not log such messages to the
error log, but efforts are underway to add the same functionality as is present in
the 1.x branch.
Opening just one connection will not disrupt anything, but opening hundreds of connections at the same time will make all available Apache processes busy. When the maximal number of processes is reached, Apache will log the event into the error log (“server reached MaxClients setting, consider raising the MaxClients setting”) and start holding new connections in a queue. This type of attack is similar to the SYN flood network attack we discussed earlier. If we continue to open new connections at a high rate, legitimate requests will hardly be served.
If we start opening our connections at an even higher rate, the waiting queue
itself will become full (up to 511 connections are queued by default; another value
can be configured using the ListenBackLog
directive) and will
result in new connections being rejected.
Defending against this type of attack is difficult. The only solution is to monitor server performance closely (in real-time) and deny access from the attacker’s IP address when attacked.
Not all attacks come from the outside. Consider the following points:
In the worst case scenario (from the security point of view), you will have users with shell access and access to a compiler. They can upload files and compile programs as they please.
Suppose you do not allow shell access but you do allow CGI scripts. Your users can execute scripts, or they can compile binaries and upload and execute them. Similarly, if users have access to a scripting engine such as PHP, they may be able to execute binaries on the system.
Most users are not malicious, but accidents do happen. A small programming mistake can lead to a server-wide problem. The wider the user base, the greater the chances of having a user that is just beginning to learn programming. These users will typically treat servers as their own workstations.
Attackers can break in through an account of a legitimate user, or they can find a weakness in the application layer and reach the server through that.
Having a malicious user on the system can have various consequences, but in this chapter, we are concerned only with the DoS attacks. What can such a user do? As it turns out, most systems are not prepared to handle DoS attacks, and it is easy to bring the server down from the inside via the following possibilites:
A fork bomb is a program that creates copies of
itself in an infinite loop. The number of processes grows exponentially and
fills the process table (which is limited in size), preventing the system
from creating new processes. Processes that were active prior to the fork
bomb activation will still be active and working, but an administrator will
have a difficult time logging in to kill the offending program. You can find
more information about fork bombs at http://www.voltronkru.com/library/fork.html
.
A malloc bomb is a program that allocates large amounts of memory. Trying to accommodate the program, the system will start swapping, use up all of its swap space, and finally crash.
Disk overflow attacks require a bit more effort and thought than the
previous two approaches. One attack would create a large file (as easy as
cat
/dev/zero
>
/tmp/log
). Creating a very large number of small files,
and using up the inodes reserved for the partition, will have a similar
effect on the system, i.e., prevent it from creating new files.
To keep the system under control, you need to:
Put user files on a separate partition to prevent them from affecting system partitions.
Use filesystem quotas. (A good tutorial can be found in the Red Hat 9 manual
at http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/custom-guide/ch-disk-quotas.html
.)
Use pluggable authentication modules (PAM) limits.
Keep track of what users are doing via process accounting or kernel auditing.
Process limits, process accounting, and kernel auditing are described in the following sections.
Process limits allow administrators to introduce system-wide, per-group, or per-user limits on the usage of system resources. By default, there are virtually no limits in place:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 2039
virtual memory (kbytes, -v) unlimited
To impose limits, edit /etc/security/limits.conf
. (It may be
somewhere else on your system, depending on the distribution.) Changes will take
effect immediately, althogh active sessions will not be affected. Configuring limits
is tricky because restrictions can have consequences that are not obvious at first.
It is advisable to use trial and error, and ensure the limit configuration works the
way you want it to.
With process accounting in place, every command execution is logged. This
functionality is not installed by default on most systems. On Red Hat distributions,
for example, you need to install the package psacct
. Even when
installed, it is not activated. To activate it, type:
# accton /var/account/pacct
Depending on your platform, you may also need to update your system scripts to ensure process accounting is enabled after each restart. Process accounting information will be stored in binary format, so you have to use the following tools to extract information:
lastcomm
Prints information on individual command executions.
ac
Prints information on users’ connect time.
sa
Prints system-wide or per-user (turn on per-user output with the
-m
switch) summaries of command execution.
The grsecurity kernel patch (http://www.grsecurity.net
) gives even more insight into what is
happening on the system. For example, it provides:
Program execution logging
Resource usage logging (it records attempts to overstep resource limits)
Logging of the execution of programs in a chroot jail
chdir
logging
(u
)mount
logging
IPC logging
Signal logging (it records segmentation faults)
Fork failure logging
Time change logging
Once you compile the patch into the kernel, you can selectively activate the
features at runtime through sysctl
support. Each program
execution will be logged to the system log with a single entry:
May 3 17:08:59 ben kernel: grsec: exec of /usr/bin/tail (tail messages ) by /bin/bash[bash:1153] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:1087] uid/euid:0/0 gid/egid:0/0
You can restrict extensive logging to a single group and avoid logging of the whole system. Note that grsecurity kernel auditing provides more information than process accounting but the drawback is that there aren’t tools (at least not at the moment) to process and summarize collected information.
Traffic shaping is a technique that establishes control over web server traffic. Many Apache modules perform traffic shaping, and their goal is usually to slow down a (client) IP address or to control the bandwidth consumption on the per-virtual host level. As a side effect, these modules can be effective against certain types of DoS attacks. The following are some of the more popular traffic-shaping modules:
One module is designed specifically as a remedy for Apache DoS attacks:
The mod_dosevasive
module will allow you to specify a maximal
number of requests executed by the same IP address against one Apache child. If the
threshold is reached, the IP address is blacklisted for a time period you specify. You
can send an email message or execute a system command (to talk to a firewall, for
example) when that happens.
The mod_dosevasive
module is not as good as it could be because it
does not use shared memory to keep information about previous requests persistent.
Instead, the information is kept with each child. Other children know nothing about
abuse against one of them. When a child serves the maximum number of requests and dies,
the information goes with it.
Blacklisting IP addresses can be dangerous. An attempt to prevent DoS attacks can become a self-inflicted DoS attack because users in general do not have unique IP addresses. Many users browse through proxies or are hidden behind a network address translation (NAT) system. Blacklisting a proxy will cause all users behind it to be blacklisted. If you really must use one of the traffic-shaping techniques that uses the IP address of the client for that purpose, do the following:
Know your users (before you start the blacklist operation).
See how many are coming to your web site through a proxy, and never blacklist its IP address.
In the blacklisting code, detect HTTP headers that indicate the request came
through a proxy (HTTP_FORWARDED
,
HTTP_X_FORWARDED
, HTTP_VIA
) and do not
blacklist those.
Monitor and verify each violation.
With some exceptions (such as with vulnerabilities that can be easily fixed) DoS attacks are very difficult to defend against. The main problem remains being able to distinguish legitimate requests from requests belonging to an attack.
The chapter concludes with a strategy for handling DoS attacks:
Treat DoS attacks as one of many possible risks. Your assessment about the risk will influence the way you prepare your defense.
Learn about the content hosted on the server. It may be possible to improve software characteristics (and make it less susceptible to DoS attacks) in advance.
Determine what you will do when various types of attacks occur. For example, have the contact details of your upstream provider ready.
Monitor server operation to detect attacks as soon as possible.
Act promptly when attacked.
If attacks increase, install automated tools for defense.