The purpose of a web system security assessment is to determine how tight security is. Many deployments get it wrong because the responsibility to ensure a web system’s security is split between administrators and developers. I have seen this many times. Neither party understands the whole system, yet they have responsibility to ensure security.
The way I see it, web security is the responsibility of the system administrator. With the responsibility assigned to one party, the job becomes an order of magnitude easier. If you are a system administrator, think about it this way:
To get the job done, you will have to approach the other side, web application development, and understand how it is done. The purpose of Chapter 10 was to give you a solid introduction to web application security issues. The good news is that web security is very interesting! Furthermore, you will not be expected to create secure code, only judge it.
The assessment methodology laid down in this chapter is what I like to call “lightweight web security assessment methodology.“ The word ”lightweight“ is there because the methodology does not cover every detail, especially the programming parts. In an ideal world, web application security should only be assessed by web application security professionals. They need to concern themselves with programming details. I will assume you are not this person, you have many tasks to do, and you do not do web security full time. Have the 20/80 rule in mind: expend 20 percent of the effort to get 80 percent of the benefits.
Though web security professionals can benefit from this book, such professionals will, however, use the book as a starting point and make that 80 percent of additional effort that is expected of them. A complete web security assessment consists of three complementary parts. They should be executed in the following order:
Testing from the outside, with no knowledge of the system.
Testing from the inside, with full knowledge of the system.
Testing that combines the previous two types of testing. Gray-box testing can reflect the situation that might occur when an attacker can obtain the source code for an application (it could have been leaked or is publicly available). In such circumstances, the attacker is likely to set up a copy of the application on a development server and practice attacks there.
Before you continue, look at the Appendix A, where you will find a list of web security tools. Knowing how something works under the covers is important, but testing everything manually takes away too much of your precious time.
In black-box testing, you pretend you are an outsider, and you try to break in. This useful technique simulates the real world. The less you know about the system you are about to investigate, the better. I assume you are doing black-box assessment because you fall into one of these categories:
You want to increase the security of your own system.
You are helping someone else secure their system.
You are performing web security assessment professionally.
Unless you belong to the first category, you must ensure you have permission to perform black-box testing. Black-box testing can be treated as hostile and often illegal. If you are doing a favor for a friend, get written permission from someone who has the authority to provide it.
Ask yourself these questions: Who am I pretending to be? Or, what is the starting point of my assessment? The answer depends on the nature of the system you are testing. Here are some choices:
A member of the general public
A business partner of the target organization
A customer on the same shared server where the target application resides
A malicious employee
A fellow system administrator
Different starting points require different approaches. A system administrator may have access to the most important servers, but such servers are (hopefully) out of reach of a member of the public. The best way to conduct an assessment is to start with no special privileges and examine what the system looks like from that point of view. Then continue upward, assuming other roles. While doing all this, remember you are doing a web security assessment, which is a small fraction of the subject of information security. Do not cover too much territory, or you will never finish. In your initial assessment, you should focus on the issues mostly under your responsibility.
As you perform the assessment, record everything, and create an information trail. If you know something about the infrastructure beforehand, you must prove you did not use it as part of black-box testing. You can use that knowledge later, as part of white-box testing.
Black-box testing consists of the following steps:
Information gathering (passive and active)
Web server analysis
Web application analysis
Vulnerability probing
I did not include report writing, but you will have to do that, too. To make your job easier, mark your findings this way:
Things to watch out for
Problems that are not errors but are things that should be fixed
Problems that should be corrected as soon as possible
Gross oversights; problems that must be corrected immediately
Information gathering is the first step of every security assessment procedure and is important when performed as part of black-box testing methodology. Working blindly, you will see information available to a potential attacker. Here we assume you are armed only with the name of a web site.
Information gathering can be broadly separated into two categories: passive and active. Passive techniques cannot be detected by the organization being investigated. They involve extracting knowledge about the organization from systems outside the organization. They may include techniques that involve communication with systems run by the organization but only if such techniques are part of their normal operation (e.g., the use of the organization’s DNS servers) and cannot be detected.
Most information gathering techniques are well known, having been used as part of traditional network penetration testing for years. Passive information gathering techniques were covered in the paper written by Gunter Ollmann:
“Passive Information Gathering: The Analysis Of Leaked Network Security
Information“ by Gunter Ollmann (NGSS) (http://www.nextgenss.com/papers/NGSJan2004PassiveWP.pdf ) |
The name of the web site you have been provided will resolve to an IP address, giving you the vital information you need to start with. Depending on what you have been asked to do, you must decide whether you want to gather information about the whole of the organization. If your only target is the public web site, the IP address of the server is all you need. If the target of your research is an application used internally, you will need to expand your search to cover the organization’s internal systems.
The IP address of the public web site may help discover the whole network, but only if the site is internally hosted. For smaller web sites, hosting internally is overkill, so hosting is often outsourced. Your best bet is to exchange email with someone from the organization. Their IP address, possibly the address from an internal network, will be embedded into email headers.
Your first goal is to learn as much as possible about the organization, so going to its public web site is a natural place to start. You are looking for the following information:
Names and positions
Email addresses
Addresses and telephone numbers, which reveal physical locations
Posted documents, which often reveal previous revisions, or information on who created them
The web site should be sufficient for you to learn enough about the organization to map out its network of trust. In a worst-case scenario (from the point of view of attacking them), the organization will trust itself. If it relies on external entities, there may be many opportunities for exploitation. Here is some of the information you should determine:
The security posture of a smaller organization is often lax, and such organizations usually cannot afford having information security professionals on staff. Bigger companies employ many skilled professionals and possibly have a dedicated information security team.
Organizations are rarely able to enforce their procedures when parts of the operations are outsourced to external entities. If parts of the organization are outsourced, you may have to expand your search to target other sites.
Do they rely on a network of partners or distributors to do the business? Distributors are often smaller companies with lax security procedures. A distributor may be an easy point of entry.
Current domain name registration practices require significant private information to be provided to the public. This information can easily be accessed using the whois service, which is available in many tools, web sites, and on the command line.
There are many whois servers (e.g., one for each registrar), and the important
part of finding the information you are looking for is in knowing which server
to ask. Normally, whois servers issue redirects when they cannot answer a query,
and good tools will follow redirects automatically. When using web-based tools
(e.g., http://www.internic.net/whois.html
),
you will have to perform redirection manually.
Watch what information we can find on O’Reilly (registrar disclaimers have been removed from the output to save space):
$ whois oreilly.com
...
O'Reilly & Associates
1005 Gravenstein Hwy., North
Sebastopol, CA, 95472
US
Domain Name: OREILLY.COM
Administrative Contact -
DNS Admin - nic-ac@OREILLY.COM
O'Reilly & Associates, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
US
Phone - 707-827-7000
Fax - 707-823-9746
Technical Contact -
technical DNS - nic-tc@OREILLY.COM
O'Reilly & Associates
1005 Gravenstein Highway North
Sebastopol, CA 95472
US
Phone - 707-827-7000
Fax - - 707-823-9746
Record update date - 2004-05-19 07:07:44
Record create date - 1997-05-27
Record will expire on - 2005-05-26
Database last updated on - 2004-06-02 10:33:07 EST
Domain servers in listed order:
NS.OREILLY.COM 209.204.146.21
NS1.SONIC.NET 208.201.224.11
A tool called dig
can be used to convert names to IP
addresses or do the reverse, convert IP addresses to names (known as
reverse lookup). An older tool,
nslookup
, is still popular and widely deployed.
$ dig oreilly.com any
; <<>> DiG 9.2.1 <<>> oreilly.com any
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30773
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 3, ADDITIONAL: 4
;; QUESTION SECTION:
;oreilly.com. IN ANY
;; ANSWER SECTION:
oreilly.com. 20923 IN NS ns1.sonic.net.
oreilly.com. 20923 IN NS ns2.sonic.net.
oreilly.com. 20923 IN NS ns.oreilly.com.
oreilly.com. 20924 IN SOA ns.oreilly.com.
nic-tc.oreilly.com.
2004052001 10800 3600 604800 21600
oreilly.com. 20991 IN MX 20 smtp2.oreilly.com.
;; AUTHORITY SECTION:
oreilly.com. 20923 IN NS ns1.sonic.net.
oreilly.com. 20923 IN NS ns2.sonic.net.
oreilly.com. 20923 IN NS ns.oreilly.com.
;; ADDITIONAL SECTION:
ns1.sonic.net. 105840 IN A 208.201.224.11
ns2.sonic.net. 105840 IN A 208.201.224.33
ns.oreilly.com. 79648 IN A 209.204.146.21
smtp2.oreilly.com. 21011 IN A 209.58.173.10
;; Query time: 2 msec
;; SERVER: 217.160.182.251#53(217.160.182.251)
;; WHEN: Wed Jun 2 15:54:00 2004
;; MSG SIZE rcvd: 262
This type of query reveals basic information about a domain name, such as the name servers and the mail servers. We can gather more information by asking a specific question (e.g., “What is the address of the web site?”):
$ dig www.oreilly.com
;; QUESTION SECTION:
;www.oreilly.com. IN A
;; ANSWER SECTION:
www.oreilly.com. 20269 IN A 208.201.239.36
www.oreilly.com. 20269 IN A 208.201.239.37
The dig
tool converts IP addresses into names when the
-x
option is used:
$ dig -x 208.201.239.36
;; QUESTION SECTION:
;36.239.201.208.in-addr.arpa. IN PTR
;; ANSWER SECTION:
36.239.201.208.in-addr.arpa. 86381 IN PTR www.oreillynet.com.
You can see that this reverse query of the IP address from looking up the
domain name oreilly.com
gave us a whole new domain
name.
A zone transfer is a service where all the information about a particular domain name is transferred from a domain name server. Such services are handy because of the wealth of information they provide. For the same reason, the access to a zone transfer service is often restricted. Zone transfers are generally not used for normal DNS operation, so requests for zone transfers are sometimes logged and treated as signs of preparation for intrusion.
You have probably discovered several IP addresses by now. IP addresses are not sold; they are assigned to organizations by bodies known as Regional Internet Registries (RIRs). The information kept by RIRs is publicly available. Four registries cover address allocation across the globe:
Asia-Pacific Network Information Center (http://www.apnic.net
)
American Registry for Internet Numbers (http://www.arin.net
)
Latin American and Caribbean Internet Address Registry
(http://www.lacnic.net
)
RIPE Network Coordination Centre (http://www.ripe.net
)
Registries do not work with end users directly. Instead, they delegate large blocks of addresses to providers, who delegate smaller chunks further. In effect, an address can be assigned to multiple parties. In theory, every IP address should be associated with the organization using it. In real life, Internet providers may not update the IP address database. The best you can do is to determine the connectivity provider of an organization.
IP assignment data can be retrieved from any active whois server, and
different servers can give different results. In the case below, I just guessed
that whois.sonic.net
exists. This is what we
get for one of O’Reilly’s IP addresses:
$ whois -h whois.sonic.net 209.204.146.21
[Querying whois.sonic.net]
[whois.sonic.net]
You asked for 209.204.146.21
network:Class-Name:network
network:Auth-Area:127.0.0.1/32
network:ID:NETBLK-SONIC-209-204-146-0.127.0.0.1/32
network:Handle:NETBLK-SONIC-209-204-146-0
network:Network-Name:SONIC-209-204-146-0
network:IP-Network:209.204.146.0/24
network:IP-Network-Block:209.204.146.0 - 209.204.146.255
network:Org-Name:John Irwin
network:Email:ora@sonic.net
network:Tech-Contact;Role:SACC-ORA-SONIC.127.0.0.1/32
network:Class-Name:network
network:Auth-Area:127.0.0.1/32
network:ID:NETBLK-SONIC-209-204-128-0.127.0.0.1/32
network:Handle:NETBLK-SONIC-209-204-128-0
network:Network-Name:SONIC-209-204-128-0
network:IP-Network:209.204.128.0/18
network:IP-Network-Block:209.204.128.0 - 209.204.191.255
network:Org-Name:Sonic Hostmaster
network:Email:ipowner@sonic.net
network:Tech-Contact;Role:SACC-IPOWNER-SONIC.127.0.0.1/32
Search engines have become a real resource when it comes to information gathering. This is especially true for Google, which has exposed its functionality through an easy-to-use programming interface. Search engines can help you find:
Publicly available information on a web site or information that was available before.
Information that is not intended for public consumption but that is nevertheless available unprotected (and the search engine picked it up).
Posts from employees to newsgroups and mailing lists. Post headers reveal information about the infrastructure. Even message content can reveal bits about the infrastructure. If you find a member of the development team asking questions about a particular database engine, chances are that engine is used in-house.
Links to other organizations, possibly those that have done work for the organization being targeted.
Look at some example Google queries. If you want to find a list of PDF documents available on a site, type a Google search query such as the following:
site:www.modsecurity.org filetype:pdf
To see if a site contains Apache directory listings, type something like this:
site:www.modsecurity.org intitle:"Index of /" "Parent Directory"
To see if it contains any WS_FTP log files, type something like this:
site:www.modsecurity.org inurl:ws_ftp.log
Anyone can register with Google and receive a key that will support up to 1,000 automated searches per day. To learn more about Google APIs, see the following:
Google Web APIs (http://www.google.com/apis/
)
Google Web API Reference (http://www.google.com/apis/reference.html
)
Social engineering is arguably the oldest hacking technique, having been used
hundreds of years before computers were invented. With social engineering, a
small effort can go a long way. Kevin Mitnick (http://en.wikipedia.org/wiki/Kevin_Mitnick
) is the most
well-known practitioner. Here are some social-engineering approaches:
Just visit the company and have a look around. Get some company documentation from their sales people.
Follow up on a visit with a thank-you email and a question. You will get an email back (which you will use to extract headers from).
Open an account. Inquire about partnership and distributor opportunities. The sign-up procedure may give out interesting information about the security of the company’s extranet system. For example, you may be told that you must have a static IP address to connect, that a custom client is required, or that you can connect from wherever you want provided you use a privately issued client certificate.
Message boards are places where you can meet a company’s employees. Developers will often want to explain how they have designed the best system there is, revealing information they feel is harmless but which can be useful for the assessment.
Cases in which current employees disclose company secrets are rare but you can find former (often disgruntled) employees who will not hesitate to disclose a secret or two. Even in an innocent conversation, people may give examples from where they used to work. Talking to people who have designed a system will help you get a feeling for what you are up against.
For more information on social engineering (and funny real-life stories), see:
“Social Engineering Fundamentals, Part I: Hacker Tactics” by Sarah
Granger (http://www.securityfocus.com/printable/infocus/1527
)
“Social Engineering Fundamentals, Part II: Combat Strategies” by Sarah
Granger (http://www.securityfocus.com/printable/infocus/1533
)
For each domain name or IP address you acquire, perform a connectivity check
using traceroute
. Again, I use O’Reilly as an example.
$ traceroute www.oreilly.com
traceroute: Warning: www.oreilly.com has multiple addresses; using 208.201.
239.36
traceroute to www.oreilly.com (208.201.239.36), 30 hops max, 38 byte packets
1 gw-prtr-44-a.schlund.net (217.160.182.253) 0.238 ms
2 v999.gw-dist-a.bs.ka.schlund.net (212.227.125.253) 0.373 ms
3 ge-41.gw-backbone-b.bs.ka.schlund.net (212.227.116.232) 0.535 ms
4 pos-80.gw-backbone-b.ffm.schlund.net (212.227.112.127) 3.210 ms
5 cr02.frf02.pccwbtn.net (80.81.192.50) 4.363 ms
6 pos3-0.cr02.sjo01.pccwbtn.net (63.218.6.66) 195.201 ms
7 layer42.ge4-0.4.cr02.sjo01.pccwbtn.net (63.218.7.6) 187.701 ms
8 2.fast0-1.gw.equinix-sj.sonic.net (64.142.0.21) 185.405 ms
9 fast5-0-0.border.sr.sonic.net (64.142.0.13) 191.517 ms
10 eth1.dist1-1.sr.sonic.net (208.201.224.30) 192.652 ms
11 www.oreillynet.com (208.201.239.36) 190.662 ms
The traceroute
output shows the route packets use to
travel from your location to the target’s location. The last few lines matter;
the last line is the server. On line 10, we see what is most likely a router,
connecting the network to the Internet.
traceroute
relies on the ICMP protocol to discover
the path packets use to travel from one point to another, but ICMP packets
can be filtered for security reasons. An alternative tool,
tcptraceroute
(http://michael.toren.net/code/tcptraceroute/
) performs a
similar function but uses other methods. Try
tcptraceroute
if traceroute
does not produce results.
Port scanning is an active information-gathering technique. It is viewed as impolite and legally dubious. You should only perform port scanning against your own network or where you have written permission from the target.
The purpose of port scanning is to discover active network devices on a given range of addresses and to analyze each device to discover public services. In the context of web security assessment, you will want to know if a publicly accessible FTP or a database engine is running on the same server. If there is, you may be able to use it as part of your assessment.
Services often run unprotected and with default passwords. I once discovered a MySQL server on the same machine as the web server, running with the default root password (which is an empty string). Anyone could have accessed the company’s data and not bother with the web application.
The most popular port-scanning tool is Nmap (http://www.insecure.org/nmap/
), which is free and useful. It
is a command line tool, but a freeware frontend called NmapW is available from
Syhunt (http://www.syhunt.com/section.php?id=nmapw
). In the remainder
of this section, I will demonstrate how Nmap can be used to learn more about
running devices. In all examples, the real IP addresses are masked because they
belong to real devices.
The process of the discovery of active hosts is called a ping
sweep. An attempt is made to ping each IP address and live
addresses are reported. Here is a sample run, in which
XXX.XXX.XXX.112/28
represents the IP address you
would type:
# nmap -sP
XXX.XXX.XXX.112/28
Starting nmap 3.48 ( http://www.insecure.org/nmap/ )
Host (XXX.XXX.XXX.112) seems to be a subnet broadcast address (returned 1
extra pings).
Host (XXX.XXX.XXX.114) appears to be up.
Host (XXX.XXX.XXX.117) appears to be up.
Host (XXX.XXX.XXX.120) appears to be up.
Host (XXX.XXX.XXX.122) appears to be up.
Host (XXX.XXX.XXX.125) appears to be up.
Host (XXX.XXX.XXX.126) appears to be up.
Host (XXX.XXX.XXX.127) seems to be a subnet broadcast address (returned 1
extra pings).
Nmap run completed -- 16 IP addresses (6 hosts up) scanned in 7 seconds
After that, you can proceed to get more information from individual hosts by looking at their TCP ports for active services. The following is sample output from scanning a single host. I have used one of my servers since scanning one of O’Reilly’s servers without a permit would have been inappropriate.
# nmap -sS
XXX.XXX.XXX.XXX
Starting nmap 3.48 ( http://www.insecure.org/nmap/ )
The SYN Stealth Scan took 144 seconds to scan 1657 ports.
Interesting ports on XXX.XXX.XXX.XXX:
(The 1644 ports scanned but not shown below are in state: closed)
PORT STATE SERVICE
21/tcp open ftp
22/tcp open ssh
23/tcp open telnet
25/tcp open smtp
53/tcp open domain
80/tcp open http
110/tcp open pop-3
143/tcp open imap
443/tcp open https
993/tcp open imaps
995/tcp open pop3s
3306/tcp open mysql
8080/tcp open http-proxy
Nmap run completed -- 1 IP address (1 host up) scanned in 157.022 seconds
You can go further if you use Nmap with a -sV
switch, in
which case it will connect to the ports you specify and attempt to identify the
services running on them. In the following example, you can see the results of
service analysis when I run Nmap against ports 21, 80, and 8080. It uses the
Server
header field to identify web servers, which is the
reason it incorrectly identified the Apache running on port 80 as a Microsoft
Internet Information Server. (I configured my server with a fake server name, as
described in Chapter 2, where HTTP
fingerprinting for discovering real web server identities is discussed.)
#nmap -sV
XXX.XXX.XXX.XXX-P0 -p 21,80,8080
Starting nmap 3.48 ( http://www.insecure.org/nmap/ ) Interesting ports on XXX.XXX.XXX.XXX: PORT STATE SERVICE VERSION 21/tcp open ftp ProFTPD 1.2.9 80/tcp open http Microsoft IIS webserver 5.0 8080/tcp open http Apache httpd 2.0.49 ((Unix) DAV/2 PHP/4.3.4) Nmap run completed -- 1 IP address (1 host up) scanned in 22.065 seconds
Another well-known tool for service identification is Amap (http://www.thc.org/releases.php
). Try it if Nmap
does not come back with satisfactory results.
Scanning results will usually fall into one of three categories:
Where there is no firewall in place, you will often find many unrestricted services running on the server. This indicates a server that is not taken care of properly. This is the case with many managed dedicated servers.
A moderate-strength firewall is in place, allowing access to
public services (e.g., http
) but protecting
private services (e.g., ssh
). This often means
whoever maintains the server communicates with the server from a
static IP address. This type of firewall uses an “allow by default,
deny what is sensitive“ approach.
In addition to protecting nonpublic services, a tight firewall configuration will restrict ICMP (ping) traffic, restrict outbound traffic, and only accept related incoming traffic. This type of firewall uses a “deny by default, allow what is acceptable” approach.
If scan results fall into the first or the second category, the server is probably not being closely monitored. The third option shows the presence of people who know what they are doing; additional security measures may be in place.
This is where the real fun begins. At a minimum, you need the following tools:
A browser to access the web server
A way to construct and send custom requests, possibly through SSL
A web security assessment proxy to monitor and change traffic
Optionally, you may choose to perform an assessment through one or more open proxies (by chaining). This makes the test more realistic, but it may disclose sensitive information to others (whoever controls the proxy), so be careful.
If you do choose to go with a proxy, note that special page objects such as Flash animations and Java applets often choose to communicate directly with the server, thus revealing your real IP address.
We will take these steps:
Test SSL.
Identify the web server.
Identify the application server.
Examine default locations.
Probe for common configuration problems.
Examine responses to exceptions.
Probe for known vulnerabilities.
Enumerate applications.
I have put SSL tests first because, logically, SSL is the first layer of security you encounter. Also, in some rare cases you will encounter a target that requires use of a privately issued client certificate. In such cases, you are unlikely to progress further until you acquire a client certificate. However, you should still attempt to trick the server to give you access without a valid client certificate.
Attempt to access the server using any kind of client certificate (even a
certificate you created will do). If that fails, try to access the server using
a proper certificate signed by a well-known CA. On a misconfigured SSL server,
such a certificate will pass the authentication phase and allow access to the
application. (The server is only supposed to accept privately issued
certificates.) Sometimes using a valid certificate with a subject
admin
or Administrator
may get you
inside (without a password).
Whether or not a client certificate is required, perform the following tests:
Version 2 of the SSL protocol is known to suffer from a few security
problems. Unless there is a good reason to support older SSLv2 clients,
the web server should be configured to accept only SSLv3 or TLSv1
connections. To check this, use the OpenSSL client, as demonstrated in
Chapter 4, adding the
-no_ssl3
and -no_tls1
switches.
A default Apache SSL configuration will allow various ciphers to be
used to secure the connection. Many ciphers are not considered secure
any more. They are there only for backward compatibility. The OpenSSL
s_client
tool can be used for this purpose, but
an easier way exists. The Foundstone utility SSLDigger (described in the
Appendix A) will perform many
tests attempting to establish SSL connections using ciphers of different
strength. It comes with a well-written whitepaper that describes the
tool’s function.
Programmers sometimes redirect users to the SSL portion of the web site from the login page only and do not bother to check at other entry points. Consequently, you may be able to bypass SSL and use the site without it by directly typing the URL of a page.
After SSL testing (if any), attempt to identify the web server. Start by typing a Telnet command such as the following, substituting the appropriate web site name:
$telnet www.modsecurity.org 80
Trying 217.160.182.153... Connected to www.modsecurity.org. Escape character is '^]'.OPTIONS / HTTP/1.0
Host: www.modsecurity.org
HTTP/1.1 200 OK Date: Tue, 08 Jun 2004 10:54:52 GMT Server: Microsoft-IIS/5.0 Content-Length: 0 Allow: GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, PATCH, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, TRACE
We learn two things from this output:
The web server supports WebDAV. You can see this by the appearance of
the WebDAV specific methods, such as PATCH
and
PROPFIND
, in the Allow
response header. This is an indication that we should perform more
WebDAV research.
The Server
signature tells us the site is running
the Microsoft Internet Information Server. Suppose you find this
unlikely (having in mind the nature of the site and its pro-Unix
orientation). You can use Netcraft’s “What’s this site running?” service
(at http://uptime.netcraft.co.uk
and
described in the Appendix A) and
access the historical data if available. In this case, Netcraft will
reveal the site is running on Linux and Apache, and that the server
signature is “Apache/1.3.27 (Unix) (Red-Hat/Linux) PHP/4.2.2
mod_ssl/2.8.12 openSSL/0.9.6b“ (as of August 2003).
We turn to httprint
for the confirmation of the
signature:
$ httprint -P0 -h www.modsecurity.org -s signatures.txt
httprint v0.202 (beta) - web server fingerprinting tool
(c) 2003,2004 net-square solutions pvt. ltd. - see readme.txt
http://net-square.com/httprint/
httprint@net-square.com
--------------------------------------------------
Finger Printing on http://www.modsecurity.org:80/
Derived Signature:
Microsoft-IIS/5.0
9E431BC86ED3C295811C9DC5811C9DC5050C5D32505FCFE84276E4BB811C9DC5
0D7645B5811C9DC5811C9DC5CD37187C11DDC7D7811C9DC5811C9DC58A91CF57
FCCC535BE2CE6923FCCC535B811C9DC5E2CE69272576B769E2CE69269E431BC8
6ED3C295E2CE69262A200B4C6ED3C2956ED3C2956ED3C2956ED3C295E2CE6923
E2CE69236ED3C295811C9DC5E2CE6927E2CE6923
Banner Reported: Microsoft-IIS/5.0
Banner Deduced: Apache/1.3.27
Score: 140
Confidence: 84.34
This confirms the version of the web server that was reported by Netcraft. The confirmation shows the web server had not been upgraded since October 2003, so the chances of web server modules having been upgraded are slim. This is good information to have.
This complete signature gives us many things to work with. From here we can go
and examine known vulnerabilities for Apache, PHP, mod_ssl
,
and OpenSSL. The OpenSSL version (reported by Netcraft as 0.9.6b) looks very
old. According to the OpenSSL web site, Version 0.9.6b was released in July
2001. Many serious OpenSSL vulnerabilities have been made public since that
time.
A natural way forward from here would be to explore those vulnerabilities further. In this case, however, that would be a waste of time because the version of OpenSSL running on the server is not vulnerable to current attacks. Vendors often create custom branches of software applications that they include in their operating systems. After the split, the included applications are maintained internally, and the version numbers rarely change. When a security problem is discovered, vendors perform what is called a backport: the patch is ported from the current software version (maintained by the original application developers) back to the older release. This only results in a change of the packaging version number, which is typically only visible from the inside. Since there is no way of knowing this from the outside, the only thing to do is to go ahead and check for potential vulnerabilities.
We now know the site likely uses PHP because PHP used to appear in the web
server signature. We can confirm our assumption by browsing and looking for a
nonstatic part of the site. Pages with the extension .php
are likely to be PHP scripts.
Some sites can attempt to hide the technology by hiding extensions. For
example, they may associate the extension .html
with PHP,
making all pages dynamic. Or, if the site is running on a Windows server,
associating the extension .asp
with PHP may make the
application look as if it was implemented in ASP.
Attempts to increase security in this way are not likely to succeed. If you look closely, determining the technology behind a web site is easy. For system administrators it makes more sense to invest their time where it really matters.
Suppose you are not sure what technology is used at a web site. For example,
suppose the extension for a file is .asp
but you think that
ASP is not used. The HTTP response may reveal the truth:
$telnet www.modsecurity.org 80
Trying 217.160.182.153... Connected to www.modsecurity.org. Escape character is '^]'.HEAD /index.asp HTTP/1.0
Host: www.modsecurity.org
HTTP/1.1 200 OK Date: Tue, 24 Aug 2004 13:54:11 GMT Server: Microsoft-IIS/5.0 X-Powered-By: PHP/4.3.3-dev Set-Cookie: PHPSESSID=9d3e167d46dd3ebd81ca12641d82106d; path=/ Connection: close Content-Type: text/html
There are two clues in the response that tell you this is a PHP-based site.
First, the X-Powered-By
header includes the PHP version.
Second, the site sends a cookie (the Set-Cookie
header) whose
name is PHP-specific.
Don’t forget a site can utilize more than one technology. For example, CGI scripts are often used even when there is a better technology (such as PHP) available. Examine all parts of the site to discover the technologies used.
A search for default locations can yield significant rewards:
Finding files present where you expect them to be present will reinforce your judgment about the identity of the server and the application server.
Default installations can contain vulnerable scripts or files that reveal information about the target.
Management interfaces are often left unprotected, or protected with a default username/password combination.
For Apache, here are the common pages to try to locate:
/server-status
/server-info
/mod_gzip_status
/manual
/icons
~root/
~nobody/
Test to see if proxy operations are allowed in the web server. A running proxy
service that allows anyone to use it without restriction (a so-called
open proxy) represents a big configuration error. To
test, connect to the target web server and request a page from a totally
different web server. In proxy mode, you are allowed to enter a full hostname in
the request (otherwise, hostnames go into the Host
header):
$telnet www.example.com 80
Connected to www.example.com. Escape character is '^]'.HEAD http://www.google.com:80/ HTTP/1.0
HTTP/1.1 302 Found Date: Thu, 11 Nov 2004 14:10:14 GMT Server: GWS/2.1 Location: http://www.google.de/ Content-Type: text/html; charset=ISO-8859-1 Via: 1.0 www.google.com Connection: close Connection closed by foreign host.
If the request succeeds (you get a response, like the response from Google in the example above), you have encountered an open proxy. If you get a 403 response, that could mean the proxy is active but configured not to accept requests from your IP address (which is good). Getting anything else as a response probably means the proxy code is not active. (Web servers sometimes simply respond with a status code 200 and return their default home page.)
The other way to use a proxy is through a CONNECT
method,
which is designed to handle any type of TCP/IP connection, not just HTTP. This
is an example of a successful proxy connection using this method:
$telnet www.example.com 80
Connected to www.example.com. Escape character is '^]'.CONNECT www.google.com:80 HTTP/1.0
HTTP/1.0 200 Connection Established Proxy-agent: Apache/2.0.49 (Unix)HEAD / HTTP/1.0
Host: www.google.com
HTTP/1.0 302 Found Location: http://www.google.de/ Content-Type: text/html Server: GWS/2.1 Content-Length: 214 Date: Thu, 11 Nov 2004 14:15:22 GMT Connection: Keep-Alive Connection closed by foreign host.
In the first part of the request, you send a CONNECT
line
telling the proxy server where you want to go. If the CONNECT
method is allowed, you can continue typing. Everything you type from this point
on goes directly to the target server. Having access to a proxy that is also
part of an internal network opens up interesting possibilities. Internal
networks usually use nonroutable private space that cannot be reached from the
outside. But the proxy, because it is sitting on two addresses simultaneously,
can be used as a gateway. Suppose you know that the IP address of a database
server is 192.168.0.99
. (For example, you may have found this
information in an application library file through file disclosure.) There is no
way to reach this database server directly but if you ask the proxy nicely it
may respond:
$telnet www.example.com 80
Connected to www.example.com. Escape character is '^]'.CONNECT 192.168.0.99:3306 HTTP/1.0
HTTP/1.0 200 Connection Established Proxy-agent: Apache/2.0.49 (Unix)
If you think a proxy is there but configured not to respond to your IP address, make a note of it. This is one of those things whose exploitation can be attempted later, for example after a successful entry to a machine that holds an IP address internal to the organization.
The presence of WebDAV may allow file enumeration. You can test this using the
WebDAV protocol directly (see Chapter 10)
or with a WebDAV client. Cadaver (http://www.webdav.org/cadaver/
) is one such client. You should
also attempt to upload a file using a PUT
method. On a web
server that supports it, you may be able to upload and execute a script.
Another frequent configuration problem is the unrestricted availability of web server access logs. The logs, when available, can reveal direct links to other interesting (possibly also unprotected) server resources. Here are some folder names you should try:
/logs
/stats
/weblogs
/webstats
For your review, you need to be able to differentiate between normal responses and exceptions when they are coming from the web server you are investigating. To do this, make several obviously incorrect requests at the beginning of the review and watch for the following:
Is the server responding with HTTP status 404
when
pages are not found, as expected?
Is an IDS present? Simulate a few attacks against arbitrary scripts and see what happens. See if there might be a device that monitors the traffic and interferes upon attack detection.
Some applications respond to errors with HTTP status 200
as
they would for successful requests, rather than following the HTTP standard of
returning suitable status codes (such as status 404
when a
page is not found). They do this in error or in an attempt to confuse automated
vulnerability scanners. Authors of vulnerability scanners know about this trick,
but it is still used. Having HTTP status 200
returned in
response to errors will slow down any programmatic analysis of the web site but
not much. Instead of using the response status code to detect problems, you will
have to detect problems from the text embedded in the response page.
Examine the error messages produced by the application (even though we have not reached application analysis yet). If the application gives out overly verbose error messages, note this problem. Then proceed to use this flaw for information discovery later in the test.
If there is sufficient information about the web server and the application server and there is reason to suspect the site is not running the latest version of either, an attacker will try to exploit the vulnerabilities. Vulnerabilities fall into one of the following three categories:
Easy to exploit vulnerabilities, often web-based
Vulnerabilities for which ready-made exploits are available
Vulnerabilities for which exploits are not yet released
Attackers are likely to attempt exploitation in cases 1 and 2. Exploitation through case 3 is possible in theory, but it requires much effort and determination by the attacker. Run up-to-date software to prevent the exploitation of valuable targets.
If you have reason to believe a system is vulnerable to a known vulnerability, you should attempt to compromise it. A successful exploitation of a vulnerability is what black-box assessment is all about. However, that can sometimes be dangerous and may lead to interrupted services, server crashing, or even data loss, so exercise good judgment to stop short of causing damage.
The last step in web server analysis is to enumerate installed applications.
Frequently, there will be only one. Public web sites sometimes have several
applications, one for the main content, another for forums, a third for a web
log, and so on. Each application is an attack vector that must be analyzed. If
you discover that a site uses a well-known application, you should look for its
known vulnerabilities (for example, by visiting http://www.securityfocus.com/bid
or http://www.secunia.com
). If the application has not been
patched recently there may be vulnerabilities that can be exploited.
The web application analysis steps should be repeated for every identified application.
Depending on the assessment you are performing, you may be able to execute processes on the server from the beginning (if you are pretending to be a shared hosting customer, for example). Even if such a privilege is not given to you, a successful exploitation of an application weakness may still provide you with this ability. If you can do this, one of the mandatory assessment steps would be to assess the execution environment:
Use a tool such as env_audit
(see Chapter 6) to search for process
information leaks.
Search the filesystem to locate executable binaries, files and directories you can read and write.
If the source of the web application you are assessing is commonly available, then download it for review. (You can install it later if you determine there is a reason to practice attacking it.) Try to find the exact version used at the target site. Then proceed with the following:
Learn about the application architecture.
Discover how session management is implemented.
Examine the access control mechanisms.
Learn about the way the application interacts with other components.
Read through the source code (if available) for vulnerabilities.
Research whether there are any known vulnerabilities.
The remainder of this section continues with the review under the assumption the source code is unavailable. The principle is the same, except that with the source code you will have much more information to work with.
Map out the entire application structure. A good approach is to use a spider
to crawl the site automatically and review the results manually to fill in the
blanks. Many spiders do not handle the use of the HTML
<base>
tag properly. If the site uses it, you will
be likely to do most of the work manually.
As you are traversing the application, you should note response headers and cookies used by the application. Whenever you discover a page that is a part of a process (for example, a checkout process in an e-commerce application), write the information down. Those pages are candidates for tests against process state management weaknesses.
Look into the source code of every page (here I mean the HTML source code and not the source of the script that generated it), examining JavaScript code and HTML comments. Developers often create a single JavaScript library file and use it for all application modules. It may happen that you get a lot of JavaScript code covering the use of an administrative interface.
Enumerate pages that accept parameters. Forms are especially interesting because most of the application functionality resides in them. Give special attention to hidden form fields because applications often do not expect the values of such fields to change.
For each page, write down the following information:
Target URL
Method (GET
/POST
)
Encoding (usually
application/x-www-form-urlencoded
; sometimes
multipart/form-data
)
Parameters (their types and default values)
If authentication is required
If SSL is required
Notes
You should note all scripts that perform security-sensitive operations, for the following reasons:
File downloads performed through scripts (instead of directly by the web server) may be vulnerable to file disclosure problems.
Scripts that appear to be using page parameters to include files from disk are also candidates for file disclosure attacks.
User registration, login, and pages to handle forgotten passwords are sensitive areas where brute-force attacks may work.
Attempt to access directories directly, hoping to get directory listings and discover new files. Use WebDAV directory listings if WebDAV is available.
If that fails, some of the well-known files may provide more information:
robots.txt
(may contain links to hidden
folders)
.bash_history
citydesk.xml
(contains a list of all site
files)
WS_FTP.LOG
(contains a record of all FTP
transfers)
WEB-INF/
(contains code that should never be
accessed directly)
CVS/
(contains a list of files in the
folder)
_mm/contribute.xml
(Macromedia Contribute
configuration)
_notes/<pagename>.mno
(Macromedia
Contribute file notes)
_baks
(Macromedia Contribute backup files)
Mutate existing filenames, appending frequently used backup extensions and sometimes replacing the existing extension with one of the following:
~
.bak
.BAK
.old
.OLD
.prev
.swp
(but with a dot in front of the
filename)
Finally, attempting to download predictably named files and folders in every existing folder of the site may yield results. Some sample predictable names include:
You have collected enough information about the application to analyze three potentially vulnerable areas in every web application:
Session management mechanisms, especially those that are homemade, may be vulnerable to one of the many attacks described in Chapter 10. Session tokens should be examined and tested for randomness.
The login page is possibly the most important page in an application,
especially if the application is not open for public registration. One
way to attack the authentication method is to look for script
vulnerabilities as you would for any other page. Perhaps the login page
is vulnerable to an SQL injection attack and you could craft a special
request to bypass authentication. An alternative is to attempt a
brute-force attack. Since HTTP is a stateless protocol, many web
applications were not designed to detect multiple authentication
failures, which makes them vulnerable to brute-force attacks. Though
such attacks leave clearly visible tracks in the error logs, they often
go unnoticed because logs are not regularly reviewed. It is trivial to
write a custom script (using Perl, for example) to automate brute-force
attacks, and most people do just that. You may be able to use a tool
such as Hydra (http://thc.org/thc-hydra/
) to do the same without any
programming.
The authorization subsystem can be tested once you authenticate with the application. The goal of the tests should be to find ways to perform actions that should be beyond your normal user privileges. The ability to do this is known under the term privilege escalation. For example, a frequent authorization problem occurs when a user’s unique identifier is used in a script as a parameter but the script does not check that the identifier belongs to the user who is executing the script. When you hear in the news of users being able to see other users’ banking details online, the cause was probably a problem of this type. This is known as horizontal privilege escalation. Vertical privilege escalation occurs when you are able to perform an action that can normally only be performed by a different class of user altogether. For example, some applications keep the information as to whether the user is a privileged user in a cookie. In such circumstances, any user can become a privileged user simply by forging the cookie.
The final step of black-box vulnerability testing requires the public interface of the application, parameterized pages, to be examined to prove (or disprove) they are susceptible to attacks.
If you have already found some known vulnerabilities, you will need to confirm them, so do that first. The rest of the work is a process of going through the list of all pages, fiddling with the parameters, attempting to break the scripts. There is no single straight path to take. You need to understand web application security well, think on your feet, and combine pieces of information to build toward an exploit.
This process is not covered in detail here. Practice using the material available in this chapter and in Chapter 10. You should follow the links provided throughout both chapters. You may want to try out two web application security learning environments (WebMaven and WebGoat) described in the Appendix A.
Here is a list of the vulnerabilities you may attempt to find in an application. All of these are described in Chapter 10, with the exception of DoS attacks, which are described in Chapter 5.
SQL injection attacks
XSS attacks
File disclosure flaws
Source code disclosure flaws
Misconfigured access control mechanisms
Application logic flaws
Command execution attacks
Code execution attacks
Session management attacks
Brute-force attacks
Technology-specific flaws
Buffer overflow attacks
White-box testing is the complete opposite of what we have been doing. The goal of black-box testing was to rely only on your own resources and remain anonymous and unnoticed; here we can access anything anywhere (or so the theory goes).
The key to a successful white-box review is having direct contact and cooperation from developers and people in charge of system maintenance. Software documentation may be nonexistent, so you will need help from these people to understand the environment to the level required for the assessment.
To begin the review, you need the following:
Complete application documentation and the source code.
Direct access to application developers and system administrators. There is no need for them to be with you all the time; having their telephone numbers combined with a meeting or two will be sufficient.
Unrestricted access to the production server or to an exact system replica. You will need a working system to perform tests since looking at the code is not enough.
The process of white-box testing consists of the following steps:
Architecture review
Configuration review
Functional review
At the end of your white-box testing, you should have a review report that documents your methodology, contains review notes, lists notices, warnings, and errors, and offers recommendations for improvement.
The purpose of the architecture review is to pave the way for the actions ahead. A good understanding of the application is essential for a successful review. You should examine the following:
If you are lucky, the application review will begin with a well-defined security policy in hand. If such a thing does not exist (which is common), you will have difficulties defining what “security” means. Where possible, a subproject should be branched out to create the application security policy. Unless you know what needs to be protected, it will not be possible to determine whether the system is secure enough. If a subproject is not a possibility, you will have to sketch a security policy using common sense. This security policy will suffer from being focused too much on technology, and based on your assumptions about the business (which may be incorrect). In any case, you will definitely need something to guide you through the rest of the review.
Code review will be the subject of later review steps. At this point, we are only interested in major application modules. A typical example would be an application that consists of a public part and the administrative interfaces.
Applications are built onto libraries that handle common tasks. It is these libraries that interact with the environment and should be the place to look for security problems.
What kind of data is the application storing? How is it stored and where? Is the storage methodology secure enough for that type of data? Authentication information (such as passwords) should be treated as data, too. Here are some common questions: Are passwords stored in plaintext? What about credit card information? Such information should not be stored in plaintext and should not be stored with a method that would allow an attacker to decrypt it on the server.
Which external systems does the application connect to? Most web applications connect to databases. Is the rule of least privilege used?
Further questions to ask yourself at this point are:
Is the application architecture prone to DoS attacks?
Is the application designed in such a way as to allow it to scale to support its users and processing demands?
In a configuration review, you pay attention to the environment the application resides in. You need to ask yourself the following questions:
What operating system is the server running? What kind of protection does it have? What other services does it offer?
Is the server exclusively used for this application? Are many applications sharing the same server? Is it a shared hosting server managed by a third party?
Who has access to the system and how? Shell access is the most dangerous because it gives great flexibility, but other types of access (FTP, CGI scripts) can become equally dangerous with effort and creativity.
To begin your configuration review, create a temporary folder somewhere to
store the files you will create during the review, as well as the relevant files
you will copy from the application. We assume the path
/home/review
is correct.
Always preserve the file path when making copies. For example, if you want
to preserve /etc/passwd
, copy it to the location
/home/review/etc/passwd
.
As you are making copies ensure you do not copy some of the sensitive data. For example, you do not want to make a copy of the server’s private key. If configuration files contain passwords, you should replace them with a note.
There can always be exceptions. If you have a good reason to make a copy of a sensitive file, go ahead and do it. Review results are likely to be classified as sensitive data, too.
Armed with the knowledge of how the application works (or how it should work), we go to the filesystem to assess the configuration. This part of the review starts by creating a record of all files that are part of the application. I find it useful to have a folder tree at the beginning followed by the detailed listing of all files:
#find /home/application/ -type d | sort > /home/review/filelist.txt
#echo >> /home/review/filelist.txt
#ls -albR /home/application >> /home/review/filelist.txt
In the example above, I have assumed the application sits in the
/home/application
folder. Ideally, all application
files will reside within a single folder. If they do not, the review should
include all relevant folders. For now we assume we have everything listed in the
file filelist.txt
.
Continue to use the same file for your notes. It is convenient to have everything in one place. You will need at least two console windows and a browser window to test assumptions you make during the review. In your notes, include the following:
Name of the application and a short description of its purpose
Details about the environment (e.g., the name of the server and whether it is a production server, a development server, or a demo setup for the review)
Your name and email address
Possibly a phone number
Description of the activity (e.g., “Routine web security review“)
Make a copy of the web server configuration files first. Then examine the
relevant parts of the configuration, making notes as you go. Remember to include
the .htaccess
files in the review (if used). Record the
following information:
Hostnames and web server ports
Web server document root folder(s) and aliases
Extension-based mappings, folders where CGI scripts are allowed to run, and script aliases
Parts of the site that are password-protected
Situations in which access control is based on file or folder names
(e.g., “.htaccess
files cannot be
downloaded“)
Situations in which access control is based on client IP address or hostname (e.g., “Access to the administrative interface is allowed only from UK offices“)
In most cases, you can copy the server configuration and add your notes to it. Remember your audience will include people who do not know how to configure Apache, so your notes should translate the configuration for them.
Creating a comprehensive checklist of things to look for in web server configuration is difficult. The approach most likely to succeed is to compare the documented requirements (if they exist) with the actual configuration to find flaws. Ask yourself if the web server is configured to mitigate DoS attacks (see Chapter 5).
Applications typically have their own configuration files. You need to know where such files are stored and familiarize yourself with the options. Make copies of the files for record-keeping purposes.
Some applications keep their configuration, or parts of the configuration, in a database. If you find this is the case, you need to dump the configuration part of a database into a file and store the dump as a record.
You will probably be interested in options related to logging and access control. Applications often need their own password to access other parts of the system (e.g., a database), and you should note how those passwords are stored. If the application supports a debugging mode, you need to examine if it is used and how.
Examine how a connection to the database is made. You do not want to see:
A connection based on trust (e.g., “accept all connections from localhost“). This would mean that any local user could gain access to the database.
A connection made with a root account. This account will typically have full access to the database system.
The web application should have minimal database privileges. It is acceptable for an application to use one account to access a database and have full privileges over it. It is not acceptable to be able to access more than one database (think about containment). The application privileges should be further restricted wherever possible (e.g., do not allow the account to drop tables, or give it read-only access to parts of the database).
The same concept (“least privilege used”) applies to connections to other types of systems, for example LDAP.
When reviewing file permissions, we are interested in deviations from the default permissions, which are defined as follows:
Application files are owned by the application user (for example,
appuser
) and the application group (for example
appgrp
). The account and the group are not used
for other purposes, which also means that no other users should be
members of the application group.
Write access is not allowed.
Other users and groups have no access to application files.
As an exception, the web server user is allowed read access for files and is allowed read and execute access for CGI scripts (see Chapter 6).
We examine the potential for information leakage first, by understanding who
is allowed read access to application files. If read access is discovered and it
cannot be justified, the discovery is marked as an error. We automate the search
using the find
utility.
Examine if any suid
or guid
files are
present. Such files allow binaries to run as their owner (typically
root
) and not as the user who is executing them. Their
presence (though unlikely) may be very dangerous, so it is worth checking for
them:
# find /home/application -type f -and \( -perm -4000 -or -perm -2000 \) |
xargs ls -adl
The following finds world-readable files, where any system user can read the files and folders:
# find /home/application -perm -4 | xargs ls -adl
The following finds files owned by users other than the application user:
# find /home/application ! -user appuser | xargs ls -adl
The following finds group-readable files, where the group is not the application group:
# find /home/application -perm -40 ! -group appgrp | xargs ls -adl
Allowing users other than the application user write access opens a whole new attack vector and is, therefore, very dangerous. This is especially true for the web server user because it may be possible for an attacker to control the publicly available scripts to create a file under the application tree, leading to code execution compromise.
The following finds world-writable files:
# find /home/application -perm -2 | xargs ls -adl
The following finds files owned by users other than the application user. This includes files owned by the web server user.
# find /home/application ! -user appuser | xargs ls -adl
The following finds group-writable files, in which the group is not the application group (group-writable files are not necessary but there may be a good reason for their existence):
# find /home/application -perm -20 ! -group appgrp | xargs ls -adl
We now go through the file listing, trying to understand the purpose of each file and make a judgment as to whether it is in the right place and whether the permissions are configured properly. Here is advice regarding the different types of files:
Datafiles should never be stored under the web server tree. No user other than the application user should have access to them.
Library files should never be kept under the web server tree
either, but they are found there sometimes. This is relatively safe
(but not ideal) provided the extension used is seen by the web
server as that of a script. Otherwise, having such files under the
web server tree is a configuration error. For example, some
programmers use a .inc
extension for PHP
library files or a .class
extension for
individual PHP classes. These will probably not be recognized as PHP
scripts.
This class covers temporary files placed under the web server for download, “special” folders that can be accessed by anyone who knows their names. Such files do not belong on a web site. Temporary files should be moved to the assessment storage area immediately. If there is a genuine need for functionality that does not exist (for example, secure download of certain files), a note should be made to implement the functionality securely.
If file upload is allowed, the folder where writing is allowed should be configured not to allow script or code execution. Anything other than that is a code execution compromise waiting to happen.
All sorts of files end up under the web server tree. Archives, backup files created by editors, and temporary files are dangerous as they can leak system information.
At the end of this step, we go back to the file permission report and note as errors any assigned permissions that are not essential for the application to function properly.
The next step is to examine parts of the source code. A full source code review is expensive and often not economical (plus it requires very good understanding of programming and the technology used, an understanding only developers can have). To meet our own goals, we perform a limited review of the code:
Basic review to understand how the application works
Review of critical application components
Review of hot spots, the parts of the code most vulnerable to attacks
In basic application review, you browse through the source code, locate the libraries, and examine the general information flow. The main purpose of the review is to identify the application building blocks, and review them one by one.
Web applications are typically built on top of infrastructure that is designed to handle common web-related tasks. This is the layer where many security issues are found. I say “typically” because the use of libraries is a best practice and not a mandatory activity. Badly designed applications will have the infrastructure tasks handled by the same code that provides the application functionality. It is a bad sign if you cannot identify the following basic building blocks:
Input data should never be accessed directly. Individual bits of data should first be validated for type (“Is it a number?”) and meaning (“Birth dates set in the future are not valid”). It is generally accepted that the correct strategy to deal with input is to accept what you know is valid (as opposed to trying to filter out what you know is not).
To prevent XSS attacks, output should be properly escaped. The
correct way to perform escaping depends on the context. In the case
of HTML files, the metacharacters <
(less
than), >
(greater than),
&
(ampersand), ’ (single quote), and “
(double quotes) should be replaced with their safe equivalents:
<
, >
,
&
, '
, and
"
, respectively. (Remember that an
HTML file can contain other types of content, such as Javascript,
and escaping rules can be different for them.)
Examine how database queries are constructed. The ideal way is through use of prepared statements. Constructing queries through string concatenation is easy to get wrong even if special care is taken.
Examine the interaction with systems other than databases. For example, in the case of LDAP, you want to see the LDAP query properly constructed to avoid the possibility of LDAP injection.
Examine the session management mechanisms for weaknesses (as described in Chapter 10).
Examine the code that performs access control. Does it make sense? You are looking to spot dumb mistakes here, such as storing information in cookies or performing authentication only at the gate, which lets those who know the layout of the application straight through.
The application should have an error log and an audit log. It should actively work to log relevant application events (e.g., users logging in, users logging out, users accessing documents). If, as recommended, you did black-box testing, you should look in the log files for your own traces. Learning how to catch yourself will help catch others.
You should look for application hot spots by examining scripts that contain “dangerous” functions, which include those for:
File manipulation
Database interaction
Process execution
Access to input data
Some hot spots must be detected manually by using the application. For others,
you can use the find
and grep
tools to
search through the source code and tell you where the hot spots are.
First, create a grep
pattern file, for example
hotspots.txt
, where each line contains a pattern that
will match one function you would like to review. A list of patterns to look for
related to external process invocation under PHP looks like this:
exec passthru proc_open shell_exec system ` popen
Next, tell grep
to search through all PHP files. If other
extensions are also used, be sure to include extensions other than the
.php
one shown.
# find . -name "*.php" | xargs grep -n -f hotspots.txt
If you find too many false positives, create a file
notspots.txt
and fill it with negative patterns (I
needed to exclude the pg_exec
pattern, for example). Then use
another grep
process to filter out the negative
patterns:
# find . -name "*.php" | xargs grep -n -f hotspots.txt | grep -v -f notspots.txt
After you find a set of patterns that works well, store it for use in future reviews.
In the third and final phase of security assessment, the black-box testing procedures are executed again but this time using the knowledge acquired in the white-box testing phase. This is similar to the type of testing an attacker might do when he has access to the source code, but here you have a slight advantage because you know the layout of the files on disk, the configuration, and changes made to the original source code (if any). This time you are also allowed to have access to the target system while you are testing it from the outside. For example, you can look at the application logs to discover why some of your attacks are failing.
The gray-box testing phase is the time to confirm or deny the assumptions about vulnerabilities you made in the black-box phase. For example, maybe you thought Apache was vulnerable to a particular problem but you did not want to try to exploit it at that time. Looking at it from the inside, it is much easier and quicker to determine if your assumption was correct.