LinuxGuruz Next Previous Contents

6. Using iptables

iptables has a fairly detailed manual page (man iptables), and if you need more detail on particulars. Those of you familiar with ipchains may simply want to look at Differences Between iptables and ipchains; they are very similar.

There are several different things you can do with iptables. First the operations to manage whole chains. You start with three built-in chains input, output and forward which you can't delete.

  1. Create a new chain (-N).
  2. Delete an empty chain (-X).
  3. Change the policy for a built-in chain. (-P).
  4. List the rules in a chain (-L).
  5. Flush the rules out of a chain (-F).
  6. Zero the packet and byte counters on all rules in a chain (-Z).

There are several ways to manipulate rules inside a chain:

  1. Append a new rule to a chain (-A).
  2. Insert a new rule at some position in a chain (-I).
  3. Replace a rule at some position in a chain (-R).
  4. Delete a rule at some position in a chain (-D).
  5. Delete the first rule that matches in a chain (-D).

6.1 What You'll See When Your Computer Starts Up

At the moment (Linux 2.3.15), iptables is a module, called (`iptables.o'). You'll need to insert it into your kernel before you can use the iptables command. In future, it will be possible to build it in to the kernel.

Before any iptables commands have been run (be careful: some distributions will run iptables in their initialization scripts), there will be no rules in any of the built-in chains (`INPUT', `FORWARD' and `OUTPUT'), the INPUT and OUTPUT chains will have a policy of ACCEPT, and the FORWARD chain will have a policy of DROP (you can override this by providing the `forward=1' option to the iptables module).

6.2 Operations on a Single Rule

This is the bread-and-butter of packet filtering; manipulating rules. Most commonly, you will probably use the append (-A) and delete (-D) commands. The others (-I for insert and -R for replace) are simple extensions of these concepts.

Each rule specifies a set of conditions the packet must meet, and what to do if it meets them (a `target'). For example, you might want to drop all ICMP packets coming from the IP address 127.0.0.1. So in this case our conditions are that the protocol must be ICMP and that the source address must be 127.0.0.1. Our target is `DROP'.

127.0.0.1 is the `loopback' interface, which you will have even if you have no real network connection. You can use the `ping' program to generate such packets (it simply sends an ICMP type 8 (echo request) which all cooperative hosts should obligingly respond to with an ICMP type 0 (echo reply) packet). This makes it useful for testing.

# ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.2 ms

--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.2/0.2/0.2 ms
# iptables -A INPUT -s 127.0.0.1 -p icmp -j DROP
# ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes

--- 127.0.0.1 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
# 

You can see here that the first ping succeeds (the `-c 1' tells ping to only send a single packet).

Then we append (-A) to the `INPUT' chain, a rule specifying that for packets from 127.0.0.1 (`-s 127.0.0.1') with protocol ICMP (`-p icmp') we should jump to DROP (`-j DROP').

Then we test our rule, using the second ping. There will be a pause before the program gives up waiting for a response that will never come.

We can delete the rule in one of two ways. Firstly, since we know that it is the only rule in the input chain, we can use a numbered delete, as in:

        # iptables -D INPUT 1
        #
To delete rule number 1 in the INPUT chain.

The second way is to mirror the -A command, but replacing the -A with -D. This is useful when you have a complex chain of rules and you don't want to have to count them to figure out that it's rule 37 that you want to get rid of. In this case, we would use:

        # ipchains -D INPUT -s 127.0.0.1 -p icmp -j DROP
        #
The syntax of -D must have exactly the same options as the -A (or -I or -R) command. If there are multiple identical rules in the same chain, only the first will be deleted.

6.3 Filtering Specifications

We have seen the use of `-p' to specify protocol, and `-s' to specify source address, but there are other options we can use to specify packet characteristics. What follows is an exhaustive compendium.

Specifying Source and Destination IP Addresses

Source (`-s', `--source' or `--src') and destination (`-d', `--destination' or `--dst') IP addresses can be specified in four ways. The most common way is to use the full name, such as `localhost' or `www.linuxhq.com'. The second way is to specify the IP address such as `127.0.0.1'.

The third and fourth ways allow specification of a group of IP addresses, such as `199.95.207.0/24' or `199.95.207.0/255.255.255.0'. These both specify any IP address from 199.95.207.0 to 199.95.207.255 inclusive; the digits after the `/' tell which parts of the IP address are significant. `/32' or `/255.255.255.255' is the default (match all of the IP address). To specify any IP address at all `/0' can be used, like so:

        # ipchains -A input -s 0/0 -j DENY
        #

This is rarely used, as the effect above is the same as not specifying the `-s' option at all.

Specifying Inversion

Many flags, including the `-s' and `-d' flags can have their arguments preceded by `!' (pronounced `not') to match addresses NOT equal to the ones given. For example. `-s ! localhost' matches any packet not coming from localhost.

Specifying Protocol

The protocol can be specified with the `-p' flag. Protocol can be a number (if you know the numeric protocol values for IP) or a name for the special cases of `TCP', `UDP' or `ICMP'. Case doesn't matter, so `tcp' works as well as `TCP'.

The protocol name can be prefixed by a `!', to invert it, such as `-p ! TCP'.

Specifying an Interface

The `-i' (or `--in-interface') and `-o' (or `--out-interface') options specify the name of an interface to match. An interface is the physical device the packet came in on (`-i') or is going out on (`-o'). You can use the ifconfig command to list the interfaces which are `up' (ie. working at the moment).

Packets traversing the INPUT chain don't have an output interface, so any rule using `-o' in this chain will never match. Similarly, packets traversing the OUTPUT chain don't have an input interface, so any rule using `-i' in this chain will never match.

Only packets traversing the FORWARD chain have both an input and output interface.

It is perfectly legal to specify an interface that currently does not exist; the rule will not match anything until the interface comes up. This is extremely useful for dial-up PPP links (usually interface ppp0) and the like.

As a special case, an interface name ending with a `+' will match all interfaces (whether they currently exist or not) which begin with that string. For example, to specify a rule which matches all PPP interfaces, the -i ppp+ option would be used.

The interface name can be preceded by a `!' to match a packet which does NOT match the specified interface(s).

Specifying Fragments

Sometimes a packet is too large to fit down a wire all at once. When this happens, the packet is divided into fragments, and sent as multiple packets. The other end reassembles these fragments to reconstruct the whole packet.

The problem with fragments is that after the IP header is a part of the internal packet: looking inside the packet for protocol headers (such as is done by the TCP, UDP and ICMP extensions) is not possible, as these headers are only contained in the first fragment.

If you are doing connection tracking or NAT, then all fragments will get merged back together before they reach the packet filtering code, so you need never worry about fragments. Otherwise, you can insert the tiny `ip_defrag.o' module which performs the same task (note, this is only allowed if you are the only connection between the two networks).

Otherwise, it is important to understand how fragments get treated by the filtering rules. Any filtering rule that asks for information we don't have will not match. This means that the first fragment is treated like any other packet. Second and further fragments won't be. Thus a rule -p TCP --sport www (specifying a source port of `www') will never match a fragment (other than the first fragment). Neither will the opposite rule -p TCP --sport ! www.

However, you can specify a rule specifically for second and further fragments, using the `-f' (or `--fragment') flag. It is also legal to specify that a rule does not apply to second and further fragments, by preceding the `-f' with `!'.

Usually it is regarded as safe to let second and further fragments through, since filtering will effect the first fragment, and thus prevent reassembly on the target host, however, bugs have been known to allow crashing of machines simply by sending fragments. Your call.

Note for network-heads: malformed packets (TCP, UDP and ICMP packets too short for the firewalling code to read the ports or ICMP code and type) are dropped when such examinations are attempted. So are TCP fragments starting at position 8.

As an example, the following rule will drop any fragments going to 192.168.1.1:

 
# iptables -A OUTPUT -f -d 192.168.1.1 -j DROP
#

Extensions to iptables: New Tests

iptables is extensible, meaning that both the kernel and the iptables tool can be extended to provide new features.

Some of these extensions are standard, and other are more exotic. Extensions can be made by other people and distributed separately for niche users.

Kernel extensions normally live in the kernel module subdirectory, such as /lib/modules/2.3.15/net. They are currently (Linux 2.3.15) not demand loaded, so you will need to manually insert the ones you want. In future they may be loaded on-demand again.

Extensions to the iptables program are shared libraries which live usually live in /usr/local/lib/iptables/, although a distribution would put them in /lib/iptables or /usr/lib/iptables.

Extensions come in two types: new targets, and new tests; we'll talk about new targets below. Some protocols automatically offer new tests: currently these are TCP, UDP and ICMP as shown below.

For these you will be able to specify the new tests on the command line after the `-p' option, which will load the extension. For explicit new tests, the `-m' option to load the extension, after which the extended options will be available.

To get help on an extension, use the option to load it (`-p' or `-m') followed by `-h' or `--help'.

TCP Extensions

The TCP extensions are automatically loaded if `--protocol tcp' is specified, and no other match is specified. It provides the following options (none of which match fragments).

--tcp-flags

Followed by an optional `!', then two strings of flags, allows you to filter on specific TCP flags. The first string of flags is the mask: a list of flags you want to examine. The second string of flags tells which one(s) should be set. For example,

# iptables -A INPUT --protocol tcp --tcp-flags ALL SYN,ACK -j DENY

This indicates that all flags should be examined (`ALL' is synonymous with `SYN,ACK,FIN,RST,URG,PSH'), but only SYN and ACK should be set. There is also an argument `NONE' meaning no flags.

--syn

Optionally preceded by a `!', this is shorthand for `--tcp-flags SYN,RST,ACK SYN'.

--source-port

followed by an optional `!', then either a single TCP port, or a range of ports. Ports can be port names, as listed in /etc/services, or numeric. Ranges are either two port names separated by a `-', or (to specify greater than or equal to a given port) a port with a `-' appended, or (to specify less than or equal to a given port), a port preceded by a `-'.

--sport

is synonymous with `--source-port'.

--destination-port

and

--dport

are the same as above, only they specify the destination, rather than source, port to match.

--tcp-option

followed by an optional `!' and a number, matches a packet with a TCP option equaling that number. A packet which does not have a complete TCP header is dropped automatically if at attempt is made to examine its TCP options.

An Explanation of TCP Flags

It is sometimes useful to allow TCP connections in one direction, but not the other. For example, you might want to allow connections to an external WWW server, but not connections from that server.

The naive approach would be to block TCP packets coming from the server. Unfortunately, TCP connections require packets going in both directions to work at all.

The solution is to block only the packets used to request a connection. These packets are called SYN packets (ok, technically they're packets with the SYN flag set, and the FIN and ACK flags cleared, but we call them SYN packets for short). By disallowing only these packets, we can stop attempted connections in their tracks.

The `--syn' flag is used for this: it is only valid for rules which specify TCP as their protocol. For example, to specify TCP connection attempts from 192.168.1.1:

-p TCP -s 192.168.1.1 --syn

This flag can be inverted by preceding it with a `!', which means every packet other than the connection initiation.

UDP Extensions

These extensions are automatically loaded if `--protocol udp' is specified, and no other match is specified. It provides the options `--source-port', `--sport', `--destination-port' and `--dport' as detailed for TCP above.

ICMP Extensions

This extension is automatically loaded if `--protocol icmp' is specified, and no other match is specified. It provides only one new option:

--icmp-type

followed by an optional `!', then either an icmp type name (eg `host-unreachable'), or a numeric type (eg. `3'), or a numeric type and code separated by a `/' (eg. `3/3'). A list of available icmp type names is given using `-p icmp --help'.

Other Match Extensions

The other two extensions in the netfilter package are demonstration extensions, which (if installed) can be invoked with the `-m' option.

mac

This module must be explicitly specified with `-m mac' or `--match mac'. It is used for matching incoming packet's source Ethernet (MAC) address, and thus only useful for packets traversing the INPUT and FORWARD chains. It provides only one option:

--mac-source

followed by an optional `!', then an ethernet address in colon-separated hexbyte notation, eg `--mac-source 00:60:08:91:CC:B7'.

limit

This module must be explicitly specified with `-m limit' or `--match limit'. It is used to restrict the rate of matches, such as for suppressing log messages. It will only match a given number of times per second (by default 3 matches per hour, with a burst of 5). It takes two optional arguments:

--limit

followed by a number; specifies the maximum average number of matches to allow per second. The number can specify units explicitly, using `/second', `/minute', `/hour' or `/day', or parts of them (so `5/second' is the same as `5/s').

--limit-burst

followed by a number, indicating the maximum burst before the above limit kicks in.

This match can often be used with the LOG target to do rate-limited logging. To understand how it works, let's look at the following rule, which logs packets with the default limit parameters:

# iptables -A FORWARD -m limit -j LOG

The first time this rule is reached, the packet will be logged; in fact, since the default burst is 5, the first five packets will be logged. After this, it will be twenty minutes before a packet will be logged from this rule, regardless of how many packets reach it. Also, every twenty minutes which passes without matching a packet, one of the burst will be regained; if no packets hit the rule for 100 minutes, the burst will be fully recharged; back where we started.

You cannot currently create a rule with a recharge time greater than about 59 hours, so if you set an average rate of one per day, then your burst rate must be less than 3.

unclean

This module must be explicitly specified with `-m unclean or `--match unclean'. It does various random sanity checks on packets. This module has not been audited, and should not be used as a security device (it probably makes things worse, since it may well have bugs itself). It provides no options.

6.4 Target Specifications

Now we know what examinations we can do on a packet, we need a way of saying what to do to the packets which match our tests. This is called a rule's target.

There are two very simple built-in targets: DROP and ACCEPT. We've already met them. If a rule matches a packet and its target is one of these two, no further rules are consulted: the packet's fate has been decided.

There are two types of targets other than the built-in ones: extensions and user-defined chains.

User-defined chains

One powerful feature which iptables inherits from ipchains is the ability for the user to create new chains, in addition to the three built-in ones (INPUT, FORWARD and OUTPUT). By convention, user-defined chains are lower-case to distinguish them (we'll describe how to create new user-defined chains below in Operations on an Entire Chain below).

When a packet matches a rule whose target is a user-defined chain, the packet begins traversing the rules in that user-defined chain. If that chain doesn't decide the fate of the packet, then once traversal on that chain has finished, traversal resumes on the next rule in the current chain.

Time for more ASCII art. Consider two (silly) chains: INPUT (the built-in chain) and test (a user-defined chain).

         `INPUT'                         `test'
        ----------------------------    ----------------------------
        | Rule1: -p ICMP -j DROP   |    | Rule1: -s 192.168.1.1    |
        |--------------------------|    |--------------------------|
        | Rule2: -p TCP -j test    |    | Rule2: -d 192.168.1.1    |
        |--------------------------|    ----------------------------
        | Rule3: -p UDP -j DROP    |
        ----------------------------

Consider a TCP packet coming from 192.168.1.1, going to 1.2.3.4. It enters the INPUT chain, and gets tested against Rule1 - no match. Rule2 matches, and its target is test, so the next rule examined is the start of test. Rule1 in test matches, but doesn't specify a target, so the next rule is examined, Rule2. This doesn't match, so we have reached the end of the chain. We return to the INPUT chain, where we had just examined Rule2, so we now examine Rule3, which doesn't match either.

So the packet path is:

                                v    __________________________
         `INPUT'                |   /    `test'                v
        ------------------------|--/    -----------------------|----
        | Rule1                 | /|    | Rule1                |   |
        |-----------------------|/-|    |----------------------|---|
        | Rule2                 /  |    | Rule2                |   |
        |--------------------------|    -----------------------v----
        | Rule3                 /--+___________________________/
        ------------------------|---
                                v

User-defined chains can jump to other user-defined chains (but don't make loops: you're packets will be dropped if they're found to be in a loop).

Extensions to iptables: New Targets

The other type of target is an extension. A target extension consists of a kernel module, and an optional extension to iptables to provide new command line options. There are several extensions in the default netfilter distribution:

LOG

This module provides kernel logging of matching packets. It provides these additional options:

--log-level

Followed by a level number or name. Valid names are (case-insensitive) `debug', `info', `notice', `warning', `err', `crit', `alert' and `emerg', corresponding to numbers 7 through 0. See the man page for syslog.conf for an explanation of these levels.

--log-prefix

Followed by a string of up to 14 characters, this message is sent at the start of the log message, to allow it to be uniquely identified.

This module is most useful after a limit target, so you don't flood your logs.

REJECT

This module has the same effect as `DROP', except that the sender is sent an ICMP `port unreachable' error message. Note that the ICMP error message is not sent if (see RFC 1122):

Special Built-In Targets

There are two special built-in targets: RETURN and QUEUE.

RETURN has the same effect of falling off the end of a chain: for a rule in a built-in chain, the policy of the chain is executed. For a rule in a user-defined chain, the traversal continues at the previous chain, just after the rule which jumped to this chain.

QUEUE is a special target, which queues the packet for userspace processing. Unless there is something waiting for the packet (ie. a program which hasn't been written yet), the packet will be dropped.

6.5 Operations on an Entire Chain

A very useful feature of iptables is the ability to group related rules into chains. You can call the chains whatever you want, but I recommend using lower-case letters to avoid confusion with the built-in chains and targets. Chain names can be up to 16 letters long.

Creating a New Chain

Let's create a new chain. Because I am such an imaginative fellow, I'll call it test. We use the `-N' or `--new-chain' options:

# iptables -N test
#

It's that simple. Now you can put rules in it as detailed above.

Deleting a Chain

Deleting a chain is simple as well, using the `-X' or `--delete-chain' options. Why `-X'? Well, all the good letters were taken.

# iptables -X test
# 

There are a couple of restrictions to deleting chains: they must be empty (see Flushing a Chain below) and they must not be the target of any rule. You can't delete any of the three built-in chains.

If you don't specify a chain, then all user-defined chains will be deleted, if possible.

Flushing a Chain

There is a simple way of emptying all rules out of a chain, using the `-F' (or `--flush') commands.

        # ipchains -F forward
        # 

If you don't specify a chain, then all chains will be flushed.

Listing a Chain

You can list all the rules in a chain by using the `-L' command.

The `refcnt' listed for each user-defined chain is the number of rules which have that chain as their target. This must be zero (and the chain be empty) before this chain can be deleted.

If the chain name is omitted, all chains are listed, even empty ones.

There are three options which can accompany `-L'. The `-n' (numeric) option is very useful as it prevents iptables from trying to lookup the IP addresses, which (if you are using DNS like most people) will cause large delays if your DNS is not set up properly, or you have filtered out DNS requests. It also causes TCP and UDP ports to be printed out as numbers rather than names.

The `-v' options shows you all the details of the rules, such as the the packet and byte counters, the TOS comparisons, and the interfaces. Otherwise these values are omitted.

Note that the packet and byte counters are printed out using the suffixes `K', `M' or `G' for 1000, 1,000,000 and 1,000,000,000 respectively. Using the `-x' (expand numbers) flag as well prints the full numbers, no matter how large they are.

Resetting (Zeroing) Counters

It is useful to be able to reset the counters. This can be done with the `-Z' (or `--zero') option.

The problem with this approach is that sometimes you need to know the counter values immediately before they are reset. In the above example, some packets could pass through between the `-L' and `-Z' commands. For this reason, you can use the `-L' and `-Z' together, to reset the counters while reading them.

Setting Policy

We glossed over what happens when a packet hits the end of a built-in chain when we discussed how a packet walks through chains earlier. In this case, the policy of the chain determines the fate of the packet. Only built-in chains (INPUT, OUTPUT and FORWARD) have policies, because if a packet falls off the end of a user-defined chain, traversal resumes at the previous chain.

The policy can be either ACCEPT or DROP.

Using ipchains and ipfwadm

There are modules in the netfilter distribution called ipchains.o and ipfwadm.o. Insert one of these in your kernel (NOTE: they are incompatible with iptables.o, ip_conntrack.o and ip_nat.o!). Then you can use ipchains or ipfwadm just like the good old days.

This will be supported for some time yet. I think a reasonable formula is 2 * [notice of replacement - initial stable release], beyond the date that a stable release of the replacement is available.

This means that for ipfwadm, the end of support is (FIXME: get real dates):

2 * [October 1997 (2.1.102 release) - March 1995 (ipfwadm 1.0)] 
        + January 1999 (2.2.0 release)
    = November 2003.

This means that for ipchains, the end of support is (FIXME: get real dates):

2 * [August 1999 (2.3.15 release) - October 1997 (2.2.0 release)] 
        + January 2000 (2.3.0 release?)
    = September 2003.

So you don't have to worry until 2004.


LinuxGuruz Next Previous Contents