The IP and TCP Protocols

An explanation is provided of the main functions performed by the TCP and IP protocols.

Services provided by IP

IP has two major functions: addressing and fragmentation. With regards to addressing, IP provides an unreliable, connectionless datagram delivery service. If any errors should occur during the transmission of an IP datagram, IP does not attempt to correct the error. It will discard the datagram and send an ICMP error message back to the host from which the datagram originated. IP treats each datagram as an independent entity — a collection of datagrams sent to a particular host do not have to follow the same path to that host, and these datagrams may also get delivered out of order.

The maximum size of an IP datagram is determined by the maximum transmission unit (MTU) for the physical link layer. The link layer can (and is likely to) change as the packet moves from source to destination. Therefore, the MTU can (and is likely to) change over the route. If an IP datagram is larger than the MTU of the link layer, the datagram is fragmented to fit within the bounds of the MTU. These fragments are not reassembled until they reach the destination host, and if any of the fragments fail to reach their destination, the entire datagram has to be retransmitted. IP is responsible for fragmenting and reassembling the datagram.

IP merely provides a best effort service to get the datagrams to their destination. The datagrams may get there out of order, or may not get delivered at all. The task of ensuring that the datagrams get there in order and are all delivered is assigned to TCP.

Services provided by TCP

TCP provides a reliable byte stream service, i.e. it guarantees that all the data will be delivered to the other end and that the data will be delivered to the application layer in the same order that it was sent without any duplicate segments. TCP also provides flow control mechanism through which each end of the connection can determine how much data the other end is currently prepared to receive.

TCP is a connection-oriented protocol — there are three well defined states in a TCP connection: connection establishment, data transfer and connection release. TCP is a full duplex service, i.e. the connection can carry data in both directions.

The maximum size of a TCP segment is determined by the Maximum Segment Size (MSS) option which is specified in the connection establishment handshake. At any point during the data transfer phase, the size of the TCP segment sent depends on the MSS and the other end's advertised window size (which specifies how much data the other end is willing to accept at that point in time) and the amount of data to be sent. If the size of the data stream from the application layer is bigger than the MSS or the window size, the data is broken up into segments to fit the lower of these two values. These segments are reassembled at their destination in the correct order. The resulting data stream that is returned to the application layer at the destination will match what was sent.

TCP and UDP ports

Most of the application layer protocols are layered on top of either TCP or UDP. Therefore, TCP and UDP have to concurrently handle the many disparate data streams sent to them from the application layer. The demultiplexing process would be hopelessly complicated and impossible to handle without some means of identifying which data stream a particular segment of data belongs to. Both TCP and UDP provide a port identifier to uniquely identify each data stream. A set of standard port numbers are assigned by the IANA to the major aplication layer protocols. These 'well known' ports make it easier to find the corresponding server for a particular protocol on a host. e.g. SMTP servers can usually be found on port 25 and HTTP servers are usually on port 80.

The combination of a TCP/UDP port and an IP address uniquely identifies a particular service on a particular host. The term 'socket' is usually used to refer to the combination of the TCP/UDP port and IP address.

IP (Internet Protocol)

The IP header

An IPv4 header is usually 20 bytes long, but it can have up to an additional 40 bytes of options. The fixed portion includes the following information:

IP addresses

An IP address is a 32 bit number which uniquely identifies every single interface on the Internet. IP addresses are usually represented in dotted decimal notation, i.e. four 8 bit numbers separated by dots, e.g. 150.203.1.10.

IP addresses can be subdivided into a network portion and a host portion. The number of bits that make up the network portion of the address is specified as the subnet mask. The subnet mask is a 32 bit number with ones in all the bits in the network portion of the address and zeroes in all the bits of the host portion of the address. e.g. if the host portion of the address is made up of 8 bits, the subnet mask would be 255.255.255.0.

IP addresses are also divided into classes — each IP address class has a fixed number of bits allocated to the network portion of the address. Given its own IP address and subnet mask, and the IP address of another node, a node should be able to determine whether or not that node is on the same network, and whether or not that host is on the same subnet.

An IP address with zeroes in all the bits of the host portion of the address is called a network address and is used to address all the hosts on that network. IP addresses with non-zero host portions address a specific host.

IP routing

IP routing is done hop by hop. A host sending an IP datagram does not need to know the complete path to the destination host except where the destination host is on the same subnet (i.e. where only the host portion of the address is different). Under most other circumstances, it only has to know the address of the node on the next hop, to which the datagram is delivered. The node on the next hop will then forward the datagram on to the next node, assuming that each node gets the datagram closer to its intended destination.

The TTL value in the IP header is an upper bound on the lifetime of the datagram and exists to prevent infinite looping of a datagram on the network (in case each hop does not actually take the datagram any closer to its destination). It is initialised by the originating host and decremented by 1 for:

If the TTL hits 0 before the datagram gets to its destination, the datagram is discarded and an ICMP timeout message is sent back to the originating host.

Every single node on the Internet maintains a set of internal routing tables. The node makes decisions on where to send IP datagrams based on the rules in the routing table. Unless a host has been specifically configured to act as a router, it should not forward packets between its interfaces.

When a host receives an IP datagram, it checks the destination IP address in the header. If it was destined for the host, i.e. if the destination IP address matches its own or if the IP address was a broadcast address, the host demultiplexes the datagram and sends it on to the appropriate upper layer protocols. If the datagram was not intended for the host, and the host is not configured to act as a router, the datagram is thrown away. If the host is also acting as a router, it then checks its routing tables to determine where to forward the datagram next.

Routing tables on hosts are usually expressed in the following terms: if a datagram is addressed to x, it should be sent to y, where x can be a specific host address or a network address, and y is a specific host address or the network address of a directly connected network.

Most routing tables have several specific host/network address entries and a default gateway entry for any networks not covered by the specific entries.

Kernel routing table
Destination     Gateway         Genmask         Flags MSS    Window Use Iface
192.168.0.0     *               255.255.255.0   U     1500   0       16 eth0
127.0.0.0       *               255.0.0.0       U     3584   0       14 lo
203.8.111.211   *               255.255.255.255 UH    1500   0        1 ppp0
default         203.8.111.211   *               UG    1500   0        5 ppp0

Most hosts have their routing tables set up statically at boot time. However, if a router has a choice of using multiple routes to a particular destination, dynamic routing protocols may also be used. These protocols are used to adjust the routing table automatically based on the availability and loads on destination nodes and other changes in the network.

TCP (Transmission Control Protocol)

The TCP header

The TCP header contains a fixed portion which is 20 bytes long, which includes the following information:

TCP connection establishment

The establishment of a TCP connection takes place with the usage of a 3 way handshaking protocol:

TCP connection termination

TCP is a full duplex protocol, i.e. data can be sent in both directions independently. Therefore, to fully close the connection, it has to be terminated in both directions. If one end of the connection sends segment with the FIN flag set, it means that that end has got no more data to send. However, this end can still receive data from the other end until the other end has explicitly closed its end of the connection. This is known as a half-close.

The full termination of a TCP connection takes place with the usage of a 4 way handshake: