As data is encapsulated, the IP protocol at the network layer is able to specify the target IP address to which the data is ultimately destined, as well as the interface via which the data is to be transmitted, however before transmission can occur, the source must be aware of the target Ethernet (MAC) address to which data should be transmitted. The Address Resolution Protocol (ARP) represents a critical part of the TCP/IP protocol suite that enables discovery of MAC forwarding addresses to facilitate IP reachability. The Ethernet next hop must be discovered before data encapsulation can be completed.

The ARP packet is generated as part of the physical target address discovery process. Initial discovery will contain partial information since the destination hardware address or MAC address is to be discovered. The hardware type refers to Ethernet with the protocol type referring to IP, defining the technologies associated with the ARP discovery. The hardware and protocol length identifies the address length for both the Ethernet MAC address and the IP address, and is defined in bytes.

The operation code specifies one of two states, where the ARP discovery is set as REQUEST for which reception of the ARP transmission by the destination will identify that a response should be generated. The response will generate REPLY for which no further operation is necessary by the receiving host of this packet, and following which the ARP packet will be discarded. The source hardware address refers to the MAC address of the sender on the physical segment to which ARP is generated. The source protocol address refers to the IP address of the sender.

The destination hardware address specifies the physical (Ethernet) address to which data can be forwarded by the Ethernet protocol standards, however this information is not present in an ARP request, instead replaced by a value of 0. The destination protocol address identifies the intended IP destination for which reachability over Ethernet is to be established.

The network layer represents a logical path between a source and a destination. Reaching an intended IP destination relies on firstly being able to establish a physical path to the intended destination, and in order to do that, an association must be made between the intended IP destination and the physical next hop interface to which traffic can be forwarded.

For a given destination the host will determine the IP address to which data is to be forwarded, however before encapsulation of the data can commence, the host must determine whether a physical forwarding path is known. If the forwarding path is known encapsulation to the destination can proceed, however quite often the destination is not known and ARP must be implemented before data encapsulation can be performed.

The ARP cache (pronounced as [kash]) is a table for association of host destination IP addresses and associated physical (MAC) addresses. Any host that is engaged in communication with a local or remote destination will first need to learn of the destination MAC via which communication can be established.

Learned addresses will populate the ARP cache table and remain active for a fixed period of time, during which the intended destination can be discovered without the need for addition ARP discovery processes. Following a fixed period, the ARP cache table will remove ARP entries to maintain the ARP cache table’s integrity, since any change in the physical location of a destination host may result in the sending host inadvertently addressing data to a destination at which the destination host no longer resides.

The ARP cache lookup is the first operation that an end system will perform before determining whether it is necessary to generate an ARP request. For destinations beyond the boundaries of the hosts own network, an ARP cache lookup is performed to discover the physical destination address of the gateway, via which the intended destination network can be reached.

Where an ARP cache entry is unable to be determined, the ARP request process is performed. This process involves generation of an ARP request packet, and population of the fields with the source and destination protocol addresses, as well as the source hardware address. The destination hardware address is unknown. As such the destination hardware address is populated with a value equivalent to 0. The ARP request is encapsulated in an Ethernet frame header and trailer as part of the forwarding process. The source MAC address of the frame header is set as the source address of the sending host.

The host is currently unaware of the location of the destination and therefore must send the ARP request as a broadcast to all destinations within the same local network boundary. This means that a broadcast address is used as the destination MAC address. Once the frame is populated, it is forwarded to the physical layer where it is propagated along the physical medium to which the host is connected. The broadcasted ARP packet will be flooded throughout the network to all destinations including any gateway that may be present, however the gateway will prevent this broadcast from being forwarded to any network beyond the current network.

If the intended network destination exists, the frame will arrive at the physical interface of the destination at which point lower layer processing will ensue. ARP broadcasts mean that all destinations within the network boundary will receive the flooded frame, but will cease to process the ARP request, since the destination protocol address does not match to the IP address of those destinations.

Where the destination IP address does match to the receiving host, the ARP packet will be processed. The receiving host will firstly process the frame header and then process the ARP request. The destination host will use the information from the source hardware address field in the ARP header to populate its own ARP cache table, thus allowing for a unicast frame to be generated for any frame forwarding that may be required, to the source from which the ARP request was received.

The destination will determine that the ARP packet received is an ARP request and will proceed to generate an ARP reply that will be returned to the source, based on the information found in the ARP header. A separate ARP packet is generated for the reply, for which the source and destination protocol address fields will be populated. However, the destination protocol address in the ARP request packet now represents the source protocol address in the ARP reply packet, and similarly the source protocol address of the ARP request becomes the destination protocol address in the ARP reply.

The destination hardware address field is populated with the MAC of the source, discovered as a result of receiving the ARP request. For the required destination hardware address of the ARP request, it is included as the source hardware address of the ARP reply, and the operation code is set to reply, to inform the destination of the purpose of the received ARP packet, following which the destination is able to discard the ARP packet without any further communication. The ARP reply is encapsulated in the Ethernet frame header and trailer, with the destination MAC address of the Ethernet frame containing the MAC entry in the ARP cache table, allowing the frame to be forwarded as a unicast frame back to the host that originated the ARP request.

Upon receiving the ARP reply, the originating host will validate that the intended destination is correct based on the frame header, identify that the packet header is ARP from the type field and discard the frame headers. The ARP reply will then be processed, with the source hardware address of the ARP reply being used to populate the ARP cache table of the originating host (Host A).

Following the processing of the ARP reply, the packet is discarded and the destination MAC information is used to facilitate the encapsulation process of the initial application or protocol that originally requested discovery of the destination at the data link layer.

The ARP protocol is also applied to other cases such as where transparent subnet gateways are to be implemented to facilitate communication across data-link networks, where hosts are considered to be part of the same subnetwork. This is referred to as Proxy ARP since the gateway operates as a proxy for the two datalink networks. When an ARP request is generated for a destination that is considered to be part of the same subnet, the request will eventually be received by the gateway. The gateway is able to determine that the intended destination exists beyond the data-link network on which the ARP request was generated. \

Since ARP requests cannot be forwarded beyond the boundaries of the broadcast domain, the gateway will proceed to generate its own ARP request to determine the reachability to the intended destination, using its own protocol and hardware addresses as the source addresses for the generated ARP request. If the intended destination exists, an ARP reply shall be received by the gateway for which the destinations source hardware address will be used to populate the ARP cache table of the gateway.

The gateway upon confirming the reachability to the intended destination will then generate an ARP reply to the original source (Host A) using the hardware address of the interface on which the ARP reply was forwarded. The gateway will as a result operate as an agent between the two data-link networks to facilitate data link layer communication, with both hosts forwarding traffic intended for destinations in different data-link networks to the relevant physical address of the “Proxy” gateway

In the event that new hardware is introduced to a network, it is imperative that the host determine whether or not the protocol address to which it has been assigned is unique within the network, so as to prevent duplicate address conflicts. An ARP request is generated as a means of determining whether the protocol address is unique, by setting the destination address in the ARP request to be equal to the host’s own IP address.

The ARP request is flooded throughout the network to all link layer destinations by setting the destination MAC as broadcast, to ensure all end stations and gateways receive the flooded frame. All destinations will process the frame, and should any destination discover that the destination IP address within the ARP request match the address of a receiving end station or gateway, an ARP reply will be generated and returned to the host that generated the ARP request.

Through this method the originating host is able to identify duplication of the IP address within the network, and flag an IP address conflict so to request that a unique address be assigned. This means of generating a request based on the hosts own IP address defines the basic principles of gratuitous ARP.

SUMMARY

The host is required to initially determine whether it is already aware of a link layer forwarding address within its own ARP cache (MAC address table). If an entry is discovered the end system is capable of creating the frame for forwarding without the assistance of the address resolution protocol. If an entry cannot be found however, the ARP process will initiate, and an ARP request will be broadcasted on the local network.

Gratuitous ARP messages are commonly generated at the point in which an IP address is configured or changed for a device connected to the network, and at any time that a device is physically connected to the network. In both cases the gratuitous ARP process must ensure that the IP address that is used remains unique.

Ref : https://e.huawei.com/en/talent/#/resources