Summary
In this post, I am translating my notes from my Solutions Architect Professional Training IT Reviews Section. This first post covers some networking concepts, and I will follow up with at least one more post as I continue the section.
What is Networking in Computing?
Networking is an umbrella term that covers many parts of facilitating one computer’s communication with other computers. This communication can include local networking between two or more computers in the same location all the way to computing devices communicating across the globe and beyond.
The OSI 7-layer Model
The OSI 7-layer model is a conceptual model that helps us think about how networking technologies work together to deliver communications. It helps to explain the technology from level one, where we are talking about voltages on an ethernet cable or radio waves on a wifi signal to the top level, where we are concerned about delivering websites and encrypting them.
The first three layers are media layers, concerned with how the data moves on the physical wires to one degree or another. The upper four layers are concerned with routing and encapsulating data so that it can be transmitted and delivered quickly and / or reliably.
Layer Type |
Layer |
---|---|
Host | Layer Seven Application |
Host | Layer Six Presentation |
Host | Layer Five Session |
Host | Layer Four Transport |
Media | Layer Three Network |
Media | Layer Two Data Link |
Media | Layer One Physical |
Layer One - Physical
In this layer, we are concerned about connecting two network interfaces. These network interfaces must use the same technologies and protocols to talk to each other at this level. A typical example of a layer one technology is an ethernet cable, fiber optic cable, or radio wave carrying a wifi signal.
For example, when we talk about ethernet cables, we are concerned with defining things like voltage levels on the wire, timings/data rates, and cable lengths.
Network hubs can be used at this level to expand a network beyond two nodes. Hubs are devices with multiple ports that allow multiple devices to be connected to the same network. Hubs, however, do not have any intelligence. Instead, every port on the hub merely repeats the signals (including collisions and errors) on the wire to all nodes on the network. This fact means that the collision domain with a hub is expanded.
At this level, there is also no access control to the wire. This fact means that devices can and do attempt to transfer data simultaneously as other devices on the wire. Unfortunately, the simultaneous transmissions lead to collisions and cause data corruption. Additionally, there is no logic at this level to detect these collisions. These limitations mean level-one networks don’t scale well without help from higher layers.
Layer Two - Data Link
Layer Two, like all of the layers of the stack above Layer One, runs on top of the lower layer and is somewhat independent of the lower layer. A prevalent example of a Layer Two protocol might be Ethernet.
Ethernet introduces a concept of a frame, which will be a package and container for data. It contains details about the Source and destination in the form of MAC addresses. In addition, it indicates which Layer Three protocol will be used and includes a payload between 46 and 1500 bytes that will contain the layer three packet. Finally, it consists of a check sequence that can be used to determine if the frame was delivered successfully. When considering MAC addresses, we are discussing a 48-bit number representing a globally unique ID. The first part of this ID is the OUI (Manufacturer ID for the network card) and the Network Interface Controller ID, which is unique to the card. These IDs are typically pre-programmed onto the card and are not modified or set by a user. Of course, there are some exceptions, but this is typical.
At Layer Two, we introduce a mechanism to avoid collisions as well. In the case of Ethernet, this method is called CSMA (Carrier Sense Multiple Access). Essentially it looks for a carrier on the wire before it attempts to transmit anything, avoiding collision if it sees another device already communicating.
Unfortunately, CSMA alone doesn’t prevent all collisions. For example, if two devices are connected and want to transmit, they will check for a carrier. Sensing no carrier, they will both begin to transmit simultaneously, and a collision will occur. Layer Two, though, introduces collision detection as well. When a collision is detected, all devices transmit a JAM signal and stop transmitting. Then, each device on the network will back off, wait for a random amount of time, and start the CSMA process again to try to transmit again.
In Layer One, we discussed the concepts of network hubs. These devices physically connect the wires on a network together. At Layer Two, network hubs are replaced with network switches. As a result, network Switches are much improved over the hub:
A switch maintains a table of MAC addresses connected to each port on the switch. Only the port on the switch connected to the destination mac address will transmit the frame.
Each port on the switch becomes a collision domain instead of the entire network being one collision domain. This behavior reduces collisions and isolates them, thus improving performance.
Layer Three - Network
At layer three, we talk about internetworking and expanding a network beyond a specific location. All Layer three networks travel on Layer two networks, but they aren’t confined to the same Layer two network or the same kind of Layer two network. They aren’t concerned with the medium of transmission much at all.
The most popular Layer three protocol is IP, and this is the protocol that enables the internet. IP transmits data in packets, and there are currently two popular versions of IP (IPv4 and IPv6). IPv6 was mainly developed to replace the IPv4 addressing to provide more addresses to serve all internet users adequately. But IPv4 is still the most popular and in-use version of IP, and I will go into more detail about its packets and addressing below. That said, IPv6, though somewhat different, works mainly on the same principles.
IPv4 Packets consist of a Source IP address, Destination IP address, Protocol Field (containing the Layer four protocol indicator), Time To Live (or Hop Limit for IPv6), and Data from the Layer Four segment. IPv4 addresses are four binary octets typically expressed in decimal form. An example of an IPv4 address is 122.55.7.2. The IP address consists of a host portion and a network portion. This separation determines how an IP network is subnetted and which packets flow to routers and which stay on the same layer two network.
To determine the IPv4 Network and Host, one needs to understand netmasks. A network Netmask indicates the number of bits associated with the IP address that makes up the network and the number that makes up the host. The mask is typically represented in four octets, such as 255.255.0.0, or sometimes as a slash with a number after it, such as /16. Understanding netmasks is easier when we convert the addresses and masks all to binary.
This example subtracts the netmask to get the host number of the device
122 .55 .7 .2
01111010 00110111 00000111 00000010 (122.55.7.2 expressed as binary number)
- 11111111 11111111 00000000 00000000 (255.255.0.0 expressed as binary)
= 00000000 00000000 00000111 00000010 (Host Number 0.0.7.2)
This example does a logical or on the netmask to get the Network of the device
122 .55 .7 .2
01111010 00110111 00000111 00000010 (122.55.7.2 expressed as binary number)
or 11111111 11111111 00000000 00000000 (255.255.0.0 expressed as binary)
= 01111010 00110111 00000000 00000000 (Network Number 122.55.0.0)
In this case, any packet destined for the same network number will be handled in Layer 2 using ARP and will not go to the network’s default router (or other routers based on the device’s route table).
Any packet destined for the same network number will be handled in Layer 2 using ARP and will not go to the network’s default router (or other routers based on the device’s route table). However, if the network number is different, it will be routed to another network via the route table on the device (and, in many cases, based on the default route).
Route tables specify which router device the packet should be forwarded to on a network if the packet isn’t part of the same network. Route tables consist of network numbers and local network IP addresses.
201.15.0.0/16 122.55.254.254
0.0.0.0/0 122.55.1.1
In the above example, 0.0.0.0/0 is special as it indicates a concept of the default route. The /0 in that route specifies the entire IPv4 address space. Essentially the device looks through the route table and finds the most specific route for a given address. For example, in the case of an address like 201.15.1.5, it would see that 201.15.0.0 matches that and is a /16 network (which is more specific than a /0 network), and thus it sends the packet to 122.55.254.254 which is a router that likely has a connection directly to that 201.15.0.0/16 network or provides a better route than the one on the default route. If, on the other hand, the address is 112.0.0.5, it will not match the 201.15.0.0/16 network, and thus it looks at the next most specific route. In this case that route is the default route, so it sends the packet to 122.55.1.1 - which might be connected to the internet service provider.
When the packet makes it to the routers mentioned above, those devices perform a similar function, look for the best route outbound from the device, and send the packet along. It is important to understand these routers connect different layer two networks, and some even connect different types of layer two networks (i.e., copper to fiber). So at the router, the packets are removed from the layer two frame and repackaged into a new layer two frame for the outbound network.
But we still need to address how the layer three packet gets sent to the specific layer two device. To do this, we use a technology called ARP (Address Resolution Protocol). ARP exists between layer two and layer three. So essentially, what happens is that ARP sends out a broadcast to the layer two network that says, “I need to send a packet to a specific IP address. Which MAC address do I need to send this to reach that IP address?” Since this is a broadcast, all devices on the network see the request and the one associated with that IP address, at layer three, responds with the MAC address. So at this point, layer two can forward the packet appropriately.
Layer Four - Transport (and to some extent Layer 5)
At layer three, we were able to achieve internetworking. We can send data from our computers to computers on distant networks, so long as we have connectivity between them - even if that connectivity isn’t directly connected. But we can only send one thing at a time - or even conceive of more than one application for our network. Additionally, we need to find a way to determine if data is corrupted or lost on its way to the destination network.
Layer four solves these problems by providing protocols and ports. Ports allow for different channels of communication. For instance, if we need to communicate with a mail server, we might use port 25. But if we want to connect to a secure website, we might use port 443. These ports allow us to serve multiple application types across the internetwork.
In addition to ports, we have protocols:
- TCP
- UDP
- ICMP
I will briefly mention UDP and ICMP, though they are essential protocols. UDP is a connectionless protocol used for speed and applications where reliability is less important than the application’s performance. Some applications where UDP is ideal are streaming voice and video. ICMP is for network testing and tracing.
TCP is the most important protocol on the internet. TCP introduces the concept of a connection and segments. Segments are like the packets at layer three but don’t contain an IP address for the destination. Instead, they include a destination port. They also provide a source port.
Additionally, segments have sequence numbers, window sizes, checksums, data, and other fields. These fields help to ensure that when a message is encapsulated into a segment, it is guaranteed to be received in the correct order and not corrupted by the receiver. If the receiver doesn’t get or can’t reassemble it in order, it will be able to tell and take appropriate action.
When a TCP connection is established, it is done through what is known as a three-way handshake. The first step is that the client sends a SYN packet that contains a random sequence number. The receiver gets that packet and sends back its SYN-ACK with a different random sequence number, along with the client’s sequence number plus one. Upon receiving the SYN-ACK, the client will respond with an ACK where the receiver’s sequence number is incremented, and the client’s sequence number is incremented. This process establishes a connection and sets parameters for communication. Then, the client and the receiver use the sequence numbers, along with the other fields in the segments, to ensure all data is reliably transferred in both directions.
Network Address Translation
At this point, we detour from discussing the OSI 7-Layer model and discuss something important to IPv4 networking - given the scarcity of IPv4 addresses. That topic is network address translation or NAT.
Some IPv4 space is dedicated to private networks. These networks are not routable on the internet and can only be used on local networks that don’t directly talk to the internet. These addresses are described in RFC-1918 which outlines these IP ranges:
10.0.0.0 - 10.255.255.255 (10/8 prefix)
172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
192.168.0.0 - 192.168.255.255 (192.168/16 prefix)
For addresses in these ranges to be able to communicate with the internet, they must be translated to public addresses before transmission out of a local network. This translation is where NAT comes into play.
There are several types of Network Address Translation:
- Static Network Address Translation
- Dynamic Network Address Translation
- Port Address Translation
Static Network Address Translation is a form of NAT where the router, between the local network and the network with internet routable addresses, maintains a table that translates specific private IP addresses to a particular public IP address. The AWS internet gateway uses this type of NAT to connect clients in a VPC.
In Static NAT, when the packets reach the router or gateway, they are re-written. The source IP address is changed to the IP address for the client with private IP as assigned. On the way back, packets with the public IP address associated with a specific private IP are re-written, so the destination is the associated private IP address.
Dynamic NAT works exactly like static NAT except that there is no static assignment of a public IP address for each private IP. Instead, a pool of public IP addresses is associated with a larger set of private IPs. When a client wishes to request the internet, the gateway assigns, like a lease, for a short time a public address, and the rest of it works just like Static NAT. One issue with this type of NAT is that if too many clients wish to communicate over the internet simultaneously, more public IP addresses may need to be used to service them. As a result, some clients will be denied access until public IPs are freed.
Port address translation is a type of NAT used by the NAT gateway in AWS. It is also the type used by most home routers. In this type of NAT, one public IP address services all private IP addresses behind the gateway. The request is sent to the gateway when a client makes a request. The gateway changes the source address to the public address assigned to the gateway. The gateway also sets a source port number and keeps a table of which client it gave that source port number. The request is forwarded onto the internet, and when there is return traffic, the gateway checks the port number with its table of ports and addresses (Port Address Table). Notably, the port assigned by the gateway is different from the source port from the client. Then, it rewrites the incoming packet with the destination IP of the private address and the source port that the client used to request in the first place. This port address translation enables multiple clients to share the IP address and keep the communication between them separate.
Conclusion
So far, this has covered the basics of networking. I am posting these roughly based on one evening’s study time. So my next post will likely cover various topics from encryption, subnetting, DNS, and a few other areas.