How the Internet Works
The internet is unquestionably one of the most influential innovations that shaped our modern lives. This massive network of computers had an extraordinary impact on our culture, economy, and everyday lives. It offers us instantaneous communication with loved ones and strangers alike across continents as well as endless hours of entertainment in all its digital forms. This network is able to transmit ludicrous amounts of data simultaneously between billions of devices across the entire planet.
So how did we manage to create such a reliable and automated system?
All data is a set of binary digits regardless of what type of information it represents. Images, audio, and all other file types are only differentiated by metadata which is a set of binary digits describing the nature of the data that follows. Therefore, transmitting data requires at least two signal states representing the ones and zeroes. The most common physical data transmission technology are voltage variations across a copper cable, modulated electromagnetic waves, and light pulses through optical fiber .
Transmitting one signal across a wire is insufficient. Therefore, multiple signals are modulated into different frequency bands that can be separated on the receiving end. The range of available frequencies defines the bandwidth and is mostly limited by hardware quality and the surrounding environment.
These physical technologies only enable the transmission of data between two devices uniquely connected to one another. So what happens when you have a network of billions of devices?
How does the data arrive at the correct destination in an acceptable time frame?
To ensure data reaches its proper destination in a complex network, data is split into packets of a specific size then preceded by more metadata according to specific protocols. The protocols that govern the internet fall into a hierarchy described by the OSI model. Each protocol appends specific information onto the data to be sent then the protocol below it on the OSI model appends more information and so on in a process known as “encapsulation”.
First, when an application running on a device decides to send data across a network, it structures the data according to a specified protocol like HTTP for web browsers or RTM for streaming audio and video. Compression, encryption, and other important operations are done after this step. Then the operating system allocates one of its ports to that application. Then either the UDP or TCP protocol is used in the transport layer to create a datagram specifying the port of the intended receiving application. TCP is the protocol responsible for resending requests if the previous ones are lost on the network and reorders data packets if they take different routes and arrive in the wrong order. UDP on the other hand is the simpler faster approach to sending data at the cost of part of it being lost.
More data is then appended to the TCP or UDP datagram by the network layer protocol which is almost always the Internet Protocol or IP which specifies the IP address of the recipient creating an IP Packet. Lastly, the data link layer protocol adds even more data to the IP packet. The protocol on this layer is usually either Ethernet or WiFi which specify the MAC address of the next device on the network.
After all those protocols are applied correctly, the data is transmitted across a physical medium to a gateway router. It then reads the metadata then checks its internal routing table which is a rough map of devices on the network and sends the data onward to what it calculates to be the next step in the optimal route. The next router on the network does the same until the data packet reaches its destination or gets lost on the way. But luckily it won’t just keep hopping endlessly since the routers would check its number of hops and eventually discard it if it exceeds a preset number.
If a device does not know the exact IP address of the device it wishes to communicate with and only knows a website name, for example, it sends a request to a DNS server. This server communicates with a central register of addresses that responds with the required IP address.
This way routers can efficiently understand millions of packets and send them properly to their destination. In addition to all these operations, each device on the network checks for random errors by verifying a CRC sum, and the final recipient checks for data integrity from tampering by verifying the Hash value. These operations are done in mere milliseconds in billions of devices without anyone noticing. The internet truly is a fascinating web of complexity.