Computers seem to be virtually limitless in their abilities – they can send and receive digital information, serve content via many defined protocols, compute algorithms much faster than a human can, and even provide countless hours of fun and entertainment. One very common use of computers is indeed serving content to consumers, a technological concept known as client / server model. A simple metaphor: a consumer (or a client) goes (makes a connection request) to a place of business (or a server) and peruses the content therein. The bottom line is that a server hosts and provides resources and content to clients requesting said resources.
The way the clients and servers communicate is in terms of protocols, or a predefined method of communication between multiple networked hosts. A prominent protocol used nowadays is the HTTP protocol in our ever-beloved usage of the Internet. Other protocols that make our lives much simpler, without diving too far into their technical details, include: TCP, or really the TCP/IP suite, that essentially allows communication between computer systems (identified by IP addresses) in a reliable manner (TCP); DHCP, which saves many network administrators the headache of manually assigning IP address to their userbase; DNS, which negates the need of humans to remember yet another numerical identifier (or multiple IP addresses of their favorite websites); NTP, that allows for proper time alignment; and ARP, which works with MAC and IP address matching on a local network. If you’re interested in other common networking protocols and a description of the above protocols and more, refer to this article.
This article is written based on the client / server architecture model and discusses perspectives based on the model, though other architectures work in similar ways. That is, any network communication you have between multiple networked hosts will more or less follow the same basis as the client / server architecture.
How Servers Serve Services
Let’s assume that a business wanted to serve certain content to their userbase. They can create a website describing the business that is accessible using the HTTP protocol. In addition, they can offer file downloads (if applicable) that are accessible via the FTP protocol, or any of the other abundant protocols available. It is also entirely possible to serve content multiple ways from the same machine to meet the needs of varying clients – some may not want to use web access and prefer downloading content via FTP, for instance.
The way a server knows which service to serve is by the requested port number indicated by an incoming connection request. A sample connection may include: source IP address, destination IP address, source port, destination port, as well as protocol being used. When the server receives the connection request over a port, that server system (physical hardware itself) should have server software with an available socket that is bound on that port. For instance: port 80 is the common port for HTTP, 443 is for HTTPS, and 21 is for FTP. Once a connection is established, that is, the kernel sends the request to the appropriate software listening on that port, the server may offload that request internally to allow for further incoming connection requests on that same port. This StackOverflow post is rather technical but provides great insight on how this is done.
Depending on what server software is configured, a web server or FTP server, then that software determines how communications will be further handled. Each protocol is designated and specified by their appropriate RFC number. Request for Comments (RFC) are standards that detail out how each protocol will allow communication between connecting clients. There are numerous special keywords used to identify what the behavior of the software should be. For instance: HTTP has a GET keyword that queries a resource on a specific host.
Fun fact, there are 65,535 TCP ports and 65,535 UDP ports on a computer! In other words, a server can serve a lot of different services over TCP or UDP. Granted each service would share the pool of available resources (RAM, CPU, storage utilizations) on the system itself, so it may not be wise to have everything hosted on a single computer system. Another consideration in this would be how many clients are accessing the server content simultaneously. If you wanted to personally have a test server setup, I wouldn’t suspect this being an issue at all. In fact, I have a single system acting as a web server and FTP server at home, among other services.
How Do Clients and Servers Communicate
Let’s expand a bit more from the above-mentioned incoming request – which takes five bits of information (source IP, destination IP, source port, destination port, protocol).
When a client makes a request to a server for HTTPS communications, the client would use a system-generated source port number to send the request out through the local router to the server on a desired port, in this case port 443 for HTTPS communications. The client’s host itself has a random source port, the router generates yet another random source port that it keeps track of, and the request continues to the respective server. That server would return the traffic request to the originating IP address and respective source port number, in which the source IP is the public IP address where the client made the request from and the router identifies which internal client made that request based on the source port. This way if multiple clients from within the same network location made multiple connections to the same remote server, the source ports determine which client the router should send the return traffic to.
Bear in mind that traffic going out onto the Internet requires a unique IP address to be able to be identified for communication requests; that is, they’re publicly routable addresses. Internal client IP addresses are distinguished and reserved based on RFC 1918. This is where NATing comes into play, where internal clients making public server requests have their IP address rewritten by their network’s router / gateway device before sending the request out onto the Internet. The way gateway devices keep track of multiple clients is via connection tables and tracking internal client IP addresses and their corresponding source ports. Return traffic is addressed to the public IP of the network along with the source port, the one generated by the gateway device. From there, the gateway device looks up which client is associated with the gateway’s generated source port maintained within its connection table, then the traffic gets sent back to the original client.
The above may seem complicated, and the technical details surely are, but the concept of what it does is simple. Any computer system can be a server if appropriate software is installed on that system. This server software interacts with the hardware itself by means of system calls. In simple terms, system calls are basically API (application programming interface) calls or libraries that operating systems offer to software installed on it. The software uses the libraries, or API, that the OS offers to reach the kernel and have access to the hardware stack. Check the references for more insight on this.
One more example of Internet-bound connection requests. Suppose a local network is assigned 10.0.1.0/24. Any client (let’s say 10.0.1.50) on that network that makes a request has a random port (example, 55,087) assigned to the request and is sent alongside its IP address in the form of a packet. The gateway device accepts the request, granted it matches any preset rules for Internet traffic, rewrites the packet with its own source port (55,087 is changed to 53,798) as well as the public IP address assigned to the physical site by the ISP; for example, let’s assume it was WatchGuard.com’s IP address of 126.96.36.199. Since the public IP address is routable, the return traffic knows where to go and gets analyzed by the gateway device once again.
I added some references that greatly expand on all of this and recommend reading it, if you’re interested in the technical details at least. The complexity of computers and how they truly work on the inside, compared to the ease of use offered to consumers is just fascinating to me. My hopes are to break down complex details into easy-to-digest chunks of information.
Chase, J. Sockets and Client/Server Communication. Retrieved from https://users.cs.duke.edu/~chase/cps196/slides/sockets.pdf
Examcollection.com contributors. How do Common Networking Protocols Function. Retrieved from https://www.examcollection.com/certification-training/network-plus-how-do-common-networking-protocols-function.html
Mozilla.org contributors. What is a web server? Retrieved from https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_is_a_web_server
Jean-Pierre Schwickerath says
“Fun fact, there are 65,535 TCP ports and 65,535 UDP ports on a computer!”
This is only true if you assume that a computer uses only one IP address. A port is rather useless if not combined with an IP address, making the pair a “socket”. This socket is what is actually used to any kind of communication.
Emil Hozan says
Correct, thank you for pointing that out!