Computer networking programming, often referred as socket programming, is about writing programs to exchange information between processes running on connected hosts – computers, mobiles etc. Hosts are generally interconnected over a private network (LAN) or internet. In fact, the processes can run on a single host or computer also.
Most applications today, desktop or mobile, are becoming more online. They exchange information with other entities over the internet. For example, when you opened this article, your web browser, Chrome, IE or something else, exchanged information with a web server identified by ‘qnaplus.com’. Both the web browser and the web server are examples of network programs.
When we write network programming, we take help of a software framework or structure called socket. Socket allows a program to transmit and receive data to or from another program.
Network socket acts as an end point of a network communication which is generally identified as transport protocol, IP address and port. For a programmer, a socket is just an integer – socket library APIs hide all the detail. On Unix-like systems, socket is just like any other file descriptors (fd). Instead of read and write, we have send and receive type of functions to exchange information.
Here are some important concepts associated with networking programming.
Client Server Architecture
When processes communicate over a computer network, they play two distinct roles – server and client. Sever passively waits for connections from clients. The clients, on the other hand, actively initiate a connection towards the server. When the connection is established, the client requests something, like text, image or video and the server serves that.
In our web page access example, the website, ‘qnaplus.com’ acts as a server running somewhere on the internet. The server here is identified as the domain name, ‘qnaplus.com’, which eventually gets translated into an IP address and a port. The server always waits for new connection requests from the clients.
The web browser, Chrome or IE, that you are using, acts as a client. When you put the website address, ‘qnaplus.com’, on the address bar, the browser connects to a DNS (Domain Name Server) to translate it into an IP address. It uses port 80 by default for website access. Then using this IP address and the port, the browser connects the server. Once the connection is established, the client requests the home page of the website. The website name generally represents the home page of the site. The server transfers the home page over the computer network which is eventually displayed by the browser. Other pages, identified by URL (Uniform Resource Locator), can also be requested in the similar way.
Networking Protocols
Protocols are the heart of network communications. They are basically a set of rules agreed between two parties involved in data communication. Purpose of protocol varies from transferring information to security to element discovery to management etc.
Protocols primarily define the format of the data that is exchanged and how to process the received data.
Data format means which bit or byte of the transmitted data represents what. For example, the IP packet format looks like this.
The first 4 bits represent the IP version. For IPv4, the value of this field will be 4. Generally IP packets carry transport layer packets like TCP or UDP. We’ll learn about this later.
Similarly two bytes starting from the second byte represent the total length of an IP packet. Whoever will process an IP packet will know the total length of the packets.
All other header fields are defined by the IP protocol. All parties involved in IP communication will have to agree on this.
Processing a protocol packet depends on the protocol. For example, routers and network hosts process IP packets. Routers’ job is to help reach the packets to their final destinations. Routers get the destination IP address of the packet from a predefined header field and figures out the next hop for the packet.
Connectionless and Connection Oriented Transport
There is always a tradeoff between transport reliability and communication overhead. Transport protocols are categorized as connection-oriented and connectionless based on that.
Connection-Oriented
Connection oriented protocols aim for reliable data transport.
Connection establishment: Before sending any data, two parties (client and server) establish a connection. It is a kind of handshake where both parties inform each other that they are available and ready to exchange data.
Reliability: Connection oriented protocol ensures that the sent data is actually received by the receiver. The receiver sends an acknowledgement to the sender that some portion of the is actually received.
Nodrop delivery: It ensures that no portion of the transmitted data is dropped. If some data is lost on transmission, they will be retransmitted. It is achieved by having data sequence numbers and an acknowledgement mechanism.
In order delivery: Connection oriented protocols ensure that the data portions are delivered in the same order they were sent by the sender. Even if data packets are reached out of order by the lower layer protocol (say IP), they get rearranged based on the sequence number before delivering to the receiver.
Apart from these, connection oriented protocols provide flow control, congestion control, multi-homing etc.
Connectionless
On the contrary, connectionless protocols transport data on a best effort basis.
Connectionless protocols don’t establish a connection before transmitting the actual data. The sender won’t know whether the receiver actually exists or is ready to receive data.
The sender will not know whether the transmitted data is actually received by the receiver.
All transmitted packets might not be delivered.
There is no guarantee that the transmitted packets will be received by the receiver in the same order they were transmitted.
This transport mode doesn’t have overheads, bigger header or extra control packet exchange, that the connection oriented protocols have to achieve the reliability.
This is the best mode for multicasting and broadcasting.