Backend Engineering 2: TCP vs UDP
Backend Engineering 2: TCP vs UDP
TCP and UDP are in layer 4 of the OSI model
TCP (stands for Transmission Control Protocol ) is a connection-oriented protocol sits on Layer 4 Transport in OSI Model.
❓ What does it mean by “connection-oriented”?
👉 connection-oriented: connection must be established successfully before sending data
👉 In contrast, we have connectionless which means data will be sent as soon as possible, it doesn’t care if the connection was established or not.
- ✅ Acknowledgement: For example, we have a TCP connection between A and B. When A sends a packet to B, if B receives this packet, it will send a message to A: Hey, I received the packet you sent, thanks
- ✅ Guaranteed delivery: Okay, come back to the above example, what if B doesn’t receive a packet at all from A? In that case, A will try to retransmit the packet to B until it’s sent successfully, so we call it’s guaranteed delivery. Note that we also have a timeout, so if the time A tried to retransmit packets exceeds the timeout, the connection is dropped.
- ✅ Congestion control: There are cases when clients send too many packets through the network which can cause congestive collapse resulting from oversubscription. Congestion control will prevent senders from overwhelming the network
- ✅ Packets order: TCP adds sequencing to packet so that it will know the exact order of packet as well as remove duplication.
Cons:
- ❌ Larger packets: TCP adds sequencing, add acknowledgment number, congesting control so it will make our packet larger.
- ❌ More bandwidth: Larger packets require more bandwidth, obviously.
- ❌. Slower than UDP: Of course, there’s a lot of stuff that happens when transmitting packets using TCP, we wait for ack, wait for delivery, wait for congestion control... each of these states will slow down transmission.
- ❌ Stateful:
- 👉 TCP is stateful because both source and destination carry information of connection. They have to store the state of connection because of the main concern of TCP is “delivery the data to destination” so it have to keep track of status of data (how much data they’re ready to receive) as well as the order of packets. If the source or destination is down, the connection is closed immediately and you’ll lost all the states which are currently carried by source and destination.
- ❌ Server memory: For each connection, TCP must allocate some memory on disk for that connection and check for a newly transmitted packet on each connection. This can make a high memory usage which leads to DOS (Denied of service) if there are too many TCP connections. (Connection pool can help with this, we can switch to UDP if it’s necessary)
The TCP connection is established and terminated using a three-way handshake (For more detail about the three-way handshake, go Link. Basically, you only need to understand at a high level that to establish a connection i.e between A and B we have 3 steps
- Firstly, A send a packet to B to tell B that: Hey, I want to connect to you, is it ok
- Secondly, if B’s okay with that, it will send back a packet to A to inform that it’s ready to connect
- Lastly, A will send back another packet to B inform to open the connection
After the above 3 steps, a TCP connection is established and A and B can transmit packets to each other.
TCP is recommended to be used for applications that require high reliability i.e web servers, database info, FTP, SSH
UDP is in Layer 4 as TCP, but it's connectionless (I mentioned what's connectionless above)
- ✅ Smaller packets
- ✅ Less bandwidth
- ✅ Faster than TCP
- ✅ Stateless
- ❌ No Acknowledgment: For example a UDP connection between A and B. if A sends datagram (alias of the packet) to B, it doesn’t care if B receives it or not.
- ❌ No guaranteed delivery: Because it doesn’t care if the datagram is sent successfully or not, it won’t retry in case sending datagram was failed.
- ❌ No congestion control:
- ❌ No ordered packets
- ❌ Security: Since there’s no congestion, DOS attacks can easily happen.
UDP is mostly used in systems that are less reliable but require real-time communication such as video chat, streaming,
❓ What is the maximum number of TCP connections that can be opened for a single port?
👉 Each TCP/IP packet has 4 fields for addressing: source_ip, source_port, destination_ip, destination_port. If we open multi connections to the same port at the server (note that source_ip and source_port are our client info while destination_ip and destination_port are our server info) which means that only source_port are different among these connections. Ports are 16-bit numbers therefore we can have 2^16 connections. Of course this is just theoretical number because in the real life, we will have smaller number of maximum connections than that (the first reason is that system has default ports which you can't use)
I'll keep adding for more questions on this section...
So that’s it for today, I’ll come back later, Stay tuned. Bye-bye
Copyright:
Cover photo from: https://quantrimang.com/su-khac-nhau-giua-giao-thuc-tcp-va-udp-154559
References:
https://github.com/donnemartin/system-design-primer#communication
https://www.youtube.com/watch?v=qqRYkcta6IE&list=PLQnljOFTspQUNnO4p00ua_C5mKTfldiYT&index=6