Building Your Own Web Server — Part 3: Blocking Single and Multithreaded Server

All articles in this series

🙌 Preface

Let me start with a small note. I’m very happy we didn’t jump into the implementation of the server right from the beginning. By first building out modules like the configuration parser and the HTTP request parser, we now have a solid foundation that we can rely on — and that allows us to focus this part purely on the server logic.

This article continues our journey of implementing a web server from scratch, and here we’ll cover the most important part: handling TCP connections and serving HTTP traffic.

We’ll go through three types of server models step-by-step:

Blocking single-threaded server — the simplest possible form
Blocking server with keep-alive support — to match HTTP/1.1 behaviour
Multithreaded non-blocking server — real-world concurrency and performance

For each of them, we’ll walk through the algorithm, real Python implementation, and performance benchmarks with ApacheBench.

🧠 What You’ll Learn

In this part of the series, you’ll get a hands-on understanding of how real web servers operate at the socket and HTTP level.

By the end of the article, you’ll know how to:

✅ Implement a blocking TCP server that serves HTTP requests
🔁 Extend it to support HTTP keep-alive connections (used by all modern browsers)
🧵 Use threads to handle multiple clients in parallel
🧱 Structure your code with clean, testable building blocks (HTTPSession, HTTPProcessor, DataProvider)
🧭 Parse and route requests using a configuration file with server and location blocks (NGINX-style)
📈 Benchmark your server using ab (ApacheBench) and analyze the results
⚖️ Compare performance between single-threaded and multithreaded architectures
👨‍💻 Prepare your architecture for the next level: non-blocking I/O and event loops

🧱 1. Blocking Server (No Keep-Alive)

Let’s start from the most minimal working implementation.

❓ What is a Blocking Server?

This type of server does one thing at a time:

It listens for incoming TCP connections
When a client connects, it reads the request
Sends a response
Closes the connection and exits

This version is single-threaded and serves only one request. If the client is slow or does not send anything — the server just waits (blocks). It’s not usable in practice, but is a great place to start.

🧩 Algorithm Description (Beginner Friendly)

Create a socket using Python’s socket module
Bind it to a host and port so the OS knows where to route connections
Listen for incoming client connections
Accept the first connection — this blocks the server until a client connects
Receive data from the client — this also blocks if nothing is sent
If data is received:
- Compose a basic HTTP 200 OK response
- Send the response
Close the connection and exit the server

That’s it! Here’s the code.

💻 Python Implementation (With Comments)

import socket

class Server:
    def __init__(self, host='', port=8080):
        self.HOST = host  # Empty string means "listen on all available interfaces"
        self.PORT = port  # Default port we’ll serve on

    def start_server(self):
        # Step 1: Create a TCP socket
        with socket.socket() as s:
            # Step 2: Bind the socket to the host and port
            s.bind((self.HOST, self.PORT))

            # Step 3: Start listening for connections
            s.listen()

            print(f"[Server] Listening on {self.HOST}:{self.PORT}")

            # Step 4: Accept a connection (blocking call)
            connection, client_address = s.accept()

            # Step 5: Use the connection
            with connection:
                print(f"[Server] Connected by address: {client_address}")

                # Step 6: Receive data (up to 1024 bytes)
                received_data = connection.recv(1024)

                if received_data:
                    # Step 7: Build a basic HTTP response
                    response = (
                        "HTTP/1.1 200 OK\r\n"
                        "Content-Length: 5\r\n"
                        "Content-Type: text/plain\r\n"
                        "\r\n"
                        "Hello"
                    )

                    # Step 8: Send response back to the client
                    connection.sendall(response.encode())

                # Step 9: Connection is closed automatically by 'with' block

✅ Pros and ❌ Cons

✅ Pros	❌ Cons
Very easy to understand and implement	Can only handle one connection ever
Good for debugging and experimentation	Blocks on every operation (accept, recv)
No concurrency — zero scalability	Not compatible with modern clients using keep-alive

🔁 2. Blocking Server with Keep-Alive

In the previous section, we implemented the most basic version of a web server. It accepted a single connection, processed a single HTTP request, returned a dummy response, and then shut down. It worked, but it wasn’t realistic. Browsers and clients don’t behave that way — and servers shouldn’t either.

Let’s now move to the next step and support persistent TCP connections — i.e., keep-alive — which is the default in HTTP/1.1.

📚 What is Keep-Alive?

By default, modern HTTP clients (including all browsers) expect that one TCP connection will be reused for multiple requests and responses. This saves a lot of time by avoiding repeated handshakes and allows for faster page loads.

But in our first server version, we closed the connection after a single response. This breaks keep-alive and forces the client to re-open a new connection every time.

Our goal now is to make the server:

keep the connection open
accept multiple requests from the same socket
return multiple responses

⚠️ Reminder:

recv()

Blocks

In Python, calling socket.recv(1024) is a blocking operation. It stops your thread and waits until:

some data is available in the socket buffer, or
the client closes the connection

This is dangerous in a real-world setup because it means a slow or idle client can block your whole server. For now, we accept this limitation and handle it in the next section using threads.

🧱 Introducing

HTTPSession,DataProvider, and HTTPProcessor

To build a robust and flexible design, we need to decompose our logic into smaller, clear components. Instead of shoving everything into a giant while-loop, we separate responsibilities.

🧠 HTTPSession: Manages a Single Client Connection

This class takes over the entire handling of a single TCP connection:

It reads incoming bytes using recv()
Feeds the bytes into a DataProvider buffer
Parses full HTTP requests using HTTPProcessor
Generates responses (reads files from disk, builds HTTP headers, sends back data)
Decides whether to keep the connection alive or close it based on the request

This gives us clean isolation: one class = one responsibility = easier maintenance.

📦 DataProvider : Input Buffer for Raw Data

recv() gives us unpredictable chunks of data — possibly partial, multiple, or garbage. DataProvider:

stores all incoming data
exposes a read-only .data property
allows dropping the bytes we already processed

class DataProvider:
    def __init__(self):
        self._data = b""

    @property
    def data(self) -> bytes:
        return self._data

    @data.setter
    def data(self, new_data: bytes):
        self._data += new_data

    def reduce_data(self, size: int):
        self._data = self._data[size:]

🧠 HTTPProcessor: Extracts One Message from Buffer

Each time data is received, HTTPProcessor checks whether we can extract a full HTTP message from it.

class HTTPProcessor:
    def __init__(self, data_provider: DataProvider):
        self.data_provider = data_provider

    def get_one_http_message(self) -> Optional['HTTPMessage']:
        try:
            http_message, bytes_consumed = HTTPParser.parse_message(self.data_provider.data)
            if http_message:
                self.data_provider.reduce_data(bytes_consumed)
            return http_message
        except RuntimeError:
            return None

🧠 How Everything Works Together

Let’s walk through the entire server behaviuor:

🔄 Step-by-step Overview

The Server accepts a new socket connection.
It passes the connection to a new HTTPSession object.
The session enters a loop:
- It calls recv() and adds data to the buffer
- It tries to parse a full HTTP message
- If it succeeds, it sends the response
- If the connection is not marked as keep-alive, it exits
On exit, the connection is closed.

🧪 Pseudo-Code Flow

Start server
    While True:
        Accept new connection
        Create HTTPSession(connection)
        session.handle()

HTTPSession.handle():
    While connection is active:
        data = recv()  # blocks here!
        append data to buffer
        While True:
            message = parse one message from buffer
            if no message: break
            process request, generate response
            if no keep-alive:
                exit session

💻 Implementation

Now let’s look at the full code:

import socket
from typing import Tuple, Optional, Dict

class HTTPMessage:
    def __init__(self, start_line: str, headers: Dict[str, str], body: bytes):
        self.start_line = start_line
        self.headers = headers
        self.body = body

class DataProvider:
    def __init__(self):
        self._data = b""

    @property
    def data(self) -> bytes:
        return self._data

    @data.setter
    def data(self, new_data: bytes):
        self._data += new_data

    def reduce_data(self, size: int):
        self._data = self._data[size:]

class HTTPProcessor:
    def __init__(self, data_provider: DataProvider):
        self.data_provider = data_provider

    def get_one_http_message(self) -> Optional[HTTPMessage]:
        try:
            msg, used = HTTPParser.parse_message(self.data_provider.data)
            if msg:
                self.data_provider.reduce_data(used)
            return msg
        except RuntimeError:
            return None

class HTTPSession:
    def __init__(self, conn: socket.socket, addr: Tuple[str, int]):
        self.connection = conn
        self.addr = addr
        self.data_provider = DataProvider()
        self.http_processor = HTTPProcessor(self.data_provider)
        self.active = True

    def handle(self):
        print(f"Connected by {self.addr}")
        while self.active:
            data = self.connection.recv(1024)
            if not data:
                break
            self.data_provider.data = data

            while request := self.http_processor.get_one_http_message():
                uri = request.start_line.split(" ")[1][1:] or "index.html"
                if uri == "favicon.ico":
                    continue
                try:
                    with open(uri) as f:
                        content = f.read().encode()
                        headers = (
                            "HTTP/1.1 200 OK\r\n"
                            f"Content-Length: {len(content)}\r\n"
                            "Content-Type: text/plain\r\n"
                        )
                        if 'keep-alive' in request.headers.get('connection', ''):
                            headers += "Connection: keep-alive\r\n"
                        else:
                            self.active = False
                        headers += "\r\n"
                        self.connection.sendall(headers.encode() + content)
                except Exception:
                    msg = b"404 Not Found"
                    headers = (
                        "HTTP/1.1 404 Not Found\r\n"
                        f"Content-Length: {len(msg)}\r\n"
                        "Content-Type: text/plain\r\n\r\n"
                    )
                    self.connection.sendall(headers.encode() + msg)
        self.connection.close()

class Server:
    def __init__(self, host: str = "", port: int = 8080):
        self.HOST = host
        self.PORT = port

    def start_server(self):
        with socket.socket() as s:
            s.bind((self.HOST, self.PORT))
            s.listen()
            print(f"Server is listening on port {self.PORT}")
            while True:
                conn, addr = s.accept()
                session = HTTPSession(conn, addr)
                session.handle()

if __name__ == '__main__':
    Server().start_server()

🔧 Adding Configuration Support with NGINX-style Routing

At this point, our server is fully functional: it handles TCP sockets, parses HTTP messages, supports keep-alive connections, and even serves files based on URI. But — there’s one big thing missing.

We want the server behavior to be configurable.

In the previous part of this series, we built a configuration parser inspired by how NGINX works. Now it’s time to plug that logic into our actual server.

To do that, we’ll:

Use the config to define the listening port
Use NGINX-like server and location blocks to define route-to-file mappings
Build a simplified route matcher that selects the correct root for each request

Let’s first understand how route matching in NGINX actually works.

🧭 How Route Matching Works (NGINX-style)

In NGINX, route matching follows a block-based, hierarchical structure using server and location directives. These define which traffic goes where, and what content to serve.

🔎 Matching Steps

Match the server block:
- The request is routed to the server block matching the listening port.
Match a location inside that server:
- All location blocks are compared to the request URI.
- Each location uses prefix matching — does the URI start with this prefix?
- Out of all matching locations, we select only one: the longest matching prefix.
Determine the root directory for this location.
Build the full path by combining the root and the entire original URI.

🧠 Example: Longest Prefix Matching

Let’s say we have this config:

server {
    listen 8080;

    location / {
        root html;
    }

    location /docs/ {
        root docs;
    }

    location /docs/tutorials/ {
        root tutorials;
    }
}

Now the request comes in:

GET /docs/tutorials/python.html HTTP/1.1

All of these location blocks match the URI prefix:

/ → matches everything
/docs/ → matches /docs/tutorials/python.html
/docs/tutorials/ → matches /docs/tutorials/python.html

So what do we choose?

✅ Only /docs/tutorials/, because it is the longest matching prefix.

This is how NGINX works: multiple matches are fine, but only the most specific one wins.

Final path to serve:

tutorials/docs/tutorials/python.html

Yes, it includes the full URI even if it overlaps — NGINX does that unless alias is used (we’re following the default root behavior here).

🧠 Summary

You can match multiple location blocks, but only the longest match matters.
This gives us a clean priority system.
We’ll replicate this logic using a RouteMatcher helper class.

Let’s now implement this routing logic in Python, connect it to our server, and make it driven by the config file we previously built.

💻 Full Server Implementation (With Config + NGINX-Style Routing)

import socket
from typing import Tuple, Optional, Dict
from config_parser import load_config, ServerConfig  # Your config parser from previous part
from http_parser import HTTPParser, HTTPMessage  # Your existing HTTP parser

# -------------------------
# Route Matching Logic
# -------------------------

class RouteMatcher:
    """
    Selects the server block based on port and matches location blocks using longest URI prefix.
    """
    @staticmethod
    def match_location(locations, uri: str):
        """
        Finds the location block with the longest prefix match.
        """
        matched_location = None
        longest_prefix = -1
        for path, root_dir in locations.items():
            if uri.startswith(path) and len(path) > longest_prefix:
                matched_location = root_dir
                longest_prefix = len(path)
        return matched_location

# -------------------------
# Data Buffer
# -------------------------

class DataProvider:
    """
    A simple buffer that accumulates received data and lets us consume it safely.
    """
    def __init__(self):
        self._data = b""

    @property
    def data(self) -> bytes:
        return self._data

    @data.setter
    def data(self, chunk: bytes):
        self._data += chunk

    def reduce_data(self, size: int):
        self._data = self._data[size:]

# -------------------------
# Message Processor
# -------------------------

class HTTPProcessor:
    """
    Handles parsing of buffered data into HTTPMessage objects.
    """
    def __init__(self, data_provider: DataProvider):
        self.data_provider = data_provider

    def get_one_http_message(self) -> Optional[HTTPMessage]:
        try:
            message, consumed = HTTPParser.parse_message(self.data_provider.data)
            if message:
                self.data_provider.reduce_data(consumed)
            return message
        except Exception:
            return None

# -------------------------
# Client Connection Session
# -------------------------

class HTTPSession:
    """
    Handles the lifecycle of a single HTTP connection.
    """
    def __init__(
        self,
        connection: socket.socket,
        client_address: Tuple[str, int],
        port: int,
        server_config: ServerConfig,
    ):
        self.connection = connection
        self.addr = client_address
        self.data_provider = DataProvider()
        self.http_processor = HTTPProcessor(self.data_provider)
        self.port = port
        self.server_config = server_config
        self.active = True

    def handle(self):
        print(f"[Session] Connected from {self.addr}")
        while self.active:
            data = self.connection.recv(1024)
            if not data:
                break

            self.data_provider.data = data

            while request := self.http_processor.get_one_http_message():
                url = request.url
                root = 'html'  # Default root directory
                if url == "/":
                    url = "/index.html"
                else:
                    # Get root path
                    root = RouteMatcher.match_location(self.server_config.routes[self.port], url)

                file_path = f"{root}{url}"
                print(f"[Request] {url} => {file_path}")

                try:
                    with open(file_path, "rb") as f:
                        body = f.read()
                    headers = (
                        "HTTP/1.1 200 OK\r\n"
                        f"Content-Length: {len(body)}\r\n"
                        "Content-Type: text/plain\r\n"
                    )
                    if "keep-alive" in request.headers.get("connection", "").lower():
                        headers += "Connection: keep-alive\r\n"
                    else:
                        self.active = False
                    headers += "\r\n"
                    self.connection.sendall(headers.encode() + body)
                except Exception as e:
                    print(f"[Error] {e}")
                    self._send_404()

        self.connection.close()

    def _send_404(self):
        msg = b"404 Not Found"
        headers = (
            "HTTP/1.1 404 Not Found\r\n"
            f"Content-Length: {len(msg)}\r\n"
            "Content-Type: text/plain\r\n\r\n"
        )
        self.connection.sendall(headers.encode() + msg)

# -------------------------
# Server Entrypoint
# -------------------------

class Server:
    """
    Main server class. Reads config, binds to the correct port, and handles requests.
    """
    def __init__(self, config_path: str):
        self.config = load_config(config_path)

    def start(self):
        port = self.config.listen_ports[0]
        with socket.socket() as s:
            s.bind(("", port))
            s.listen()
            print(f"[Server] Listening on port {port}")
            while True:
                conn, addr = s.accept()
                session = HTTPSession(conn, addr, port, self.config)
                session.handle()

# -------------------------
# Start Server
# -------------------------

if __name__ == "__main__":
    server = Server("config.conf")
    server.start()

Full version is here: https://github.com/DmytroHuzz/build_own_webserver/blob/main/server_v1.py

✅ Summary

🧭 We now support NGINX-style routing with longest prefix matching
⚙️ Server behaviuor is fully configurable via config.conf
🗂 Each request is matched to a location block and served from its root
🧠 All logic is split into clean, beginner-friendly classes

📊 Benchmark Results (All 4 Scenarios)

#	Scenario	Command	Requests/sec	Time
1	Single request, no keep-alive	ab -n 100000	10308.65	9.701
2	Single request, keep-alive	ab -n 100000 -k	19351.89	5.167
3	Concurrent (50x), no keep-alive	ab -n 100000 -c 50	22733.18	4.399
4	Concurrent (50x), with keep-alive	ab -n 100000 -c 50 -k	❌ Crashed	—

🔍 Benchmark Takeaways

Adding keep-alive nearly doubles performance in single-client tests
Concurrent clients improve speed — until the server crashes on concurrent keep-alive (recv() + blocking)
We need threading to support parallel persistent clients — coming next!

🧵 Multithreaded Server — Make It Scalable

We already built a functional server — it listens, accepts connections, parses requests, and sends responses. But there’s still a serious limitation: it’s single-threaded.

If one client connects and keeps the connection open (which browsers often do with keep-alive), our server gets stuck. It can’t accept or serve any new requests until that one client is done.

And here comes the fix: threads.

🧠 Why Threads?

Let’s pause here and make something very clear. Threads are one of the simplest tools in programming that allow us to handle multiple tasks at the same time — or more precisely, to interleave work so that one task doesn’t block everything else.

In the context of our web server, here’s the core issue:

If we use only

one thread

This happens because of this function:

received_data = connection.recv(1024)

recv() is a blocking call — meaning it will stop everything and wait until data arrives from the client. If no data comes, we sit there doing nothing. That’s okay for a toy server, but completely unacceptable in real-world scenarios.

So how do we fix it?

🔄 Introducing Threads

A thread is a lightweight execution unit — like a mini-program inside your main program. With threads, you can let one thread wait on a blocking call like recv(), while the main thread (or other threads) continue doing useful work.

In our case:

Main thread listens for new connections
Every client connection is handled in its own thread
So if one client stalls, it affects only its own thread, not the others

Simple, powerful.

🐍 But Wait — What About Python’s GIL?

Yes, let’s talk about that.

Python (specifically CPython, the most common interpreter) has something called the Global Interpreter Lock (GIL). The GIL makes sure that only one thread executes Python bytecode at a time, even on multi-core CPUs.

Sounds bad, right?

🧠 Not always. Because when a thread waits on I/O, like reading from a socket or a file, Python releases the GIL temporarily. That means other threads can run during that time.

So even though Python threads aren’t great for CPU-heavy tasks (like math, AI, or video processing), they are very effective for I/O-bound programs — like our web server, where most time is spent waiting for users or reading files.

⚖️ Summary: Threads in Our Web Server

Without threads: One slow client can freeze the entire server
With threads: Each client is isolated in its own thread
Blocking functions like recv() no longer harm the whole server
Python’s GIL does not prevent this from working, because I/O releases it
Threads are easy to use and scale okay up to a point (until you need async/event loops)

🧭 Pseudocode: Multithreaded Server

To visualize how this version of the server works, let’s outline the flow in a simplified pseudocode diagram:

+----------------------------+
| Main Thread               |
|---------------------------|
| 1. Listen on a port       |
| 2. Accept connection      |
| 3. Start new thread       |
| 4. Pass connection to it  |
+----------------------------+
            |
            v
  +-------------------------+
  | Per-Client Thread       |
  |-------------------------|
  | 1. Receive data         |
  | 2. Parse request        |
  | 3. Send response        |
  | 4. Close (or keep alive)|
  +-------------------------+

💻 Python Implementation (Threaded Version)

Let’s now implement this idea in code. Below is the full server logic, adapted to support multithreading. Each client is handled in a separate thread. This avoids blocking the main thread and enables our server to handle multiple users simultaneously.

All classes are commented and kept minimal for clarity.

import socket
from typing import Tuple, Optional, Dict
from config_parser import load_config, ServerConfig  # Your config parser from previous part
from http_parser import HTTPParser, HTTPMessage  # Your existing HTTP parser
from threading import Thread

# -------------------------
# Route Matching Logic
# -------------------------

class RouteMatcher:
    """
    Selects the server block based on port and matches location blocks using longest URI prefix.
    """
    @staticmethod
    def match_location(locations, uri: str):
        """
        Finds the location block with the longest prefix match.
        """
        matched_location = None
        longest_prefix = -1
        for path, root_dir in locations.items():
            if uri.startswith(path) and len(path) > longest_prefix:
                matched_location = root_dir
                longest_prefix = len(path)
        return matched_location

# -------------------------
# Data Buffer
# -------------------------

class DataProvider:
    """
    A simple buffer that accumulates received data and lets us consume it safely.
    """
    def __init__(self):
        self._data = b""

    @property
    def data(self) -> bytes:
        return self._data

    @data.setter
    def data(self, chunk: bytes):
        self._data += chunk

    def reduce_data(self, size: int):
        self._data = self._data[size:]

# -------------------------
# Message Processor
# -------------------------

class HTTPProcessor:
    """
    Handles parsing of buffered data into HTTPMessage objects.
    """
    def __init__(self, data_provider: DataProvider):
        self.data_provider = data_provider

    def get_one_http_message(self) -> Optional[HTTPMessage]:
        try:
            message, consumed = HTTPParser.parse_message(self.data_provider.data)
            if message:
                self.data_provider.reduce_data(consumed)
            return message
        except Exception:
            return None

# -------------------------
# Client Connection Session
# -------------------------

class HTTPSession:
    """
    Handles the lifecycle of a single HTTP connection.
    """
    def __init__(
        self,
        connection: socket.socket,
        client_address: Tuple[str, int],
        port: int,
        server_config: ServerConfig,
    ):
        self.connection = connection
        self.addr = client_address
        self.data_provider = DataProvider()
        self.http_processor = HTTPProcessor(self.data_provider)
        self.port = port
        self.server_config = server_config
        self.active = True

    def handle(self):
        print(f"[Session] Connected from {self.addr}")
        while self.active:
            data = self.connection.recv(1024)
            if not data:
                break

            self.data_provider.data = data

            while request := self.http_processor.get_one_http_message():
                url = request.url
                root = 'html'  # Default root directory
                if url == "/":
                    url = "/index.html"
                else:
                    # Get root path
                    root = RouteMatcher.match_location(self.server_config.routes[self.port], url)

                file_path = f"{root}{url}"
                print(f"[Request] {url} => {file_path}")

                try:
                    with open(file_path, "rb") as f:
                        body = f.read()
                    headers = (
                        "HTTP/1.1 200 OK\r\n"
                        f"Content-Length: {len(body)}\r\n"
                        "Content-Type: text/plain\r\n"
                    )
                    if "keep-alive" in request.headers.get("connection", "").lower():
                        headers += "Connection: keep-alive\r\n"
                    else:
                        self.active = False
                    headers += "\r\n"
                    self.connection.sendall(headers.encode() + body)
                except Exception as e:
                    print(f"[Error] {e}")
                    self._send_404()

        self.connection.close()

    def _send_404(self):
        msg = b"404 Not Found"
        headers = (
            "HTTP/1.1 404 Not Found\r\n"
            f"Content-Length: {len(msg)}\r\n"
            "Content-Type: text/plain\r\n\r\n"
        )
        self.connection.sendall(headers.encode() + msg)

# -------------------------
# Server Entrypoint
# -------------------------

class Server:
    """
    Main server class. Reads config, binds to the correct port, and handles requests.
    """
    def __init__(self, config_path: str):
        self.config = load_config(config_path)

    def start(self):
        port = self.config.listen_ports[0]
        with socket.socket() as s:
            s.bind(("", port))
            s.listen()
            print(f"[Server] Listening on port {port}")
            while True:
                conn, addr = s.accept()
                session = HTTPSession(conn, addr, port, self.config)
                thread = Thread(target=session.handle, daemon=True)
                thread.start()

# -------------------------
# Start Server
# -------------------------

if __name__ == "__main__":
    server = Server("config.conf")
    server.start()

Full version: https://github.com/DmytroHuzz/build_own_webserver/blob/main/server_v2.py

📊 Benchmarks & Performance Analysis

Here are your benchmark results using ab (ApacheBench):

#	Scenario	Command	Concurrency	Keep-Alive	Requests/sec	Time
1	Single client	ab -n 100000	1	❌	7871.07	12.705
2	Single client + Keep-Alive	ab -n 100000 -k	1	✅	19139.16	5.225
3	50 concurrent clients	ab -n 100000 -c 50	50	❌	7182.58	13.923
4	50 concurrent + Keep-Alive	ab -n 100000 -c 50 -k	50	✅	13777.80	7.258

📊 Blocking vs Multithreaded Server — Performance Comparison

Now that we’ve implemented both a blocking (single-threaded) server and a multithreaded version, let’s compare how they behave under different real-world usage patterns using ApacheBench.

#	Scenario	Command	Concurrency	Keep-Alive	Blocking Server	Multithreaded Server
1	Single client, no keep-alive	ab -n 100000	1	❌	10308.65 req/sec	7871.07 req/sec
2	Single client, keep-alive	ab -n 100000 -k	1	✅	19351.89 req/sec	19139.16 req/sec
3	50 clients, no keep-alive	ab -n 100000 -c 50	50	❌	22733.18 req/sec	7182.58 req/sec
4	50 clients, keep-alive	ab -n 100000 -c 50 -k	50	✅	❌ Crashed	13777.80 req/sec

🧠 Analysis

✅ Multithreaded server handles concurrency correctly: it doesn’t crash or stall with multiple clients.
🧵 Thread overhead exists: single-thread performance slightly drops due to thread management.
🔥 Keep-Alive is a huge win for both versions — especially in single-client scenarios.
❌ The blocking server simply cannot scale with concurrency — it either stalls or crashes.

💬 Final Thoughts

The blocking server is great for understanding the basics but unusable beyond a single request at a time.
The multithreaded server is a practical step toward a real web server: it supports multiple clients, concurrent connections, and persistent TCP streams.
But we’re still not done — with many threads, we’ll eventually hit memory and scheduling limits.

🧭 In the next part, we’ll move to non-blocking I/O and event-driven architecture — the kind of foundation used by NGINX, Node.js, and other modern high-performance servers.