An async server is mostly boundary discipline: read enough, write enough, drain buffers, close peers, and keep blocking work out of handlers.
Core answer
Use stream servers when a protocol can be handled as asynchronous reads and writes. Make backpressure and shutdown explicit in every handler.
# [CURRENT - 3.10-3.14] Works on Python 3.10+import asynciofrom dataclasses import dataclass@dataclass(frozen=True, slots=True)class Reply: text: strasync def handle(reader: asyncio.StreamReader, writer: asyncio.StreamWriter) -> None: line = await reader.readline() reply = Reply(f"ack:{line.decode().strip()}\n") writer.write(reply.text.encode()) await writer.drain() writer.close() await writer.wait_closed()async def main() -> None: server = await asyncio.start_server(handle, "127.0.0.1", 0) async with server: print(server.sockets[0].getsockname())asyncio.run(main())Why this design exists
Server code waits on sockets far more than it executes protocol logic in many workloads. Async streams let one event loop multiplex those waits while the handler still reads like sequential control flow.
Mechanics and CPython internals
start_server creates listening sockets and schedules handler coroutines for accepted connections. StreamWriter.write buffers data; drain is the pressure boundary that suspends when transport buffers need relief. Closing a writer begins shutdown; wait_closed lets the close complete.
# [CURRENT - 3.10-3.14] Works on Python 3.10+import asynciofrom dataclasses import dataclass@dataclass(frozen=True, slots=True)class Request: line: strasync def echo_once(request: Request) -> str: async def handler(reader: asyncio.StreamReader, writer: asyncio.StreamWriter) -> None: writer.write((await reader.readline()).upper()) await writer.drain() writer.close() await writer.wait_closed() server = await asyncio.start_server(handler, "127.0.0.1", 0) async with server: host, port = server.sockets[0].getsockname()[:2] reader, writer = await asyncio.open_connection(host, port) writer.write((request.line + "\n").encode()) await writer.drain() response = (await reader.readline()).decode().strip() writer.close() await writer.wait_closed() return responseprint(asyncio.run(echo_once(Request("ping"))))Complexity and tradeoffs
Per-connection tasks retain state while waiting. Async servers avoid one thread per wait, but they still pay memory for buffers, task state, and pending work. CPU-heavy parsing or synchronous clients inside handlers stall unrelated connections on the same loop.
Idiomatic patterns and refactoring
Refactor a write-only handler to include drain and close discipline before treating it as production-ready.
# [CURRENT - 3.10-3.14] Works on Python 3.10+import asynciofrom dataclasses import dataclass@dataclass(frozen=True, slots=True)class Message: payload: bytesasync def write_bad(writer: asyncio.StreamWriter, message: Message) -> None: writer.write(message.payload)async def write_reply(writer: asyncio.StreamWriter, message: Message) -> None: writer.write(message.payload) await writer.drain() writer.close() await writer.wait_closed()print(Message(b"ack\n"))Common mistakes and edge cases
Do not omit framing: readline, fixed sizes, delimiters, or protocol parsers define where a message ends. Do not confuse write with "sent". Do not forget shutdown paths for peer disconnects and cancellation.
When to use / When NOT to use
Use async streams for waiting-heavy network protocols where coroutine handlers clarify resource boundaries.
Do not use them to hide CPU-heavy request work on the event loop or to avoid choosing a real protocol framing strategy.