Async context, backpressure, and offloading

You have await working. Now you need to manage resource lifecycles, bound concurrency, and handle results in completion order. These are the patterns that separate toy async code from production async code.

Think of an async context manager like an automated parking garage gate. Entering (__aenter__) takes time — the gate lifts, you get a ticket. Exiting (__aexit__) also takes time — you pay, the gate lifts. Both operations may need to wait, and that waiting is part of the protocol. That is why network clients, file handles, and database sessions all use async with.

Core answer

Good async code needs more than await. It needs:

safe acquisition/release boundaries
bounded fan-out
deliberate completion-order handling
explicit offloading for blocking work

Without those, "concurrent" code can still overload upstream systems or freeze the loop.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
import asyncio
async def fetch_one(name, sem):
    async with sem:
        await asyncio.sleep(0.1)
        return name

See Backpressure and Offloading

Visualize throttling, completion order, and moving blocking work off the event-loop thread.

Asynchronous context managers

An asynchronous context manager is the async variant of the with protocol. It uses:

__aenter__
__aexit__

and is entered with async with.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
class Demo:
    async def __aenter__(self):
        return self
    async def __aexit__(self, exc_type, exc, tb):
        return False

This matters in async resource code because entry or cleanup may itself need to await:

open or warm a network connection
flush buffered output
close gracefully

That is why async client libraries often use async with for sessions, streams, and connection lifetimes.

Completion order vs input order

asyncio.as_completed solves a different problem than gather.

gather returns results aligned with input order
as_completed lets you react in completion order

# [CURRENT - 3.10-3.14] Works on Python 3.10+
import asyncio
async def job(delay):
    await asyncio.sleep(delay)
    return delay
async def main():
    tasks = [job(0.3), job(0.1), job(0.2)]
    for finished in asyncio.as_completed(tasks):
        print(await finished)
asyncio.run(main())

Use completion order when:

early results are useful immediately
you want progress reporting
slow tail tasks should not delay consumption of fast ones

Throttling with semaphores

A semaphore is one of the clearest backpressure tools in asyncio. It caps how many coroutines can enter a protected section concurrently.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
import asyncio
async def fetch_all(names, limit=2):
    sem = asyncio.Semaphore(limit)
    async def one(name):
        async with sem:
            await asyncio.sleep(0.1)
            return name
    return await asyncio.gather(*(one(name) for name in names))

The semaphore does not make individual requests faster. It prevents unbounded concurrency from turning into:

connection pool exhaustion
upstream rate-limit violations
file descriptor pressure
latency collapse from too many in-flight tasks

This connects directly to the production guidance in .

Making multiple requests per unit of work

One of the chapter's real lessons is architectural: as soon as one logical job requires several awaited operations, you need to decide:

do these requests run sequentially?
do they run concurrently?
do they belong to one failure boundary?

If they are one parent operation, the structured-concurrency answer in current Python is TaskGroup. That existing guide already covers the lifecycle/failure side, so this guide keeps the focus on throughput and flow shape instead of duplicating it. See .

Delegating blocking work

If part of the workload is synchronous and blocking, you must move it off the event-loop thread.

# [OLDER / 3.9, CURRENT - 3.10-3.14] Works on Python 3.9+
import asyncio
from pathlib import Path
async def load_text(path):
    return await asyncio.to_thread(Path(path).read_text, encoding="utf-8")

asyncio.to_thread is a practical boundary tool for:

blocking file I/O
legacy synchronous clients
other operations that would otherwise freeze the loop

It does not make CPU-heavy Python bytecode parallel in the general case. It mainly protects loop responsiveness while the blocking function runs elsewhere.

Version context

Current project guidance targets Python 3.10-3.14. Python 3.9 and below are End-of-Life.

Version-sensitive points:

asyncio.as_completed: long-standing asyncio API
asyncio.Semaphore: long-standing asyncio API
asyncio.to_thread: Python 3.9+
TaskGroup: Python 3.11+

This guide intentionally cross-links TaskGroup instead of re-explaining its failure semantics in full.

Edge cases and gotchas

Semaphores limit concurrency inside the protected block only. They do not automatically limit task creation outside it.

as_completed changes result consumption order. That is powerful, but it also means downstream code must tolerate out-of-order completion.

to_thread can protect the loop from blocking I/O, but it is not a universal substitute for process-based CPU parallelism.

Async without backpressure is usually a production bug. If your design can create unbounded in-flight work, it will eventually do so under real traffic.

Production usage

Use ──────────────────────────────────────────────

async with for resource ownership with awaited setup/teardown
as_completed when earliest-finished work should be consumed first
semaphores to cap concurrency
to_thread or executors only at explicit blocking boundaries

For server-side consequences of the same event-loop rules, see .

Further depth

Core answer

Good async code needs more than await. It needs:

safe acquisition/release boundaries
bounded fan-out
deliberate completion-order handling
explicit offloading for blocking work

Without those, "concurrent" code can still overload upstream systems or freeze the loop.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
import asyncio
async def fetch_one(name, sem):
    async with sem:
        await asyncio.sleep(0.1)
        return name

See Backpressure and Offloading

Visualize throttling, completion order, and moving blocking work off the event-loop thread.

Asynchronous context managers

An asynchronous context manager is the async variant of the with protocol. It uses:

__aenter__
__aexit__

and is entered with async with.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
class Demo:
    async def __aenter__(self):
        return self
    async def __aexit__(self, exc_type, exc, tb):
        return False

This matters in async resource code because entry or cleanup may itself need to await:

open or warm a network connection
flush buffered output
close gracefully

That is why async client libraries often use async with for sessions, streams, and connection lifetimes.

Completion order vs input order

asyncio.as_completed solves a different problem than gather.

gather returns results aligned with input order
as_completed lets you react in completion order

# [CURRENT - 3.10-3.14] Works on Python 3.10+
import asyncio
async def job(delay):
    await asyncio.sleep(delay)
    return delay
async def main():
    tasks = [job(0.3), job(0.1), job(0.2)]
    for finished in asyncio.as_completed(tasks):
        print(await finished)
asyncio.run(main())

Use completion order when:

early results are useful immediately
you want progress reporting
slow tail tasks should not delay consumption of fast ones

Throttling with semaphores

A semaphore is one of the clearest backpressure tools in asyncio. It caps how many coroutines can enter a protected section concurrently.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
import asyncio
async def fetch_all(names, limit=2):
    sem = asyncio.Semaphore(limit)
    async def one(name):
        async with sem:
            await asyncio.sleep(0.1)
            return name
    return await asyncio.gather(*(one(name) for name in names))

The semaphore does not make individual requests faster. It prevents unbounded concurrency from turning into:

connection pool exhaustion
upstream rate-limit violations
file descriptor pressure
latency collapse from too many in-flight tasks

This connects directly to the production guidance in .

Making multiple requests per unit of work

One of the chapter's real lessons is architectural: as soon as one logical job requires several awaited operations, you need to decide:

do these requests run sequentially?
do they run concurrently?
do they belong to one failure boundary?

Delegating blocking work

If part of the workload is synchronous and blocking, you must move it off the event-loop thread.

# [OLDER / 3.9, CURRENT - 3.10-3.14] Works on Python 3.9+
import asyncio
from pathlib import Path
async def load_text(path):
    return await asyncio.to_thread(Path(path).read_text, encoding="utf-8")

asyncio.to_thread is a practical boundary tool for:

blocking file I/O
legacy synchronous clients
other operations that would otherwise freeze the loop

It does not make CPU-heavy Python bytecode parallel in the general case. It mainly protects loop responsiveness while the blocking function runs elsewhere.

Version context

Current project guidance targets Python 3.10-3.14. Python 3.9 and below are End-of-Life.

Version-sensitive points:

asyncio.as_completed: long-standing asyncio API
asyncio.Semaphore: long-standing asyncio API
asyncio.to_thread: Python 3.9+
TaskGroup: Python 3.11+

This guide intentionally cross-links TaskGroup instead of re-explaining its failure semantics in full.

Edge cases and gotchas

Semaphores limit concurrency inside the protected block only. They do not automatically limit task creation outside it.

as_completed changes result consumption order. That is powerful, but it also means downstream code must tolerate out-of-order completion.

to_thread can protect the loop from blocking I/O, but it is not a universal substitute for process-based CPU parallelism.

Async without backpressure is usually a production bug. If your design can create unbounded in-flight work, it will eventually do so under real traffic.

Production usage

Use ──────────────────────────────────────────────

async with for resource ownership with awaited setup/teardown
as_completed when earliest-finished work should be consumed first
semaphores to cap concurrency
to_thread or executors only at explicit blocking boundaries

For server-side consequences of the same event-loop rules, see .

Further depth

Python in Depth

Async context, backpressure, and offloading

Python in Depth