Bytecode is a diagnostic view into one Python implementation. It explains execution shape; it does not replace measurement or turn CPython internals into language guarantees.
Core answer
Use dis when the question is about compiled control flow, name lookup, closure cells, calls, comprehensions, or interpreter specialization. Pair it with profiling before treating an opcode difference as a performance conclusion.
# [CURRENT - 3.10-3.14] Works on Python 3.10+from dataclasses import dataclassfrom dis import dis@dataclass(frozen=True, slots=True)class LineItem: quantity: int unit_cents: intdef totals(items: list[LineItem]) -> list[int]: return [item.quantity * item.unit_cents for item in items]sample = [LineItem(2, 1250), LineItem(1, 499)]print(totals(sample))dis(totals)Why this design exists
Python source compiles into code objects before the evaluation loop runs it. Bytecode keeps that executable form compact and gives the interpreter a stable internal instruction stream to optimize. Modern CPython also specializes execution at runtime; PEP 659 is the key design reference for the specializing adaptive interpreter.
The relevant teaching boundary is strict: compilation and code-object structure are CPython-facing tools for reasoning, while source-level semantics come from the language reference.
Mechanics and CPython internals
dis reads a code object and renders representative instructions. Code objects store constants, local names, free variables, flags, and instruction data. In CPython 3.11+, adaptive specialization and inline caches mean one static disassembly view may not show every runtime detail that matters after warm-up.
# [CURRENT - 3.10-3.14] Works on Python 3.10+from dataclasses import dataclassfrom dis import Bytecode@dataclass(frozen=True, slots=True)class Refund: amount_cents: int fee_cents: intdef net_amount(refund: Refund) -> int: base = refund.amount_cents return base - refund.fee_centsfor instruction in Bytecode(net_amount): print(instruction.opname, instruction.argrepr)print(net_amount.__code__.co_varnames)print(net_amount(Refund(4000, 125)))Complexity and tradeoffs
Disassembly cost is development-time inspection. Runtime opcode count can matter in Python-level hot loops, but algorithmic complexity, allocation, C-level work, cache effects, I/O, and adaptive optimization often dominate. A list comprehension can remove repeated Python-level append dispatch; it does not rescue an O(n^2) algorithm.
Idiomatic patterns and refactoring
Use disassembly to explain a refactor only after the source-level refactor is already defensible.
# [CURRENT - 3.10-3.14] Works on Python 3.10+from dataclasses import dataclassfrom dis import dis@dataclass(frozen=True, slots=True)class Row: value: intdef collect_loop(rows: list[Row]) -> list[int]: output: list[int] = [] for row in rows: output.append(row.value * 2) return outputdef collect_comprehension(rows: list[Row]) -> list[int]: return [row.value * 2 for row in rows]sample = [Row(1), Row(2)]print(collect_loop(sample), collect_comprehension(sample))dis(collect_comprehension)Common mistakes and edge cases
Do not compare bytecode from different Python versions as if opcodes were a compatibility contract. Do not infer exact nanosecond costs from opcode count alone. Do not mistake dis output for proof that a global lookup, descriptor access, or call path is slow in a real workload.
When to use / When NOT to use
Use dis when it resolves a question about CPython compilation or interpreter work. Use benchmarks and profilers when the question is latency or throughput.
Do not teach bytecode as portable Python semantics, and do not contort clean code for an opcode-level micro-win before measuring the real workload.