Container choice is a memory-layout choice before it is a syntax choice. list, tuple, set, and array.array spend bytes on different promises.
Core answer
Start with semantics: list for mutable ordered references, tuple for fixed ordered references, set for hash-based membership, and array.array for homogeneous C-style numeric storage.
# [CURRENT - 3.10-3.14] Works on Python 3.10+from array import arrayfrom dataclasses import dataclass@dataclass(frozen=True, slots=True)class Samples: ordered: list[int] fixed: tuple[int, ...] unique: set[int] packed: arraydef load(values: list[int]) -> Samples: return Samples(values, tuple(values), set(values), array("I", values))print(load([7, 7, 9]))Why this design exists
Python containers optimize for different contracts. A general list must hold arbitrary Python objects and grow. A tuple can commit to a fixed length. A set reserves spare hash slots for fast membership. array.array drops heterogeneous object storage to pack primitive values.
Mechanics and CPython internals
Lists and tuples store references to PyObject values. The objects themselves may live elsewhere, carry object headers, and participate in reference counting. Lists over-allocate to keep append amortized; tuples do not grow. Sets store hash-table metadata and spare slots. array.array stores raw C values in its buffer and converts to Python objects when values cross the API boundary.
# [CURRENT - 3.10-3.14] Works on Python 3.10+from array import arrayfrom dataclasses import dataclassfrom sys import getsizeof@dataclass(frozen=True, slots=True)class Footprint: name: str bytes: intdef compare(count: int) -> list[Footprint]: values = list(range(count)) return [ Footprint("list", getsizeof(values)), Footprint("tuple", getsizeof(tuple(values))), Footprint("set", getsizeof(set(values))), Footprint("array", getsizeof(array("I", values))), ]print(compare(1000))Complexity and tradeoffs
List append is amortized O(1), indexed access for list and tuple is O(1), set membership is average O(1), and list/tuple membership is scan-based O(n). array.array improves density but does not automatically beat every Python-level numeric loop because conversion and algorithm shape still matter.
Idiomatic patterns and refactoring
Refactor when the dominant operation disagrees with the representation.
# [CURRENT - 3.10-3.14] Works on Python 3.10+from dataclasses import dataclass@dataclass(frozen=True, slots=True)class Blocklist: raw_ids: list[str] lookup_ids: set[str]def build(ids: list[str]) -> Blocklist: return Blocklist(ids, set(ids))def blocked_slow(ids: list[str], user_id: str) -> bool: return user_id in idsdef blocked(index: Blocklist, user_id: str) -> bool: return user_id in index.lookup_idsindex = build(["U-1", "U-2", "U-3"])print(blocked_slow(index.raw_ids, "U-3"))print(blocked(index, "U-3"))Common mistakes and edge cases
A tuple containing mutable objects is only shallowly immutable. A set loses duplicates. An array enforces its typecode and platform-sized primitive representation. Size readings from sys.getsizeof are local CPython measurements, not a portable whole-object-graph bill.
When to use / When NOT to use
Use the smallest contract that matches the workload and measure when memory or throughput matters.
Do not choose a dense or hashed container only because a microbenchmark ignored the domain operations your code actually performs.