You have written a class with __init__, __repr__, __eq__ boilerplate more times than you can count. @dataclass eliminates that busywork. The important technical questions are which methods it synthesizes, what storage model it implies, and what invariants it leaves entirely to you.
Think of a dataclass like a factory machine. You feed it a blueprint (annotations and field() declarations) and it stamps out a class with __init__, __repr__, __eq__, and optional ordering — all generated from the field list. The machine handles the repetitive welding. You still own the design.
Use field() when a plain default is not enough: mutable defaults, hidden constructor fields, metadata, comparison control, or factories.
# [OLDER / 3.9, CURRENT - 3.10-3.14] Works on Python 3.9+ [PEP 585]from dataclasses import dataclass, field@dataclassclass Batch: rows: list[str] = field(default_factory=list)default_factory runs at instance creation time, which is why it solves the shared-mutable-default problem instead of reproducing it.
Dataclasses inspect class annotations and synthesize methods such as:
__init____repr____eq__- optionally ordering methods
- optionally hash behavior depending on
eq,frozen, andunsafe_hash
# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+from dataclasses import dataclass@dataclass(frozen=True, order=True)class Version: major: int minor: int patch: int = 0print(Version(3, 14) > Version(3, 10))Ordering follows field order, not domain meaning. If field order does not encode business ordering, order=True can generate deceptively wrong semantics.
frozen=True prevents ordinary attribute assignment through the generated API, but it is not deep immutability and it is not a security boundary.
Without slots=True, a normal dataclass instance usually has an instance __dict__. With slots=True, the instance stores attributes in slot descriptors and does not expose a __dict__ unless you add one deliberately.
Measured locally on CPython 3.12.3, 64-bit Linux for a simple two-field dataclass:
- plain dataclass instance:
48bytes plus about280bytes for__dict__ - slotted dataclass instance:
48bytes and no instance__dict__
# [CURRENT - 3.10-3.14] Requires Python 3.10+# Example byte counts below were measured on CPython 3.12.3, 64-bit Linux.import sysfrom dataclasses import dataclass@dataclassclass B: x: int y: int@dataclass(slots=True)class A: x: int y: intprint(hasattr(B(1, 2), "__dict__"))print(hasattr(A(1, 2), "__dict__"))print(sys.getsizeof(B(1, 2)))print(sys.getsizeof(B(1, 2).__dict__))print(sys.getsizeof(A(1, 2)))This is a CPython memory-layout concern, not a language guarantee, but it is one of the main reasons slots=True matters for large object populations.
Dataclasses were added in Python 3.7 PEP 557. slots=True and match_args=True were added in Python 3.10. weakref_slot=True was added in Python 3.11, and the docs require it to be paired with slots=True. Current project guidance targets Python 3.10-3.14. Python 3.9 and below are End-of-Life.
# [CURRENT - 3.10-3.14] Requires Python 3.10+from dataclasses import dataclass@dataclass(slots=True)class Point: x: int y: intDataclasses also generate __match_args__ by default in Python 3.10+, which makes them participate naturally in structural pattern matching. Keyword-only fields are excluded from __match_args__.
# [CURRENT - 3.10-3.14] Requires Python 3.10+from dataclasses import dataclass@dataclassclass Point: x: int y: intprint(Point.__match_args__)ClassVar marks class attributes that are not dataclass instance fields. InitVar creates an initialization-only parameter that is passed to __post_init__ but not stored as a field.
# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+from dataclasses import InitVar, dataclass, fieldfrom typing import ClassVar@dataclassclass User: table: ClassVar[str] = "users" name: str raw_email: InitVar[str] email: str = field(init=False) def __post_init__(self, raw_email): self.email = raw_email.strip().casefold()Those distinctions matter because dataclasses are driven by field classification, not just by visible attributes in the class body.
Hash behavior is where many production bugs start. Mutable dataclasses should usually not be hashable. unsafe_hash=True exists, but the name is honest: it is only safe when the fields involved in hashing are effectively immutable.
Do not use unsafe_hash=True as a convenience toggle. If hashed instances can mutate, dict and set behavior can become incorrect after insertion.
Field ordering still follows Python's rule that non-default parameters must come before defaulted ones. field(default_factory=...) counts as a defaulted field for that purpose.
Use ──────────────────────────────────────────────
__post_init__for validation and normalizationfrozen=Truefor value objectsslots=Truefor many small instances after measuringfield(init=False)for derived stored fields
# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+from dataclasses import dataclass@dataclass(frozen=True)class Port: value: int def __post_init__(self): if not 0 < self.value < 65536: raise ValueError("port out of range")Field annotations connect directly to , but runtime validation remains your responsibility.