Dataclass fields and generated behavior

default_factory, frozen, order, ClassVar, InitVar, and __post_init__

The `@dataclass` decorator generates `__init__`, `__repr__`, `__eq__`, and optionally ordering methods and `__hash__`. Field options control the generated behavior. `default_factory` solves the shared-mutable-default problem by calling a factory for each instance. `frozen=True` generates setter methods that raise `FrozenInstanceError`. `order=True` generates all comparison operators. `slots=True` (Python 3.10+) stores fields in `__slots__` descriptors, reducing memory per instance by eliminating `__dict__`. `InitVar` fields are passed to `__init__` but not stored, useful for computed initialization. `ClassVar` fields are excluded from all generated methods. Field ordering requires that fields without defaults come before fields with defaults. <a href="/classes-data-builders">Compare dataclass with NamedTuple and namedtuple</a>. <a href="/language-mutable-defaults">Understand the mutable default problem that default_factory solves</a>.

Understand.
Visualize.
Master.

Python in Depth

An interactive engineering reference for Python internals

Quick note

Treat dataclass flags as public behaviour choices.

:)
TABLE OF CONTENTS
5.2Dataclass fields and generated behavior

default_factory, frozen, order, ClassVar, InitVar, and __post_init__

You have written a class with __init__, __repr__, __eq__ boilerplate more times than you can count. @dataclass eliminates that busywork. The important technical questions are which methods it synthesizes, what storage model it implies, and what invariants it leaves entirely to you.

Think of a dataclass like a factory machine. You feed it a blueprint (annotations and field() declarations) and it stamps out a class with __init__, __repr__, __eq__, and optional ordering — all generated from the field list. The machine handles the repetitive welding. You still own the design.

Core answer

Use field() when a plain default is not enough: mutable defaults, hidden constructor fields, metadata, comparison control, or factories.

# [OLDER / 3.9, CURRENT - 3.10-3.14] Works on Python 3.9+ [PEP 585]
from dataclasses import dataclass, field
@dataclass
class Batch:
rows: list[str] = field(default_factory=list)

default_factory runs at instance creation time, which is why it solves the shared-mutable-default problem instead of reproducing it.

Mechanism and generated methods

Dataclasses inspect class annotations and synthesize methods such as:

  • __init__
  • __repr__
  • __eq__
  • optionally ordering methods
  • optionally hash behavior depending on eq, frozen, and unsafe_hash
# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
from dataclasses import dataclass
@dataclass(frozen=True, order=True)
class Version:
major: int
minor: int
patch: int = 0
print(Version(3, 14) > Version(3, 10))

Ordering follows field order, not domain meaning. If field order does not encode business ordering, order=True can generate deceptively wrong semantics.

frozen=True prevents ordinary attribute assignment through the generated API, but it is not deep immutability and it is not a security boundary.

Storage model and slots

Without slots=True, a normal dataclass instance usually has an instance __dict__. With slots=True, the instance stores attributes in slot descriptors and does not expose a __dict__ unless you add one deliberately.

Measured locally on CPython 3.12.3, 64-bit Linux for a simple two-field dataclass:

  • plain dataclass instance: 48 bytes plus about 280 bytes for __dict__
  • slotted dataclass instance: 48 bytes and no instance __dict__
# [CURRENT - 3.10-3.14] Requires Python 3.10+
# Example byte counts below were measured on CPython 3.12.3, 64-bit Linux.
import sys
from dataclasses import dataclass
@dataclass
class B:
x: int
y: int
@dataclass(slots=True)
class A:
x: int
y: int
print(hasattr(B(1, 2), "__dict__"))
print(hasattr(A(1, 2), "__dict__"))
print(sys.getsizeof(B(1, 2)))
print(sys.getsizeof(B(1, 2).__dict__))
print(sys.getsizeof(A(1, 2)))

This is a CPython memory-layout concern, not a language guarantee, but it is one of the main reasons slots=True matters for large object populations.

Version context

Dataclasses were added in Python 3.7 PEP 557. slots=True and match_args=True were added in Python 3.10. weakref_slot=True was added in Python 3.11, and the docs require it to be paired with slots=True. Current project guidance targets Python 3.10-3.14. Python 3.9 and below are End-of-Life.

# [CURRENT - 3.10-3.14] Requires Python 3.10+
from dataclasses import dataclass
@dataclass(slots=True)
class Point:
x: int
y: int

Dataclasses also generate __match_args__ by default in Python 3.10+, which makes them participate naturally in structural pattern matching. Keyword-only fields are excluded from __match_args__.

# [CURRENT - 3.10-3.14] Requires Python 3.10+
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
print(Point.__match_args__)
Field categories

ClassVar marks class attributes that are not dataclass instance fields. InitVar creates an initialization-only parameter that is passed to __post_init__ but not stored as a field.

# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
from dataclasses import InitVar, dataclass, field
from typing import ClassVar
@dataclass
class User:
table: ClassVar[str] = "users"
name: str
raw_email: InitVar[str]
email: str = field(init=False)
def __post_init__(self, raw_email):
self.email = raw_email.strip().casefold()

Those distinctions matter because dataclasses are driven by field classification, not just by visible attributes in the class body.

Edge cases and gotchas

Hash behavior is where many production bugs start. Mutable dataclasses should usually not be hashable. unsafe_hash=True exists, but the name is honest: it is only safe when the fields involved in hashing are effectively immutable.

Do not use unsafe_hash=True as a convenience toggle. If hashed instances can mutate, dict and set behavior can become incorrect after insertion.

Field ordering still follows Python's rule that non-default parameters must come before defaulted ones. field(default_factory=...) counts as a defaulted field for that purpose.

Production usage

Use ──────────────────────────────────────────────

  • __post_init__ for validation and normalization
  • frozen=True for value objects
  • slots=True for many small instances after measuring
  • field(init=False) for derived stored fields
# [OLDER / 3.7-3.8, CURRENT - 3.10-3.14] Works on Python 3.7+
from dataclasses import dataclass
@dataclass(frozen=True)
class Port:
value: int
def __post_init__(self):
if not 0 < self.value < 65536:
raise ValueError("port out of range")

Field annotations connect directly to , but runtime validation remains your responsibility.

Further depth
  • dataclasses module
  • dataclasses.field
  • typing.ClassVar
  • PEP 557: Data Classes
  • sys.getsizeof
BOARD NOTESContext
WHY NO BENCHMARK?

This topic is better taught with structure, semantics, and cross-references than with a synthetic chart.

Treat dataclass flags as public behaviour choices.

RELATED GUIDES
NEXT CHECKS
Contribute