You check whether a username is in a blocked list with username in blocked_list. That works — but if blocked_list is a list, Python scans every element one by one. If it is a set, Python computes the hash and checks in one step. The difference between linear scan and hash lookup grows with every element.
Think of a set like a club bouncer with a perfect memory. You say a name, and the bouncer instantly knows whether it is on the list. A list is like the same bouncer reading every name on a clipboard one by one. Dictionary views extend the same hash-powered lookup to the keys and items of a dict — no need to build a separate set first.
Use a set when the real operation is repeated membership testing or set algebra: union, intersection, difference, subset, or superset.
# [CURRENT - 3.10-3.14] Works on Python 3.xblocked = {"root", "admin", "system"}def allowed(username): return username not in blockedprint(allowed("ana"))On this machine, local timeit measurements over a 10000-element collection showed the scale of the difference:
9999 in list(range(10000))repeated10000times: about0.595seconds9999 in set(range(10000))repeated10000times: about0.000236seconds
Those exact times are not portable. The stable lesson is the algorithmic one: list membership is linear scan, set membership is average-case hash lookup.
Sets are hash tables without associated values. The same hashability rules that govern dict keys govern set elements.
# [CURRENT - 3.10-3.14] Works on Python 3.xrole = frozenset({"read", "write"})permissions = {role: "editor"}print(permissions[role])Because sets are hash tables, they also reserve significant extra memory to protect lookup speed. Local CPython 3.12.3 measurements with sys.getsizeof show stepped growth:
- empty set:
216bytes 1to4elements:216bytes5to16elements:728bytes32to64elements:2264bytes
# [CURRENT - 3.10-3.14] Works on Python 3.x# Example byte counts below were measured on CPython 3.12.3, 64-bit Linux.import sysfor n in (0, 1, 4, 5, 16, 32): s = set(range(n)) print(n, sys.getsizeof(s))That overhead is the price of near-constant-time membership.
The set base object (~216 bytes for an empty set on CPython 3.12) is much larger than an empty dict (~72 bytes) because the set object includes inline hash table storage (8 entry slots) even at minimum size.
CPython implements sets using the same underlying hash table machinery as dicts. The set type (PySetObject) is defined in Objects/setobject.c. Unlike dicts, sets store only keys — there are no associated values. The hash table uses the same probing algorithm but a different resize policy (~60% fill trigger instead of dict's 2/3).
The set object contains:
- a pointer to the hash table entries (an array of
_Py_setentrystructs, each holding a key pointer and cached hash) usedandfillcounters that track how many slots are occupied (used) versus ever occupied (fill, which includes dummy entries from deleted elements)
The probing formula for sets uses a hybrid of linear probing and the same perturb algorithm as dicts (Objects/setobject.c):
# 9 steps of linear probing first, then fall back to perturbj = (5 * j) + 1 + perturbperturb >>= PERTURB_SHIFTThe minimum set size on CPython 3.12 is PySet_MINSIZE = 8 slots, with a load factor that triggers resize at ~60% fill. The trigger is based on fill (active entries plus dummy slots left by deletions), not just used active entries.
A key difference from dict: sets use a combined entry struct (_Py_setentry) that stores key and hash together, rather than the split-table layout dicts use (sparse indices + dense entries). Set resizing allocates one new combined table, copies active entries, and frees the old table — no rehashing needed because the hash is cached in each entry.
When you call set() on an iterable, CPython calls set_update_iterable in Objects/setobject.c (Python 3.12), which iterates the source and inserts each element one at a time. The frozenset constructor does the same but returns an immutable wrapper around the same hash table, with no add/discard methods.
frozenset is hashable, which means it can be used as a dict key or set element. This is its primary practical advantage over mutable set — without hashability, you could not nest sets inside sets or use them as mapping keys.
Sets support the expected algebra directly:
# [CURRENT - 3.10-3.14] Works on Python 3.xrequested = {"read", "write", "delete"}granted = {"read", "write"}print(requested - granted)print(requested <= granted)print(requested & granted)Operator forms such as |, &, and - on actual set objects expect set operands. If the right-hand side is a general iterable, use methods such as .union() or .intersection().
# [CURRENT - 3.10-3.14] Works on Python 3.xprint({"a", "b"}.intersection(["b", "c"]))dict.keys(), dict.items(), and dict.values() return dynamic view objects over the underlying dictionary. They are not materialized lists, and they update when the dictionary changes.
# [CURRENT - 3.10-3.14] Works on Python 3.xdata = {"a": 1}keys = data.keys()data["b"] = 2print(list(keys))Key views are set-like because keys are unique and hashable. Item views are also set-like when their (key, value) pairs are hashable. Values views are not set-like because values are not generally unique.
# [CURRENT - 3.10-3.14] Works on Python 3.xexpected = {"id", "email", "role"}payload = {"id": 1, "email": "a@example.com", "debug": True}missing = expected - payload.keys()extra = payload.keys() - expectedprint(missing, extra)One subtle but important distinction from ordinary sets: the Python docs note that set-like dictionary views accept any iterable as the other operand for set operators. That makes view algebra surprisingly ergonomic.
# [CURRENT - 3.10-3.14] Works on Python 3.xd = {"a": 1, "b": 2}print(d.keys() | ["b", "c", "c"])This works because the view's __or__ method builds a set from self and then calls .union() with the other operand, which accepts any iterable. The view itself is not a set — it is a live projection of the dict's current keys — but the set operators on views produce real set objects as results.
Set types and dictionary views are stable Python 3 features. Dictionary insertion order is guaranteed in Python 3.7+. Dictionary views became reversible in Python 3.8. The .mapping attribute on views was added in Python 3.10. Python 3.9 and below are End-of-Life.
CPython-specific notes:
- the empty-set memory footprint (
216bytes on CPython 3.12) is an implementation detail; other interpreters may differ - the probing algorithm and load factor may change across CPython releases
- source for set internals:
Objects/setobject.c
Set iteration order is not a sorting contract. It can vary across runs and table states. If output order matters, sort explicitly.
# [CURRENT - 3.10-3.14] Works on Python 3.xnames = {"zoe", "ana", "bo"}print(sorted(names))frozenset is immutable and hashable, but that does not make it ordered. It is for stable membership sets, not for sequence semantics.
If you only need key comparisons, use dictionary views directly. Building a separate set of keys first often adds allocation without adding meaning.
Use sets for:
- blocklists and allowlists
- schema comparison
- permission algebra
- deduplication when order does not matter
Use dict.keys() and dict.items() views for live schema and key-delta checks against dictionaries that may continue to evolve during processing.
When each key needs associated data, move back to dictionaries; see .