I recently got side-tracked into exploring the basics of namedtuple()
as
I got a glimpse of its usage in our engineering codebase. Here’s my summary:
Mutable and Hashable
To understand the behavior of namedtuple()
, it is best to also visit the
concept of Python object’s mutability and hashability. These two concepts are
closely linked.
Hashability: an object’s is hashable when its hash value never changed during its lifetime
Most of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not; immutable containers (such as tuples and frozensets) are only hashable if their elements are hashable. Objects which are instances of user-defined classes are hashable by default
Mutability: an object with a fixed value and cannot be altered is immutable
(For example: int
, float
, string
, tuple
).
in contrast, an object can keep its value while keeping its id()
is mutable.
(For example: list
, dict
)
hash()
and id()
identity: id()
, the identity of the two same value variables are the same
If two objects (that exist at the same time) have the same identity, they’re actually two references to the same object.
The
is
operator compares items by identity,a is b
is equivalent toid(a) == id(b)
.
hash value: hash()
, hash value is based off an object’s value, and hash value
must remain the same for the lifetime of the object. If an object is mutable,
then it doesn’t make sense for it to have hash.
The hash value is an integer which is used to quickly compare dictionary keys or sets.
Why Hash?
Hash values are very useful, as they enable quick look-up of values
in a large collection of values, it’s commonly used in set
and dict
.
with if x in elements:
:
-
In a
list
, Python needs to go through the whole list and comparex
's value with each value in the list elements. -
In a
set
, Python keeps track of each element’s hash, Python will get the hash-value forx
, look that up in an internal structure and find elements that have the same hash asx
.
It also means we can have non-hashable objects in a list
,
but not in a set
or as keys in a dict
.
Example:
There is no way to change an int
object’s value without re-assigning (copy) it
to a different object.
1 | x = 5 |
But for list
, we can edit its value after assignment while keeping its id()
the same. (note: use list
built-in function rather than re-assignment,
this is the same for x.sort
vs. x=sorted(x)
)
1 | x = [5] |
NamedTuple
A data class are just regular classes that are geared towards storing state,
rather than containing a lot of logic, namedtuple()
is one kind of data classes.
Every time we create a class that mostly consists of attributes, we make a data class.
With namedtuple()
, we can create immutable sequence types
that allow us to access their values using descriptive field names
and the dot notation instead of unclear integer indices.
Initialization
- typename:
str
, class name of thenamedtuple
- field names: names that are used to access values in the
namedtuple
, it can be declared using any of the following:- iterable of strings: [“a”, “b”, “c”]
- a string with name seperated by white spaces: “a b c”
- a string with name separated by commas: “a, b, c”
Example:
1 | from collections import namedtuple |
Access and Edit Value
It is very straight-forward to access a tuple’s attribute value using dot notation
this gives namedtuple
a great edge against dict
or tuple
.
1 | Person = namedtuple('Person', 'name children') |
Since namedtuple
is immutable, we can’t assign value to its attribute;
what we can do is to use ._replace()
; and also, its value can be mutable, like
a list
.
1 | >> jj.children = ['Tobby', 'Wang'] |
Using ._asdict()
The built-in function ._asdict()
converts namedtuple
into a dictionary.
1 | Person = namedtuple("Person", "name age height") |
and to generate a namedtuple object from dictionary
1 | d = { |
@dataclass
@dataclass
came out after Python 3.7, which is similar to namedtuple
, but they are mutable.
thus, we can set value to a @dataclass
attribute.
1 | from dataclasses import dataclass |
frozen attribute
if we want @dataclass
to behave like namedtuple
with an un-editable “protected” attribute,
just use @dataclass(frozen=True)
.
override __iter__()
@dataclass
are also not iterable by default, unlike namedtuple
. We can achieve
that by implementing the special method .__iter__()
:
1 | from dataclasses import astuple, dataclass |
Subclassing namedtuple
Subclassing namedtuple
gives us additional functionality.
1 | BasePerson = namedtuple("BasePerson", "name birthdate country") |
In the above example, subclassing from namedtuple
provides us better documentation
(i.e. Person.__doc__
), better string representation (i.e. print jane
) and an
extra property to access based off a Person’s instance attribute value.
__new__()
constructor
Zechong Hu’s Blog - Inheritance for Python Namedtuples
To override the constructor for namedtuple
class with default value:
1 | BasePerson = namedtuple("BasePerson", ["name", "birthdate" ,"country"]) |
__slots__
The special attribute __slots__
explicitly state what attribute we want
our class instances to have.
By default, when an instance (object) is created,
__dict__
is used to store an object’s (writable) attributes.
A dynamic dictionary:
- requires more memory
- takes longer time to create.
Because namedtuple
makes immutable instances that are lightweight,
we need to prevent the creation of __dict__
to get the benefit while subclassing
by setting __slots__
as empty tuple.
In a more general note, please consider using __slots__
when creating
tons of objects, this saves memory and time when instancing.
Comparison __dict__
vs. __slots__
1 | class Person(object): |
1 | class Person(object): |
Reference
Medium megha mohan - Mutable vs Immutable Objects in Python
Stack Overflow - What are data classes and how are they different from common classes?
Geeks for Geesk - Use of __slots__
Stack Overflow - Usage of __slots__?
Stack Overflow - Difference between hash() and id()
Stack Overflow - Two variables in Python have same id, but not lists or tuples