Python's dataclasses module simplifies the creation of classes used for storing data.
While most people know about basic usage, there’s a less-known feature field(default_factory=...)
that can be incredibly useful for handling default values in mutable types.
How It Works
When defining a dataclass, you might want to use a mutable default value, such as a list or a dictionary.
Using a mutable default directly can lead to unexpected behavior due to the way default arguments are shared across instances.
The default_factory function provides a clean way to handle mutable defaults.
Here’s a simple example:
from dataclasses import dataclass, field
from typing import List
@dataclass
class Student:
name: str
grades: List[int] = field(default_factory=list) # Use default_factory for mutable default
# Create new Student instances
student1 = Student(name="Alice")
student2 = Student(name="Bob", grades=[90, 85])
# Modify student1's grades
student1.grades.append(95)
print(student1) # Output: Student(name='Alice', grades=[95])
print(student2) # Output: Student(name='Bob', grades=[90, 85])
# Output:
# Student(name='Alice', grades=[95])
# Student(name='Bob', grades=[90, 85])
In this example, grades is initialized with an empty list for each new Student instance.
Using field(default_factory=list)
ensures that each instance gets its own separate list, avoiding the pitfalls of shared mutable defaults.
Why It’s Cool
The default_factory feature is invaluable for avoiding common issues with mutable default arguments.
It helps ensure that each instance of a dataclass has its own default value, making your code more predictable and avoiding subtle bugs related to shared state.