Attribute Access and Descriptor Protocol in Python
Description
In Python, attribute access (such as obj.attr) may seem simple, but it involves underlying mechanisms like __getattribute__, __getattr__, and the descriptor protocol. Understanding these mechanisms helps developers control object behavior and avoid subtle bugs. For example, why can @property implement computed attributes? How do descriptors intercept attribute operations? These are key points in interviews to assess the depth of understanding of Python's object model.
Knowledge Explanation
-
Basic Attribute Access Flow
When accessingobj.attr, Python searches for the attribute in the following order:- First, check if
attris a descriptor defined in the object's class or parent class (i.e., a class attribute that implements the__get__method). - If no descriptor is found, look for an instance attribute in
obj.__dict__. - If still not found, continue searching for class attributes in the class's
__dict__and the parent class chain. - If all attempts fail, trigger
__getattr__(if defined) or raise anAttributeError.
- First, check if
-
Core Methods of the Descriptor Protocol
A descriptor is a class that implements any of the following methods:__get__(self, obj, type=None) -> value: Intercepts attribute reading.__set__(self, obj, value) -> None: Intercepts attribute assignment.__delete__(self, obj) -> None: Intercepts attribute deletion.
Based on whether__set__is implemented, descriptors are divided into:- Data Descriptor (implements
__set__): Has higher priority than instance attributes (e.g.,@property). - Non-Data Descriptor (implements only
__get__): Has lower priority than instance attributes (e.g., function methods defined in a class).
-
The Essence of the
@propertyDecorator
@propertyis a built-in data descriptor. For example:class Circle: def __init__(self, radius): self.radius = radius @property def area(self): return 3.14 * self.radius ** 2- When accessing
c.area, the__get__method of thepropertyclass is called to dynamically compute the value. - Because
propertyimplements__set__(which prohibits assignment by default), it is a data descriptor. Even if an instance has__dict__['area'], the descriptor is called first.
- When accessing
-
Practical Example of a Custom Descriptor
Suppose type validation is needed for assignment:class TypedDescriptor: def __init__(self, name, expected_type): self.name = name self.expected_type = expected_type def __get__(self, obj, objtype): if obj is None: return self # Return the descriptor itself when accessed via the class return obj.__dict__.get(self.name) def __set__(self, obj, value): if not isinstance(value, self.expected_type): raise TypeError(f"Expected {self.expected_type}") obj.__dict__[self.name] = value # Store in the instance dictionary class Person: name = TypedDescriptor("name", str) # Class attribute is a descriptor instance age = TypedDescriptor("age", int) def __init__(self, name, age): self.name = name # Triggers __set__ self.age = age- When executing
p = Person("Alice", 30),self.name = nameactually callsTypedDescriptor.__set__for type checking. - When reading
p.name,__get__is called to return the value from the instance's__dict__.
- When executing
-
Relationship Between Descriptors and
__getattribute__- All attribute access is ultimately handled by the
object.__getattribute__method, which has built-in logic for the descriptor protocol. - If you override
__getattribute__, you must manually callsuper().__getattribute__()or handle descriptors; otherwise, the protocol may be broken.
- All attribute access is ultimately handled by the
Summary
The underlying mechanisms of attribute access reflect Python's philosophy of "convention over configuration." The descriptor protocol is the foundation for advanced features (such as ORM fields and property validation). Key points to remember: data descriptors have the highest priority, followed by non-data descriptors, and finally instance attributes.