Describing Descriptors

Pycon 2015, Montréal, Canada

9 Apr 2015

Laura Rupprecht

@LCRcircuit

About me:

Outline:

  • What is a descriptor?
  • Custom Descriptor Example
  • Kinds of descriptors
  • Attribute lookup order
  • @property / @classmethod
  • Usage in ORMs
  • Common Problems

What is a descriptor?

A certain type of attribute

class Foo():
    x = SomeDescriptor(some_args)  # x is the attribute

SomeDescriptor, the thing you're using instead of a normal attribute, might define:

  • __get__
  • __set__
  • __delete__

Descriptor Definition

class Descriptor():
    def __init__(self, *args, **kwargs):
        # do something

    def __get__(self, instance, owner):
        # do something
        return value

    def __set__(self, instance, value):
        # do something
        return None

    def __delete__(self, instance):
        # do something
        return None

Example:

  • Dealing with json YouTube API responses
  • Dictionary access is everywhere

    video['snippet']['thumbnails']['maxres']['url']

  • Hat tip to Jonathan @tushman - "Use Descriptors"
uh...
what's a descriptor?

What is the definition of a descriptor?

Official Version: An object attribute that has some sort of “binding behavior.”

Has any combination of the following methods:

  • __get__
  • __set__
  • __delete__

Example: Making API Use Easier

response = YouTubeApi.get_video_info(video_id='dQw4w9WgXcQ')

{u'etag': u'"9iWEWaGPvvCMMVNTPHF9GiusHJA/5Y8U-tVLJ2KpbH3djNvGCA7vmLs"',
  u'id': u'dQw4w9WgXcQ',
  u'kind': u'youtube#video',
  u'snippet': {u'categoryId': u'10',
   u'channelId': u'UC38IQsAvIsxxjztdMZQtwHA',
   u'channelTitle': u'channelTitle',
   u'description': u'A description',
   u'liveBroadcastContent': u'none',
   u'localized': {u'description': u'A description',
    u'title': u'Title'},
   u'publishedAt': u'2009-10-25T06:45:58.000Z',
   u'thumbnails': {u'default': {u'height': 90,
     u'url': u'https://i.ytimg.com/vi/dQw4w9WgXcQ/default.jpg',
     u'width': 120},
    u'high': {u'height': 360,
     u'url': u'https://i.ytimg.com/vi/dQw4w9WgXcQ/hqdefault.jpg',
     u'width': 480},
    u'maxres': {u'height': 720,
     u'url': u'https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg',
     u'width': 1280},
    u'medium': {u'height': 180,
     u'url': u'https://i.ytimg.com/vi/dQw4w9WgXcQ/mqdefault.jpg',
     u'width': 320},
    u'standard': {u'height': 480,
     u'url': u'https://i.ytimg.com/vi/dQw4w9WgXcQ/sddefault.jpg',
     u'width': 640}},
   u'title': u'Title'},
  u'statistics': {u'commentCount': u'240857',
   u'dislikeCount': u'32927',
   u'favoriteCount': u'0',
   u'likeCount': u'548877',
   u'viewCount': u'111604659'}}

How can we make this prettier?

response = YouTubeApi.get_video_info(video_id)

youtube_id = response['id']
title = response['snippet']['title']
views = response['statistics']['viewCount']
description = response['snippet']['description']

1st Try: put into an object

class YouTubeResponse():
    def __init__(self, response):
        self.youtube_id = response['id']
        self.title = response['snippet']['title']
        self.views = response['statistics']['viewCount']
        self.description = response['snippet']['description']

result = YouTubeApi.get_video_info(video_id)
response = YouTubeResponse(result)
print(response.youtube_id)
# dQw4w9WgXcQ
a little better...

Let's focus on the response

import dict_digger

def get_response(json_response, *path):
    return dict_digger.dig(json_response, *path)
		

class YouTubeResponse():
    def __init__(self, json_response):
    	self.json_response = json_response
    	
        self.youtube_id = get_response(json_response, 'id')
        self.title = get_response(json_response, 'snippet', 'title')
        self.views = get_response(json_response, 'statistics', 'viewCount')
        self.description = get_response(json_response, 'snippet', 'description')
closer... but still repetitive

Let's focus on the response

import dict_digger

class ResponseDescriptor():
    def __init__(self, *path):
        self.path = path

    def __get__(self, instance, objtype):
        return dict_digger.dig(instance.json_response, *self.path)
		

class YouTubeResponse():
    def __init__(self, json_response):
    	self.json_response = json_response
    	
    youtube_id = ResponseDescriptor('id')
    title = ResponseDescriptor('snippet', 'title')
    views = ResponseDescriptor('statistics', 'viewCount')
    description = ResponseDescriptor('snippet', 'description')

For completeness, add error checking

import dict_digger
class ResponseDescriptor():
    def __init__(self, *path):
        self.path = path

    def __get__(self, instance, objtype):
        try:
            return dict_digger.dig(instance.json_response, *self.path, fail=True)
        except (IndexError, KeyError):
            raise AttributeError("Item not present in instance attribute"
                                 "'json_response'")

class YouTubeResponse():
    def __init__(self, json_response):
    	self.json_response = json_response
    	
    youtube_id = ResponseDescriptor('id')
    title = ResponseDescriptor('snippet', 'title')
    views = ResponseDescriptor('statistics', 'viewCount')
    description = ResponseDescriptor('snippet', 'description')

Now things are easier

result = YouTubeApi.get_video_info('dQw4w9WgXcQ')
response = YouTubeResponse(result)
result = YouTubeApi.get_video_info('dQw4w9WgXcQ')
response = YouTubeResponse(result)
print(response.title)
# Rick Astley - Never Gonna Give You Up

Two types of descriptors

  1. Data Descriptor:
    • __get__
    • __set__ / __delete__
  2. Non-Data Descriptor:
    • __get__
What does that mean for when I look up an attribute?

Take a step back

Looking for foo.x:

  1. Python is all dictionaries, right? So look up the item in the object dictionary: foo.__dict__['x']
Well, that's almost how it works...

Looking for foo.x

Step Example
1. Check class dict for data descriptor type(foo).__dict__['x']
2. Check instance dict foo.__dict__['x']
3. Check class dict for non-data descriptor or attribute type(foo).__dict__['x']
4. Not found AttributeError
* Look through MRO (Method Resolution Order) if nothing found in type(foo)

Data/Non-Data Descriptors

Effect of combining function and attribute call mechanics
  1. Call instance.f(x)
  2. Check type(instance).__dict__['f'] for data descriptor
  3. Check instance.__dict__['f'] for object
  4. Find non-data descriptor in type(instance).__dict__['f']
  5. Call f_descriptor.__get__(instance, owner)
  6. Call f(instance, x)

Looking for foo.x

Step Example
1. Check class dict for data descriptor type(foo).__dict__['x']
2. Check instance dict foo.__dict__['x']
3. Check class dict for non-data descriptor or attribute type(foo).__dict__['x']
4. Not found AttributeError
* Look through MRO (Method Resolution Order) if nothing found in type(foo)

Outline:

  • What is a descriptor?
  • Custom Descriptor Example
  • Kinds of descriptors
  • Attribute lookup order
  • @property / @classmethod
  • Usage in ORMs
  • Common Problems

@property

A @property transforms your code from this:
class Foo(object):
    def __init__(self):
        self.the_answer = 42

    @property
    def bar(self):
        return self.the_answer
to this:
class Foo(object):
    def _bar(self):
    	return self.the_answer
    
    bar = property(fget=_bar, fset=None, fdel=None, doc=None)

@property

which is used like this:
foo = Foo()
print(foo.bar)
# 42
foo.bar = 24
# AttributeError: can't set attribute

Remember, property is always a data descriptor!

@classmethod

A @classmethod transforms your code from this:
class Foo(object):

    @classmethod
    def bar(cls):
        return 42
to this:
class Foo(object):
    def _bar(cls):
        return 42

    bar = classmethod(_bar)

@classmethod

which can be used like this:
foo = Foo()
print(foo.bar)
# 42
but also like this:
print(Foo.bar)
# 42

Note that classmethod is a non-data descriptor!

ORMs in Python

Object Relational Mapper
from django.db import models

class Person(models.Model):
    name = models.CharField(max_length=255, blank=True)
    birthday = models.DateField()

ORMs in Python

Example:

import datetime
p = Person(name='Laura', birthday=datetime.datetime.now())
p.save()

p.name
# 'Laura'

ORMs in Python

And then the magic part happens:

p.name = 'Super crazy long name definitely more than 255 chars...'
p.save()
# ValidationError

Making Custom Descriptors

Why?

  • To check that data has the proper format
  • To automatically perform verification of a field
  • To make it easier to manipulate existing data fields

Why not?

  • Because it looks awesome
  • Job security

Other common exceptions to think about

  • ValueError
  • ValidationError
  • AttributeError
  • NotImplementedError

Other common pitfalls

Confusing class and instance variables

class BadDescriptor():
    def __init__(self):
        self.value = None

    def __get__(self, instance, objtype):
        return self.value

    def __set__(self, value):
        self.value = value
thing_one = Foo()
thing_one.x = 'hello world'
print(thing_one.x)
# 'hello world'

thing_two = Foo()
print(thing_two.x)
# 'hello world'

Infinite Recursion: Problem

class InfinitelyBadDescriptor():

    def __init__(self, name):
        self.name = name

    def __get__(self, instance, owner):
        return getattr(instance, self.name)
getattr(instance, self.name)

calls

__get__(self, instance, owner)

Infinite Recursion: Solution

class InfinitelyAwesomeDescriptor():

    def __init__(self, name, default):
        self.name = name
        self.default = default

    def __get__(self, instance, owner):
        try:
            return instance.__dict__[self.name]
        except KeyError:
            return self.default

    def __set__(self, instance, value):
        instance.__dict__[self.name] = value
  1. Uses instance.__dict__[self.name]
  2. __set__(self, instance, value) is optional

Other Resources

Thanks

Slides at bit.ly/py-descriptors
@LCRcircuit