Extending Schemas¶
Pre-processing and Post-processing Methods¶
Data pre-processing and post-processing methods can be registered using the pre_load
, post_load
, pre_dump
, and post_dump
decorators.
from marshmallow import Schema, fields, pre_load
class UserSchema(Schema):
name = fields.Str()
slug = fields.Str()
@pre_load
def slugify_name(self, in_data):
in_data['slug'] = in_data['slug'].lower().strip().replace(' ', '-')
return in_data
schema = UserSchema()
result, errors = schema.load({'name': 'Steve', 'slug': 'Steve Loria '})
result['slug'] # => 'steve-loria'
Passing “many”¶
By default, pre- and post-processing methods receive one object/datum at a time, transparently handling the many
parameter passed to the schema at runtime.
In cases where your pre- and post-processing methods need to receive the input collection when many=True
, add pass_many=True
to the method decorators. The method will receive the input data (which may be a single datum or a collection) and the boolean value of many
.
Example: Enveloping¶
One common use case is to wrap data in a namespace upon serialization and unwrap the data during deserialization.
from marshmallow import Schema, fields, pre_load, post_load, post_dump
class BaseSchema(Schema):
# Custom options
__envelope__ = {
'single': None,
'many': None
}
__model__ = User
def get_envelope_key(self, many):
"""Helper to get the envelope key."""
key = self.__envelope__['many'] if many else self.__envelope__['single']
assert key is not None, "Envelope key undefined"
return key
@pre_load(pass_many=True)
def unwrap_envelope(self, data, many):
key = self.get_envelope_key(many)
return data[key]
@post_dump(pass_many=True)
def wrap_with_envelope(self, data, many):
key = self.get_envelope_key(many)
return {key: data}
@post_load
def make_object(self, data):
return self.__model__(**data)
class UserSchema(BaseSchema):
__envelope__ = {
'single': 'user',
'many': 'users',
}
__model__ = User
name = fields.Str()
email = fields.Email()
user_schema = UserSchema()
user = User('Mick', email='mick@stones.org')
user_data = user_schema.dump(user).data
# {'user': {'email': 'mick@stones.org', 'name': 'Mick'}}
users = [User('Keith', email='keith@stones.org'),
User('Charlie', email='charlie@stones.org')]
users_data = user_schema.dump(users, many=True).data
# {'users': [{'email': 'keith@stones.org', 'name': 'Keith'},
# {'email': 'charlie@stones.org', 'name': 'Charlie'}]}
user_objs = user_schema.load(users_data, many=True).data
# [<User(name='Keith Richards')>, <User(name='Charlie Watts')>]
Raising Errors in Pre-/Post-processor Methods¶
Pre- and post-processing methods may raise a ValidationError
. By default, errors will be stored on the "_schema"
key in the errors dictionary.
from marshmallow import Schema, fields, ValidationError, pre_load
class BandSchema(Schema):
name = fields.Str()
@pre_load
def unwrap_envelope(self, data):
if 'data' not in data:
raise ValidationError('Input data must have a "data" key.')
return data['data']
sch = BandSchema()
sch.load({'name': 'The Band'}).errors
# {'_schema': ['Input data must have a "data" key.']}
If you want to store and error on a different key, pass the key name as the second argument to ValidationError
.
from marshmallow import Schema, fields, ValidationError, pre_load
class BandSchema(Schema):
name = fields.Str()
@pre_load
def unwrap_envelope(self, data):
if 'data' not in data:
raise ValidationError('Input data must have a "data" key.', '_preprocessing')
return data['data']
sch = BandSchema()
sch.load({'name': 'The Band'}).errors
# {'_preprocessing': ['Input data must have a "data" key.']}
Pre-/Post-processor Invocation Order¶
In summary, the processing pipeline for deserialization is as follows:
@pre_load(pass_many=True)
methods@pre_load(pass_many=False)
methodsload(in_data, many)
(validation and deserialization)@post_load(pass_many=True)
methods@post_load(pass_many=False)
methods
The pipeline for serialization is similar, except that the “pass_many” processors are invoked after the “non-raw” processors.
@pre_dump(pass_many=False)
methods@pre_dump(pass_many=True)
methodsdump(obj, many)
(serialization)@post_dump(pass_many=False)
methods@post_dump(pass_many=True)
methods
Warning
You may register multiple processor methods on a Schema. Keep in mind, however, that the invocation order of decorated methods of the same type is not guaranteed. If you need to guarantee order of processing steps, you should put them in the same method.
from marshmallow import Schema, fields, pre_load
# YES
class MySchema(Schema):
field_a = fields.Field()
@pre_load
def preprocess(self, data):
step1_data = self.step1(data)
step2_data = self.step2(step1_data)
return step2_data
def step1(self, data):
# ...
# Depends on step1
def step2(self, data):
# ...
# NO
class MySchema(Schema):
field_a = fields.Field()
@pre_load
def step1(self, data):
# ...
# Depends on step1
@pre_load
def step2(self, data):
# ...
Handling Errors¶
By default, Schema.dump()
and Schema.load()
will return validation errors as a dictionary (unless strict
mode is enabled).
You can specify a custom error-handling function for a Schema
by overriding the handle_error
method. The method receives the ValidationError
and the original object (or input data if deserializing) to be (de)serialized.
import logging
from marshmallow import Schema, fields
class AppError(Exception):
pass
class UserSchema(Schema):
email = fields.Email()
def handle_error(self, exc, data):
"""Log and raise our custom exception when (de)serialization fails."""
logging.error(exc.messages)
raise AppError('An error occurred with input: {0}'.format(data))
schema = UserSchema()
schema.load({'email': 'invalid-email'}) # raises AppError
Schema-level Validation¶
You can register schema-level validation functions for a Schema
using the marshmallow.validates_schema
decorator. Schema-level validation errors will be stored on the _schema
key of the errors dictonary.
from marshmallow import Schema, fields, validates_schema, ValidationError
class NumberSchema(Schema):
field_a = fields.Integer()
field_b = fields.Integer()
@validates_schema
def validate_numbers(self, data):
if data['field_b'] >= data['field_a']:
raise ValidationError('field_a must be greater than field_b')
schema = NumberSchema()
result, errors = schema.load({'field_a': 1, 'field_b': 2})
errors['_schema'] # => ["field_a must be greater than field_b"]
Validating Original Input Data¶
Normally, unspecified field names are ignored by the validator. If you would like access to the original, raw input (e.g. to fail validation if an unknown field name is sent), add pass_original=True
to your call to validates_schema
.
from marshmallow import Schema, fields, validates_schema, ValidationError
class MySchema(Schema):
foo = fields.Int()
bar = fields.Int()
@validates_schema(pass_original=True)
def check_unknown_fields(self, data, original_data):
unknown = set(original_data) - set(self.fields)
if unknown:
raise ValidationError('Unknown field', unknown)
schema = MySchema()
errors = schema.load({'foo': 1, 'bar': 2, 'baz': 3, 'bu': 4}).errors
# {'baz': 'Unknown field', 'bu': 'Unknown field'}
Storing Errors on Specific Fields¶
If you want to store schema-level validation errors on a specific field, you can pass a field name (or multiple field names) to the ValidationError
.
class NumberSchema(Schema):
field_a = fields.Integer()
field_b = fields.Integer()
@validates_schema
def validate_numbers(self, data):
if data['field_b'] >= data['field_a']:
raise ValidationError(
'field_a must be greater than field_b',
'field_a'
)
schema = NumberSchema()
result, errors = schema.load({'field_a': 2, 'field_b': 1})
errors['field_a'] # => ["field_a must be greater than field_b"]
Overriding how attributes are accessed¶
By default, marshmallow uses the utils.get_value
function to pull attributes from various types of objects for serialization. This will work for most use cases.
However, if you want to specify how values are accessed from an object, you can override the get_attribute
method.
class UserDictSchema(Schema):
name = fields.Str()
email = fields.Email()
# If we know we're only serializing dictionaries, we can
# use dict.get for all input objects
def get_attribute(self, key, obj, default):
return obj.get(key, default)
Custom “class Meta” Options¶
class Meta
options are a way to configure and modify a Schema's
behavior. See the API docs
for a listing of available options.
You can add custom class Meta
options by subclassing SchemaOpts
.
Example: Enveloping, Revisited¶
Let’s build upon the example above for adding an envelope to serialized output. This time, we will allow the envelope key to be customizable with class Meta
options.
# Example outputs
{
'user': {
'name': 'Keith',
'email': 'keith@stones.com'
}
}
# List output
{
'users': [{'name': 'Keith'}, {'name': 'Mick'}]
}
First, we’ll add our namespace configuration to a custom options class.
from marshmallow import Schema, SchemaOpts
class NamespaceOpts(SchemaOpts):
"""Same as the default class Meta options, but adds "name" and
"plural_name" options for enveloping.
"""
def __init__(self, meta):
SchemaOpts.__init__(self, meta)
self.name = getattr(meta, 'name', None)
self.plural_name = getattr(meta, 'plural_name', self.name)
Then we create a custom Schema
that uses our options class.
class NamespacedSchema(Schema):
OPTIONS_CLASS = NamespaceOpts
@pre_load(pass_many=True)
def unwrap_envelope(self, data, many):
key = self.opts.plural_name if many else self.opts.name
return data[key]
@post_dump(pass_many=True)
def wrap_with_envelope(self, data, many):
key = self.opts.plural_name if many else self.opts.name
return {key: data}
Our application schemas can now inherit from our custom schema class.
class UserSchema(NamespacedSchema):
name = fields.String()
email = fields.Email()
class Meta:
name = 'user'
plural_name = 'users'
ser = UserSchema()
user = User('Keith', email='keith@stones.com')
result = ser.dump(user)
result.data # {"user": {"name": "Keith", "email": "keith@stones.com"}}
Using Context¶
The context
attribute of a Schema
is a general-purpose store for extra information that may be needed for (de)serialization. It may be used in both Schema
and Field
methods.
schema = UserSchema()
# Make current HTTP request available to
# custom fields, schema methods, schema validators, etc.
schema.context['request'] = request
schema.dump(user)