This page is available as a Jupyter notebook: tutorials/9-pydantic-validation.ipynb.

Pydantic validation

When building computational workflows with jobflow, you’re often chaining together multiple jobs that pass data between each other. Without proper validation, several problems can occur:

  1. Silent Failures: A job might produce output in an unexpected format, causing downstream jobs to fail with cryptic error messages or produce incorrect results without warning.

  2. Missing Required Data: Without data validation jobs are not required to include a critical field in its output, and the error only appears several steps later in the workflow.

  3. Documentation Drift: Without enforced schemas, it’s unclear what data structure each job expects or produces, making workflows harder to understand and maintain.

Pydantic provides powerful data validation and settings management using Python type annotations. It allows users to define explicit schemas for job inputs and outputs using Python type hints, catch errors early at the job boundaries rather than deep in your workflow, auto-generate documentation of your data structures, ensure data consistency across complex, multi-step workflows, and validate at runtime with clear, informative error messages. For example this is used in atomate2 to validate the outputs of computational tasks.

In the example below, we define a simple Pydantic model to validate that the output of a job is a float.

[2]:
from pydantic import BaseModel, Field

from jobflow import job, run_locally
from jobflow.core.job import apply_schema


class FloatValidator(BaseModel):
    result: float = Field(..., description="The resulting float value")


@job
def add(a, b):
    return FloatValidator(result=a + b)


job_1 = add(1, 2)
response = run_locally(job_1)

print(response)
2025-10-27 12:32:49,377 INFO Started executing jobs locally
2025-10-27 12:32:49,380 INFO Starting job - add (8ae64cc3-2da4-4914-b967-3c45e808d7d5)
2025-10-27 12:32:49,381 INFO Finished job - add (8ae64cc3-2da4-4914-b967-3c45e808d7d5)
2025-10-27 12:32:49,383 INFO Finished executing jobs locally
{'8ae64cc3-2da4-4914-b967-3c45e808d7d5': {1: Response(output=FloatValidator(result=3.0), detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False, job_dir=PosixPath('/X/jobflow/docs/tutorials'))}}

Or equivalently, we can use the output_schema parameter in the job decorator. In which case the results of the job can be returned as a dictionary and will be validated against the schema.

[18]:
@job(output_schema=FloatValidator)
def add(a, b):
    return {"result": a + b}


# Or equivalently:


@job
def _add(a, b):
    return apply_schema({"result": a + b}, FloatValidator)

If the output does not conform to the schema, an error will be raised:

[8]:
@job
def invalid_add(a, b):
    return FloatValidator(result={"invalid_result": a + b})


invalid_job = invalid_add(1, 2)

response_invalid = run_locally(invalid_job)
2025-10-20 14:20:52,616 INFO Started executing jobs locally
2025-10-20 14:20:52,618 INFO Starting job - invalid_add (a22f7c1f-a80e-404f-9192-302487b4eaf7)
2025-10-20 14:20:52,621 INFO invalid_add failed with exception:
Traceback (most recent call last):
  File "/X/jobflow/src/jobflow/managers/local.py", line 117, in _run_job
    response = job.run(store=store)
  File "/X/jobflow/src/jobflow/core/job.py", line 604, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "/var/folders/zh/3748r38115qb94_pvwg0cc6m0000gn/T/ipykernel_33718/2862673370.py", line 3, in invalid_add
    return FloatValidator(result={"invalid_result": a + b})
  File "/opt/homebrew/Caskroom/miniforge/base/envs/jobflow/lib/python3.14/site-packages/pydantic/main.py", line 250, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for FloatValidator
result
  Input should be a valid number [type=float_type, input_value={'invalid_result': 3}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/float_type

2025-10-20 14:20:52,622 INFO Finished executing jobs locally

Similarly, it is possible to define input schemas using Pydantic models to validate the inputs of your jobs. This ensures that the data being processed meets the expected format and constraints.

[ ]:
class InputValidator(BaseModel):
    a: float = Field(..., description="First float value")
    b: float = Field(..., description="Second float value")


@job
def validated_add(inputs: InputValidator):
    return FloatValidator(result=inputs.a + inputs.b)


validated_job = validated_add(InputValidator(a=3.0, b=4.0))
validated_response = run_locally(validated_job)

print(validated_response)
2025-10-19 14:07:28,934 INFO Started executing jobs locally
2025-10-19 14:07:28,935 INFO Starting job - validated_add (8fd24cd1-82ff-42ac-8bcf-d75504818c71)
2025-10-19 14:07:28,936 INFO Finished job - validated_add (8fd24cd1-82ff-42ac-8bcf-d75504818c71)
2025-10-19 14:07:28,937 INFO Finished executing jobs locally
{'8fd24cd1-82ff-42ac-8bcf-d75504818c71': {1: Response(output=FloatValidator(result=7.0), detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False, job_dir=PosixPath('/X/docs/tutorials'))}}

If the input does not conform to the schema, an error will be raised before the job is executed.

[15]:
input_invalid_job = validated_add(InputValidator(a="a", b=4.0))

run_locally(input_invalid_job)
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[15], line 1
----> 1 input_invalid_job = validated_add(InputValidator(a="a", b=4.0))
      3 run_locally(input_invalid_job)

File /opt/homebrew/Caskroom/miniforge/base/envs/jobflow/lib/python3.14/site-packages/pydantic/main.py:250, in BaseModel.__init__(self, **data)
    248 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    249 __tracebackhide__ = True
--> 250 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
    251 if self is not validated_self:
    252     warnings.warn(
    253         'A custom validator is returning a value other than `self`.\n'
    254         "Returning anything other than `self` from a top level model validator isn't supported when validating via `__init__`.\n"
    255         'See the `model_validator` docs (https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more details.',
    256         stacklevel=2,
    257     )

ValidationError: 1 validation error for InputValidator
a
  Input should be a valid number, unable to parse string as a number [type=float_parsing, input_value='a', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/float_parsing

By default, Pydantic models are not strict about extra fields. However, you can configure the model to forbid extra fields by setting extra='forbid'. This ensures that only the defined fields are accepted, and any additional fields will raise a validation error.

[21]:
class InputValidator(BaseModel):
    a: float = Field(..., description="First float value")
    b: float = Field(..., description="Second float value")


class OutputValidator(BaseModel, extra="forbid"):
    result: float = Field(..., description="The resulting float value")


@job
def validated_add(inputs: InputValidator):
    return OutputValidator(result=inputs.a + inputs.b)


validated_job = validated_add(InputValidator(a=3.0, b=4.0, c=5.0, d=6.0))

validated_response = run_locally(validated_job)

print(validated_response)
2025-10-19 14:36:24,898 INFO Started executing jobs locally
2025-10-19 14:36:24,900 INFO Starting job - validated_add (2b361380-3a7b-4f26-8c6e-c76c533fe66c)
2025-10-19 14:36:24,903 INFO Finished job - validated_add (2b361380-3a7b-4f26-8c6e-c76c533fe66c)
2025-10-19 14:36:24,903 INFO Finished executing jobs locally
{'2b361380-3a7b-4f26-8c6e-c76c533fe66c': {1: Response(output=OutputValidator(result=7.0), detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False, job_dir=PosixPath('/X/docs/tutorials'))}}

In the code above, extra parameters in the input data are allowed by default, c and d will be ignored without raising an error.

If instead we have an invalid_add that returns additional field, an error will be raised since extra=forbid has been specified in the OutputValidator:

Finally, if a field is missing from a Pydantic model, a validation error will also be raised.

[ ]:
class MissingOutputValidator(BaseModel):
    result: float = Field(..., description="The resulting float value")
    extra_field: float = Field(..., description="An extra required float value")


@job
def invalid_add(a, b):
    return MissingOutputValidator(result=a + b)


invalid_job = invalid_add(1, 2)
run_locally(invalid_job)
2025-10-19 15:02:00,197 INFO Started executing jobs locally
2025-10-19 15:02:00,199 INFO Starting job - invalid_add (d03d4dd6-a66e-4551-862d-ca6f01b2032c)
2025-10-19 15:02:00,200 INFO invalid_add failed with exception:
Traceback (most recent call last):
  File "/X/src/jobflow/managers/local.py", line 117, in _run_job
    response = job.run(store=store)
  File "/X/src/jobflow/core/job.py", line 604, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "/var/folders/zh/3748r38115qb94_pvwg0cc6m0000gn/T/ipykernel_34651/2447134739.py", line 7, in invalid_add
    return OutputValidator(result=a + b)
  File "/X/pydantic/main.py", line 250, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for OutputValidator
extra_field
  Field required [type=missing, input_value={'result': 3}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing

2025-10-19 15:02:00,201 INFO Finished executing jobs locally
{}