This page is available as a Jupyter notebook: tutorials/3-defining-jobs.ipynb.

Defining jobs in jobflow

In this tutorial, you will:

  • Learn about the job decorator.

  • Understand the structure of the Job object.

  • Set the configuration settings of a job.

  • Use the Response object.

  • Learn tips for writing job functions.

The purpose of this tutorial is to delve into the basic functionality of jobs and gain a feeling for what is possible. Later tutorials will describe how to employ jobs in complex workflows.

Creating job objects

The building block of jobflows are Job objects. Jobs are delayed calls to python functions whose outputs are stored in a database. The easiest way to create a job is using the @job decorator. The job decorator can be applied to any function, even those with optional parameters.

[2]:
from jobflow import job


@job
def add(a, b, c=2):
    return a + b + c

Any call to the add function will return a Job object.

[3]:
add_first = add(1, 2, c=5)

Each job is assigned a unique identifier (UUID).

[4]:
add_first.uuid
[4]:
'01eb0d89-3817-4bac-9897-9fb0ebeea2c7'

Jobs also have an index. This tracks the number of times the job has been “replaced” (replacing is covered in detail in the Dynamic and nested Flows tutorial).

[5]:
add_first.index
[5]:
1

Jobs have outputs that can be accessed using the output attribute. As the job has not yet been executed, the output is currently a reference to the future output.

[6]:
add_first.output
[6]:
OutputReference(01eb0d89-3817-4bac-9897-9fb0ebeea2c7)

The output of a job can be used as the input to another job.

[7]:
add_second = add(add_first.output, 3)

The output does not have to be an argument on its own, it can be included in a list or a dictionary.

[8]:
@job
def sum_numbers(numbers):
    return sum(numbers)


sum_job = sum_numbers([add_first.output, add_second.output])

Running Jobs

Here, we will demonstrate how to run a simple job locally, which can be useful for testing purposes.

[9]:
from jobflow.managers.local import run_locally

response = run_locally(add(1, 2))
2023-06-08 10:09:44,084 INFO Started executing jobs locally
2023-06-08 10:09:44,219 INFO Starting job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished executing jobs locally

The output contains a UUID for the job along with its outputs.

[10]:
print(response)
{'f43ad355-31b1-4e2c-a3fc-7159f180d662': {1: Response(output=5, detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False)}}