This page is available as a Jupyter notebook: tutorials/3-defining-jobs.ipynb.

Defining jobs in jobflow#

In this tutorial, you will:

  • Learn about the job decorator.

  • Understand the structure of the Job object.

  • Set the configuration settings of a job.

  • Use the Response object.

  • Learn tips for writing job functions.

The purpose of this tutorial is to delve into the basic functionality of jobs and gain a feeling for what is possible. Later tutorials will describe how to employ jobs in complex workflows.

Creating job objects#

The building block of jobflows are Job objects. Jobs are delayed calls to python functions whose outputs are stored in a database. The easiest way to create a job is using the @job decorator. The job decorator can be applied to any function, even those with optional parameters.

[2]:
from jobflow import job


@job
def add(a, b, c=2):
    return a + b + c

Any call to the add function will return a Job object.

[3]:
add_first = add(1, 2, c=5)

Each job is assigned a unique identifier (UUID).

[4]:
add_first.uuid
[4]:
'01eb0d89-3817-4bac-9897-9fb0ebeea2c7'

Jobs also have an index. This tracks the number of times the job has been “replaced” (replacing is covered in detail in the Dynamic and nested Flows tutorial).

[5]:
add_first.index
[5]:
1

Jobs have outputs that can be accessed using the output attribute. As the job has not yet been executed, the output is currently a reference to the future output.

[6]:
add_first.output
[6]:
OutputReference(01eb0d89-3817-4bac-9897-9fb0ebeea2c7)

The output of a job can be used as the input to another job.

[7]:
add_second = add(add_first.output, 3)

The output does not have to be an argument on its own, it can be included in a list or a dictionary.

[8]:
@job
def sum_numbers(numbers):
    return sum(numbers)


sum_job = sum_numbers([add_first.output, add_second.output])

Running Jobs#

Here, we will demonstrate how to run a simple job locally, which can be useful for testing purposes.

[9]:
from jobflow.managers.local import run_locally

response = run_locally(add(1, 2))
2023-06-08 10:09:44,084 INFO Started executing jobs locally
2023-06-08 10:09:44,219 INFO Starting job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished executing jobs locally

The output contains a UUID for the job along with its outputs.

[10]:
print(response)
{'f43ad355-31b1-4e2c-a3fc-7159f180d662': {1: Response(output=5, detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False)}}