This page is available as a Jupyter notebook: tutorials/3-defining-jobs.ipynb.
Defining jobs in jobflow#
In this tutorial, you will:
Learn about the
job
decorator.Understand the structure of the
Job
object.Set the configuration settings of a job.
Use the
Response
object.Learn tips for writing job functions.
The purpose of this tutorial is to delve into the basic functionality of jobs and gain a feeling for what is possible. Later tutorials will describe how to employ jobs in complex workflows.
Creating job objects#
The building block of jobflows are Job
objects. Jobs are delayed calls to python functions whose outputs are stored in a database. The easiest way to create a job is using the @job
decorator. The job decorator can be applied to any function, even those with optional parameters.
[2]:
from jobflow import job
@job
def add(a, b, c=2):
return a + b + c
Any call to the add
function will return a Job
object.
[3]:
add_first = add(1, 2, c=5)
Each job is assigned a unique identifier (UUID).
[4]:
add_first.uuid
[4]:
'01eb0d89-3817-4bac-9897-9fb0ebeea2c7'
Jobs also have an index. This tracks the number of times the job has been “replaced” (replacing is covered in detail in the Dynamic and nested Flows tutorial).
[5]:
add_first.index
[5]:
1
Jobs have outputs that can be accessed using the output
attribute. As the job has not yet been executed, the output is currently a reference to the future output.
[6]:
add_first.output
[6]:
OutputReference(01eb0d89-3817-4bac-9897-9fb0ebeea2c7)
The output of a job can be used as the input to another job.
[7]:
add_second = add(add_first.output, 3)
The output does not have to be an argument on its own, it can be included in a list or a dictionary.
[8]:
@job
def sum_numbers(numbers):
return sum(numbers)
sum_job = sum_numbers([add_first.output, add_second.output])
Running Jobs#
Here, we will demonstrate how to run a simple job locally, which can be useful for testing purposes.
[9]:
from jobflow.managers.local import run_locally
response = run_locally(add(1,2))
2023-06-08 10:09:44,084 INFO Started executing jobs locally
2023-06-08 10:09:44,219 INFO Starting job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished executing jobs locally
The output contains a UUID for the job
along with its outputs.
[10]:
print(response)
{'f43ad355-31b1-4e2c-a3fc-7159f180d662': {1: Response(output=5, detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False)}}