This page is available as a Jupyter notebook: tutorials/3-defining-jobs.ipynb.
Defining jobs in jobflow¶
In this tutorial, you will:
Learn about the
job
decorator.Understand the structure of the
Job
object.Set the configuration settings of a job.
Use the
Response
object.Learn tips for writing job functions.
The purpose of this tutorial is to delve into the basic functionality of jobs and gain a feeling for what is possible. Later tutorials will describe how to employ jobs in complex workflows.
Creating job objects¶
The building block of jobflows are Job
objects. Jobs are delayed calls to python functions whose outputs are stored in a database. The easiest way to create a job is using the @job
decorator. The job decorator can be applied to any function, even those with optional parameters.
[2]:
from jobflow import job
@job
def add(a, b, c=2):
return a + b + c
Any call to the add
function will return a Job
object.
[3]:
add_first = add(1, 2, c=5)
Each job is assigned a unique identifier (UUID).
[4]:
add_first.uuid
[4]:
'01eb0d89-3817-4bac-9897-9fb0ebeea2c7'
Jobs also have an index. This tracks the number of times the job has been “replaced” (replacing is covered in detail in the Dynamic and nested Flows tutorial).
[5]:
add_first.index
[5]:
1
Jobs have outputs that can be accessed using the output
attribute. As the job has not yet been executed, the output is currently a reference to the future output.
[6]:
add_first.output
[6]:
OutputReference(01eb0d89-3817-4bac-9897-9fb0ebeea2c7)
The output of a job can be used as the input to another job.
[7]:
add_second = add(add_first.output, 3)
The output does not have to be an argument on its own, it can be included in a list or a dictionary.
[8]:
@job
def sum_numbers(numbers):
return sum(numbers)
sum_job = sum_numbers([add_first.output, add_second.output])
Running Jobs¶
Here, we will demonstrate how to run a simple job locally, which can be useful for testing purposes.
[9]:
from jobflow.managers.local import run_locally
response = run_locally(add(1, 2))
2023-06-08 10:09:44,084 INFO Started executing jobs locally
2023-06-08 10:09:44,219 INFO Starting job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished job - add (f43ad355-31b1-4e2c-a3fc-7159f180d662)
2023-06-08 10:09:44,220 INFO Finished executing jobs locally
The output contains a UUID for the job
along with its outputs.
[10]:
print(response)
{'f43ad355-31b1-4e2c-a3fc-7159f180d662': {1: Response(output=5, detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False)}}