Persistent jobs for Julia.
The package Persist
allows running jobs independent of the Julia shell. The jobs are run in the background, either on the local machine or via Slurm, and are not interrupted when the Julia shell exits. This is a convenient and safe way to start long-running calculations, without having to write a Julia script.
Programming in Julia typically proceeds in two stages: First one writes some exploratory code in the shell (or via Jupyter). Later, when the code becomes more sophisticated, one converts part of the code to a Julia package that is developed in an editor outside the Julia shell. One still uses the Julia shell to test the package.
As code complexity increases, so do the run times. What takes a few seconds initially turns into minutes and then hours of run time. This then makes things inconvenient:
- While one long-running command is executing, the Julia shell is blocked
- If the command is started in the background, one may accidentally overwrite or delete data that it is accessing
- If the Julia shell exists, or the network connection is lost, the background process is aborted
This package Persist
circumvents these problems: It allows wrapping a Julia command in a shell script that is run in the background, independent of the Julia shell.
Here is an example:
using Persist
# Start a calculation in the background
job = persist("hello", ProcessManager) do
sleep(10) # Simulate a long-running task
println("Hello, World!") # Produce some output
return [42] # Return a value
end
# Do something else
# Check on the background job
status(job)
# Get the job's result
fetch(job)
getstdout(job)
getstderr(job)
wait(job)
cleanup(job)
You can also use Slurm to submit a job:
using Persist
persist("calcpi", SlurmManager) do
sleep(10)
big(pi)
end
# Jobs are written to file, and can be read back in
job = readmgr("calcpi")
jobinfo(job)
println("pi = $(fetch(job))")
cleanup(job)
Simple, really.
The Julia expression is serialized and written to a file. A shell script is generated that is executed in the background (or via Slurm). This script reads the expression, executes it, and serializes the result to another file. Various commands examine the status of the job. fetch
deserializes the result once the job has finished.
This is very similar to the way in which @spawn
or @everywhere
works, except that the expression is evaluated independently of the Julia shell. The same caveats regarding defining functions and using modules apply.
mgr = @persist name manager expression
mgr = persist(function, name, manager)
name::AbstractString
: Job namemanager::JobManager
: EitherProcessManager
orSlurmManager
expression::Any
,function::Any
: Expression or function to evaluatemgr::JobManager
: Job manager object
mgr = readmgr(name)
name::AbstractString
: Job namemgr::JobManager
: Job manager object
st = status(mgr)
mgr::JobManager
: Job manager objectst::JobStatus
: Job status; one ofjob_empty
,job_queued
,job_runnig
,job_done
,job_failed
st = jobinfo(mgr)
mgr::JobManager
: Job manager objectst::AbstractString
: Human-readable job status description, as e.g. output byps
orsqueue
cancel(mgr)
mgr::JobManager
: Job manager object
st = isready(mgr)
mgr::JobManager
: Job manager objectst::Bool
: Whether the job is done
wait(mgr)
mgr::JobManager
: Job manager object After waiting,isready(mgr) == true
.
result = fetch(mgr)
mgr::JobManager
: Job manager objectresult::Any
: Job result (i.e. its return value) Wait for the job to complete, then return the job's result.
out = getstdout(mgr)
err = getstderr(mgr)
mgr::JobManager
: Job manager objectout::AbstractString
: Job output (what the job wrote tostdout
)err::AbstractString
: Job output (what the job wrote tostderr
) Partial job output may (or may not) be available while the job is running.
cleanup(mgr)
mgr::JobManager
: Job manager object This deletes all information about the job, its result, and its output.