This package provides the function pmapreduce
. This function is essentially
function pmapreduce(f, op, itrs)
@distributed op for arg in itrs
f(arg)
end
end
However, since @distributed
partitions itrs
evenly across nodes, some nodes
may be idle if the nodes are different in computational speed, or equivalently
if f
takes different time to compute depending on the input. With the added option
pmapreduce(f, op, itrs...; algorithm = :reduction_local)
, computations are
dynamically load balanced, that is, the elements in itrs
are only distributed
to a worker which is free. The result of each computation of f
is stored
and reduced locally, until itrs
is exhausted, and then sent back to the master
process, where the results from the nodes are further reduced, thus saving
communication bandwidth.
addprocs(2)
function test()
@everywhere function f(n)
sleep(n)
n
end
args = [5, ones(Int, 9)...]
val_even = @time pmapreduce(f, +, args)
val_uneven = @time pmapreduce(f, +, args; algorithm = :reduction_local)
println(val_even, " ", val_uneven)
end
julia> test()
9.039506 seconds (150 allocations: 6.562 KiB)
7.048571 seconds (1.91 k allocations: 81.703 KiB)
14 14
The option :reduction_master
is also available, where the result of every
f
computation is sent back to the master, where the reduction is performed.