I have a Mathematica notebook that derives some rather massive expressions. I wanted to do some transformations on them in parallel using ParallelMap
or ParallelTable
, but noticed that these commands were only running on a single CPU core for hours before actually starting to run in parallel and occupy all CPU cores. While it was running on only that single CPU core, I could not even abort the evaluation using Alt-.
like one usually can: it simply seemed stuck.
make_massive_expression[x_] := ...;
process[x_] := Simplify[x];
a1 = simple_expression;
a2 = make_massive_expression[a1];
a3 = make_massive_expression[a2];
as = {a1,a2,a3};
b = ParallelTable[process[as[[i]]], {i,Length[as]}];
As it turns out, during the startup phase Mathematica copies all definitions from the main kernel to the parallel kernels. And that seems to be a rather inefficient procedure. So let’s transfer the needed definitions manually.
make_massive_expression[x_] := ...;
process[x_] := Simplify[x];
a1 = simple_expression;
a2 = make_massive_expression[a1];
a3 = make_massive_expression[a2];
as = {a1,a2,a3};
DistributeDefinitions[as, process];
b = ParallelTable[process[as[[i]]], {i,Length[as]}, DistributedContexts -> None];
Now DistributeDefinitions
is slow, but ParallelTable
immediately starts running in parallel on multiple kernels. We haven’t gained anything by splitting things like this, but at least we can now tell exactly where the problem lies. So instead of transferring the massive expressions to the parallel kernels, let’s only transfer the simple expression and have the parallel kernels derive the massive expression themselves:
make_massive_expression[x_] := ...;
process[x_] := Simplify[x];
a1 = simple_expression;
DistributeDefinitions[a1, make_massive_expression, process];
ParallelEvaluate[(
a2 = make_massive_expression[a1];
a3 = make_massive_expression[a2];
as = {a1,a2,a3}
), DistributedContexts -> None];
b = ParallelTable[process[as[[i]]], {i,Length[as]}, DistributedContexts -> None];