Category Archives: Programming

What to do when Mathematica’s ParallelMap/ParallelTable takes a long time to start up

I have a Mathematica notebook that derives some rather massive expressions. I wanted to do some transformations on them in parallel using ParallelMap or ParallelTable, but noticed that these commands were only running on a single CPU core for hours before actually starting to run in parallel and occupy all CPU cores. While it was running on only that single CPU core, I could not even abort the evaluation using Alt-. like one usually can: it simply seemed stuck.

make_massive_expression[x_] := ...;
process[x_] := Simplify[x];
a1 = simple_expression;
a2 = make_massive_expression[a1];
a3 = make_massive_expression[a2];
as = {a1,a2,a3};

b = ParallelTable[process[as[[i]]], {i,Length[as]}];

As it turns out, during the startup phase Mathematica copies all definitions from the main kernel to the parallel kernels. And that seems to be a rather inefficient procedure. So let’s transfer the needed definitions manually.

make_massive_expression[x_] := ...;
process[x_] := Simplify[x];
a1 = simple_expression;
a2 = make_massive_expression[a1];
a3 = make_massive_expression[a2];
as = {a1,a2,a3};

DistributeDefinitions[as, process];
b = ParallelTable[process[as[[i]]], {i,Length[as]}, DistributedContexts -> None];

Now DistributeDefinitions is slow, but ParallelTable immediately starts running in parallel on multiple kernels. We haven’t gained anything by splitting things like this, but at least we can now tell exactly where the problem lies. So instead of transferring the massive expressions to the parallel kernels, let’s only transfer the simple expression and have the parallel kernels derive the massive expression themselves:

make_massive_expression[x_] := ...;
process[x_] := Simplify[x];
a1 = simple_expression;

DistributeDefinitions[a1, make_massive_expression, process];

ParallelEvaluate[(
   a2 = make_massive_expression[a1];
   a3 = make_massive_expression[a2];
   as = {a1,a2,a3}
), DistributedContexts -> None];

b = ParallelTable[process[as[[i]]], {i,Length[as]}, DistributedContexts -> None];

What to do if GitHub only sends emails for a random subset of notifications

For the past few months, it seemed like GitHub wasn’t sending me notification emails for all activity in my watched repositories. I was only getting a random subset of these emails, at most half of what it should have been. I eventually contacted GitHub’s support and they were immediately able to help me. They told me that they send a certain portion of their notifications not directly, but via a third-party service (judging from the message headers, it’s SendGrid).

This third-party service had apprently added me to their suppression list because one email they sent to me months ago had been hard-bounced. There probably was some malfunction on our email server at the time that caused this. I understand why SendGrid does this, but silently discarding any emails GitHub asked them to deliver to me is bad. Really, they should have notified GitHub, which then should have either emailed me at one of the other addresses in my profile or shown a big banner after I log in the next time, asking me to re-confirm my email address to clear the block.

So if this ever happens to you and you’re getting fewer GitHub notification emails than you should be: just email their support and ask them to check. I should have done this sooner, but it really wasn’t clear to me whom to blame (it could just as well have been our mail server’s fault).

Using C++11 on Mac OS X 10.8

Recent Xcode versions for Mac OS X 10.7 and 10.8 ship with Clang, a modern compiler for C/C++/ObjC based on LLVM. It fully supports C++11: simply add -std=c++0x or -std=c++11 to your CXXFLAGS. This already gives you all the new language features such as the auto keyword.

However, when you get more in-depth with C++, you’ll also want to use the new features of the standard library, such as <array> or <random>.  This however results in strange error messages:

gamelogic/Board.cpp:11:10: fatal error: 'random' file not found
#include <random>
         ^

As it turns out, your binaries get linked to the system-default libstdc++ version (/usr/lib/libstdc++.6.dylib) which is too old to support C++11. However, Mac OS X also includes libc++ (/usr/lib/libc++.1.dylib), a complete reimplementation of the standard library by the LLVM team that is fully C++11 compatible. Simply tell the compiler to use it using -stdlib=libc++ and tell the linker to link against it using -lc++.

So for a qmake .pro project file, all this might look as follows. The conditional makes it compatible with other compilers such as g++ on Linux that already ship with a C++11-compatible standard library.

QMAKE_CXXFLAGS += -std=c++0x
macx {
 contains(QMAKE_CXX, /usr/bin/clang++) {
  message(Using LLVM libc++)
  QMAKE_CXXFLAGS += -stdlib=libc++
  QMAKE_LFLAGS += -lc++
 }
}

UPDATE 2016: Mac OS X 10.9 and higher default to libc++ and don’t require the extra compiler flag. Since Mac OS X 10.8 is out of support anyway, there is no reason to use the flag anymore.

HTML to ePub using Sigil

I was looking for a way to convert HTML books into an ePub file. The general layout of the file should be preserved (including images), while all the stuff that doesn’t make sense on an ebook reader (such as navigation elements and the usual “back to top” links) should be removed.

After trying Calibre rather extensively, I came across an app named Sigil, which does exactly what I want: You just throw in your HTML files (it automatically imports images referenced by them) and add some metadata.

Before proceeding, you should use your favorite scripting language (or modify the attached quick-and-dirty PHP script) to remove everything but the main part of the chapter from the HTML files. (Make sure to remove any tables or divs surrounding the entire content because that might break page-by-page navigation on your ebook reader).

Sigil works very smooth if your HTML files are in alphabetical order. If they’re not, don’t despair: take the index.html file that (hopefully) came with them and us your favorite scripting language (or modify the attached quick-and-dirty PHP script) to grab all the links from it (be sure to remove anchors and duplicates) and generate an XML structure like <spine toc="ncx">
<itemref idref="file1.html" />
<itemref idref="file2.html" />
</spine>
. Manually replace the spine section in the content.opf file inside the generated ePub with the lines you just created. Then re-open the ePub in Sigil and check whether it found any HTML files you forgot to include (they will show up at the top of the file list) – if there are any, move them to the place where you want them.

Once you have everything the way you want it, check the auto-generated table of contents using the TOC Editor option. Chances are that you have everything in there duplicated if the links in your index.html file are recognized as chapter headlines. In that case, just uncheck those (if you don’t feel like unchecking 500 items, I’ve attached an AppleScript to do that, just select the bottom-most line you want unchecked and adjust the number of lines inside the script).