Yearly Archives: 2018

Snom D735 review

I’ve been using Snom Voice over IP telephones for about 10 years. Their software works reliably and provides all the features you might wish for, and the hardware is solid too. I know it’s 2018 and most people don’t use landline phones anymore, but the audio quality is still much better, you can’t comfortably hold a cell phone between your shoulder and your ear, and cellular reception isn’t great where I live.

I started with a 360, then had an 870 and later a 760. When it was time to get a new phone, my list of requirements was pretty short: it should have a USB port on the side for a headset and it should have a graphical display. That left only the D735, D765 and D785. The latter two are priced rather similarly, while the first one can occasionally be picked up for just around 100€.

This article isn’t going to be about the software running on the phones: it is and has always been great. Also, it’s the same across all of Snom’s models. So I’ll just write what I liked and didn’t like about the hardware.

I first tested the D785 for a few days. It’s rather bulky and while the large display looks great, the software doesn’t really make much use of all that extra space (yet). The self-labeling keys with the second display seem like a neat feature, but they are a bit hard to read when the backlight is off and not as useful as I had expected.

So I decided to settle for the D735. The one obvious downside is the tiny screen by comparison to the D765. The entire UI is sized down and even the phone number displayed while in a call scrolls horizontally because its width doesn’t fit. There is still quite a bit of whitespace on the call screen, so if Snom reduced the margins a bit, I think it could actually fit a lot more onto that screen. The downside is also an upside: the phone is smaller, more akin to the D715 than to the D765. While the D765 has two rows of six speed dial keys each above the keypad, the D735 has four of them on either side of the display. That allows it to display labels for them on the screen so you can immediately see what would happen if you pressed them. It also lets you have four pages of different speed dial keys. The labels are very narrow — just showing an icon and a few characters of text. However, they tie in with the proximity sensor. Snom has advertised that as a unique and highly innovative feature, which seems overblown — until you actually try it. Just move your hand close to the phone and it displays the full key label (across half the width of the screen). This allows you to put a lot more text into the label than you could on the paper-labeled keys of the D715 or D765 and even more than on the second screen of the D785.

Personally, I think the D735 has the potential to replace the D715 as Snom’s “default” phone. Supposedly the D715 is their best selling device. Since the D735 only costs a little more and has a color screen and more speed dial keys, it seems like a no-brainer to prefer it over the D715. I can also see it cutting into the D765’s sales a bit — if you don’t mind the smaller screen, you get basically the same feature set in a smaller case. The D785 still remains Snom’s top of the line model — if you want a gorgeous huge screen and self-labeling speed dial keys, it offers a great package. The D735 however may just provide the best value of any of the devices Snom currently has in its lineup.

OpenWRT on AVM Fritz!Box 3370

I was looking for a new DSL modem and router as I am switching from cable to VDSL2. I was eyeing the Ubiquiti EdgeRouter series for a while because they have a big feature set at a reasonable price. I was a bit reluctant about the X series as they seem to be a bit troubled by their small flash memory and the Lite doesn’t have an SFP slot, which would have been nice for the fibre-to-the-home future. Also, both the X and the Lite have been available for quite a few years now, so I’m not sure how long they would have remained in firmware support. The 4 is much more expensive however, and I’d still need a VDSL2 modem, which seems to cost around 100€ (e.g. Draytek Vigor 130 or Allnet ALL-BM200VDSL2V).

Of course, I could have gotten an off-the-shelf router with an integrated modem, like the AVM Fritz!Box series that’s very popular in Germany and probably paid less in total (standalone VDSL2 modems are rather expensive because not many people want/need them). I had a Fritz!Box on cable for the past few years and am not particularly happy with the quality of their firmware though. The hardware is great, however.

So I decided to go with OpenWRT. The only built-in DSL modems it supports are Lantiq chips, so it had to be one from that list. I wanted something that has at least 64 MB of flash memory (so I could install some extra packages) and Gigabit Ethernet on all four ports. Luckily, OpenWRT recently got full support for the AVM Fritz!Box 3370. AVM announced that model back in 2010 and dropped official support for it in 2015, so they are available for ~25€ on eBay nowadays. Other models that would have been nice but are not currently supported by OpenWRT are the 3390 (simultaneous dual-band WiFi), and 3490/7490 (USB 3.0, 802.11ac, 512 MB flash memory and 256 MB RAM; the 7490 additionally has phone ports which can’t be used with OpenWRT). The ZyXEL P-2812HNU-F1 and P-2812HNU-F3 are quite similar to the AVM 3370, but they are not as readily available on eBay and tend to cost about twice as much.

OpenWRT doesn’t provide too much information on how to install, but it’s quite straight-forward. First, you need to check if your device is at least revision 2 and that it doesn’t have a certain bad bootloader version that makes installation more difficult. Go to http://192.168.178.1/support.lua on the original firmware, log in and click “Support-Daten erstellen”. In the resulting file, you should see something like the following at the top:

HWRevision      175
HWSubRevision   5
ProductID       Fritz_Box_3370
[...]
urlader-version 2475

Also, we need to know what kind of flash memory chip the device has. Scroll down to ##### BEGIN SECTION dmesg and look for something like

[ 1.450000] [HSNAND] Hardware-ECC activated
[ 1.450000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1 (Micron NAND 128MiB 3,3V 8-bit)

Download the files corresponding to your flash chip. Set your IP address statically to 192.168.178.20/24 and reboot the router. When the ethernet interface comes up after a few seconds, ftp 192.168.178.1 and upload the firmware as documented by OpenWRT:

quote USER adam2
quote PASS adam2
binary
debug
passive
quote SETENV linux_fs_start 0
quote MEDIA FLSH
put openwrt-lantiq-xrx200-avm_fritz3370-rev2-micron-squashfs-eva-kernel.bin mtd1
put openwrt-lantiq-xrx200-avm_fritz3370-rev2-micron-squashfs-eva-filesystem.bin mtd0
quote REBOOT

OpenWRT is now ready at 192.168.1.1 after a few minutes.

Note that Fritz!Box 3370 support is not in the 18.06 release version, only in the current snapshot builds. This means that the luci web interface is not pre-installed and you should only install new packages within the first few days after flashing (so you don’t pick up newer packages incompatible with your firmware).

After using the 3370 for a little while, unfortunately I noticed that it doesn’t handle more than around 60 Mbit/s. So be warned that it might not exhaust a 100 Mbit/s down, 50 Mbit/s up line. This seems to be due to a slightly underpowered CPU.

Root disk spindown on Debian 9

I recently installed Debian 9 on a Seagate PersonalCloud. Because the device will only get backed up to once every day, I want its disk to be spun down when it’s not needed. Even on a minimal install, you’ll find quite a few background services that access the root disk every few minutes. Here is what I had to do to keep my disk spun down.

hdparm

First, install hdparm (apt-get install hdparm) and configure /etc/hdparm.conf to spin down your disks after 10 minutes:

#quiet
spindown_time = 120

Since hdparm isn’t available in the initrd, when the rule /lib/udev/rules.d/85-hdparm.rules fires, you need to add /etc/systemd/system/hdparm-sda.service:

[Unit]
Description=hdparm sda
ConditionPathExists=/lib/udev/hdparm

[Service]
Type=forking
Environment=DEVNAME=/dev/sda
ExecStart=/lib/udev/hdparm
TimeoutSec=0
StandardOutput=tty
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

Now systemctl daemon-reload && systemctl enable hdparm-sda && systemctl start hdparm-sda.

cron.hourly

When you look into /var/log/syslog, you see messages like

CRON[393]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)

Of course, we can’t have these messages written to a log if we want the disk to remain spun down, so edit /etc/crontab and comment out the hourly job. As long as /etc/cron.hourly is empty, this will not do any harm. If you have any hourly jobs, you might want to move them to daily jobs.

systemctl timers

Systemctl has its own cron-like timer mechanism. You can view active timers with systemctl list-timers --all and disable ones you don’t need, especially ones that run more than once a day. systemctl disable snapper-timeline.timer && systemctl stop snapper-timeline.timer.

smartd

If you have smartmontools installed (apt-get install smartmontools), you’ll see lines like the following appear in the syslog when the disk is spun down:

smartd[294]: Device: /dev/sda [SAT], is in STANDBY mode, suspending checks

Writing these messages causes the disk to spin up, so we need to disable smartd: systemctl stop smartd && systemctl disable smartd. To keep monitoring our disks, put the following into /etc/cron.daily/smart-check and then chmod +x /etc/cron.daily/smart-check:

#!/bin/bash

/usr/sbin/smartctl -q errorsonly -A /dev/sda

systemd-tmpfiles-clean

When you run

inotifywait -m -r -e access -e modify -e create -e delete --timefmt '%c' --format '%T PATH:%w%f EVENTS:%,e' --exclude '/(dev/pts|proc|sys|run)' /

to see what is going on on your disk, you’ll see that your temporary directories are being cleaned every couple of minutes. We can reduce that to bi-weekly by running systemctl edit systemd-tmpfiles-clean.timer and pasting

[Timer]
OnBootSec=5min
OnUnitActiveSec=14d

postfix

If you have Postfix installed, inotify will also show you that it periodically checks its queue directories. So uninstall postfix (apt-get remove postfix) and configure a forwarding MTA that does not run as a daemon.

systemd-timesync

This last one was tricky to discover because it doesn’t appear in the logs and inotify doesn’t see it. You can logging of every disk access to the kernel log to see even more, but you need to disable syslog, otherwise you’ll get a self-amplifying write cascade.

systemctl stop syslog.socket
systemctl stop rsyslog.service
dmesg -C
echo 1 | sudo tee /proc/sys/vm/block_dump
dmesg -Tw

Here you’ll see that systemd-timesyncd stores its last sync date by touching a file. It only changes metadata, which is why inotify doesn’t see it happening. My solution was to put the following into /etc/tmpfiles.d/zz-clock.conf:

d /run/systemd/timesync 0755 systemd-timesync systemd-timesync - -
f /run/systemd/clock 0644 systemd-timesync systemd-timesync - -
f /run/systemd/timesync/clock 0644 systemd-timesync systemd-timesync - -
L+ /var/lib/systemd/timesync/clock - - - - /run/systemd/timesync/clock
L+ /var/lib/systemd/clock - - - - /run/systemd/clock

/var/lib/systemd/clock is used by systemd 234 and lower, while /var/lib/systemd/timesync/clock is used by systemd 235 and higher. So the latter will only be needed once you upgrade to Debian 10.

Scientific Article: Toward Understanding of Self-Electrophoretic Propulsion under Realistic Conditions: From Bulk Reactions to Confinement Effects

My colleague Patrick and I published a review article in Accounts of Chemical Research:

Scientific Article: Toward Understanding of Self-Electrophoretic Propulsion under Realistic Conditions: From Bulk Reactions to Confinement Effects
Michael Kuron, Patrick Kreissl, and Christian Holm
Accounts of Chemical Research 51, 2998 (2018)
DOI:
10.1021/acs.accounts.8b00285

Unfortunately there is no open-access version of this article.

USB WLAN adapter for Snom D7x5 VoIP phones

Snom lists a number of supported USB WLAN chipsets and occasionally adds new ones in newer firmwares. I had a BIGtec BIG120, but it died recently (it would overheat after a few days’ use and then fail to see any networks beyond channel 1 until it cooled down). Most of the devices listed by Snom are ancient; my BIG120 is no longer available in the market. Firmware 8.9.3.60 added some new supported WiFi chips, but Snom’s list of supported devices is rather short. There are two additional troubles with USB WiFi adapters: Manufacturers often release new revisions under the same name but with a different chip, so you have to be careful to buy just the right one. And while the phone (or other Linux device) might support that specific chip, its driver needs to actually contain the USB ID of the specific USB adapter. You can find information about revisions and about USB IDs of all kinds of devices at WikiDevi.

Here is a list of all the USB IDs supported by the Snom D715 (and probably all D7X5 and D3X5 phones) that I extracted from firmware 8.9.3.80. I didn’t include the Ralink RT3070/RT2070 that were already supported by the older Snom 7×0 and 8xx series as they are difficult to find nowadays, so unless you have one of the older phones, don’t bother with them.

8192cu.ko

04BB:094C
04BB:0950
04F2:AFF7
04F2:AFF8
04F2:AFF9
04F2:AFFA
04F2:AFFB
04F2:AFFC
050D:1004
050D:1102
050D:2102
050D:2103
0586:341F
06F8:E033
06F8:E035
0789:016D
07AA:0056
07B8:8178
07B8:8189
0846:9021
0846:9041
0846:F001
0B05:17AB
0B05:17BA
0BDA:018A
0BDA:0A8A
0BDA:17C0
0BDA:1E1E
0BDA:2E2E
0BDA:317F
0BDA:5088
0BDA:8170
0BDA:8176
0BDA:8177
0BDA:8178
0BDA:817A
0BDA:817B
0BDA:817C
0BDA:817D
0BDA:817E
0BDA:817F
0BDA:8186
0BDA:818A
0BDA:8191
0BDA:8191
0BDA:8754
0DF6:0052
0DF6:005C
0DF6:0061
0DF6:0070
0E66:0019
0E66:0020
0EB0:9071
103C:1629
1058:0631
13D3:3357
13D3:3358
13D3:3359
2001:3307
2001:3308
2001:3309
2001:330A
2001:330B
2019:1201
2019:4902
2019:AB2A
2019:AB2B
2019:AB2E
2019:ED17
20F4:624D
20F4:648B
2357:0100
4855:0090
4855:0091
4856:0091
7392:7811
7392:7822
CDAB:8010
CDAB:8011

8192eu.ko

0BDA:818B
0BDA:818C
2001:3319
2357:0107
2357:0108
2357:0109

mt7601Usta.ko

148F:6370
148F:7601
148F:7650

rt5572sta.ko

0411:00E8
043E:7A12
043E:7A13
043E:7A22
043E:7A32
043E:7A32
043E:7A42
0471:200F
0471:20DD
0471:2104
0471:2126
0471:2180
0471:2181
0471:2182
04BB:0944
04BB:0945
04BB:0947
04BB:0948
04DA:1800
04DA:1801
04DA:23F6
04E8:2018
050D:8053
050D:805C
050D:815C
050D:815C
057C:8501
0586:3416
0586:341A
0586:341E
0586:343E
0789:0162
0789:0163
0789:0164
0789:0166
07AA:002F
07AA:003C
07AA:003F
07B8:2770
07B8:2870
07B8:3070
07B8:3071
07B8:3072
07D1:3C09
07D1:3C0A
07D1:3C0D
07D1:3C0E
07D1:3C0F
07D1:3C11
07D1:3C16
07D1:3C17
07FA:7712
083A:6618
083A:7511
083A:7512
083A:7522
083A:8522
083A:A618
083A:A701
083A:A702
083A:A703
083A:B511
083A:B511
083A:B522
0846:9012
0930:0A07
0B05:1731
0B05:1732
0B05:1742
0B05:1784
0CDE:0022
0CDE:0025
0DB0:3820
0DB0:3821
0DB0:3822
0DB0:3870
0DB0:3871
0DB0:6899
0DB0:6899
0DB0:821A
0DB0:822A
0DB0:822B
0DB0:822C
0DB0:870A
0DB0:871A
0DB0:871B
0DB0:871C
0DB0:899A
0DF6:0017
0DF6:002B
0DF6:002C
0DF6:002D
0DF6:0039
0DF6:003E
0DF6:003F
0DF6:0041
0DF6:0042
0DF6:0042
0DF6:0047
0DF6:0048
0DF6:0050
0DF6:005F
0DF6:0065
0DF6:0066
0DF6:0067
0DF6:0068
0E66:0001
0E66:0003
0E66:0021
100D:9031
1044:800B
1044:800D
129B:1828
13B1:002F
13D3:3247
13D3:3273
13D3:3305
13D3:3307
13D3:3321
13D3:3329
13D3:3329
13D3:3365
1482:3C09
148F:2770
148F:2870
148F:3070
148F:3071
148F:3072
148F:3370
148F:3572
148F:3573
148F:5370
148F:5372
148F:5572
14B2:3C06
14B2:3C07
14B2:3C09
14B2:3C12
14B2:3C23
14B2:3C25
14B2:3C27
14B2:3C28
157E:300E
15A9:0006
15C5:0008
167B:4001
1690:0740
1690:0740
1690:0744
1690:0761
1690:0764
1737:0070
1737:0071
1737:0078
1737:0079
1740:9701
1740:9702
1740:9703
1740:9705
1740:9706
1740:9707
1740:9708
1740:9709
1740:9801
177F:0302
1875:7733
18C5:0012
1A32:0304
1D4D:000C
1D4D:000E
1D4D:0011
1EDA:2012
1EDA:2012
1EDA:2210
1EDA:2310
2001:3C15
2001:3C19
2001:3C1A
2001:3C1B
2001:3C1C
2001:3C1D
2019:5201
2019:AB25
2019:ED06
2019:ED19
203D:1480
203D:14A9
20B8:8888
5A57:0280
5A57:0282
5A57:0282
5A57:0283
5A57:0284
5A57:5257
7392:4085
7392:7711
7392:7717
7392:7718
7392:7733

My choice fell on the ASUS USB-N10 Nano, which is on the list as 0B05:17BA. It is available for 10 Euro, is so small that it fits into the side USB port without protruding from behind the phone, works stable and does not produce a lot of heat.

AMD Ryzen Threadripper: NUMA architecture, CPU affinity and HTCondor

At the university lab I work at, we usually get desktop computers with Intel Core i7 Extreme CPUs. They have more memory controllers and more cores than Intel’s mainstream CPUs, which is great for us because we do software development and simulations on these machines. Now that it was time to order some new machines, we decided to check out what AMD offers. The last time AMD’s Athlon series was competitive with Intel in the high-end desktop market was probably in the days before the Core i series was introduced a decade ago. Even in the server/HPC market, AMD’s Opteron series hadn’t been a serious Intel competitor after 2012. Now with the Ryzen, AMD finally has something that beats Intel’s high-end offerings both in absolute price and in price per performance.

Today’s high-end CPU market

As prices on Intel’s high-end chips seem to have been increasing with every generation and after Intel didn’t handle the Spectre/Meltdown disaster very well at all earlier this year, we decided that it really was time to break Intel dominance in our lab. I would actually have liked to get something with a non-x86 architecture (because why not), but the requirement of eight CPU cores and four DDR4 memory channels ruled out pretty much everything. The remaining ones were either eliminated based on price (the IBM POWER9 CPU as in the Talos II) or because they are not available in the market (ARM in the form of the Cavium ThunderX2 or Qualcomm Centriq). The POWER9 and ThunderX2 actually have eight memory channels (the Talos’s mainboard only provides access to four though), as does the AMD Epyc (the server version of the Threadripper), so they or their successors might still become interesting options in the future. For comparison, Intel’s current Xeon Scalable series only has six channels, as does the Centriq.

The AMD Ryzen family

We decided to order a Threadripper 1950X with 16 cores. When I unpacked it and started my first benchmark simulation, I was pretty disappointed by the performance though. It turned out that the Threadripper is a NUMA architecture, but you need to toggle a BIOS option (set Memory Interleave to Channel) before it actually presents itself as such to the operating system. In its default mode, memory latencies are very high because a process might be running on a CPU core that is using memory on the other pair of memory controllers. The topology it presents in NUMA mode is roughly like this: out of the 16 cores, four cores each share a common L3 cache to make what AMD calls a core complex (CCX). Two CCXes together share a two-channel memory controller and sit on the same die. Two dies are interconnected with a 50 GB/s link. In NUMA mode and after setting the correct CPU affinity on my simulation processes (which is what the remaining sections of this blog post will be about), I was getting the expected performance — and as you might have expected, the Threadripper really is powerful. You can get something comparable from Intel, the Core i9 Extreme, but these chips are extreme not only in name but also in price. Also, I love the simplicity of AMD’s product lineup: the Ryzen series has two memory channels and 4-8 cores (i.e. one CCX), the Ryzen Threadripper has four memory channels and 8-16 cores (two CCX) and the Epyc has eight memory channels and up to 32 cores. Don’t bother with the second-generation Threadripper that were just announced: the 2920X is almost identical to the 1920X and the 2950X to the 1950X. The 2970WX and 2990WX are really odd chips: they have four CCXes (like the Epyc), but two of them don’t have their own memory controllers. So half of these chips would be as fast as my 1950X in NUMA mode, and the other half would be slower than my 1950X in its default mode.

CPU affinity for MPI

Usually, you only have to worry about CPU pinning and process affinity on servers, high-performance compute clusters and some workstations. Since AMD introduced the Opteron and Intel introduced the Core architecture, machines with multiple processor sockets have been associating memory controllers with CPUs in such a way that memory access within a socket was fast and between sockets was slower. This is called NUMA. To minimize inter-socket memory accesses, you need to tell your operating system’s scheduler to never move threads or processes between cores. This is called pinning or affinity. On servers, it is usually taken care of by the application, while on HPC clusters the admin configures the MPI library to do the right thing. The Threadripper is probably the first chip that brings NUMA to the desktop market. 

If you are using OpenMPI, you just need to set two environment variables

export OMPI_MCA_hwloc_base_binding_policy=numa
export OMPI_MCA_rmaps_base_mapping_policy=numa

so that processes are pinned (bound) to a NUMA domain (usually that’s a socket, but in the Threadripper it’s a die with two CCXes). The second variable tells OpenMPI to create (map) processes on alternating NUMA domains — so the first one is put on the first die, the second one on the second die, the third one on the first die, and so on.

You can also use the -bind-to core and -map-by core command-line arguments to mpiexec/mpirun to achieve the same. Using l3cache  instead of numa may actually get you some even better performance because latencies within a CCX are lower than between CCXes within a die, but that only gains you a few percent and requires that you use more processes and less threads, which may be suboptimal for some software. Below is an interesting figure about latency between cores, as measured with this tool (I’m not sure about the yellow squares though — I would have expected them to be more turquoise, and they don’t seem to match what I observe in actual performance).

If you are using Intel MPI (some commercial software we use does that), you only need

export I_MPI_PIN_DOMAIN=numa

to get pinning. The process creation already happens on alternating domains by default.

I like to put these variables in a file in /etc/profile.d so that they are automatically set for everyone who logs into these machines. I didn’t know this before, but both the GNU and the Intel OpenMP library will only create one thread for each core from their affinity mask, so you don’t even need to set OMP_NUM_THREADS=8 manually if you are running hybrid-parallelized codes.

CPU affinity with HTCondor

Our machines run 24 hours a day, 365 days a year. When nobody is using them locally or via SSH, they run simulation jobs via the HTCondor job scheduler. Of course, we want to make good use of the resources with these jobs too, so we need CPU pinning as well. Usually, we set

NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=100%
SLOT_TYPE_1_PARTITIONABLE = true

in our Condor configuration so that people can decide themselves how many CPUs they want for their jobs. For the Threadripper, it doesn’t make sense to use more than 8 cores per job because we don’t want jobs to cross NUMA domains. This means we need two slots with 50% of the CPU cores, and we want to set SLOT<n>_CPU_AFFINITY to pin the processes to the die. I wrote a Jinja2 template that creates this Condor configuration:

{%- set nodes = salt.file.find('/sys/devices/system/node', name='node[0-9]*', type='d') -%}
NUM_SLOTS = {{ nodes | count }}
ENFORCE_CPU_AFFINITY = True

{% for node in nodes -%}
NUM_SLOTS_TYPE_{{ loop.index }} = 1
{%- set cpus = salt.file.find(node, name='cpu[0-9]*', type='d') -%}
{%- set physical_cpus = [] -%}
{% for cpu in cpus %}
{%- set cpu_id = cpu.replace(node + '/cpu', '') -%}
{%- set siblings = salt.file.read(cpu + '/topology/thread_siblings_list').strip().split(',') -%}
{% if cpu_id == siblings[0] %}
{%- do physical_cpus.append(cpu) -%}
{% endif -%}
{% endfor %}
SLOT_TYPE_{{ loop.index }} = cpus={{ physical_cpus | count }}
SLOT_TYPE_{{ loop.index }}_PARTITIONABLE = true
{%- set cpu_ids = [] -%}
{% for cpu in cpus %}
{%- do cpu_ids.append(cpu.replace(node + '/cpu', '')) -%}
{% endfor %}
SLOT{{ loop.index }}_CPU_AFFINITY = {{ cpu_ids | join(',')}}

{% endfor -%}

If you don’t have a Jinja2-based configuration manager like SaltStack, you can use the following Python script to render the template:

import os, sys
from jinja2 import Environment, FileSystemLoader
import salt
import salt.modules.file

salt.file = salt.modules.file

file_loader = FileSystemLoader(os.path.dirname(__file__))
env = Environment(loader=file_loader, extensions=['jinja2.ext.do'])
template = env.get_template(sys.argv[1])
template.globals['salt'] = salt
print (template.render())

This produces

NUM_SLOTS = 2
ENFORCE_CPU_AFFINITY = True

NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=8
SLOT_TYPE_1_PARTITIONABLE = true
SLOT1_CPU_AFFINITY = 0,1,16,17,18,19,2,20,21,22,23,3,4,5,6,7

NUM_SLOTS_TYPE_2 = 1
SLOT_TYPE_2 = cpus=8
SLOT_TYPE_2_PARTITIONABLE = true
SLOT2_CPU_AFFINITY = 10,11,12,13,14,15,24,25,26,27,28,29,30,31,8,9

One additional complication is that some of our users like to set getenv = True in their condor submit file. This means that the variables we set in the global shell profile in the previous section are inherited to the Condor job, overriding the affinity we set for the slots. So we need a job wrapper script that removes these variables if present. Put

#!/bin/bash

for var in $(/usr/bin/env | /bin/grep -E '^(OMPI_|I_MPI_)' | /usr/bin/awk -F = {print $1}'); do
    echo "Removing environment variable $var" >&2
    unset "$var"
done

exec "$@"
error=$?
echo "Failed to exec($error): $@" > $_CONDOR_WRAPPER_ERROR_FILE
exit 1

into /usr/lib/condor/libexec/condor_pinning_wrapper.sh and add

USER_JOB_WRAPPER = /usr/lib/condor/libexec/condor_pinning_wrapper.sh

to your Condor configuration.

Update 2019-01: Updated the Jinja template to work correctly on non-16-core Threadrippers. These have one (12-core, 24-core) or two (8-core) cores disabled on each CCX, which results in non-consecutive core IDs.

Filtering outgoing traffic with VirtualBox’s NAT interface

Hypervisors like VMware Workstation, VMWare Fusion or VirtualBox usually offer three kinds of network interfaces: bridged (to a network on the host), NAT (sharing an IP address with the host via network address translation) and host-only (a connection exclusively between host and guest).

VMWare, at least on Linux, realizes NAT entirely in the kernel, using standard IP forwarding and setting up a DHCP server on the host that gives addresses to the guest. VirtualBox, on the other hand, handles NAT entirely in user-space, meaning all packets entering and leaving the VM really seem to be going to and from a process named VBoxHeadless or similar.

If you want to limit what kinds of connections a guest system can make, VMware lets you do that quite easily by adding rules to the FORWARD chain:

sudo iptables -I FORWARD -i vmnet8 -j ACCEPT -d github.com
sudo iptables -I FORWARD -i vmnet8 -j REJECT

Unfortunately, things are a lot more difficult with VirtualBox. Since there is no dedicated interface, you need to filter based on process. iptables can’t do that directly, but you can use cgroups to add the necessary marks to the packets:

sudo mkdir /sys/fs/cgroup/net_cls/virtualbox
echo 86 | sudo tee /sys/fs/cgroup/net_cls/virtualbox/net_cls.classid
sudo iptables -N VIRTUALBOX
sudo iptables -I OUTPUT -j VIRTUALBOX -m cgroup --cgroup 86
sudo iptables -I VIRTUALBOX -j ACCEPT -d github.com
sudo iptables -I VIRTUALBOX -j REJECT

Of course, you now need to create the VirtualBox process inside that cgroup. One way is

sudo apt-get install cgroup-tools
sudo cgexec -g net_cls:virtualbox vboxmanage startvm WindowsXP --type=headless

or you can use a wrapper script instead of vboxmanage:

#!/bin/sh -e
CGROUP_NAME=virtualbox
CGROUP_ID=86

if [ ! -d /sys/fs/cgroup/net_cls/$CGROUP_NAME/ ]; then
  mkdir /sys/fs/cgroup/net_cls/$CGROUP_NAME
  echo $CGROUP_ID > /sys/fs/cgroup/net_cls/$CGROUP_NAME/net_cls.classid
fi

/bin/echo $$ > /sys/fs/cgroup/net_cls/$CGROUP_NAME/tasks

exec /usr/bin/vboxmanage "$@"