Yearly Archives: 2015

Digest authentication freezes Apache on OS X Server 5.0

When running a web site on OS X Server 5.0 for a while (on OS X 10.10.5 in my case), eventually you’ll notice hundreds of httpd processes in Activity Monitor. One or two might cause a bit of CPU load, while the others don’t do anything. When you try to load the web page, it is insanely slow and often throws a HTTP 500 error, a HTTP 502 Proxy Error, or the connection just times out. /var/log/apache2/error_log reports errors like

[Thu Nov 05 13:15:24.435549 2015] [mpm_prefork:error] [pid 60920] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting.

but that’s the only hint you get. To find out more, add the following lines inside the VirtualHost section of /Library/Server/Web/Config/apache2/sites/0000_127.0.0.1_34580_.conf and restart the Websites service in Server.app:

<Location /server-status>
SetHandler server-status
</Location>

Now, got to http://localhost/server-status and refresh it occasionally while traffic hits your web site. You’ll eventually see dozens of lines like the one below (starting with 52-0 in my case):

Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process
Srv PID Acc M   CPU SS  Req Conn    Child   Slot    Client  VHost   Request

52-0    80825   0/2/2   C   0.45    51  0   0.0 0.02    0.02    ::1

All httpd processes you’re seeing in Activity Monitor are stuck in “Closing connection”, except for those that cause considerable CPU load. If the server were behaving correctly, you wouldn’t have as many processes and those that aren’t currently handling requests would either be “Waiting for Connection” or “Open slot with no current process”.

Let’s fire up a debugger to see what’s causing the processes to get stuck:

$ sudo lldb -p 80825
(lldb) process attach --pid 80825
Process 80825 stopped
* thread #1: tid = 0x3a69d4, 0x00007fff8fe35902 libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fff8fe35902 libsystem_kernel.dylib`__wait4 + 10
libsystem_kernel.dylib`__wait4:
->  0x7fff8fe35902 <+10>: jae    0x7fff8fe3590c            ; <+20>
    0x7fff8fe35904 <+12>: movq   %rax, %rdi
    0x7fff8fe35907 <+15>: jmp    0x7fff8fe30c78            ; cerror
    0x7fff8fe3590c <+20>: retq   

Executable module set to "/usr/sbin/httpd".
Architecture set to: x86_64-apple-macosx.
(lldb) bt
* thread #1: tid = 0x3a69d4, 0x00007fff8fe35902 libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff8fe35902 libsystem_kernel.dylib`__wait4 + 10
    frame #1: 0x0000000109b3ee95 libapr-1.0.dylib`apr_proc_wait + 70
    frame #2: 0x000000010aa4290c mod_auth_digest_apple.so`cleanup_server_event + 73
    frame #3: 0x0000000109b3627a libapr-1.0.dylib`apr_pool_destroy + 82
    frame #4: 0x0000000109a74ce2 httpd`clean_child_exit + 50
    frame #5: 0x0000000109a74c59 httpd`child_main + 2393
    frame #6: 0x0000000109a73b2e httpd`make_child + 510
    frame #7: 0x0000000109a74181 httpd`perform_idle_server_maintenance + 1265
    frame #8: 0x0000000109a72887 httpd`prefork_run + 2471
    frame #9: 0x0000000109a26328 httpd`ap_run_mpm + 120
    frame #10: 0x0000000109a1185f httpd`main + 4687
    frame #11: 0x00007fff8d7435c9 libdyld.dylib`start + 1
(lldb) continue
Process 82070 resuming
(lldb) exit

Ah, so mod_auth_digest_apple.so is the culprit. So in all *.conf files in /Library/Server/Web/Config/apache2 and its subdirectories, replace every occurrence of AuthType Digest with AuthType Basic and comment out all lines containing mod_auth_digest_apple.so by prepending a # character. Restart the Websites service in Server.app. Watch http://localhost/server-status: everything should be fine now and no more connections stuck on “Closing connection”!

Printer Sharing randomly stops working due to memorystatus_thread killing cupsd

Printers shared in OS X 10.9, 10.10 or 10.11 randomly stop being accessible from remote computers. The system log reports that the cupsd process was terminated by memorystatus_thread:

Oct 26 07:14:33 robert kernel[0]: memorystatus_thread: idle exiting pid 4901 [cupsd]

This can also manually be triggered by executing sudo launchctl stop org.cups.cupsd.plist and manually and temporarily fixed by executing sudo launchctl start org.cups.cupsd.plist.

As a workaround, you can edit the CUPS LaunchDaemon to restart whenever a remote computer attempts to connect. On 10.9 or 10.10:

sudo /usr/libexec/PlistBuddy -c "Delete Sockets:Listeners:0:SockNodeName" /System/Library/LaunchDaemons/org.cups.cupsd.plist
sudo /usr/libexec/PlistBuddy -c "Delete Sockets:Listeners:1:SockNodeName" /System/Library/LaunchDaemons/org.cups.cupsd.plist
sudo launchctl unload /System/Library/LaunchDaemons/org.cups.cupsd.plist
sudo launchctl load /System/Library/LaunchDaemons/org.cups.cupsd.plist

On 10.11, first disable System Integrity Protection in Recovery mode, then run

sudo /usr/libexec/PlistBuddy -c "Add Sockets:Listeners:1 Dict" /System/Library/LaunchDaemons/org.cups.cupsd.plist/org.cups.cupsd.plist
sudo /usr/libexec/PlistBuddy -c "Add Sockets:Listeners:1:SockServiceName String" /System/Library/LaunchDaemons/org.cups.cupsd.plist/org.cups.cupsd.plist
sudo /usr/libexec/PlistBuddy -c "Set Sockets:Listeners:1:SockServiceName ipp "/System/Library/LaunchDaemons/org.cups.cupsd.plist/org.cups.cupsd.plist
sudo launchctl unload /System/Library/LaunchDaemons/org.cups.cupsd.plist
sudo launchctl load /System/Library/LaunchDaemons/org.cups.cupsd.plist

ARP and multicast packets lost with OpenVPN in tap mode

After upgrading our OpenVPN server VM from Debian 7 to Debian 8 (moving us from OpenVPN 2.2 to OpenVPN 2.3 and Linux kernel 3.2 to Linux kernel 3.16) and upgrading our virtualization from VMware ESXi 5.5 to ESXi 6.0 and moving the VM to a different host, the VPN got really unreliable: the VPN connection itself worked fine, but any connections established across the VPN were very slow to get established. Once they were established, everything worked fine and you could even create new connections to the same host across the VPN and they would be established quickly.

I wasn’t sure which one of the many changes caused the issue, but luckily Wireshark quickly revealed the problem: As we are using OpenVPN in layer 2 mode (i.e. with tap interfaces), ARP packets are quite important. While I could see the ARP requests making it across the interface bridge from tap0 to eth0, I saw the ARP replies going into eth0 and not making it to tap0. The server-side fix is easy, just disable the MAC table on the bridge completely and simply lets all packets pass:

brctl setageing br0 0

Now that ARP was working, I noticed that VPN clients also did not get IPv6 addresses. Evidently, the ICMPv6 multicasts weren’t making it across the bridge either. To fix that, enable multicast snooping on the bridge:

echo 1 > /sys/devices/virtual/net/br0/bridge/multicast_querier

Update March 2016: A recent kernel update in Debian Jessie appears to have changed the multicast bridging behavior. I now need to disable multicast snooping:

echo 0 > /sys/devices/virtual/net/br0/bridge/multicast_querier

Fixing OpenMPI over InfiniBand on Rocks Cluster Linux

We recently got a new small compute cluster at the university, running Rocks Clusters Linux 6.1.1, a CentOS 6 derivative. The nodes are interconnected via an InfiniBand network. Unfortunately, the default configuration of OpenMPI 1.6.2 in the HPC roll wastes a significant amount of performance: it communicates using TCP, which is run over a load-balanced combination of IP over InfiniBand and IP over Ethernet.

Switching to DMA over InfiniBand is simple: just run the following command on all compute nodes and the head node:

sed -i 's/add rocks-openmpi/add rocks-openmpi_ib/g' /etc/profile.d/rocks-hpc.*sh

Now however, you get a message like this when you run an MPI job:

--------------------------------------------------------------------------
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

This may be caused by your OpenFabrics vendor limiting the amount of
physical memory that can be registered.  You should investigate the
relevant Linux kernel module parameters that control how much physical
memory can be registered, and increase them to allow registering all
physical memory on your machine.

See this Open MPI FAQ item for more information on these Linux kernel module
parameters:

    http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

  Local host:              bee.icp.uni-stuttgart.de
  Registerable memory:     32768 MiB
  Total memory:            130967 MiB

Your MPI job will continue, but may be behave poorly and/or hang.
--------------------------------------------------------------------------

To fix that, run

echo "options mlx4_core log_num_mtt=24" >> /etc/modprobe.d/mlx4.conf

on all nodes and reboot. log_mtts_per_seg defaulted to 3 on our kernel and did not need tweaking. To check your current values, run

grep . /sys/module/mlx4_core/parameters/*mtt*

One warning message that still comes up when running an MPI job is the following:

--------------------------------------------------------------------------
WARNING: Failed to open "OpenIB-cma-1" [DAT_INVALID_ADDRESS:]. 
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
bee.icp.uni-stuttgart.de:30104:  open_hca: getaddr_netdev ERROR: No such device. Is ib1 configured?
bee.icp.uni-stuttgart.de:30104:  open_hca: device mthca0 not found
bee.icp.uni-stuttgart.de:30104:  open_hca: device mthca0 not found
DAT: library load failure: libdaplscm.so.2: cannot open shared object file: No such file or directory
DAT: library load failure: libdaplscm.so.2: cannot open shared object file: No such file or directory

As UDAPL is removed in newer OpenMPI versions anyway, this is fixed by running

echo "btl = ^udapl" >> /opt/openmpi/etc/openmpi-mca-params.conf

on all compute nodes and the head node.

So all in all, you can simply add the following lines to /export/rocks/install/site-profiles/6.1.1/nodes/extend-compute.xml and rebuild your compute node image:

echo "btl = ^udapl" >> /opt/openmpi/etc/openmpi-mca-params.conf
sed -i 's/add rocks-openmpi/add rocks-openmpi_ib/g' /etc/profile.d/rocks-hpc.*sh
echo "options mlx4_core log_num_mtt=24" >> /etc/modprobe.d/mlx4.conf
dracut -f 2.6.32-504.16.2.el6.x86_64 # may need to rebuild the initrd so it picks up the modprobe parameters

Leserbrief “Vorratsdatenspeicherung”

Am 15. April 2015 kündigte Justizminister H. Maas an, die Vorratsdatenspeicherung wieder einführen zu wollen, die er selber jahrelang kritisiert hatte und die im Jahr 2010 vom Bundesverfassungsgericht und im Jahr 2014 vom Europäischen Gerichtshof verboten worden war. Am 27. April 2015 druckte die Süddeutsche Zeitung dazu einen von mir verfassten Leserbrief:

Elektronische Fußfessel für alle

Wie kann die Antwort auf die überbordende und ungesetzliche Überwachung durch westliche Geheimdienste wie NSA, BND und GCHQ sein, noch größere Datenmassen anzusammeln? In einem Rechtsstaat dürfen erst bei konkretem Verdacht Überwachungsmaßnahmen ergriffen werden, sonst besteht die Gefahr, dass die Bürger den Gebrauch ihrer Freiheiten einschränken, um bloß nicht als falsch-positiver Treffer bei einer Rasterfahndung aufzufallen.

Die Befürworter der Vorratsdatenspeicherung argumentieren unlauter: Es werden Einzelfälle verallgemeinert, um eine Notwendigkeit der Wiedereinführung zu belegen. Es werden seriöse Gutachten ignoriert, die der Vorratsdatenspeicherung keinen nennenswerten Nutzen für die Strafverfolgung attestieren. Es werden gar frei erfundene Beispiele genutzt, siehe Sigmar Gabriels Behauptung, die Vorratsdatenspeicherung habe bei der Aufklärung der Anschläge von Norwegen 2011 geholfen. Wenn die Vorratsdatenspeicherung aber gar nicht die Verfolgung schwerer Straftaten erleichtert, warum verlangt die Politik sie denn dann?

Wenn schon die Kommunikationsdaten nicht von Nutzen sind, liegt die Vermutung nahe, dass der Staat es auf die Mobilfunk-Standortdaten abgesehen hat: Fortwährende Speicherung der Aufenthaltsorte aller Mobiltelefone in Deutschland, quasi elektronische Fußfesseln für die gesamte Bevölkerung, ohne dass diese sie bemerkt – ein Traum für jeden Sicherheitspolitiker, ein Albtraum für die Väter und Mütter des Grundgesetzes.

Die nächsten beiden Jahre großer Koalition werden sehr schwer für die SPD: Sie hat alle ihre Wünsche aus dem Koalitionsvertrag bereits erfüllt bekommen, jetzt besteht die Union auf die Umsetzung ihrer eigenen Projekte. Vielleicht ist es an der Zeit für die SPD, die Koalition einfach aufzukündigen.

Michael Kuron, Frickenhausen

Scientific Article: Role of Geometrical Shape in Like-Charge Attraction of DNA

My first scientific article has been published. It is available at The European Physical Journal E.

Role of geometrical shape in like-charge attraction of DNA
Michael Kuron, Axel Arnold
Eur. Phys. J. E 38, 20 (2015)
DOI: 10.1140/epje/i2015-15020-9

While the journal is not open-access, I am allowed to provide a PDF version of the author’s accepted manuscript for download on my own website below:

Download

Multiple GPUs on unsupported Mac Pro

The first two generations of Apple’s Mac Pro, the MacPro1,1 and MacPro2,1, do not officially run OS X later than 10.7.5. However, there is a modified EFI bootloader available which emulates the EFI64 interface on EFI32 machines. The original version available at Google Code supports OS X 10.9, and there’s a newer one available at Github which also does OS X 10.10. You simply drop in a new boot.efi in two places and add your board ID to the list of supported systems.

The resulting system works perfectly fine, which makes me wonder why Apple didn’t come up with a solution like this themselves. In any case, it extends the life of 2006 and 2007 Mac Pros beyond last year’s end-of-support for OS X Lion.

Since OS X 10.7 and higher included graphics drivers that not only supported the official Apple-supplied GPUs with EFI-compatible firmwares, but pretty much any off-the-shelf Nvidia or AMD GPU, GPUs have become quite easy to upgrade in Mac Pros (the classic tower cheese grater Mac Pro, not the new black trash can Mac Pro). The only thing you lose is the boot screen, so you still need to keep around that original GPU to debug the machine if it doesn’t boot.

These upgraded GPUs work fine with the modified bootloader as well, however if you try to install multiple GPUs (e.g. if you want to drive more than two displays or develop CUDA code and would like to run it in the debugger), only one of them will actually output video.

The solution to make multiple GPUs work in Macs running OS X versions they don’t officially support is surprisingly simple:

sudo /usr/libexec/PlistBuddy -c "Add :IOKitPersonalities:AppleGraphicsDevicePolicy:ConfigMap:Mac-F4208DA9 string none" /System/Library/Extensions/AppleGraphicsControl.kext/Contents/PlugIns/AppleGraphicsDevicePolicy.kext/Contents/Info.plist
sudo /usr/libexec/PlistBuddy -c "Add :IOKitPersonalities:AppleGraphicsDevicePolicy:ConfigMap:Mac-F4208DC8 string none" /System/Library/Extensions/AppleGraphicsControl.kext/Contents/PlugIns/AppleGraphicsDevicePolicy.kext/Contents/Info.plist
sudo touch /System/Library/Extensions

So here we are, running a 2006 MacPro1,1 with two EVGA Nvidia GT610 1GB cards, driving three Apple Cinema Displays.

Fixing OS X Server Push Mail

OS X Server 10.7 and later support push mail for iOS devices. This mechanism is neither based on IMAP IDLE (which iOS doesn’t support) nor Exchange ActiveSync (EAS), but on Apple’s Push Notification Service (APNS) infrastructure.

After setting up Mail using the GUI in OS X Server 10.10 Yosemite, I wondered why push didn’t work. From my understanding, it should happen automatically. The only indications something was wrong were the following lines in /Library/Logs/Mail/push_notify.log:

Feb 21 20:13:27 server.example.com push_notify[22848]: ApplePushServiceProvider: Warning: no device map found for 3F2504E0-4F89-41D3-9A0C-0305E82C3301

as well as XAPPLEPUSHSERVICE missing from the IMAP capabilities list:

$ openssl s_client -quiet -connect localhost:993
* OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE AUTH=PLAIN AUTH=LOGIN] Dovecot ready.

This is often the point where you have to break out the disassembler to find out what is wrong. Luckily however, Dovecot is open source, including the modifications Apple made to support APNS. Tracing through the code, the message above is logged if /Library/Server/Mail/Data/mta/guid_device_maps.plist does not contain a section for the user to which the incoming email is addressed. This section is written when Dovecot receives an XAPPLEPUSHSERVICE command. This command is probably only sent by a client when the XAPPLEPUSHSERVICE capability is reported by the server. The reason why the server didn’t report the capability was a simple incorrect (default) setting, easily fixable using

sudo serveradmin settings mail:imap:aps_topic_enabled = yes

Push mail immediately started working for me after this command, and the capability is correctly reported:

$ openssl s_client -quiet -connect localhost:993
* OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE XAPPLEPUSHSERVICE AUTH=PLAIN AUTH=LOGIN] Dovecot ready.