Debugging A Kernel Panic

06 Jan 2018 | Internals

Overview

This post goes step by step through the process of debugging a kernel panic on macOS Sierra across a network connection. The general process is described at length for different configurations in Technical Note TN2118. We’ll use a macOS host to act as the core dump server and a macOS guest virtual machine client - the debugee.

For building and loading a kernel extension, check the previous tip.

Walkthrough

Configure the dump server

The first step in collecting kernel core dumps is to set up a kernel core dump server. We’ll use a macOS Sierra.
A typical size for kernel dumps is 200-500 MB and can vary dependng on the physical memory size and usage patterns.
The server needs to be accessible from the client over the network.
On the server we need a directory where the cores will be dropped, which needs to be writable by the program dumping the cores. According to the official guide, the following settings will do:

$ sudo mkdir /PanicDumps
$ sudo chown root:wheel /PanicDumps
$ sudo chmod 1777 /PanicDumps

Next step is to activate kdumpd (the kernel dump server process):

$ sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.kdumpd.plist

By default this will try to dump the cores to the /PanicDumps folder. If you've used a different folder name in the previous step, update its .plist property file.

Next, check that the dump server was started correctly:

$ sudo launchctl list | grep kdump
-	0	com.apple.kdumpd

The default port, also defined in the com.apple.kdumpd.plist file is 1069. Just to make sure, we can verify the port is open:

$ netstat  -an  | grep 1069
udp4       0      0  *.1069                 *.*
udp6       0      0  *.1069                 *.*

Configure the client (the target machine)

On the client machine we need to modify the NVRAM boot-args to inclunde two arguments:

The debug flag, which must be set to a combination of the flags described here. We’ll use 0x444, which is equivalent to DB_KERN_DUMP_ON_PANIC|DB_ARP|DB_NMI.
The IP address of the dump server in the _panicd_ip variable.
Let’s set both using the nvram command:

$ sudo nvram boot-args='debug=0x444 _panicd_ip=192.168.136.1'

A restart is needed, since we’ve modified boot arguments.

$ sudo reboot

After the reboot, verify the parameters have been set correctly;

$ sysctl kern.bootargs
kern.bootargs: debug=0x444 _panicd_ip=192.168.136.1

Trigger a kernel panic on the client using the following dtrace trick. Note that SIP needs to be disabled for this to work:

$ sudo dtrace -w -n "BEGIN{ panic(); }"

If everything went all, a panic dump should start to be transferred to the dump server. Wait until the .gz file is written completely:

$ ls -alh /PanicDumps
[..]
-rw-rw----   1 nobody  wheel   126M 21 Mar 23:59 core-xnu-3789.72.11-192.168.136.130-a5001516.gz

Analyse the core dump

We can analyse the core dump using lldb:

$ gunzip core-xnu-3789.72.11-192.168.136.130-a5001516.gz

$ file core-xnu-3789.72.11-192.168.136.130-a5001516
core-xnu-3789.72.11-192.168.136.130-a5001516: Mach-O 64-bit core x86_64

$ lldb -c core-xnu-3789.72.11-192.168.136.130-a5001516
(lldb) target create --core "core-xnu-3789.72.11-192.168.136.130-a5001516"
Kernel UUID: B814CFE3-B6F6-304F-BFB9-C22EFC948A53
Load Address: 0xffffff8017400000
WARNING: Unable to locate kernel binary on the debugger system.
Core file '/PanicDumps/core-xnu-3789.72.11-192.168.136.130-a5001516' (x86_64) was loaded.

As hinted in the warning message, to get access to symbols and lldbmacros, we would also need the kernel from the KDK.

Housekeeping

Remove the folder containing the dunps when the analysis is done.
Disable the dump server:

$ sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.kdumpd.plist

If you’ve disabled the firewall to allow communication between the server and the client, or added any permissive rules, make sure to remove them and re-enable the firewall.
Re-enable System Integrity Protection on the client machine.

References

Technical Note TN2118 - Kernel Core Dumps

« Building A Kernel Extension Function Interposing »

craftwa.re

A walk outside the sandbox