Debugging A Kernel Panic
Overview
This post goes step by step through the process of debugging a kernel panic on macOS Sierra across a network connection. The general process is described at length for different configurations in Technical Note TN2118. We’ll use a macOS host to act as the core dump server and a macOS guest virtual machine client - the debugee.
For building and loading a kernel extension, check the previous tip.
Walkthrough
Configure the dump server
- The first step in collecting kernel core dumps is to set up a kernel core dump server. We’ll use a macOS Sierra.
- A typical size for kernel dumps is 200-500 MB and can vary dependng on the physical memory size and usage patterns.
- The server needs to be accessible from the client over the network.
- On the server we need a directory where the cores will be dropped, which needs to be writable by the program dumping the cores. According to the official guide, the following settings will do:
$ sudo mkdir /PanicDumps
$ sudo chown root:wheel /PanicDumps
$ sudo chmod 1777 /PanicDumps
- Next step is to activate kdumpd (the kernel dump server process):
$ sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.kdumpd.plist
By default this will try to dump the cores to the /PanicDumps folder. If you've used a different folder name in the previous step, update its .plist property file.
- Next, check that the dump server was started correctly:
$ sudo launchctl list | grep kdump
- 0 com.apple.kdumpd
- The default port, also defined in the
com.apple.kdumpd.plist
file is 1069. Just to make sure, we can verify the port is open:
$ netstat -an | grep 1069
udp4 0 0 *.1069 *.*
udp6 0 0 *.1069 *.*
Configure the client (the target machine)
On the client machine we need to modify the NVRAM boot-args to inclunde two arguments:
- The debug flag, which must be set to a combination of the flags described here. We’ll use
0x444
, which is equivalent toDB_KERN_DUMP_ON_PANIC|DB_ARP|DB_NMI
. - The IP address of the dump server in the
_panicd_ip
variable. - Let’s set both using the
nvram
command:
$ sudo nvram boot-args='debug=0x444 _panicd_ip=192.168.136.1'
- A restart is needed, since we’ve modified boot arguments.
$ sudo reboot
- After the reboot, verify the parameters have been set correctly;
$ sysctl kern.bootargs
kern.bootargs: debug=0x444 _panicd_ip=192.168.136.1
- Trigger a kernel panic on the client using the following
dtrace
trick. Note that SIP needs to be disabled for this to work:
$ sudo dtrace -w -n "BEGIN{ panic(); }"
- If everything went all, a panic dump should start to be transferred to the dump server. Wait until the
.gz
file is written completely:
$ ls -alh /PanicDumps
[..]
-rw-rw---- 1 nobody wheel 126M 21 Mar 23:59 core-xnu-3789.72.11-192.168.136.130-a5001516.gz
Analyse the core dump
- We can analyse the core dump using
lldb
:
$ gunzip core-xnu-3789.72.11-192.168.136.130-a5001516.gz
$ file core-xnu-3789.72.11-192.168.136.130-a5001516
core-xnu-3789.72.11-192.168.136.130-a5001516: Mach-O 64-bit core x86_64
$ lldb -c core-xnu-3789.72.11-192.168.136.130-a5001516
(lldb) target create --core "core-xnu-3789.72.11-192.168.136.130-a5001516"
Kernel UUID: B814CFE3-B6F6-304F-BFB9-C22EFC948A53
Load Address: 0xffffff8017400000
WARNING: Unable to locate kernel binary on the debugger system.
Core file '/PanicDumps/core-xnu-3789.72.11-192.168.136.130-a5001516' (x86_64) was loaded.
- As hinted in the warning message, to get access to symbols and lldbmacros, we would also need the kernel from the KDK.
Housekeeping
- Remove the folder containing the dunps when the analysis is done.
- Disable the dump server:
$ sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.kdumpd.plist
- If you’ve disabled the firewall to allow communication between the server and the client, or added any permissive rules, make sure to remove them and re-enable the firewall.
- Re-enable System Integrity Protection on the client machine.