Friday, October 31, 2014

getting kernel crashdumps for hung machines

Debugging hung machines can be a bit tricky. Here I'll document methods to trigger a crashdump when these hangs occur.

What exactly does it mean when a machine 'hangs' or 'freezes-up'? More information can be found in the kernel documentation [1], but overall there are a few types of hangs A "Soft Lock-Up" is when the kernel loops in kernel mode for a duration without giving tasks a chance to run. A "Hard Lock-Up" is when the kernel loops in kernel mode for a duration without letting other interrupts run. In addition a "Hung Task" is when a userspace task has been blocking for a duration. Thankfully the kernel has options to panic on these conditions and thus create a proper crashdump.

In order to setup crashdump, on an Ubuntu machine we can do the following. First we need to install and setup crashdump, more info can be found here [2].
sudo apt-get install linux-crashdump
Select NO unless you really would like to use kexec for your reboots.

Next we need to enable it since by default it is disabled.
sudo sed -i 's/USE_KDUMP=0/USE_KDUMP=1/' /etc/default/kdump-tools

Reboot to ensure the kernel cmdline options are properly setup
sudo reboot

After reboot run the following:
sudo kdump-config show

If this command shows 'ready to dump', then we can test a crash to ensure kdump has enough memory and will dump properly. This command will crash your computer, so hopefully you are doing this on a test machine.
echo c | sudo tee /proc/sysrq-trigger

The machine will reboot and you'll see a crash in /var/crash.

All of this is already documented in [2], so now we need to enable panics for hang and lockup conditions. Now we need to enable crashing on lockups, so we'll enable many cases at once.

Edit /etc/default/grub and change this line to the following:
GRUB_CMDLINE_LINUX="nmi_watchdog=panic hung_task_panic=1 softlockup_panic=1 unknown_nmi_panic"

In addition you could enable these via /proc/sys/kernel or sysctl. For more information about these parameters there is documentation here [3].

If you've used the command line change, update grub and then reboot.
sudo update-grub && sudo reboot

Now your machine should crash when it locks up, and you'll get a nice crashdump to analyze. If you want to test such a setup I wrote a module [4] that induces a hang to see if this works properly.

Happy hacking.

  1. https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt
  2. https://wiki.ubuntu.com/Kernel/CrashdumpRecipe
  3. https://www.kernel.org/doc/Documentation/kernel-parameters.txt
  4. https://github.com/arges/hanger


12 comments:

  1. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks a lot.


    Dot Net Training Institute in Chennai

    ReplyDelete
    Replies
    1. Machine Learning Projects for Final Year machine learning projects for final year

      Deep Learning Projects assist final year students with improving your applied Deep Learning skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include Deep Learning projects for final year into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Deep Learning Projects for Final Year even arrange a more significant compensation.

      Python Training in Chennai Python Training in Chennai Angular Training Project Centers in Chennai

      Delete
  2. Very informative and It was an awesome post. I love reading your fantastic content. Thanks for sharing it with us. We are so greatful to your sharing.
    Change MAC Address v3.1

    ReplyDelete
  3. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. thanks for sharing it.
    android training in chennai

    ReplyDelete
  4. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. thanks for sharing it. android training in chennai

    ReplyDelete