Tuesday, April 7, 2015

this blog has moved

I will not be updating this blog in the future. I've migrated to the following:
http://chrisarges.net/

Please update your bookmarks!

Thursday, November 13, 2014

using gcov/lcov with the linux kernel

Enable the following in your kernel configuration:
CONFIG_GCOV_KERNEL=y
CONFIG_GCOV_PROFILE_ALL=y
CONFIG_GCOV_FORMAT_AUTODETECT=y

Build the kernel and install headers/debug/images to your machine. Note there may be issues if you run this on a separate machine so consult the official documentation for additional information [1].

Reboot the machine and use the following to see it things work:
sudo apt-get install lcov gcc
SRC_PATH=~/src/linux # or whatever your path to your kernel tree is
gcov ${SRC_PATH}/kernel/sched/core.c -o /sys/kernel/debug/gcov/${SRC_PATH}/kernel/sched/

Obviously this would be useful for quick tests, but perhaps we want to test coverage after a test, and display results graphically. We can use lcov [2] to accomplish this using the following:
# reset counters
sudo lcov --zerocounters
# run test
./test.sh
# generate coverage info and generate webpage
sudo lcov -c -o kerneltest.info
sudo genhtml -o /var/www/html kerneltest.info
  1. https://www.kernel.org/doc/Documentation/gcov.txt
  2. http://ltp.sourceforge.net/documentation/how-to/UsingCodeCoverage.pdf

Friday, October 31, 2014

getting kernel crashdumps for hung machines

Debugging hung machines can be a bit tricky. Here I'll document methods to trigger a crashdump when these hangs occur.

What exactly does it mean when a machine 'hangs' or 'freezes-up'? More information can be found in the kernel documentation [1], but overall there are a few types of hangs A "Soft Lock-Up" is when the kernel loops in kernel mode for a duration without giving tasks a chance to run. A "Hard Lock-Up" is when the kernel loops in kernel mode for a duration without letting other interrupts run. In addition a "Hung Task" is when a userspace task has been blocking for a duration. Thankfully the kernel has options to panic on these conditions and thus create a proper crashdump.

In order to setup crashdump, on an Ubuntu machine we can do the following. First we need to install and setup crashdump, more info can be found here [2].
sudo apt-get install linux-crashdump
Select NO unless you really would like to use kexec for your reboots.

Next we need to enable it since by default it is disabled.
sudo sed -i 's/USE_KDUMP=0/USE_KDUMP=1/' /etc/default/kdump-tools

Reboot to ensure the kernel cmdline options are properly setup
sudo reboot

After reboot run the following:
sudo kdump-config show

If this command shows 'ready to dump', then we can test a crash to ensure kdump has enough memory and will dump properly. This command will crash your computer, so hopefully you are doing this on a test machine.
echo c | sudo tee /proc/sysrq-trigger

The machine will reboot and you'll see a crash in /var/crash.

All of this is already documented in [2], so now we need to enable panics for hang and lockup conditions. Now we need to enable crashing on lockups, so we'll enable many cases at once.

Edit /etc/default/grub and change this line to the following:
GRUB_CMDLINE_LINUX="nmi_watchdog=panic hung_task_panic=1 softlockup_panic=1 unknown_nmi_panic"

In addition you could enable these via /proc/sys/kernel or sysctl. For more information about these parameters there is documentation here [3].

If you've used the command line change, update grub and then reboot.
sudo update-grub && sudo reboot

Now your machine should crash when it locks up, and you'll get a nice crashdump to analyze. If you want to test such a setup I wrote a module [4] that induces a hang to see if this works properly.

Happy hacking.

  1. https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt
  2. https://wiki.ubuntu.com/Kernel/CrashdumpRecipe
  3. https://www.kernel.org/doc/Documentation/kernel-parameters.txt
  4. https://github.com/arges/hanger


Tuesday, July 1, 2014

using ktest.pl with ubuntu

Bisecting the kernel is one of those tasks that's time-consuming and error prone. Ktest.pl is a script that lives in the linux kernel source tree [1] that helps to automate this process. The script is extremely extensible and as such takes times to understand which variables need to be set and where. In this post, I'll go over how to perform a kernel bisection using a VM as the target machine. In this example I'm using 'ubuntu' as the VM name.

First ensure you have all dependencies correctly setup:
sudo apt-get install libvirt-bin qemu-kvm cpu-checker virtinst uvtool git
sudo apt-get build-dep linux-image-`uname -r`

Ensure kvm works:
kvm-ok

In this example we are using uvtool to create VMs using cloud images, but you could just as easily use a preseed install or a manual install via an ISO.
First sync the cloud image:
uvt-simplestreams-libvirt sync release=trusty arch=amd64

Clone the necessary git directory:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux.git

Copy ktest.pl outside of the linux kernel (since bisecting it also changes the script, this way it remains constant):
cd linux
cp tools/testing/ktest/ktest.pl
cp -r tools/testing/ktest/examples/include ..
cd ..

Create directories for script:
mkdir configs build
mkdir configs/ubuntu build/ubuntu

Get an appropriate config for the bisect you are using and ensure it can reasonable 'make oldconfig' with the kernel version you are using. For example, if we are bisecting v3.4 kernels, we can use an Ubuntu v3.5 series kernel config and yes '' | make oldconfig to ensure it is very close. Put this config into configs/ubuntu/config-min. For convenience I have a put a config that works here for this example:
http://people.canonical.com/~arges/amd64-config.flavour.generic

Create the VM, ensure you have ssh keys setup on your local machine first:
uvt-kvm create ubuntu release=trusty arch=amd64 --password ubuntu --unsafe-caching

Ensure the VM can be ssh'ed to via 'ssh ubuntu@ubuntu':
echo "$(uvt-kvm ip ubuntu) ubuntu" | sudo tee -a /etc/hosts

SSH into VM with ssh ubuntu@ubuntu.

Set up the initial target kernel to boot on the VM:
sudo cp /boot/vmlinuz-`uname -r` /boot/vmlinuz-test
sudo cp /boot/initrd.img-`uname -r` /boot/initrd.img-test

Ensure SUBMENUs are disabled on the VM, as the grub2 detection script in ktest.pl fails with submenus, and update grub.
echo "GRUB_DISABLE_SUBMENU=y" | sudo tee -a /etc/default/grub
sudo update-grub

Ensure we have a serial console on the VM with /etc/init/ttyS0.conf, and ensure that agetty automatically logs in as root. If you ran with the above script you can do the following:
sudo sed -i 's/exec \/sbin\/getty/exec \/sbin\/getty -a root/' /etc/init/ttyS0.conf

Ensure that /root/.ssh/authorized_keys on the VM contains the host keys so that ssh root@ubuntu works automatically. If you are using the above commands you can do:
sudo sed -i 's/^.*ssh-rsa/ssh-rsa/g' /root/.ssh/authorized_keys

Finally add a test case to /home/ubuntu/test.sh inside of the ubuntu VM. Ensure it is executable.
#!/bin/bash
# Make a unique string
STRING=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | head -c 32)
> /var/log/syslog
echo $STRING > /dev/kmsg
# Wait for things to settle down...
sleep 5
grep $STRING /var/log/syslog
# This should return 0.

Now exit out of the machine and create the following configuration file for ktest.pl called ubuntu.conf. This will bisect from v3.4 (good) to v3.5-rc1 (bad), and run the test case that we put into the VM.
# Setup default machine
MACHINE = ubuntu

# Use virsh to read the serial console of the guest
CONSOLE = virsh console ${MACHINE}
CLOSE_CONSOLE_SIGNAL = KILL

# Include defaults from upstream
INCLUDE include/defaults.conf
DEFAULTS OVERRIDE

# Make sure we load up our machine to speed up builds
BUILD_OPTIONS = -j8

# This is required for restarting VMs
POWER_CYCLE = virsh destroy ${MACHINE}; sleep 5; virsh start ${MACHINE}

# Use the defaults that update-grub spits out
GRUB_FILE = /boot/grub/grub.cfg
GRUB_MENU = Ubuntu, with Linux test
GRUB_REBOOT = grub-reboot
REBOOT_TYPE = grub2

DEFAULTS

# Do a simple bisect
TEST_START
RUN_TEST = ${SSH} /home/ubuntu/test.sh
TEST_TYPE = bisect
BISECT_GOOD = v3.4
BISECT_BAD = v3.5-rc1
CHECKOUT = origin/master
BISECT_TYPE = test
TEST = ${RUN_TEST}
BISECT_CHECK = 1

Now we should be ready to run the bisection (this will take many, many hours depending on the speed of your machine):
./ktest.pl ubuntu.conf
  1. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/tools/testing/ktest?id=HEAD

Wednesday, June 25, 2014

using CRIU to checkpoint a kernel build

CRIU stands for Checkpoint/Restart in Userspace [1]. As the criu package should be landing in utopic soon, and I wanted to test drive it to see how it handles.

I thought of an interesting example of being in the middle of a linux kernel build and a security update needing to be installed and the machine rebooted. While most of us could probably just reboot and rebuild, why not checkpoint it and save the progress; then restore after the system update?  I admit its not the most useful example; but it is pretty cool nonetheless.
sudo apt-get install criu
# start build; save the PID for later
cd linux; make clean; make defconfig
make & echo $! > ~/make.pid
# enter a clean directory that isn't tmp and dump state
mkdir ~/cr-build && cd $_
sudo criu dump --shell-job -t $(cat ~/make.pid)
# install and reboot machine
# restore your build
cd ~/cr-build
sudo criu restore --shell-job -t $(cat ~/make.pid)

And you're building again!
  1. http://criu.org/Main_Page

Monday, June 16, 2014

manually deploying openstack with a virtual maas on ubuntu trusty (part 2)

In the previous post, I went over how to setup a virtual MAAS environment using KVM [1]. Here I will explain how to setup Juju for use with this environment.

For this setup, we’ll use the maas-server as the juju client to interact with the cluster.

This guide was very useful:
https://maas.ubuntu.com/docs/juju-quick-start.html

Update to the latest stable tools:
sudo apt-add-repository ppa:juju/stable
sudo apt-get update
Next we want to setup juju on the host machine.
sudo apt-get install juju-core
Create juju environment file.
juju init
Generate a specific MAAS API key by using the following link:
http://192.168.100.10/MAAS/account/prefs/

write the following to ~/.juju/environments.yaml replacing ‘maas api key’ with what was generated above:
default: vmaas
environments:
  vmaas:
    type: maas
    maas-server: 'http://192.168.100.10/MAAS'
    maas-oauth: '<maas api key>'
    admin-secret: ubuntu # or something generated with uuid
    default-series: trusty
Now let’s sync tools and bootstrap a node. Note, if you have multiple juju environments then you may need to specify ‘-e vmaas’ if it isn’t your default environment.
juju sync-tools
juju bootstrap # add --show-log --debug  for more output
See if it works by using the following command:
juju status
You should see something similar to the following:
~$ juju status
environment: vmaas
machines:
  "0":
    agent-state: down
    agent-state-info: (started)
    agent-version: 1.18.4
    dns-name: maas-node-0.maas
    instance-id: /MAAS/api/1.0/nodes/node-e41b0c34-e1cb-11e3-98c6-5254001aae69/
    series: trusty
services: {}
Now we can do a test deployment with the juju-gui to our bootstrap node.
juju deploy juju-gui
While it is deploying you can type the following to get a log.
juju debug-log
I wanted to be able to access the juju-gui from an ‘external’ address, so I edited /etc/networking/interfaces on that machine to have a static address:
juju ssh 0
sudo vim /etc/networking/interfaces
Add the following to the file:
auto eth0
iface eth0 inet static
  address 192.168.100.11
  netmask 255.255.255.0
Bring that interface up.
sudo ifup eth0
The password can be found here on the host machine:
grep pass .juju/environments/vmaas.jenv
If you used the above configurations it should be ‘ubuntu’.

Log into the service so you can monitor the status graphically during the deployment.

If you get errors saying that servers couldn’t be reached you may have DNS configuration or proxy issues. You’ll have to first resolve these before using Juju. I’ve had is intermittent network issues in my lab. In order to workaround those physical issues you may have to retry the bootstrap, or increase the timeout values in ~/.juju/environments.yaml to use the following:
  bootstrap-retry-delay: 5
  bootstrap-timeout: 1200
Now you’re cooking with juju.
  1. http://dinosaursareforever.blogspot.com/2014/06/manually-deploying-openstack-with.html

Friday, June 13, 2014

manually deploying openstack with a virtual maas on ubuntu trusty (part 1)

The goal of this new few series of posts is to be able to setup virtual machines to simulate a real-world openstack deployment using maas and juju. This goes through setting up a maas-server in a VM as well as setting up maas-nodes in VMs and getting them enlisted/commissioned into the maas-server. Next juju is configured to use the maas cluster. Finally, openstack is deployed using juju.

Overview

Requirements

Ideally, a large server with 16 cores, 32G memory, 500G disk. Obviously you can tweak this setup to work with less; but be prepared to lock up lesser machines. In addition your host machine needs to be able to support nested virtualization.

Topology

Here is the basics of what will be setup for our virtual maas cluster. Each red box is a virtual machine with two interfaces. The eth0 interface in the VM connects to the NATed maas-internet network, while the VM’s eth1 interface connects to the isolated maas-management network. The number of maas-nodes should match what is required for the deployment; however it is simple enough to enlist more nodes later. I choose to use a public/private network in order to be more flexible later in how openstack networking is set up.

Setup Host Machine

Install Requirements

First install all required programs on the host machine.
sudo apt-get install libvirt-bin qemu-kvm cpu-checker virtinst uvtool

Next, check if kvm is working correctly.
kvm-ok

Ensure nested KVM is enabled. (replace intel with amd if necessary)
cat /sys/module/kvm_intel/parameters/nested

This should output Y, if it doesn’t do the following:
sudo modprobe -r kvm_intel
sudo modprobe kvm_intel nested=1

Ensure $USER is added to libvirtd group.
groups | grep libvirtd

Ensure host machine has SSH keys generated and setup. (Be careful, don’t overwrite your existing keys)
[ -d ~/.ssh ] || ssh-keygen -t rsa

Virtual Network Setup

This step can be done via virt-manager, but also done via command line using virsh.
Setup a virtual network which uses NAT to communicate with the host machine with the following parameters:
Network Name: maas_internet
Network: 192.168.100.0/24
Do _not_ Enable DHCP.
Forwarding to physical network; Any physical device; NAT
And setup an isolated virtual network the following parameters:
Network Name: maas_management
Network: 10.10.10.0/24
Do _not_ Enable DHCP.
Isolated network;

Install the MAAS Server

Download and Start the Install

Ensure you have virt-manager connected to the hypervisor.
While there are many ways we can create virtual machines, I chose the tool uvtool because it works well in Trusty and quickly creates VM based on the Ubuntu cloud image.

Sync the latest cloud trusty cloud image:
uvt-simplestreams-libvirt sync release=trusty arch=amd64

Create a maas-server VM:
uvt-kvm create maas-server release=trusty arch=amd64 --disk 20 --memory 2048 --password ubuntu

After it boots, shut it down and  edit the VM’s machine configuration.
Make the two network interfaces connect to maas_internet and maas_management respectively.

Now edit /etc/network/interfaces to have the following:
auto eth0
iface eth0 inet static
  address 192.168.100.10
  netmask 255.255.255.0
  gateway 192.168.100.1
  dns-nameservers 10.10.10.10 192.168.100.1

auto eth1
iface eth1 inet static
  address 10.10.10.10
  netmask 255.255.255.0
  dns-nameservers 10.10.10.10 192.168.100.1

And follow the instructions here:
http://maas.ubuntu.com/docs/install.html#pkg-install

Which is essentially:
sudo apt-get install maas maas-dhcp maas-dns

MAAS Server Post Install Tasks

http://maas.ubuntu.com/docs/install.html#post-install

First let’s check if the webpage is working correctly. Depending on your installation, you may need to proxy into a remote host hypervisor before accessing the webpage. If you’re working locally you should be able to access this address directly (as the libvirt maas_internet network is already connected to your local machine).

If you need to access it indirectly (and 192.168.100.0 is a non-conflicting subnet):
sshuttle -D -r <hypervisor IP> 192.168.100.0/24

Access the following:
http://192.168.100.10/MAAS
It should remind you that post installation tasks need to be completed.

Let’s create the admin user from the hypervisor machine:
ssh ubuntu@192.168.100.10
sudo maas-region-admin createadmin --username=root --email="user@host.com" --password=ubuntu

If you want to limit the types of boot images that can be created you need to edit
sudo vim /etc/maas/bootresources.yaml

Import boot images, using the new root user you created to log in:
http://192.168.100.10/MAAS/clusters/
Now click 'import boot images' and be patient as it will take some time before these images are imported.

Add a key for the host machine here:
http://192.168.100.10/MAAS/account/prefs/sshkey/add/

Configure the MAAS Cluster

Follow instructions here to setup cluster:
http://maas.ubuntu.com/docs/cluster-configuration.html

http://192.168.100.10/MAAS/clusters/
Click on ‘Cluster master’
Click on edit interface eth1.
Interface: eth1
Management: DHCP and DNS
IP: 10.10.10.10
Subnet mask: 255.255.255.0
Broadcast IP: 10.10.10.255
Router IP: 10.10.10.10
IP Range Low: 10.10.10.100
IP Range High: 10.10.10.200

Click Save Interface
Ensure Nodes Auto-Enlist

Create a MAAS key and use that to log in:
http://192.168.100.10/MAAS/account/prefs/
Click on ‘+ Generate MAAS key’ and copy that down.

Log into the maas-server, and then log into maas using the MAAS key:
maas login maas-server http://192.168.100.10/MAAS

Now set all nodes to auto accept:
maas maas-server nodes accept-all

Setup keys on the maas-server so it can access the virtual machine host
sudo mkdir -p ~maas
sudo chown maas:maas ~maas
sudo -u maas ssh-keygen

Add the pubkey in ~maas/.ssh/id_rsa.pub to the virsh servers authorized_keys and to the maas SSH keys (http://192.168.100.10/MAAS/account/prefs/sshkey/add/)
sudo cat /home/maas/.ssh/id_rsa.pub

Now install virsh to test a connection and allow the maas-server to control maas-nodes.
sudo apt-get install libvirt-bin

Test the connection to the hypervisor (replace ubuntu with hypervisor host user)
sudo -u maas virsh -c qemu+ssh://ubuntu@192.168.100.1/system list --all

Confirm Maas-Server Networking

Ensure we can reach important address via maas-server:
host streams.canonical.com
host store.juju.ubuntu.com
host archive.ubuntu.com

And that we can download charms if needed:
wget https://store.juju.ubuntu.com/charm-info

Setup Traffic Forwarding

Setup maas-server to forward traffic from eth1 to eth0:

You can type the following out manually or add it as an upstart script to ensure forwarding is setup properly each time add this file to /etc/init/ovs-routing.conf (thanks to Juan Negron):

description "Setup NAT rules for ovs bridge"

start on runlevel [2345]

env EXTIF="eth0"
env BRIDGE="eth1"

task

script
    echo "Configuring modules"
    modprobe ip_tables || :
    modprobe nf_conntrack || :
    modprobe nf_conntrack_ftp || :
    modprobe nf_conntrack_irc || :
    modprobe iptable_nat || :
    modprobe nf_nat_ftp || :

    echo "Configuring forwarding and dynaddr"
    echo "1" > /proc/sys/net/ipv4/ip_forward
    echo "1" > /proc/sys/net/ipv4/ip_dynaddr

    echo "Configuring iptables rules"
    iptables-restore <<-EOM
*nat
-A POSTROUTING -o ${EXTIF} -j MASQUERADE
COMMIT
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A FORWARD -i ${BRIDGE} -o ${EXTIF} -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
-A FORWARD -i ${EXTIF} -o ${BRIDGE} -j ACCEPT
-A FORWARD -j LOG
COMMIT
EOM

end script
Then start the service:
sudo service ovs-routing start

Setup Squid Proxy

Ensure squid proxy can access cloud images:
echo "cloud-images.ubuntu.com" | sudo tee /etc/squid-deb-proxy/mirror-dstdomain.acl.d/98-cloud-images
sudo service squid-deb-proxy restart

Install MAAS Nodes

Now we can virt-install each maas-node on the hypervisor such that it automatically pxe boots and auto-enlists into MAAS. You can adjust the script below to create as many nodes as required. I’ve also simplified things by creating everything with dual nics and ample memory and hard drive space, but of course you could use custom machines per service. Compute-nodes need more compute power, ceph nodes will need more storage, and quantum-gateway will need dual nics. In addition you could specify raw disks instead of qcow2, or use storage pools; but in this case I wanted something simple that didn’t automatically use all the space it needed.

for i in {0..19}; do
  virt-install \
    --name=maas-node-${i} \
    --connect=qemu:///system --ram=4096 --vcpus=1 --hvm --virt-type=kvm \
    --pxe --boot network,hd \
    --os-variant=ubuntutrusty --graphics vnc --noautoconsole --os-type=linux --accelerate \
--disk=/var/lib/libvirt/images/maas-node-${i}.qcow2,bus=virtio,format=qcow2,cache=none,sparse=true,size=32 \
    --network=network=maas_internet,model=virtio \
    --network=network=maas_management,model=virtio 
done

Now each node needs to be manually enlisted with the proper power configuration.
http://maas.ubuntu.com/docs/nodes.html#virtual-machine-nodes

Host Name: maas-node-${i}.vmaas
Power Type: virsh
Power Address: qemu+ssh://ubuntu@192.168.100.1/system
Power ID: maas-node-${i}
Here we need to match the machines to the mac address and update the power requirements.  You can get the mac addresses of each node by using the following on the hypervisor:

virsh dumpxml maas-node-${i} | grep "mac addr"

Here is a script that helps automate some of this process, it can be run from the maas-server (replace USER ubuntu with the appropriate value) this matches mac address from virsh to the ones in maas and then sets up the power accordingly:

#!/usr/bin/python

import sys, os, libvirt
from xml.dom.minidom import parseString
os.environ['DJANGO_SETTINGS_MODULE'] = 'maas.settings'
sys.path.append("/usr/share/maas")
from maasserver.models import Node, Tag

hhost = 'qemu+ssh://ubuntu@192.168.100.1/system'

conn = libvirt.open(hhost)
nodes_dict = {}
domains = conn.listDefinedDomains()
for node_name in domains:
    node = conn.lookupByName(node_name)
    node_xml = parseString(node.XMLDesc(0))
    node_mac1 = node_xml.getElementsByTagName('interface')[0].getElementsByTagName('mac')[0].getAttribute('address')
    nodes_dict[node_mac1] = node_name

maas_nodes = Node.objects.all()
for node in maas_nodes:
    try:
        system_id = node.system_id
        mac = node.get_primary_mac()
        node_name = nodes_dict[str(mac)]
        node.hostname = node_name
        node.power_type = 'virsh'
        node.power_parameters = { 'power_address':hhost, 'power_id':node_name }
        node.save()
    except: pass

Note you will need python-libvirt and run the above command with something like the following:
sudo -u maas ./setup-nodes.py

Setup Fastpath and Commission Nodes

You most likely want to use fast-path installer on nodes to speed up installation times. Set all nodes to use fastpath installer using another bulk action on the nodes.

After you have all this done, click bulk action commission.
You should see all your machines starting up if you set things up properly, give this some time. You should have all the nodes in the 'Ready' state in maas now!
http://192.168.100.10/MAAS/nodes/

Confirm DNS setup

One point of trouble can be ensuring DNS is setup correctly. We can test this by starting a maas node and inside of that trying the following:
dig streams.canonical.com
dig store.juju.ubuntu.com

If we can’t hit those, we’ll need to ensure the maas server is setup correctly.
Go to: http://192.168.100.10/MAAS/settings/
Enter the host machines upstream DNS if necessary here, it should setup the bind configuration file and restart that service. After this re-test.

In addition I had to disable dnssec-validation for bind. Edit the following file:
sudo vim /etc/bind/named.conf.options

And change the following value:
dnssec-validation no;

And restart the service:
sudo service bind9 restart

Now you have a working virtual maas setup using the latest Ubuntu LTS!