RPi2: Work in progress 1

Here’s a quick status update on working with Raspberry Pi gen 2. The installed operating system is Raspbian Wheezy 3.18.7-v7+ built on 16 February 2015.

I’m happy to report that I could profile programs using PERF software events. I’m disappointed to report that PERF does not recognize any hardware (performance counter) events. This distro has Linux-tools-3.2 installed. I uninstalled 3.2 and installed 3.18 which matches the kernel:

sudo apt-get remove Linux-tools-3.2
sudo apt-get install Linux-tools-3.18

Still no joy when attempting to use hardware events. If you want to profile your program using PERF software events, please see my current PERF tutorial about finding execution hot-spots. I tried all of the commands and, with the exception of one typo, everything still works!

I’m in the process of troubleshooting my loadable kernel module for user-space performance counter events. I’ve encountered many of the same old stumbling blocks (e.g., finding the correct headers and Module.symvers file). At the present time, the kernel will attempt to load the module, then die. I cannot tell at this stage if there is a problem in the module itself or if there is a bug in Raspbian Wheezy. In case you want to dive into module development yourself, I’ve started a permanent page for building kernel modules on RPi2.

Once again, after two+ years, I want to make a public plea for more open information about the underlying hardware and for guidance and support for end-user device driver development. Quite frankly, Broadcom plays this situation too close to the chest, especially for a computer that’s advertised as a vehicle for learning and education. The dearth of information is stifling. People still struggle to identify and download essential information (e.g., Module.symvers) for device driver development. This is not true of other major Linux distros and the Raspbian folks really need to take note! Broadcom, in particular, runs the risk of killing off the goose laying the golden eggs.

Before signing off, here is a quick PERF command cheat sheet. I recommend reading the tutorial, but if you really must peck away at the keyboard… All the best!

perf help
perf list
perf stat -e cpu-clock ./program
perf record -e cpu-clock ./program
perf record -e cpu-clock,faults .program
perf report
perf report --stdio --sort comm,dso --header
perf report --stdio --dsos=program,libc-2.13.so
perf annotate --stdio --dsos=program --symbol=function
perf annotate --stdio --dsos=program --symbol=function --no-source
perf record -e cpu-clock --freq=8000 ./program
perf evlist -F

Replace “program” with the name of your application program and replace “function” with the name of a function in your program.

Performance counter kernel module

As promised, I’ve described the design of a Linux loadable kernel module that allows user-space access to the Raspberry Pi (ARM 1176) performance counters. By the way, the design of the module is not specific to Raspbian Wheezy or even the Raspberry Pi for that matter. I believe that the kernel module could be used on the new Beagleboard Black (BBB) to enable user-space counter access on its ARM Cortex-A8 processor under Linux. I just ordered a BBB and will try out the code when possible. (Assuming quick delivery!)

The kernel module alone isn’t enough to measure performance events. In fact, the kernel module doesn’t even touch the counters. It merely flips a privileged hardware bit which lets user-space programs read and write the performance counters and control register. So, I have also written a few user-space C functions to configure, clear, start and stop the performance counters. An application program just needs to call a few functions to choose the events to be measured and start counting, to stop counting, to get the raw counts, and to print the event counts.

I have uploaded the source for both the kernel module (aprof.c) and the user-space functions (rpi_pmu.h and rpi_pmu.c). In addition, there is source for some utility functions that I like to use in benchmark programs (test_common.h and test_common.c). All of this is a work in progress and I will update the source when major enhancements or changes are made.

Speaking of source, I have found a way of organizing and storing source code through WordPress. WordPress is kind of security paranoid and doesn’t allow you to upload source code or even gzip’ed TAR files. I ran into this issue when I attempted to upload a make file and WordPress wouldn’t let me do it (with complaints about potentially malicious code and so forth). WordPress does let you post source for viewing, however.

So, I’ve added a Source menu item to the main menu. I want the menu structure below the Source item to operate like a browsable code repository. The first level of items below Source are projects, like the kernel module. The next level of menu items navigate into the source belonging to a project. Each make file and source file is a separate page. The source code is displayed using the SyntaxHighligher plug-in in order to keep indentation. No other formatting or highlighting is done just to keep things simple. I could cut and paste code from these pages, so I hope you can, too!