Linux clock and temperature: An interlude

I’m in the midst of investigating a performance anomaly which seemingly pops up at random. I wrote a pointer chasing program to exercise and measure cache miss performance events. As part of the testing regimen, I run the program several times in a row and compare run time, event counts, etc. and look for inconsistencies.

The program is usually well-behaved/consistent and produces the expected result. For example, when the program is configured to always hit in the level 1 data (L1D) cache, the program measures just a few L1D misses and the run time is short. However, occasionally a run is slow and has a slew of L1D misses. What’s up?

My first thought was re-scheduling, that is, the pointer chasing program starts on one core and is moved by the OS to another core. The cache on the new core is cold and more misses occur. The Linux taskset command launches and pins a program to a core. In fancier language, it sets the CPU affinity for a (running) process. If we pin the program to a particular core, the cache should stay warm.

If you’re an old-timer and haven’t used taskset in a while, please be aware that a user must have CAP_SYS_NICE capability to change the affinity of a process. You can also set CAP_SYS_NICE capability for an application binary using the setcap utility:

    sudo setcap 'cap_sys_nice=eip' <application> 

You can check capabilities with getcap:

    getcap  <application>

The form of the capabilities string is in accordance with the cap_from_text call, so I recommend viewing its man page. The eip flags are case sensitive and specify the effective, inheritabe and permitted sets, respectively.

As to the performance anomoly, setting the CPU (core) affinity did not resolve the issue. Long runs and misses kept popping up. My next thought was “maybe CPU clock throttling?”

There’s quite a bit of on-line material about Raspberry Pi clock throttling and I won’t repeat all of it here. Suffice it to say, the RPi 4 firmware has a so-called CPU scaling governor that kicks in at high temperatures. The governor tries to keep the CPU temperature below 80 . Over-temperature occurs when the temperature rises above 85 . The governor adjusts (throttles) the CPU clock to achieve the configured operating temperature goals.

We do know that Raspberry Pi 4 can run hot. My RPi4 has heat sinks installed, but no case fan. Heat vents out the top of the Canakit plastic enclosure. The heat sinks are warm to the touch, not super hot, really. However, it’s not a bad idea to take the Pi’s temperature.

The following command displays the RPi’s temperature:

    cat /sys/class/thermal/thermal_zone0/temp 

Divide the result by 1000 to obtain the temperature in degrees Celsius. The next command displays the current frequency (kHz):

    cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq

Divide the result by 1000 to obtain the frequency in MHz. This frequency is the Linux kernel’s requested frequency. The actual, possibly throttled frequency may be different.

The Pi’s vcgencmd command is even better! vcgencmd is the Swiss Army knife of system info utilities. The following command displays a list of vcgencmd subcommands:

    vcgencmd commands

Here’s a few commands to get you started:

    vcgencmd measure_temp 
vcgencmd measure_clock arm
vcgencmd measure_volts core
vcgencmd get_throttled

See the vcgencmd man page for more.

You may run into permission issues with vcgencmd. I couldn’t tame the permissions and simply ran vcgencmd via sudo.

The get_throttled subcommand returns a bit mask. Here’s the magic decoder ring:

    Bit   Hex value  Meaning 
---- --------- -----------------------------------
0 1 Under-voltage detected (< 4.64V)
1 2 Arm frequency capped (temp > 80'C)
2 4 Currently throttled
3 8 Soft temperature limit active
16 10000 Under-voltage has occurred
17 20000 Arm frequency capping has occurred
18 40000 Throttling has occurred
19 80000 Soft temperature limit has occurred

If all of that isn’t enough, you can install and run cpufrequtils:

    sudo apt install cpufrequtils 
cpufreq-info

After running the workload and measuring both CPU clock and temperature, throttling did not appear to be a problem. My current conjecture has to do with Linux fixes for the Spectre security vulnerability. In short, Spectre is a class of vulnerabilities exploiting observable side-effects of the machine micro-architecture in order to set up clandestine information channels that leak confidential data. One way to supress data cache observables is to flush (clean) and invalidate the data caches during a context switch. If the data cache is invalidated, cache misses and program run time will go up. Stay tuned.

Even though I haven’t found the source of the performance anomoly, I welcomed the chance to learn about vcgencmd, etc. Off to investigate Linux hardware cache flushing…

Copyright © 2021 Paul J. Drongowski