Friday, April 26, 2019

psutil 5.6.2 and getloadavg() on Windows

New psutil 5.6.2 release implements an emulation of os.getloadavg() on Windows which was kindly contributed by Ammar Askar who originally implemented it for cPython's test suite. This idea has been floating around for quite a while. The first proposal dates back to 2010, when psutil was still hosted on Google Code, and it popped up multiple times throughout the years. There was/is a bunch of info on internet mentioning the bits with which it's theoretically possible to do this (the so called System Processor Queue Length), but I couldn't find any real implementation. A Google search tells there is quite some demand for this, but very few tools out there providing this natively (the only one I could find is this sFlowTrend tool and Zabbix), so I'm very happy this finally landed into psutil / Python.

Other improvements and bugfixes in psutil 5.6.2

The full list is here but I would like to mention a couple:
  • 1476: the possibility to set process' high I/O priority on Windows
  • 1458: colorized test output. I admit nobody will use this directly but it's very cool and I'm porting it to a bunch of other projects I work on (e.g. pyftpdlib). Also, perhaps this could be a good candidate for a small module to put on PYPI which can also include some functionalities taken from pytest and which I'm gradually re-implementing in unittest module amongst which:
    • 1478: re-running failed tests
    • display test timings/durations: this is something I'm contributing to cPython, see BPO-4080 and and PR-12271

About me

I'm currently in China (Shenzhen) for a mix of vacation and work, and I will likely take a break from Open Source for a while (likely 2.5 months, during which I will also go to Philippines and Japan - I love Asia ;-)).

External

Tuesday, March 5, 2019

psutil 5.6.0 with Process.parents() is out

Hello world, =)

It was a long time since my last blog post (over 1 year and a half). During this time I moved between Italy, Prague and Shenzhen (China), and also contributed a couple of nice patches for Python I want to blog about when Python 3.8 will be out: zero-copy for shutil.copy*() functions and socket.create_server() utility function. But let's move on and talk about what this blog post is about: the next major psutil version.

Process parents()

From the doc: return the parents of this process as a list of Process instances. If no parents are known return an empty list.
>>> import psutil
>>> p = psutil.Process(5312)
>>> p.parents()
[psutil.Process(pid=4699, name='bash', started='09:06:44'),
 psutil.Process(pid=4689, name='gnome-terminal-server', started='0:06:44'),
 psutil.Process(pid=1, name='systemd', started='05:56:55')] 
Nothing really new here, as it's a convenience method based on the existing parent() method, but still it's something nice to have implemented as a builtin and which can be used to work with process trees in conjunction with children() method. The idea was proposed by Ghislain Le Meur.

Windows

A bunch of interesting improvements occurred on Windows.

The first one is that certain Windows APIs requiring to be dynamically loaded from DLL libraries are now loaded only once on startup (instead of on per function call), significantly speeding up different functions and methods. This is described and implemented in PR #1422 which also provides benchmarks.

Another one is Process' suspend() and resume() methods. Before they were using CreateToolhelp32Snapshot() to iterate over all process' threads which was somewhat unorthodox and didn't work if process was suspended via Process Hacker. Now it relies on undocumented NtSuspendProcess and NtResumeProcess APIs, which is the same approach used by ProcessHacker and other famous Sysinternals tools. The change was proposed and discussed in issue #1379 and implemented in PR #1435. I think I will later propose the addition of suspend() and resume() method in subprocess module in Python.

Last nice improvement about Windows it's about SE DEBUG mode. SE DEBUG mode can be seen as a "bit" which you can set on the Python process on startup so that we have more chances of querying processes owned by other users, including many owned by Administrator and Local System. Practically speaking this means we will get less AccessDenied exceptions for low PID processes.  It turns out the code doing this has been broken presumably for years, and never set SE DEBUG. This is fixed now and the change was made in PR #1429.

Removal of Process.memory_maps() on OSX

This was somewhat controversial. The history about memory_maps() on OSX is a painful one. It was based on an undocumented and probably broken Apple API called proc_regionfilename() which made memory_maps() either randomly raise EINVAL or result in segfault! Also, memory_maps() could only be used for the current process, limiting its usefulness to os.getpid() only. For any other process it raised AccessDenied. This has been a known problem for a long time but sometime over the last few years I got tired of seeing random test failures on Travis that I couldn't reproduce locally, so I commented the unit-test and forget about it until last week, when I realized the real impact this has on production code. I tried looking for a solution once again, spending quite some time looking for public source codes which managed to do this right with no luck. The only tool I'm aware of which does this right is vmmap from Apple, but it's closed source. After careful thinking, since no solution was found, I decided to just remove memory_maps() from OSX. This is not something I took lightly, but considering the alternative is getting a segfault I decided to sacrifice backward compatibility (hence the major version bump).

Improved exceptions

One problem which afflicted psutil maintenance over the years was receiving bug reports including tracebacks which didn't provide any information on what syscall failed exactly. This was especially painful on Windows where a single routine can invoke different Windows APIs. Now the OSError (or WindowsError) exception will include the syscall from which the error originated, see PR #1428

Other important bugfixes

  • #1353: process_iter() is now thread safe
  • #1411: [BSD]segfault could occur on Process instantiation
  • #1427: [OSX] Process cmdline() and environ() may erroneously raise OSError on failed malloc().
  • #1447: original exception wasn't turned into NoSuchProcess / AccessDenied exceptions when using Process.oneshot() ctx manager.
A full list of enhancements and bug fixes is available here.

Thursday, October 12, 2017

psutil 5.4.0 with AIX support is out

After a long time psutil finally adds support for a brand new exotic platform: AIX! Honestly I am not sure how many AIX Python users are out there (I suppose not many) but still, here it is. For this we have to thank Arnon Yaari who started working on the porting a couple of years ago. To be honest I was skeptical at first because AIX is the only platform which I cannot virtualize and test on my laptop so that made me a bit nervous but Arnon did a very good job. The final PR is huge, it required a considerable amount of work on his part and a review process of over 140 messages which were exchanged between me and him over the course of over 1 month during which I was travelling through China. The final result is very good, basically (almost) all original unit tests pass and the quality of the submitted code is awesome which (I must say) is kind of unusual for an external contribution like this one. Kudos to you Arnon! ;-)

Other than AIX support, release 5.4.0 also includes a couple of important bug fixes for sensors_temperatures() and sensors_fans() functions on Linux and the fix of a bug on OSX which could cause a segmentation fault when using Process.open_files(). Complete list of bugfixes is here.

In terms of future contributions for exotic and still unsupported platforms it is worth mentioning a (still incomplete) PR for Cygwin which looks promising and Mingw32 compiler support on Windows. It looks like psutil is gradually getting to a point where the addition of new functionalities is becoming more rare, so it is good that support for new platforms happens now when the API is mature and stable. Future development in this direction can also include Android and (hopefully) IOS support. Now *that* would be really awesome to have! =)

Stay tuned.

Saturday, September 2, 2017

psutil 5.3.0 with full Unicode support is out

psutil 5.3.0 is finally out. This release is a major one, as it includes tons of improvements and bugfixes, probably like no other previous release. It is interesting to notice how huge the diff between 5.2.2 and 5.3.0 is. This is due to the fact that I've been travelling quite a lot this year, so I kept postponing it. It may sound weird but I consider publishing a new release and write a blog post about more stressful than working on the release itself. =). Anyway, here goes.

Full Unicode support

This is the biggest change. In order to achieve this I had to refactor all functions and internals either returning or accepting a string. Incidentally this helped me having a better understanding of how Unicode works and how it should be handled at the C level in terms of differences between Python 2 and 3. Issue #1040 includes all the reasonings I've been through and potentially serves as a documentation for people who are facing a similar task (handling Unicode in C for both Python 2 and 3). Up until version 5.2.x psutil functions returning a string had different problems as they could:
  • raise decoding error on Python 3 in case of non-ASCII string
  • return unicode instead of str (Python 2)
  • return incorrect / invalid encoded data in case of non-ASCII string
5.3.0 fixes these three issues and consolidates the correct handling of Unicode strings. On Windows this was achieved by using Unicode-specific Windows APIs. The notes below describe how Unicode and strings in general are handled internally by psutil and they apply to any API returning a string such as Process.exe or Process.cwd, including non-filesystem related methods such as Process.username or WindowsService.description:
  • all strings are encoded by using the OS filesystem encoding (sys.getfilesystemencoding()) which varies depending on the platform (e.g. "UTF-8" on OSX, "mbcs" on Win)
  • no API call is supposed to crash with UnicodeDecodeError
  • instead, in case of badly encoded data returned by the OS, the following error handlers are used to replace the corrupted characters in the string:
  • on Python 2 all APIs return bytes (str type), never unicode
  • on Python 2 you can go back to unicode by doing:
>>> unicode(proc.exe(), sys.getdefaultencoding(), errors="replace")

Improved process_iter() function

process_iter() accepts two new parameters in order to invoke Process.as_dict() internally: "attrs" and "ad_value". With this you can iterate over all processes in one shot without having to catch NoSuchProcess explicitly. Before:
>>> import psutil
>>> for proc in psutil.process_iter():
...     try:
...         pinfo = proc.as_dict(attrs=['pid', 'name'])
...     except psutil.NoSuchProcess:
...         pass
...     else:
...         print(pinfo)
...
{'pid': 1, 'name': 'systemd'}
{'pid': 2, 'name': 'kthreadd'}
{'pid': 3, 'name': 'ksoftirqd/0'}
...
Now:
>>> import psutil
>>> for proc in psutil.process_iter(attrs=['pid', 'name']):
...     print(proc.info)
...
{'pid': 1, 'name': 'systemd'}
{'pid': 2, 'name': 'kthreadd'}
{'pid': 3, 'name': 'ksoftirqd/0'}
This improves expressiveness as it makes it possible to use nice list/dict comprehensions. Here's some examples.

Processes having "python" in their name::
>>> from pprint import pprint as pp
>>> pp([p.info for p in psutil.process_iter(attrs=['pid', 'name']) if 'python' in p.info['name']])
[{'name': 'python3', 'pid': 21947},
{'name': 'python', 'pid': 23835}]
Processes owned by user::
>>> import getpass
>>> pp([(p.pid, p.info['name']) for p in psutil.process_iter(attrs=['name', 'username']) if p.info['username'] == getpass.getuser()])
(16832, 'bash'),
(19772, 'ssh'),
(20492, 'python')]
Processes actively running::
>>> pp([(p.pid, p.info) for p in psutil.process_iter(attrs=['name', 'status']) if p.info['status'] == psutil.STATUS_RUNNING])
[(1150, {'name': 'Xorg', 'status': 'running'}),
(1776, {'name': 'unity-panel-service', 'status': 'running'}),
(20492, {'name': 'python', 'status': 'running'})]

Automatic overflow handling of numbers

On very busy or long-lived system systems numbers returned by disk_io_counters() and net_io_counters() functions may wrap (restart from zero). Up to version 5.2.x you had to take this into account while now this is automatically handled by psutil (see: #802). If a "counter" restarts from 0 psutil will add the value from the previous call for you so that numbers will never decrease. This is crucial for applications monitoring disk or network I/O in real time. Old behavior can be resumed by passing nowrap=True argument.

SunOS Process environ()

Process.environ() is now available also on SunOS (see #1091).

Other improvements and bug fixes

Amongst others, here's a couple of important bug fixes I'd like to mention:

  • #1044: on OSX different Process methods could incorrectly raise AccessDenied for zombie processes. This was due to poor proc_pidpath OSX API.
  • #1094: on Windows, pid_exists() may lie due to the poor OpenProcess Windows API which can return a handle even when a process PID no longer exists. This had repercussions for many Process methods such as cmdline(), environ(), cwd(), connections() and others which could have unpredictable behaviors such as returning empty data or erroneously raise NoSuchProcess exceptions. For the same reason (broken OpenProcess API), processes could unexpectedly stick around after being terminate()d and wait()ed on.
BSD systems also received some love (NetBSD and OpenBSD in particular). Different memory leaks were fixed and functions returning connected sockets were partially rewritten. The full list of enhancement and bug fixes can be seen here.

About me

I would like to spend a couple more words about my current situation. Last year (2016) I relocated to Prague and remote worked from there the whole year (it's been cool - great city!). This year I have mainly been resting in Turin (Italy) due to some health issues and travelling across Asia once I started to recover. I am currently in Shenzhen, China, and unless the current situation with North Korea gets worse I'm planning to continue my trip until November and visit Taiwan, South Korea and Japan. Once I'm finished the plan is to briefly return to Turin (Italy) and finally return to Prague. By then I will probably be looking for a new (remote) gig again, so if you have anything for me by November feel free to send me a message. ;-)

Wednesday, February 1, 2017

psutil 5.1.0: temperatures, batteries and cpu frequency

OK, here's another psutil release. Main highlights of this release are sensors-related APIs.

Temperatures

It is now possible to retrieve hardware temperatures. The relevant commit is here. Unfortunately this is Linux only. I couldn't manage to implement this on other platforms mainly for two reasons:

  • On Windows it is hard to do this in a hardware agnostic fashion. I bumped into 3 different approaches, all using WMI, and none of them worked with my hardware so I gave up. 
  • On OSX it appears it is possible to retrieve temperatures relatively easy, but I have a virtualized OSX box which does not support sensors, so basically I gave up on this due to lack of hardware. If somebody wants to give it a try be my guest.

>>> import psutil
>>> psutil.sensors_temperatures()
{'acpitz': [shwtemp(label='', current=47.0, high=103.0, critical=103.0)],
 'asus': [shwtemp(label='', current=47.0, high=None, critical=None)],
 'coretemp': [shwtemp(label='Physical id 0', current=52.0, high=100.0, critical=100.0),
              shwtemp(label='Core 0', current=45.0, high=100.0, critical=100.0),
              shwtemp(label='Core 1', current=52.0, high=100.0, critical=100.0),
              shwtemp(label='Core 2', current=45.0, high=100.0, critical=100.0),
              shwtemp(label='Core 3', current=47.0, high=100.0, critical=100.0)]}

Battery status

This works on Linux, Windows and FreeBSD and provides battery status information. The relevant commit is here.
>>> import psutil
>>>
>>> def secs2hours(secs):
...     mm, ss = divmod(secs, 60)
...     hh, mm = divmod(mm, 60)
...     return "%d:%02d:%02d" % (hh, mm, ss)
...
>>> battery = psutil.sensors_battery()
>>> battery
sbattery(percent=93, secsleft=16628, power_plugged=False)
>>> print("charge = %s%%, time left = %s" % (batt.percent, secs2hours(batt.secsleft)))
charge = 93%, time left = 4:37:08

CPU frequency

Available under Linux, Windows and OSX. Relevant commit is here. Linux is the only platform which reports the real-time value (always changing), on all other platforms current frequency is represented as the nominal “fixed” value.

>>> import psutil
>>> psutil.cpu_freq()
scpufreq(current=931.42925, min=800.0, max=3500.0)
>>> psutil.cpu_freq(percpu=True)
[scpufreq(current=2394.945, min=800.0, max=3500.0),
 scpufreq(current=2236.812, min=800.0, max=3500.0),
 scpufreq(current=1703.609, min=800.0, max=3500.0),
 scpufreq(current=1754.289, min=800.0, max=3500.0)]

What CPU a process is on

This will let you know what CPU number a process is currently running on, which is somewhat related to the existent cpu_affinity() functionality. The relevant commit is here. It is interesting to use this method to visualize how the OS scheduler continuously evenly reassigns processes to different CPUs  (see cpu_distribution.py script).

CPU affinity

Process().cpu_affinity([])
...can now be used as an alias for "set affinity against all eligible CPUs". This was implemented because it turns out on Linux it is not always possible to set affinity against all CPUs. Having such an alias is also a shortcut to avoid doing this, which is kinda verbose:
psutil.Process().cpu_affinity(list(range(psutil.cpu_count())))

Other bug fixes

See full list.

Sunday, November 6, 2016

psutil 5.0.0 is around twice as fast

OK, this is a big one. Starting from psutil 5.0.0 you can query multiple Process information around twice as fast than with previous versions (see original ticket and updated doc). It took me 7 months, 108 commits and a massive refactoring of psutil internals (here is the big commit), and I can safely say this is one of the best improvements and long standing issues which have been addressed in a major psutil release. Here goes.

The problem

Except for some cases, the way different process information are retrieved varies depending on the OS. Sometimes it requires reading a file in /proc filesystem (Linux), some other times it requires using C (Windows, BSD, OSX, SunOS), but every time it's done differently. Psutil abstracts this complexity by providing a nice high-level interface so that you, say, call Process.name() without worrying about what happens behind the curtains or on what OS you're on.

Internally, it is not rare that multiple process info (e.g. name(), ppid(), uids(), create_time()) may be fetched by using the same routine. For example, on Linux we read /proc/stat to get the process name, terminal, CPU times, creation time, status and parent PID, but only one value is returned and the others are discarded. On Linux the code below reads /proc/stat 6 times:
>>> import psutil
>>> p = psutil.Process()
>>> p.name()
>>> p.cpu_times()
>>> p.create_time()
>>> p.ppid()
>>> p.status()
>>> p.terminal()
Another example is BSD. In order to get process name, memory, CPU times and other metrics, a single sysctl() call is necessary, but again, because of how psutil used to work so far that same sysctl() call is executed every time (see here, here, and so on), one information is returned (say name()) and the rest is discarded. Not anymore.

Do it in one shot

It appears clear how the approach described above is not efficient, also considering that applications similar to top, htop, ps or glances usually collect more than one info per-process.
psutil 5.0.0 introduces a new oneshot() context manager. When used, the internal routine is executed once (in the example below on name()) and the other values are cached. The subsequent calls sharing the same internal routine (read /proc/stat, call sysctl() or whatever) will return the cached value.
With psutil 5.0.0 the code above can be rewritten like this, and on Linux it will run 2.4 times faster: 
>>> import psutil
>>> p = psutil.Process()
>>> with p.oneshot():
...     p.name()
...     p.cpu_times()
...     p.create_time()
...     p.ppid()
...     p.status()
...     p.terminal()

Implementation

One great thing about psutil design is its abstraction. It is dived in 3 "layers". The first layer is represented by the main Process class (python), which is what dictates the end-user high-level API. The second layer is the OS-specific Python module which is thin wrapper on top of the OS-specific C extension module (third layer). Because this was organized this way (modularly) the refactoring was reasonably smooth. In order to do this I first refactored those C functions collecting multiple info and grouped them in a single function (e.g. see BSD implementation). Then I wrote a decorator which enables the cache only when requested (when entering the context manager) and decorated the "grouped functions" with with it. The whole thing is enabled on request by the highest-level oneshot() context manager, which is the only thing which is exposed to the end user. Here's the decorator:
def memoize_when_activated(fun):
    """A memoize decorator which is disabled by default. It can be
    activated and deactivated on request.
    """
    @functools.wraps(fun)
    def wrapper(self):
        if not wrapper.cache_activated:
            return fun(self)
        else:
            try:
                ret = cache[fun]
            except KeyError:
                ret = cache[fun] = fun(self)
            return ret

    def cache_activate():
        """Activate cache."""
        wrapper.cache_activated = True

    def cache_deactivate():
        """Deactivate and clear cache."""
        wrapper.cache_activated = False
        cache.clear()

    cache = {}
    wrapper.cache_activated = False
    wrapper.cache_activate = cache_activate
    wrapper.cache_deactivate = cache_deactivate
    return wrapper
In order to measure the various speedups I finally wrote a benchmark script (well 2 actually) and kept tuning until I was sure the various changes made psutil actually faster. The benchmark scripts calculate the speedup you can get if you call all the "grouped" methods together (best case scenario).

Linux: +2.56x speedup

Linux process is the only pure-python implementation as (almost) all process info are gathered by reading files in the /proc filesystem. /proc files typically contain different information about the process and /proc/PID/stat and /proc/PID/status are the perfect examples. That's why on Linux we aggregate them in 3 groups. The relevant part of the Linux implementation can be seen here.

Windows: from +1.9x to +6.5x speedup

Windows is an interesting one. In normal circumstances, if we're querying a process owned by our user, we group together only process' num_threads(), num_ctx_switches() and num_handles(), getting a +1.9x speedup if we access those methods in one shot. Windows is particular though, because certain methods use a dual implementation: a "fast method" is attempted first, but if the process is owned by another user it fails with AccessDenied. In that case psutil falls back on using a second "slower" method (see here for example).
The second method is slower because it iterates over all PIDs but differently than "plain" Windows APIs it can be used to get multiple info in one shot: num threads, context switches, handles, CPU times, create time and IO counters. That is why querying processes owned by other users results in an impressive +6.5 speedup.

OSX: +1.92x speedup

On OSX we can get 2 groups of information. With sysctl() syscall we get process parent PID, uids, gids, terminal, create time, name. With proc_info() syscall we get CPU times (for PIDs owned by another user) memory metrics and ctx switches. Not bad.

BSD: +2.18x speedup

BSD was an interesting one as we gather a tons of process info just by calling sysctl() (see implementation). In a single shot we get process name, ppid, status, uids, gids, IO counters, CPU and create times, terminal and ctx switches.

SunOS: +1.37 speedup

SunOS implementation is similar to Linux implementation in that it reads files in /proc filesystem but differently from Linux this is done in C. Also in this case, we can group different metrics together (see here and here).

External links