Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Thursday, October 12, 2017

psutil 5.4.0 with AIX support is out

After a long time psutil finally adds support for a brand new exotic platform: AIX! Honestly I am not sure how many AIX Python users are out there (I suppose not many) but still, here it is. For this we have to thank Arnon Yaari who started working on the porting a couple of years ago. To be honest I was skeptical at first because AIX is the only platform which I cannot virtualize and test on my laptop so that made me a bit nervous but Arnon did a very good job. The final PR is huge, it required a considerable amount of work on his part and a review process of over 140 messages which were exchanged between me and him over the course of over 1 month during which I was travelling through China. The final result is very good, basically (almost) all original unit tests pass and the quality of the submitted code is awesome which (I must say) is kind of unusual for an external contribution like this one. Kudos to you Arnon! ;-)

Other than AIX support, release 5.4.0 also includes a couple of important bug fixes for sensors_temperatures() and sensors_fans() functions on Linux and the fix of a bug on OSX which could cause a segmentation fault when using Process.open_files(). Complete list of bugfixes is here.

In terms of future contributions for exotic and still unsupported platforms it is worth mentioning a (still incomplete) PR for Cygwin which looks promising and Mingw32 compiler support on Windows. It looks like psutil is gradually getting to a point where the addition of new functionalities is becoming more rare, so it is good that support for new platforms happens now when the API is mature and stable. Future development in this direction can also include Android and (hopefully) IOS support. Now *that* would be really awesome to have! =)

Stay tuned.

Saturday, September 2, 2017

psutil 5.3.0 with full Unicode support is out

psutil 5.3.0 is finally out. This release is a major one, as it includes tons of improvements and bugfixes, probably like no other previous release. It is interesting to notice how huge the diff between 5.2.2 and 5.3.0 is. This is due to the fact that I've been travelling quite a lot this year, so I kept postponing it. It may sound weird but I consider publishing a new release and write a blog post about more stressful than working on the release itself. =). Anyway, here goes.

Full Unicode support

This is the biggest change. In order to achieve this I had to refactor all functions and internals either returning or accepting a string. Incidentally this helped me having a better understanding of how Unicode works and how it should be handled at the C level in terms of differences between Python 2 and 3. Issue #1040 includes all the reasonings I've been through and potentially serves as a documentation for people who are facing a similar task (handling Unicode in C for both Python 2 and 3). Up until version 5.2.x psutil functions returning a string had different problems as they could:
  • raise decoding error on Python 3 in case of non-ASCII string
  • return unicode instead of str (Python 2)
  • return incorrect / invalid encoded data in case of non-ASCII string
5.3.0 fixes these three issues and consolidates the correct handling of Unicode strings. On Windows this was achieved by using Unicode-specific Windows APIs. The notes below describe how Unicode and strings in general are handled internally by psutil and they apply to any API returning a string such as Process.exe or Process.cwd, including non-filesystem related methods such as Process.username or WindowsService.description:
  • all strings are encoded by using the OS filesystem encoding (sys.getfilesystemencoding()) which varies depending on the platform (e.g. "UTF-8" on OSX, "mbcs" on Win)
  • no API call is supposed to crash with UnicodeDecodeError
  • instead, in case of badly encoded data returned by the OS, the following error handlers are used to replace the corrupted characters in the string:
  • on Python 2 all APIs return bytes (str type), never unicode
  • on Python 2 you can go back to unicode by doing:
>>> unicode(proc.exe(), sys.getdefaultencoding(), errors="replace")

Improved process_iter() function

process_iter() accepts two new parameters in order to invoke Process.as_dict() internally: "attrs" and "ad_value". With this you can iterate over all processes in one shot without having to catch NoSuchProcess explicitly. Before:
>>> import psutil
>>> for proc in psutil.process_iter():
...     try:
...         pinfo = proc.as_dict(attrs=['pid', 'name'])
...     except psutil.NoSuchProcess:
...         pass
...     else:
...         print(pinfo)
...
{'pid': 1, 'name': 'systemd'}
{'pid': 2, 'name': 'kthreadd'}
{'pid': 3, 'name': 'ksoftirqd/0'}
...
Now:
>>> import psutil
>>> for proc in psutil.process_iter(attrs=['pid', 'name']):
...     print(proc.info)
...
{'pid': 1, 'name': 'systemd'}
{'pid': 2, 'name': 'kthreadd'}
{'pid': 3, 'name': 'ksoftirqd/0'}
This improves expressiveness as it makes it possible to use nice list/dict comprehensions. Here's some examples.

Processes having "python" in their name::
>>> from pprint import pprint as pp
>>> pp([p.info for p in psutil.process_iter(attrs=['pid', 'name']) if 'python' in p.info['name']])
[{'name': 'python3', 'pid': 21947},
{'name': 'python', 'pid': 23835}]
Processes owned by user::
>>> import getpass
>>> pp([(p.pid, p.info['name']) for p in psutil.process_iter(attrs=['name', 'username']) if p.info['username'] == getpass.getuser()])
(16832, 'bash'),
(19772, 'ssh'),
(20492, 'python')]
Processes actively running::
>>> pp([(p.pid, p.info) for p in psutil.process_iter(attrs=['name', 'status']) if p.info['status'] == psutil.STATUS_RUNNING])
[(1150, {'name': 'Xorg', 'status': 'running'}),
(1776, {'name': 'unity-panel-service', 'status': 'running'}),
(20492, {'name': 'python', 'status': 'running'})]

Automatic overflow handling of numbers

On very busy or long-lived system systems numbers returned by disk_io_counters() and net_io_counters() functions may wrap (restart from zero). Up to version 5.2.x you had to take this into account while now this is automatically handled by psutil (see: #802). If a "counter" restarts from 0 psutil will add the value from the previous call for you so that numbers will never decrease. This is crucial for applications monitoring disk or network I/O in real time. Old behavior can be resumed by passing nowrap=True argument.

SunOS Process environ()

Process.environ() is now available also on SunOS (see #1091).

Other improvements and bug fixes

Amongst others, here's a couple of important bug fixes I'd like to mention:

  • #1044: on OSX different Process methods could incorrectly raise AccessDenied for zombie processes. This was due to poor proc_pidpath OSX API.
  • #1094: on Windows, pid_exists() may lie due to the poor OpenProcess Windows API which can return a handle even when a process PID no longer exists. This had repercussions for many Process methods such as cmdline(), environ(), cwd(), connections() and others which could have unpredictable behaviors such as returning empty data or erroneously raise NoSuchProcess exceptions. For the same reason (broken OpenProcess API), processes could unexpectedly stick around after being terminate()d and wait()ed on.
BSD systems also received some love (NetBSD and OpenBSD in particular). Different memory leaks were fixed and functions returning connected sockets were partially rewritten. The full list of enhancement and bug fixes can be seen here.

About me

I would like to spend a couple more words about my current situation. Last year (2016) I relocated to Prague and remote worked from there the whole year (it's been cool - great city!). This year I have mainly been resting in Turin (Italy) due to some health issues and travelling across Asia once I started to recover. I am currently in Shenzhen, China, and unless the current situation with North Korea gets worse I'm planning to continue my trip until November and visit Taiwan, South Korea and Japan. Once I'm finished the plan is to briefly return to Turin (Italy) and finally return to Prague. By then I will probably be looking for a new (remote) gig again, so if you have anything for me by November feel free to send me a message. ;-)

Wednesday, February 1, 2017

psutil 5.1.0: temperatures, batteries and cpu frequency

OK, here's another psutil release. Main highlights of this release are sensors-related APIs.

Temperatures

It is now possible to retrieve hardware temperatures. The relevant commit is here. Unfortunately this is Linux only. I couldn't manage to implement this on other platforms mainly for two reasons:

  • On Windows it is hard to do this in a hardware agnostic fashion. I bumped into 3 different approaches, all using WMI, and none of them worked with my hardware so I gave up. 
  • On OSX it appears it is possible to retrieve temperatures relatively easy, but I have a virtualized OSX box which does not support sensors, so basically I gave up on this due to lack of hardware. If somebody wants to give it a try be my guest.

>>> import psutil
>>> psutil.sensors_temperatures()
{'acpitz': [shwtemp(label='', current=47.0, high=103.0, critical=103.0)],
 'asus': [shwtemp(label='', current=47.0, high=None, critical=None)],
 'coretemp': [shwtemp(label='Physical id 0', current=52.0, high=100.0, critical=100.0),
              shwtemp(label='Core 0', current=45.0, high=100.0, critical=100.0),
              shwtemp(label='Core 1', current=52.0, high=100.0, critical=100.0),
              shwtemp(label='Core 2', current=45.0, high=100.0, critical=100.0),
              shwtemp(label='Core 3', current=47.0, high=100.0, critical=100.0)]}

Battery status

This works on Linux, Windows and FreeBSD and provides battery status information. The relevant commit is here.
>>> import psutil
>>>
>>> def secs2hours(secs):
...     mm, ss = divmod(secs, 60)
...     hh, mm = divmod(mm, 60)
...     return "%d:%02d:%02d" % (hh, mm, ss)
...
>>> battery = psutil.sensors_battery()
>>> battery
sbattery(percent=93, secsleft=16628, power_plugged=False)
>>> print("charge = %s%%, time left = %s" % (batt.percent, secs2hours(batt.secsleft)))
charge = 93%, time left = 4:37:08

CPU frequency

Available under Linux, Windows and OSX. Relevant commit is here. Linux is the only platform which reports the real-time value (always changing), on all other platforms current frequency is represented as the nominal “fixed” value.

>>> import psutil
>>> psutil.cpu_freq()
scpufreq(current=931.42925, min=800.0, max=3500.0)
>>> psutil.cpu_freq(percpu=True)
[scpufreq(current=2394.945, min=800.0, max=3500.0),
 scpufreq(current=2236.812, min=800.0, max=3500.0),
 scpufreq(current=1703.609, min=800.0, max=3500.0),
 scpufreq(current=1754.289, min=800.0, max=3500.0)]

What CPU a process is on

This will let you know what CPU number a process is currently running on, which is somewhat related to the existent cpu_affinity() functionality. The relevant commit is here. It is interesting to use this method to visualize how the OS scheduler continuously evenly reassigns processes to different CPUs  (see cpu_distribution.py script).

CPU affinity

Process().cpu_affinity([])
...can now be used as an alias for "set affinity against all eligible CPUs". This was implemented because it turns out on Linux it is not always possible to set affinity against all CPUs. Having such an alias is also a shortcut to avoid doing this, which is kinda verbose:
psutil.Process().cpu_affinity(list(range(psutil.cpu_count())))

Other bug fixes

See full list.

Sunday, November 6, 2016

psutil 5.0.0 is around twice as fast

OK, this is a big one. Starting from psutil 5.0.0 you can query multiple Process information around twice as fast than with previous versions (see original ticket and updated doc). It took me 7 months, 108 commits and a massive refactoring of psutil internals (here is the big commit), and I can safely say this is one of the best improvements and long standing issues which have been addressed in a major psutil release. Here goes.

The problem

Except for some cases, the way different process information are retrieved varies depending on the OS. Sometimes it requires reading a file in /proc filesystem (Linux), some other times it requires using C (Windows, BSD, OSX, SunOS), but every time it's done differently. Psutil abstracts this complexity by providing a nice high-level interface so that you, say, call Process.name() without worrying about what happens behind the curtains or on what OS you're on.

Internally, it is not rare that multiple process info (e.g. name(), ppid(), uids(), create_time()) may be fetched by using the same routine. For example, on Linux we read /proc/stat to get the process name, terminal, CPU times, creation time, status and parent PID, but only one value is returned and the others are discarded. On Linux the code below reads /proc/stat 6 times:
>>> import psutil
>>> p = psutil.Process()
>>> p.name()
>>> p.cpu_times()
>>> p.create_time()
>>> p.ppid()
>>> p.status()
>>> p.terminal()
Another example is BSD. In order to get process name, memory, CPU times and other metrics, a single sysctl() call is necessary, but again, because of how psutil used to work so far that same sysctl() call is executed every time (see here, here, and so on), one information is returned (say name()) and the rest is discarded. Not anymore.

Do it in one shot

It appears clear how the approach described above is not efficient, also considering that applications similar to top, htop, ps or glances usually collect more than one info per-process.
psutil 5.0.0 introduces a new oneshot() context manager. When used, the internal routine is executed once (in the example below on name()) and the other values are cached. The subsequent calls sharing the same internal routine (read /proc/stat, call sysctl() or whatever) will return the cached value.
With psutil 5.0.0 the code above can be rewritten like this, and on Linux it will run 2.4 times faster: 
>>> import psutil
>>> p = psutil.Process()
>>> with p.oneshot():
...     p.name()
...     p.cpu_times()
...     p.create_time()
...     p.ppid()
...     p.status()
...     p.terminal()

Implementation

One great thing about psutil design is its abstraction. It is dived in 3 "layers". The first layer is represented by the main Process class (python), which is what dictates the end-user high-level API. The second layer is the OS-specific Python module which is thin wrapper on top of the OS-specific C extension module (third layer). Because this was organized this way (modularly) the refactoring was reasonably smooth. In order to do this I first refactored those C functions collecting multiple info and grouped them in a single function (e.g. see BSD implementation). Then I wrote a decorator which enables the cache only when requested (when entering the context manager) and decorated the "grouped functions" with with it. The whole thing is enabled on request by the highest-level oneshot() context manager, which is the only thing which is exposed to the end user. Here's the decorator:
def memoize_when_activated(fun):
    """A memoize decorator which is disabled by default. It can be
    activated and deactivated on request.
    """
    @functools.wraps(fun)
    def wrapper(self):
        if not wrapper.cache_activated:
            return fun(self)
        else:
            try:
                ret = cache[fun]
            except KeyError:
                ret = cache[fun] = fun(self)
            return ret

    def cache_activate():
        """Activate cache."""
        wrapper.cache_activated = True

    def cache_deactivate():
        """Deactivate and clear cache."""
        wrapper.cache_activated = False
        cache.clear()

    cache = {}
    wrapper.cache_activated = False
    wrapper.cache_activate = cache_activate
    wrapper.cache_deactivate = cache_deactivate
    return wrapper
In order to measure the various speedups I finally wrote a benchmark script (well 2 actually) and kept tuning until I was sure the various changes made psutil actually faster. The benchmark scripts calculate the speedup you can get if you call all the "grouped" methods together (best case scenario).

Linux: +2.56x speedup

Linux process is the only pure-python implementation as (almost) all process info are gathered by reading files in the /proc filesystem. /proc files typically contain different information about the process and /proc/PID/stat and /proc/PID/status are the perfect examples. That's why on Linux we aggregate them in 3 groups. The relevant part of the Linux implementation can be seen here.

Windows: from +1.9x to +6.5x speedup

Windows is an interesting one. In normal circumstances, if we're querying a process owned by our user, we group together only process' num_threads(), num_ctx_switches() and num_handles(), getting a +1.9x speedup if we access those methods in one shot. Windows is particular though, because certain methods use a dual implementation: a "fast method" is attempted first, but if the process is owned by another user it fails with AccessDenied. In that case psutil falls back on using a second "slower" method (see here for example).
The second method is slower because it iterates over all PIDs but differently than "plain" Windows APIs it can be used to get multiple info in one shot: num threads, context switches, handles, CPU times, create time and IO counters. That is why querying processes owned by other users results in an impressive +6.5 speedup.

OSX: +1.92x speedup

On OSX we can get 2 groups of information. With sysctl() syscall we get process parent PID, uids, gids, terminal, create time, name. With proc_info() syscall we get CPU times (for PIDs owned by another user) memory metrics and ctx switches. Not bad.

BSD: +2.18x speedup

BSD was an interesting one as we gather a tons of process info just by calling sysctl() (see implementation). In a single shot we get process name, ppid, status, uids, gids, IO counters, CPU and create times, terminal and ctx switches.

SunOS: +1.37 speedup

SunOS implementation is similar to Linux implementation in that it reads files in /proc filesystem but differently from Linux this is done in C. Also in this case, we can group different metrics together (see here and here).

External links


Sunday, October 23, 2016

psutil 4.4.0 released - improved Linux memory metrics and OSX support

OK, here's another psutil release. Main highlights of this release are more accurate memory metrics on Linux and different OSX fixes. Here goes.

Linux virtual memory

This new psutil release sets a milestone regarding virtual_memory() metrics on Linux which are now calculated way more precisely (see commit). Across the years different people complained that the numbers reported by virtual_memory() were not accurate or did not match the ones reported by "free" command line utility exactly (see #862, #685, #538). As such I investigated how "available memory" is calculated on Linux and indeed psutil were doing it wrong. It turns out "free" cmdline itself, and many other similar tools, also did it wrong up until 2 years ago when somebody finally decided to accurately calculate the available system memory straight into the Linux kernel and expose this info to user-level applications. Starting from Linux kernel 3.14, a new "MemAvailable" column was added to /proc/meminfo and this is how psutil now determines available memory. Because of this both "available" and "used" memory fields returned by virtual_memory() precisely match "free" command line utility. As for older kernels (< 3.14), psutil tries to determine this value by using the same algorithm which was used in the original Linux kernel commit. Free cmdline utility source code also inspired an additional fix which prevents available memory overflowing total memory on LCX containers.

OSX fixes

For many years the OSX development of psutil occurred on a very old OSX 10.5 version, which I emulated via VirtualBox. The OS itself was a hacked version of OSX, called iDeneb. After many years I finally managed to get access to a more recent version of OSX (El Captain) thanks to VirtualBox + Vagrant. With this I finally had the chance to address many long standing OSX bugs. Here's the list:
  • 514: fix Process.memory_maps() segfault (critical!).
  • 783: Process.status() may erroneously return "running" for zombie processes.
  • 908: different process methods could erroneously mask the real error for high-privileged PIDs and raise NoSuchProcess and AccessDenied instead of OSError and RuntimeError.
  • 909: Process open_files() and connections() methods may raise OSError with no exception set if process is gone.
  • 916: fix many compilation warnings.

Improved procinfo.py script

procinfo.py is a script which shows psutil capabilities regarding obtaining different info about processes. I improved it so that now it reports a lot more info. Here's a sample output:

$ python scripts/procinfo.py
pid           4600
name          chrome
parent        4554 (bash)
exe           /opt/google/chrome/chrome
cwd           /home/giampaolo
cmdline       /opt/google/chrome/chrome
started       2016-09-19 11:12
cpu-tspent    27:27.68
cpu-times     user=8914.32, system=3530.59,
              children_user=1.46, children_system=1.31
cpu-affinity  [0, 1, 2, 3, 4, 5, 6, 7]
memory        rss=520.5M, vms=1.9G, shared=132.6M, text=95.0M, lib=0B,
              data=816.5M, dirty=0B
memory %      3.26
user          giampaolo
uids          real=1000, effective=1000, saved=1000
uids          real=1000, effective=1000, saved=1000
terminal      /dev/pts/2
status        sleeping
nice          0
ionice        class=IOPriority.IOPRIO_CLASS_NONE, value=0
num-threads   47
num-fds       379
I/O           read_count=96.6M, write_count=80.7M,
              read_bytes=293.2M, write_bytes=24.5G
ctx-switches  voluntary=30426463, involuntary=460108
children      PID    NAME
              4605   cat
              4606   cat
              4609   chrome
              4669   chrome
open-files    PATH
              /opt/google/chrome/icudtl.dat
              /opt/google/chrome/snapshot_blob.bin
              /opt/google/chrome/natives_blob.bin
              /opt/google/chrome/chrome_100_percent.pak
              [...]
connections   PROTO LOCAL ADDR            REMOTE ADDR               STATUS
              UDP   10.0.0.3:3693         *:*                       NONE
              TCP   10.0.0.3:55102        172.217.22.14:443         ESTABLISHED
              UDP   10.0.0.3:35172        *:*                       NONE
              TCP   10.0.0.3:32922        172.217.16.163:443        ESTABLISHED
              UDP   :::5353               *:*                       NONE
              UDP   10.0.0.3:59925        *:*                       NONE
threads       TID              USER          SYSTEM
              11795             0.7            1.35
              11796            0.68            1.37
              15887            0.74            0.03
              19055            0.77            0.01
              [...]
              total=47
res-limits    RLIMIT                     SOFT       HARD
              virtualmem             infinity   infinity
              coredumpsize                  0   infinity
              cputime                infinity   infinity
              datasize               infinity   infinity
              filesize               infinity   infinity
              locks                  infinity   infinity
              memlock                   65536      65536
              msgqueue                 819200     819200
              nice                          0          0
              openfiles                  8192      65536
              maxprocesses              63304      63304
              rss                    infinity   infinity
              realtimeprio                  0          0
              rtimesched             infinity   infinity
              sigspending               63304      63304
              stack                   8388608   infinity
mem-maps      RSS      PATH
              381.4M   [anon]
              62.8M    /opt/google/chrome/chrome
              15.8M    /home/giampaolo/.config/google-chrome/Default/History
              6.6M     /home/giampaolo/.config/google-chrome/Default/Favicons
              [...]

NIC netmask on Windows

net_if_addrs() on Windows is now able to return the netmask.

Other improvements and bug fixes

Just take a look at the HISTORY file.

Sunday, May 15, 2016

psutil 4.2.0, Windows services and Python

New psutil 4.2.0 is out. The main feature of this release is the support for Windows services:

>>> import psutil
>>> list(psutil.win_service_iter())
[<WindowsService(name='AeLookupSvc', display_name='Application Experience') at 38850096>,
 <WindowsService(name='ALG', display_name='Application Layer Gateway Service') at 38850128>,
 <WindowsService(name='APNMCP', display_name='Ask Update Service') at 38850160>,
 <WindowsService(name='AppIDSvc', display_name='Application Identity') at 38850192>,
 ...]
>>> s = psutil.win_service_get('alg')
>>> s.as_dict()
{'binpath': 'C:\\Windows\\System32\\alg.exe',
 'description': 'Provides support for 3rd party protocol plug-ins for Internet Connection Sharing',
 'display_name': 'Application Layer Gateway Service',
 'name': 'alg',
 'pid': None,
 'start_type': 'manual',
 'status': 'stopped',
 'username': 'NT AUTHORITY\\LocalService'}

I did this mainly because I find pywin32 APIs too low level. Having something like this in psutil can be useful to discover and monitor services more easily. The code changes are here and here's the doc. The API for querying a service is similar to psutil.Process. You can get a reference to a service object by using its name (which is unique for every service) and then use name(), status(), etc..:

>>> s = psutil.win_service_get('alg')
>>> s.name()
'alg'
>>> s.status()
'stopped'

Initially I thought to expose and provide a complete set of APIs to handle all aspects of service handling including start(), stop(), restart(), install(), uninstall() and modify() but I soon realized that I would have ended up reimplementing what pywin32 already provides at the cost of overcrowding psutil API (see my reasoning here). I think psutil should really be about monitoring, not about installing and modifying system stuff, especially something as critical as a Windows service.

Considerations about Windows services

For those of you who are not familiar with Windows, a service is something, generally an executable (.exe), which runs at system startup and keeps running in background. We can say they are the equivalent of a UNIX init script. All service are controlled by a "manager" which keeps track of their status and metadata (e.g. description, startup type) and with that you can start and stop them. It is interesting to note that since (most) services are bound to an executable (and hence a process) you can reference the process via its PID:

>>> s = psutil.win_service_get('sshd')
>>> s
<WindowsService(name='sshd', display_name='Open SSH server') at 38853046>
>>> s.pid()
1865
>>> p = psutil.Process(1865)
>>> p
<psutil.Process(pid=19547, name='sshd.exe') at 140461487781328>
>>> p.exe()
'C:\CygWin\bin\sshd'

Other improvements

psutil 4.2.0 comes with 2 other enhancements for Linux:
  • psutil.virtual_memory() returns a new "shared" memory field. This is the same value reported by "free" cmdline utility.
  • I changed the way how /proc was parsed. Instead of reading /proc/{pid}/status line by line I used a regular expression. Here's the speedups:
    * Process.ppid() is 20% faster
    * Process.status() is 28% faster
    * Process.name() is 25% faster
    * Process.num_threads() is 20% faster (on Python 3 only; on Python 2 it's a bit slower - I
       suppose re module received some improvements)

Links

Wednesday, February 17, 2016

psutil 4.0.0 and how to get "real" process memory and environ in Python

New psutil 4.0.0 is out, with some interesting news about process memory metrics. I'll just get straight to the point and describe what's new.

"Real" process memory info

Determining how much memory a process really uses is not an easy matter (see this and this). RSS (Resident Set Size), which is what most people usually rely on, is misleading because it includes both the memory which is unique to the process and the memory shared with other processes. What would be more interesting in terms of profiling is the memory which would be freed if the process was terminated right now. In the Linux world this is called USS (Unique Set Size), and this is the major feature which was introduced in psutil 4.0.0 (not only for Linux but also for Windows and OSX).

USS memory

The USS (Unique Set Size) is the memory which is unique to a process and which would be freed if the process was terminated right now. On Linux this can be determined by parsing all the "private" blocks in /proc/pid/smaps. The Firefox team pushed this further and managed to do the same also on OSX and Windows, which is great. New version of psutil is now able to do the same:
>>> psutil.Process().memory_full_info()
pfullmem(rss=101990, vms=521888, shared=38804, text=28200, lib=0, data=59672, dirty=0, 
         uss=81623, pss=91788, swap=0)

PSS and swap

On Linux there are two additional metrics which can also be determined via /proc/pid/smaps: PSS and swap. PSS, aka "Proportional Set Size", represents the amount of memory shared with other processes, accounted in a way that the amount is divided evenly between the processes that share it. I.e. if a process has 10 MBs all to itself (USS) and 10 MBs shared with another process, its PSS will be 15 MBs. "swap" is simply the amount of memory that has been swapped out to disk. With memory_full_info() it is possible to implement a tool like this, similar to smem on Linux, which provides a list of processes sorted by "USS". It is interesting to notice how RSS differs from USS:
~/svn/psutil$ ./scripts/procsmem.py
PID     User    Cmdline                            USS     PSS    Swap     RSS
==============================================================================
...
3986    giampao /usr/bin/python3 /usr/bin/indi   15.3M   16.6M      0B   25.6M
3906    giampao /usr/lib/ibus/ibus-ui-gtk3       17.6M   18.1M      0B   26.7M
3991    giampao python /usr/bin/hp-systray -x    19.0M   23.3M      0B   40.7M
3830    giampao /usr/bin/ibus-daemon --daemoni   19.0M   19.0M      0B   21.4M
20529   giampao /opt/sublime_text/plugin_host    19.9M   20.1M      0B   22.0M
3990    giampao nautilus -n                      20.6M   29.9M      0B   50.2M
3898    giampao /usr/lib/unity/unity-panel-ser   27.1M   27.9M      0B   37.7M
4176    giampao /usr/lib/evolution/evolution-c   35.7M   36.2M      0B   41.5M
20712   giampao /usr/bin/python -B /home/giamp   45.6M   45.9M      0B   49.4M
3880    giampao /usr/lib/x86_64-linux-gnu/hud/   51.6M   52.7M      0B   61.3M
20513   giampao /opt/sublime_text/sublime_text   65.8M   73.0M      0B   87.9M
3976    giampao compiz                          115.0M  117.0M      0B  130.9M
32486   giampao skype                           145.1M  147.5M      0B  149.6M

Implementation

In order to get these values (USS, PSS and swap) we need to pass through the whole process address space. This usually requires higher user privileges and is considerably slower than getting the "usual" memory metrics via Process.memory_info(), which is probably the reason why tools like ps and top show RSS/VMS instead of USS. A big thanks goes to the Mozilla team which figured out all this stuff on Windows and OSX, and to Eric Rahm who put the PRs for psutil together (see #744, #745 and #746). For those of you who don't use Python and would like to port the code on other languages here's the interesting parts:

Memory type percent

After reorganizing process memory APIs I decided to add a new memtype parameter to Process.memory_percent(). With this it is now possible to compare a specific memory type (not only RSS) against the total physical memory. E.g.
>>> psutil.Process().memory_percent(memtype='pss')
0.06877466326787016

Process environ

Second biggest improvement in psutil 4.0.0 is the ability to determine the process environment variables. This opens interesting possibilities about process recognition and monitoring techniques. For instance, one might start a process by passing a certain custom environment variable, then iterate over all processes to find the one of interest (and figure out whether it's running or whatever):
import psutil
for p in psutil.process_iter():
    try:
        env = p.environ()
    except psutil.Error:
        pass
    else:
        if 'MYAPP' in env:
            ...
Process environ was a long standing issue (year 2009) who I gave up to implement because the Windows implementation worked for the current process only. Frank Benkstein solved that and the process environ can now be determined on Linux, Windows and OSX for all processes (of course you may still bump into AccessDenied for processes owned by another user):
>>> import psutil
>>> from pprint import pprint as pp
>>> pp(psutil.Process().environ())
{...
 'CLUTTER_IM_MODULE': 'xim',
 'COLORTERM': 'gnome-terminal',
 'COMPIZ_BIN_PATH': '/usr/bin/',
 'HOME': '/home/giampaolo',
 'PWD': '/home/giampaolo/svn/psutil',
  }
>>>
It must be noted that the resulting dict usually does not reflect changes made after the process started (e.g. os.environ['MYAPP'] = '1'). Again, for whoever is interested in doing this in other languages, here's the interesting parts:

Extended disk IO stats

psutil.disk_io_counters() has been extended to report additional metrics on Linux and FreeBSD:
  • busy_time, which is the time spent doing actual I/Os (in milliseconds).
  • read_merged_count and write_merged_count (Linux only), which is number of merged reads and writes (see iostats doc)
With these new metrics it is possible to have a better representation of actual disk utilization, similarly to iostat command on Linux.

OS constants

Given the increasing number of platform-specific metrics I added a new set of constants to quickly differentiate what platform you're on: psutil.LINUX, psutil.WINDOWS, etc.

Main bug fixes


  • #734: on Python 3 invalid UTF-8 data was not correctly handled for proces name(), cwd(), exe(), cmdline() and open_files() methods, resulting in UnicodeDecodeError. This was affecting all platforms. Now surrogateescape error handler is used as a workaround for replacing the corrupted data.
  • #761: [Windows] psutil.boot_time() no longer wraps to 0 after 49 days.
  • #767: [Linux] disk_io_counters() may raise ValueError on 2.6 kernels and it's  broken on 2.4 kernels.
  • #764: psutil can now be compiled on NetBSD-6.X.
  • #704: psutil can now be compiled on Solaris sparc.
Complete list of bug fixes is available here.

Porting code

Being 4.0.0 a major version, I took the chance to (lightly) change / break some APIs.
  • Process.memory_info() no longer returns just an (rss, vms) namedtuple. Instead it returns a namedtuple of variable length, changing depending on the platform (rss and vms are always present though, also on Windows). Basically it returns the same result of old process_memory_info_ex(). This shouldn't break your existent code, unless you were doing "rss, vms = p.memory_info()".
  • At the same time process_memory_info_ex() is now deprecated. The method is still there as an alias for memory_info(), issuing a deprecation warning.
  • psutil.disk_io_counters() returns 2 additional fields on Linux and 1 additional field on FreeBSD. 
  • psutil.disk_io_counters() on NetBSD and OpenBSD no longer return write_count and read_count metrics because the kernel do not provide them (we were returning the busy time instead). I also don't expect this to be a big issue because NetBSD and OpenBSD support is very recent.

Final notes and looking for a job

OK, this is it. I would like to spend a couple more words to announce the world that I'm currently unemployed and looking for a remote gig again! =) I want remote because my plan for this year is to remain in Prague (Czech Republic) and possibly spend 2-3 months in Asia. If you know about any company who's looking for a Python backend dev who can work from afar feel free to share my Linkedin profile or mail me at g.rodola [at] gmail [dot] com.

External links

Friday, February 12, 2016

How to always execute exit functions in Python

...or why atexit.register() and signal.signal() are evil

UPDATE (2016-02-13): this recipe no longer handles SIGINT, SIGQUIT and SIGABRT as aliases for "application exit" because it was a bad idea. It only handles SIGTERM. Also it no longer support Windows because signal.signal() implementation is too different than POSIX.

Many people erroneously think that any function registered via atexit module is guaranteed to always be executed when the program terminates. You may have noticed this is not the case when, for example, you daemonize your app in production then try to stop it or restart it: the cleanup functions will not be executed. This is because functions registered wth atexit module are not called when the program is killed by a signal:
import atexit, os, signal

@atexit.register
def cleanup():
    print("on exit")  # XXX this never gets printed

os.kill(os.getpid(), signal.SIGTERM)

It must be noted that the same thing would happen if instead of atexit.register() we would use a "finally" clause. It turns out the correct way to make sure the exit function is always called in case a signal is received is to register it via signal.signal(). That has a drawback though: in case a third-party module has already registered a function for that signal (SIGTERM or whatever), your new function will overwrite the old one:

import os, signal

def old(*args):
    print("old")  # XXX this never gets printed

def new(*args):
    print("new")

signal.signal(signal.SIGTERM, old)
signal.signal(signal.SIGTERM, new)
os.kill(os.getpid(), signal.SIGTERM)

Also, we would still have to use atexit.register() so that the function is called also on "clean" interpreter exit and take into account other signals other than SIGTERM which would cause the process to terminate. This recipe attempts to address all these issues so that:
  •  the exit function is always executed for all exit signals (SIGTERM, SIGINT, SIGQUIT, SIGABRT) on SIGTERM and on "clean" interpreter exit.
  • any exit function(s) previously registered via atexit.register() or signal.signal() will be executed as well (after the new one). 
  • It must be noted that the exit function will never be executed in case of SIGKILL, SIGSTOP or os._exit().

The code

"""
Function / decorator which tries very hard to register a function to
be executed at importerer exit.

Author: Giampaolo Rodola'
License: MIT
"""

from __future__ import print_function
import atexit
import os
import functools
import signal
import sys


_registered_exit_funs = set()
_executed_exit_funs = set()


def register_exit_fun(fun=None, signals=[signal.SIGTERM],
                      logfun=lambda s: print(s, file=sys.stderr)):
    """Register a function which will be executed on "normal"
    interpreter exit or in case one of the `signals` is received
    by this process (differently from atexit.register()).
    Also, it makes sure to execute any other function which was
    previously registered via signal.signal(). If any, it will be
    executed after our own `fun`.

    Functions which were already registered or executed via this
    function will be ignored.

    Note: there's no way to escape SIGKILL, SIGSTOP or os._exit(0)
    so don't bother trying.

    You can use this either as a function or as a decorator:

        @register_exit_fun
        def cleanup():
            pass

        # ...or

        register_exit_fun(cleanup)

    Note about Windows: I tested this some time ago and didn't work
    exactly the same as on UNIX, then I didn't care about it
    anymore and didn't test since then so may not work on Windows.

    Parameters:

    - fun: a callable
    - signals: a list of signals for which this function will be
      executed (default SIGTERM)
    - logfun: a logging function which is called when a signal is
      received. Default: print to standard error. May be set to
      None if no logging is desired.
    """
    def stringify_sig(signum):
        if sys.version_info < (3, 5):
            smap = dict([(getattr(signal, x), x) for x in dir(signal)
                         if x.startswith('SIG')])
            return smap.get(signum, signum)
        else:
            return signum

    def fun_wrapper():
        if fun not in _executed_exit_funs:
            try:
                fun()
            finally:
                _executed_exit_funs.add(fun)

    def signal_wrapper(signum=None, frame=None):
        if signum is not None:
            if logfun is not None:
                logfun("signal {} received by process with PID {}".format(
                    stringify_sig(signum), os.getpid()))
        fun_wrapper()
        # Only return the original signal this process was hit with
        # in case fun returns with no errors, otherwise process will
        # return with sig 1.
        if signum is not None:
            if signum == signal.SIGINT:
                raise KeyboardInterrupt
            # XXX - should we do the same for SIGTERM / SystemExit?
            sys.exit(signum)

    def register_fun(fun, signals):
        if not callable(fun):
            raise TypeError("{!r} is not callable".format(fun))
        set([fun])  # raise exc if obj is not hash-able

        signals = set(signals)
        for sig in signals:
            # Register function for this signal and pop() the previously
            # registered one (if any). This can either be a callable,
            # SIG_IGN (ignore signal) or SIG_DFL (perform default action
            # for signal).
            old_handler = signal.signal(sig, signal_wrapper)
            if old_handler not in (signal.SIG_DFL, signal.SIG_IGN):
                # ...just for extra safety.
                if not callable(old_handler):
                    continue
                # This is needed otherwise we'll get a KeyboardInterrupt
                # strace on interpreter exit, even if the process exited
                # with sig 0.
                if (sig == signal.SIGINT and
                        old_handler is signal.default_int_handler):
                    continue
                # There was a function which was already registered for this
                # signal. Register it again so it will get executed (after our
                # new fun).
                if old_handler not in _registered_exit_funs:
                    atexit.register(old_handler)
                    _registered_exit_funs.add(old_handler)

        # This further registration will be executed in case of clean
        # interpreter exit (no signals received).
        if fun not in _registered_exit_funs or not signals:
            atexit.register(fun_wrapper)
            _registered_exit_funs.add(fun)

    # This piece of machinery handles 3 usage cases. register_exit_fun()
    # used as:
    # - a function
    # - a decorator without parentheses
    # - a decorator with parentheses
    if fun is None:
        @functools.wraps
        def outer(fun):
            return register_fun(fun, signals)
        return outer
    else:
        register_fun(fun, signals)
        return fun

Usage

As a function:
def cleanup():
    print("cleanup")

register_exit_fun(cleanup)
As a decorator:
@register_exit_fun
def cleanup():
    print("cleanup")

Unit tests

This recipe is hosted on ActiveState and has a full set of unittests. It works with Python 2 and 3.

Notes about Windows

On Windows signals are only partially supported meaning a function which was previously registered via signal.signal() will be executed only on interpreter exit, but not if the process receives a signal. Apparently this is a limitation either of Windows or the signal module (most likely Windows).

Because of how different signal.signal() behaves on Windows, this code is UNIX only: http://bugs.python.org/issue26350

Proposal for stdlib inclusion

The fact that atexit module does not handle signals and that signal.signal() overwrites previously registered handlers is unfortunate. It is also confusing because it is not immediately clear which one you are supposed to use (and it turns out you're supposed to use both). Most of the times you have no idea (or don't care) that you're overwriting another exit function. As a user, I would just want to execute an exit function, no matter what, possibly without messing with whatever a module I've previously imported has done with signal.signal(). To me this suggests there could be space for something like "atexit.register_w_signals".

External discussions

Tuesday, November 24, 2015

OpenBSD support for psutil

OK, this is a big one: starting from version 3.3.0 (released just now) psutil will officially support OpenBSD platforms. This was contributed by Landry Breuil (thanks dude!) and myself in PR #615. The interesting parts of the code changes are this and this.

Differences with FreeBSD

As expected, OpenBSD implementation is very similar to FreeBSD's (which was already in place), that is why I decided to merge most of it in a single C file (_psutil_bsd.c) and use 2 separate C files for when the two implementations differed too much: freebsd.c and openbsd.c. In terms of functionality here's the differences with FreeBSD. Unless specified, these differences are due to the kernel which does not provide the information natively (meaning we can't do anything about it).
Similarly to FreeBSD, also OpenBSD implementation of Process.open_files() is problematic as it is not able to return file paths (FreeBSD can sometimes). Other than these differences the functionalities are all there and pretty much the same, so overall I'm pretty satisfied with the result. 

Considerations about BSD platforms

psutil has been supporting FreeBSD basically since the beginning (year 2009). At the time it made sense to support FreeBSD instead of other BSD variants because it is the most popular, followed by OpenBSD and NetBSD. Compared to FreeBSD, OpenBSD appears to be more "minimal" both in terms of facilities provided by the kernel and the number of system administration tools available. One thing which I appreciate a lot about FreeBSD is that the source code of all CLI tools installed on the system is available under /usr/bin/src, which was a big help for implementing all psutil APIs. OpenBSD source code is also available but it uses CSV and I am not sure it includes the source code for all CLI tools. There are still two more BSD variants for which it may be worth to add support for: NetBSD and DragonflyBSD (in this order). About a year ago some guy provided a patch for adding basic NetBSD support so it is likely that will happen sooner or later.

Other enhancements available in this release

The only other enhancement is issue #558, which allows specifying a different location of /proc filesystem on Linux.

External discussions


Saturday, June 13, 2015

psutil 3.0

Here we are. It's been a long time since my last blog post and my last psutil release. The reason? I've been travelling! I mean... a lot. I've spent 3 months in Berlin, 3 weeks in Japan and 2 months in New York City. While I was there I finally had the chance to meet my friend Jay Loden in person. Jay and I originally started working on psutil together 7 years ago



Back then I didn't know any C (and I still am a terrible C developer) so he's been crucial to develop the initial psutil skeleton including OSX and Windows support. I'm back home now (but not for long ;-)), so I finally have some time to write this blog post and tell you about the new psutil release. Let's see what happened.

net_if_addrs()

In a few words, we're now able to list network interface addresses similarly to "ifconfig" command on UNIX:
>>> import psutil
>>> psutil.net_if_addrs()
{'eth0': [snic(
            family=<AddressFamily.AF_INET: 2>, 
            address='10.0.0.4', 
            netmask='255.0.0.0', 
            broadcast='10.255.255.255'),
          snic(
            family=<AddressFamily.AF_PACKET: 17>, 
            address='9c:eb:e8:0b:05:1f', 
            netmask=None, 
            broadcast='ff:ff:ff:ff:ff:ff')],
 'lo': [snic(
            family=<AddressFamily.AF_INET: 2>, 
            address='127.0.0.1', 
            netmask='255.0.0.0', 
            broadcast='127.0.0.1'),
        snic(
            family=<AddressFamily.AF_PACKET: 17>, 
            address='00:00:00:00:00:00', 
            netmask=None, 
            broadcast='00:00:00:00:00:00')]}
This is limited to AF_INET (IPv4), AF_INET6 (IPv6) and AF_LINK (ETHERNET) address families. If you want something more poweful (e.g. AF_BLUETOOTH) you can take a look at netifaces extension. And here's the code which does these tricks on POSIX and Windows:
Also, here's some doc.

net_if_stats()

This will return a bunch of information about network interface cards:
>>> import psutil
>>> psutil.net_if_stats()
{'eth0': snicstats(
    isup=True, 
    duplex=<NicDuplex.NIC_DUPLEX_FULL: 2>, 
    speed=100, 
    mtu=1500),
 'lo': snicstats(
    isup=True, 
    duplex=<NicDuplex.NIC_DUPLEX_UNKNOWN: 0>, 
    speed=0, 
    mtu=65536)}
Again, here's the code:
...and the doc.

Enums

Enums are a nice new feature introduced in Python 3.4. Very briefly (or at least, this is what I appreciate the most about them), they help you write an API with human-readable constants. If you use Python 2 you'll see something like this:
>>> import psutil
>>> psutil.IOPRIO_CLASS_IDLE
3
On Python 3.4 you'll see a more informative:
>>> import psutil
>>> psutil.IOPRIO_CLASS_IDLE
<IOPriority.IOPRIO_CLASS_IDLE: 3>
They are backward compatible, meaning if you're sending serialized data produced with psutil through the network you can safely use comparison operators and so on. The psutil APIs returning enums (on Python >=3.4) are:
All the other existing constants remained plain strings (STATUS_*) or integers (CONN_*).

Zombie processes

This is a big one. The full story is here but basically the support for zombie processes on UNIX was broken (except on Linux - Windows doesn't have zombie processes). Up until psutil 2.* we could instantiate a zombie process:
>>> pid = create_zombie()
>>> p = psutil.Process(pid)
...but every time we queried it we got a NoSuchProcess exception:
>>> psutil.name()
  File "psutil/__init__.py", line 374, in _init
    raise NoSuchProcess(pid, None, msg)
psutil.NoSuchProcess: no process found with pid 123
That was misleading though because the PID technically still existed:
>>> psutil.pid_exists(p.pid)
True
Furthermore, depending on what platform you were on, certain process stats could still be queried (instead of raising NoSuchProcess):
>>> psutil.cmdline()
['python']
Also process_iter() did not return zombie processes at all. This was probably the worst aspect because being able to identify them is an important use case, as they signal an issue with process: if a parent process spawns a child, terminates it (via kill()), but doesn't wait() for it it will create a zombie. Long story short, the way this changed in psutil 3.0 is that:
  • we now have a new ZombieProcess exception, raised every time we're not able to query a process because it's a zombie
  • it is raised instead of NoSuchProcess (which was incorrect and misleading)
  • it is still backward compatible (meaning you won't have to change your old code) because it inherits from NoSuchProcess
  • process_iter() finally works, meaning you can safely identify zombie processes like this:
import psutil
zombies = []
for p in psutil.process_iter():
    try:
        if p.status() == psutil.STATUS_ZOMBIE:  
            zombies.append(p)
    except NoSuchProcess:
        pass

Removal of deprecated APIs

This is another big one, probably the biggest. In a previous blog post I already talked about deprecated APIs. What I did back then (January 2014) was to rename and officially deprecate different APIs and provide aliases for them so that people wouldn't yell at me because I broke their existent code. The most interesting deprecation was certainly the one affecting module constants and the hack which was used in order to provide "module properties". With this new release I decided to get rid of all those aliases. I'm sure this will cause problems but hey! This is a new major release, right? =). Plus the amount of crap which was removed is impressive (see the commit). Here's the old aliases which are now gone for good (or bad, depending on how much headache they will cause you):

Removed module functions and constants

Already deprecated nameNew name
psutil.BOOT_TIME()psutil.boot_time()
psutil.NUM_CPUS()psutil.cpu_count()
psutil.TOTAL_PHYMEM()psutil.virtual_memory().total
psutil.avail_phymem()psutil.virtual_memory().free
psutil.avail_virtmem()psutil.swap_memory().free
psutil.cached_phymem() (Linux only)psutil.virtual_memory().cached
psutil.get_pid_list()psutil.pids().cached
psutil.get_process_list()-
psutil.get_users()psutil.users()
psutil.network_io_counters()psutil.net_io_counters()
psutil.phymem_buffers() (Linux only)psutil.virtual_memory().buffers
psutil.phymem_usage()psutil.virtual_memory()
psutil.total_virtmem()psutil.swap_memory().total
psutil.used_virtmem()psutil.swap_memory().used
psutil.used_phymem()psutil.virtual_memory().used
psutil.virtmem_usage()psutil.swap_memory()

Process methods (assuming p = psutil.Process()):

Already deprecated nameNew name
p.get_children()p.children()
p.get_connections()p.connections()
p.get_cpu_affinity()p.cpu_affinity()
p.get_cpu_percent()p.cpu_percent()
p.get_cpu_times()p.cpu_times()
p.get_io_counters()p.io_counters()
p.get_ionice()p.ionice()
p.get_memory_info()p.memory_info()
p.get_ext_memory_info()p.memory_info_ex()
p.get_memory_maps()p.memory_maps()
p.get_memory_percent()p.memory_percent()
p.get_nice()p.nice()
p.get_num_ctx_switches()p.num_ctx_switches()
p.get_num_fds()p.num_fds()
p.get_num_threads()p.num_threads()
p.get_open_files()p.open_files()
p.get_rlimit()p.rlimit()
p.get_threads()p.threads()
p.getcwd()p.cwd()
p.set_cpu_affinity()p.cpu_affinity()
p.set_ionice()p.ionice()
p.set_nice()p.nice()
p.set_rlimit()p.rlimit()

If your code suddenly breaks with AttributeError after you upgraded psutil it means you were using one of those deprecated aliases. In that case just take a look at the table above and rename stuff in accordance.

Bug fixes

I fixed a lot of stuff (full list here), but here's the list of things which I think are worth mentioning:

Ease of development

These are not enhancements you will directly benefit from but I put some effort into making my life easier every time I work on psutil.
  • I care about psutil code being fully PEP8 compliant so I added a pre-commit GIT hook which runs flake8 on every commit and rejects it if the coding style is not compliant. The way I install this is via make install-git-hooks.
  • I added a make install-dev-deps command which installs all deps and stuff which is useful for testing (ipdb, coverage, etc).
  • A new make coverage command which runs coverage. With this I discovered some of parts in the code which weren't covered by tests and I fixed that.
  • I started using tox to easily test psutil against all supported Python versions (from 2.6 to 3.4) in one shot.
  • I reorganized tests so that now they can be easily executed with py.test and nose (before, only unittest runner was fully supported)

Final words

I must say I'm pretty satisfied with how psutil is going and the satisfaction I still get every time I work on it. Right now it gets almost 800.000 download a month, which is pretty great for a Python library. As of right now I consider psutil almost "completed" in terms of features, meaning I'm basically running out of ideas on what I should add next (see TODO). From now on the future development will probably focus on adding support for more exotic platforms (OpenBSD, NetBSD, Android). There also have been some discussions on python-ideas mailing list about including psutil into Python stdlib but, assuming that will ever happen, it's still far away in the future as it would require a lot of time which I currently don't have. That should be all. I hope you will all enjoy this new release.

Friday, June 13, 2014

Python and sendfile

sendfile(2) is a UNIX system call which provides a "zero-copy" way of copying data from one file descriptor (a file) to another (a socket). Because this copying is done entirely within the kernel, sendfile(2) is more efficient than the combination of "file.read()" and "socket.send()", which requires transferring data to and from user space.  This copying of the data twice imposes some performance and resource penalties which sendfile(2) syscall avoids; it also results in a single system call (and thus only one context switch), rather than the series of read(2) / write(2) system calls (each system call requiring a context switch) used internally for the data copying. A more exhaustive explanation of how sendfile(2) works is available here,  but long story short is that sending a file with sendfile() is usually twice as fast than using plain socket.send(). Typical applications which can benefit from using sendfile() are FTP and HTTP servers.

socket.sendfile()

I recently contributed a patch for Python's socket module which adds a high-level socket.sendfile() method (see full discussion at issue 17552). socket.sendfile() will transmit a file until EOF is reached by attempting to use os.sendfile(), if available, else it falls back on using plain socket.send(). Internally, it takes care of handling socket timeouts and provides two optional parameters to move the file offset or to send only a limited amount of bytes. I came up with this idea because getting all of that right is a bit tricky, so a generic wrapper seemed to be convenient to have. socket.sendfile() will make its appearance in Python 3.5.

sendfile and Python

sendfile(2) made its first appearance into the Python stdlib kind of late: Python 3.3. It was contributed by Ross Lagerwall and me in issue 10882. Since the patch didn't make it into python 2.X and I wanted to use sendfile() in pyftpdlib I later decided to release it as a stand alone module working with older (2.5+) Python versions (see pysendfile project). Starting with version 3.5, Python will hopefully start using sendfile() more extensively, in details:
Also, Windows provides something similar to sendfile(2): TransmitFile. Now that socket.sendfile() is in place it seems natural to add support for it as well (see issue 21721).

Backport to Python 2.6 and 2.7

For those of you who are interested in using socket.sendfile() with older Python 2.6 and 2.7 versions here's a backport. It requires pysendfile modules to be installed. Full code including tests is hosted here.

#!/usr/bin/env python

"""
This is a backport of socket.sendfile() for Python 2.6 and 2.7.
socket.sendfile() will be included in Python 3.5:
http://bugs.python.org/issue17552
Usage:

>>> import socket
>>> file = open("somefile.bin", "rb")
>>> sock = socket.create_connection(("localhost", 8021))
>>> sendfile(sock, file)
42319283
>>>
"""

import errno
import io
import os
import select
import socket
try:
    memoryview  # py 2.7 only
except NameError:
    memoryview = lambda x: x

if os.name == 'posix':
    import sendfile as pysendfile  # requires "pip install pysendfile"
else:
    pysendfile = None


_RETRY = frozenset((errno.EAGAIN, errno.EALREADY, errno.EWOULDBLOCK,
                    errno.EINPROGRESS))


class _GiveupOnSendfile(Exception):
    pass


if pysendfile is not None:

    def _sendfile_use_sendfile(sock, file, offset=0, count=None):
        _check_sendfile_params(sock, file, offset, count)
        sockno = sock.fileno()
        try:
            fileno = file.fileno()
        except (AttributeError, io.UnsupportedOperation) as err:
            raise _GiveupOnSendfile(err)  # not a regular file
        try:
            fsize = os.fstat(fileno).st_size
        except OSError:
            raise _GiveupOnSendfile(err)  # not a regular file
        if not fsize:
            return 0  # empty file
        blocksize = fsize if not count else count

        timeout = sock.gettimeout()
        if timeout == 0:
            raise ValueError("non-blocking sockets are not supported")
        # poll/select have the advantage of not requiring any
        # extra file descriptor, contrarily to epoll/kqueue
        # (also, they require a single syscall).
        if hasattr(select, 'poll'):
            if timeout is not None:
                timeout *= 1000
            pollster = select.poll()
            pollster.register(sockno, select.POLLOUT)

            def wait_for_fd():
                if pollster.poll(timeout) == []:
                    raise socket._socket.timeout('timed out')
        else:
            # call select() once in order to solicit ValueError in
            # case we run out of fds
            try:
                select.select([], [sockno], [], 0)
            except ValueError:
                raise _GiveupOnSendfile(err)

            def wait_for_fd():
                fds = select.select([], [sockno], [], timeout)
                if fds == ([], [], []):
                    raise socket._socket.timeout('timed out')

        total_sent = 0
        # localize variable access to minimize overhead
        os_sendfile = pysendfile.sendfile
        try:
            while True:
                if timeout:
                    wait_for_fd()
                if count:
                    blocksize = count - total_sent
                    if blocksize <= 0:
                        break
                try:
                    sent = os_sendfile(sockno, fileno, offset, blocksize)
                except OSError as err:
                    if err.errno in _RETRY:
                        # Block until the socket is ready to send some
                        # data; avoids hogging CPU resources.
                        wait_for_fd()
                    else:
                        if total_sent == 0:
                            # We can get here for different reasons, the main
                            # one being 'file' is not a regular mmap(2)-like
                            # file, in which case we'll fall back on using
                            # plain send().
                            raise _GiveupOnSendfile(err)
                        raise err
                else:
                    if sent == 0:
                        break  # EOF
                    offset += sent
                    total_sent += sent
            return total_sent
        finally:
            if total_sent > 0 and hasattr(file, 'seek'):
                file.seek(offset)
else:
    def _sendfile_use_sendfile(sock, file, offset=0, count=None):
        raise _GiveupOnSendfile(
            "sendfile() not available on this platform")


def _sendfile_use_send(sock, file, offset=0, count=None):
    _check_sendfile_params(sock, file, offset, count)
    if sock.gettimeout() == 0:
        raise ValueError("non-blocking sockets are not supported")
    if offset:
        file.seek(offset)
    blocksize = min(count, 8192) if count else 8192
    total_sent = 0
    # localize variable access to minimize overhead
    file_read = file.read
    sock_send = sock.send
    try:
        while True:
            if count:
                blocksize = min(count - total_sent, blocksize)
                if blocksize <= 0:
                    break
            data = memoryview(file_read(blocksize))
            if not data:
                break  # EOF
            while True:
                try:
                    sent = sock_send(data)
                except OSError as err:
                    if err.errno in _RETRY:
                        continue
                    raise
                else:
                    total_sent += sent
                    if sent < len(data):
                        data = data[sent:]
                    else:
                        break
        return total_sent
    finally:
        if total_sent > 0 and hasattr(file, 'seek'):
            file.seek(offset + total_sent)


def _check_sendfile_params(sock, file, offset, count):
    if 'b' not in getattr(file, 'mode', 'b'):
        raise ValueError("file should be opened in binary mode")
    if not sock.type & socket.SOCK_STREAM:
        raise ValueError("only SOCK_STREAM type sockets are supported")
    if count is not None:
        if not isinstance(count, int):
            raise TypeError(
                "count must be a positive integer (got %s)" % repr(count))
        if count <= 0:
            raise ValueError(
                "count must be a positive integer (got %s)" % repr(count))


def sendfile(sock, file, offset=0, count=None):
    """sendfile(sock, file[, offset[, count]]) -> sent

    Send a *file* over a connected socket *sock* until EOF is
    reached by using high-performance sendfile(2) and return the
    total number of bytes which were sent.
    *file* must be a regular file object opened in binary mode.
    If sendfile() is not available (e.g. Windows) or file is
    not a regular file socket.send() will be used instead.
    *offset* tells from where to start reading the file.
    If specified, *count* is the total number of bytes to transmit
    as opposed to sending the file until EOF is reached.
    File position is updated on return or also in case of error in
    which case file.tell() can be used to figure out the number of
    bytes which were sent.
    The socket must be of SOCK_STREAM type.
    Non-blocking sockets are not supported.
    """
    try:
        return _sendfile_use_sendfile(sock, file, offset, count)
    except _GiveupOnSendfile:
        return _sendfile_use_send(sock, file, offset, count)

Monday, May 26, 2014

Goodbye Google Code, I'm moving to Github

8 years ago I started hosting my first open source project (pyftpdlib) on Google Code and I later ended up also hosting psutil and pysendfile. Back then GC had just been released and similarly to other Google products I appreciated the clean and minimalistic interface, the excellent bug tracker and the freedom to choose between different revision control systems (SVN, GIT and Mercurial, which is my favourite one). Unfortunately as the years passed Google completely lost interest in maintaining GC to the point that now GC can basically be considered an abandoned project.  If you take a look at GC bug tracker you can see literally hundreds of issues which have been open for years, even some apparently easy ones such as #60 and #919. The lack of interest from Google is absolutely astonishing and it is the main reason why I ultimately decided to change. After at least a couple of years of thinking about migrating to github I finally bite the bullet and as of today psutil is now hosted on github (update: now also pyftpdlib and pysendfile).

What I will miss the most about GC

First of all I must say that despite the unfortunate situation of GC I'm also sad for abandoning it. It started as a really great hosting platform, and it still has some peculiar aspects which I know I will be missing. In order of importance:
  • The bug tracker: it is much more powerful than github's, especially for the extremely customizable labeling system which is pure gold, the excellent searching system and the grid view. GC bug tracker seriously kicks some ass so kudos to whoever was behind its design! By comparison github bug tracker is too minimalistic and it has no good way to order issues or list them in a more compact form. I'm totally gonna miss psutil bug tracker.
  • Mercurial: I'm a big fan of Mercurial and I consider it way more pleasant to work with compared to GIT. I don't know exactly why GIT ended up being so much more used than Mercurial (probably because of github?) but I'm sure that many other guys like me who know both systems will agree that Mercurial is simply so much easier to use. Unfortunately once you decide to stick with github you have no other choice. Mercurial, I'm gonna miss you too!
  • GC layout: it is much simpler than github's! Everything is easy to find, even for a non-geek person. The home page alone is perfect to summarize what the project is about and doesn't have tens of icons all over the place. github layout is more complicated and needs some time to get used to, even for a programmer. If these projects (psutil and others) weren't about programming I wouldn't have chosen github because it's "not for the masses".

What I appreciate about github

  • Travis integration: there's this totally awesome free continuos integration service called Travis which given a configuration file like this it will automatically run tests on multiple python versions every time a commit is pushed. They recently added OSX support and Windows support is on the way. This way I will finally be able to quickly test psutil on Linux, OSX and Windows without using virtualized systems except for FreeBSD and Solaris! To me this is like the ultimate Christmas gift and I couldn't ask for any better. Note: as of today Travis only works with github.
  • forks and pull requests: honestly I'm not a big fan of them (yet?), probably because I'm used to the python-dev development workflow consisting in uploading patches on the bug tracker and reviewing them (see this for example). Nevertheless to my understanding most people use pull requests in order to contribute to open source projects so basically this is a service I'm glad to offer to my users who hopefully will be able to contribute back more easily. GC has a cloning system but isn't anywhere near github's and I'm not even sure how it works (never cared).
  • the "social" side of github including the fact that you can "star" developers and receive notifications about their activity was another big incentive for migrating. The personal landing page collecting all your contributions to different projects is absolutely cool. GC had something similar but they stupidly removed it all of the sudden and never reintroduced it back. A lot of people were angry but again, they didn't care. Actually this was the feature I appreciated the most about GC after the bug tracker and that is when I seriously started thinking about flipping off GC for good. 
  • the enormous user base: the fact that github is the most used code hosting platform out there will hopefully help me and my projects have a little more visibility. Also, in many job interviews I've been asked what my github profile was so it seems github also became an active part in getting jobs.
  • gists: gists are "a simple way to share code snippets and pastes with others. All gists are Git repositories, so they are automatically versioned, forkable and usable from Git". Seriously, they are  beautiful. In order to share my code snippets I've always used Active State but I think I will eventually migrate them as well in order to have everything in one place.
  • the fact that if you mention an issue number as part of your commit message that specific issue will automatically be updated. As I said GC bug tracker is superior in basically any aspect but since I always took care of updating issues by mentioning the specific cset which fixed them (see for example here) having this little extra feature will save me some time. 
  • SSH keys: using Mercurial on GC means using password based authentication. Incredibly they still do not support SSH key based auth. Simply "git push"ing without entering any password when I'm not on my laptop is nice, and of course, it is also much more secure.

Migration

For those of you who are interested in knowing how I did it, here goes: as for moving the issue from GC bug tracker to github's I used this tool. I managed to preserve the issue IDs but unfortunately not the real owners nor the real issue dates, which kind of sucks. As for migrating the code from mercurial to git I just used this. The Mercurial -> GIT transition was perfect and I also managed to preserve the original Mercurial named branches and tags, which for me it was crucial. In conclusion, psutil is a 5 years, medium sized  project with hundreds of issues: the transition in this case is definitively possible but not painless so if you plan on migrating, the sooner you do it the better.