pyinotify - Python wrapper for inotify

watch file and directory changes with Python under Linux

pyinotify is a simple wrap of inotify. inotify is a Linux Kernel feature (merged in the kernel 2.6.13) which aims to easily detect filesystem changes. The main features of inotify are accessed through three system calls. First, pyinotify wrap these system calls permitting their call from Python programs. The second goal of pyinotify is to provide an implementation on top of these system calls hiding the manipulation of inotify and providing more higher functionalities. Pyinotify doesn't requires detailed knowledge of inotify. Moreover, it only needs few statements to initialize, watch, handle notifications (eventually trough a new separate thread) and process them through a subclass's instance. The only things to know is the path of items to watch, the kind of changes to monitor and the actions to execute on these notifications. Note: pyinotify requires Python 2.3 and above.

You can download pyinotify here.

git repository: http://pyinotify.sourceforge.net/pyinotify.git

To familiarize yourself with pyinotify, run a first pyinotify's example like this:

$ cd pyinotify-x-x-x && python setup.py build
$ python src/pyinotify/pyinotify.py '/my-dir-to-watch'

Where my-dir-to-watch is a path leading to a valid directory. Now just go in this directory, play with files: read one, create another,... and compare your actions with the output produced by pyinotify. Enjoy, you have just been watching your first directory :).

Note: if you want to install pyinotify, just type:

$ cd pyinotify-x-x-x && python setup.py install

Read the README file to know where the files are installed.

After this first try you might want develop your own code watching directories and files, only for some particular kind of events, and with dedicated processing functions accomplishing a specific task like sending mail notifications, generating rss output... pyinotify let you make whatever kind of processing you want through subclassing. It provides base classes abstracting boring but required stuffs and lets you keep the focus on your Python development. Thus, you can add watches on many files, each one for an specified set of events, the notifications are automatically handled and processed by your own dedicated code. Browse the generated documentation versions 0.4.x, versions 0.5.x to view the whole class hierarchy, the exposed methods, and the events codes with their descriptions. Then read the next sections to learn how to use pyinotify.

Let's introduce the python namespace through which pyinotify can be accessed and which should help you to understand its logic. pyinotify is compounded of two modules which are named inotify and pyinotify:

  • The namespace inotify is a simple raw wrap of inotify (wrapping 3 systems calls, and 3 variables available in /proc). You'd probably never have/want to directly import (use) this namespace unless you know what you doing.
  • The namespace pyinotify exposes higher developments made on top of inotify which are the purpose of pyinotify.

Ok, let's start a more detailed example. Say, we want to monitor the temp directory '/tmp' and all its subdirectories for every new file's creation or deletion. Additionally, we want print coherent messages for each notification on standart output.

Now you have the choice to either do the monitoring from the thread who instantiate the monitoring, the main benefit is that it doesn't need to instantiate a new thread, the drawback is to block your program in the monitoring task. Or, you don't want to block your main thread, you can handle the monitoring in a new thread if this perspective does not frighten you, and it shouldn't. These two manners of doing the same thing are available in pyinotify. Up to you to choose which one is the most adapted to your needs and is consistent with your constraints. Next, we will detail the two approaches: their common code, then their differences.

First we have to make the Python import statements: ThreadedINotify and SimpleINotify are the main classes, we will instantiate one of these classes depending on our thread's policy/design choice. The former let make the monitoring in a new thread whereas the latter force to make the monitoring without instantiate a new thread (as explained above). We will inherit our customized processing class from ProcessEvent which will be dedicated to process notifications. Finally, EventsCodes bring the set of codes, each code is associated to an event.

import os
from pyinotify import ThreadedINotify, SimpleINotify, ProcessEvent, EventsCodes

The following class inherit from ProcessEvent, handles notifications and process defined actions. We will watch only two events: IN_CREATE and IN_DELETE, at which we can associate an individual processing method by providing a method whose the name is written with the specific syntax: process_*EVENT_NAME* where *EVENT_NAME* is the name of the handled event to process. For the sake of simplicity, our two methods are very basics they only print messages on standart output:

class PTmp(ProcessEvent):
    def process_IN_CREATE(self, event_k):
        print "Create: %s" %  os.path.join(event_k.path, event_k.name)

    def process_IN_DELETE(self, event_k):
        print "Remove: %s" %  os.path.join(event_k.path, event_k.name)

The value representing the set of events we plan to watch (IN_DELETE and IN_CREATE) is assigned to the variable mask:

mask = EventsCodes.IN_DELETE | EventsCodes.IN_CREATE

Next, are the separate sections describing respectively SimpleINotify and ThreadedINotify:

  • Class SimpleINotify: simple (non-threaded) inotify's monitoring

    This statement instantiate our monitoring class and realizes initializations with in particular the inotify's instantiation.
    ino = SimpleINotify()
    
    The next statement add a watch on the first parameter and recursively on all its subdirectories, note that symlinks are not followed. The recursion is due to the last parameter set to True. rec is an optional parameter set to False by default, thus the default monitoring is limited to the level of the given directory. The third parameter is an instance of your customized class, this instance will be called on each processing roughly like this: PTmp()(event) where event is the observed (and enqeued) event, for more details on what is an event see the dedicated section. As result, a dictionary where keys are paths and values are corresponding watch descriptors (wd) is assigned to wdd. A new wd is created each time a new file or directory starts to be monitored. It is useful (and often necessary) to keep these wd for further update or remove a particular watch, see the dedicated section. Obviously, if the monitored element had been a file, the rec parameter would have been ignored whatever its value.
    wdd = ino.add_watch('/tmp', mask, PTmp(), rec=True)
    
    Let's start reading the events and process them. Note that during the loop we can freely add, update or remove any watches on files or directories, we can also do anything we want, even stuff unrelated to inotify. We call the close() method when we want stop monitoring.
    while True: # loop forever
        try:
            # process the queue of events as explained above
            ino.process_events()
            if ino.event_check():
                # read the new incoming events and enqeue them
                # for the next processing's session
                ino.read_events()
        except KeyboardInterrupt:
            # destroy the inotify's instance on this interrupt (stop monitoring)
            ino.close()
            break
    
  • Class ThreadedINotify: threaded inotify's monitoring

    The first statement instantiate our monitoring class and realizes initializations. The second statement start the monitoring thread, doing actually nothing as no directory or file is being monitored.
    ino = ThreadedINotify()
    ino.start()
    
    same description as above:
    tmp_path = '/tmp'
    wdd = ino.add_watch(tmp_path, mask, PTmp(), rec=True)
    
    At any moment we can for example remove the watch on '/tmp' like that:
    if wdd[tmp_path] > 0: # test if the wd is valid, this test is not mandatory
       ino.rm_watch(wdd[tmp_path])
    
    Note that its subdirectories (if any) are still being watched. If we wanted to remove '/tmp' and all the watches on its sudirectories, we could have done like that:
    ino.rm_watch(wdd[tmp_path], rec=True)
    
    Or we would have even done better like that:
    ino.rm_watch(wdd.values())
    
    That is, most of the code is written, next, we can add, update or remove watches on files or directories with the same principles. The only remaining important task is to stop the thread when we wish stop monitoring, it will automatically destroy the inotify's instance. Call the following method:
    ino.stop()
    

There are few remarks and special cases that are worth considering:

  • EventsCodes.ALL_EVENTS isn't a true event, that mean that you don't have to implement the method process_ALL_EVENTS (worst it would be wrong to define this method), it is just an alias on all the events, the real type of each event is kept, and the processing actions are dissociated. Obviously, if we want to apply the same actions whatever the kind of the event, we only have to implement a process_default method (for a complete example see: src/examples/simple.py).
  • Say we want processing events from the same 'family' with a common processing method, e.g. for this family of events:
    mask = EventsCodes.IN_CLOSE_WRITE | EventsCodes.IN_CLOSE_NOWRITE
    
    It is enough to provide a processing method named process_IN_CLOSE according to the syntax process_IN_*familybasename*. The two previous events will be processed by this method. In this case, beware to not implement one of these following methods: process_IN_CLOSE_WRITE and process_IN_CLOSE_NOWRITE, because these methods have an higher precedence (see below), thereby are looked first and would be called instead of process_IN_CLOSE (for a complete example see: src/examples/close.py).
  • Look-up precedence of processing methods (by increasing order of priority): specialized method (ex: process_IN_CLOSE_WRITE), family method (ex: process_IN_CLOSE), default method (process_default).
  • For more detailed messages from pyinotify, turn-on the DEBUG variable located in src/pyinotify.py.

Each event is an object which is an instance of the class Event, and is dispatched to an appropriate processing method (according to its type), in which it can takes actions in response to this event. It possesses precious informative data in its class attributes:

  • wd: is a Watch Descriptor, it is an unique identifier who represents the watched item through which this event could be observed.
  • path: is the complete path of the watched item as given in parameter to the method add_watch.
  • name: is not None only if the watched item is a directory, and if the current event has occurred against an element included in this directory.
  • mask: is a bitmask of events, it carries all the types of events watched on wd.
  • isdir: is a boolean flag set to True if the event has occurred against a directory.
  • cookie: is a unique identifier who permits to tie together two disparate 'moved to' and 'moved from' events.

Question: among these methods, which one must be called with a string path and which ones with a watch descriptor?

This question should be fairly simple to answer, but it's worth clarifying it once time for all, with a simple table. It recalls the kind of parameter accepted by each method:

Parameter Returned result Example
add_watch path (or list of paths) {path1: wd1, path2: wd2, ...}
Where wdx is the watch descriptor associated to pathx.

ra = ino.add_watch('/a-dir', mask)
if ra['/a-dir'] > 0: print "added"

update_watch wd (or list of wds) {wd1: success, wd2: success, ...}
Where success is True if the op on wdx succeeded, False otherwise.

ru = ino.update_watch(ra['/a-dir'], new_mask)
if ru['/a-dir']: print "updated"

rm_watch wd (or list of wds) {wd1: success, wd2: success, ...}
Where success is True if the op on wdx succeeded, False otherwise.

rr = ino.rm_watch(ra['/a-dir'])
if rr['/a-dir']: print "deleted"

So, we add a watch with a path, we get the result, this result has all the paths and all the watch descriptors associated. If the path hasn't been able to be watched the wd is negative. Otherwise the wd is positive.

The methods for updating or removing a watch only take watch descriptors and return a dictionary to notify the success or failure of every operations.

In extreme case where your parameter doesn't fit the expected format, which can happens if you lost previously returned values, you can use the methods get_wd(a_path) and its counterpart get_path(a_wd). The former takes a path and returns its wd, the latter takes a wd and returns its path. For performance and reliability, these methods should be avoided as much as possible.

Either as introduction to inotify or as means to dive into pyinotify, there are some interesting readings worth to be mentioned. These suggestions are sorted by reading order. But, this is just an advice, you could make as you feel.

  • Read this excellent introduction to inotify written by his co-author Robert Love.
  • Obviously, read the pyinotify's documentation as specified above.
  • If you want to write code beyond the basic example, read the python files in the example directory src/examples/*.py and the tests in src/tests/*.py provided with pyinotify.
  • And finally, you can directly read parts of the src/pyinotify/pyinotify.py source code, its size is relatively short and readable (thanks to Python :) ).
sebastien.martini at gmail.com - Last update: 03-2006