Posts in category daemon

Twisted web, thread pools and daemonization

The title of this blog post could have been a ticket summary like: WMI takes 100% of CPU when daemonized (with a lot of "WSGI application error" and EBADF crashes), but everything is OK when it stays attached to the current terminal. Luckily I found the solution before posting a new ticket and pushing changes to the branch ;-)

It made me suffer from a headache, because there is no documentation on this subject on Twisted website. Twisted folks tend to think everyone will use their twistd for daemon-purposes, which is a broken assumption when Twisted must just fit into an existing architecture.

I searched a lot on the daemonization side and cleaned my foundations.process.daemonize() function a little but this didn't help, anyway. Then  Rob Golding gave me an hint about it, via a very generic « twisted django wsgi » google search with a very lot of luck.

I just moved the twisted imports after the daemonization call, and everything went fine again. Hopefully, importing twisted.web.version is still possible at the beginning of my wmi.py file without breaking this thread-pool thing, which makes the "please install twisted.web" message still possible too. Nice!

Upcoming WMI2 internals (and new repo in Trac)

The following schema tries to explain how the future WMI2 is working. This architecture is already fully functionnal in our development repository and will enter stable branch soon (official due date: March 5th, 2012). We are currently in the process of polishing it before releasing the patch.

The new WMI will be a huge step forward interactive and web-2.0-like interface ; everything can be handled asynchronously, and the WMI can update any part of its interface without refreshing the whole page. It is even a fully-featured Django + Jinja2 + jQuery application, which runs on top of our new webserver. The webserver is fully WSGI compliant, and built on top of the great gevent co-routine-based library.

RPC Events Arch for WMI2

As a side note, I've now integrated the WMI2 repository into Trac to help follow the changes and global Licorn® activity. Without this, one could think that Licorn® development is halted, because it all happens in this separate branch, until its merged into the stable one.

New inotifier (v4) finished, and a bunch of (small but cool) new features

I'm proud to announce the new inotifier rewrite (and its bunch of small enhancements), internally and lovely named "hopefully-this-one-will-work-as-expected" (private joke to me). It's shorter than previous version in terms of codelines, albeit more complex when dealing with special cases (large directories, multiple concurrent accesses to same files, re-born just-deleted files or dirs, etc). The new version is many times faster than all previous ones (including the external C-implemented gamin one). When you untar an archive, you can expect more or less the time of the untar process, after it finished, for complete ACLs application. Previously, it could take minutes to do the same (specifically when untarring the linux kernel in a shared dir). licornd is also very smart when talking about resources-consumption: it takes the CPU for ACLs intensive tasks (but only ONE CPU), and doesn't take it long. For what it has to do, I find it well balanced from the functionnality/resource point of view.

The new inotifier and related core.classes additions allow users homes to be watched now, and offer dedicated functionnalities to handle configuration files, and report *real* changes to them (not 'all access', generating a lot of false positives).

dnsmasq backend, privileges directly benefit from this new functionnalities. shadow configuration files watch is more robust and verifies everything when they reload (one could create inconsistencies, editing the files manually; this is taken in account).

There are still some rough edges and evil sub-sonic bugs (perhaps they are all the same, I can't hunt it down for now), but only on very-very heavily loaded systems, where users and groups pop in an out very fast. I will fix them in the next coding cycle.

Hopefully, you won't need the chk group command anymore. If you do, please provide a full trace:

export LTRACE=std
licornd -rvD

<whatever command in your other terminal>

In the new-but-small-but-cool features category, you'll find the command fuzzy matching:

get u
get us
get usr
get users

(and so one, with identical counterparts for add/mod/del/chk)

Will bring you the list of users. In the same kind:

get g     -> groups
get pro   -> profiles
get pri   -> privileges
get kw    -> keywords

And so on. Everything is computed when you type it, there are no so-called "fixed values".

In the not-so-small-but-very-cool category, you will find that every part of Licorn® is now fully multi-lingual, on-the-fly: the daemon starts in the system lang, but every thread inside of it can switch to another language, and the client languages are pulled in from the web headers or the calling CLI environment. This makes everything dynamic, at will.

Documentation has been updated for permissions parts.

French tranlation is progressing notably: WMI part is finished, CLI is 90% done, and the rest is more or less 70% done (it doesn't matter anyway, as no user really sees it in real life).

I voluntarily don't mention the core object rewrite. It's very technical and doesn't bring new end-users functionalities, but guarantees that everything is cleaner and easier to extend inside licornd, regarding the users/groups/profiles/privileges/machines point of view.

I probably forgot many things here, but if I had written a book, you won't have read it anyway. Code and *use* the code is better. Many bugs have been fixed, and the code is generally more pythonic and lighter tht before: there are more generators, less hard-coded things, and abstractions (when necessary) got in the right places. At least, this how I wanted to implement them.

Enjoy,

Core-rewrite #1 + full-i18n, new inotifier work started

The past 2 weeks have featured a great core rewrite that I wanted to achieve for a long time. Core objects (Users, Groups & Profiles) are now clean objects, implemented with all the pythonic-fancyness that modern code can have (most notably properties, weak references and internal generators where applicable). Controllers manipulates them in a clean way too, doing things the CoreUnitObject can't do because they are not aware of the controller context (which make the whole thing totally logical, finally; things as they were meant to be, at last; [put your favorite self-satisfaction sentence here]).

Controllers and unit objects moderately use the daemon service facility, to make things more instant to clients and avoid long-activity stoppers: for example, setting permissive ON/OFF on groups launches a background check on shared data and returns instantly (among others).

As a consequence, the service facilities are initialized very early in the daemon (even before the LMC initializes) and are usable everywhere: Each individual object has a licord R/O property (named after this avoid collision with the Thread's daemon attribute), offering directly the {service,aclcheck,network}_* methods. The imports overhead and dirtiness of the previous implementation is totally avoided.

The patch lies in the  development repository (not referenced in trac but accessible to SSHers). I will not push it to the stable branch until the new inotifier has landed, but it is very stable (testsuite has run many times on it).

The Full-i18n milestone will soon be closed (or nearly), because #2, #70, #541, #542 and #544 are implemented or closed in this patch. Thus, besides the full-object rewrite, we now have on-the-fly in-thread language transparent switching, and this really rocks for the WMI.

LTRACE enhancement in the daemon

Starting from now, the only thing you've got to do to enable LTRACE is to set the environment variable. If the daemon finds it at start, it will refork itself with the appropriate options to enable it.

The LTRACE status is maintained while restarting from inside the daemon (this was not as easy as it seems to implement).

Bonus: just export LTRACE=std to get a nice but not so verbose LTRACE output, you will notice that the only places where the daemon hands on stop/restart are justified (__unihibit_udisks() and PyroFinder queue not empty). In this case, just wait. The udisks thing can take up to 10 secs, and the PyroFinder one, up to 20 secs (i'm searching how to lower this one, the Pyro timeout is already set to 5 secs).

Volumes and Rdiff-backup extensions entered the repository

Theses extensions are ready for wider testing. They have their documentation setup ( volumes and  rdiffbackup).

A major thread rewrite has been done during this dev cycle.

CLI and WMI related parts are coming in.

Network discovery in the daemon

The daemon has now an auto-discovery capability on its local network(s). It will scan the LAN (on all of its ethernet interfaces) and will try to discover all hosts which are up (answer ping).

With  10 to 30 network threads in each pool (the default beiing 5, is resource-conservative), it can be faster than nmap to do a full LAN discovery (you will gain ARP resolution and Pyro resolution bonuses).

You can  disable the feature if you don't like it or experiment problems with it (don't forget to report bugs in this case).

We finally got rid of nmap and all those consuming subprocess.Popen calls, with a big functionnality gain.

Licornd is now fully interactive

You can debug licornd in a live interactive session. Just start it in the foreground (licornd -D), and press 'i' . You can access every object and do whatever you want (trigger a method, dump an object, really whatever the python language allows you ; so be carefull, you're root...).

Bonus: everything is auto-completed with <TAB>, like any interactive shell can be, and your command history is saved to ~/.licorn/licornd_history (thanks to readline).

Press Control-D when you are done to leave interactive mode and return to standard command mode.

The interactive session is implemented in a separate thread. The licorn daemon stays fully functionnal during it. Stopping and restarting individual things manually will probably come in the near future, but as of now if your display gets corrupted with other daemon output, just hit Control-L to clear your screen.

Example session:

olive@desktop-001 ~/licorn @ licornd -D
 * [2010/04/12 01:01:44.8412] licornd/master@server(5124): starting all threads.
 * [2010/04/12 01:01:44.8454] licornd/wmi(5129): started, waiting for master to become ready.
 * [2010/04/12 01:01:44.9980] licornd/master@server(5124): all threads started, going to sleep waiting for signals.
 * [2010/04/12 01:01:45.4224] licornd/wmi(5129): ready to answer requests at address http://localhost:3356/.
 * [2010/04/12 01:01:47.6714] Entering interactive mode. Welcome into licornd's arcanes…
Licorn® @DEVEL@, Python 2.6.6 (r266:84292, Sep 15 2010, 15:52:39) [GCC 4.4.5] on linux2
licornd> LMC.users.keys()
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 33, 34, 38, 39, 41, 300, 113, 100, 101, 102, 103, 1000, 105, 106, 107, 108, 109, 110, 111, 112, 104, 114, 115, 116, 117, 65534]
licornd> dqueues
{'pyrosys': <Queue.Queue instance at 0x9c7a18c>, 'reverse_dns': <Queue.Queue instance at 0x9c7a04c>, 'arppings': <Queue.Queue instance at 0x9c71eec>, 'pings': <Queue.Queue instance at 0x9c71dac>}
licornd> 
 * [2010/04/12 01:02:24.0383] Leaving interactive mode. Welcome back to Real World™.
 * [2010/04/12 01:02:25.3743] licornd/master@server: signal 2 received, shutting down…
 * [2010/04/12 01:02:25.3748] licornd/wmi: signal 15 received, shutting down…
 * [2010/04/12 01:02:25.3751] licornd/wmi: exiting.
 * [2010/04/12 01:02:25.5468] licornd/master@server: exiting (up 40 secs).

PS: i really love Python. These kind of things are just amazing.

PS2: the implementation can be incomplete, I didn't really test every object. Just report any bug you find and I will fix it ASAP.

Core refactor 001

As you can read in my last big patch (~rev 593), the Licorn® core has been greatly refactorized. There is now a global (but fine grained locked) object, called the LMC (LicornMasterController), which holds all controllers (UsersController, GroupsController and others). It holds 2 other special controllers:

  • the BackendController (real object pretty name: LMC.backends) which has been pulled out from the LicornConfiguration object and holds all backends. The news is that all backends are now equal (there is no level difference between dnsmasq and ldap for example). Backend objects are just much more clean to read and understand. Controllers can use different backends if there's a good reason to. A direct consequence is that LMC.configuration is much simpler than before (and the diet will continue).
  • the LockManager: it is yet just an Enumeration but will become soon a more clever object. It holds all locks for all other controllers, and all locks for controllers unique records. Locks are stored outside of controllers because Pyro can't pickle lock objects (which seems quite fine if you think about it). The LockManager (real object pretty name: LMC.locks) is not exported via Pyro, because all other controllers use it internally and it is not meant to be accessed directly. To lock an object remotely, just call <controller_name>.acquire() and release() methods as you would on a normal lock and the LicornCoreController class will wrap everything for you.

The LMC is accessed directly in the daemon, and remotely via a single LMC.connect() call, in the CLI and the WMI. Pyro related changes and their consequences in the project are nearing end, and the Licorn® core has benefited a big clean-up from it. This is a big piece of work, and a great enhancement: we gained more functionnalities, guaranteed consistency, fine grained locking, with a much simpler codebase. Let the magic (and hard work) continue.

Daemon status available

In my last patch I added the availability to query daemon status, in two ways:

  • the get status command, which can be called with argument --full. This command is complemented by get users --dump, get groups --dump and so on, which help debugging daemon internal data structures without stopping it. This method for getting the daemon status is independent from its state (forked into background or not).
  • when the daemon is attached to the terminal (launched with -D), you can now type uni-letters commands to query it:
    • 'f' or 'l' will toggle between normal and full status.
    • [Enter] will just display a newline (usefull for manually marking spaces between different operations.
    • Ctrl-L will clear the screen, like in a normal terminal.
    • Ctrl-T will display the current status of the daemon (full status depends on wether you activated it before or not, beiing disabled by default and remembered across the daemon session when you set it, until terminate or restart).
    • Ctrl-Y (or space) will do the same, but will clear the same first. Typing repeatedly on space will emulate a top-like behaviour, permitting to monitor the daemon status, even if it is very active.
    • Ctrl-R will reload the daemon (by sending it an USR1 signal). Very useful when you modified daemon or core code, just hit R in your daemon terminal and you're done with the new code reloaded.
    • Ctrl-C will break and terminate, as expected.
    • Ctrl-U will terminate the daemon with a traditionnal signal 15 (similating a normal kill or killall).
    • (Caution) Ctrl-K will send a real KILL signal, when the daemon is stuck.

In some rare cases (when the interactor thread is crashed, which never happens ;-) ), you will not be able to use these commands and will need to operate "à l'ancienne" (Ctrl-Z, bg, sudo killall -r licornd and al.).

Pyro changes in stable branch

Pyro work has been commited to the stable branch. News:

  • you don't need to use sudo anymore with licorn commands. If you're a member of group admins, everything should be transparent for you.
  • every CLI tool and the WMI needs the daemon to work.
  • CLI tools will launch the daemon if needed and wait for it to be ready before continuing.

Functionnaly-wise, nothing should have changed (this is guaranteed by the testsuite).

Security-wise, core objects are not yet protected with locks, but this is the next work to do in turn. For everyday use, this should not hurt (i've tryed to crash it, but didn't succeed).

Work has (re-)begun

I'm back at Licorn® coding after nearly a year of sleep. Many things have evolved since then and I'm coming back with shining new ideas.

First I will merge the new WMI, which has fewer bugs and is more pretty than the old. I will be learning bzr during this period (branching, pushing, merging).

Then I will improve daemon internals, execute pylint and fix bugs.

Then fully implement the keywords part (internals and GUIs), in parallel of other tasks (client, cluster...). Not a prioritary task, but this will add a killer feature, IMO.