Pages

Saturday, January 29, 2005

Alternatives added to pkgsrc

A few days ago (on the 25th), I finally added the alternatives framework to pkgsrc; haven't had a chance to write about it until now.

The actual implementation is very different from what I had in mind when I started. It is composed of an independent utility, pkg_alternatives, and a make module, alternatives.mk. The later accomplishes perfect integration of the former with pkgsrc, so that the use of alternatives during package development is as painless as possible.

First of all, and as I already discussed, this implementation uses wrappers in favor of symbolic links for better flexibility (specially across NFS-mounted trees).

WRT its use during development, the package author only has to create an additional file, namely ALTERNATIVES, which is automatically read during installation. Each line of this file is fed to the "pkg_alternatives -w register" command to add the given wrapper/alternative pair to the database. There is no need to define any special variable.

Furthermore, the pkg_alternatives utility has been reworked to support two modes of operation: the group (AKA package) mode and the wrapper mode. The former operates on a whole set of wrappers; each set is identified by the package name that created it. The later operates on independent wrappers, and can be used by the user to tune each wrapper to suit his needs.

At last, the alternatives framework has been made completely optional due to lots of complaints (and they were right). It will only be activated when the administrator explicitly installs the pkg_alternatives package. Otherwise, no wrappers will be created anywhere. And all of this is binary package friendly, of course.

At the moment, only vim and nvi have been converted to use this system. I will commit updates for both the Python and Java packages very soon, as there has been no more negative feedback.

Hope this makes pkgsrc a bit more powerful, specially when used on extremely reduced Linux distributions :-)

Wednesday, January 26, 2005

GNOME Panel easter eggs

Today I reinstalled GNOME from scratch on my machine due to a mistake I made yesterday (which blew away half of it). After starting it, I saw a problem I had never noticed before. I have been able to solve it, although I haven't figured which is the root cause yet. Anyway, while looking for where the problem was, I discovered three easter eggs in the GNOME Panel module. If you don't like spoilers, stop reading now.

Two of them are exactly the same in the results they produce, although they are accomplished in completely different ways. The first one is to open the "About Panels" dialog (right-clicking on any visible panel) and, when the about dialog pops up, press the 'f' key three times. The second one is to open the "Run Application..." dialog, type the "free the fish" string and click "Run". After that, you will see a fish appear on screen, which will keep moving from left to right (clicking on it changes its direction). Unfortunately, it seems that the only way to stop it is killing the panel process).

The last easter egg is achieved in a similar way to the previous one, but the results are more spectacular. You should type the "gegls from outer space" in the "Run Application..." dialog and then click "Run". As a result, you will get an "Space invaders" game clone which uses cows rather than aliens.

Wow! These are the first easter eggs I discover by myself :-)

Sunday, January 23, 2005

vr(4) problems

Since I started using the vr(4) driver (i.e., when I switched my network to twisted pair cables), I've been experiencing very annoying crashes when running in promiscuous mode. This happened rarely at boot up, while dhclient(8) fetches its first lease. However, if you happen to use any other utility that puts the NIC in promiscuous mode, such as tcpdump(8) or bridge(4), the crashes happen almost 95% of the times.

Yesterday, I got up decided to fix this issue (mainly because I need to set up a bridge). Oh well, what a crazy decision. I spent the whole day tracking down the problem; it took so long because I didn't know anything about the kernel's networking code and because GNU gdb did strange things.

After dealing with the kernel core dump and reading a big amount of code, I started to understand what was going on. (I can say that I learned a lot during the process, although I am still missing many, many details.) And finally I isolated the problem.

For some reason, the card is producing zero-length packets with some unusual flags active. I guess these are raised to notify the driver about a special condition but, without the card specs (they used to be public, but now you have to fill in a form and wait for confirmation to get them), it is very difficult to know what's going on. FWIW, I looked at the Linux driver and they simply ignore these packets. Basically, they assume that if a packet does not have the first fragment nor the last fragment bits set, it's incorrect.

So I've done a patch that discards these packets, just as Linux does. You can find it in this thread (the first one is incorrect; the second one is almost correct, except the assertion). I'm waiting to do a bit more testing locally to later commit the patch.

However, I think I still haven't found the root of the problem. All this patch does is add some more sanity checks, but those zero-length packets shouldn't be generated at all (according to what I see in Linux when reproducing a similar scenario).

Friday, January 21, 2005

Alternatives become wrappers

Todd Vierling made some interesting suggestions about the alternatives stuff. I didn't like them at first, but the more I thought about them, the more I realized they were the way to go :-P

So I've rewritten the alternatives framework almost from scratch, to make it use little wrappers instead of symbolic links (you can forget most things from the previous post, specially the example code). For those that don't know it, a wrapper is a little program that encapsulates another one; it does some preliminary tasks, but ends up running the program it's wrapping. Here are some of the advantages of using wrappers (really, some of them are only due to the new implementation):

  • They don't break if the original program disappears. How can this happen, if the alternatives are managed by packages?. Suppose you have /usr/pkg shared among several computers, but PKG_SYSCONFDIR is local to them (say /etc/pkg). Having this scenario, you have nvi set as the default for vi on machine A, while you have vim on machine B. If you remove one of them from one system, the other machine will end up with dangling symbolic links.

    Wrappers, however, check a list of alternatives when they are run and walk the entire list looking for a valid candidate. This means that the problem described above cannot happen (and if it does, you get a nice error message).

  • If you have multiple candidates available and you choose one manually to be the default, you want it to stay as the default even if you deinstall the package (think on updates, for example). This is now possible, and in fact very easy to achieve.

  • They let the user (not the administrator) configure which utility to use in each case. I.e., the wrapper will first look inside the ~/.alternatives directory for settings, then inside PKG_SYSCONFDIR, and at last inside the database stored in PREFIX/libdata/alternatives.

    The first two locations are optional and can be edited by hand at will (or even through the alternatives utility). The later one, though, is only managed by packages at (de)install time.

  • They don't "break" if the application changes its behavior depending on argv[0]. With symbolic links, you cannot use gvim as an alternative for vi (in fact, you can, but you won't get the graphical window).

  • Alternatives can have command line arguments. Todd showed an example: "emacs -l vip" as an alternative for vi.

There are some disadvantages too. The main one is that wrappers are restricted to programs. I.e., you cannot use them to provide alternatives for other stuff such as documentation, libraries, manual pages... But anyway, supporting these is not really a good idea if you think about it. However, one could argue about manual pages; I've mitigated this problem by generating one for each wrapper on the fly. They are simple, but I think can solve the inconvenient fairly well.

Check this message for some more information as well to links to the existing implementation.

Wednesday, January 19, 2005

Alternatives system in pkgsrc

Inspired by the recent comments of a user, I decided to implement an alternatives system for pkgsrc, similar to the one used by Debian. The purpose of this framework is to manage symbolic links that point to specific programs from a group of utilities with similar behavior.

The easiest example to show what this means comes when looking at Vim and Nvi. Both editors can be generally installed on the same system without conflicts. However, in that case, the system still misses a generic name, such as vi, to launch one as the default editor for the system. The administrator can make that link by hand, but couldn't it be nice if the system itself created it for you?

Alternatives come to improve this situation: multiple packages register themselves as providers for a set of links with a generic name (which I have named as class). The administrator is then free to choose which one should be the default, and the system adjusts the symbolic links to point to the preferred utility. This even works in an unattended manner, because the packages adjust the links at (de)installation if needed.

This is, of course, applicable to many other situations. Consider Java virtual machines, MTAs, window managers, Python interpreters, etc. In fact, I hope that this will simplify some stuff that we already have in pkgsrc to handle wrappers to run Java (i.e., lang/java-wrapper or Python, in a more flexible way.

So let's look at how this actually works from the point of view of a package (in this case, editors/nvi):

ALTERNATIVE_CLASSES=   vi 

ALTERNATIVE_NAME.vi= nvi

ALTERNATIVE_FILES.vi= ${PREFIX}/bin/vi ${PREFIX}/bin/nvi

ALTERNATIVE_FILES.vi+= ${PREFIX}/man/man1/vi.1 ${PREFIX}/man/man1/nvi.1

Simple, eh? ;-) This is internally handled by the new utility, pkg_alternatives, which is not yet finished. I hope it'll be ready tomorrow. Note that this is still not committed to the tree; in fact, I don't even know if other developers will be against it. Anyway, more on this on the next post.

Edit (20th Jan, 10:39): I've had to change the semantics of the Makefile variables to allow more flexibility, so I've adjusted the example above accordingly.

QEMU

Today, I decided to try the QEMU CPU emulator. I installed it and, after fighting several problems caused by incorrect flags passed to the compiler, I got it working.

So I tried to install OS/2 Warp 4 as the client OS. The emulator is very fast, compared to others such as Bochs (yeah, well, they don't work the same way). I had some problems during the installation (strange crashes in the installer) but, after disabling the network installation, it seems to work properly now.

Hmm... I haven't been able to run this OS in any other emulator :-) It's really worth trying.

Friday, January 14, 2005

Automatic verification of PLISTs

pkgsrc uses static file lists to register the contents of an installed package. A drawback of this approach is that the file list of a package may occasionally become out of sync with reality. I.e., it may omit some files that are really installed, thus leaving them orphaned, or specify extra files that are not really copied into place.

There are multiple reasons that can cause this desynchronization, including careless updates, incorrect detection of dependencies or, even worse, files that are only installed depending on the OS you are running. When some files are not properly installed, pkgsrc can easily tell you that an error happened, because it will not find all the files listed in the static list. However, the opposite situation is hard to detect, and it was not possible up until now.

Yesterday, I looked at the stale files left after a bulk build under Linux and I was quite disappointed because the list was not anything close to short. So I decided to add a mechanism to capture these problems as soon as possible. You can benefit from these changes with recent pkgsrc's HEAD as of today's evening.

The new feature, called check-files is only run when PKG_DEVELOPER is defined and CHECK_FILES is set to YES. The later is a default (though it might be changed if it continues to show flaws). This will ensure that any package does not install more files than expected.

There is also some extra functionality that is only enabled when CHECK_FILES_STRICT is set to YES (not a default). In this case, check-files will ensure that packages do not touch PKG_SYSCONFDIR nor VARBASE directly. This behavior is completely incorrect from the point of view of binary packages, as they have to use bsd.pkg.install.mk for correct operation. However, these checks cannot be turned on by default (yet) because they will break many packages. Even though, developers should turn on this strict behavior and fix their packages whenever possible.

Wednesday, January 12, 2005

Monotone dedicated server

Some weeks ago, I installed Monotone on my main machine to act as a dedicated server for Vigipac's source code. During the process, I had to write a rc.d script and configure multiple things to get everything working safely. The overall process is not difficult once you know how Monotone works, but it is quite time consuming and error prone (due to concrete file permissions, for example). So I thought I could share my work to make this process easier to other people and love pkgsrc even more ;-)

A simple way to achieve this could have been to include the rc.d script in the Monotone package, alongside a user and a group, using the marvelous bsd.pkg.install.mk framework. However, if I had gone this route, users of Monotone who only want it as a client would have an extra, useless user/group pair in their system. This is wrong, IMHO.

So the other possible approach was to create a new package that provided all the needed bits to make the configuration of a dedicated server as painless as possible. And this is what I've done. The package is named, for obvious reasons, monotone-server, and is found in pkgsrc's devel category.

Basically, all it does is install a rc.d script, register a user/group pair (monotone:monotone) to run the server as and provide a home-grown script (monotone-server-init) to initialize the database. Furthermore, it installs a template file containing the required hooks to authenticate clients against branches and also provides a mechanism to easily define the set of branches (collections) to share (through an "invented" branches.conf file).

What does the monotone-server-init script do, you say? Well, it asks you where the database should be created (which defaults to /var/monotone/monotone.db), creates it, generates a key pair to access the server and tells you what to do to end the configuration. All in an interactive and self-explanatory process.

Hope it's useful to someone :-)

Saturday, January 08, 2005

Kernel debugging tutorial

I recently had two crashes while killing a process which are very likely to be caused by the kqueue code. I asked NetBSD's current-users@ mailing list about the issue and got an answer suggesting the reading of a paper.

Greg Lehey, the author of the post, wrote a nice tutorial and a set of slides dealing with kernel debugging under BSD systems; these were prepared for the EuroBSDCon 2004 (though it looks like they were not presented).

The tutorial is 167 pages long (not yet complete!) and is full of examples. I've read some of the very first pages and looks very interesting, either for the experienced and inexperienced developer, so it's definitely worth reading.

Edit (Jan 9th, 18:35): The tutorial was, in fact, presented, as seen here; thanks to nbuwe for pointing this out.

Thursday, January 06, 2005

NetBSD 2.0 beats FreeBSD 5.3 in server performance

Gregory McGarry has published an article that benchmarks NetBSD 2.0 and FreeBSD 5.3 in multiple situations that can be of interest in production servers. This includes scalability, reliability and performance in areas such as sockets, threads, process creation...

What's impressive is that NetBSD 2.0 outperforms FreeBSD 5.3 in almost all tests! Moreover, the areas in which it doesn't are not very important when it comes to real production servers. For example, the test that measures thread creation times doesn't reflect reality, because real servers won't manage that many threads. Anyway, it shows areas that are worth optimizing.

Now read the article and check it with your own eyes ;-)

Edit (Jan 7th, 21:24): Added missing link to the article (see above).

Wednesday, January 05, 2005

Tracking down a deadlock

Yesterday's night, I packaged Vino, a VNC server that integrates seamlessly with GNOME. After creating the package, I ran vino-preferences and saw it crash with the following assertion:

assertion "next != 0" failed: file
"/home/jmmv/NetBSD/src/lib/libpthread/pthread_run.c", line 130, function "pthread__next"

Hmm, threading problems... so I started looking at Vino's code to see where the problem could be. Saw it was using threads, but it was not after half an hour or so until I noticed that they were disabled by default. Oh, well. Then I wondered "what? threading problems and the program is not using them?".

So I ran the program under GNU GDB ("why didn't he do this earlier?" you say... well, I don't know) and found:

[...]
#9 0x48315fe5 in IA__gtk_image_set_from_file (image=0x80eca80, filename=0x8103b00 "/usr/pkg/share/icons/Nuvola/scalable/apps/gnome-lockscreen.svg") at gtkimage.c:842
#10 0x0804c1e3 in vino_preferences_dialog_setup_icons (dialog=0xbfbfea34) at vino-preferences.c:675
[...]

Hm... opened the "Theme selector", switched to another theme (Wasp), and voila! The problem had gone away. So the next logical step was to try to move the offending icon from Nuvola to Wasp and try again. Oops, it crashed. "Is the icon corrupt?" I thought. To verify, I tried to replace it with multiple other ones (all of them in SVG format), and it kept crashing. Wow! Here is where it started to get interesting, because the problem was located somewhere deep in the dependency tree. (Am I a bug addict? ;-)

After almost an extra hour of debugging, I isolated the problem: a deadlock in gdk-pixbuf. This library is modular, in the sense that image loaders are "external" to it. Some of them are thread-safe (according to some property in their header), but others aren't. In this case, SVG is not, while PNG is (this is why an SVG icon made it crash the program and a PNG one did not). When the loader is not thread-safe, gdk-pixbuf ensures exclusive mutual access to its functions... and this is where the problem relied.

Vino uses the gtk_image_set_from_file function (as seen in frame #9 above), which in turn calls gdk_pixbuf_animation_new_from_file. This function acquires a global lock using the _gdk_pixbuf_lock function and then calls _gdk_pixbuf_generic_image_load. This other function also calls _gdk_pixbuf_lock... there you have it, the deadlock (POSIX threads do not have nested locking as Java does).

I fixed this issue by narrowing the critical region in the gdk_pixbuf_animation_new_from_file function. And this morning I decided to submit this back to the authors (while GNOME's anonymous CVS was not working for me), in bug #162999.

When the CVS server came back online... I saw that the problem had already been fixed in HEAD! Heh, well. I spent quite a bit of time filing the bug that I'd have avoided... but, anyway, I'd still have spent a lot of time searching for the problem.

Tuesday, January 04, 2005

Epiphany's scrolling

For a long time, I've been believing that Epiphany, the GNOME web browser, was broken when installed through pkgsrc. And I was starting to hate it.

The problem I had was that trying to scroll the page using the keyboard resulted in a cursor moving around the document, instead of scrolling the whole page. Believe me, extremely annoying behavior, since it jumped across columns unexpectedly.

My diagnostic was that I had hit some kind of compatibility problem between Epiphany and the version of Mozilla I had installed through the mozilla-gtk2 package. This belief was increased this morning when I noticed that the configuration script was not finding some header files properly. I reviewed it and fixed some problems, but the problem did not disappear.

So this evening, while surfing the net, I thought... let's search for a solution; typed "epiphany scrolling" in Google and... ta-da! There it was, the first match contained what I was looking for, a FAQ about Epiphany. Scrolled down to the eighth question from the sixth section, "Scrolling with the keyboard doesn't work, and instead there's a cursor moving around!", and found the solution.

Copying verbatim: 'You have activated "caret browsing" mode. Disable it with the F7 key.' OMG, OMG! What a stupid "problem" (if I can call it as such)! Pressed F7 and it went away. No comments. Just be aware it doesn't happen to you ;-)

Sunday, January 02, 2005

VigiPac is now public

First of all, happy new year!

As I said some days ago, a friend of mine and me have been working for a while (a month and a half, more or less) in a computer game as an university task. The deadline for the project was December 19th (so the subject related to it is over), but we decided to continue it as a free software project.

So, finally, I can announce that the game, named VigiPac, is now public. Summarizing, it aims to be a three-dimensional Pacman clone with networking support, licensed under the GPL and written in ANSI C++.

There is still no formal release but the sources are available in a Monotone (yay!) repository. You can check out the current status of the project in its features page, and maybe have some fun trying it ;-)