Saturday, May 31, 2014

Code review culture meets FreeBSD

One of the things that often shocks new engineers at Google is the fact that every change to the source tree must be reviewed before commit. It is hard to internalize such a workflow if you have never been exposed to it, but given enough time —O(weeks) is my estimation—, the formal pre-commit code review process becomes a habit and, soon after, something you take for granted.

To me, code reviews have become invaluable and, actually, I feel "naked" when I work on open source projects where this process is not standard practice. This is especially the case when developing my own, 1-person projects, because there is nobody to bounce my code off for a quick sanity-check. Fortunately, this may not be the case any more in, at least, FreeBSD, and I am super-happy to see change happening.

A few individuals in the FreeBSD project have set up an instance of Phabricator, an open source code review system, that is reachable at and ready to be used by FreeBSD committers. Instructions and guidelines are in the new CodeReview wiki page.

To the FreeBSD committer:

Even if you are skeptical —I was, back when I joined Google in 2009— I'd like to strongly encourage you to test this workflow for any change you want to commit, be it a one-liner or a multipage patch. Don't be shy: get your code up there and encourage specific reviewers to comment the hell out of it. There is nothing to be ashamed of when (not if) your change receives dozens of comments! (But it is embarrassing to receive the comments post-commit, isn't it?)

Beware of the process though. There are several caveats to keep in mind if you want to keep your sanity and that's what started this post. My views on this are detailed below.

Note that the Phabricator setup for FreeBSD is experimental and has not yet been blessed by core. There is also no policy requiring reviews to be made via this tool nor reviews to be made at all. However, I'm hopeful that things may change given enough time.

Let's discuss code reviews per se.

Getting into the habits of the code review process, and not getting mad at it, takes time and a lot of patience. Having gone through thousands of code reviews and performed hundreds of them over the last 5 years, here come my own thoughts on this whole thing.

First of all, why go through the hassle?

Simply put: to get a second and fresh pair of eyes go over your change. Code reviews exist to give someone else a chance to catch bugs in your code; to question your implementation in places where things could be done differently; to make sure your design is easy to read and understand (because they will have to understand it to do a review!); and to point out style inconsistencies.

All of these are beneficial for any kind of patch, be it the seemingly-trivial one-line change to the implementation of a brand-new subsystem. Mind you: I've seen reviews of the former class receive comments that spotted major flaws in the apparently-trivial change being made.

The annoying "cool down" period

All articles out there providing advice on becoming a better writer seem to agree on one thing: you must step away from your composition after finishing the first draft, preferably for hours, before copyediting it. As it turns out, the exact same thing applies to code.

But it's hard to step aside from your code once it is building and running and all that is left for the world to witness your creation is to "commit" it to the tree. But you know what? In the vast majority of cases, nobody cares if you commit your change at that exact moment, or tomorrow, or the week after. It may be hard to hear this, but that pre-commit "high" that rushes you to submit your code is usually counterproductive and dangerous. (Been there, done that, and ended up having to fix the commit soon after for stupid reasons... countless times... and that is shameful.)

What amuses me the most are the times when I've been coding for one-two hours straight, cleaned up the code in preparation for submission, written some tests... only to take a bathroom break and realize, in less than five minutes, that the path I had been taking was completely flawed. Stepping aside helps and that's why obvious problems in the code magically manifest to you soon after you hit "commit", requiring a second immediate followup commit to correct them.

Where am I going with all this? Simple: an interesting side-effect of pre-commit reviews is that they force you to step aside from your code; they force you to cool down and thus they allow you to experience the benefits of the disconnect when you get back to your change later on.

Keep multiple reviews open at once

So cooling down may be great, but I hear you cry that it will slow down your development because you will be stuck waiting for approval on a dependent change.

First of all, ask yourself: are you intending to write crappy code in a rush or, alternatively, do you care about getting the code as close to perfect as possible? Because if you are in the former camp, you should probably change your attitude or avoid contributing to a project other people care about; and if you are in the latter camp, you will eventually understand that asking for a review and waiting for your reviewer to get back to you is a reasonable thing to do.

But it is true that code reviews slow you down unless you change your work habits. How? Keep multiple work paths open. Whenever you are waiting for a change to be reviewed, do something else: prepare a dependent commit; write documentation or a blog post; work on a different feature; work on a completely-separate project; etc. In my case at work, I often have 2-3 pending changes at various stages of the review process and 1-2 changes still in the works. It indeed takes some getting used to, but the increased quality of the resulting code pays off.

Learn to embrace comments

Experienced programmers that have not been exposed to a code review culture may get personally offended when their patches are returned to them with more than zero comments. You must understand that you are not perfect (you knew that) and that the comments are being made to ensure you produce the best change possible.

Your reviewers are not there to annoy you: they are there to ensure your code meets good quality standards, that no obvious (and not-so-obvious) bugs sneak in and that it can be easily read. Try to see it this way and accept the feedback. Remember: in a technical setting, reviewers comment on your ideas and code, not on you as a person — it is important to learn to distantiate yourself from your ideas so that you can objectively assess them.

I guarantee you that you will become a better programmer and team player if you learn to deal well with reviews even when it seems that every single line you touched receives a comment.

Selecting your reviewers

Ah... the tricky part of this whole thing, which is only made worse in the volunteer-based world of open source.

Some background first: because code reviews at Google are a prerequisite for code submission, you must always find a reviewer for your change. This is easy in small team-local projects, but with the very large codebase that we deal with, it not always is: the original authors of the code you are modifying, who usually are the best reviewers, may not be available any longer. FreeBSD also has a huge codebase, older than Google's, so the same problem exists. Ergo, how do you find the reviewer?

Your first choice, again, is to try and find the owner of the code you are modifying. The owner (or owners) may still be the original author if he is still around, but it can be anyone else that stepped in since to maintain that piece of code.

Finding an individual owner may not possible: maybe the code is abandoned; maybe it is actively used but no single individual can be considered the owner. This is unfortunate but is a reality in open source. So do you abandon the code review process?

Of course NOT! Get someone with relevant expertise in the change you are making to look at your code; maybe they won't be able to predict all of the consequences of the change, but their review is lightyears better than nothing. At work, I may "abuse" specific local teammates that I know are thorough in their assessment.

The last thing to consider when selecting your reviewers is: how picky are they? As you go through reviews, you will learn that some reviewers will nitpick every single detail (e.g. "missing dot at end of sentence", "add a blank line here") while others will only glance over the logic of the code and give you a quick approval. Do not actively avoid the former camp; in fact, try to get them involved when your primary reviewer is on the latter; otherwise, it's certain that you will commit trivial mistakes (if only typos). I'm in the nitpickers camp and proudly so, if you ask.

Should all of the above fail, leaving you without a reviewer: ask for volunteers! There will probably be someone ready to step in.

Set a deadline

Because most committers in open source projects are volunteers, you cannot send out a change for review and wait indefinitely until somebody looks. Unless you are forbidden to commit to a specific part of the tree without review, set a deadline for when you will submit the change even if there have been no reviews. After all, the pre-commit review workflow in FreeBSD is not enforced (yet?).

If you end up committing the change after the deadline without having received a review, make sure to mention so in the commit message and clearly open the door at fixing any issues post-commit.

Learn to say no

Because code reviews happen in the open, anybody is allowed to join the review of a patch and add comments. You should not see this as an annoyance but you must know when to say no and you must clearly know who your actual approvers are and who are just making "advisory" comments.

Also note that comments in a review are not always about pointing obviously-wrong stuff out. Many times, the comments will be in the form of questions asking why you did something in a specific way and not another. In those cases, the comment is intended to start a discussion, not to force you to change something immediately. And in very few cases, the discussion might degenerate in a back-and-forth against two very valid alternatives. If this happens... you'll either have to push your way through (not recommended) or find a neutral and experienced third reviewer that can break the deal.

Get to the reviews!

Wow, that was way longer than I thought. If you are interested in getting your code for FreeBSD reviewed — and who wouldn't be when we are building a production-quality OS? — read the CodeReview wiki page instructions and start today.

And if you have already started, mind to share your point of view? Any questions?

Friday, May 23, 2014

Refocusing Kyua maybe?

The FreeBSD devsummit that just passed by gave me enough insight into Jenkins to question the long-term plans for Kyua. Uh, WHAT?! Let me explain.

In the beginning...

One of the original but unstated goals of Kyua was to fix the "mess" that is the NetBSD releng test run logs site: if you pay close attention, you will notice that various individuals have reinvented the wheel over and over again in an attempt to automate release builds and test suite runs. In other words: different parties have implemented independent continuous integration systems several times with more or less success.

Ending up with such duplication of infrastructure was never an intended result of ATF as ATF should have had this functionality on its own. Unfortunately, my lack of experience on continuous integration when I started ATF seven years ago made ATF's plans and design not cover the truly end goal of having a system up and running for all the platforms we care about.

In other words: even if this was never published on the Kyua website, my long-term plan was to turn Kyua into a continuous integration platform. This would involve providing some form of automation to execute builds along the execution of tests, a mechanism to distribute build and test tasks to more than one machine, a comprehensive web interface to interact with the system, a monitoring system to ensure the platform keeps up and running, and much more.

As you can imagine, implementing all of these features is a humongous amount of work and Kyua is currently far from providing any of these. (Note: it is relatively simple to come up with a simple do-one-thing implementation, but productizing and productionizing the result is a completely different story — and I am not here to ship toy-quality software!)

Enter Jenkins

During the May 2014 FreeBSD devsummit, I got to learn about Jenkins: An extendable open source continuous integration server. Thanks to Craig Rodrigues, I had a local instance up and running in less than 20 minutes and was able to hack Kyua to output Jenkins-compatible reports in less than an hour of hands-on work. (Sorry, changes still not pushed to Github.)

As it turns out, Jenkins is no joke. It is a hugely popular piece of software used by high-profile projects and companies. Just think on holding a conference on such a niche tool and getting 400+ attendees to understand this.

My original reaction to the UI was to dismiss it as ugly (it indeed is), but a bit of poking revealed a large amount of hidden-but-easily-reachable functionality. And after visiting the FreeBSD Jenkins deployment, I quickly realized the power of this system once I saw it distributing jobs across a cluster and monitoring their status in real time and in a relatively nice way.

What a discovery. Jenkins does all I ever wanted Kyua to do and more, much more. "Fortunately," Jenkins lacks the pieces to actually define and execute the FreeBSD/NetBSD test suite, and this is exactly what Kyua offers today — which means that Kyua is not irrelevant.

Not enough time

So we have Jenkins out there that does most of what I ever wanted Kyua to do and we have Kyua which currently does not do any of it but fulfills the one missing piece in Jenkins. Hmm... sound like an opportunity to rethink Kyua's goals maybe. But why? It all revolves around time and spending it wisely in the things that will have the most impact with the fewer effort.

The time I can devote to Kyua is minimal these days, so at my current pace I —or, rather, we— will never reach those grandiose goals of having native continuous integration in Kyua. What's worse: the fact that I had such plans in mind but no time for them made me feel bad about Kyua overall (I'm pretty sure there is a name for this apathetic feeling but couldn't find it).

Learning about Jenkins has been a relief. "What if... what if I could reshape Kyua's goals to be less ambitious in scope? What if Kyua did one thing only, which would be to define and execute a test suite in a single configuration under a single machine, and excelled at it? Then you'd be free to plug Kyua into whichever continuous integration system that best suit your needs and Jenkins would be a reasonable choice." Those are the kinds of questions that are floating my mind since last week.

No downsides?

The main problem with Jenkins and the context in which we want to use it in (BSD systems) is, probably, the fact that Jenkins is written in Java. Java support on the BSDs has never been great, but things generally work OK under amd64 and i386. This actually is a serious issue considering all the platforms that FreeBSD and NetBSD support. Without going too far, I now would like to run Jenkins at home to offer continuous integration for Kyua itself... but my two servers are PowerPC-based machines — so no Jenkins for them yet, at least under FreeBSD.

Let us not get distracted from the core idea of splitting continuous integration out of Kyua just because Jenkins may not fit some obscure use case out there. Jenkins may be a very good solution to this, but it need not be the only one!


The more I think about it, the more I become convinced that Kyua's goals need to be simplified. Let Kyua focus on what it already is good at: defining a complex test suite and making sure that it can be run easily and deterministically. Let the remaining scaffolding for continuous integration to the big players out there. Jenkins, in this case, would be just one of the possible options for the continuous integration framework; we'd just need to make sure that Kyua plays well with at least one other system to ensure the design is generic.

If we went this route, Kyua would not need to be as complex as it already is. As a result, there are several things in the current design that could be yanked and/or streamlined to offer a simpler and more reliable piece of software.

It really all boils down to a very simple idea:

Optimize my little and precious time to have the greatest impact on the project and other people's work.

The alternative to this is to remain in the status quo, which is to naively wait for years until Kyua gets all of the necessary continuous integration features and makes Jenkins and other custom scripts unnecessary. And I cannot promise any of this will ever happen, nor I can say it makes sense at all.

I haven't made a decision yet as I would like these thoughts to solidly settle before announcing any changes widely, and I would also like to think about any possible simplifications to Kyua's design.

In the meantime: what are your thoughts?

Wednesday, May 21, 2014

BSDCan 2014 summary

BSDCan 2014 and the accompanying FreeBSD devsummit are officially over. Let's recap.

FreeBSD devsummit

The FreeBSD devsumit at BSDCan is, by far, the largest of them all. It is true that I already visited a devsummit once —the one in EuroBSDCon 2013—, but this is the first time I participate in the "real deal" while also being a committer.

The first impressive thing about this devsummit is that there were about 120 attendees. The vast majority of these were developers, of course, but there was also a reasonable presence from vendors — including, for example, delegates from Netflix, Isilon, NetApp and even smaller parties like Tarsnap.

The devsummit started with a plenary status talk followed by a new proposal for release engineering and a planning session for features desired in FreeBSD 11. We ended up with three whiteboards full of desired items, ranging from small stuff like adding a feature to a driver to larger stuff like packaging the base system or potentially dropping support for IA64 altogether. In the end, and being this a volunteer project, it is obvious that all features that came up will not be ready for 11.0. However —and this is the important part— this was extremely motivating: tons of new ideas for future releases, a collective desire to getting them done and dozens of people with the energy required to make them happen.

I digress. We then went off to our working groups.

The Continuous Integration working group

The devsummit session that I most wanted to attend was the working session on Continuous Integration with Jenkins, and it was quite an eye-opener for me.

When I first heard about the Jenkins work that had gone into FreeBSD, I disregarded it because I assumed it overlapped my work on Kyua (NIH syndrome, you know). The thing is that it did actually overlap my plan, but this is a good thing and I will let you know why in an upcoming post.

Fortunately though, my views on Jenkins and the project changed pretty quickly. Jenkins is a very powerful software package with a large community backing it. With help from Craig Rodrigues, I had a local instance up and running, ready for hacking, in less than 20 minutes. Plugging Kyua into Jenkins, which is something I promised to do a couple of months ago but never got to it due to procrastination, turned out to be a simple 1-hour task.

So where are we going from here? Integrating the FreeBSD Test suite into Jenkins! This will be a relatively-simple thing to do and will provide a quick and powerful solution to continuous integration for FreeBSD. As Craig mentioned, realizing this alone was worth the entire trip to BSDCan.

My talk on the FreeBSD test suite

My talk this year was on the shiny-new FreeBSD Test Suite. The original conference schedule had me against a talk on mandoc and a talk on PC-BSD (if I recall correctly), which made me happy because I thought I would not have super-strong competition and I'd be able to fish many attendees looking for a systems talk.

Unfortunately... the speaker for the PC-BSD talk could not make it to Ottawa and, in his place, the conference organized an impromptu talk on LibreSSL — the hot topic these days. As a result, I really was not hoping for (m)any attendees.

However, in the end, I think I got around 40 people in the room; not as many as I wished, but let's call it a success. The funny thing, though, is that one reason behind the existence of Heartbleed-like bugs is a poor testing culture, which in turn results in poor APIs and unverified (often pretty obvious) corner cases.

Speaking of talks, this is the first time I experimented with bullet-less slides. My slides had pictures, diagrams and code samples alone and all the "bullet" content was on my private notes. I hope this was much more engaging for the audience but, in particular, reinforces the fact that the slides are not the presentation.

The talk will be online soon I hope, at which point I'll post a link.

To summarize: I think the talk went well. If you were one of the few attendees: thank you and please provide detailed feedback! It's the only way to become a better speaker.

Little NetBSD presence

The lack of NetBSD presentations and attendees is a recurrent topic in pretty much all BSD conferences. Only very few people show up and the content of the conference is pretty much all FreeBSD and OpenBSD related (the latter being a good recent push).

This is probably a vicious circle. Because there are no NetBSD talks, NetBSD developers don't see the value in attending... and because they don't see the value in attending, they don't propose topics.

But the truth is attendance is not about "we against you" presentation-wise. Meeting people from other projects and holding hallway conversations with them is incredibly valuable and essential if you want your work and ideas to float around. Many misconceptions are cleared this way and, who knows, maybe you will find an unexpected collaboration opportunity.

From here, I'd like to encourage the top contributors of the NetBSD project to fly out to the major BSD conferences. You know who you are. There is no need to present, but obviously doing that is a plus because you'll have your expenses covered.

If any of you are reading this, I'd also ask you to attend the FreeBSD devsummit to see how things work on the other side of the fence. I know that at least one board member is interested in hearing how the FreeBSD Foundation drives things on their end as their prior experience surely applies equally to the NetBSD Foundation. If interested, drop me a note when the time comes and I'll extend you an invite.

Hallway conversations

As always, the best thing in these technical conferences are the hallway conversations with other developers. Building personal relationships with the community helps massively in later online interactions, especially in treating nasty-looking-at-first email replies in a more personal way. (Doesn't work with everyone but it does the majority of the time.)

I can't mention all the individuals I talked to because they were a good bunch, but let me say that I learned a lot about the future desires for the ports tree, the status of the FreeBSD powerpc port, "things" about IPv6, internals of Xen...

The food

This year's BSDCan featured catered salad and sandwiches instead of the traditional lunch boxes. I think the widespread consensus was that the food was much better this time. I personally welcomed the change.

Gained motivation

And, lastly, we are back to motivation. I've already glanced over this throughout the post, but I gotta restate it on its own. Being able to meet all the people that really care about the project in a single room, sharing what their desires and goals are; receiving kudos for one's own work and suggestions on how to improve it; and just seeing everyone put effort in their preferred areas is very motivational. It refreshes one's view on the projects and, at least in my case, really makes me want to continue working on what I was doing in the past.

I've got this feeling during every single BSD conference I attended and I hope this continues to be the case in the future.

That's all for now. See you in Sofia, Bulgaria for EuroBSDCon 2014.

Wednesday, March 12, 2014

GSoC 2014 idea: Port FreeBSD's old-style tests to ATF

Are you a student interested in contributing to a production-quality operating system by increasing its overall quality? If so, you have come to the right place!

As you may already know, the Google Summer of Code 2014 program is on and FreeBSD has been accepted as a mentoring organization. As it so happens, I have a project idea that may sound interesting to you.

During the last few months, we have been hard at work adding a standardized test suite to the FreeBSD upstream source tree as described in the TestSuite project page. However, a test suite is of no use if it lacks a comprehensive collection of tests!

Fortunately, the FreeBSD source tree already has a reasonable collection of test programs that could be put to use... but unfortunately, these are all not part of the test suite and are broken in various ways because they have not be run for years. Here is where you come into play.

My project idea

I would like you to spend the summer working with us in converting all existing legacy tests into modern-style tests, and hooking those into the FreeBSD test suite. The obsolete tests currently live in src/tools/regression/ and src/tools/test/.

Sounds boring? Here is what you will achieve:

  • Contribute to a production-quality operating system from which you will be able to reference your work at any point in the future. If you are into systems at all, this is an invaluable experience for yourself and your résumé.
  • Gain a better understanding of Unix and FreeBSD by debugging failing tests. Trust me, debugging problems on a real-world operating system is an excellent way to gain working knowledge of systems and troubleshooting, both of which are valuable skill-sets in the industry.
  • Understand how a test suite is organized and how individual tests become useful in day-to-day operations. Not all tests are created equal and many are not as nice as they should be, so you will be using your coding skills to improve their quality.
  • Get to know ATF, Kyua and other testing technologies.

And if that's not all: the results of your project will be immediately visible to the whole world at the official continuous testing machines and your fellow developers will be infinitely grateful for having one more tool to ensure their changes don't break the system!

So, if you are interested, please navigate to the project proposal page, read the nitty-gritty details and contact us via the freebsd-testing mailing list. And, if you happen to be in Tokyo attending AsiaBSDCon 2014, stop by and talk to me!

Saturday, February 15, 2014

How to merge multiple Git repositories into one

Are you looking for a method to merge multiple Git repositories into a single one? If so, you have reached the right tutorial!

Please bear with me for a second while I provide you with background information and introduce the subject of our experiments. We'll get to the actual procedure soon and you will be able to apply it to any repository of your choice.

In the Kyua project, and with the introduction of the kyua-atf-compat component in the Summer of 2012, I decided to create independent Git repositories for each component. The rationale was that, because each component would be shipped as a standalone distfile, they ought to live in their own repositories.

Unfortunately, this approach is turning out to be a bit of an inconvenience: it is annoying to manage various repositories when the code of them all is supposed to be used in unison; it is hard to apply changes that cross component boundaries; and it is "impossible" to reuse code among the various components (e.g. share autoconf macros) in a clean manner — much less attempt to share the version number between them all.

So what if all components lived in the same repository a la BSD but were still shipped as individual, fine-grained tarballs for packaging's sake? Let's investigate.

The goal

Obviously, the goal is to get two or more Git repositories and merge them together. It's particularly important to not mangle any existing commit IDs nor tags so that history is preserved intact.

For the specifics of our example, Kyua has three repositories: one for kyua-cli (which is the default, unqualified repository in Google Code), one for kyua-atf-compat and one for kyua-testers. The idea is to end up with a single repository that contains three top-level directories, one for each component, and all independent of each other (at least initially).

Process outline

The key idea to merge Git repositories is the following:

  1. Select a repository to act as pivot. This is the one into which you will merge all others.
  2. Move the contents of the pivot repository into a single top-level directory.
  3. Set up a new remote for the secondary repository to be merged.
  4. Fetch the new remote and check it out into a new local branch.
  5. Move the contents of the secondary repository into a single top-level directory.
  6. Check out the master branch.
  7. Merge the branch for the secondary repository.
  8. Repeat from 3 for any additional repository to be merged.

Sounds good? Let's get down to the surgery!

We need to select a pivot. For Kyua, this will be the default Google Code repository in Let's start by checking it out and moving all of its contents into a subdirectory:

$ git clone
$ cd kyua
$ mkdir kyua-cli
$ git mv * kyua-cli
$ git commit -a -m "Move."

We are ready to start tackling the merge of a secondary repository. I will use in this example.

The first step is to pull in that secondary repository into our pivot:

$ git remote add origin-testers
$ git fetch origin-testers

And now, check it out into a temporary branch and move all of its contents into a subdirectory:

$ git branch merge-testers origin-testers/master
$ mkdir kyua-testers
$ git mv * kyua-testers
$ git commit -a -m "Move."

Done? It's the time to merge the two repositories into one!

$ git checkout master
$ git merge merge-testers

And clean some stuff up.

$ git branch -d merge-testers
$ git remote remove origin-testers

Voilà. It wasn't that hard, was it? Just repeat the steps above for any other secondary repository you would like to merge.

Parting words

Note that this procedure achieves the goal of preserving the history of all individual repositories, the revision numbers and the tags. In other words: all previous history is left intact and all commit logs remain valid after the merge.

Do you know if there is any easier way of doing this? Would it have any differences in the actual results?

What do you think about doing the merge for Kyua? I see this as a prerequisite for the migration to GitHub.

Thursday, February 06, 2014

Moving projects to GitHub

For a couple of years or so, I have been hosting my open source projects in Google Code. The time to jump ship has come.

The major reason for this move is that Google Code stopped supporting file downloads three weeks ago. This is unfortunate given that "binary" releases are a must for proper software distribution. Sure, I could use a third-party service like Bintray to offer the downloads, but I'd rather consolidate all project data in a single location.

With this, I am moving to GitHub. "Why not <insert-your-favorite-hosting> though?" Because Google prefers us to put Google-sponsored open source projects in GitHub, and all the work I do now is sponsored by my employer as part of my 20% time. (Yes, 20% time still exists even if people out there may try to convince you otherwise.)

GitHub was already hosting some projects of my own (including sysbuild and sysupgrade), so the remaining (Lutok, ATF and Kyua) will move as well. In fact, Lutok was already moved a few days ago as an experiment and I am working on ATF as we speak.

Moving Kyua will be a little bit trickier given that Kyua is composed of more than one Git repository. I still have to decide which of the two options I have in mind is better, so stay tuned for a follow-up post detailing my thoughts.

Fortunately, because there is no VCS change with this move, the move should be pretty much transparent (unlike the move years ago out of Monotone). The biggest downside may be moving the issue tracker, and that's something I'm just starting to think about.

Stay tuned and consider the current GitHub repositories an experiment.

Wednesday, February 05, 2014

Killing the ATF deprecated tools code

The time to kill the deprecated tools —atf-report and atf-run principally— from the upstream ATF distribution file has come. Unfortunately, this is not the trivial task that it may seem.

But wait, "Why?" and "Why now?"

Because NetBSD still relies on the deprecated tools to run its test suite, they cannot just be killed. Removing them from the upstream distribution, however, is actually a good change for both ATF and NetBSD.

The main reason for removing the tools from the upstream ATF package is simplicity: with this change, ATF becomes a pure distribution of libraries to implement ATF-compliant test programs. At the moment, building upon ATF is difficult because the code of the libraries is overly complicated: the libraries include a lot of internal modules only used by the tools and this code cannot be lightened unless the tools are gone. Not putting this tools-only utility code outside of the base libraries was a bad design decision when the package was first created in 2007 and is a huge mess to untangle.

However, even if the tools are removed from the upstream ATF package, we still need to accommodate for the fact that NetBSD still requires the tools. The easiest way to do this is to just "hand the keys over" to NetBSD and transfer ownership of the tools code to them (which is a weird way to put it given that "them" includes me as well!). The actual effect of this "removal" is that the master copy of the tools code will live in the NetBSD source tree.

So, how is this going to happen?

A naive course of action would be to just remove the old code from the upstream distribution file in the next release. With this done, NetBSD would import the new release and, at the same time, keep the files for the old code without deleting them.

Unfortunately, this approach wouldn't work very well. The problem is, as mentioned above, that the tools have deep tentacles into the atf-c++ code and atf-c++, in turn, is mostly a wrapper over atf-c. If we did as mentioned above, changing the upstream code for atf-c++ and/or atf-c at a future date would be incredibly difficult: any minor change to their internals would result in the NetBSD copy of the tools not building any more on the next import, and that would be a huge maintenance nightmare (for me actually, because I'll continue to be the owner of the code regardless of where it lives).

Therefore, the process to perform the removal needs to account for this original deficiency and has to mitigate its effects. The plan is as follows:

  1. Detach the ATF tools from the libraries, which is tricky but necessary. Future changes to the internal structure of atf-c++ must not affect the code of the tools as should have been the case from day one. The way this is going to happen is by duplicating all code in atf-c++ that is required by the tools into the tools themselves and then killing the dependency of the tools on the libraries.
  2. Publish 0.19 as the final release with the deprecated tools: this is the release that will contain the standalone version of the code for use by anyone that may need it.
  3. Remove all ATF tools related code from the upstream tree.
  4. Publish 0.20 immediately after 0.19 as the first release without the deprecated tools. 0.19 should never hit any packaging system and will exist solely as a transition point for NetBSD (or others that may need it).
  5. Import 0.19 into the NetBSD source tree. This will rearrange the code currently existing in src/external/bsd/atf/ to isolate the old tools (with all of their supporting code) in a single and standalone subdirectory.
  6. Import 0.20 into the NetBSD source tree as well, right after 0.19. Because NetBSD does not import any of the autotools generated code, this double-import will result in minimal changes to the tree... yet it will be declaring the tools as owned by NetBSD.
  7. Upgrade existing ATF packages to 0.20, including those in pkgsrc, FreeBSD ports and Fedora, and simplify them significantly to remove any tools code. This may actually involve killing atf-libs and similar packages that were added in the past to cope with the duality of the contents of the atf package.
  8. Simplify the tools code imported into NetBSD to drop most of, if not all, the portability-related code. This implies removing anything that depends on the results of configure, for example, given that the indirection won't be needed any more.

OK, when?

Really soon. I have all the changes ready for 0.19 and 0.20 in my local git tree (which took two long-haul flights within the U.S. last week to finish!) and I'm now working on the integration into NetBSD to ensure all works. When this is ready, expect the new releases out and updated in your favorite packaging systems!

Anything else?

Doing this cleanup will free ATF to evolve more easily regrading new features and much needed interface cleanups. Of course, updating the deprecated tools in NetBSD will still have to happen whenever/if there are any incompatible changes, but doing so in a self-contained, OS-specific tree will be much simpler and palatable.

The ultimate goal, really, is to make the upstream atf package as lightweight and simple as possible.

What are your thoughts on this?