Thursday, December 4, 2014

Checkbox [Touch] 1.0.2 now available in the Ubuntu click store.

This is a re-post from my G+ post but I think it's such an important milestone that we should let everyone know about it :-)

Hello everyone. After a bit of silence I'm super pleased to announce that Checkbox [Touch] is now in the Ubuntu Click store. Get it on all your devices running Ubuntu RTM or newer.

Our hybrid QML + Python3 application is available on x86 , amd64 and armhf alike, from one .click file. Bugs and any other feedback can be reported on https://bugs.launchpad.net/checkbox-touch/+filebug The code is available, under the GPLv3 license, in the lp:checkbox repository.

Thanks a lot to everyone that made this possible!

Thursday, October 23, 2014

Update on python-glibc

My pure-python bindings to glibc are progressing at a nice rate. I've made some interesting changes today that I'd like to share.
  • First, there is a clear difference between the raw glibc functions (all in the glibc module) and anything else. You can use them directly just as you would have from C. There's no magic going on and it's all there.
  • Second, we now have a growing collection of python wrappers (in the new pyglibc package), that give low-level primitives nice, high-level, pythonic API. Some of those are straight out of Python 3.4 (but are not a code copy), those include selectors.EpollSelector and select.epoll, some are custom (there's nothing to based this on) like signalfd and pthread_sigmask. More are on the way.
  • Third, and this is pretty interesting. I've decided to build a PEP3156 compatible event loop API. This is paramount for how this code can be consumed. It should roughly work out of the box as a drop-in replacement for the Python 3.4 only asyncio module. Did I mention that it works on Python 2.7? A lot is still missing but I am making progress. This ultimately means that once my contraption makes it into plainbox it won't have to be supported forever (aka job security) and can be discarded once we can depend on Python 3.4. It also means there's a clear, well defined API, a reference implementation (and some others if you look hard enough.
All of that is coming in the 0.6 release that I plan to make later today. The API is stable as I don't like changing my examples over and over so if you want to give it a try, please do so.

My ultimate goal is to scratch my itch. I want to build a reliable test launcher that does monitoring and cleanup. My only constraint is support for Python 3.2 on Ubuntu 12.04 that I have to support. I'm doing a little bit more by supporting Python 2.7 (since it's not costing me anything) on anything that is running the recent enough glibc.

If you're interested in discussing this, using it, adding patches or the like, ping me please.

Wednesday, October 22, 2014

Launching a process to monitor stdout, stderr and exit code reliably

Recently I'm fixing a rather difficult bug that deals with doing one simple task reliably. Run a program and watch (i.e. intercept and process) stdout and stderr until the process terminates.

Doing this is surprisingly difficult and I was certainly caught in a few mistakes the first time I tried to do this. I recently posted a lengthy comment on the corresponding bug. It took me a few moments to carefully analyze and re-think the situation and how a reliable approach should work. Non the less I am only human and I certainly have made my set of mistakes.

Below is the reproduction for my current approach. The implementation is still in progress but it seems to work (I need to implement the termination phase of non-kill-able processes and switch to fully non-blocking I/O). So far I've used epoll(7) and signalfd(7). I'm still planning to use timerfd_create(2) for the timer, perhaps with CLOCK_RTC for hard wall-clock-time limit enforcement. I'll post the full, complete examples once I'm done with this but you can look at how it mostly looks like today in the python-glibc git tree's demos/ directory.

I'd like to ask everyone that has experience with this part of systems engineering to poke holes in my reasoning and show how this might fail and misbehave. Thanks.

The current approach, that so far works good on all the pathological cases is to do this.
The general idea is that we're in a I/O loop, using non-blocking I/O and a select-like mechanism to wait for wait for:
 - timeout (optional, new feature)
 - read side of the stdout pipe data
 - read side of the stdout pipe being closed
 - read side of the stderr pipe data
 - read side of the stderr pipe being closed
 - SIGCHLD being delivered with the intent to say that the process is dead
In general we keep looping and terminate only when the set of waited things (stdout depleted, stderr depleted, process terminated) is empty. This is not always true so see below. The action that we do on each is event is obviously different:
If the timeout has elapsed we proceed to send SIGTERM, reset the timer for shutdown period, followed by SIGQUIT and another timer reset. After that we send SIGKILL. This can fail as the process may have elevated itself beyond our capabilities. This is still undecided but perhaps, at this time, we should use an elevated process manager (see below). If we fail to terminate the process special provisions apply (see below).
If we have data to read we just do and process that (send to log files, process, send to .record.gz). This is a point where we can optimize the process and improve reliability in event of sudden system crash. Using more modern facilities we can implement tee in kernel space which lowers processing burden on python and, in general, makes it more likely that the log files will see actual output the process made just prior to its death.
We can also use pipes in O_DIRECT (aka packet mode) here to ensure that all writes() end up as individual records, which is the indented design of the I/O log record concept. This won't address the inherent buffering that is enabled in all programs that detect when they are redirected and no longer attached to a tty.
Whenever one of the pipes is depleted (which may *never* happen, lesson learned) we just close our side.
When the child dies, and this is the most important part and the actual bugfix, we do the following sequence of events:
 - if we still have stdout pipe open, read at most one PIPE_BUF. We cannot read more as the pipe may live on forever and we can just hang as we currently do. Reading one PIPE_BUF ensures that we catch the last moments of what the originally started process intended to tell us. Then we close the pipe. This will likely result in SIGPIPE in any processes that are still attached to it though we have no guarantee that it will rally kill them as that signal can be blocked.
 - if we still have stderr pipe open we follow the same logic as for stdout above.
 - we restore some signal handling that was blocked during the execution of the loop and terminate.
There's one more trick up our sleeve and that is PR_SET_CHILD_SUBREAPER but I'll describe that in a separate bug report that deals with runaway processes. Think dbus-launch or anything that double-forks and demonizes

If you have any comments or ideas please post them here (wherever you are reading this), on the launchpad bug report page or via email. Thanks a lot!

Sunday, September 21, 2014

Announcing Morris 1.0

Earlier today I've released the first standalone version of Morris (source, documentation). Morris is named after Gabriel Morris, the inventor of Colonne Morris aka the advertisement column. Morris is a simple and proven Python event/signaling library (not for watching sockets or for doing IO but for generic, in-process broadcast messages).

Morris is the first part of the Plainbox project that I've released as a standalone, small library. We've been using that code for two years now. Morris is simple, well-defined and I'd dare to say, complete. Hence the 1.0 version, unlike the traditional 0.1 that many free software projects start with.

Morris works on python 2.7+ , pypy and python 3.2+. It comes with tests, examples and extensive docstrings. Currently you can install it from pypi but a Debian package is in the works and should be ready for review later today.

Here's a simple example on how to use the library in practice:

from __future__ import print_function

from morris import signal

class Processor(object):
    def process(self, thing):
        self.on_processing(thing)

    @signal
    def on_processing(self, thing):
        pass

def on_processing(thing):
    print("Processing {}".format(thing))

proc = Processor()
proc.on_processing.connect(on_processing)
proc.process("foo")
proc.process("bar")


For more information check out morris.readthedocs.org

Thursday, September 4, 2014

PEX, distribute your standalone python executables

I just discovered PEX. It's pretty simple conceptually. Bundle all your python 2/3 modules in a ZIP file. Add a __main__.py inside with bootstrap magic and set the interpreter to #!/usr/bin/env python* and you're done. That's what PEX does for you, with a few extra bells and whistles.

So I did this:

$ pex -r 'plainbox' -r 'xlsxwriter' -r 'lxml' \
  -e plainbox.public:main -o plainbox

And it worked :-) It's super simple and quite convenient for many things I can think of. If you want to play around with the python3 version you may want to apply this patch (python3 is still a stranger to many developers :P)

You can also download the resulting PlainBox executable

Tuesday, September 2, 2014

Personal annoyance personified: we need a Serial Manager

This is a personal annoyance of mine. Everything involving a serial line is preceded by "sudo stop modemmanager". Given that we're talking about a free desktop things should not have to require that.

I've just noticed and read the "Ubuntu Loves Devs" effort and I think that's something that could be addressed, or at least acknowledged. I've filed a bug report about what could be made to make modems and embedded / specialized development tools be less at odds with each other.

If you're interested in embedded development boards or accessing various devices using serial lines I'd like to invite you to join the discussion.

Thursday, August 28, 2014

Checkbox Project Insights

Another day behind us. Another day hacking on the Checkbox Project.

Today we got a few issues on the 3.2 SRU kernel for precise. I've recorded a short explanation of how the SRU process looks like from our (Certification) perspective. We're investigating those to see if those are kernel problems or test bugs.

I've started the day by working on a few code reviews and SRU reviews. The bulk of the time was spent on the new validation subsystem for Checkbox. As before, you can see most of that via the Live Coding videos, specifically episodes #17, #18, #19 and #20) on my YouTube channel.

You can always find us, checkbox hackers in #checkbox on freenode. If you care about testing hardware with free software, join us!

Tuesday, August 26, 2014

Live Coding Experiment

Hey.

Last week I've started doing recording videos of me, coding, live with screen sharing  and background context on everything I do. I did this to increase transparency of FOSS development as well as to increase awareness of the Checkbox project that I participate in.


I think while the actual videos are a bit too long for casual watching the experiment itself is interesting and worth pursuing.

I'm recording about 3-4 videos a day. I'll try to focus on making the content more interesting for both casual viewers that bail out after a minute or two and my hardcore colleagues that sometimes watch those to get up-to-speed about new feature development.

In any case, it is out there, in the open. If you want to talk to us, join #checkbox on freenode. Ping me on Google+. Browse the code. Improve translations or get involved in any other way you want.

Lastly, for a bit of self promotion, have a look at the latest video

Friday, August 22, 2014

Live coding videos


Today I was experimenting with coding live, on air with google hangouts. It is an interesting idea IMHO as it adds visibility to a process that is done in the open but rarely transparently in a way others can watch and learn from.

I've recorded two videos today:  Live coding: adding a man page for the new category unit and Live coding: fixing bug https://bugs.launchpad.net/checkbox-ng/+bug/1360125. If you want to see how I work (including and all the mistakes I make :-) do watch them and give me feedback so that I can get learn and get better at it.

Monday, June 9, 2014

Pyotherside + QML + Python3 in practice, checkbox-touch code walkthrough

If you're interested in python3, QML and pyotherside you then you might be interested in this video I've recorded about checkbox-touch. Checkbox touch is a prototype application built with QML and python3, using the excellent pyotherside library to bridge the gap between the two worlds.


Thursday, June 5, 2014

Moving to my own email address

So I've been using Gmail for a good while. I have three accounts, one personal, one for Canonical personality and one dead for my Linaro personality.

Using Google products with more than one account is a frustrating experience. Especially with hangouts that apparently just don't work at all without private browsing. But that's just a minor annoyance.

The Linaro experience taught me that nothing lasts unless you own it. With that in mind I've decided to move my primary personal address away from @gmail.com to my own domain.

My new address is related to my twitter handle @zygoon (since my usual nickname was not available) on my own domain, zygoon.pl. If, by any chance, you have zkrynicki@gmail.com in your address book I'd like to ask you to update it to:


I've published updated GPG keys in case you were wondering.

Wednesday, April 30, 2014

pyotherside = QML + Python3

If you wanted to write a QML / Ubuntu SDK application but had a considerable amount of python3 code that you didn't want to throw out the window you can find pyotherside utterly fantastic.

In short, you add one .so file (105K on amd64) (or install it system-wide) run qmlscene on your qml stuff and you can import and call anything from python world and get an asynchronous response. You can also use QML signals. This works pretty much like magic, it's speedy, responsive and the code is tiny. If you ever looked at pyside or pyqt then this is everything but.

I strongly recommend watching the Qt Developer Days 2013 presentation by the upstream author. You should also keep a look at the documentation. I've filed an ITP (to get it packaged in Debian) and we should see it in Debian and Ubuntu very very soon.

The upstream git repository has a number of examples. I love the matplotlib example that shows how you can render arbitrary bitmaps and push them to QML trivially.

If you want to give it a try Ive prepared a PPA with the same packages that I uploaded to Debian. Give them a spin and let me know if you find any problems.

Friday, April 11, 2014

Checkbox challenges for 2015

Having a less packed day for the first time in a few weeks I was thinking about the next steps for the Checkbox project. There are a few separate big tasks that I think should happen over the next 6-18 months.

First of all, our large collection of tests needs maintenance. We need to keep adapting it to changing requirements and new hardware. We need to fix bugs and make it more robust. We also need to add some level of polish to the user interface. To make sure all our test programs are behaving in an uniform way, use correct wording, can be localized, etc. Those are all important to keep the project healthy. We also have a big challenge ahead of us, with the whole touch world entering the Ubuntu ecosystem. We will have to revisit some decisions, decide which libraries, tools and layers to use to test certain features and make sure we don't leave anything behind. This is very challenging as we really have a lot of existing tests. We also need to make them work the same way regardless of how they are started (classic Ubuntu, touch Ubuntu, remote Ubuntu server).

The core tools got an amazing boost over the past 12 months. Starting from pretty old technology that was very flexible but hard to understand and modify to something that is probably just as flexible but far easier to understand and work with. Still, it's not all roses. The Ubuntu SDK UI needs a lot of work to get right. It has usability issues, it has architecture design issues. We also have a big disconnect between the core technology (python3) used by and Qt+QML C++ codebase, talking over D-Bus with the rest of the stack. That brings friction and is 10x harder to modify than an all-python solution. Ideally we'd like to switch to PyQt but how that fares with the future Touch world is hard to say. I suspect that our remote testing story will help us have a smooth transition that won't compromise our existing effort and equally won't collide with the direction set by the first Ubuntu touch release.

Perhaps not in the spotlight but definitely we need to work on "whitelists" (aka test plans). We need to learn how our users take our stack and remix it to solve their problems. Our test plan technology is ancient and shows its weaknesses. We need a 2.0 test plans that allow us to express the problems we need to solve clearly, unambiguously and efficiently. We need to improve our per-device-instance test support. We need to provide rich meta-data for user interfaces. We need better vocabulary to create true test plans that can react to results in a way unconstrained by the design of the legacy checkbox first written over seven years ago. We also need to execute those changes in a way that has no flag days or burnt bridges. Nobody likes to build on moving sand and we're here to provide a solid foundation for other teams at Canonical and everyone in the free software ecosystem.

Lastly we have the elephant in the room called deployment. Checkbox doesn't by itself handle deploying system images and configuration onto bare metal (we have a very old and support project for doing that) and the metal is changing very rapidly. Severs are quite unlike desktops, laptops (Ethernet-less ultrabooks?) and most importantly tablets and the whole touch-device ecosystem behind them. In the next 12 months we need a very good story and a solid plan on how to execute the transition from what we have now onto something that keeps us going for the next few years, at least. Canonical luckily has such a project already, MAAS. MAAS was envisioned for big iron hardware but if you look at it from our point of view we really want to have uniform API for all hardware. From that big-ass server in a Data Centre somewhere across the globe to that development board on your desk, which will be the next tablet or phone product. We want to do the same set of operations on all of the devices in this spectrum, manage, control, track, re-image. The means and technology to do that differ widely and from experience I can tell you this is a zoo with all the queer animals you can think of but I'm confident we can make it work.

So there you have it. Checkbox over the next 12+ months, as seen through my eyes.

Friday, April 4, 2014

Checkbox Project Update

The Checkbox project is undergoing more changes. We had to solve the problem of bug management and feature tracking for releasing each of the many components that now make up the project. We have discussed a number of ideas, including using tags, milestones, series and lastly, to use multiple projects. Using multiple projects ended up the most direct and effective option.

We had to add a twist to that idea though, apart from the existing checkbox project, all of the new projects would have no source. Just bugs, blueprints and releases (series, milestones and tarballs). Why? Because we lead a double life and need to take that into account and splitting the project into multiple code repositories is a separate transition that we have decided not to do (at this time, though I think that's healthy for us).

So the double-life aspect. As mentioned in one of my earlier posts, Chcekbox has two kinds of releases. The one life is about our upstream role. We release tarballs, package them for Debian, get them sponsored, synchronize them to Ubuntu into the hands of everyone using the platform. The other life is organized around our PPAs, internal customers and project schedules. There Ubuntu deadlines don't matter but it also means that important bugs have two releases they are a part of. They are a part of one (or more) of the upstream components. This is important so that we can properly document what goes into each release. They are also a part of a timestamped delivery for our internal customers. They also care about tracking fixes to the issues blocking their work.

So with that we now have checkbox-project (a launchpad project group) that aggregates our entire stack. You can now see all of the bugs and milestones throughout the project. You can also see how particular bugs or features translate to upcoming, scheduled releases of particular components. We hope that this new arrangement will be more valuable for everyone who tracks our work, despite the added set of project.

Wednesday, April 2, 2014

PlainBox Target Device


The plainbox-0.6 milestone is full of content but one thing I want to point out is the CEP-4 blueprint. In short, you will be able to run PlainBox on a desktop or laptop computer but execute tests on a server or tablet device you can connect to over ssh or adb.

I'd like to solicit comments and feedback on the proposed design. Development has started but so far just in R&D mode, to check the limitations of adb and see how the proposed design really fits into the current architecture.

So, if you are interested in device or server testing, have a look at the specification (linked from the blueprint) and discuss this in checkbox-dev@lists.launchpad.net. Please help us help you better.

Monday, March 31, 2014

PlainBox Providers for Everyone

With the imminent release of PlainBox 0.5.2 providers with native executables (read: compiled code) are a reality.

Have a look at https://github.com/plainbox-providers/ for two very simple examples. Fork them, star them, share them, edit them, break them.

The final release of PlainBox will be made to pypi, Debian synchronized to Ubuntu. Early builds are already available in our PPA (as soon as the recipe builds finish).

About PlainBox: PlainBox is a toolkit consisting of python3 library, development tools, documentation and examples. It is targeted at developers working on testing or certification applications and authors creating tests for such applications.

Thursday, March 13, 2014

PlainBox 0.5b1 released

I've just released the latest version of PlainBox. The 0.5b1 release is available on pypi. A list of changes, and a lot of other documentation, is available on readthedocs. This release was long in the making, bringing a number of important features and bug fixes.

Updated Debian packages should be made available tomorrow. The final release is expected early next week after which I will try to get a Feature Freeze Exception and sync it to Ubuntu 14.04.

About PlainBox: PlainBox is a toolkit consisting of python3 library, development tools, documentation and examples. It is targeted at developers working on testing or certification applications and authors creating tests for such applications.

Saturday, February 15, 2014

checkbox 0.17.6 released, new release process

Today I've released the next stable version of the classic Checkbox. Checkbox is a hardware testing tool developed by the Hardware Certification team here at Canonical.

After the initial bumpy morning I have released 0.17.6 and opened 0.17.7 for development. The release was built with launchpad, using a this packaging recipe and is available in both testing and stable PPAs. The daily development PPA is already tracking 0.17.7 builds. The release is tracked on the 2014-feb-14 milestone on launchpad.

This was a rather uneventful release but the release process was anything but. It is the first release built entirely on tags and merge requests. The release candidate 0.17.6c1 was branched from trunk , built from the release branch into the testing PPA. The recipe there is particularly interesting. Here is the relevant text:
# bzr-builder format 0.2 deb-version 0.17.6~c1~ppa
lp:checkbox/release tag:checkbox-v0.17.6c1
merge checkbox-packaging lp:~checkbox-dev/checkbox/checkbox-packaging-release tag:packaging-checkbox-v0.17.6c1
We are taking the release candidate tag from the lp:checkbox/release branch and a similar tag from the release packaging branch. This is mechanism allows us to release follow ups. Previously we could just not release if the release had serious issues. Now we can fix issues and release another candidate version.

The release was tested using our standard testing process (full certification run on reference hardware) and after reviewing results, was green-lit for final release.

Still on the release branch, a version bump was committed (to final release version), a new tag was added (using the improved releasectl script) and another version bump (to next development version) was committed. A similar operation was performed in the packaging branch. Both changes were pushed and merged to their respective trunks (for code and for packaging). Those merges kicked off one more build, this time just to have final version everywhere where it matters, using the new pair of tags using the stable release recipe the relevant portion of which you can see below:
# bzr-builder format 0.2 deb-version 0.17.6
lp:checkbox tag:checkbox-v0.17.6
merge checkbox-packaging lp:~checkbox-dev/checkbox/checkbox-packaging tag:packaging-checkbox-v0.17.6
The situation is very similar to what was quoted above for the release candidate. The essential difference is that now the final tags are being looked up in trunk. This ensures that we have actually merged the tags back to trunk and that the release can be reproduced later.

The release process is a little bit heavier than before, due to the extra builds and the extra tagging of the candidate version. We will be working to improve the automation around releases to alleviate that cost and make it a derivative of the fact that a specific tag was placed in trunk. Everything else is just a side effect of that.

This process has some very nice properties. It naturally prevents, at source control level, anyone from releasing duplicate version. It allows multiple releases to be in flight (in testing, preparing for release). It ensures that anyone can rebuild a release or branch off the relevant tag and add a bugfix and release again.

Since this was all pretty much experimental, we haven't written the instructions down in our release policy document but I plan to work on that early next week. We found only two actual issues during this experience. One is easy to fix, that tarmac is not merging or propagating tags. This seems easy enough to fix. The bigger problem is that launchpad is not showing tags in merge requests. In a process where you rely on tags this is a real issue as it requires trust that nobody is sneaking tags behind your back and that all the tags are placed on appropriate revisions.

So this is it, this is the new release process, what do you think? What would you change to make it better?

Friday, February 7, 2014

PlainBox is going local

With the feature freeze approaching quickly and plainbox (and checkbox-ng) being already in the Debian and Ubuntu archives, we're working on getting proper i18n support ready.

I've posted a few initial patches that add gettext_domain to provider definitions. With that we can add additional APIs to expose the domain over DBus (or localized strings, though I don't like that approach) and our python APIs.

PlainBox is just a framework for testing applications and we must deal with a lot of data to get everything right (only a fraction of the text on screen is actually in the program code). One of the biggest sources of data are test providers.

One of the challenges that we yet have to solve is how to tie this system with Launchpad's automatic translation system. For python code it's all okay but we need to ensure that launchpad is correctly exporting all the marked strings from our data files. If we manage to do that in time, translators can fill in the blanks and everyone using Ubuntu 14.04 will get new translations as a part of periodic language pack updates.

Tuesday, February 4, 2014

PlainBox 0.5a1 released.

Hello

I'm pleased to announce the availability of PlainBox 0.5 alpha 1.

This is an alpha release of the 0.5 series. It is released for testing and
early feedback.For a list of changes refer to the changelog [1]. Final
release is not scheduled yet but it is expected to follow by the end of
February 2014.

The new release source tarball has been uploaded to pypi [2] and is
immediately available.

The Debian SVN packaging repository has been updated [3]. You may expect
that new versions of PlainBox packages will show up the Debian and Ubuntu
archives in two or three days.

As always, our official documentation has is live on readthedocs [4].

[1] http://plainbox.readthedocs.org/en/latest/changelog.html#plainbox-0-5a1
[2] https://pypi.python.org/pypi/plainbox
[3] http://anonscm.debian.org/viewvc/python-modules/packages/plainbox/
[4] http://plainbox.readthedocs.org/