Integrated subproject sites on readthedocs.org

January 14th, 2017

This has been bothering me for a few years now, but in the FAQ for readthedocs it calls out the celery/kombu projects as an example of subprojects on RTD. And.. ok, I suppose it’s technically true, they are related projects, and they do use the subprojects. But if you didn’t know that kombu existed, you’d never be able to find it from the celery project. But they aren’t good examples, and as far as I can tell, no large project uses subprojects in an integrated/useful/obvious way.

Until I did it in December, anyways.

Now, one problem RobotPy has had is that we have a lot of subprojects. There’s robotpy-wpilib, pynetworktables, the utilities library, pyfrc… and until recently each one had it’s own unique documentation site, and there was some duplication of information between site. But it’s annoying, because you have to search all of these projects to find what you want, and it was difficult to discover new related content across projects.

However, I’m using subprojects on RTD now, and all of the subproject sites now share a unified sidebar that make them seem to be one giant project. There are a few things that make this work:

  • I automatically generate the sidebar, which means the toctree in all of the documentation subproject sites are the same.
  • They use intersphinx to link between the sites
  • But more importantly, the intersphinx links and the sidebar links are all generated based on whether not the project is ‘stable’ or ‘latest’.

The last point is really the most important part and requires you to be a bit disciplined. You don’t want your ‘latest’ documentation subproject pointing to the ‘stable’ subproject, or vice versa — chances are if the user selected one or the other, they want to stay on that through all of your sites. Thankfully, detecting the version of the site is pretty easy:

# on_rtd is whether we are on readthedocs.org, this line of code grabbed from docs.readthedocs.org
on_rtd = os.environ.get('READTHEDOCS', None) == 'True'

# This is used for linking and such so we link to the thing we're building
rtd_version = os.environ.get('READTHEDOCS_VERSION', 'latest')
if rtd_version not in ['stable', 'latest']:
    rtd_version = 'stable'

Then the intersphinx links are generated:

intersphinx_mapping = {
  'robotpy': ('http://robotpy.readthedocs.io/en/%s/' % rtd_version, None),
  'wpilib': ('http://robotpy-wpilib.readthedocs.io/en/%s/' % rtd_version, None),
}

And  as a result, they point to the correct version of the remote sites, as do the sidebar links.

While this approach may not work well for everyone, it has worked really well for the RobotPy project. Feel free to use this technique on your sites, and check out the RobotPy documentation site at robotpy.readthedocs.io!

Detect the ID of the docker container your process is running in

August 16th, 2016

Couldn’t find this exact thing elsewhere, so I’m publishing it in case I ever need it again. This trick depends on the fact that docker uses cgroups and the cgroup ID seems to always be equal to the container ID. This probably only works on linux-based docker containers.

import re

def my_container_id():
    '''
        If this process is in a container, this will return the container's ID,
        otherwise it will return None
    '''
    
    try:
        fp = open('/proc/self/cgroup', 'r')
    except IOError:
        return
    else:
        with fp:
            for line in fp:
                m = re.match(r'^.*\:/docker/(.*)$', line)
                if m:
                    return m.group(1)

Download and unpack source RPM in one command

October 31st, 2015

One thing I constantly find myself doing when developing GTK on python is referring to the C source code to figure out just exactly why it’s doing the things that it does. It’s easier to just download the source package from my distro instead of finding the upstream source repository.. etc etc etc. Of course, downloading and unpacking an RPM is magic that I always seem to forget.

Here’s a script that does it all in one swoop.

BPM autodetection using python + GStreamer’s bpmdetect plugin

August 16th, 2015

I recently found the bpmdetect element in GStreamer, and thought it would be neat to try it out and see how well it works. The GStreamer bpmdetect plugin is mostly undocumented, so I had to dig in the source code to figure out how to extract tags from it. The operation is pretty simple:

  • Setup a pipeline to read in a file, put a fakesink at the end of the pipeline and set ‘sync’ to false
  • Insert the bpmdetect element
    • However, due to this bug, insert a capsfilter before the element to mix it down to a single channel!
  • Attach a message handler, and listen for taglist messages
    • If you already have the BPM tag on the file, then it will be emitted by whatever is decoding the audio too. So, what I do is look for messages that *only* have the beats-per-minute tag in the tag list
    • The bpm is accumulated, so the last BPM message you get will be the calculated rate

Pretty simple! Of course, the results are only as reliable as libsoundtouch’s BPM detection is… but it seems to be correct at least some of the time. Expect to see this code in Exaile soon as a companion to the manual BPM counter! 🙂

Code is available in a gist on github: https://gist.github.com/virtuald/c30032a5b8cdacd1a6c0

GTK3 Composite Widget Templates for Python

May 24th, 2015

Recently, the other developers of the Exaile audio player and I decided to finally migrate to GTK3 and GStreamer 1.x. I mentioned that I wanted to use some code I had developed a few years ago to get rid of the manual UI building code that we had and replace it with GtkBuilder XML files, and @mathbr noted that GTK had added a new feature called ‘composite widget templates’ a few years ago. The ideas were similar to mine, and reading the comments in a blog post about the Vala implementation inspired me to create a working version of this for Python. The first implementation took a few hours, and I’ve been adding improvements ever since as I’ve been integrating this into Exaile.

Here’s how the vala demo code from that blog post looks like in Python, turns out it’s not *that* different:

from __future__ import print_function
from gi.repository import Gtk
from gi_composites import GtkTemplate

@GtkTemplate(ui='mywidget.ui')
class MyWidget(Gtk.Box):

    __gtype_name__ = 'MyWidget'
    
    entry = GtkTemplate.Child()

    def __init__(self, text):
        super(Gtk.Box, self).__init__()
        self.init_template()
        self.entry.set_text(text)
    
    @GtkTemplate.Callback
    def button_clicked(self, widget):
        print("The button was clicked with entry text: %s" % self.entry.get_text())

    @GtkTemplate.Callback
    def entry_changed(self, widget):
        print("The entry text changed: %s" % self.entry.get_text())

The key pieces to note are:

  • Use the @GtkTemplate decorator to load the template for your widget
  • Use GtkTemplate.Child to create attributes on your widget that will be loaded from the XML file (there’s also GtkTemplate.Child.widgets(n) if you need to declare multiple widgets)
  • Use @GtkTemplate.Callback decorator to mark methods to be connected to signals as declared in the XML file

For the full demo + associated GtkBuilder XML file check out the github repo.

I’d love to see this functionality included with GTK’s python bindings, and in fact after creating this I found a bug open on the GNOME bugzilla with a patch to PyGObject to allow python users to use it, but for whatever reason it never got merged. My implementation works on the current release of PyGObject, and possibly older versions too.

Want to get rid of boilerplate in your GTK3 python application? Check out the examples/code on github.

Better python interface to mjpg-streamer using OpenCV

April 3rd, 2015

The FIRST Robotics Team that I work with decided to install two cameras on the robot, but it took awhile for us to figure out the best way to actually stream the camera data. In previous years, we had used Axis IP cameras — but this year we had USB cameras plugged into the control system. Initially we used some streaming code that came from WPILib, but it wasn’t particularly high performance. Then we heard of someone who was using mjpg-streamer, which sounded exactly like what we wanted to use!

Of course, we needed to connect to the stream from python 3. I looked around, and while there were some examples, they didn’t perform quite as well as I would have liked. I believe if you compile with OpenCV with ffmpeg, it has mjpg support builtin, but it was quite laggy for me in the past. So, I wrote a reasonably efficient python mjpg-streamer client — in particular, I partially parse the HTTP stream, and reuse the image buffers when reading in the data, instead of making a bunch of copies. It works pretty well for us, maybe you’ll find it useful the next time you need to read an mjpg-streamer stream from your Raspberry Pi or on your FRC Robot!

I’m not going to explain how to compile/install mjpg-streamer, there’s plenty of docs on the web for that (but, if you want precompiled binaries for the roboRIO, go to this CD post). Here’s the code for the python client (note: this was tested using OpenCV 3.0.0-beta and Python 3):

import re
from urllib.request import urlopen
import cv2
import numpy as np

# mjpg-streamer URL
url = 'http://10.14.18.2:8080/?action=stream'
stream = urlopen(url)
    
# Read the boundary message and discard
stream.readline()

sz = 0
rdbuffer = None

clen_re = re.compile(b'Content-Length: (\d+)\\r\\n')

# Read each frame
# TODO: This is hardcoded to mjpg-streamer's behavior
while True:
      
    stream.readline()                    # content type
    
    try:                                 # content length
        m = clen_re.match(stream.readline()) 
        clen = int(m.group(1))
    except:
        return
    
    stream.readline()                    # timestamp
    stream.readline()                    # empty line
    
    # Reallocate buffer if necessary
    if clen > sz:
        sz = clen*2
        rdbuffer = bytearray(sz)
        rdview = memoryview(rdbuffer)
    
    # Read frame into the preallocated buffer
    stream.readinto(rdview[:clen])
    
    stream.readline() # endline
    stream.readline() # boundary
        
    # This line will need to be different when using OpenCV 2.x
    img = cv2.imdecode(np.frombuffer(rdbuffer, count=clen, dtype=np.byte), flags=cv2.IMREAD_COLOR)
    
    # do something with img?
    cv2.imshow('Image', img)
    cv2.waitKey(1)

 

Connect to RoboRIO USB on Linux

January 3rd, 2015

We got our RoboRIO imaged today, and my first thought was whether I could use it on Linux without needing to plug it into a network. Turns out, this is a pretty simple thing to do!

  1. Plug the RoboRIO in to your computer
  2. Identify the network device, it should be something like enp0s29u1u2
  3. Assign it an IP address like so:
    sudo ip addr add 172.22.11.1/24 dev enp0s29u1u2

Now you need to start an DHCP server to give it an address, because FIRST didn’t give it a static address for some reason. You could modify the configuration on the RoboRIO… but let’s assume you don’t want to do that. Instead:

  1. Download this python script that acts like a DHCP server
  2. Run this:
    sudo python simple-dhcpd -a 172.22.11.1 -i enp0s29u1u2 -f 172.22.11.2 -t 172.22.11.2

And that’s it! When it works, you should get a message that says “Leased: 172.22.11.2”.

If it’s not working, here are some things to watch out for:

  • Make sure your firewall allows port 67 on UDP
  • Make sure dnsmasq or some similar program isn’t listening on port 67 (use “netstat -ln | grep 67” to check)

Things that would be nice to change:

  • It’d be nice if FIRST changed the usb0 device to have a static address instead of depending on dhcp
  • Probably should create a udev rule to make the network device something pretty
  • When you disconnect the device, you have to run the scripts again. Probably should set something more permanent up, but this is good enough for now

Hope this helps you out!

Simple python wrapper to give SSH a password for automation purposes (with output capture)

December 20th, 2014

A very simple thing that most developers want to do at one time or another is automate SSH. However, since OpenSSH reads the password from the terminal, this can make it a very annoying thing to automate. Indeed, searching google for a solution to this problem yields all sorts of bad answers (and very few good ones).

There are a number of ways to solve the problem in python other than a wrapper around the SSH command:

  • Use public keys for authentication (this is a generally good practice for this sort of automation, but sometimes you can’t do this)
  • Use paramiko to talk the SSH protocol directly
  • Use pexpect‘s ssh wrapper to wrap openssh instead of rolling your own wrapper

However, if you can’t use external dependencies for some reason, this script is for you. It’s derived from this script and many other sources of information on the net, but I think it’s a bit simpler to use. It has been tested on Linux and OSX, on Python 2.7 and 3.4.

Let me know if you have problems with the script!

Introducing pyhcl

October 15th, 2014

HCL is a configuration file format created by Hashicorp to “build a structured configuration language that is both human and machine friendly for use with command-line tools”. They created it to use with their tools because they weren’t satisfied with existing solutions, and I think they did a really good job with it.

I also have similar opinions. JSON has parsers in just about any language one can think of, but is a really terrible format for humans to deal with since it doesn’t support comments. YAML is sometimes mentioned as a better choice, but I think it’s generally pretty terrible to use. I’ve only had to use YAML a few times, but in my opinion it’s tricky to get YAML files of any complexity correct without a bit of futzing.

Instead, HCL is super easy to type out and read, and looks quite nice:

variable "ami" {
    # this is a comment
    description = "the AMI to use"
}

I liked the idea of HCL a lot, so I created a python implementation of the parser using ply, which I’ve called pyhcl. When you read an HCL file, pyhcl turns it into a python dictionary (just like JSON), and the dictionary representation for the above HCL is like this:

{
  "variable": {
    "ami": {
      "description": "the AMI to use"
    }
  }
}

To get that result, parsing HCL using pyhcl is super easy, and is pretty much the same as parsing JSON:

import hcl

with open('file.hcl', r) as fp:
     obj = hcl.load(fp)

So far, I’ve been really happy with HCL, and I’ve been using it for some projects with complex configuration requirements, and the end-users of those projects have been quite happy with the simplicity that HCL provides.

pyhcl is provided under the MPL 2.0, just like the golang parser, and can mostly be used in the same places one might use JSON. It’s probably not terribly performant — but parsing files that humans read shouldn’t really require performance. If you do, python’s JSON parser is written in C and should meet your needs quite nicely.

One thing that would be nice is if there was an actual specification for HCL, and in particular it isn’t very well defined on how to convert HCL to/from JSON… but lacking that, pyhcl currently tries to match the golang implementation bug for bug, and its test suite has stolen most of the fixtures from the golang parser to ensure maximum compatibility.

Github site for pyhcl
Pypi site for pyhcl

Easily transfer docker images between two machines over the network

September 29th, 2014

I’ve been using docker a lot, and on occasion I need to transfer images between two machines that are on a local network. If a particular image is large, I might not want to download it twice from two machines, so I download it on one machine and transfer it to the other over the local network.

Now, I could stand up a local docker registry and use that, but it’s a bit of work. Instead, I’ve found that the quickest and easiest solution is to combine the docker ‘save’ and ‘load’ commands with a bit of netcat magic, and it’s pretty fast and easy. (Update: you can do it easily using SSH too, see the end of the post). Check it out.

First, on the destination machine (make sure your firewall allows traffic to the specified port, in this case 1234):

nc -v -l 1234 | docker load

Next, on the source machine, transfer the image (virtuald/etcd:0.4.6) to the destination IP (192.168.0.42):

docker save virtuald/etcd:0.4.6 | nc -v 192.168.0.42 1234

And that’s it!

The sad thing is that docker save/load doesn’t show a status message when saving/loading, so it might look like it’s not doing anything. However, using the -v flag for netcat shows when the connection is successfully opened/closed, so that’s something.

Security warning: Obviously, running netcat like this is a *huge* security hole while its up and listening, as anyone who can connect to the port can upload arbitrary images into your docker registry. This is mitigated a bit since netcat will immediately disconnect after the first client disconnects, but still risky on an untrusted network. Only use this on trusted networks!

Note: due to this bug, you’ll want to be using docker 1.2+, otherwise you may get unexpected results.

Update! As Joshua Barratt points out, since this method generalizes to any transport that allows piping via stdin/stdout, you can also do the transfer via SSH too, which is certainly more secure. Use the -C option to enable compression for faster transfers (thanks Andreas Steffan).

docker save virtuald/etcd:0.4.6 | ssh -C 192.168.0.42 ‘docker load’

Update II: As a number of people have pointed out, you can use PV to show a status message:

docker save virtuald/etcd:0.4.6 | pv | ssh -C 192.168.0.42 'docker load'