Archive for the ‘tips’ Category

Integrated subproject sites on readthedocs.org

Saturday, January 14th, 2017

This has been bothering me for a few years now, but in the FAQ for readthedocs it calls out the celery/kombu projects as an example of subprojects on RTD. And.. ok, I suppose it’s technically true, they are related projects, and they do use the subprojects. But if you didn’t know that kombu existed, you’d never be able to find it from the celery project. But they aren’t good examples, and as far as I can tell, no large project uses subprojects in an integrated/useful/obvious way.

Until I did it in December, anyways.

Now, one problem RobotPy has had is that we have a lot of subprojects. There’s robotpy-wpilib, pynetworktables, the utilities library, pyfrc… and until recently each one had it’s own unique documentation site, and there was some duplication of information between site. But it’s annoying, because you have to search all of these projects to find what you want, and it was difficult to discover new related content across projects.

However, I’m using subprojects on RTD now, and all of the subproject sites now share a unified sidebar that make them seem to be one giant project. There are a few things that make this work:

  • automatically generate the sidebar, which means the toctree in all of the documentation subproject sites are the same.
  • They use intersphinx to link between the sites
  • But more importantly, the intersphinx links and the sidebar links are all generated based on whether not the project is ‘stable’ or ‘latest’.

The last point is really the most important part and requires you to be a bit disciplined. You don’t want your ‘latest’ documentation subproject pointing to the ‘stable’ subproject, or vice versa — chances are if the user selected one or the other, they want to stay on that through all of your sites. Thankfully, detecting the version of the site is pretty easy:

# on_rtd is whether we are on readthedocs.org, this line of code grabbed from docs.readthedocs.org
on_rtd = os.environ.get('READTHEDOCS', None) == 'True'

# This is used for linking and such so we link to the thing we're building
rtd_version = os.environ.get('READTHEDOCS_VERSION', 'latest')
if rtd_version not in ['stable', 'latest']:
    rtd_version = 'stable'

Then the intersphinx links are generated:

intersphinx_mapping = {
  'robotpy': ('http://robotpy.readthedocs.io/en/%s/' % rtd_version, None),
  'wpilib': ('http://robotpy-wpilib.readthedocs.io/en/%s/' % rtd_version, None),
}

And  as a result, they point to the correct version of the remote sites, as do the sidebar links.

While this approach may not work well for everyone, it has worked really well for the RobotPy project. Feel free to use this technique on your sites, and check out the RobotPy documentation site at robotpy.readthedocs.io!

Easily transfer docker images between two machines over the network

Monday, September 29th, 2014

I’ve been using docker a lot, and on occasion I need to transfer images between two machines that are on a local network. If a particular image is large, I might not want to download it twice from two machines, so I download it on one machine and transfer it to the other over the local network.

Now, I could stand up a local docker registry and use that, but it’s a bit of work. Instead, I’ve found that the quickest and easiest solution is to combine the docker ‘save’ and ‘load’ commands with a bit of netcat magic, and it’s pretty fast and easy. (Update: you can do it easily using SSH too, see the end of the post). Check it out.

First, on the destination machine (make sure your firewall allows traffic to the specified port, in this case 1234):

nc -v -l 1234 | docker load

Next, on the source machine, transfer the image (virtuald/etcd:0.4.6) to the destination IP (192.168.0.42):

docker save virtuald/etcd:0.4.6 | nc -v 192.168.0.42 1234

And that’s it!

The sad thing is that docker save/load doesn’t show a status message when saving/loading, so it might look like it’s not doing anything. However, using the -v flag for netcat shows when the connection is successfully opened/closed, so that’s something.

Security warning: Obviously, running netcat like this is a *huge* security hole while its up and listening, as anyone who can connect to the port can upload arbitrary images into your docker registry. This is mitigated a bit since netcat will immediately disconnect after the first client disconnects, but still risky on an untrusted network. Only use this on trusted networks!

Note: due to this bug, you’ll want to be using docker 1.2+, otherwise you may get unexpected results.

Update! As Joshua Barratt points out, since this method generalizes to any transport that allows piping via stdin/stdout, you can also do the transfer via SSH too, which is certainly more secure. Use the -C option to enable compression for faster transfers (thanks Andreas Steffan).

docker save virtuald/etcd:0.4.6 | ssh -C 192.168.0.42 ‘docker load’

Update II: As a number of people have pointed out, you can use PV to show a status message:

docker save virtuald/etcd:0.4.6 | pv | ssh -C 192.168.0.42 'docker load'

Automated docker ambassadors with CoreOS + registrator + ambassadord

Monday, July 28th, 2014

I’m just starting to play around with docker, and I’ve been investigating the use of CoreOS for deploying a cluster of docker containers. Though I’ve only been using it for a week, I really like what I’ve seen so far. CoreOS is makes it very easy to cluster together a group of machines using etcd, and in particular, I really like their fleet software, which allows you to manage systemd units (which you can use to run docker containers) across an entire CoreOS cluster. Fleet makes it easy to do things like high availability, failure recovery, and other useful things without too much extra effort right out of the box. The one piece missing is how to connect the containers together. There are some ways they’ve documented to do it, but honestly most of the ways I’ve seen on the internet consist of a bunch of shell script glue that feels really hacky to me.

In the docker community, something called the ‘ambassador’ pattern has emerged, which is this idea of proxying connections to container A from container B via container P, and container P has enough smarts in it to transparently redirect connections to many different containers depending on parameters. However, most of the stuff I’ve found on the web is very labor intensive and full of nasty shell scripting that is easy to mess up.

Jeff Lindsay has created the first stage of what I think is a really good general solution to this problem — namely, his projects called registrator and ambassadord. Registrator listens for docker containers to startup, and automatically adds them something like etcd or consul. You link your containers to ambassadord, and when your container tries to make an outgoing connection, it will do a lookup to figure out where the connection needs to go, and connect you there. It’s pretty easy, with very little configuration needed for the involved containers.

CoreOS already ships with etcd built-in, so CoreOS + registrator + ambassadord seems to be a great combination to me. I’ve modified CoreOS’s sample vagrant cluster to demonstrate how to use these to connect containers together.

(more…)

Transparently use avro schemata (.avsc) files in a python module

Sunday, June 8th, 2014

One of the cool things about avro is that it has bindings in a couple of different languages. However, I think the only one that has native code generation support for working with avro objects is Java, which makes working with avro in the other languages a bit harder. Here’s a simple way to load your schemata dynamically (and if you don’t want to write your schemata by hand, then using maven you can generate it from AVDL files using the tip from my previous post).

What this bit of code does is overrides the __getattr__ function on the module, so anytime you try to access a type on the module, it will attempt to load the avro schema from a file of the same name with the avsc extension. To use this code, create a file called __init__.py in your directory of .avsc files, and paste the following code in.

import sys
from os.path import join, dirname
import avro.schema

class AvroSchemaLoader(object):
    '''
        This object allows us to lazily load schemata files in the current
        directory and parse them as needed.
        
        It is intended to be used as a replacement of the current module in
        sys.modules, so usage of this object should be transparent to users.
        
        For example, to access the Foo wrapper object, you would do the
        following:
        
            >>> from this_dir_name import Foo
            >>> print type(Foo)
            <avro.schema.RecordSchema at ...>
            >>>
    '''
    
    def __init__(self, module):
        # things break in odd ways if you don't keep a reference to the module here
        self.__module = module  

    def __getattr__(self, name):
        if name.startswith('__'):
            return object.__getattr__(self, name)
        
        with open(join(dirname(__file__), '%s.avsc' % name), "r") as fp:
            schema = avro.schema.parse(fp.read())
        
        setattr(self, name, schema)
        return schema


# Replace this module instance with the dynamic loader
sys.modules[__name__] = AvroSchemaLoader(sys.modules[__name__])

There’s a lot you can do to make this better — like load a wrapper around the schema instead of using the schema directly. I’ll leave that as an exercise for the reader. 🙂

Automatically generating avro schemata (avsc files) using maven

Sunday, June 8th, 2014

I’ve been using avro for serialization a bit lately, and it seems like a really useful, flexible, and performant technology. To use avro containers, you have to define a schema for them — but writing out JSON files is a bit of a pain. Avro provides an IDL that you can use to specify the object types instead, and it’s much easier to work with. The avro-maven-plugin is quite useful because you can automatically generate Java objects from the IDL files — but what if you’re working with the same Avro files in a different language that can’t use the IDL?

Until they add the functionality to the maven plugin, there’s an easy way you can automate this yourself using a bit of maven magic. First thing to do is add the following dependency to your project’s pom.xml

<dependency>
  <groupId>org.apache.avro</groupId>
  <artifactId>avro-tools</artifactId>
  <version>1.7.6</version>
</dependency>

Next, you need to add a simple little class that converts an entire directory from avdl files to avsc files. Avro-tools ships with a useful class called IdlToSchemataTool that will convert a single file for you, so converting an entire directory is just a simple wrapper around that. There is a bit of improvement that could be done here, but this gets the job done assuming your directory only has avdl files in it.

package main;

import java.io.File;
import java.util.ArrayList;
import java.util.List;

import org.apache.avro.tool.IdlToSchemataTool;

/**
 * Converts an entire directory from Avro IDL (.avdl) to schema (.avsc)
 */
public class ConvertIdl {

	public static void main(String [] args) throws Exception {
		IdlToSchemataTool tool = new IdlToSchemataTool();
		
		File inDir = new File(args[0]);
		File outDir = new File(args[1]);
		
		for (File inFile: inDir.listFiles()) {
			List<String> toolArgs = new ArrayList<String>();
			toolArgs.add(inFile.getAbsolutePath());
			toolArgs.add(outDir.getAbsolutePath());
			
			tool.run(System.in, System.out, System.err, toolArgs);
		}
	}
}

Finally, you add the following to the plugins section of pom.xml to actually generate the avsc files. This uses the exec-maven-plugin to run the class we created above during compilation. This configuration assumes that you are storing your avdl files in src/main/avro, and that you want to place the files in schemata. Obviously you can reconfigure this however you want.

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>exec-maven-plugin</artifactId>
    <version>1.3</version>
    <executions>
        <execution>
            <phase>compile</phase>
            <goals>
                <goal>java</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <mainClass>main.ConvertIdl</mainClass>
        <arguments>
            <argument>${project.basedir}/src/main/avro/</argument>
            <argument>${project.basedir}/schemata/</argument>
        </arguments>
    </configuration>
</plugin>

And that’s it! To actually convert your avdl files to avsc files, run ‘mvn compile’ and the output directory should be filled with avsc files containing the JSON schema for your avro containers. Hope this helps you out, let me know if you find any bugs or have improvements.

VirtualKD 2.8 with VirtualBox 4.2

Monday, March 11th, 2013

If you try to install VirtualKD 2.8 on VirtualBox 4.2.x, you get an error similar to this:

Unable to cast COM object of type ‘VirtualBox.VirtualBoxClass’ to interface type ‘VirtualBox.IVirtualBox’. This operation failed because the QueryInterface call on the COM component for the interface with IID ‘{…}’ failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)).

It turns out it’s a pretty easy thing to fix. The Interop.VirtualBox.dll distributed with VirtualKD is built for the 4.1 VirtualBox interface, so you just have to rebuild it for your version of VirtualBox. Create a C# project, and paste the following code into the Program.cs file to build a new Interop.VirtualBox.dll for 4.2.x.

using System;
using System.Reflection;
using System.Reflection.Emit;
using System.Runtime.InteropServices;

namespace ConvertTypeLibToAssembly
{
    public class App
    {
        private enum RegKind
        {
            RegKind_Default = 0,
            RegKind_Register = 1,
            RegKind_None = 2
        }
    
        [ DllImport( "oleaut32.dll", CharSet = CharSet.Unicode, PreserveSig = false )]
        private static extern void LoadTypeLibEx( String strTypeLibName, RegKind regKind, 
            [ MarshalAs( UnmanagedType.Interface )] out Object typeLib );
        
        public static void Main()
        {
            Object typeLib;
            LoadTypeLibEx( @"C:\Program Files\Oracle\VirtualBox\VBoxC.dll", RegKind.RegKind_None, out typeLib ); 
            
            if( typeLib == null )
            {
                Console.WriteLine( "LoadTypeLibEx failed." );
                return;
            }
                
            TypeLibConverter converter = new TypeLibConverter();
            ConversionEventHandler eventHandler = new ConversionEventHandler();
            //AssemblyBuilder asm = converter.ConvertTypeLibToAssembly( typeLib, "Interop.Virtualbox.dll", 0, eventHandler, null, null, null, null );
            AssemblyBuilder asm = converter.ConvertTypeLibToAssembly(typeLib, "Interop.VirtualBox.dll", TypeLibImporterFlags.SafeArrayAsSystemArray, eventHandler, null, null, "VirtualBox", null); //using assembly name "VirtualBox" and SafeArrayAsSystemArray to be compatible to VisualStudio-Generated Interop-Assembly

            asm.Save("Interop.Virtualbox.dll");
        }
    }

    public class ConversionEventHandler : ITypeLibImporterNotifySink
    {
        public void ReportEvent( ImporterEventKind eventKind, int eventCode, string eventMsg )
        {
            // handle warning event here...
        }
        
        public Assembly ResolveRef( object typeLib )
        {
            // resolve reference here and return a correct assembly...
            return null; 
        }    
    }
}

Some of this code was grabbed from MSDN, and a single line was grabbed from VirtualBoxService. As the latter is GPL, this code may be GPL. I don’t claim any rights to it.

Problems with file descriptors being inherited by default in Python

Wednesday, February 6th, 2013

Have you ever run into a traceback that ends with something like this?

  File "C:\Python27\lib\logging\handlers.py", line 141, in doRollover
    os.rename(self.baseFilename, dfn)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process

I certainly have, in a few places. The basic problem is that when python creates file objects on Windows (and I think on *nix as well), by default Python will mark the handle as being inheritable (I’m sure there’s a reason why… but, doesn’t make a whole lot of sense for this to be the default behavior to me). So if your script spawns a new process, that new process will inherit all the file handles from your script — and of course since it doesn’t realize that it even has those handles, it’ll never close them. A great example of this is launching a process, then exiting. When you launch your script again and try to open that handle… the other process still has it open, and depending on how the file was opened, you may not be able to open it due to a sharing violation.

It looks like they’re trying to provide ways to fix the problem in PEP 433 for Python 3.3, but that doesn’t help those of us still using Python 2.7. Here’s a snippet that you can put at the very beginning of your script to fix this problem on Windows:

import sys

if sys.platform == 'win32':
    from ctypes import *
    import msvcrt
    
    __builtins__open = __builtins__.open
    
    def __open_inheritance_hack(*args, **kwargs):
        result = __builtins__open(*args, **kwargs)
        handle = msvcrt.get_osfhandle(result.fileno())
        windll.kernel32.SetHandleInformation(handle, 1, 0)
        return result
        
    __builtins__.open = __open_inheritance_hack

Now, I admit, this is a bit of a hack… but it solves the problem for me. Hope you find this useful!

Drawing in color in PyGTK

Monday, October 15th, 2012

I’ve been playing with drawing on your own widgets in PyGTK on Windows, and I found it incredibly difficult to figure out how to draw something in color on a gtk.gdk.Drawable object using draw_line, draw_rectangle, etc. You can’t just set the color using the semi-obvious mechanism:

    gc = widget.window.new_gc()
    gc.set_foreground(gtk.gdk.Color(255,0,0))

I think the reason it doesn’t work is because if the color isn’t in the device-specific colormap, then GTK will ignore whatever color you set without bothering to warn you that something is wrong. However, I’ve finally hit on something that works. In your expose event (or elsewhere), you can put in something like the following:

    def on_expose_event(self, widget, event):

        gc = widget.window.new_gc()
        colormap = self.gc.get_colormap()
        color = colormap.alloc_color('yellow')
        gc.set_foreground(color)

        # whatever gtk.gdk.Drawable draw_* functions you call here
        # will use that color

Hope you find this useful!