Concept Map playlist visualization generated by Exaile 3.4 beta3

July 15th, 2014

At work, I’ve been playing a little bit with visualization of data in D3. After hours, I DJ at local Lindy Hop dance events, and it occurred to me recently that it could be interesting to visualize my playlists to understand more about what I actually played. Using Exaile, I’m able to easily add a lot of metadata to the tracks in my collection, and its plugin framework made it easy to take that data and do something interesting with it.

From that initial investigation, I’ve built in a playlist visualization templating engine plugin for Exaile that can do some very simple visualization stuff, and I’m planning on extending the types of things I can do with it. Here’s an interesting one I created from a recent playlist. This visualization was inspired by The Concept Map, except the source code for this isn’t minified. 🙂

Check out the visualization at (works best in Chrome).

Transparently use avro schemata (.avsc) files in a python module

June 8th, 2014

One of the cool things about avro is that it has bindings in a couple of different languages. However, I think the only one that has native code generation support for working with avro objects is Java, which makes working with avro in the other languages a bit harder. Here’s a simple way to load your schemata dynamically (and if you don’t want to write your schemata by hand, then using maven you can generate it from AVDL files using the tip from my previous post).

What this bit of code does is overrides the __getattr__ function on the module, so anytime you try to access a type on the module, it will attempt to load the avro schema from a file of the same name with the avsc extension. To use this code, create a file called in your directory of .avsc files, and paste the following code in.

import sys
from os.path import join, dirname
import avro.schema

class AvroSchemaLoader(object):
        This object allows us to lazily load schemata files in the current
        directory and parse them as needed.
        It is intended to be used as a replacement of the current module in
        sys.modules, so usage of this object should be transparent to users.
        For example, to access the Foo wrapper object, you would do the
            >>> from this_dir_name import Foo
            >>> print type(Foo)
            <avro.schema.RecordSchema at ...>
    def __init__(self, module):
        # things break in odd ways if you don't keep a reference to the module here
        self.__module = module  

    def __getattr__(self, name):
        if name.startswith('__'):
            return object.__getattr__(self, name)
        with open(join(dirname(__file__), '%s.avsc' % name), "r") as fp:
            schema = avro.schema.parse(
        setattr(self, name, schema)
        return schema

# Replace this module instance with the dynamic loader
sys.modules[__name__] = AvroSchemaLoader(sys.modules[__name__])

There’s a lot you can do to make this better — like load a wrapper around the schema instead of using the schema directly. I’ll leave that as an exercise for the reader. 🙂

Automatically generating avro schemata (avsc files) using maven

June 8th, 2014

I’ve been using avro for serialization a bit lately, and it seems like a really useful, flexible, and performant technology. To use avro containers, you have to define a schema for them — but writing out JSON files is a bit of a pain. Avro provides an IDL that you can use to specify the object types instead, and it’s much easier to work with. The avro-maven-plugin is quite useful because you can automatically generate Java objects from the IDL files — but what if you’re working with the same Avro files in a different language that can’t use the IDL?

Until they add the functionality to the maven plugin, there’s an easy way you can automate this yourself using a bit of maven magic. First thing to do is add the following dependency to your project’s pom.xml


Next, you need to add a simple little class that converts an entire directory from avdl files to avsc files. Avro-tools ships with a useful class called IdlToSchemataTool that will convert a single file for you, so converting an entire directory is just a simple wrapper around that. There is a bit of improvement that could be done here, but this gets the job done assuming your directory only has avdl files in it.

package main;

import java.util.ArrayList;
import java.util.List;

import org.apache.avro.tool.IdlToSchemataTool;

 * Converts an entire directory from Avro IDL (.avdl) to schema (.avsc)
public class ConvertIdl {

	public static void main(String [] args) throws Exception {
		IdlToSchemataTool tool = new IdlToSchemataTool();
		File inDir = new File(args[0]);
		File outDir = new File(args[1]);
		for (File inFile: inDir.listFiles()) {
			List<String> toolArgs = new ArrayList<String>();
			toolArgs.add(outDir.getAbsolutePath());, System.out, System.err, toolArgs);

Finally, you add the following to the plugins section of pom.xml to actually generate the avsc files. This uses the exec-maven-plugin to run the class we created above during compilation. This configuration assumes that you are storing your avdl files in src/main/avro, and that you want to place the files in schemata. Obviously you can reconfigure this however you want.


And that’s it! To actually convert your avdl files to avsc files, run ‘mvn compile’ and the output directory should be filled with avsc files containing the JSON schema for your avro containers. Hope this helps you out, let me know if you find any bugs or have improvements.

Downloading and uploading dashboards to/from a graphite server

April 24th, 2014

I’ve been using graphite and statsd lately, and over lunch quickly whipped up this script to download/upload dashboard configurations to/from a graphite server. Here’s the code, use as you wish.

#!/usr/bin/env python
# Author: Dustin Spicuzza
# Use this script to download/upload dashboards to/from a graphite server.

import json

import sys
import urllib
import urllib2

download_url = 'http://%s/dashboard/load/%s'
upload_url = 'http://%s/dashboard/save/%s'

def download(ip, name, fname):
    uf = urllib2.urlopen('http://%s/dashboard/load/%s' % (ip, name))

    data = json.loads(urllib2.unquote(
    data_str = json.dumps(data['state'], sort_keys=True, indent=4, separators=(',',': '))

    if fname == None:
        print data_str
        with open(fname, 'w') as fp:

    return 0

def upload(ip, name, fname):

    if fname is None:
        data_str =
        with open(fname, 'r') as fp:
            data_str =

    data = json.loads(data_str)

    if data['name'] != name:
        print "ERROR: dashboard name doesn't match name in state file"
        return 1

    post_data = urllib.urlencode([('state', json.dumps(data))])

    request = urllib2.Request(upload_url % (ip, name), post_data)
    request.add_header("Content-type", "application/x-www-form-urlencoded")

    print urllib2.urlopen(request).read()
    return 0

def usage():
    print "Usage: [upload graphite_host name filename] | [download graphite_host name [filename]]"

if __name__ == '__main__':

    if len(sys.argv) < 4:

    action = sys.argv[1]
    ip = sys.argv[2]
    name = sys.argv[3]
    fname = None

    if len(sys.argv) > 4:
        fname = sys.argv[4]

    if action == 'upload':
        retval = upload(ip, name, fname)

    elif action == 'download':
        retval = download(ip, name, fname)



Global shared development folder on Vagrant

December 20th, 2013

Vagrant is pretty awesome for development. One thing that I’ve ran into is that I use a lot of vagrant instances at various times, and much of the time I want to access my development files from inside the VM. One thing that is nice about vagrant is that by default it maps the folder where the Vagrantfile is located to /vagrant inside the VM. However, most of the time the content I want to access isn’t in that folder, so I found a good way to allow me to access stuff without needing to copy content all over the place.

What you can do is setup a global Vagrantfile, and all of the VMs that are stood up by your username will get the settings inside that VM. Just create a file  ~/.vagrant.d/Vagrantfile so it looks like the following. This will map some local folder to /src on the vagrant VM — but of course, you should set the paths to values that make sense for you.

Vagrant.configure("2") do |config|

    config.vm.provider :virtualbox do |vbox, override|

        # path on your local machine
        host_folder_name = "~/local/path/to/somewhere"

        # path where the local folder is mapped to inside the VM
        vm_folder_name = "/src"

        # In newer versions of Vagrant, you should use "type" otherwise
        # you may find it rsyncing your computer to the VM
        override.vm.synced_folder File.expand_path(host_folder_name), vm_folder_name, type: "virtualbox"


Of course, if you set something like this up, definitely use the vagrant-rekey-ssh plugin to make sure that nobody else is able to access your VM via SSH using the default insecure vagrant keys.

Beware of using Vagrant VMs on a bridged network

December 10th, 2013

I love Vagrant. If you haven’t used it, Vagrant is pretty awesome. It lets you manage VMs + configurations really easily, and it’s a great development tool to use when you need to do rapid iterative development on disposable environments.

However, last week I realized that because of the way it’s implemented, all vagrant VMs share the same set of credentials to access them. Since most people only do local development with them, this is mostly ok (though it could be used to jump process boundaries and escalate privileges if you were creative and already running on the box) — however, in a bridged configuration, this is a huge security vulnerability as *anyone* on the internet could potentially use this key to SSH into your VM.

Since some of the stuff I do involves using bridged VMs, I wrote a Vagrant plugin to fix this problem. It replaces the default vagrant SSH key with one randomly generated for your user on that host. If you want to fix this security vulnerability on your vagrant installation, just do:

vagrant plugin install vagrant-rekey-ssh

Hope you find it useful! The github site for the plugin has more useful information if you want to read more.

Tool to implement ruby-inspired DSLs in python

November 17th, 2013

This weekend, I created library to aid developers in creating neat little embedded DSLs when using python, without having to do any complex parsing or anything like that. The resulting DSLs look a bit more like english than python does.

The idea for this was inspired by some ruby stuff that I’ve been using lately. I’ve been using ruby quite a bit lately, and while I am still not a huge fan of the language, I do like the idea of easy to code up DSLs that I can use to populate objects without too much effort. Since I wanted a DSL for a python project I was working on, I played with a few ideas and ended up with this tool. Read the rest of this entry »

Versioned chef environments

November 3rd, 2013

I was just recently introduced to chef, and it has turned out to be a pretty useful tool for automating infrastructure. One feature that I’m using in our environment is chef environments. A bunch of developers are all updating this file, and their cookbooks depend on attributes in it*.  I keep getting bit by various user errors that all could be solved if a cookbook could state ‘make sure you have at least this version of the environment’.

I’m sure there’s a good reason for it, but chef does not currently support versioning environments. To work around this problem, I created a cookbook with a library function that can check a node attribute and compare the version information there to determine if the environment is out of date. It’s a bit of a hack for now, but it gets the job done.

I thought someone else might find this useful, so I uploaded the code to github and to the opscode community site. If you have a need for versioned environments in chef, this might work for you too. Let me know if you find it useful!

*Yes, this could be considered an anti-pattern, but for our use case using environments to override attributes makes perfect sense, and allows us to not have to fork every community cookbook that we want to use.

Another Python SIP wrapper for the tesseract OCR library

July 26th, 2013

Tesseract is a pretty decent open source OCR engine that was developed by HP back in the day, but is now maintained as open source by Google. It has a C++ API that you can program to, and as you would expect there are a number of wrappers (of varying quality) that allow you to use libtesseract from Python. For various reasons, none of those fit my needs, so I created my own SIP-based wrapper instead. I will not be maintaining this as a format project, but if you want an apache licensed python wrapper, you can find the code on github. 🙂

GitHub repository for Python Tesseract SIP wrapper

Award Winning FIRST Robotics Robot Control Interface

April 8th, 2013

KwarqsDashboard is an award-winning control system developed in 2013 for FIRST Robotics Team 2423, The Kwarqs. For the second year in a row, Team 2423 won the Innovation in Control award at the 2013 Boston Regional. The judges cited this control system as the primary reason for the award.

It is designed to be used with a touchscreen, and robot operators use it to select targets and fire frisbees at the targets. Check out the shiny screenshot below:

Kwarqs Dashboard ScreenshotThere are a lot of really useful features that we built into this control interface, and once we got our mechanical and electrical bugs ironed out the robot was performing quite well. We’re looking forward to competing with it at WPI Battle Cry 2013 in May!  Here’s a list of some of the many features it has:

  • Written entirely in Python
    • Image processing using OpenCV python bindings, GUI written using PyGTK
    • Cross platform, fully functional in Linux and Windows 7/8
  • All control/feedback features use NetworkTables, so the same robot can be controlled using the SmartDashboard instead if needed
    • SendableChooser compatible implementation for mode switching
  • Animated robot drawing that shows how many frisbees are present, and tilts the shooter platform according to the current angle the platform is actually at.
  • Allows operators to select different modes of operation for the robot using brightly lit toggle switches
  • Operators can choose an autonomous mode on the dashboard, and set which target the robot should aim for in modes that use target tracking
  • Switches operator perspective when robot switches modes
  • Simulated lighted rocker switches to activate robot systems
  • Logfiles written to disk with errors when they occur

Target acquisition image processing features:

  • Tracks the selected targets in a live camera stream, and determines adjustments the robot should make to aim at the target
  • User can click on targets to tell the robot what to aim at
  • Differentiates between top/middle/low targets
  • Partially obscured targets can be identified
  • Target changes colors when the robot is aimed properly

Fully integrated realtime analysis support for target acquisition:

  •  Adjustable thresholding, saves settings to file
  • Enable/disable drawing features and labels on detected targets
  • Show extra threshold images
  • Can log captured images to file, once per second
  • Can load a directory of images for analysis, instead of connecting to a live camera

So, as you can see, lots of useful things. We’re releasing the full source code for it under a GPL license, so go ahead, download it, play with it, and let me know what you think! Hope you find this useful.

A lot of people made this project possible:

  • Some code structuring ideas and PyGTK widget ideas were derived from my work with Exaile
  • Team 341 graciously open sourced their image processing code in 2012, and the image processing is heavily derived from a port of that code to python.
  • Sam Rosenblum helped develop the idea for the dashboard, and helped refine some of the operating concepts.
  • Stephen Rawls helped refine the image processing code and distance calculations.
  • Youssef Barhomi created image processing stuff for the Kwarqs in 2012, and some of the ideas from that code were copied.

The included images were obtained from various places:

  • Linda Donoghue created the robot image
  • The fantastic lighted rocker switches were created by Keith Sereby, and are distributed with permission.
  • The green buttons were obtained via google image search, I don’t recall where

Download it on my FRC resources page!