Archive for September, 2007

Google Documents

Tuesday, September 25th, 2007

While working on the “Green Space” project (which really isn’t about green space) we’ve been using Google Documents extensively to collaborate on press releases, agendas, and other random things. Its been AWESOME, despite the fact that the user interface for Google Docs really sucks, IMHO.

Ironically, before I actually used it, I couldn’t see why anyone would want to replace their desktop word processor for something like Google Docs. Now, I can see why to some extent — though, I can’t imagine using it for EVERYTHING. The best and most useful feature is the fact that multiple people can work on the same document at the same time, which is pretty freaking awesome for projects like ours.

The “Green Space” @ WMU

Wednesday, September 19th, 2007

I haven’t been blogging much lately since I’ve been involved with a HUGE project lately protesting an ugly parking lot at my University, and its been taking up all of my time that isn’t already been taken up with schoolwork.

We created a Facebook group which has over 800 members now. The use of Facebook here has been a great collaboration tool and a good way for us to get interested volunteers to help us. Also, the viral nature of Facebook has allowed us to reach people we ordinarily couldn’t reach.

On Monday night, a group of volunteers gathered at the ugly parking lot in question and proceeded to mark EVERY element of the parking lot with its price using chalk. We also proceeded to chalk around the area to draw attention to the chalk in the parking lot.

The response has been simply phenomenal! We’ve appeared on TV, had newspaper articles, and been invited to radio shows. We’ve had students and staff tell us that not only were they amused by the project, but also that it was very informative. Its been quite a fun experience for me.

I highly recommend the website I created for it. Check it out at http://greenspace.virtualroadside.com/

Anatomy of a Boing Boing link using OWS

Tuesday, September 11th, 2007

My roommate just got a page of his linked to by Boing Boing, so I just added a better heatmap function to OWS to do some better visual analysis of the hit. As you can see, the results are quite nice.

OWS Heatmap of jonathanryan.org

As you can see, the initial traffic spike peaked at 1:00pm on 9/8 with 639 visitors or so that hour, with traffic falling off until the end of the day, with another spike with people waking up at 9 or 10 the next day, and then falling back to mostly normal levels.

A more interesting observation is the spike in bot/crawler traffic, as shown below.

OWS Heatmap (bot) jonathanryan.org

Apparently people aren’t the only ones who follow links on popular sites such as Boing Boing. 🙂

This heatmap is not yet in the latest version of OWS, but it is stable and available in SVN right now. OWS 0.8.0.4 will have this, which will hopefully be introduced by the weekend if I have time..

VMWare and my carputer

Wednesday, September 5th, 2007

The biggest problem with the carputer I have is that its almost impossible to configure it easily — especially when its mounted in my car. So, what I’ve done is used VMWare to create a virtual machine that runs Linux on it, and then I used rsync to do a lil something like so from the virtual machine:

/usr/bin/rsync -apzv --delete --exclude=/dev --exclude=/sys --exclude=/var/log --exclude=/var/lock --exclude=/var/tmp --exclude=/var/run --exclude=/proc --exclude=/tmp -e "ssh" root@carputer_address:/ /

Which, of course, this command copies practically everything onto the virtual machine. Its a great solution so far (since you can easily copy changes back), the only real challenge was that I had to recompile the kernel to support the VMWare hardware. Haven’t gotten X working yet either, but I’m pretty sure that will be trivial compared to the fact that my carputer supports SSE2, but the host computer doesn’t, so I had to do the following

emerge -e world

after adjusting my build settings… grr. Recompiling 632 packages right now actually. Kept getting ‘invalid instruction’ errors all over the place.

Obsessive Web Statistics (OWS) analysis plugin tutorial

Saturday, September 1st, 2007

This is a short tutorial on how you can write an analysis plugin for Obsessive Website Statistics (OWS). OWS is designed first and foremost to be plugin friendly, and as you will see, adding useful functionality in the form of plugins is not hard at all, and can be done in just a few lines of code. We are going to add DNS hostname resolving to OWS.

What is an analysis plugin?

An analysis plugin performs analysis on the parsed logfile data, and stores that information in the database dimensions. OWS has wrapped all of this stuff in a nice easy to use abstraction layer so that you won’t need to make actual SQL queries if you don’t want to.

Implementation

All OWS plugins are implemented as PHP classes. This is the bare skeleton that all OWS plugins should define.

class OWSDNS implements iPlugin{

	// this should return a unique ID identifying the plugin, should start with an alpha,
	// should use basename instead of just __FILE__ otherwise it could expose path information
	public function getPluginId();//{
		return 'p'. md5(basename(__FILE__) . get_class());
	}

	// returns an associative array describing the plugin
	public function getPluginInformation(){

		return array(

			'pluginName' => 'Name of plugin',
			'aboutUrl' => 'http://information.about.plugin',

			'author' => 'author',
			'url' => 'http://developers.website',

			'description' => 'Description of what plugin does'
		);
	}
}

You should notice we define two functions — getPluginId() and getPluginInformation(). These must be defined by any OWS plugin, and are used to identify the plugin in a number of instances. This plugin also implements iPlugin. All interfaces are defined (with plenty of comments) in include/plugin_interfaces.inc.php. A plugin can implement as many interfaces as it needs to. There are a few types, but the one we are going to implement is iAnalysisPlugin. We will do so by changing the first part to:

class OWSDNS implements iPlugin, iAnalysisPlugin {

Additionally, we need to register the plugin with OWS so that it knows what kind of plugin you are defining. Add this to the end of your source file:

register_plugin('analysis',new OWSDNS());

An analysis plugin needs to implement the following functions:

define_dimensions
InitializeAnalysis
preAnalysis
getPrimaryNode
getAttributes
postAnalysis

All of these functions are documented in include/plugin_interfaces.inc.php if you need more comprehensive information.

Now, OWS stores data in multiple dimensions. Each dimension has a ‘primary node’ which is the main data element of the dimension. Each primary node can have mutliple attributes which are defined about it, and always has the same name as the dimension. Plugins can define new dimensions or extend existing dimensions.

Right now, OWS stores only the host address — which is an IP address representing the visitor. What our plugin needs to do is resolve this address, and store it as an attribute of the dimension. So, we need to extend the dimension ‘host’, which we can do using the function define_dimensions().

// this function should return a set of arrays that define the dimensions
// and attributes that this plugin defines. You should not specify an attribute
// that another plugin defines. This is not website dependent.
public function define_dimensions(){

	return array(
		'host' => array(
			'hostname' => attribute_defn('varchar',254,16)
		)
	);
}

Pretty simple, eh? See, the array returned means that we are defining inside dimension ‘host’, an attribute named ‘hostname’. The function attribute_defn is used to define the SQL type that our attribute has, so the installer can create it for us. Now, we can write the actual analysis part.

At the beginning of analysis, the function InitializeAnalysis is called in case the plugin needs to do something before the analysis begins. This function is called once per website analyzed. Our plugin isn’t going to need this, so we just return true.

public function InitializeAnalysis($website){
	return true;
}

Now, after all plugins are initialized, then the logfile lines are read from the logfile (or from the database in the case of an install or in the case of reanalysis). It is read in phases, which consist of 4 steps:

preAnalysis
getPrimaryNode
getAttributes
postAnalysis

Now, preAnalysis and postAnalysis are only called once per phase, but getPrimaryNode is typically called at least once per logfile line. Our plugin doesn’t use getPrimaryNode — getPrimaryNode is only used for plugins that define new dimensions in define_dimensions. If you don’t define a primary node, then you should return false and show an error.

It should also be noted that our plugin doesn’t need to do any preAnalysis or postAnalysis, so we can just return true.

public function preAnalysis($website,&$ids){
	return true;
}

public function getPrimaryNode($website, $dimension, $line){
	return show_error("Invalid dimension passed to plugin\"" . get_class() . "\"");
}

public function postAnalysis($website,&$ids){
	return true;
}

Now we get to the part that actually does the work. The function getAttributes needs to return an array representing the attributes that the plugin defines per dimension. The $dimension argument is passed in to the function, and we should only do analysis on the primary node. The contents of the primary node are passed in to the function as well. This makes sense, because attributes of the primary node should be discernable by only looking at the primary node itself. If this is not the case, then you should probably be defining a new dimension instead.

This function should return an array of attributes/values in the form of:

	array('attribute' => 'value', ...)

Note: The returned values can be cached (for performance reasons), so this function may NOT always be called for each row. You should ALWAYS return an array with the same keys each time, in the same order that you defined them in define_dimensions. Of course, if you do not define any attributes in the dimension passed in the $dimension parameter, or if there is an error, then return false.

Anyways, heres the code for this function:


public function getAttributes($website, $dimension, $pnode){

	if ($dimension != 'host')
		return show_error("Invalid dimension passed to plugin\"" . get_class() . "\" in getAttributes!");

	// return the hostname
	return array('hostname' => gethostbyaddr($pnode));

}

And thats it! Wasn’t that easy? Of course, theres a lot more useful things we could probably implement, and make this more polished. Now, after you install the plugin and run the analysis, the only filter you’ll be able to use on your new dimension attribute in the web interface is the manual analysis, since it allows analysis on all defined dimensions. But, it would be a pretty trivial matter to either modify an existing filter plugin or create a new filter plugin. We’ll discuss this in the future.

Hope this helps you out. If you need help with OWS, or developing for OWS, don’t hesitate to ask! Leave your comments, or join the obsessive-compulsive mailing list!

Download this
Obsessive Website Statistics Website