Archive for July, 2007

Faceball

Thursday, July 26th, 2007

Heh.. this is amusing. Faceball…

“At its simplest level Faceball involves two people hitting beachballs at each others faces. At a deeper level its a vehicle for the release of personal animosity, and the Shaming of the Weak.”

I think thats all the introduction it needs.

http://faceball.org/

http://flickr.com/groups/faceball/

MySQL and indexes

Thursday, July 26th, 2007

I was able to obtain a 1.6GB apache combined logfile from a colleague, and have been using it to see how good/bad OWS performance is on this size of logs. Unfortunately, it looks like OWS does not work well with data this size. What makes it really disappointing is that the site in question only gets around 1000-5000 unique visitors a day.

The biggest performance-related problem right now is that MySQL is ignoring the indexes that I have set up. Through some research, apparently this is an InnoDB related problem where it tries to use the primary key for everything, as opposed to using the secondary indexes the same way. This has been evidenced with normal index usage on my tables with only 100,000 rows or so on it, while it trying to use the primary key for the table with 6.5 million rows in it (and performing a table scan, which is definitely BAD). Then when I use FORCE INDEX then it seems things work better, but I can’t imagine thas the proper way to do it. What I’m going to do is try and use clustered indexes, and use the date as the primary key (since almost every single query deals with the date in some way), and see what kind of performance increases I get.

I think when it comes right down to it though, using a flat MySQL table ends up having the same types of problems you have with flat files — browsing gigabytes of data is slow. Of course, some of this can probably be eliminated with better queries, but I haven’t quite figured out how to do that.

All of this analysis stuff has brought me into examining OLAP and other multidimensional ways of representing this kinds of data. Right now, I’m thinking I want to redo the backend storage model of OWS so its more efficient and fast using a different type of data representation (still using MySQL), while maintaining the same easy to use interface.

By the way, a great link I’ve found thats helped me with some of these random issues is http://www.xaprb.com/blog/ , though most of the useful articles were published last year when he wasn’t so busy. I encourage you to check it out.

Apparently Microsoft uses Firefox too!

Tuesday, July 24th, 2007

Haha… I was browsing the Facebook developer site and they linked to Microsofts website in relation to MS’s popfly application. Apparently it has support for Facebook application development or some such thing. Well if you take one good look at the screenshot they were using…

popfly.jpg

Recognize those tabs? Its Firefox! Apparently Microsoft likes Firefox too. 😀 Of course, we probably already knew that, but this is further proof!

Link: http://msdn.microsoft.com/vstudio/express/showcase/

Image Link: http://msdn.microsoft.com/vstudio/express/images/facebook/popfly.jpg

Optimizing a really nasty SQL query

Monday, July 23rd, 2007

Ok, so while working on Obsessive Website Statistics (OWS), I’ve hit a situation. See, OWS tries to be semi-intelligent and combine the parameters of all the installed plugins into really giant/nasty SQL queries that make you shudder, but tend to work.

So right now, I’m trying to select the following (at the same time):

  • All pages, grouped
  • COUNT() of all hosts
  • COUNT() of all pages that end with .html, .htm, .php, /
  • COUNT() of all pages
  • SUM() of filesizes

So of course, getting the 1st, 2nd, 4th, and 5th items is pretty easy. However, the third item is getting annoying. I tried using a subquery, but considering the table is 100,000 rows this is a particularly slow query:

SELECT 
	virtualroadside_com.request_str,
	COUNT(DISTINCT virtualroadside_com.host),
	c.b,
	COUNT(virtualroadside_com.filename),
	SUM(virtualroadside_com.bytes) 
FROM 
	(	
		SELECT 
			virtualroadside_com.request_str AS a,
			COUNT(virtualroadside_com.id) AS b 
		FROM 
			virtualroadside_com 
		WHERE 
			(virtualroadside_com.filename LIKE '%.html' 
			OR virtualroadside_com.filename LIKE '%/' 
			OR virtualroadside_com.filename LIKE '%.htm' 
			OR virtualroadside_com.filename LIKE '%.php') 
		GROUP BY virtualroadside_com.request_str 
		ORDER BY virtualroadside_com.request_str DESC 
		LIMIT 0,100
	) c,
	virtualroadside_com 
WHERE 
	virtualroadside_com.request_str = c.a
GROUP BY virtualroadside_com.request_str 
ORDER BY virtualroadside_com.request_str DESC 
LIMIT 0,100;

So, the big question here is: Is there better ways to accomplish this sort of thing without using subqueries? This particular query takes around 40 seconds on 105,000 rows to execute on my computer (Dual PIII, 500Mhz). I’m positive theres a way to do with with a JOIN of some kind, but I can’t get any of those to work correctly either. Let me know if you have any good ideas! I’ll publish a better way to do this hopefully in the next few days once I figure it out. 🙂

10 things you can do now to improve the security and performance of your Windows PC

Thursday, July 19th, 2007

Lets face it, secure and Windows are two words almost never used together, and for good reason. Security on Windows has traditionally been extremely horrible, and has singlehandedly brought about the rise of the antivirus industry and its billions of dollars of revenue. Another problem with Windows is that it gets slow over time. If your computer is slow, most of the time because of one of the following reasons:

  1. You have spyware, viruses, or both
  2. You have too many startup programs
  3. You have too many ‘temp’ files
  4. Your temporary internet files folder has too many files (internet explorer)
  5. You have toolbars or other browser helper objects installed
  6. You have antivirus installed (this has a big impact on performance)
  7. You’re using Windows Vista

However, it is possible to live (relatively) securely using a Windows PC with decent performance. Here are some useful tips (most that you already should know, and some that may surprise you).

Security Tips

  • Do your Windows Updates. This should be a given. If you don’t, you are asking for trouble. Sure they’re annoying at times, and can even cause problems, but you better do them. Personally, I do the optional updates too, just in case.
  • Disable Windows file sharing and the Remote Registry service. If you don’t use it, then disable it. Most users don’t share things on the network, so this isn’t a problem. File and Printer sharing can be dangerous to have on a public network, as shown by the next item.
  • Have a strong Administrator password. Many viruses/exploits rely on the fact that you have file and printer sharing enabled and that the Administrator password is blank. See, Windows shares out all of your hard drives to the world with names like C$ or ADMIN$, and by default makes them available to anyone who has the Administrator password. Check for yourself: Click Start -> Run -> type ‘fsmgmt.msc’ and click on ‘Shared Folders’. I bet you didn’t know those were being shared. If you’re really paranoid, rename the Administrator account.
  • BACK YOUR STUFF UP. Seriously, everyone says to do this, but nobody does it regularly. The best defense you can have against malware and viruses is the ability to restore your information anytime you need to without losing too much data. Theres really no excuse with how cheap CD-R and DVD-R media/burners are now.

Performance Tips

  • Don’t use antivirus. Get rid of it. Conventional wisdom says that you should use antivirus on your windows PC to make sure that you don’t get any nasty viruses. I disagree. Not only does antivirus routinely fail to protect your machine from viruses (especially new ones), but most vendors products slow down your computer a TON, and cause even more problems. I have not had antivirus on my PC for over 5 years, and I have never had problems with viruses or spyware. The key to keeping viruses off your computer is surprisingly simple: don’t visit questionable websites, don’t download questionable software or attachments, and do your Windows Updates. And if you happen to make a mistake, you’ve got those backups, right?
  • Remove Viruses and Spyware. Cmon, you don’t have these on your computer, right? NEVER pay money for an anti-spyware product, there are way too many free resources out there to take care of the problem for you (there are also a LOT of bad free anti-spyware products as well, so beware!). My small howto details some programs that you can use to do the job manually without too much trouble.
  • Delete those temporary files. Having too many temporary internet files can slow your computer down a lot, especially on startup and shutdown. For some reason, Windows can sometimes accumulate thousands of files in the temp directory. I’ve seen up to 10,000 files before in someone’s temp directory. Wait you say, I thought that ‘temporary’ means ‘temporary’? No, not in Windows. Most of the trash is stuff left behind by lazy installers or careless third party programs. I recommend you use a program like Pocket Killbox (http://killbox.net/) to delete the files, its much quicker than doing it manually. Also, don’t forget about those temporary internet files. If you’re not using Firefox (you are, aren’t you?), then internet explorer can get extremely slow because of its inefficient caching methods. Pocket Killbox can help you here too.
  • Stop programs from running on startup. There are a lot of tools (including MSCONFIG, which comes with Windows, despite being rather annoying to use). There are a lot of free tools you can use to examine the programs that are starting up when your computer is, and more! HijackThis and AutoRuns are excellent programs to use for this purpose. Refer to my antivirus and spyware howto about good ways of doing this.
  • Delete your system restore points (All except the most recent one), or just disable it completely. In my experience, System Restore rarely ever fixes problems (though, I’ve heard rumors to the contrary). It just wastes 10% of your hard drive.
  • Defrag regularly. This isn’t as important as it used to be, but its still a good way to keep your system running smoothly. I schedule mine to run every night around 4am when I’m not using my computer.

I hope this helps you out, let me know what you do to improve the performance and security of your system!

TiddlyWiki

Thursday, July 19th, 2007

A friend just gave me a link to a piece of software called TiddlyWiki. Besides the neat name, it is actually a ridiculously useful piece of software. Its a Wiki inside of an HTML file. Seriously. So you can bring a wiki around with you anywhere that you can edit at any time on any (modern) browser without requiring server software.

Anyways, its way cool, and I’ve already thought of a million different uses for it… though really, you have to use it to understand why its so freaking sweet. Go there. Now.
http://www.tiddlywiki.com/

Obsessive Web Statistics: Open Source Web 2.0 Website Statistics System

Wednesday, July 18th, 2007

For the last month or so, I’ve been working on a new PHP/MySQL/jQuery web application that I’ve decided to call “Obsessive Web Statistics” (OWS). A project has been created on Sourceforge for it, and I’m happy to finally announce the first file release for OWS! There are a number of features about OWS that give it an advantage over existing website statistics software.

Instead of generating static HTML reports like most website statistics programs, OWS takes your Apache logfiles and puts them into a MySQL database. A dynamic jQuery driven interface with a PHP backend allows you to manipulate the data and display it in useful ways. The interface is mostly intuitive and simple to use while providing powerful options to manipulate the data. For more information, you can visit the Sourceforge page or subscribe to the obsessive-compulsive mailing list (how crazy of a name is that? LoL. FYI: Its only about development and help for OWS) . Archived version of the list is available at Sourceforge as well.

Links:

Sourceforge website: http://obsessive.sourceforge.net

Demo Site: http://ows.mattas.net/

Many thanks to my friend Tony Mattas (http://www.mattas.net) for providing hosting for the OWS demo site!

A lesson well learned: Don’t seek() on STDIN

Wednesday, July 18th, 2007

The neat thing about fread() and friends in most languages is that you can open up STDIN and pretend its a file. However, the following code doesn’t work.. hehe.

<?php
	$file = $argv[1];

	if ($file != '-' && !file_exists($file))
		die("File $file does not exist, foo!\n");
	if ($file == '-')
		$file = 'php://stdin';

	$f = fopen($file,'r');

	// crazy code in here

	// ok, go backwards
	fseek($f, $some_position);

	// Heh, it doesn't always work!?

?>

Yeah, I know it seems to be quite obvious from this small snippet, but for the record the lines were specified by quite a few lines of code and I didn’t quite realize that I was doing that… lol.

jQuery snippet:

Thursday, July 12th, 2007

Here is a neat jQuery snippet that you can use to create extra elements in a form, underneath (mostly) the previous element. Took longer than it should have to create this since I didn’t quite understand the behavior of “before” and “after”… ie, a parent node has to exist for them to work.

<input name="field[]" type="text" /> <a href="#" onclick="$(this). before('<br/>'). before($(this). prev(). prev(). clone()); return false;" href="" >+</a>

This snippet works especially well inside of a table. 🙂 Unfortunately, I don’t have a demo of the code since I don’t have jQuery enabled on this blog.. yet. 🙂

Visual jQuery: Awesome jQuery Reference

Thursday, July 12th, 2007

This is seriously the coolest form of documentation that I’ve seen. Implemented using jQuery by jQuery developer Yehuda Katz, Visual jQuery implements the API documentation for jQuery using an expanding tree of categorized nodes. Its quite innovative, IMHO. If you develop using jQuery, definitely a plus to have around. And, there is a downloadable copy of it available too. 🙂

http://www.visualjquery.com/