Tip: Pointing pylibmc at a unix domain socket

This is a quick tip that will hopefully save someone else a few minutesat least. If you’re using Django 1.3, using pylibmc and want to point at a local unix domain socket for memcached, this is what your CACHES setting will look like:

CACHES = {
        'default': {
                'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
                'LOCATION': '/tmp/memcached.sock',
        }
}

This also assumes that you have configured memcached to use a unix domain socket (with the -s option in memcached.conf).

Rabbits for the celery

I run an Arch Linux desktop as my primary development workstation.We use celery pretty heavily on some of our Django projects, and I was working to get my local environment at least somewhat closer to our production setup, only to find there isn’t a non-AUR package for rabbitmq-server.

Of course you can just download an AUR package, but that’s not nerdy enough. I managed to get a GitHub project to act as a pacman repository, and will share it here in case anyone else would like to stay up to date with rabbitmq-server without mucking with the AUR packages yourselves.

Just to be clear, there is no real benefit to installing from my repository, other than not having to download PKGBUILDs and mess with compiling/installing yourself. I just yanked the highest voted rabbitmq-server off of AUR and sucked it into a repository.

If there are any other common packages that you use that I might also use, comment on here and I might be convinced into pulling them in.

Repository: https://github.com/duointeractive/archduo

Tamarin 1.1 released

Tamarin 1.1 was released to account for the fact that there are rarecases where a user’s client doesn’t identify itself at all. I had accounted for one form of this, but failed to handle another. You’ll want to update to prevent ParseException exceptions from being raised if your parser runs into this.

Tamarin is a drop-in Django app that is used to parse S3 access log buckets. This is useful for getting the logs into a medium (a DB) that can be more easily queried, filtered, sorted, and etc.

PyPi page is here: http://pypi.python.org/pypi/tamarin/

Sources are on GitHub: https://github.com/duointeractive/tamarin

S3 access log parsing/storage with Tamarin

We have been helping one of our clients moves their massive collectionof audio and video media to S3 over the last few weeks. After most of the files were in place, we saw that our usage reports on for one of the buckets was reporting much higher usage than expected. We ran some CSV usage report dumps to try to get a better idea of what was going on, but found ourselves wanting more details. For example:

  • Who are the biggest consumers of our media? (IP Addresses)
  • What are the most frequently downloaded files?
  • Are there any patterns suggesting that we are having our content scraped by bots or malicious users?
  • How do the top N users compare to the average user in resource consumption.

Enter: Bucket Logging

One of S3’s many useful features includes Server Access Logging. The basic idea is that you go to the bucket you’d like to log, enable bucket logging, and tell S3 where to dump the logs. You then end up with a bunch of log keys that are in a format that resembles something you’d get from Apache or Nginx. We ran some quick and dirty scripts against a few day’s worth of data, but quickly found ourselves wanting to be able to form more specific queries on the fly without having to maintain a bunch of utility scripts. We also needed to prepare for the scenario where we need to automatically block users that were consuming disproportionately large amounts of bandwidth.

Tamarin screeches its way into existence

The answer for us ended up being to write an S3 access log parser with pyparsing, dumping the results into a Django model. We did the necessary leg work to get the parser working, and tossed this up on GitHub as Tamarin. Complete documentation may be found here.

Tamarin contains no real analytical tools itself, it is just a parser, two Django models, and a log puller (retrieves S3 log keys and tosses them at the parser). Our analytical needs are going to be different than the next person’s, and we like to keep apps like this as focused as possible. We very well may release apps in the future that leverage Tamarin for things like the automated blocking of bandwidth hogs we mentioned, or apps that plot out pretty graphs. However, these are best left up to other apps so Tamarin can be light, simple, and easy to tweak as needed.

Going back to our customer with higher-than-expected bandwidth usage, we ended up finding that aside from a few bots from Nigeria and Canada, usage patterns were pretty normal. The media that was uploaded into that bucket was never tracked for bandwidth usage on the old setup, so the high numbers were actually legitimate. With this in mind, we were able to go back to our client and present concrete evidence that they simply had a lot more traffic than previously imagined.

Where to go from here

If anyone ends up using Tamarin, please do leave a comment for me with any interesting queries you’ve built. We can toss some of them up on the documentation site for other people to draw inspiration from.

Source: https://github.com/duointeractive/tamarin

Documentation: http://duointeractive.github.com/tamarin/

GitHub Project: https://github.com/duointeractive/tamarin

New IMC and IRC extensions for Evennia MUD server

Evennia, the Twisted+Django MUD server, has just finished bringing inshiny new support for IRC and IMC (Inter-mud communication) as of revision 1456. This allows users to bind a local game channel to a remote IRC or IMC room. Evennia transparently sends/receives messages between the game server and the remote IRC/IMC server, while the players are able to talk over said channel just like they would a normal one.

It is even possible to bridge an IRC room to an IMC channel, with the Evennia server acting as a hub for messages. The next step for any eager takers may be to create a Jabber extension (any takers?).

If you’re curious, feel free to drop by #evennia on FreeNode to pester the developers.