Exocet makes code reloading easy

One of the big philosophical “pillars” I’ve been building my tinkerPython MUD on is that cold restarts (restart the process, clients are disconnected) should be exceedingly rare. Code re-loading in Python can be challenging, but Allen Short has nailed it with his Exocet module.

The coolest part for me (as it pertains to my game) is as follows:

Exocet is a new way to load Python modules. It separates the act of naming a dependency from the act of creating a module object. As a result, more than one instance of a module can be created from the same source file.”

As a result of this, we can deal with Python modules as objects, and can easily replace them. This is how it ends up looking for me (lots of things removed for the sake of brevity):

import exocet
# Load the general commands module, stuff it in a variable.
general_cmds = exocet.loadNamed(
   'src.game.commands.general',
   exocet.pep302Mapper
)
# Add a command to the command table.
self.add_command(general_cmds.CmdLook())

This is an excerpt from my game’s global command table. The general command module is loaded by exocet instead of Python’s regular import statement. Since we’ve tossed the module reference in a variable, the normal rules of Python garbage collection apply.

While my game does this a little differently, running the line that populates general_cmds again would mean that the old general command module’s reference count would drop to zero, hence it would be garbage collected. In its place, you have the newly loaded general commands module with any new code updates. Here is all I’d need to do to re-load the general commands module after some modifications:

import exocet
# Re-load the general commands module, replacing the old one.    general_cmds = exocet.loadNamed(
   'src.game.commands.general',
   exocet.pep302Mapper
)

Some disclaimers

Of course there had to be a catch, right? I’m still learning how to best use this, but here are a few pointers:

  • Your Exocet-based imports will probably always look strange to you, because they are. However, the power you get from them is well worth the “different” look to them. This is a pretty silly point, but I am pretty silly about code cosmetics.
  • Be careful about storing references to exocet-loaded modules. If you reference an exocet module from other modules/classes, you may find yourself in a situation where the old module’s reference count never drops to zero, and it isn’t garbage collected. I get around this by using properties to replace things that were previous instance attributes. I probably did a bad job explaining this, sorry!
  • Exocet is still in development. It is only available on Launchpad (sadface). However, some pip+bzr magic can make that less of a problem.
  • Documentation is limited to a series of blog posts by the author. They are pretty useful, but you’ll have to wade through them if you’re looking for something.

Rabbits for the celery

I run an Arch Linux desktop as my primary development workstation.We use celery pretty heavily on some of our Django projects, and I was working to get my local environment at least somewhat closer to our production setup, only to find there isn’t a non-AUR package for rabbitmq-server.

Of course you can just download an AUR package, but that’s not nerdy enough. I managed to get a GitHub project to act as a pacman repository, and will share it here in case anyone else would like to stay up to date with rabbitmq-server without mucking with the AUR packages yourselves.

Just to be clear, there is no real benefit to installing from my repository, other than not having to download PKGBUILDs and mess with compiling/installing yourself. I just yanked the highest voted rabbitmq-server off of AUR and sucked it into a repository.

If there are any other common packages that you use that I might also use, comment on here and I might be convinced into pulling them in.

Repository: https://github.com/duointeractive/archduo

Tamarin 1.1 released

Tamarin 1.1 was released to account for the fact that there are rarecases where a user’s client doesn’t identify itself at all. I had accounted for one form of this, but failed to handle another. You’ll want to update to prevent ParseException exceptions from being raised if your parser runs into this.

Tamarin is a drop-in Django app that is used to parse S3 access log buckets. This is useful for getting the logs into a medium (a DB) that can be more easily queried, filtered, sorted, and etc.

PyPi page is here: http://pypi.python.org/pypi/tamarin/

Sources are on GitHub: https://github.com/duointeractive/tamarin

Minecraft, Python, and nerdery

A little over a month ago, I was finally pulled into the rapidly growingthing that is Minecraft. Like many of you, I ended up happily breaking blocks and constructing crude huts and castles into the wee hours of the night.

As is the case with many other things I enjoy, I found myself wondering “What Python nerdery can I get into with Minecraft?” Much like the Minecraft client, the server is written in Java, which is not something I play with for fun. After some Googling around, I stumbled across the Bravo project, an effort to write a custom Minecraft server in Python. “Bingo!”

Bravo is built on top of Twisted and is aimed at being a much more efficient, extendable alternative to the “Notchian” official server. Development is still pretty early, but it is already just about suitable for those who wanting to run creative servers.

Lending a hand

One thing I immediately found out about the Bravo community is that they are immensely patient and helpful with any questions or ideas. I lurk on their IRC channel (#Bravo on Freenode), and have been very impressed so far. For these and other reasons, I can strongly recommend this project for Pythonistas looking for a way to apply their talents to one of their hobbies (Minecraft!).

The Bravo issue tracker has all kinds of stuff in it waiting to be implemented, or fixed up. A lot of these are not extremely difficult, and the maintainer has been great handling pull requests and providing good feedback.

If you’re not sure where to start, or have questions, the IRC room (#Bravo on FreeNode) is great.

tl;dr version

Bravo is a custom Minecraft server written in Python. It is early in development, but is already suitable for creative stuff. The community is friendly, and you should consider perusing their issue tracker.

Source: https://github.com/MostAwesomeDude/bravo

Docs: http://www.docs.bravoserver.org/index.html

IRC: #Bravo on FreeNode

S3 access log parsing/storage with Tamarin

We have been helping one of our clients moves their massive collectionof audio and video media to S3 over the last few weeks. After most of the files were in place, we saw that our usage reports on for one of the buckets was reporting much higher usage than expected. We ran some CSV usage report dumps to try to get a better idea of what was going on, but found ourselves wanting more details. For example:

  • Who are the biggest consumers of our media? (IP Addresses)
  • What are the most frequently downloaded files?
  • Are there any patterns suggesting that we are having our content scraped by bots or malicious users?
  • How do the top N users compare to the average user in resource consumption.

Enter: Bucket Logging

One of S3’s many useful features includes Server Access Logging. The basic idea is that you go to the bucket you’d like to log, enable bucket logging, and tell S3 where to dump the logs. You then end up with a bunch of log keys that are in a format that resembles something you’d get from Apache or Nginx. We ran some quick and dirty scripts against a few day’s worth of data, but quickly found ourselves wanting to be able to form more specific queries on the fly without having to maintain a bunch of utility scripts. We also needed to prepare for the scenario where we need to automatically block users that were consuming disproportionately large amounts of bandwidth.

Tamarin screeches its way into existence

The answer for us ended up being to write an S3 access log parser with pyparsing, dumping the results into a Django model. We did the necessary leg work to get the parser working, and tossed this up on GitHub as Tamarin. Complete documentation may be found here.

Tamarin contains no real analytical tools itself, it is just a parser, two Django models, and a log puller (retrieves S3 log keys and tosses them at the parser). Our analytical needs are going to be different than the next person’s, and we like to keep apps like this as focused as possible. We very well may release apps in the future that leverage Tamarin for things like the automated blocking of bandwidth hogs we mentioned, or apps that plot out pretty graphs. However, these are best left up to other apps so Tamarin can be light, simple, and easy to tweak as needed.

Going back to our customer with higher-than-expected bandwidth usage, we ended up finding that aside from a few bots from Nigeria and Canada, usage patterns were pretty normal. The media that was uploaded into that bucket was never tracked for bandwidth usage on the old setup, so the high numbers were actually legitimate. With this in mind, we were able to go back to our client and present concrete evidence that they simply had a lot more traffic than previously imagined.

Where to go from here

If anyone ends up using Tamarin, please do leave a comment for me with any interesting queries you’ve built. We can toss some of them up on the documentation site for other people to draw inspiration from.

Source: https://github.com/duointeractive/tamarin

Documentation: http://duointeractive.github.com/tamarin/

GitHub Project: https://github.com/duointeractive/tamarin