Amazon ElastiCache review

Amazon Web Services has just announced the beta release of AmazonElastiCache, a hosted/managed memcached service. This is an offer similar to Relational Database Service (RDS) in that the management and clustering is handled for you, leaving you with a host/port to point your services at.

Powered by Memcached

Their announcement doesn’t provide too many details about the inner workings of the ElastiCache, but it appears the initial beta is some form of memcached 1.4.5. It’s entirely possible that they have made modifications.

There may be future additions of different engines, such as Redis, but AWS has not provided any indications of this happening in the immediate future.

Unlike some of the recent AWS feature launches, ElastiCache sports a tab on the AWS Management Console, which is great. Administering your clusters looks to be dead simple.

Performance and benchmarks

It’s still pretty early, and I haven’t been able to find any benchmarks comparing an EC2 instance with Memcached to ElastiCache. However, keep a few things in mind:

  • You can deploy your ElastiCache to single or multiple availailibty zones. Instances within the same zone can communicate with ElastiCache just as fast as they can to neighboring EC2 instances.
  • It is likely that like RDS, Amazon has tweaked the instances that run ElastiCache specifically for this workload.

That said, my uneducated guess would be that performance between a properly configured EC2 instance and ElastiCache will be negliglble. But that’s not what they’re selling you on. You’re paying the premium for not having to manage that part of your stack.

Usage case

For sites with low to moderate traffic, running a dedicated memcached instance may not be worth the money, or an increase in your service’s footprint. It is often fine to run a memcached server on the same instance as your app server. Memcached is great about not using much CPU, but it will happily consume as much RAM as you allow it to.

As your site grows, you may find that memcached actively uses RAM that your app server needs, causing resource contention. Worse case, you may find yourself hitting swap space. This can be absolute death for even moderately trafficked sites.

For those that are finding their memcached process needing more and more RAM to satiate your application’s needs, it’s probably time to explore breaking your caching out into its own EC2 instance, or ElastiCache.

Price

ElastiCache clusters come in sizes similar to EC2 instances: Small, Large, XL, etc. On the lower end, the monthly price for a single-node, Small On-Demand ElastiCache cluster is about $70. To serve as a basis for comparison, a Small On-Demand EC2 instance costs about $63. For those that actually use On-Demand instances over prolonged periods of time, the difference is negligible. You’re paying a $7/month premium to not have to hassle with administering your memcached instance.

However, the price difference for Reserved instances vs. On-Demand ElastiCache is large. Let’s say you use a Small Reserved EC2 instance. By the time you take into account the up-front reservation fee and the reduced monthly fee, you’re looking at around $41/month for a 1-year term. Since ElastiCache currently has no Reservation capability like EC2 yet, this leaves us with a marked difference in price points. A single-node, Small On-Demand ElastiCache would be $756 for the whole year, vs. $491 for the Small Reserved EC2 instance with a 1-year term. That’s a $265 savings for doing it yourself and shelling out the up-front reservation fee.

Cheaper alternatives

For those that find the price for ElastiCache a little too high, there are a few different options that might be more affordable:

  • Run memcached on one of your current instances, or on the same instance as the app server. This assumes lower load/traffic. Make sure that you aren’t running out of RAM.
  • Run memcached on an On-Demand Micro instance. This is one of the few useful tasks I’ve found for Micro instances. Recall that memcached is extremely easy on the CPU, and Micro instances have around 613 MB to play with. This would cost around $15/month On-Demand (less with a Reserved instance). If you are free-tier eligible, this route would be free for a year, if you haven’t already fired up another Micro instance.
  • If you can afford the up-front fee, a Reserved Small EC2 instance is powerful enough to handle even well-trafficked sites. You could pay a little more and get two Small Reserved EC2 instances for a moderate bit more than a single-node Small ElastiCache cluster.

The verdict

I love that Amazon is adding this, and think it’ll be a competitive service eventually. ElastiCache is currently in beta, and the prices are higher while they’re figuring out their costs and margins. The lack of a Reserved ElastiCache cluster option is unfortunately the killer. At the higher ends in particular, On-Demand ElastiCache is a good deal more expensive than the equivalent  Reserved EC2 instance.

I strongly suggest waiting for the prices to come down, and for the ability to purchase Reserved ElastiCache clusters. As it stands, this service isn’t cost-effective yet for the benefit.

Tip: Pointing pylibmc at a unix domain socket

This is a quick tip that will hopefully save someone else a few minutesat least. If you’re using Django 1.3, using pylibmc and want to point at a local unix domain socket for memcached, this is what your CACHES setting will look like:

CACHES = {
        'default': {
                'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
                'LOCATION': '/tmp/memcached.sock',
        }
}

This also assumes that you have configured memcached to use a unix domain socket (with the -s option in memcached.conf).

A MUD behind a proxy is… potentially great

My perpetual tinker project is a MUD server that may or may not eversee the light of day. In recent adventures, I pursued using Exocet to make my goal of a mostly interruption-less experience for the players a reality. The attempt worked in most cases, but failed horribly in a few others. The failures were bad enough to make me scrap the idea. The next best thing I could think of is to stick a proxy server in front of the MUD server.

The proxy would handle all of the telnet protocol and session stuff, and just dumbly pipe input into the MUD server through Twisted’s AMP (a really neat, simple, 2-way protocol). When a user inputs something, the proxy says to MUD server “A session attached to an Object with an ID of “a20dl3da” input this. The MUD server would then have any object matches route the input through whatever command tables they are subscribed to, causing things to happen in-game.

Communication back to the proxy would happen whenever an an Object’s emit method (IE: print to any sessions controlling this object). The proxy would see if it had a session attached to the given object, and call the TelnetProtocol’s msg() method with the output.

Neat thing #1: Strictly enforced separation

Convention typically dictates that connection and protocol-level things be kept separate from business logic and other more interesting things. However, having separate proxy and MUD server sections of the codebase really enforces that separation in my mind.

Keeping session and protocol-level gunk confined to the proxy makes the MUD server easier to understand, maintain, and test. I find this layout a little easier to mentally digest.

The other cool thing in the future is that adding support for other protocols (websockets, anyone?) can be handled in the proxy, hooking the input/output into AMP commands. Protocols are already in their own island with Twisted, but this separation is much more strictly enforced under this arrangement (which again, I like). The MUD server can speak in a protocol-agnostic format like Markdown or BBcode, and the protocols can format the output for whatever they are serving.

Neat thing #2: Neither proxy nor MUD server care if the other dies

Consider the following two scenarios:

  • MUD server dies, proxy stays up.

    The proxy accepts connections, but all input is left with an error message telling the user to stay put until the MUD comes back up. All sessions are maintained, and Twisted’s auto-reconnection facilities continuously tries to get back in touch with the MUD. When it does come back up, business continues as usual without interruption. The MUD server doesn’t care about sessions, and the proxy doesn’t care about in-game objects, rooms, and etc.

  • The proxy goes down, but the MUD server stays up.

    This one isn’t quite as neat. In theory, this scenario should be extremely rare. If the proxy goes down, the user is unable to connect to the running game. They’ll need to re-connect once the proxy comes back up. However, the MUD server continues about its business in the meantime, so mobs are moving, the economy is ticking, etc. Once the proxy is back, it re-connects and players can interact with the game world again.

Neat thing #3: We don’t need to bother with code re-loading

The last, and most important, neat thing is that because of neat things #1 and #2, we don’t need to implement code re-loading. If both proxy and MUD server are monitored/auto-restarted by something like Supervisor, the latest version of the game code can be loaded by silently shutting down the MUD server (but leaving the proxy up). Supervisor (or runit, or launchctl, or whatever) sees the server process down, restarts it, and the proxy automatically re-connects as soon as it’s back up.

The end result is that the user may get an error or two if they’re trying to type stuff while the server is down, but the outage should be short and potentially completely unnoticed by some of the players. We don’t need to worry about all of the messyness associated with code reloading, and we can keep the MUD server focused on game logic.

Code to come

I’ve got a proof-of-concept for this arrangement “working”, but it’ll be some time before I am able to restore the existing features of the MUD server to work with the new proxy + MUD server model. I’ll continue to write posts about progress as it happens.