The journey to Quay.io - An introspection and review

The journey to Quay.io - An introspection and review

We've been putting together version 2.0 of our Continuous Integration/Deploy system at work, which led to me kicking the tires of a bunch of different build/test systems. The goal was to put together a CI/CD pipeline that the entire team would feel comfortable using. Our deploy target was a number of Kubernetes clusters.

With much of our services already being adapted to work with Docker and Docker Compose, we were ready to figure out which combination of systems would build and test our images.

Read More

Amazon Elastic Transcoder Review

Amazon Elastic Transcoder was released just a few short days ago.Given that we do a lot of encoding at Pathwright, this was of high interest to us. A year or two ago, we wrote media-nommer which is similar to Amazon’s Transcoder, and it has worked well for us. However, as a small company with manpower constraints we’ve had issues finding time to continue maintaining and improving media-nommer.

With the hope that we could simplify and reduce engineering overhead, we took the new Elastic Transcoder service for a spin at Pathwright.

Web-based management console impressions

One of the strongest points of the AWS Elastic Transcoder is its web management console. It makes it very easy for those that don’t have the time or capability to work with the API to fire off some quick transcoding jobs. With that said, I will make a few points:

  • At the time of this article’s writing (Jan 30, 2013), it is extremely tedious to launch multiple encoding jobs. There’s a lot of typing and manual labor involved. For one-off encodings, it’s great.
  • Transcoding presets are easy to set up. I appreciate that!
  • Some of the form validation Javascript is just outright broken. You may end up typing a value that isn’t OK, hitting “Submit”, only to find nothing happen. The probable intended behavior is to show an error message, but they are very inconsistently rendered (particularly on the job submission page).
  • The job status information could be a lot more detailed. I’d really like to see a numerical figure for how many minutes AWS calculates the video to be, even if this is only available once a job is completed. This lets you know exactly how much you’re being billed for on a per-video basis. Currently, you just get a lump sum bill, which isn’t helpful. It’d also be nice to see when an encoding job was started/finished on the summary (you can do this by listening to an SNS topic, but you probably aren’t doing that if you’re using the web console).

Web API impressions

For the purpose of working Elastic Transcoder into Pathwright, we turned to the excellent boto (which we are regular contributors to). For the most part, this was a very straightforward process, with some caveats:

  • The transcoding job state SNS notifications contain zero information about the encoding minutes you were billed for that particular job. In our case, we bill our users for media management in Pathwright, so we must know how much each encoding job is costing us, and who it belonged to. Each customer gets a bill at the end of the month, without needing to hassle with an AWS account (these aren’t technical users, for the most part). Similarly, the “get job” API request shows no minutes figure, either.
  • If you’re writing something that uses external AWS credentials to manage media, you’ve got some setup work to do. Before you can submit job #1, you’re going to need to create an SNS topic, an IAM role, a Transcoder Pipeline, and any presets you need (if the defaults aren’t sufficient). If you make changes to any of these pieces, you need to sync the changes out to every account that you “manage”. These are all currently required to use Transcoder. This is only likely to be a stumbling block for services and applications that manage external AWS accounts (for example, we encode videos for people, optionally using their own AWS account instead of ours).
  • At the time of this article’s writing, the documentation for the web API is severely limited. There is a lack of example request/response cycles with anything but one or two of the most common scenarios. I’d like to see some of the more complex request/responses.

Some general struggles/pain points

While this article has primarily focused on the issues we ran into, we’ll criticize a little more before offering praise:

  • As is the case for anyone not paying money for premium support, AWS has terrible customer support. If you want help with the Transcoding service, the forums are basically your only option. The responses seen in there so far haven’t been very good or timely. However, it is important to note that this support model is not limited to Elastic Transcoder. It is more of an organizational problem. I am sure this is on their minds, and if there is a group that can figure out how to offer decent support affordably, it’d be Amazon. Just be aware that you’re not going to get the best, fastest support experience without paying up.
  • We do low, medium, and high quality transcodings for each video we serve at Pathwright. Our lower quality encoding is smaller (in terms of dimensions) than the medium and high quality encodings. With media-nommer and ffmpeg, we were able to specify a fixed width and let ffmpeg determine the height (while preserving aspect ratio). The Amazon Transcoder currently requires height and width for each preset, if you want to specify a dimension. Given that our master video files are all kinds of dimensions and aspect ratios, this is a non-starter for us.
  • If you submit an encoding job with an output S3 key name that already exists, the job fails. While you do open yourself up to some issues in doing so, we would appreciate the ability to say “I want to over-write existing files in the output bucket”. There is probably a technical reason for this, but I think this fails the practicality test. A solution can and should be found to allow this.
  • Because of the aforementioned poor support, I still don’t have a good answer to this, but it doesn’t appear that you can do two-pass encodings. This is a bummer for us, as we’ve been able to get some great compression and quality doing this.

Overall verdict

For Pathwright, the Amazon Transcoder isn’t capable enough to get the nod just yet. However, the foundation that has been laid is very solid. The encodings themselves execute quickly, and it’s great not having to worry about the state of your own in-house encoding infrastructure.

The prices are very fair, and are a large savings over Zencoder and Encoding.com at the lower to moderate volumes. The price advantage does taper off as your scale gets very large, and those two services do offer a lot more capabilities. If your needs are basic, Amazon Transcoder is likely to be cheaper and “good enough” for you. If you need live streaming, close captioning, or anything more elaborate, shell out and go with a more full-featured service.

Once some of the gaping feature gaps are filled and the platform has time to mature and stabilize, this could be a good service. If the customer support improves with the features, this could be an excellent service.

Verdict: Wait and see, but so far, so good.

Amazon ElastiCache review

Amazon Web Services has just announced the beta release of AmazonElastiCache, a hosted/managed memcached service. This is an offer similar to Relational Database Service (RDS) in that the management and clustering is handled for you, leaving you with a host/port to point your services at.

Powered by Memcached

Their announcement doesn’t provide too many details about the inner workings of the ElastiCache, but it appears the initial beta is some form of memcached 1.4.5. It’s entirely possible that they have made modifications.

There may be future additions of different engines, such as Redis, but AWS has not provided any indications of this happening in the immediate future.

Unlike some of the recent AWS feature launches, ElastiCache sports a tab on the AWS Management Console, which is great. Administering your clusters looks to be dead simple.

Performance and benchmarks

It’s still pretty early, and I haven’t been able to find any benchmarks comparing an EC2 instance with Memcached to ElastiCache. However, keep a few things in mind:

  • You can deploy your ElastiCache to single or multiple availailibty zones. Instances within the same zone can communicate with ElastiCache just as fast as they can to neighboring EC2 instances.
  • It is likely that like RDS, Amazon has tweaked the instances that run ElastiCache specifically for this workload.

That said, my uneducated guess would be that performance between a properly configured EC2 instance and ElastiCache will be negliglble. But that’s not what they’re selling you on. You’re paying the premium for not having to manage that part of your stack.

Usage case

For sites with low to moderate traffic, running a dedicated memcached instance may not be worth the money, or an increase in your service’s footprint. It is often fine to run a memcached server on the same instance as your app server. Memcached is great about not using much CPU, but it will happily consume as much RAM as you allow it to.

As your site grows, you may find that memcached actively uses RAM that your app server needs, causing resource contention. Worse case, you may find yourself hitting swap space. This can be absolute death for even moderately trafficked sites.

For those that are finding their memcached process needing more and more RAM to satiate your application’s needs, it’s probably time to explore breaking your caching out into its own EC2 instance, or ElastiCache.

Price

ElastiCache clusters come in sizes similar to EC2 instances: Small, Large, XL, etc. On the lower end, the monthly price for a single-node, Small On-Demand ElastiCache cluster is about $70. To serve as a basis for comparison, a Small On-Demand EC2 instance costs about $63. For those that actually use On-Demand instances over prolonged periods of time, the difference is negligible. You’re paying a $7/month premium to not have to hassle with administering your memcached instance.

However, the price difference for Reserved instances vs. On-Demand ElastiCache is large. Let’s say you use a Small Reserved EC2 instance. By the time you take into account the up-front reservation fee and the reduced monthly fee, you’re looking at around $41/month for a 1-year term. Since ElastiCache currently has no Reservation capability like EC2 yet, this leaves us with a marked difference in price points. A single-node, Small On-Demand ElastiCache would be $756 for the whole year, vs. $491 for the Small Reserved EC2 instance with a 1-year term. That’s a $265 savings for doing it yourself and shelling out the up-front reservation fee.

Cheaper alternatives

For those that find the price for ElastiCache a little too high, there are a few different options that might be more affordable:

  • Run memcached on one of your current instances, or on the same instance as the app server. This assumes lower load/traffic. Make sure that you aren’t running out of RAM.
  • Run memcached on an On-Demand Micro instance. This is one of the few useful tasks I’ve found for Micro instances. Recall that memcached is extremely easy on the CPU, and Micro instances have around 613 MB to play with. This would cost around $15/month On-Demand (less with a Reserved instance). If you are free-tier eligible, this route would be free for a year, if you haven’t already fired up another Micro instance.
  • If you can afford the up-front fee, a Reserved Small EC2 instance is powerful enough to handle even well-trafficked sites. You could pay a little more and get two Small Reserved EC2 instances for a moderate bit more than a single-node Small ElastiCache cluster.

The verdict

I love that Amazon is adding this, and think it’ll be a competitive service eventually. ElastiCache is currently in beta, and the prices are higher while they’re figuring out their costs and margins. The lack of a Reserved ElastiCache cluster option is unfortunately the killer. At the higher ends in particular, On-Demand ElastiCache is a good deal more expensive than the equivalent  Reserved EC2 instance.

I strongly suggest waiting for the prices to come down, and for the ability to purchase Reserved ElastiCache clusters. As it stands, this service isn’t cost-effective yet for the benefit.