Why Open edX hosting is so complicated
Update: This post was originally published in 2015. To get a more up-to-date and detailed picture of the nuances of Open edX, check out these two posts: Want to host Open edX Yourself? Top 3 Things to Consider and The True Cost of Hosting Open edX Yourself. Thanks!
With Amazon Web Services, the Google Cloud, Microsoft Azure and countless other cloud hosting services, it’s been a race to the bottom for super-cheap hosting. Good times!
However, hosting Open edX is considerably more challenging than plain vanilla Web hosting. So in most cases in which our clients — many of whom have very experienced tech teams — have started out wanting to host edX themselves, they’ve later turned to Appsembler to manage their edX hosting. Let’s look at why.
Open edX is complicated
Hosting a WordPress or Drupal site is simple. But edX combines many services to provide its superior learning experience. In this post from the edX team they provide this diagram detailing the edX architecture. (Note: this is the architecture for edX.org, so they have some components like Acquia’s Drupal which would not be in a standard Open edX install.)
That’s a lot of moving parts including multiple databases (multiple instances of MySQL and Mongo), differing code bases (most of edX is coded with Python and Django, but the discussion forums are coded with Ruby), and so on. It’s complicated. So even after choosing a cloud service and getting edX up and running, no simple task by itself, you need to be constantly monitoring it to ensure an always-on, responsive experience.
Open edX is a moving target. While there are now quarterly named releases, but with the size of the community working on Open edX new features, bug fixes and security updates go in every few hours. Seriously! Check out the commit log at https://github.com/edx/edx-platform/commits/master.
You can play it safe and just stick with the named releases (which is what we recommend), but the drawback is that you won’t be able to take advantage of the new features right away.
Scalability and Redundancy
Open edX was designed to be highly scalability to serve the needs of 4M+ students, and reliable even in the face of machine failures. It achieves this scalability and reliability by having redundant services spread across different machines. So hosting costs can multiply along with the effort required to maintain the whole system.
At the Open edX conference last year, Feanil Patel (DevOps Engineer at edX), gave a great presentation that describes the scalability and redundancy considerations when running Open edX.
Open edX is a huge system: multiple database engines, two Internet-facing services and a whole lot of components plumbed together on multiple machines.Package updates to the operating system on all of those machines come out regularly, and they don’t always play nicely with Open edX. Sometimes this means building custom packages in order to integrate an important security fix.
Running a firewall on the server to block ports to attackers, setting up SSL certificates so the URLs are encrypted with https, and performing security audits to ensure the system is safe, are all things we do for our customers. If you host Open edX yourself, these are things that will keep you up at night if you don’t do them.
When running a system as complex as Open edX, there are a lot of things to keep an eye on:
- the health of your server (We use Pingdom and New Relic)
- application performance (We conduct load tests to ensure optimal page load times and identify bottlenecks with New Relic)
- any application errors (We use Sentry and currently evaluating Opbeat)
- server logs (We use Papertrail)
We run entire servers just to ensure the health of our Open edX servers, and we have staff to watch for errors. Without this sort of visibility into what your Open edX instance is doing, how can you hope to detect and fix problems?
We backup each instance of edX that we’re hosting and these backups are stored offsite, so even if the server dies, we have your files backed up for safekeeping and quick recovery.
How do you know that your backups are even working? Better connect it up to your monitoring server(s)!
- a student claims to have filled in an exam but you can’t find any record of it?
- you wake up one morning to emails saying that your server has been hacked?
- your server gets DDOSed and your network provider cuts it off to protect their network?
- people start getting 500 errors and you don’t know why? Worse, what if you never find out, and they stop using your service?
That’s what we’re working to avoid. No one’s perfect, but it takes a considerable effort to prevent these and countless other scenarios from happening.
The point that we’re making is that to run a production-grade Open edX service, you need people with a diverse (yet specialized) set of skills. If you have that team already, great! You don’t need us! If you don’t have a team that can handle all of the above considerations, we’d love for you to give our service a try.
And if you do decide to give it a try, at scale, please read the AWS Ops Checklist!