Blog post:

Docker powered Open edX devstack

After the Open edX 2014 conference, there was a hackathon to bring together developers from the worldwide community who want to work together on the edX codebase, make improvements, build new features and make the experience for new developers more enjoyable.

One of the obstacles to getting started with Open edX is the growing list of dependencies to get the software running on your machine. To alleviate this problem, the team at edX has built “devstack” which is essentially a solution that runs an Ubuntu Linux virtual machine (VM) inside your computer (works on either Mac or Windows).

The virtual machine is bootstrapped using a Vagrantfile to download a “base box” and map the code directories on the VM to local directories on your host machine (Mac or Windows), so that you can use your favorite desktop code editing software to work on the code.

While this works, there are several disadvantages to this approach:

  1. You need to install Virtualbox and Vagrant on your machine which sometimes present an obstacle for would-be Open edX developers.
  2. Running an entire virtual machine on your computer consumes a lot of memory, and some computers don’t have enough memory, so it becomes impossible to use.
  3. The Vagrantfile method of bootstrapping the machine is not foolproof. In theory, all you need to type is “vagrant up”, but in practice many times something goes wrong.
  4. When something goes wrong, you need to dig deep into the internals of the deployment process and learn about Ansible (the tool that actually installs and configures the software on the VM)

On the Hackathon Ideas page, Ali Hasan from the Queen Rania Foundation in Jordan (makers of the Open edX-powered Edraak.org site) suggested replacing the Vagrant/VirtualBox solution with Docker.

He says on the wiki page, “Compared to VirtualBox, Docker is much more lightweight and it offers native performance for the app. As a stretch goal, I’d like to add setting up a clustered Docker setup to allow developers to have a closer environment to production. This is especially doable with Docker because of it’s relatively low memory consumption.”

The advantages of Docker are:

  1. Much lower barrier to entry especially if you use something like Docker Toolbox (formerly Boot2Docker and Kitematic) which are point-n-click solutions for getting Docker running on your Mac or Windows machine.
  2. Much lower resource consumption. While Docker still requires a VM (on Mac or Windows), the VM has a much smaller memory footprint (27MB in the case of boot2docker) than running a full Ubuntu distro in a Virtualbox VM.
  3. Docker containers start up in seconds, compared to a Virtualbox VM which can sometimes take minutes depending on how fast your machine is.
  4. Docker images can be shared via the Docker Hub for easy distribution and run on any machine that has a Docker server.

I responded to Ali’s comments on the hackathon wiki page: “We’ve already built some Open edX Docker images (edx-lite and edx-full), but they’re all-in-one, and we’d like to split the services into different images to have better separation between application and database, for example. Here are the instructions on how these images were created using Ansible. We started working on a multi-container Docker in this docker_multi_release branch of our forked configuration repo just MongoDB and MySQL for now.”   [mention that we now have containerized the Forum, ElasticSearch and RabbitMQ as well]

Since we already had a working Docker image to run Open edX in a single container (see the updated image for Cypress), we decided to spend the hackathon working on Ali’s “stretch goal”, which was to refine and polish the multi-container approach. Docker was (and still is) a very hot topic, so we attracted a lot of participants in our hackathon project. Present were:

(Forgive me if I missed you in this list. I’m doing this mostly from memory)

At the end of the hackathon, we had a semi-working solution that can spin up a multi-container Open edX deployment using Fig (now called Docker Compose). Thanks to Xavier Antoviaque who worked on the MySQL and MongoDB containers.

We posted the instructions for spinning up a multi-container Open edX deployment, but please be aware that these instructions have not been maintained, so they may not work anymore.

If you want to create your own all-in-one Docker image for Open edX, we posted instructions for that as well.

Please note that by default, emails won’t get sent. If you want to setup email delivery, you need to make a free Mandrill account, and export the Mandrill API key as an environment variable.

$ docker run -i -t -e MANDRILL_API_KEY=xxxx -p :80 -p :18010 -p :18020 -d -v ~/configuration:/configuration phusion/baseimage /sbin/my_init --enable-insecure-key

Again, these instructions were made for an older version of Docker and an older version of Open edX (Aspen), and may need to be updated. (i.e. you don’t need to use insecure_key since the addition of the ‘docker exec’ command).

Conclusion

A Docker-powered devstack would be a very welcome addition to the Open edX community, so if you’re interested in this topic, please join us at the Open edX 2015 conference hackathon.

Ed Zarecor (edX DevOps team) is working on Docker powered devstack, and I suspect we’ll work more on this at the edX Conference 2015 Hackathon.

Also, if you’re interested in using Docker containers to power Open edX deployments, we’ve started working on a multi-container Open edX hosting infrastructure using Kubernetes. Check out our blog post Open edX at Scale using Kubernetes.

 

End of post.