Blog post:

10 Things Every Open Source Project Should Have

osi_symbol-300x0On the second day of the WebFwd accelerator, Ash and Rastin, the dynamic duo from Anahita, cornered Tantek Celik, one of the WebFwd mentors, to grill him with questions about the business model for their open source social networking software Anahita.

I was talking with someone else, so I only caught fragments of their conversation, but at one point I heard Tantek say, “You’re running an open source project, but you don’t have an IRC channel!?”

To some, IRC may be associated with yesteryear’s technology, but for others, IRC is the lifeblood of an open source community, a virtual watering hole where newbies and veterans alike come to drink from the fountain of knowledge. Okay, it’s not always emitting knowledge – often it’s just senseless banter, but it still serves a vital need. Lacking a physical space, IRC provides a venue for people to come together in real-time and not just learn from each other, but get to know each other.

IRC is just one example, but it got me thinking about all the things that I take for granted in a mature open source community such as Plone. How does the community communicate? How do people report problems? How does one get the software running? How do I know the quality of the code? Below is a list of 10 things that I think every open source project should have:

1) Github repository

At the last Google I/O conference, I was surprised to see that Google engineers are putting their own code examples on Github (instead of their own Google Code) and even Microsoft is using Github now instead of their own CodePlex. Why? Because that’s where the developers live!

If you have an open source project, and it’s not on Github, it might as well not exist. What are you waiting for!? If you’re still stuck in the dark age of Subversion, all is not lost! You can easily import from Subversion to Github using svn2git. Even if you can’t switch from Subversion entirely, you can still set up a read-only Github repo.

As a DVCS (distributed version control system), Git also allows for more distributed collaboration and experimentation, since it’s trivial to make new feature branches and try new things without disrupting the work that’s happening on the master branch.

2) Documentation

I remember reading Chris McDonough’s blog post “Documentation is the Differentiator” which basically makes the case that the documentation quality is the differentiator between project success and failure. I love his explanation:

“A choice between good docs and better software is a false choice. The act of continually writing documentation always makes software better, because if it’s hard to explain, it’s probably not very good. Complexity becomes clear very quickly when you need to explain it away; it’s usually easier to change the software to be less complex than it is to document something complex. It’s frighteningly easy to write undocumentable software.”

Looking at Chris’ own docs for the Pyramid project, it’s clear why Pyramid is the fastest growing Python web framework.

At a minimum, your project should have the following documentation:

  • README – tell me who your software is for, what it does and why I might want to use it
  • INSTALL – give me step by step instructions for how to get it running
  • CHANGES – show me a history of changes so I can see the evolution and frequency of updates/releases
  • AUTHORS – is this a solo project or do you have other contributors? give credit where credit is due!
  • LICENSE – make it clear how your software is licensed. See #10 below for more explanation.

These files should either be authored in Markdown (the default markup language on Github) or reStructuredText which seems to be preferred in the Python community). Both formats are readable in a terminal as plain text files, but also can be rendered as HTML for better readability.

The advantage of using reStructuredText is that you can use Sphinx which allows you to easily make intelligent and beautiful documentation. And it’s even been used to layout books!

Once you are using Sphinx, you can then instantly publish the latest version of your docs to Read the Docs, and you can even tell Read the Docs to update your docs every time you make a commit.

3) Public issue tracker

As soon as you get your first bug report, the clock starts ticking: how long will it take to reproduce the bug, write a test for it that shows it was fixed, and issue the fix? If this cycle takes too long, the person who submitted the bug may have moved on with a different software package.

If you’re the only maintainer, you can probably get by receiving and responding to bug reports by email, but as soon as you get another person helping you, you need a way to assign the bugs to the appropriate person. Again, if there are just a handful of bugs, an email exchange works fine, but when this reaches dozens or hundreds? What about handling feature requests?

This is when an issue tracker comes in handy, as it lets you keep track of all the bugs and feature requests and manage them in a systematic way, instead of searching through your Gmail inbox trying to find the bug report that someone (whose name you’ve forgotten) sent you weeks ago.

While you’re setting up that Github repo, you can also get an issue tracker. Just click on ‘Settings’ and check the ‘Issues’ checkbox under Features. The Github issue tracker used to be quite useless, but it’s come a long way since it was first announced, and is actually quite good now.

The advantage of having your issues in the same place as your code repository is that everything is in one place, and using Github-flavored markdown you can easily reference issues in your commit messages and even close issues with commit messages (“Fixes #45” – would close issue #45).

4) Mailing list

While the issue tracker is great for handling discrete bugs and well-defined issues, it’s not the best medium for having more involved discussions about the roadmap and other topics that are not immediately actionable. A mailing list provides a forum for users and developers to answer each other’s questions. As an author, that means less work for you!

I usually set up mailing lists using Google Groups because it’s free and there’s a low friction to get people to sign up since most people already have Gmail accounts. Google Groups archives all the past messages and makes them searchable, so this also serves as a knowledge-base for newcomers to find answers to their questions.

5) IRC channel

An asynchronous tool like a mailing list allows people to connect at each person’s own convenience and schedule. An IRC channel doesn’t have that benefit, but it does allow for real-time communication which is incredibly beneficial when you have a burning question at 3 in the morning and it’s business hours in Europe.

I would argue that the free support that I’ve received on IRC channels of open source projects is superior to the majority of paid support I receive when calling a toll-free phone number. Usually I’m communicating directly with the author who can answer my question authoritatively rather than navigating a phone tree to be connected with someone who is reading from a script and never touches the code.

By now you might be asking yourself, “But Nate, do people really use IRC anymore? How would they even begin to install an IRC client?” The good news is that you don’t even need to download anything to use IRC – you can use the Webchat client that Freenode provides. Or for a more featureful web-based IRC client that provides search of the conversations, Leah Culver’s Grove.io provides a private IRC server as a commercial service. Or if you want to go with an open source solution, Stephen Mcdonald is building Gnotty.

The bad news is that IRC still appeals to more of a technical audience, so you’re unlikely to use it as your main support channel for end users. But you can still benefit from real-time chat by using something like SnapEngage or Olark which are one-to-one web-based chat services. Some open source projects like Jenkins and Anahita are even hosting “office hours” via Google Hangouts.

6) Twitter account

This is such a no-brainer, but I’m still surprised that many open source developers don’t set up Twitter accounts for their projects. If I’m using your software and I like it, I want to tweet about it. I could cc your personal Twitter account, but people might not want to read all of your tweets, just the ones about your software. Give them an easy way to follow the updates about your software.

Even as an open source developer, you’re still trying to sell your project to people. No one benefits from software that dies in obscurity. Twitter is an effective and low cost way to spread the word about your project, and more importantly, makes it easy for others to do the same.

7) Automated tests

When evaluating open source software, there are a number of things that I look at to see if it passes the “smell test”. One of those is automated tests. If I don’t see a tests directory, I’ll assume that the author doesn’t take testing seriously and conclude that the software is likely to have more bugs than software with good test coverage.

But it’s not enough to just have a test suite. Are those tests being run on every commit, or at least nightly? Are the tests passing? If they’re not passing, how long have they been failing?

While Jenkins (formerly Hudson) is the darling of the open source continuous integration software, there’s a new kid on the block called Travis and it’s a growing quickly. Travis is itself open source but most developers just use the hosted service because it’s free for open source projects.

If you’ve been browsing around on Github, you’ve probably seen Travis’ status messages reporting whether a given project’s tests are passing or not. With this status message on your project’s Github page, you’re wearing your test results on your sleeve. This is a kind of badge that tells me you care about testing enough to broadcast it.

For functional testing using Selenium, Sauce Labs also offers free accounts for open source projects.

8) Internationalization

One could argue that internationalization is not a must-have, and that may be the case for certain kinds of software, especially developer-oriented software. But if you’re making a product that is for a non-technical end user, it will probably appeal to people all over the world for whom English is not their native language.

Open source projects that are built from the start with internationalization in mind, enjoy contributions from translators all around the world who want to localize the software so that it’s more applicable to their customers. Don’t assume that code contributions are the only thing your users can offer – you may wake up one morning to find that your software has been translated into Swahili!

Transifex provides an easy translation management service for free to open source projects. See how Roberto, the author of Mayan, set up his translation config file and manages these translations in English, Arabic, French, Romanian and 12 other languages from the Transifex interface.

9) Package or Deploy scripts

The easier you can make it for me to try your software, the more likely I am to invest the time to use it. If there is a 10 page installation doc, I might decide to use another software package that has a lower barrier to entry. So how do you make your software easy to try? Here are a few ideas:

– Provide it as an easily installable package. If you’re building a Python package, make a proper release to PyPi. Don’t be lazy and make people check it out from your Github repo.

– If you’re distributing software that relies on other packages, make sure to pin your versions so that I can be assured to get the same versions of those packages as you intended.

– Consider providing config files to easily deploy your software to one of the popular PaaS providers such as Heroku, Dotcloud or OpenShift. Read my blog post about how to do this.

– Consider providing a downloadable VirtualBox image or at least a Vagrant file so that someone can create their own VM. This is especially useful if your software has a lot of system dependencies that will be a chore to install.

A VirtualBox image guarantees that the people evaluating your software are using the same operating system, and with a Vagrantfile you can make a repeatable and reliable installation method.

Roberto Rosario, the author of Mayan, an open source document management system, made a Virtualbox appliance and has had a lot of positive feedback from his users.

10) License

I’m not a lawyer, but I know that software licensing is a messy business. You want to make it very clear to your users how your software is licensed. There are a lot of open source licenses you can choose from but the most common are GPL, BSD, MIT, Apache or Mozilla.

Have other suggestions? Leave your feedback as a comment below! You can also upvote or downvote the article on Hacker News.

 

End of post.