Open Source Sustainability

This page describes how mmpdb was funded to this point, and the reasons I’m looking to crowdfund future development.

Research funds many open source projects

Open source software in cheminformatics often comes from a research project, typically funded by research grants in academic organizations, or internal R&D funds for industrial research. Once the project is done or published, the software is released for others to try out.

For example, GlaxoSmithKline funded Jameed Hussain and Ceara Rea to develop the “mmpa” package for matched molecular pair analysis. Their work was published in JCIM and contributed to the RDKit project.

Others benefit by building on open source projects

Even though GSK did not have long-term plans to support mmpa, their contribution meant that others could use the software, study it, and even improve upon it. Roche, for example, wanted a version of mmpa with better performance, and database integration including property change statistics. The code went through several iterations, resulting in “mmpdb”, which also includes improved fragmentation canonicalization, a method to handle chiral structures, and support for matching the attachment point environment.

It would have been much harder to develop mmpdb without access to the mmpa code.

Roche in turn contributed mmpdb to the RDKit project and published in JCIM. Many people are using it.

What happens when funding ends?

Research funding is a great way to start an open source project, and open source is a great way to let others use the software. But research funding usually ends once the research is done, and neither GSK nor Roche will fund someone to handle bug reports, add new features, and answer support questions for the rest of the world.

Both mmpa and mmpdb were contributed to the RDKit project, which primarily runs on volunteer contributions and indirect funding. There is growing awareness these sorts of projects can provide significantly more benefits with direct funding to support their continued development and support.

Direct funding can be difficult

It is often difficult to pay for freely available open source software. Can any of the project members accept money? What if they are employed by a competitor? Would funding distort the social structure of the project? If a company decides to fund a project, does it even have a budget process for paying for something that is available for free?

Consultant model doesn’t work well

It’s much easier for a company to justify paying for new features. For example, if you want to add Postgres support to mmpdb, you can pay the main developer (that would be me, Andrew Dalke) to implement the new feature, and integrate it with the main mmpdb code base so others can use it.

Unfortunately, economics gets in the way. It takes two to three times more work to develop a general-purpose feature than one which only implements the specific requirements of a single client. A general-purpose feature also needs more testing and documentation. Few clients are willing to pay that much more for capabilities they don’t need, even though many others would benefit.

How does proprietary software solve the problem?

Commercial proprietary software solves the economics problem by prohibiting people from using the software unless they have paid the vendor. The same features can be sold many times, and the revenue can pay for the software developers, graphic designers, documentation writers, tech support, support staff, and the many others who might be involved in a complex project.

By definition, open source software cannot prohibit further redistribution, which severely restricts that funding model.

Crowdfunding consortium approach

Instead, the mmpdb crowdfunding consortium will test another approach. The main premise is that people will pay for features, and there are some features that multiple people will want.

If 10 organizations or people are willing to pay EUR 5 000 for a new feature, that brings in EUR 50 000 to the project. If the feature costs EUR 15 000 to develop, the additional EUR 35 000 can be used to support the overall mmpdb project - especially the parts which aren’t so easy to sell, like a test suite and documentation for the code which already exists.

The key feature of this approach is that – for accounting purposes – this is identical to a standard software sale. Consortium members pay for a new version of mmpdb, using a standard purchase order sent to a software vendor in Sweden.

They then receive that software under the existing open source license. (Some companies have different budgets for capital expenditures vs. operational expenditures. I can also release the software under a restrictive limited-time license if you really want it that way.)

As new members join the consortium, the additional funding will pay for specific mmpdb improvements. These improvements will be distributed to all consortium members.

If EUR 23 000 in funding is received, then the improvements will be distributed to the public by 1 October 2020. The delay is to incentivize companies to pay for development now, rather than wait for nearly a year.

If EUR 50 000 in funding is received then the improvements will be distributed to the public with no further delay.

The deadline for joining the consortium is 1 February 2020. Join now!