Open Infrastructure at Open Privacy
05 Oct 2018
Open Privacy believes in using and producing open source solutions and open infrastructure. Many values contribute to our belief and mandate for open solutions including preferring ownership to rent, preferring customizability and control, and desiring to share our work and results with the world. This post describes our process for building our infrastructure in the first half year of starting out as a nascent non-profit.
Even non-technical non-profits require technical infrastructure like communication and document storage. However, as a technology-researching and producing non-profit, we needed quite a lot more. Given the sensitive nature of our work (especially with marginalized communities!) security is a paramount concern for us in all we do. Self-hosted solutions usually allow higher control and security and for that are preferred where possible over 3rd-party cloud options.
Balancing against these were the usual counter-weights:
Easy to setup: Spending our own time on infrastructure work versus actual Open Privacy work is an important balance to strike. Being a small organization, we want to focus the majority of our limited capacity on what we founded Open Privacy for: research and development. Time spent on infrastructure takes away from work, so we have to carefully balance necessities and time spent on infrastructure.
Ease of use: There is an unfortunate trend (though not a rule by any means) that open source and self-hosted solutions have poor usability compared to their peers.
Practicality: The solution has to be practical, and fit within the growing infrastructure framework we are building.
3rd Party Solutions
We struck compromises we were comfortable with. “Buying” (with a yearly renewal cost “rent” seems more accurate) a domain name as there really is no other option. We chose to use 3rd party DNS tools to manage it as self-hosted DNS with Bind, while doable, is a hassle, and isn’t exactly user-friendly and doesn’t really contribute much else. We opted for use of a 3rd party email hosting service because email and security aren’t ever going to be reconcilable so it is more a means of traditional notifications and non-secure communication for easy public contact. For internal secure communications there are better options including Signal and Wire. Finally, we chose to rent cloud servers to host the rest of our infrastructure on, because running and collocating hardware ourselves just isn’t a good use of our time or funds.
Self hosted Open Source solutions
For website hosting we picked nginx. Apache and nginx are fairly comparable feature wise and both mature at this point. However, nginx is bit lighter, a bit more flexible, and more popular these in general these days.
On top of that, we used letsencrypt for free SSL certs (which you can get from the command line with the certbot utility and then have it reconfigure your web server to use them for you on the fly). Finally, we utilized Mozilla’s SSL Observatory to fine-tune our web server configs for better security.
We used Jekyll as a static site generating framework. We opted for a staticly generated site because they are fast, light, and leave us with a minimal attack surface when the site is just HTML files. We picked Jekyll as it is the most mature and popular of the current generation of static site generators.
For document storage, we selected Owncloud as it is the leading open-source self-hosting document storage and syncing option. The only other contender is Nextcloud, a fork of Owncloud that is slowly starting to distinguish itself.
Code / Issue / Wiki
For code, issue and wiki hosting we chose Gogs which is a lighter weight solution than gitlab, both of which are good open source self-hosted code storage and issue tracking solutions.
For mailing lists, we picked mailman, the classic and still popular solution.
Build and CI Server
For a build and CI server we selected Drone CI which is light, Docker based, written in Go, modern, and allows self-hosting.
Tragedy of the Commons in Ops
Contributing back to the field of infrastructure is an important part of our mandate as an open source, non-profit, R&D organization. “Reproducibility” is a key part of science, and contributing back material that others can build on top of is a core part of open source. So many large companies endlessly duplicate effort internally, having each team repeat the same setup and maintenance work, rather than contributing back solutions that would save everyone time. This sometimes leads to the feeling that parts of the infrastructure and ops field are still surprisingly unpolished, as almost no one works on them while everyone uses them. Some “industry standard” tools used in ops for dev teams feel surprisingly unfinished, leaving you writing in shell script even, and when viewed across the industry, massive duplication of effort is being spent and wasted on using these tools as-is, rather than contributing fixes upstream to improve them. In some cases, some large companies do have internal tooling they have built that may be better, but it’s kept proprietary and in-house so no one else can benefit from it. For example, this is what led us to pick the newer and still more active development build and continuous integration solution of Drone CI vs the more tried and true industry standard of Jenkins, which still requires you to write build steps in shell script (or lose options and functionality to use their DSLs).
Open Privacy and Open Infrastructure
We are somewhat experimental as an organization, as there are not many research-and-technology producing non-profits (the Tor Project stands out as one of the few examples). Thus we are documenting our process where we can, to make it easier for anyone in the future to follow this path. This is why we are discussing our technical setup/choices and their reasons here and, where appropriate, documenting the process and sharing it. When we build new tools, we want to share them (as a core part of our mandate!) to move the field forward. This is all part of what we mean when we say we are an “open society”. To this end, we have created an Open Infrastructure section on our webpage listing work we have done and contributed back. Currently it includes:
- a more comprehensive write up of our mailman setup focusing on how to get mailman to use our 3rd party and not self-hosted email infrastructure
- links to two docker images we have created to work with Drone CI:
- Drone-Gogs which post updates to Gogs PRs about build status
- Android-Go-Mobile which is a build environment for Java, Android, Go lang and Go mobile, which is the full toolchain we are using to build the Android Cwtch application
As a final more personal note, it is a great pleasure for me to be contributing back to infrastructure and ops. I’ve have jobs before that have said “open source” was an important value to them, but on the occasions where I have made or enhanced tooling internally and requested to contribute it back to the larger community, I have been denied either directly or with process that was ill-defined and never meant to succeed. This environment is discouraging and stifling for ops improvement work leading to developers just surviving with the existing state of tooling. It is toxic for developers. So I am very glad to be part of Open Privacy where I can do this work and release it. It fills me with more energy to continue. I wish for a world where all developers are free to release their ancillary and ops work freely.
Director of Engineering, Open Privacy Research Society