Stochastic Confidence and the Open Source Network Effect

What has powered open source to become part of almost all software and drive nearly €100 bn of GDP in Europe? Reuse, yes. But that was always possible. Collaboration, definitely. But repositories existed for years before open source was coined in 1998. The software freedom philosophy. Absolutely, but that went 15 years without triggering a software revolution. I suggest it’s something less measurable and observable — developer confidence — and that the effects involved are stochastic, not deterministic.

A bird soars above the greyness over water in the Everglades with the water, mist and sky creating bands of greyness as if devising a scale
Continue reading

The Future Promise Of The Fediverse

There’s an old story about someone in the dark feeling the trunk of an elephant and believing it’s a snake because they can’t see the whole animal. It’s happening again, as people spooked from the Twitter crash try to feel their way around the Fediverse.

A beige room filled with connected globes made from translucent panels held in place by wires.

One of the benefits of the Fediverse is that I can use my preferred system to post things and you can follow and interact with any ActivityPub-compatible system you prefer. Your choice of, say, a photo-sharing platform doesn’t dictate that I have to sign up to the same site, or even to another photo sharing thing. It’s all powered by the ActivityPub standard – which is like RSS you can reply to. But there’s the potential to end the reign of monetized surveillance (AKA advertising) with a switch to user-owned applications.

No platform virality

If I were posting my photos to Instagram, to follow them you would have to sign up too (and since that’s Facebook-owned, submit to all their monetized identity harvesting). But if I post with PixelFed – an ActivityPub system tailored to posting photographs like on Instagram – you can follow from a compatible photo tool for sure. But you can also follow – and comment – from micro-blogging systems like Mastodon or Pleroma or from video-sharing systems like PeerTube or a blogging tool like Plume.

Yes, you have to join the Fediverse somewhere, but you can do it the way that’s comfortable on a platform that shares your values and still interact with people who made different choices, and once you’ve done it you can follow any feed regardless of the platform it’s from. It’s the end of platform virality and lock-in. It means every small app can benefit from a network effect previously only available to gatekeeper platforms.

This is the most important dimension of the Fediverse, and the one we need to develop. We need ActivityPub federated software tools of all kinds, cutting the link between my choices and your choices without also cutting our ability to interact with each other.
I never want to have to leave my social graph behind again.

Composable applications

This detachment goes further. I can segment my posting and use a more appropriate tool for specific content types and interaction styles. For example, I have been putting my travel photos on my new PixelFed server so that followers have the choice of following my micro-blogging feed on Mastodon, my photo feed on PixelFed or indeed both.

This means I don’t have to wait for my microblogging tool to get better support for posting photos; instead I can mix and match tools and build the ideal creative environment for me, and you’re not affected beyond needing to follow me in more than one place. Over time this will get fixed and I’ll be able to offer an aggregated subscription to all my feeds – it just needs someone to write a gadget to do it!

Of course, there’s much more to it than this. Since ActivityPub has two layers, a client-to-server layer and a server-to-server layer, there is great scope for wiring composable applications together so they collaborate better. And then there’s the privacy dimension – I especially like Christine Lemmer-Webber’s OCapPub ideas. I‘m sure we will see much innovation both in creating user capabilities and in managing infrastructure needs. Because pretty much everything in the Fediverse is open in every sense, there is plenty of scope for relays and clients to layer fresh capabilities upon the activity stream. It’s the UNIX philosophy revisited.

Open Source and standards done right

This is all powered by the dual merits of Open Source software and truly open standards. ActivityPub is a freely-available, royalty-free W3C standard. All the systems that manipulate it to date are Open Source software, which anyone can enjoy without asking permission first. Together that openness has fueled the wave of change triggered by the collapse of Twitter. But there is much more to it than that.

I’ll not tell you that calling the Fediverse “Mastodon” is a mistake (even if it is!) but I do recommend looking beyond the obvious similarities of Mastodon to Twitter and realize the phenomenon it is riding is not only bigger than a single piece of software, it’s bigger than a single category of software. Federation will get smarter, more secure and new categories of activity will be added. This is not so much an elephant in the dark as a whole zoo in the dark, and we’ve only touched the first few animals.


Akka Ratchets The Rights Away

There’s been a regrettable rise in the number of projects switching to “fauxpensource” proprietary licenses. In the main, this is the inevitable consequence of the rights-ratchet model running its course and reflects the growth of open source ten years or so ago, the model’s typical life-cycle. The rights ratchet model offers open source freedoms in the initial years of a product to secure adoption and market acceptance and then gradually removes their viability from customers as the company seeks to control their ecosystem and increase revenue.

So the license change that LightBend applied to its Akka product to end its open source status was in retrospect more probable than not given the available evidence that they were using a rights-ratchet business model, just like Elastic and other before them.

Signs that together seem a clear warning include:

  • VC backing includes VCs who have previously advised portfolio companies to ignore the community rather than leaving money on the table.
  • Used a Contributor Agreement despite also using a license entirely suitable for use without one.
  • Change of CEO recently saw the departure of a respected open source leader who had been at the helm during the community-building years.
  • The web site does not mention open source as a customer benefit.
  • The typical 10-year cycle of the rights-ratchet model from open source to proprietary was nearly up.

Those familiar with the rights-ratchet model will undoubtedly have been preparing for the switch. Anyone else may be surprised by it – this time.

Are bug bounty programs good for Open Source?

Seems obvious doesn’t it? They have to be a good thing surely? They are paying people to work on Open Source software – isn’t that increasing the sustainability of open source?

Stag Beetle emerging from elder blossom
Stag beetle emerging from elder blossom

Except the money isn’t going to open-source development; it’s going to a cottage industry of hacking that will likely sell the defects to the highest bidder. Even if the highest bidder is part of a white-hat bug bounty program the community that owns the code may well not have members who can fix the problem readily. Even if the community has got commercial contributors they may not have the income necessary to fix all these bugs found by people who are not buying their product.

Even assuming they want to (screenshot):

Bugs and Knowing

It’s good to have bug reports and know about defects, but there can be down sides. If the defects are exploitable vulnerabilities – CVEs – the reports are kept on a private security list until fixed to avoid informing black-hats. But in projects that can’t fix the defects fast enough they may well end up being disclosed before there is a resolution.

They also show up on CVE databases and become a cause for support calls and demands for resolution from volunteer community members, increasingly so as automatic scanning tools spread. Some of the bugs are rejected by the developers who own the software because in some cases being a bug or a feature is truthfully a matter of opinion, resulting in “will not fix” false positives.

A case can be made for bug bounty programs run for a company seeking low-cost penetration testing to be handled by their professional development team for a commercial product. It can also be made for an administration seeking community testing of public systems, like the Swiss government. But community-focussed programs like the European Commission’s Open Source bug bounty are well-meaning but surely miss their mark, creating work for communities and their commercial backers without generating donations or revenues to pay for bug resolution.

Who Can We Fund?

Specifically to that example, there’s nothing in the EU program that actually leads to any maintainers being paid, which is because that’s “software procurement” and has stringent rules. There’s a 20% uplift for bug reports who propose fixes for their bugs, but the on-ramp for maintainers of the targeted code is usually substantial and a 20% premium is unlikely to be enough to incent someone to come on board as a maintainer.

The EU has built outbound funding programs for Open Source at NL Net and NGI, but they are all earmarked for “the next big thing”, not for the boring job of keeping Open Source maintained. Edgy new features have indeed been paid for by them, but they have no regard for funding bug-bounty burdens. One fix would be to associate an earmarked grant with a matched bug bounty award so the maintainer could go claim it, giving a concrete incentive to invest the time.

Ideally, the grant to the reporter would be at least partly tied to the bug getting fixed as well. But even if the community has members able and willing to take on bounties, haggling with one or multiple unknown parties over acceptance of a solution after the investment has been made is huge risk for small rewards for implementors.

Conclusion

In the sense they reflect a growing realisation by software consumers that strip-mining Open Source is not sustainable, bug bounties are definitely welcome. But they are not the solution to creating sustainability. Open Source in the supply chain is not mainly or even largely about security. Rather, it has the same profile as Open Source elsewhere – a collaborative matter where beneficiaries share the sunk costs in proportion to the benefits gained.

Bug bounties prioritise the non-contributor’s worldview – the quality of the strip-mined commodity – and neglect the true community view – pooled innovation and shared costs. They are a good first step, but need to be rapidly followed by enlightened self-interest expressed by funding and enabling of the maintainers rather than just rewarding their users.

The Future Of Innovation Has Patent-Free Standards

It may come as a surprise to find that some supposedly “open“ standards – including those ratified by standards development organisations (SDOs) like ISO, CEN and ETSI – can’t be implemented without going cap-in-hand to the world’s largest companies to buy a licence. As I explained for OSI, it’s the result of a legacy approach to innovation from the days when it was only really about hardware.

As with any legal loophole, simply existing meant it was exploited and became the norm, even if it was initially temporary (“like income tax”). Once exploitation of a legal loophole becomes competitive, it becomes its own justification for the existence of the regulations (“look at the economic value of this segment”) and they become near impossible to remove – even when the original justification has ceased to need preferential protection.

So today we see a swathe of rich consumer electronics and telecoms companies, addicted to the revenue they get from licensing the patents (SEPs) they have embedded in “open” standards*, lobbying hard to ensure their value to the economy is recognised. They have much to lose from the loss of their special status, so invest much to protect it and to glorify it.

On the other hand, software companies have less to gain by the reformation of this anachronism – to the extent they have flirted with SEPs, maybe even a little to lose. Meanwhile, the new world of Open Source powered innovation lacks rich lobbyists due to its diffusion, and is accustomed to working round the obscenity of valuable standards being taxed by patent cartels. While the freedoms of Open Source mitigate to a degree, this means interoperability and interchangeability are being sacrificed on the altar of SEP protection.

It is not an ideological outlook that makes thoughtful Open Source advocates oppose patents in standards. It’s primarily pragmatic. Requiring a patent license to implement a standard implies that those implementing it must engage in private negotiation to get a license to proceed. That’s super-toxic to Open Source, whose mainspring is code owners giving advance, un-negotiated, equal permission to enjoy the software in any way – use, improve, share, monetise – all protected by a rights license reviewed and approved by OSI. So most projects avoid or work around SEP-encumbered standards and the ones that don’t are industry-specific.

OSI thus takes the position that standards destined to be implemented as Open Source must come with all the rights waived (and has done so for 15+ years). For some, that is already true; for others it is being actively resisted. If you want the crop of innovation you have to get the growing conditions right, and this new crop has different needs to the old hardware world and its long horizons. The future of innovation is open innovation, implemented as Open Source. Using anachronistic patent-centric metrics and regulations will chill that future. How about we don’t do that?


*Reusable Footnote: The word “open” is overloaded here.

(An edited version of this article appeared in the OpenUK survey report 2022)

Briefly: FRAND Is Toxic To Collaboration

I’ve repeatedly heard lawyers arguing about whether Open Source licenses and FRAND terms are compatible. But ultimately it doesn’t matter, because the toxin remains whatever the answer – legal compatibility is the wrong lens. When developers come to an Open Source project, they need to find a level playing field, a uniform surface with no traps, a fully illuminated environment with no shadows. Without them, collaboration is compromised.

But the presence of a standard with embedded patents (standard-essential patents or SEPs) under so-called “Fair Reasonable and Non-Discriminatory” (FRAND) terms introduces inequality. Some developers believe they are unaffected, because their usage is purely personal or they are poorly advised. Others are unconcerned because their employer is part of a cross-licensing cartel with the patent holder. But the remainder must each go privately and under NDA to the patent holder(s) and negotiate individual terms to use the patents. They then can’t publicly share the exact arrangements — or possibly even the existence of the arrangement — because of the NDA. Individual terms and secret rights are the opposite of open collaboration and destroy trust.

It’s this inequality that is toxic, not the precise compliance with the legal terms in the Open Source license. Whether great legal minds find the presence of SEPs compatible or incompatible with the license, the inequality of the participants in the community is what makes it avoid SEP-laden standards. That’s why the Open Standards Requirement for Software says any SEPs have to be waived or freely licensed in advance – to restore the level playing field. It’s not because of ideology or an anti-patent agenda or an attempt at market manipulation. The open source network effect underlying the market depends on it.

So learned dissertations about the compatibility of FRAND terms with Open Source licenses may be academically interesting, but they aren’t relevant. All SEPs in standards intended to be implemented as Open Source must be waived or freely pre-licensed, or the standard won’t be implemented by open communities

Briefly: On Overloading “Open”

The word “open” is overloaded. In the domain of standardisers, a process that permits any company to participate (even if doing so is punitively expensive) is considered “open” and the resulting deliverable is considered an “open standard” even if you have to pay to read it and negotiate patent licenses to implement it.

In the domain of software and APIs, it is the deliverable that has to be open – usable for any purpose without negotiation with its rights-holders. This overloading of the term is the origin of many of today’s issues, since – properly understood – Open Source and open standards are conceptually orthogonal

This variation in how “open” is understood within linked and overlapping domains is why “Open Source” is treated as a term of art with a consensually-agreed meaning in the domain of technology – a noun – and not as a descriptive adverbial phrase. If you see a hyphen in the middle of open-source it’s about military/political intelligence and not technology.

AI Code Is Like Public Domain Code

GitHub’s CoPilot tool may well be revolutionary, according to Bradley Kuhn. An AI trained by reading a massive and unidentified corpus of code, assumed to mostly be open source and licensed for any use to Github under their terms of use, it is able to watch what you are coding in your IDE and make suggestions on how to autocomplete the code – potentially at length. It is a kind of Clippy for code. It has just had the ultimate validation; Amazon copied it.

Spitfire in Guildhall Square, Southampton (ironically with no space for a co-pilot)
No room for a co-pilot

Sure, quit Github

While that may seem an unalloyed good to many programmers, there is an outbreak of moral panic surrounding it, as evidenced by the recent call to boycott GitHub because of it. Now, I am all in favour of people using distributed tools instead of centralised ones. Git itself is intended as a distributed tool and in a way it’s offensive for GitHub to have annexed its name to create a centralised and proprietary control point.

I am also keen for everyone as far as they can to exercise self-sufficiency over their computing and control of their personal data, and given Git was written as a response to the final abridgement of that self-sovereignty by the author of an earlier tool that the Linux developers were dependent on, Github is again somewhat offensive. Those would both be fine reasons to encourage people to move on from Github and to escape the social honeypot of carefully crafted network effect funnels that it embodies.

… but not because of Copilot

But Copilot is not a great reason to quit, or at least not for the reasons people insist on articulating. Those reasons seem strong on copyleft maximalism and the homeopathic thinking that assumes because there was GPL vapour in the air everything written at the time is infused. They also seem laced with a residual mistrust of Microsoft.

  • Copilot is unlikely to be infringing copyright. Certainly not in the USA. Probably not in most other places (although see Brown for more nuance). Even for humans, learning patterns doesn’t infringe copyrights, and quoting minimal or essential fragments rarely rises to the level needed for protection by copyright. Copyrights are not the same as patents, and re-expressing the same idea does not amount to infringement – even if such infringement were possible for a machine. Which it is not, so all these considerations are moot in many jurisdictions.
  • Copilot is unlikely to be breaching the GPL. That could only happen if copyright was being infringed. Just because the author of a work doesn’t like use of their code by Microsoft’s tool, that doesn’t somehow create an infringement that triggers the license.
  • Copilot in not morally bankrupt for using open source code for training. The whole point of Open Source Free Software is to give anyone the unconditional right to study the code and learn from it. If that’s a via an automated tool that makes the matter more efficient, it makes no difference.

Making a new thing that does the same as my patented widget is always an infringement of my patent, but making a new thing that does the same as my copyrighted code is not. An unfortunate consequence of the propaganda term “Intellectual Property” is that non-specialists munge all the concepts for all of {Copyright, Patents, Trade Secrets, Trademarks, Database rights} into one big hairball and assume anything matching the hairball triggers some form of infringement of any/all of the concepts. So arguments that mix-and-match IP concepts to imply an infringement are … problematic.

You shouldn’t use it for Open Source though

AI code helpers like Copilot are thus very unlikely to infringe rights per se. But that doesn’t mean code made by them should be welcome in Open Source projects.

To summarise a long article, Reda concludes that the output of an AI like Copilot is best understood as Public Domain. But ironically, that’s the real problem with Copilot for an Open Source developer. Public Domain is not Open Source, and AI-generated code introduces friction that works against the Open Source network effect for just the same reasons. As Brown explains, not every jurisdiction has the same degree of certainty or the same attributes to its conclusion about AI-generated works as seems commonly understood in the USA.

So while you may feel comfortable using AI-generated blocks in your code, what will you write in the pull-request to give others the same confidence? Even Github (and indeed Amazon) are at pains to point out that’s your responsibility, not theirs. Their tool may be a very helpful learning aid, but it’s something of a trap for the responsible Open Source contributor.

There’s a different case to be understood in every jurisdiction both about the code origin and the threshold for copyrightability. While the (many) lawyers I have heard from have largely waved a hand and said the arguments would never stand up in court, the arguable cases create a context where a community can’t rely on AI-generated code without further advice. Just like Public Domain, that added friction makes it non-viable for any community serious about provenance.

The biggest challenges are the ones exerting subtle, systemic steering effects that people don’t take seriously. Github may not be a digital scofflaw, but their tool is a Siren tempting you onto rocks that can ruin communities.

(Thanks to the Patreon backers who made this post possible)

Why we all need Almalinux

… whether we use it or not!

The AlmaLinux OS Foundation continues to make license-compliant releases of a fully RHEL-Compatible Linux distribution within one or two days of RedHat’s releases. This (and indeed any independent downstream of RHEL) is actually good for everyone, including Red Hat. The most recent release, AlmaLinux 9.0, appeared within a week of the release of RHEL 9.

AlmaLinux 9 Splash Screen

It validates Red Hat’s good faith

The AlmaLinux community validates that Red Hat’s product is indeed a true, forkable Open Source project and not a bad-faith hack like some other self-described open source products (for example, Forgerock, who appear to actively engineer their code to be unforkable by failing to document which parts are proprietary and which are just the CDDL-licensed Sun/Oracle code they took, and by failing to provide tools for debranding).

It provides those who need a self maintained Linux with something that has an off-ramp

Not everyone wants Red Hat’s subscription. Some Linux users – notably in the cloud hosting market – are happy to self-support and have the skills and resources to do so. They could base their work on Debian or another distribution, but as a RHEL downstream their customers retain a freedom of choice of support provider, including being free to switch to Red Hat at any time.

It creates an on-ramp for RHEL

Red Hat benefits from the growth of its adoption base, as users of downstream distributions can and do become customers.

It creates a no negotiation zone for innovative hacking

Some users need access to RHEL for skunkworks hacking that does not affect their licensing accounting under their Red Hat agreement.

This flexibility used to be included within Red Hat’s licensing universe but a hacker at a hedge fund on Wall Street ruined things for everyone by gaming Red Hat’s original trust in their customers and using a single licence to support an entire company. Red Hat was forced to reword its customer agreement to embrace all systems running RHEL.


  • Disclosure: I am a director of the AlmaLinux OS Foundation and its founder CloudLinux is a client. This article represents my own opinion and is no way endored by either entity.

Legally Ignoring The License

Perhaps the biggest current challenge to open source software is companies which ignore open source software licenses. That sounds so “yesterday” from an era of license scanners and compliance scares. But the issue is as relevant today as it was 20 years ago – just not the way you think!

Contributor agreements have been a controversial topic throughout their history. The choices by Elastic (and others before them) to relicense previously open source software under a licensing arrangement that discriminates against certain users threw the use of contributor agreements into sharp relief. But the controversy around them focused too much on the wrong problem. The main problem with a copyright-assigning agreement is not it giving the right to the aggregator to relicense the work (although that is a problem as it enables the end game of a rights ratchet). The main problem is it allows the aggregator, uniquely in the community, to ignore the license altogether.

A Brief History Of Scareware

All Open Source licenses grant unconditional permission in advance to those who comply with their terms to use, improve and share the software in any way and for any purpose. At a stroke, scope for artificially making the (inherently non-rivalrous) software scarce are eliminated. Of course, that’s a serious problem if you’re an entrepreneur whose imagination only extends to directly monetising access to the software.

Right from the start wily entrepreneurs realised that Copyleft licenses scared and confused some people, especially lawyers. So they sold customers the right to replace the open source license with a proprietary one – for some reason something customers’ lawyers found less scary. The pioneer of this approach was probably Sleepycat Software Inc whose BerkeleyDB embeddable database came under a source-available arrangement that left their users in no doubt that they had to make their own private work available to the public. Sleepycat sold a “commercial use” license that didn’t have the same requirement but which also left the user with none of the four freedoms. Selling indulgences had been profitable in the middle ages and it also worked for Sleepycat, all the way to acquisition.

Inspired by that success, many other companies sold indulgences. As the market wised up to the GPL and corporate counsel was no longer scared by it, companies transitioned to using other scareware licenses such as AGPL as well as to using “open core” approaches where the commercially valuable functions were not in the open source code at all. By using the no-charge availability of the software to gain adoption, free adoptors could be converted to paying customers and ultimately to lock-in. Some users of this strategy – notably SugarCRM – were able to ratchet back the freedom over time until they had an old-style proprietary software business.

Controlling The “Community”

However, there was an inconvenience. For much software, gaining adoption meant persuading cautious, picky developers to use the code — hand-waving to the boss was no longer enough. Once they were using the source, developers might well improve it. Inspired by the likes of Apache and Mozilla they then might well share their improvements, thus forming a community to produce better code than you. So it was smart to invite and use their improvements and thus keep control of the community.

But then the presence of these contributions under the GPL would make you subject to the GPL yourself and unable to sell indulgences or ignore the license. The fix to this was to speak to the sense of fairness and desire for an easy life (and pleasure of recognition) and claim it was in everyone’s interests for the core company to own all the copyrights. Apache and the FSF helped things along by socialising the idea of copyright assignments1. All these factors led developers to agree to gift the IP rights to their work to the core company. The name of such a document is a contributor license agreement or CLA.

Once they have a CLA, a company is able to aggregate all the rights to the software as if they own it. This has several consequences for the project:

  • They can sell indulgences, so that some community members are able to ignore the license.
  • They can ignore the license as well, enabling open core models that could otherwise be impossible.
  • They can do secret deals with other companies to treat the code as their own or even sell the complete rights, including to a company that actually wants to end the project. Because they can act secretly they can potentially preempt forks.
  • They can make releases without community consensus, making it impossible for peers to join in.
  • They can change the license by fiat, including to one that harms bona fides contributors they want to disadvantage
  • They can end the public project completely, as SugarCRM did.

Socially Unacceptable

An open source licence is a multi-lateral constitution of a community, setting norms that apply equally to all. Having every developer and user subject to the same terms is one of the pillars of community. A copyright assignment provides unqualified and unappealable immunity to all that. The presence of one in a commercially-backed project is almost certain to mean someone doesn’t want to be subject to the rules and norms everyone else must abide by, usually as part of a rights ratchet. They and their sham freedoms should no longer be tolerated by open source contributors.


Footnote 1: In both cases the CLA is – at best – marginal to the community. At Apache, the CLA is redundant with section 5 of the Apache License which many people believe grants all the rights the community needs. Folklore at Apache says that IBM’s lawyers were not sure of that and just to be certain insisted there be a CLA as well. At the FSF, the CLA is also redundant with GPLv3 (and likely with GPLv2 as well) but it has long argued that the FSF needs to own the copyrights in the USA in order to pursue license compliance — even though they don’t do so much and the surrender of copyright reduces the ability of the actual developers to choose to enforce. Both are frequently cited by abusers as justification for their actions.