Briefly: On Overloading “Open”

The word “open” is overloaded. In the domain of standardisers, a process that permits any company to participate (even if doing so is punitively expensive) is considered “open” and the resulting deliverable is considered an “open standard” even if you have to pay to read it and negotiate patent licenses to implement it.

In the domain of software and APIs, it is the deliverable that has to be open – usable for any purpose without negotiation with its rights-holders. This overloading of the term is the origin of many of today’s issues, since – properly understood – Open Source and open standards are conceptually orthogonal

This variation in how “open” is understood within linked and overlapping domains is why “Open Source” is treated as a term of art with a consensually-agreed meaning in the domain of technology – a noun – and not as a descriptive adverbial phrase. If you see a hyphen in the middle of open-source it’s about military/political intelligence and not technology.

AI Code Is Like Public Domain Code

GitHub’s CoPilot tool may well be revolutionary, according to Bradley Kuhn. An AI trained by reading a massive and unidentified corpus of code, assumed to mostly be open source and licensed for any use to Github under their terms of use, it is able to watch what you are coding in your IDE and make suggestions on how to autocomplete the code – potentially at length. It is a kind of Clippy for code. It has just had the ultimate validation; Amazon copied it.

Spitfire in Guildhall Square, Southampton (ironically with no space for a co-pilot)
No room for a co-pilot

Sure, quit Github

While that may seem an unalloyed good to many programmers, there is a good deal of moral panic surrounding it, as evidenced by the recent call to boycott GitHub because of it. Now, I am all in favour of people using distributed tools instead of centralised ones. Git itself is intended as a distributed tool and in a way it’s offensive for GitHub to have annexed its name to create a centralised and proprietary control point.

I am also keen for everyone as far as they can to exercise self-sufficiency over their computing and control of their personal data, and given Git was written as a response to the final abridgement of that self-sovereignty by the author of an earlier tool that the Linux developers were dependent on, Github is again somewhat offensive. Those would both be fine reasons to encourage people to move on from Github and to escape the social honeypot of carefully crafted network effect funnels that it embodies.

… but not because of Copilot

But Copilot is not a great reason to quit, or at least not for the reasons people insist on articulating. Those reasons seem strong on copyleft maximalism and the homeopathic thinking that assumes because there was GPL vapour in the air everything written at the time is infused. They also seem laced with a residual mistrust of Microsoft.

  • Copilot is unlikely to be infringing copyright. Certainly not in the USA. Probably not in most other places (although see Brown for more nuance). Even for humans, learning patterns doesn’t infringe copyrights, and quoting minimal or essential fragments rarely rises to the level needed for protection by copyright. Copyrights are not the same as patents, and re-expressing the same idea does not amount to infringement – even if such infringement were possible for a machine. Which it is not, so all these considerations are moot in many jurisdictions.
  • Copilot is unlikely to be breaching the GPL. That could only happen if copyright was being infringed. Just because the author of a work doesn’t like use of their code by Microsoft’s tool, that doesn’t somehow create an infringement that triggers the license.
  • Copilot in not morally bankrupt for using open source code for training. The whole point of Open Source Free Software is to give anyone the unconditional right to study the code and learn from it. If that’s a via an automated tool that makes the matter more efficient, it makes no difference.

Making a new thing that does the same as my patented widget is always an infringement of my patent, but making a new thing that does the same as my copyrighted code is not. An unfortunate consequence of the propaganda term “Intellectual Property” is that non-specialists munge all the concepts for all of {Copyright, Patents, Trade Secrets, Trademarks, Database rights} into one big hairball and assume anything matching the hairball triggers some form of infringement of any/all of the concepts. So arguments that mix-and-match IP concepts to imply an infringement are … problematic.

You shouldn’t use it for Open Source though

AI code helpers like Copilot are thus very unlikely to infringe rights per se. But that doesn’t mean code made by them should be welcome in Open Source projects.

To summarise a long article, Reda concludes that the output of an AI like Copilot is best understood as Public Domain. But ironically, that’s the real problem with Copilot for an Open Source developer. Public Domain is not Open Source, and AI-generated code introduces friction that works against the Open Source network effect for just the same reasons. As Brown explains, not every jurisdiction has the same degree of certainty or the same attributes to its conclusion about AI-generated works as seems commonly understood in the USA.

So while you may feel comfortable using AI-generated blocks in your code, what will you write in the pull-request to give others the same confidence? Even Github (and indeed Amazon) are at pains to point out that’s your responsibility, not theirs. Their tool may be a very helpful learning aid, but it’s something of a trap for the responsible Open Source contributor.

There’s a different case to be understood in every jurisdiction both about the code origin and the threshold for copyrightability. While the (many) lawyers I have heard from have largely waved a hand and said the arguments would never stand up in court, the arguable cases create a context where a community can’t rely on AI-generated code without further advice. Just like Public Domain, that added friction makes it non-viable for any community serious about provenance.

The biggest challenges are the ones exerting subtle, systemic steering effects that people don’t take seriously. Github may not be a digital scofflaw, but their tool is a Siren tempting you onto rocks that can ruin communities.

(Thanks to the Patreon backers who made this post possible)

Why we all need Almalinux

… whether we use it or not!

The AlmaLinux OS Foundation continues to make license-compliant releases of a fully RHEL-Compatible Linux distribution within one or two days of RedHat’s releases. This (and indeed any independent downstream of RHEL) is actually good for everyone, including Red Hat. The most recent release, AlmaLinux 9.0, appeared within a week of the release of RHEL 9.

AlmaLinux 9 Splash Screen

It validates Red Hat’s good faith

The AlmaLinux community validates that Red Hat’s product is indeed a true, forkable Open Source project and not a bad-faith hack like some other self-described open source products (for example, Forgerock, who appear to actively engineer their code to be unforkable by failing to document which parts are proprietary and which are just the CDDL-licensed Sun/Oracle code they took, and by failing to provide tools for debranding).

It provides those who need a self maintained Linux with something that has an off-ramp

Not everyone wants Red Hat’s subscription. Some Linux users – notably in the cloud hosting market – are happy to self-support and have the skills and resources to do so. They could base their work on Debian or another distribution, but as a RHEL downstream their customers retain a freedom of choice of support provider, including being free to switch to Red Hat at any time.

It creates an on-ramp for RHEL

Red Hat benefits from the growth of its adoption base, as users of downstream distributions can and do become customers.

It creates a no negotiation zone for innovative hacking

Some users need access to RHEL for skunkworks hacking that does not affect their licensing accounting under their Red Hat agreement.

This flexibility used to be included within Red Hat’s licensing universe but a hacker at a hedge fund on Wall Street ruined things for everyone by gaming Red Hat’s original trust in their customers and using a single licence to support an entire company. Red Hat was forced to reword its customer agreement to embrace all systems running RHEL.

  • Disclosure: I am a director of the AlmaLinux OS Foundation and its founder CloudLinux is a client. This article represents my own opinion and is no way endored by either entity.

Legally Ignoring The License

Perhaps the biggest current challenge to open source software is companies which ignore open source software licenses. That sounds so “yesterday” from an era of license scanners and compliance scares. But the issue is as relevant today as it was 20 years ago – just not the way you think!

Contributor agreements have been a controversial topic throughout their history. The choices by Elastic (and others before them) to relicense previously open source software under a licensing arrangement that discriminates against certain users threw the use of contributor agreements into sharp relief. But the controversy around them focused too much on the wrong problem. The main problem with a copyright-assigning agreement is not it giving the right to the aggregator to relicense the work (although that is a problem as it enables the end game of a rights ratchet). The main problem is it allows the aggregator, uniquely in the community, to ignore the license altogether.

A Brief History Of Scareware

All Open Source licenses grant unconditional permission in advance to those who comply with their terms to use, improve and share the software in any way and for any purpose. At a stroke, scope for artificially making the (inherently non-rivalrous) software scarce are eliminated. Of course, that’s a serious problem if you’re an entrepreneur whose imagination only extends to directly monetising access to the software.

Right from the start wily entrepreneurs realised that Copyleft licenses scared and confused some people, especially lawyers. So they sold customers the right to replace the open source license with a proprietary one – for some reason something customers’ lawyers found less scary. The pioneer of this approach was probably Sleepycat Software Inc whose BerkeleyDB embeddable database came under a source-available arrangement that left their users in no doubt that they had to make their own private work available to the public. Sleepycat sold a “commercial use” license that didn’t have the same requirement but which also left the user with none of the four freedoms. Selling indulgences had been profitable in the middle ages and it also worked for Sleepycat, all the way to acquisition.

Inspired by that success, many other companies sold indulgences. As the market wised up to the GPL and corporate counsel was no longer scared by it, companies transitioned to using other scareware licenses such as AGPL as well as to using “open core” approaches where the commercially valuable functions were not in the open source code at all. By using the no-charge availability of the software to gain adoption, free adoptors could be converted to paying customers and ultimately to lock-in. Some users of this strategy – notably SugarCRM – were able to ratchet back the freedom over time until they had an old-style proprietary software business.

Controlling The “Community”

However, there was an inconvenience. For much software, gaining adoption meant persuading cautious, picky developers to use the code — hand-waving to the boss was no longer enough. Once they were using the source, developers might well improve it. Inspired by the likes of Apache and Mozilla they then might well share their improvements, thus forming a community to produce better code than you. So it was smart to invite and use their improvements and thus keep control of the community.

But then the presence of these contributions under the GPL would make you subject to the GPL yourself and unable to sell indulgences or ignore the license. The fix to this was to speak to the sense of fairness and desire for an easy life (and pleasure of recognition) and claim it was in everyone’s interests for the core company to own all the copyrights. Apache and the FSF helped things along by socialising the idea of copyright assignments1. All these factors led developers to agree to gift the IP rights to their work to the core company. The name of such a document is a contributor license agreement or CLA.

Once they have a CLA, a company is able to aggregate all the rights to the software as if they own it. This has several consequences for the project:

  • They can sell indulgences, so that some community members are able to ignore the license.
  • They can ignore the license as well, enabling open core models that could otherwise be impossible.
  • They can do secret deals with other companies to treat the code as their own or even sell the complete rights, including to a company that actually wants to end the project. Because they can act secretly they can potentially preempt forks.
  • They can make releases without community consensus, making it impossible for peers to join in.
  • They can change the license by fiat, including to one that harms bona fides contributors they want to disadvantage
  • They can end the public project completely, as SugarCRM did.

Socially Unacceptable

An open source licence is a multi-lateral constitution of a community, setting norms that apply equally to all. Having every developer and user subject to the same terms is one of the pillars of community. A copyright assignment provides unqualified and unappealable immunity to all that. The presence of one in a commercially-backed project is almost certain to mean someone doesn’t want to be subject to the rules and norms everyone else must abide by, usually as part of a rights ratchet. They and their sham freedoms should no longer be tolerated by open source contributors.

Footnote 1: In both cases the CLA is – at best – marginal to the community. At Apache, the CLA is redundant with section 5 of the Apache License which many people believe grants all the rights the community needs. Folklore at Apache says that IBM’s lawyers were not sure of that and just to be certain insisted there be a CLA as well. At the FSF, the CLA is also redundant with GPLv3 (and likely with GPLv2 as well) but it has long argued that the FSF needs to own the copyrights in the USA in order to pursue license compliance — even though they don’t do so much and the surrender of copyright reduces the ability of the actual developers to choose to enforce.

Rights Ratchet Talk

Simon delivered a talk for the new Tidelift conference “Upstream”. In it he drew together the threads of several earlier posts about the rights ratchet model (“bait & switch meets boiling frogs”) using the history of the now-defunct Sugar CRM open source project as an initial case study and then examining the various ratchets that remove rights from open source project participants, ways to detect that a project is actually a rights ratchet and steps to mitigate the consequences including promoting permission in advance.

Don’t Call It Relicensing!

Using open source elsewhere is not relicensing, it’s overlaying a second license.

So you’re considering taking some open source code under a minimal, non-reciprocal OSI-approved license and putting it under a different open source license, hopefully in combination with your original code (or another form of larger project). 

Don’t call this “relicensing” – it is not! The original license will continue to apply and you remain responsible for complying with its requirements. Only the copyright holder can change the license. You’re not relicensing – instead you are using the rights the license has given you and applying an additional license to the combination of the earlier work and your work. 

Continue reading

An End To API Gaslighting?

The Supreme Court decision in Oracle vs Google ends a decade-long nightmare for open source developers.

Sunlight or gaslight?*

The decision of the US Supreme Court (SCOTUS) to reverse the erroneous conclusion of the US Federal Circuit appeals court (CAFC) that Google’s use of the Java SE API in Android was a copyright infringement comes as a great relief to open source programmers everywhere. Software developers have always assumed that merely including a function prototype in their code does not require copyright permission as it’s just a fact about the implementation.

Continue reading

FLOSS Weekly 622: Keith Packard

Simon joined Doc Searls to host episode 622 of FLOSS Weekly featuring Keith Packard, one of the key figures of the open source software movement. They talked about Keith’s involvement in the X System and and strayed into related topics including the many projects Keith has helped and his interest in rocketry!

One significant discussion considered the thread joining the fork of XFree86, the recent vote to change the board of Nominet in the UK and the controversy over the reinstatement of Richard M Stallman to the board of the Free Software Foundation this week. Each represents a significant entity to the open movement which has leadership that was established as a “club” between activists and failed to progress into a well-governed organisation representing and controlled by the community.

Our focus this week has been the Open Source Program Office (OSPO). While at Sun Microsystems, Simon led their OSPO and this week he got the team back together, including original founder Danese Cooper, to write about what they all did during the decade the Sun Open Source Program Office existed. This was a very popular article and it’s been read thousands of times this week. There’s scope to zoom in on specific topics mentioned in this article – let us know which would interest you.

Continue reading

The Week In Review: OSPOs

A Rights Ratchet Score Card

A draft scorecard for determining if a software project is open as bait for a business pivot or genuinely keeping your freedoms protected.

Open or closed? You decide.

The seven signs a project is following the rights-ratchet route to riches and the framework for going beyond licensing can be augmented by some straightforward indicators of an issue. None of these alone is necessarily a cause for concern, but the more clicks, the more risks. Here’s a rough-and-ready first draft of a scorecard to check whether your software supplier considers you a community peer and will respect and protect your essential freedoms, or visualises you more like one of those pods in The Matrix. Just count the clicks; the more clicks, the higher the risk this is a rights-ratchet that will end up closed.

Continue reading