Recently, Meta released Llama 3.1 to much fanfare, since the new LLM is not only competitive with others on the market, but is free and open source. They even released a long technical document describing training, bringing a much higher level of transparency than we’ve seen from the big tech companies. Even more recently, Mistral released Mistral Large 2, and Google released Gemma 2 2B. These are open source models that are competitive with closed-source alternatives like ChatGPT. This move towards open source and transparency is certainly in the right direction, so some celebration is justified. But I wanted to share my view, which I don’t think is quite common, on why this development isn’t so surprising. I think that open source AI models aren’t only the right or just way to move forward, they’re the most obvious eventuality. AI wants to be free.
There are three parts of this argument, which I get into below. The TLDR is: first, software wants to be free; people usually end up paying for a service or hardware, not just software. Second, infrastructure tends towards open source; it’s more reliable, secure, and popular. Third, AI is mostly infrastruture software; that is what it has been historically and what it is best at.
Software Wants to Be Free
AI, practically speaking, is software. And over time, history has shown that software wants to be free, at least free as in “free beer”. That’s how the evolution of software development and distribution has played out over time. Software that once commanded premium prices has become freely available, either through open source initiatives or strategic decisions by the software owners. Major tech companies have shifted their revenue sources from software licenses to cloud compute, services or platform access, and hardware. This trend is a recurring theme and right now, we’re seeing it play out with AI at an insane speed and scale, as with everything in recent AI.
I’m not an economist, but there are a few clear arguments here that have been made over the years. The trend towards free software is driven by market dynamics. Software, unlike physical goods, can be replicated infinitely at low cost. Even for big AI models, that is true; just look how quickly Llama 3.1 has now been replicated everywhere. As technologies mature, they often become commoditized, losing their value as standalone products. This is particularly true for base technologies like operating systems, which I’ll explore next. Recently, the trend has been for companies to provide free versions of their software to attract users to their platforms or ecosystems. The real revenue often comes from selling access to the platform or related services. Software is now rarely sold as a standalone product; rather, the product is a service or a platform. Finally, as I’ll discuss more about infrastructure, free or open source software encourages adoption because the user bases attract developers and companies who build complementary products and services based on the free software. In other words, network effects encourage the adoption of free software.
Consider the trajectory of operating systems. Linux was once a niche OS for enthusiasts and server rooms. It is now the most used operating system by a large margin, mostly due to cloud computing and Android. Even Microsoft has shifted its strategy with Windows, focusing more on services and cloud computing. Windows 11, the latest of their OS software products, is free as an upgrade from Windows 10. Software licenses to Windows or Microsoft Office are still a large part of Microsoft’s revenue, but their cloud computing platform Azure is a much larger part. Apple might seem the exception to the rule, since macOS and iOS are both proprietary. But I’d argue that people don’t buy macOS or iOS. They buy MacBooks and iPhones. What is the value of the software layer in that product? If it were free, would people still buy iPhones? My guess, and what Android demonstrates, is yes. Operating systems, over time, have trended towards free.
Database management software, one of the behemoths of proprietary software, has also experienced this shift towards free and open source software like MySQL and PostgreSQL. Of course, there is still a huge market for proprietary software licenses and arguing that there is no market value for software is false. Rather, I argue that access to software is the real commodity; the software itself is only part of the product. The market value of the software therefore depends on how it is deployed as a product, and in many cases is better off being free as in free beer. Most AI companies haven’t even tried selling AI models wrapped in software licenses; rather, the product has been access to models running on GPUs as a service. This is shifting now towards free models.
Infrastructure Wants to Be Open Source
For software that is infrastructure, the case for free gets even stronger. Not only does infrastructure software want to be free, it wants to be open source. By infrastructure, I mean software that is used as the building block for other software or digital goods. Programming languages are almost exclusively open source, especially those used in critical software. Most software frameworks like Ruby on Rails, the Mozilla Application Framework, and even .NET are open source; I discuss .NET in detail below. In cloud computing, the Kubernetes software has become standard for cloud application development. All of this software is used to build, maintain, or deploy other software; it is a part of software infrastructure. Infrastructure benefits from developers adhering to its standards and to its security being vetted by the developer community.
Let’s look at the .NET framework, developed by Microsoft, as an example of how infrastructure software tends to move toward open source over time in order to become more standardized and adopted. Initially launched in the late 1990s, the .NET framework was proprietary, intended to strengthen Microsoft’s hold on the software development ecosystem by providing a platform for building Windows applications. However, as software development practices evolved and the demand for cross-platform compatibility grew, the limitations of a proprietary framework became apparent. Over the next 20 years, the .NET framework moved towards open source and has today been replaced by .NET, licensed under the open source MIT license. .NET now has applications in software ranging from the Office suite to the Unity Game Engine to Chipotle’s web platform.
In the early 2000s, Microsoft began standardizing parts of the .NET framework, leading to the establishment of open standards for C# and the Common Language Infrastructure. This move aimed to encourage broader adoption and interoperability, but significant portions of the .NET framework were still proprietary, complicating .NET application development. In 2007, Microsoft announced that the source code for the .NET Framework libraries would be made available, although still under a license; this step allowed developers to view the source code, although it wasn’t fully open source. Then, in 2014, Microsoft extended its patent grants and clarified that implementations of .NET technologies would not be subject to patent litigation, opening the door for similar frameworks like Mono to better align with .NET without fear of legal repercussions. The most significant leap towards open source came in 2016 when Microsoft acquired Xamarin, Mono’s developer, and announced that Mono would be relicensed under the MIT License, effectively open-sourcing the entire .NET ecosystem. This change was accompanied by a broader commitment to cross-platform support, making .NET a fully open source framework.
The shift of the .NET framework towards open source has not only enhanced its reliability and security through community contributions but also expanded its reach and relevance. .NET went from a Windows-centric development tool to a cross-platform framework for macOS and Linux. And it could be argued that if .NET had been open source all along, if developers hadn’t avoided it for lack of understanding of the code or fear of legal repercussions, it would be more popular today than it is now. It’s notable that Java, which went through its own proprietary-to-open source transformation, went open source in 2007, and application development in Java in the 2010’s far outpaced development using the .NET framework. Infrastructure software naturally gravitates towards open source to achieve greater standardization and widespread use.
One reason that open source infrastructure is superior is that it only takes is one good open source alternative to disrupt an entire market. Once an open source infrastructure software reaches a certain level of maturity and functionality, it can quickly supplant proprietary alternatives. When a reliable, free option exists, it becomes harder for proprietary software to justify its cost, leading to a broader adoption of the open source alternative and further driving the trend toward free software. When that software is infrastructure, which greatly benefits from network effects of developers using it, the pull of an open source alternative is even stronger.
Over time, tech companies have learned the lesson that infrastructure is better when it is open source. When Google released Kubernetes, it didn’t take twenty years of piecemeal moves towards full open source; it was licensed under the very permissive, open source Apache 2.0 license. Now, Kubernetes is one of the most widely deployed software systems in the world, used not just by Google, but by virtually every tech company. It adds value to Google’s products that generate revenue without requiring that developers pay for Kubernetes or their development pipelines based on it.
AI is Infrastructure
The last part of this argument is that AI, most of the time, is infrastructure. As software, it is closer to .NET than Office; it’s an image filter, not Photoshop; it’s a layer in a web app, not the app itself. This is an argument that goes counter to the current trend of investment in AI, so I’ll take a bit of time on it, but my belief is that AI isn’t great at being a software product on its own. Like Kubernetes, it can be used to build software.
Algorithms have always been the backbone of software, often becoming commoditized and freely available over time. Search methods like A*, a foundational algorithm in computer science and AI, is now a standard tool freely available to anyone; you can find plenty of open source implementations of it. Pathfinding algorithms like A* aren’t apps; rather, they’re a part of other software like video games.
Or look at spam filters. Spam filters are a standard part of a mail server’s software stack, and some email applications include additional filters on top. They are a standalone software and even a product, although usually bundled as part of a server security package. Free and open source spam filters like Apache SpamAssassin have been in widespread use for decades. In the toolkit of these spam filtering programs is Bayesian filtering, a type of statistical learning; something that used to be called AI. The Bayesian filter even requires data to learn on, like deep neural networks, meaning that the prepackaged software either comes with a trained model or a way to train it. But the model isn’t the product; the spam filter also looks at email signature information, checks blacklists, and takes actions like blocking the email or labeling it. The AI, the Bayesian filter, is only a part of the software package.
Translation applications are a more recent example. A sophisticated translation application like Google Translate can take multiple forms of input - text, audio, images - and translate the text in them between a myriad of languages. AI is hugely important in this. Different AI models detect the words in the audio or image data, and then another AI model translates the words into a different language. Another model can then even be used to turn the words into audio, speech. This single application contains many AI models, which recently have become deep neural networks trained on huge masses of data. However, none of those models are being sold to the end user. The product is a software package, not an AI model.
Better AI models can drive user adoption, especially in a packed AI environment. DeepL offers higher quality translation than Google Translate and their huge user adoption over the years, without having Google’s name recognition, has garnered them a $2 billion valuation. But how long do better AI models suffice as a business plan? There are now plenty of good, open source AI models available for download on HuggingFace that can perform translation perfectly well; even models that don’t focus on translation, like Llama 3.1, are pretty good at it. It’s worth noting that DeepL is still not profitable, and they’re shifting to general writing assistance recently, beyond translation. In their new product, DeepL Write Pro, the AI model is just a part of it.
Of course, a change in the current AI landscape is that these models are prohibitively expensive to train; not everyone has access to the massive GPU farms that Meta does. But deep learning for computer vision went through a similar transformation. When AlexNet, one of the first deep convolutional neural networks, was trained on the ImageNet dataset, it required adapting to parallel training on multiple GPUs, as the 3GB of memory on the GTX 580 was insufficient. It took “five to six days” on these two GPUs. Now, GPUs have vastly improved in memory and efficiency, and there are ways of training on parallel CPUs which reduce training time to minutes. More importantly, models trained on this dataset are abundant online. If you want to build an AI software package that tracks football players as they move across the field, all of the components, including trained AI models, are available online as free and open source. Computer vision AI models are infrastructure in many applications now, in fields ranging from finance to medicine to agriculture.
Furthermore, small models like Gemma 2 are getting attention, because many applications don’t need the sort of massive models we’ve seen recently. Training a model the size of Gemma 2 is within the reach of academic teams and smaller groups than Google. Even if it weren’t the big tech companies pushing open source models, open alternatives would eventually be created. Mistral has been pushing excellent open source models for the past year and now has a $6 billion valuation. For open source infrastructure to dominate, there only needs to be one good open source alternative.
Moving forward
It is a good thing that Meta published Llama 3.1, and an even better thing that it came with a lengthy technical description, going beyond just open source to transparent. But this wasn’t because they have great ideals about benefitting society; it is a lesson learned in tech and aligns with the history of software. Rather, if open source is the new normal or at least will be, then the question becomes how to train, release, and maintain open source models in the most responsible way? Ongoing vigilance and responsible stewardship are essential to ensure safety and sustainability.
First, it is better and safer to have applications built on open source AI than ones built on closed source AI. ChatGPT has already been integrated as a part of other software products, through its API. There are applications that rely on its models and interface, which are completely opaque. We have already crossed that threshold, whether we should have or not; now, we need to understand the tools being built in. It was highly irresponsible of StabilityAI to release models without even trying to reduce harmful content in them, but open source releases can be studied and the safety or legal concerns laid bare. If the information is available, then irresponsible actors can be held accountable and safe models can be promoted. If all of the models are hidden behind APIs, the best we can do is guess.
HuggingFace has been a laudable actor in this, providing the platform and standards for open source AI model releases. Major tech companies will continue to play a crucial role as model providers and maintainers, but they will not always act responsibly. Again, the initial Stable Diffusion release was highly irresponsible because the data and the model hadn’t been vetted for copyright infringement or illegal content. Organizations, academia, and regulatory bodies should be allowed to enforce standards on open source models, and policy makers need to work with safety researchers to figure out those standards now.
Let’s not give Meta a free pass on mistakes made along the way or praise Zuck as a benevolent open source freedom fighter. It is also premature to say that Llama kills off OpenAI, because running LLMs as a service is still a useful product (if you can make it profitable). LLMs are hard to run correctly, and many companies will choose to pay a service to run them rather than pay to figure out how to run them internally. Nevertheless, the trajectory towards open source AI seems not only justified but inevitable. AI wants to be free; if done safely and sustainably, that freedom can foster rapid innovation, collaboration, and progress.