Kubernetes In The Age Of AI - KubeCon EU 2024

Sat Mar 23 2024 · 7 minutes read.

The Cloud Native Community Heads To Paris 🇫🇷

After running across a frozen lake in northern Sweden during the morning, arriving in warm and sun-kissed Paris by evening served as a clear sign that spring has fully blossomed. With the signs of spring comes the return of the annual KubeCon + CloudNativeCon in Europe. I was fortunate enough to attend for the second consecutive year and thorougly enjoyed meeting some friends from last year, and also making a lot of new ones.

Me standing in front of a KubeCon sign.

Despite KubeCon being predominantly an industry-focused event, it remains a valuable platform for academic researchers like myself. It allows us to engage in meaningful discussions and explore up-and-coming projects. Understanding the pulse of the community is crucial: What challenges captivate people’s attention? What innovations are in progress? And which technologies are waning in popularity? These inquiries matter not only to industry leaders but also to us academics. After all, why invest research efforts in topics without any meaningful community interest?

This post compiles some of my thoughts and learnings from this year's event.

There Was A Clear Theme For KubeCon This Year ...

... and it is spelled AI. The programming for the first day keynote at KubeCon left no room for ambiguity—the prevailing theme was unmistakably AI. Each of the six scheduled items centered around AI or Kubernetes’ pivotal role in training AI and ML models. By reading between the lines, it became evident that the CNCF is fervently emphasizing Kubernetes as an ideal platform for AI and ML workloads in the new AI powered world.

KubeCon keynote stage.

Kubernetes is perhaps not the most elegant platform for partitioning and sharing of GPU (more like NVIDIA) resources. There's a decent chunk of proprietary drivers and other software, and we thus have to use whatever we're given. This is unfortunate in an ecosystem that so heavily revolves around not only open source, but also open governance. It's not likely any time soon, but NVIDIA could really use some competition in the GPU market.

Right now, my impression is that the CNCF wants Kubernetes to become the de facto "cluster operating system." I'm not sure that I agree that it would be the best course of action to shoehorn Kubernetes into every place the sun doesn't shine. Sure, it would provide job security for anyone who has spent the time to actually learn the platform, but I also fear that things are becoming complicated enough that the average sysadmin (or platform engineer) will require a PhD in clusterology just to operate a simple CRUD app. Do we really need that? Should Kubernetes run traditionally SLURM workloads? Should it run real-time compute workloads?

As a Kubernetes fan and user, I'm saying no. We don't need a tool that can solve every problem and should be wary of Abraham Maslow's famous quote:

"If the only tool you have is a hammer, it is tempting to treat everything as if it were a nail."

Even if Kubernetes is a good platform for AI workloads, it might not be for other "hot" computing trends in the future. We should keep this in mind.

Open Source vs "Source Available"

There has been an alarming trend among common and popular open source projects in the community. RedHat started meddling with RHEL licensing and killed CentOS. HashiCorp switched the license of Terraform to the "Business Source License" and earlier this week, news hit that Redis changes its license from an open source license to a "source available" license.

Luckily, the community has quickly taken action and forked or created alternatives to the above projects. Alma and Rocky are alternative Linux distributions to CentOS. OpenTofu, with a large presence at this year's KubeCon, is a fork of Terraform. And even though the Redis license change was made public only 2 days before writing this, a fork called Redict has already been gaining traction.

However, the issue is larger than that. We are experiencing an economic downturn and through stakeholder pressure, companies are looking for new ways of capitalizing on their assets and closing off their source code is one of them. There are of course two sides to this coin. Companies have put massive amounts of capital into developing technology and growing a user base, and without any way of making a return on that investment, there is clear incentive to pull the liquidity plug. However, something as drastic as a source license change is a big hit to the users and contributors that have engaged in the project over the years. As an example, there are 703 contributors to Redis at the time of writing. Only a small fraction of those will see any benefit from the license change. The rest will see their voluntary work locked behind a paywall.

I think it's a good thing that open governance projects emerge from the ashes. If we're lucky, we might just end up better off than we were before.

Platform Engineering Is A Thing! 🛠️

The role of platform engineer is quickly becoming industry standard in tech companies of all sizes. With the introduction of the first ever Platform Engineering Day, internal development platforms and the people developing them were in focus.

The Eiffel Tower

While this is cool, I want to revisit the topic of complexity. Kubernetes is a tool that serves a clear use case. That's proven. However, its complexity means people are building internal developer platforms in house to facilitate a way of using it. This might be fine in very large corporations as they will have specific needs, but when companies of less than 100 people build platforms, something is off. How come there is no universal solution here? Why should application developers learn a new internal platform that might be buggy and unstable whenever they switch jobs?

I am not experienced in this area. I have never built or used an in-house development platform. However, as an outsider looking in, the state of this area seems like there is room for some project to take the helm and provide this functionality in the general case.

Breadth Over Detail

Another unfortunate theme, in my opinion, was a perceived reduction in technical talks. This might have just been in the areas I'm interested, but there were many more talks than last year which kept a really high and introductory level to their content. I find this unfortunate as someone who really enjoy myself some technical talks. Show me how you managed to reduce operational costs by some clever invention or recovered from that production breaking disaster.

Contributors Wanted!

The CNCF launched contributor cards lists GitHub users' contributions to the Kubernetes project. At the time of writing, there are 59.7k contributors to Kubernetes. I am one of them! Even if the contribution is small.

My contributor card

It should be noted that this only counts contributions to Kubernetes and related projects directly and not other cloud native landscape projects under the CNCF such as k3s. This is another push from the CNCF to get more people to contribute to the OSS projects they use. On Fridays keynote, it was stated that more than 200k people have passed the CKA certification. The real number of Kubernetes users is higher, and I think we all wish we could contribute more.

This is where Rob Killen's talk "Why Is This so HARD? Conveying the Business Value of Open Source" fits in really nicely. He provides tools for engineers to convey the value of contributing back to OSS on company time. I will not go into more details about it here, just watch the talk! :)

Final Note

That's a wrap for another KubeCon + CloudNativeCon Europe! Overall, I got more out of the event this year than I thought I would as a researcher. I am looking forward to continuing working with the community and to increase my OSS contributions in the coming year. With some luck, I can come back next year and visit the City of London in early April 2025! Au revoir for this year, KubeCon!

Concode at CDG airport