Podcast: Storage functionality in Kubernetes 1.31

We talk to Sergey Pronin of Percona about new storage functionality in Kubernetes 1.31, including volume attribute class and changes to persistent volumes, as well as other long-term fixes

Antony Adshead, Storage Editor

Published: 13 Aug 2024

In this podcast, we look at storage features expected in Kubernetes 1.31 with Sergey Pronin, group products manager at Percona, which develops open source products for SQL and NoSQL databases.

Pronin talks about storage functionality expected in this week’s 1.31 release, but also what he sees as some of the gaps in terms of storage for databases and more generally for storage in Kubernetes. He also discusses compliance and security gaps he thinks still need to be addressed in Kubernetes.

How would you sum up the forthcoming additions to Kubernetes that are of interest to people who deal with data storage?

Sergey Pronin: I don’t believe there are lots of storage-related improvements because the 1.31 release was heavily focused on a major removal of legacy code. It’s like 1.5 million lines of code removed from the core code base, but this code base was created mostly for legacy container storage interfaces (CSIs) created by various cloud providers, and then they moved to plugin structure. That was the main focus of this release.

There are some storage improvements. I believe the biggest one and the most interesting for me is volume attributes class. It allows users to modify existing volumes on-the-fly, like if you want to change the number of IOPS of the volume – you know how on Amazon you have EBS volumes and they have IOPS. Previously, to do it in Kubernetes, you would create a new storage class and then migrate your application to this new storage volume.

It was quite the process. For now, through Kubernetes, you can just change the IOPS for this specific volume and that’s it, but this feature was in alpha, and now in 1.31 it graduates to beta, so it’s getting closer to the GA or stable version.

That’s one major storage characteristic that’s changing. Are there any others in 1.31?

There are some additions to persistent volume status. In 1.31, there was a new “status last phase transition time” status added to persistent volumes.

This allows you to measure time between various statuses of the persistent volume. It can be in pending state, it can be in-bound, it can be in error, and so on. And now, as this last phase transition time status is added, it can be leveraged by various cluster administrators to measure various service level objectives and so on much easier.

Again, it’s not a huge improvement, but it’s definitely something the community was waiting for for quite some time. Especially cluster administrators, because persistent volumes are maturing in the Kubernetes environment, and something you would expect from day zero is not there. And now it’s added, so it’s a good thing.

Are there any other additions that you would group with these?

I have some, but they’re not really major and they’re not in GA, so I don’t think it’s worth mentioning those.

What do you think are the remaining challenges in Kubernetes for people who want to administer storage?

I think one of the issues I see lies in the realm of automated scaling and storage. Historically, Kubernetes was designed as a tool to remove toil from administrators, and for various compute resources like CPU or RAM, it is quite easy to implement automated scaling for those.

If you see that you reach a certain threshold, you can either add more nodes into the picture or you can perform vertical scaling by adding more CPU resources or RAM to the container.

But, for storage, it’s not really the case. Whereas, if you look at most of the cloud providers – I mean public cloud providers like Amazon RDS or Aurora [databases], for example – they have automated storage scaling from day zero, and it’s just super surprising for me that there is nothing like that in Kubernetes as of now.

There are some ad hoc solutions developed by companies, but they are either very limited or they are not maintained any longer. It’s more like, “Hey, I created a POC. Now, community, go figure it out!”

And for me as a developer of various [Kubernetes] Operators for databases, I definitely want to provide the same level of user experience to my users in Kubernetes, because sometimes they think, “OK, if I move from this nice Aurora from Amazon to Operators, what are the trade-offs I’m going to make?” This is one of those.

Are there any developments in Kubernetes that head towards this, or is there just nothing?

There are always some activities going on in various fields in Kubernetes, but unfortunately, there are just discussions as of now. I haven’t seen any single line of code created for that.

Also, I’m not 100% sure it should be driven by the Kubernetes community, or it can be something in the CNCF ecosystem, like the Keda project, for example.

Keda is Kubernetes Event-driven Autoscaling. The CNCF incubated it from a cloud-native incubator, and they do compute scaling quite successfully. So, I would think, why not add storage? We discussed it with them some time ago, but it didn’t move anywhere yet.

Are there any other major areas you think are yet to be solved in Kubernetes with regard to storage?

I think overall standardisation across how various Operators interact with storage would definitely help. But again, I don’t believe it’s something the Kubernetes community should be solving. It should be a wider community, involving various SIGs, because again, if I look at how various Kubernetes Operators or how various Kubernetes projects interact with storage, some of them use stateful sets, the majority of those, some of them create deployments and mount PVCs.

So, from a technical standpoint, it’s very different, and the reason for that is an underlying technology that this application’s power, like it can be some MySQL database or some MongoDB database, and for those you might want to play with storage a bit differently.

But the end result you should be getting is just stability. Your storage should be available all the time, your data should be consistent and you should be able to inspire confidence for the users that if you run something related to storage in Kubernetes, it’s just going to work. It’s not some voodoo magic for it.

Being in this field for quite some time, I still feel that we have not reached this point where companies, enterprises would be confident saying, “Oh, yes, running databases in Kubernetes is for us. We believe it’s the way forward.” There are still a lot of questions [like] how stable it is, how robust the solutions are and what are the trade-offs that they would be making?

Podcast: Storage functionality in Kubernetes 1.31

We talk to Sergey Pronin of Percona about new storage functionality in Kubernetes 1.31, including volume attribute class and changes to persistent volumes, as well as other long-term fixes

Read more about Kubernetes and storage

Read more on Containers and storage

Storage technology explained: Kubernetes, containers and persistent storage

Kubernetes at 10: CRDs at core of extensible, modular storage in K8s

Kubernetes at 10: The long road to mastery of persistent storage

Kubernetes at 10: When K8s ‘won’ and life now as a ‘surly teenager’