The ephemeral stack - DataStax: Compose yourself with serverless data.
This is a guest post for the Computer Weekly Developer Network written by Patrick McFadin, in his role as vice president for developer relations at DataStax.
McFadin’s original title for this piece was: Please Compose Yourself (A Serverless Data Approach Might Help).
Arguing that the idea of composable computing infrastructure has been around for the past decade, McFadin says that today, microservices application design methodologies make it easier to think about how to combine lots of different components.
Cloud computing services (compute, network, and storage), software containers, and serverless functions can all be used to provide those application building blocks. APIs can bring these components together and connect them, while GraphQL provides a query language for interacting with data through those APIs.
So far so good he suggests… but now, we just have to look to the rest of IT to catch up.
McFadin writes as follows…
The starting point for this expansion around composable infrastructure should be the same as your application. What do I do about data? Applications are data-creating machines and it all has to go somewhere. When you’re scaling your application using containers, each new instance will create its own data that needs to be stored. This requires a data management plan that matches your use case.
This would normally involve planning ahead, knowing what you want to implement, and having that infrastructure supported over time. If you are comfortable with this, then great. However, many developers today don’t want to spend time on building and maintaining infrastructure when they are tasked with – and measured on – building applications. Instead, they would like to treat their infrastructure in the same way as their apps.
DbaaS: a methodological fit
Running databases as a service that can be implemented and supported in containers fits with this methodology.
However, there is still overhead on the management side. To eliminate the operations requirements, you can go with a serverless data approach, where the database instance is provided as a service in response to actual demand levels. The difference between running databases in your own containers and in serverless instances is that a serverless deployment should operate similarly to a serverless function – for the developer, the service should just be there rather than requiring any set-up.
A serverless database requires separating the computational components from the underlying storage. All while providing a single interface for end-users. This separation enables costs to accrue on-demand instead of having to pre-provision for capacity ahead of time. When there is heavy demand on the database for read and write capacity, it will scale to maintain service levels. When demand is low, the underlying persistent storage is the only cost component keeping your overall data infrastructure spend in line with the actual use.
This approach delivers the same database experience for developers but it removes the operational overhead.
Capacity planning & developer experience
What does this all mean in practice, and how should it help your developers perform better?
The first benefit is that this helps with one of the biggest headaches for IT: capacity planning. Even with the flexibility that cloud computing services offer, IT teams still have to estimate the level of demand they expect to see and plan for expansion when needed.
This is problematic because you still need to be ready for seasonal peaks and growth. Typically, this means planning for peak load levels at their highest points. But what happens to all that excess capacity needed for safety overhead? While cloud service providers mention that you can scale up and down, this requires end-users to initiate that action—and that isn’t trivial. Couple this with peaks and troughs in demand during the day… and you can easily see your average over-provisioning at more than 50 percent of the capacity you actually need. Using a truly serverless data service, you should be able to scale back your capacity to match demand and only pay for that level of use.
The second benefit arises from supporting non-prod environments and CI/CD deployment pipelines more efficiently. When your developers have to set up their own database instances, this can be time-consuming and expensive. In fact, it is not uncommon for enterprises to spend as much or more on their non-production hardware and software for development, and yet you can still find capacity is outstripped by the needs of many engineers working on many independent microservices components.
By taking that element out of the developers’ hands and automating the process, you can save time on infrastructure that can be put into more application development work. This also helps avoid the problem of old test instances hanging around and forgotten in the cloud, consuming budget when they are not needed.
The last benefit is around how to get the most out of your database for your application. For each service you build, you may have different needs.For example, an application that has high write workloads needs a different approach compared to a high-read or hybrid workload. For a traditional database implementation, you would have to scale up the size of the machine or the cloud instance as the data on disk grew.
Tuning, breaking and tweaking
For a database like Apache Cassandra that runs in Java, all of the functions the database needs to perform are normally contained in a single JVM process. Breaking those functions out into their own processes and running them independently means that they can be scaled up based on application performance requirements. Data storage nodes can then be treated as independent processes and can be scaled up as needed. This is easier for the majority of developers compared to tuning their database instances, while still delivering the performance that they want from a composable approach.
As developers continue to adopt microservices and cloud, composable application design will become more popular. Everyone wants to be agile and use the best of what is available. However, we have to think beyond applications alone and about the whole approach to infrastructure, particularly around data. Any organisation looking for developer acceleration should consider serverless databases as a part of that picture.
Eliminate tradeoffs and build for the future.