Auto-tech series - Cockroach Labs: IaC shifts the path for (automation-driven) data availability

This is a guest post for the Computer Weekly Developer Network written by Rob Reid in his position as technical evangelist for Cockroach Labs, maker of the enterprise-grade distributed SQL database, CockroachDB.

Reid writes in full as follows…

A shift has happened  

As we move to Infrastructure as Code (IaC) the topic of database automation feels, to quote Dame Shirley Bassey with The Propellerheads, like just a little bit of history repeating.  

The race to automate database functions and activities is decades old and IaC has seen the baton pass from traditional vendors to those floating relational and NoSQL cloud alternatives.

Server partitioning, patching, cluster deployment and recovery were long ago made capable of running unattended as vendors sought to relieve DBAs of administrative load. IaC with multi-cloud, however, has changed the shape of what’s needed with data availability as a subject de jure – and so, with that change, the automation debate has also shifted. Developers and DBAs must manage new services coming through at a faster rate than before and ensure seamless performance – regardless of the patchwork of clouds underneath. They’re on the hook for ensuring infrastructure resilience with zero-time recovery in the event of outages. All that while storing, serving and securing a growing mountain of data.

Automation is used relatively sparingly in development and deployment, with many management processes that could – or should – be automated being run manually. 

It’s affecting businesses, eating up DBAs’ time and hitting budgets – about 10% of IT spend is being assigned to data management with 75 percent of that going on labour.

Manhandling manual sharding

Nothing illustrates this better than manual sharding – where data is split and stored on different machines or nodes. Sharding is complex and difficult to make work on-prem; the distributed and diverse nature of multi-cloud amplifies that challenge as data must be moved between shards and maintained to ensure performance as user numbers, infrastructure and data volumes increase.

Accuracy and consistency of data is another challenge. Active-Active architectures allow database nodes to read, write and shard data but delays in read/write, for example, can weaken the integrity of that data undermining transactions’ reliability. Non-relational approaches that attempt to tackle scale in the cloud do so at the expense of data consistency.

I’ve been in the trenches on this: rolling out and running a global data infrastructure, juggling seasonal loads without the cloud-native tools or tooling capable of working in such a distributed setting. It was a neigh-on impossible task to achieve safely without taking the entire system off line.

Overcoming such performance and operational hurdles means rethinking the data architecture: it means building a single, logical database founded on replication of data – not the database – with capabilities for deployment, scale and operations built in and driven by the machine – not you.

Machine-driven automation should be felt in two key areas.

A cloud-neutral database

First, in deployment: you need a cloud-neutral database.

It sounds obvious, but that means a database that can be deployed and operated using consistent procedures regardless of the cloud-provider’s name on the tin. No custom tooling and no manual intervention – it’s real IaC pets-versus-cattle stuff. In our container-driven world of IaC, automation also entails being a good Kubernetes citizen. This means taking advantage of tools like Kubernetes Operator to run ‘human’ tasks such as cluster and database security, storage configuration and upgrades.

Second, operations.

That translates as growing with demand – scaling while (looking at you, Team NoSQL) ensuring the data is available reliably and consistently. In a single, logical database architecture, copies of data would be held on clusters of nodes and replicated automatically. Nodes would communicate with each other to eliminate any inconsistencies between copies and agree on a consensus for data accuracy. 

Cockroach Labs’s Rob Reid: Deep in the auto code motherlode.

This is all great for the organisation – but about the DBA? The debate about the ‘death’ of the DBA at the hands of automation has Dame Bassey’s fingerprints all over it.

Get the machine pieces right and the DBA can shift from workhorse to strategist. You leave the automated tools to identify slow queries, find hotspots and send queries to specific nodes while the DBA gets on with the high-value work of architecting and refining workloads.

Automation opens the door to declarative design – specifying a task or outcome at the click of a button. Simply state the purpose of a workload and leave the machine to run down the list of clustering, load balancing, partitioning, recovery and replication in a region permitted by regulators.

Operationalising workloads

This will have two outcomes: DBAs factoring more business-level considerations into their role operationalising workloads and – second – working even more closely with architects and DevOps on the design of workloads and the IaC underneath. We’ve come full circle.

Automation has come a long way but history is not repeating as we move up a gear to IaC.

Machine-driven systems are evolving from running features to operating at an architectural level – providing the means to forklift workloads onto the cloud without vast and costly rewrites.

All that and DBAs can get on with the important part – making workloads go faster with fewer lines of code.