AlexOakenman - Fotolia

How Hive balances platform stability with innovation

The smart home thermostat division of British Gas has made developers fully responsible for supporting and monitoring the code they deploy

Hive has become one of the big successes of Centrica’s British Gas business, establishing the company as a viable alternative to Google’s Nest.

But being at the forefront of smart home technology means Hive requires a 24x7 way of working and an approach to software development that ensures there are no incidents on the back-end software platform on which Hive runs, while giving developers the freedom to create, build and deploy new features quickly and efficiently.

It starts with DevOps, but monitoring has become a key aspect of the DevOps process, and developers are expected to take full responsibility for the code they push into production.

“The challenge with Hive is that we are in quite an innovative space,” says Chris Livermore, head of cyber reliability engineering team for Hive Home at Centrica. “We know what good looks like and have a very clear idea of things we want to do and things we don’t want to do. But as we innovate there is a grey in the middle.

“There is no up-front approval process at Hive,” he says. “Instead, the developer teams are provided with a set of guard rails that give our developers a lot of freedom, so long as they are doing everything right. We have a lot of continual compliance.”

As an example, Hive runs a million compliance checks an hour. For Livermore, monitoring is a joint responsibility. “The only people who know if their software is working are the people who wrote the code,” he says. “They have to make sure they send the right data to the monitoring system and they set the right thresholds.

“More and more people are running 24x7 services. The days of turning up to work at 9:00 and going home at 5:30 are a rarity. In my job, I work 24x7. If there is an issue with the system out of hours, there is an expectation we fix it.”

Read more about Centrica

  • We speak to Mike Young, the CIO whose CEO recently met with the head of Microsoft to talk about strategy.
  • Centrica is in the midst of developing a fully fledged software business. Its group director of technology and engineering talks to Computer Weekly about the strategy.

Livermore’s role is to run all of the infrastructure that keeps Hive running. He says this involves supporting all the teams developing for the Hive platform. “My job is to give the developers an environment where they can focus on their code.”

“We are very much trying to empower our developers to be responsible for the software and the services we develop. We want the developer teams to be 100% focused on delivering value and features to the customer.”

This involves providing an environment for developers to build, test and deploy the code they create. “I worry about monitoring, log aggregation, security and compliance,” says Livermore.

The cyber reliability engineering team provides a set of tools to support developer teams. He says the developer teams are “absolutely responsible” for monitoring the software they produce. “When there is a problem with their software out of hours, they are on call to fix it.”

The company is a big user of VMware’s real-time cloud monitoring tool, Wavefront, and also uses CloudHealth, a cloud cost management product that VMware  has announced it will be acquiring, He says Wavefront has transformed the way the Hive platform is monitored.

“We define an incident as the software not doing what it is supposed to,” he says. “Sometimes, we can correct an incident before it becomes a problem, which is why Wavefront is useful.” If the system monitoring is trending in a way that could lead to an incident, the problem can be fixed before any issues arise, according to Livermore.

The entire end-to-end infrastructure on which the Hive Platform is based – including marketing and support websites, data collection services, and the real-time store for user and analytics data – runs on Amazon Web Services (AWS). 

“We’ve been in the AWS cloud from day one,” says Livermore. The core technologies used to power Hive are Amazon Elastic Cloud Compute (Amazon EC2), Amazon Relational Database Service (Amazon RDS) and Amazon Simple Storage Service (Amazon S3).

A choice between private and public cloud

According to Livermore, up until now, businesses needed to make a choice between using a private or public cloud.

Having seen a VMware orchestration on top of AWS demonstration at VMworld in Barcelona, he says he can see big benefits, because it is no longer a case of having to choose between on-premise and off-premise. 

“Our developers don’t have to care about where their code runs,” he says. “As I look at all the products bridging physical on-prem and hybrid clouds, it is really powerful not to have to worry where your workloads are. You can have the best of both worlds and leverage all your legacy investments.”

Given that pretty much 100% of Hive runs on AWS, Livermore says it takes a proactive view of cost management. For instance, the company uses a system that analyses AWS spending on a daily basis, which points out spending anomalies.

He adds that the cyber reliability engineering team’s role is not to become a blocker: “I am trying to provide a set of tooling that enable developers do their work.”

However, there still needs to be some form of process. “I’m not a fan of process for process’s sake, but I believe good process can empower a business,” he says.

Livermore works with the developers to create a process that works both for the teams and for the business. This means developers can deploy their own code. “We don’t work in a traditional environment where someone else deploys code. Our developers have access to their production environment to deploy their code,” he says.

Hive on Alexa speakers

Hive was selected by Amazon to be one of the Alexa Smart Home Launch Partners for the Amazon Echo in the UK in 2016.

Livermore admits his wife is not a fan of Alexa. “There are lots of gimmicky things on Alexa, but then you find some really useful things,” he says.

For Livermore, one of those useful features is being able to boost his Hive heating system. But this raises an interesting question, which harks back to the launch of the Amazon smart speaker. The company was required to develop a set of default heating control phrases for users to speak to enable Alexa to control Hive and other smart heating products.

Unfortunately this default vocabulary lacked one of Hive’s most useful features: the one-touch boost option to switch on hot water or heating for an hour at a preset temperature. “We were very proud to be an Alexa launch partner, but we had feedback from customers that they couldn’t boost heating,” he says.

This resulted in the company receiving plenty of negative feedback about the Hive skill for Alexa, even though the problem was actually with Amazon and its specifications for smart home voice control. To fix the problem, Hive needed to release a second Hive skill for Alexa, so that it could implement a voice command to support “boost”.

Read more on DevOps