Kenishirotie - Fotolia

Twitter plots Hadoop cluster migration to Google Cloud Platform

Social media giant plans to offload some of its Hadoop clusters to the Google Cloud Platform to boost the resiliency of its infrastructure

Twitter has outlined plans to move a sizeable slice of its Hadoop data-processing platform to the Google Cloud to boost the resiliency of the social media site’s underlying infrastructure.

The company is known to operate multiple Hadoop clusters, containing more than 300 petabytes of data, with the firm going on record in January 2017 (via its engineering blog) to confirm that the largest of them comprises 10,000 nodes.

These clusters have traditionally run in Twitter’s own datacentres, and are used to carry out data warehousing and processing tasks, as well as supporting the real-time aspects of the micro-blogging platform’s inner workings.

In a follow-up post, published on 3 May 2018, the company said it is now working with Google to shift its cold storage and flexible compute Hadoop clusters into the search giant’s cloud platform.

“Over the past few years, we have been assessing our platform and infrastructure needs to make sure we are well positioned to keep up with the growing needs of our service,” wrote Twitter CTO Parag Agrawal, in the blog post.

“This migration, when complete, will enable faster capacity provisioning; increased flexibility; access to a broader ecosystem of tools and services; improvements to security; and enhanced disaster recovery capabilities.

“Architecturally, we will also be able to separate compute and storage for this class of Hadoop workloads, which has a number of long-term scaling and operational benefits.”

Brian Stevens, CTO of Google Cloud, said the partnership should pave the way for even greater levels of technological collaboration between the two companies in years to come.

Read more about Google Cloud

“There is strong alignment with Twitter’s engineering strategy to meet the demands of its platform and the services Google Cloud offers at a global scale,” said Stevens.

“Google Cloud Platform’s data solutions and trusted infrastructure will provide Twitter with the technical flexibility and consistency that its platform requires, and we look forward to an ongoing technical collaboration with their team.”

News of the tech tie-up between the two firms coincides with Twitter’s admission that a bug in its systems may have resulted in some user passwords being stored insecurely. While the company claims there are no signs that the affected passwords have been misused or breached, Twitter is advising users to consider changing their login details as a precaution.

Read more on Infrastructure-as-a-Service (IaaS)