Data engineering - Hasura: Defeating the data doom loop with supergraph architecture

This is a guest post for the Computer Weekly Developer Network written by Ken Stott in his role as field CTO at Hasura.

Hasura is known for its GraphQL platform which is designed to provide data connectivity and access based upon GraphQL itself, an open source query language commonly used to enable APIs to connect with various data sources including databases. 

Stott is a recognised speaker on data fabric design and enterprise architecture, specialising in financial services, healthcare and energy… he is also known for his views on supergraph patterns and wider next-generation data architectures. He writes in full as follows…

Many organisations find themselves trapped in a “data doom loop,” where increasing complexity and inefficiency erode trust in data. 

As a key element in what can be considered to be a modern approach to data engineering, I want to explore the “supergraph architecture” as a transformative opinion that can offer a centralised data access service that streamlines data consumption and enhances quality, ultimately enabling organisations to use their data assets more effectively.

Data doom loop defined

To manage their vast amounts of data, organisations are investing heavily in data governance and infrastructure modernisation. Yet, many find themselves trapped in a frustrating cycle of increasing complexity and diminishing returns. This phenomenon, aptly termed the “data doom loop” is hindering the ability of enterprises to efficiently serve high-quality data and adapt to the pressures of effective data use with the advent of AI and large language models (LLMs).

At its core, the data doom loop is a self-perpetuating cycle where increased investments in data infrastructure inadvertently lead to more complex systems, inefficient data environments and a continuous justification for further spending without achieving the desired data-driven value. This negative reinforcing loop manifests in various ways across the organisation.

Inside the doom loop, data producers find themselves spending more time fixing data problems and operating platforms than fulfilling high-quality data requests. Consumers struggle to identify correct data sources and owners, often engaging multiple stakeholders with disparate delivery technologies and request processes. Data scientists spend an inordinate amount of time on data preparation, sometimes even building additional systems to acquire, validate and aggregate data.

Meanwhile (and we’re still inside the doom loop here folks), data governance teams grapple with defining clear lineage from origin to critical reports, constantly investigating quality issues without implementing systemic solutions. Executives, frustrated by the lack of trust in their data, may create shadow IT initiatives that ultimately exacerbate the problem.

Journey to Mount Doom

Several factors contribute to the persistence of the data doom loop. The sheer scale of enterprise operations presents significant challenges, with thousands of applications, dozens of critical data warehouses and a rapidly growing volume of data that typically doubles every two years. The average lifespan of applications and data feeds, ranging from five to ten years, further compounds the complexity over time.

Hasura’s Stott: The supergraph acts as a unified interface to access data across diverse sources, breaking down silos and simplifying data discovery.

While data governance has evolved to address regulatory compliance, it often falls short of delivering tangible business value. Organisations therefore struggle to capitalise on governance efforts for direct improvements, hindered by persistent data quality issues and the complexity of ever-changing data environments.

The proliferation of database technologies and data movement tools, while aimed at improving access and performance, often overwhelms organisations with increased complexity and integration challenges. Similarly, adopting microservices and API architectures, though designed for agility, can contribute to data fragmentation and impede effective governance.

What is supergraph architecture?

To escape the data doom loop, organisations need a paradigm shift in their approach to enterprise data management. Enter the supergraph architecture – a centralised data access service that alters the relationship between data producers and consumers.

The supergraph acts as a unified interface to access data across diverse sources, breaking down silos and simplifying data discovery and consumption. It emphasises domain-based data ownership, encouraging reusability and reducing redundant duplication. By offering standardised data access features like composition, selection and aggregation, the supergraph enables data consumers to build tailored datasets without complex engineering tasks or overreliance on data producers. A key feature of the supergraph is its metadata-driven infrastructure, which enables automation, streamlines governance processes and provides the flexibility to integrate seamlessly with emerging technologies like AI and LLMs.

Implementing a supergraph architecture can yield significant benefits for organisations struggling with data management. It accelerates time-to-market for new data products and insights, enhancing agility and responsiveness to market changes. By centralising data governance and reducing development efforts, it can contribute to lower data management and governance costs.

The unified data access layer promotes consistent data quality practices and provides stakeholders with a single source of truth, enhancing trust in data and enabling better decision-making. Moreover, the supergraph’s flexibility and scalability enable organisations to use AI, LLMs and other advanced technologies more effectively to derive greater value from their data.

Domain-centric data dominance

The data doom loop is a pervasive challenge that has long plagued organisations striving to become truly data-driven. By addressing the root causes of data management issues and embracing a holistic, domain-centric approach like the supergraph architecture, enterprises can transform their data environments.

This transformation creates new value and efficiencies in data utilisation, enabling organisations to leverage their data assets as a strategic advantage for innovation, competitive differentiation and data-driven decision-making. Those who successfully break free from the data doom loop will be well-positioned to deliver real business value to their organisation, customers and stakeholders.

Ken Stott’s career in data engineering has seen him guide Fortune 500 companies in implementing cutting-edge data management strategies through supergraph architectures. His career spans Wall Street trading tech leadership, CIO roles at Koch Industries, Enron and Scottish Re, plus 13 years leading data architecture initiatives at Bank of America.