Why graph technology is for all our digital futures

This is a guest blogpost by Neo4j’s Jim Webber, who says graphs are a way of managing complexity that is all around us, so why not work with it directly?

As DB Engines has been tracking, graph databases have been quietly rising in popularity for years. Surprisingly, there is solid evidence that despite all this success, graphs are only at the start of their acceleration. The reason graphs are accelerating is the match between graph technology as a way of working with information and the way the digital and physical worlds deal with complexity.

Graphs all around us

This isn’t hubris. Open up Wikipedia, and look up ‘domain model’. You’ll find that a domain model is defined as “a system of abstractions that describes selected aspects of a sphere of knowledge, influence or activity.” This sounds a lot like a Digital Twin of a supply chain, or a real-time model of a complex financial instrument, or a research project to better understand a complex disease like Covid —all of which are being accomplished with graphs right now.

But the Wikipedia article also uses a diagram to illustrate the concept, an abstract sketch of a health insurance plan with patients, doctors, and providers. Anyone working in modern data applications will see at once that it is very clearly a graph model—it’s made up of nodes with attributes linked by relationships. You can try the same experiment with Google Images. If you search for ‘domain model’, what you will see is page after page of graphs.

This intuitively makes sense since complex domain models are always mapped as graphs. And where the domain model is a graph, why not elegantly store the domain model in a graph database? In my view, a graph database is in fact the default technology for applications because there is no cognitive gap between the domain and data models.

Up until now, the world has tried to capture domain models in relational database format, or latterly as disconnected NoSQL database records. In theory, this is, of course, possible. In fact, it’s been mathematically proven that you can take any data and represent it in a relational database if you want to. But in practice, doing so means you have to break it down into tables, get foreign keys for all the rows (aka relations), and deal with all the arcane complexity of the relational database world. It may be possible, but it’s awkward at best.

It’s the same story for NoSQL databases. If you move to a document model, every single time there’s a connection or relationship, you have to manually encode that. You, the application developer, have to write a database layer on top of your database. And maintain it.

With graphs, your domain model, has a one-to-one relationship with the data model. At some point, most developers trying to model a complex process, trying to build a model scalable and accurate enough for prediction will recognise that there’s no value in encoding a domain model in a document or relational form. They might as well skip straight to the one that works, is performant, secure and stable—namely, graph technology.

Digitise any domain model we want

The long arc of business IT history points to more applications and more use cases being accepted as great fits for graph databases or graph data science (GDS).

In the future, enterprises and governments will be doing the same. They will be leveraging relationships in data that map ever more tightly onto the real world to digitise complex domain models where they want to make predictions and spot important patterns and trends.

The graph database has not just had a stunningly successful first ten plus years, but looks set to be a major player in the way we work with complex data for a long time to come.

The author is Chief Scientist at Neo4j, a graph database company, and co-author of Graph Databases (O’Reilly) and Graph Database for Dummies (Wiley)