Neo4j on ISO GQL: A defining moment in the history of database innovation
This is a guest post for the Computer Weekly Developer Network written by Philip Rathle in his role as chief technology officer at Neo4j.
Neo4J offers developer services to power applications with knowledge graphs, backed by a graph database with vector search.
Rathle writes in full as follows…
Early this year, ISO published a new standard for database languages: ISO GQL.
It’s a peer language to SQL and the first new ISO-approved database language since 1987 — when it first ratified SQL. This is a really big deal.
Database management systems (DBMS) that work with your data as a graph are called graph databases. The ISO GQL (Graph Query Language) standard works with a particular type of graph called a property graph, which the standard itself also defines. Its design dates back over two decades and it has gained significant popularity in recent years.
Having an international standard for the language behind graph databases is timely and valuable, in an information landscape where data is increasingly dynamic and interconnected. The fact that ISO has invested more than five years in creating this standard is a credit to the generational importance of this technology.
Why graphs, why now?
Standards channel trends – that is, they acknowledge an emerging technological need or movement and help put uniform methods in place to accelerate technology development.
Graph databases are certainly trending, thanks to the macro trend that the world and therefore its data is becoming so connected. They are increasingly being used by a wide variety of organisations to solve all kinds of problems, from fraud detection, to real-time recommendations and to supply chain management. Most recently, they’re helping improve the quality and security of generative AI applications, where knowledge graphs and GraphRAG appear to have a long-term role in the long-term success of AI within the enterprise. Just this year, Gartner placed knowledge graphs, something for which graph databases are purpose-built, at the centre of their 2024 Technology Impact Radar (which ranks the 30 most impactful technologies with the potential for disruption), as well as their 2024 ‘impact radar’ for generative AI.
Graph technology is having its moment as it solves real-world, high-value problems across a broad range of use cases and applications – helping to find hidden patterns and relationships across billions of data connections, deeply, easily and quickly. This is invaluable across both the operational database world and the world of analytics and AI
The power of standards
Standards shape industries and are a friend to the CIO, the developer and the ecosystem alike. For CIOs, they are the best antidote to vendor lock-in and guarantee access to a large pool of common skills. For developers, it allows them to concentrate their efforts on developing the best apps, rather than learning new technologies. And for the ecosystem, they provide clear integration patterns that amplify the value and reach of technology investments.
All three are protection against obsolescence.
So, GQL has made it much easier for those looking to experiment with – and depend on – graph technology by providing a uniform, standardised language.
What does GQL look like?
As a peer, complementary language to SQL that originates from the same organisation and committee as the one behind SQL, it won’t surprise you to learn that GQL resembles SQL in many ways. Both languages share the same data types and many of the same keywords and commands are the same.
Of course, GQL also includes parts that cater specifically to the unique aspects of graph databases. The absolute core of the language is the ASCII-art-inspired way of defining patterns – that is, using plain text to display the connections. A few graph database languages share this and it traces its way back to 2011 with Neo4j’s Cypher language, which, through the openCypher project, has become the de facto standard graph query language used by Neo4j, AWS Neptune and numerous others.
Path to GQL
The good news is that the shortest path to GQL is Cypher, which has over the last decade become the de-facto database language for graphs.
It is not always the case that a de-jure standard strongly inherits from the de-facto standard. When this does happen, it is clearly a best-case scenario for end users and vendors, who can then double down on skills and technology rather than doing a 180-degree pivot.
This was possible thanks to a number of factors.
First, Cypher was a major input into GQL. And as inputs go, it was of especially high quality. By the time the standards work kicked off in 2019, Cypher had already undergone nearly a decade of real-world trial-by-fire maturation.
Second, Cypher itself was originally modelled on SQL. The “no idle variance from SQL” principle that drove Cypher’s development from the start turned out to be prescient. In the early days of Cypher, no one would have imagined that it would become a significant input to an ISO standard – let alone on a convergence course. But the principle made sense. Why invent something new to do something people are already used to doing?
There is one more reason: the team behind Cypher and openCypher has been deeply involved in developing the GQL standard. During the five years or so it took to produce GQL, around half a dozen Neo4j engineers collaborated with numerous other industry experts, joining up as full-time standards committee members. As aspects of the GQL standard crystallised, Cypher itself was made to align with the forthcoming standard. The roads were smoothed in anticipation of GQL.
As a result, the majority of graph applications written today – together with the skills of those who wrote them – are already highly aligned with GQL.
Closing thoughts
This is a significant milestone for the database industry and one that paves the way for the easier and wider adoption of graph technology.
Standards, after all, are meant as a foundation for enduring technologies. GQL’s ratification by the ISO is a firm vote of confidence in the graphs.
We’re excited about the possibilities and look forward to seeing what innovation and disruption GQL has the potential to drive.