Data engineering - SAS: The install & rise of the citizen data engineer
This is a guest post for the Computer Weekly Developer Network written by Gordon Robinson in his capacity as senior director of data management at SAS.
Robinson writes in full as follows…
Over the past couple of decades, the role of the “data engineer” emerged as businesses have become increasingly data-driven, catapulting the need for specialised skills to manage the large volumes of data being processed.
Google popularised the title itself around 2010, underscoring the necessity for experts to design, build and maintain scalable data systems. The advent of technologies like Hadoop and Spark further emphasised the importance of these roles, which bridge raw data and actionable insights, aiding data scientists and analysts.
Demand for data engineers
In recent years, substantial advancements in AI, increased AI accessibility and rapid growth of data-driven decision-making have led to greater – and mounting – demands on data engineering teams. In fact, job postings for AI-related roles have surged by 119% over the past two years and, mirroring that change, data engineering positions have seen a 98% increase, says LinkedIn.
Many businesses are facing challenges in meeting the demand for data engineers and filling these critical positions. Skilled data engineers are hard to find in the job market due to the demand outpacing the supply and the most talented often cost enterprises a premium to hire too. As a result, other employees, such as software developers and data scientists, often take on responsibilities typically managed by data engineers. These tasks include managing data pipelines, ensuring data quality and building infrastructure for data processing.
The term “citizen data engineer” refers to these non-specialist professionals, typically from roles such as business analysts, operations staff or software developers, who develop the necessary skills to perform basic data engineering tasks.
Businesses are increasingly relying on these often self-trained individuals to alleviate some of the workload of their data engineering teams (if they have one).
Tools & training
Given the lack of formal training for citizen data engineers, the onus falls on companies to ensure that these employees have the appropriate tools they need to simplify their tasks and bolster their knowledge. Some key requirements for these tools are:
- Low-code/no-code: A LC/NC interface is crucial for a citizen data engineer as it enables them to efficiently perform data-related tasks without requiring extensive programming knowledge, thereby bridging the gap between technical and non-technical professionals.
- Generative AI: It is essential for data engineering tools to offer generative AI functionalities to support citizen engineers. This should encompass features such as natural language queries for locating relevant data sets, automatic suggestions for transformations and the generation of data flows.
- Collaboration: Collaboration capabilities are vital for data engineering tools, enabling cooperation among data engineers, data scientists, business analysts and end users.
- Data governance: Integrated data governance and lineage are essential for organisations to understand their data, its quality and usage. Tools for citizen data engineers should simplify this management and help the organisation ensure compliance with regulations they may need to adhere to.
Given the above, organisations will need to look beyond some of the traditional coding-only interfaces that the data engineers may use if they want to enable citizen data engineers.
Looking to the future
Whilst the number of citizen data engineers will continue to rise, this will not reduce the demand for qualified data engineers.
Colleges and businesses will need to continue developing individuals with skills in this area, as the number of complex data engineering tasks that cannot be managed by a citizen data engineer will continue to increase.
The citizen data engineer is an important, yet still under-discussed, part of organisations of all kinds. By investing in tools that support the function of a data engineer, but allow less technically skilled employees to step up into those roles, organisations will be building lightweight and efficient workflows for their data strategy.
The increasing number of citizen data engineers points to a larger trend affecting the technology space as a whole: data analytics for all.
For employees in legal, HR or even compliance, being able to aggregate and draw conclusions from their data will only grow in importance.
This skill will be a limiting factor on how well AI is deployed across an enterprise too, since the best results will only come from models fed with good quality, clean data.