peopleimages.com - stock.adobe.c

Why developers must work smarter, not just faster, with generative AI

With a lot to play for, including legal precedents, how can dev teams approach GenAI risks with a view to strong mitigation strategies?

Managing generative artificial intelligence (GenAI) tools will entail big changes in culture and procedures as its use continues to spread like wildfire through developer teams.

According to Kiran Minnasandram, vice-president and chief technology officer for Wipro FullStride Cloud, this is not just about adopting new tools, but transforming how developers interact with technology, solve problems and create new paradigms in software engineering.

A “comprehensive cultural and procedural metamorphosis” is needed, he says, to properly manage risks associated with GenAI, which range from hallucinations, technical bloat, data poisoning, input manipulation or prompt injection to intellectual property (IP) violations, and theft of GenAI models themselves.

“You’ve got to worry about the validity of the model,” says Minnasandram. “You’ve got to worry about model drift or model hallucinations. Every model is based on data, and data inherently has bias. Even if it is a small percentage of bias, and you start to extrapolate that to more and more and more data, the bias is only going to increase.”

For that reason, organisations must be “very careful” with the amount of data with which they engage the models, because bias is going to get into the data. When organisations extrapolate from limited datasets, results are restricted to that quality and quantity. Desired data may be sensitive and private – and data not available in your own explicit datasets can easily introduce model hallucination.

“You therefore need good mitigation strategies, but it’s all on a case-by-case basis,” says Minnasandram. “We’ve got to be very cautious. For instance, if it’s sensitive data, how do you anonymise it without losing data quality?”

Generated content can need guardrails, too. Even if it’s source-code generation, writing some code for machine completion, that code is not complete. Appropriate guardrails for that may entail measuring the quality of that content, he says.

Responsibility frameworks

Enterprise value will require responsibility frameworks that cover individual use, as well as tech and its technicalities in a given environment. Wipro has developed its own, and looks at how it should be taken and implemented, including internally and while maintaining responsiveness to clients.

That includes working to fully understand risk exposures around code review, security and auditing, regulatory compliance and more to develop guardrails.

The good news is that more code quality and performance improvement tools are emerging, including code and compiler optimisation, for integration into CI/CD pipelines, says Minnasandram.

It cannot be a matter of just setting GenAI aside, however. Demand for tasks like code refactoring and more advanced techniques like predictive coding or collaborative coding – where a machine “sits with the dev” and does initial code lifting – are rising.

Don Schuerman, chief technology officer (CTO) of workflow automation company Pegasystems, says the key challenges are not from a lack of code so much as “a mountain of technical debt”, with poorly managed GenAI simply increasing tech burdens.

For that reason, he sees GenAI as better used for tasks other than “cranking out code”.

“Far better to use GenAI to step back into the business problem that code is trying to solve: how do we optimise a process for efficiency? What’s the fastest way to support our customers while adhering to regulatory guidelines?” he says. “Design the optimal workflows of the future, rather than cranking out code to automate processes we already know are broken.”

Workplace pressures

Even if you have experienced and skilled oversight at all levels, editing and checking code after it has been written, workplace pressures can introduce errors and mean things get missed, he agrees.

Ensure users have “safe versions of the tools” and then use GenAI more to “get ahead of the business”. With low-code tools, IT teams often found themselves cleaning up shadow IT failures, and the same could be true with GenAI – with it being more useful to deploy it specifically to deliver speed and innovation within guardrails that at the same time ensure compliance and maintainability, Schuerman points out.

Adopt methods such as retrieval-augmented generation (RAG) to help control how GenAI accesses knowledge without the overhead of building and maintaining a custom large language model (LLM), creating knowledge “buddies” that answer questions based on a designated set of enterprise knowledge content. RAG can help prevent hallucinations while ensuring citations and traceability.

Use GenAI to generate the models – workflows, data structures, screens – that can be executed by scalable, model-driven platforms. The risk comes from using GenAI to “turn everyone into developers”, creating more bloat and technical debt, says Schuerman.

Limit it to generating workflows, data models, user experiences and so on that represent the optimal customer and employee experience, grounded in industry best practices. If you do that, you can execute the resulting applications in enterprise-grade workflow and decisioning platforms that are designed to scale.

“And if you need to make changes, you aren’t going into a bunch of generated code to figure out what’s happening – you simply update business-friendly models that reflect the workflow steps or data points in your application,” says Schuerman.

Read more about GenAI

Chris Royles, Europe, Middle East and Africa (EMEA) field CTO at data platform provider Cloudera, says it’s important to also train people to augment their prompts with better, more relevant information. That may mean providing a limited, thoroughly vetted collection of datasets and instructing the generative tool to only use data that can be explicitly found in those datasets and no others.

Without this, it can be tough to ensure your own best practice, standards and consistent principles when building new applications and services with GenAI, he says.

“Organisations should think quite clearly about how they bring AI into their own product,” says Royles. “And with GenAI, you’re using credentials to call third-party applications. That is a real concern, and protecting credentials is a concern.”

You always want to be able to override what the GenAI does, he says.

Make development teams broader and wider, with more accessibility or shorter test cycles. Built applications should be testable for validation features, such as whether the right encryption frameworks have been used, and whether credentials have been protected in the appropriate and correct manner.

Royles adds that GenAI can be used for other dev-related tasks, such as querying complex contracts, or whether it’s in fact legal to build or use the application in the first place. This, too, must be managed carefully due to the risk of hallucination of non-existent legal proofs or precedents.

Mitigation might be achieved in part by training people to augment their prompts with better, more relevant information. For example, providing a limited, thoroughly vetted collection of datasets and instructing the tool to only use data that can be explicitly found in those datasets and no others, he notes.

Bans won’t work

Tom Fowler, CTO at consultancy CloudSmiths, agrees that forbidding devs to use GenAI will not work. People will typically choose to use tech they perceive as making their lives easier or better, whether that flies in the face of company policy or not.

However, organisations should still apply themselves to avoiding the slippery slope to mediocrity or the “rubbish middle” that is a real risk when inadequate oversight or a team with too much technical debt seeks to use GenAI to patch over a gap in their dev skillset. “Organisations need to be cognisant of and guard against that,” says Fowler. “You need to try to understand what LLMs are good at and what they’re bad at.”

While capabilities are evolving quickly, LLMs are still “bad” at helping people write code and get it into production. Some sort of restriction might need to be placed on its use by developer teams, and organisations will still have a requirement for software engineering, including good engineers with solid experience and strong code review practices.

“For me, you can use GenAI to help you solve lots of small problems,” says Fowler. “You can solve a very small task very, very quickly, but they just don’t have the capability of holding large amounts of complexity – inherited systems, engineering systems designed to be able to solve big problems. That way humans are good. You need insight, you need reasoning, you need the capability to hold this big picture in your head.”

This can actually mean you’ll be looking at upskilling your dev teams, rather than hollowing it out to save money, he agrees.

A good engineer can functionally decompose what he or she is trying to do down into lots of small problems, and to those individual chunks, GenAI can be used. When GenAI is asked for help with a big complex problem or to do something end to end, “you can get rubbish”.

“You either get code that’s not going to work without some massaging, or just get bad ‘advice’,” says Fowler. “It’s about helping to scale your team and do more with less [partly as a result]. And the advent of multiple modalities, and domain-specific models, whether built from scratch or fine-tuned, will be 100% the future.”

Copyright considerations

Big players are beginning to offer enterprise offerings with protections around data and leakage and the like, which is “fantastic”, yet relatively little attention has so far been paid to copyright and other IP risk as it pertains to code, says Fowler.

Look at what happened when Oracle sued Google around using the Java API. Organisations might want to look at similarities and precedents to head off potentially nasty surprises in future.

“There’ll be precedents around what’s OK in terms of how much of it being tweaked and changed enough to be able to say that it’s not exactly the same as something else – but we don’t know yet,” he points out.

With the generic, broad uses of GenAI, data can easily come from something on Google or Stack Overflow, and somewhere amid all that, someone else’s IP can be replicated via the algorithm. Organisations building an LLM-based tool into their offering may need guardrails on that.

“All of that being said, I’m not convinced it’s a large risk that will deter most organisations,” says Fowler.

Next Steps

Examining developers' perception of AI tools

Read more on Artificial intelligence, automation and robotics