Ascannio - stock.adobe.com
Grab taps GPT-4o to improve mapping service
Regional ride-hailing giant and super app Grab is leveraging OpenAI’s GPT-4o model to build hyperlocal, dynamic maps, cutting costs and boosting accuracy
Regional ride-hailing giant and super app Grab has tapped OpenAI’s GPT-4o model to improve its GrabMaps service, addressing the mapping challenges presented by Southeast Asia’s complex road networks.
GrabMaps leverages a vast network of motorbike riders and pedestrian partners equipped with 360-degree cameras to collect millions of street-level images that are used to train and fine-tune models, enabling it to localise traffic signs, count lane dividers, and refine road geometries.
The hyperlocal approach addresses the limitations of conventional mapping providers, which often struggle with the region’s rapidly changing urban landscapes and intricate road systems optimised for motorbikes and pedestrians.
Grab’s initial experiments with GPT-4o focused on matching speed limit signs to their corresponding roads. Its teams fine-tuned the model using only 100 sample cases, combining street-level imagery and map tiles, and making hyperparameter adjustments to enhance accuracy.
Starting with a baseline accuracy of 67%, Grab improved the model’s accuracy to 80% after two rounds of fine-tuning. The model excelled in handling complex scenarios such as elevated roads and occlusions, which previously required manual intervention. By cross-referencing street imagery with map tiles, the model made context-aware decisions akin to human operators.
The results have been impressive. Grab managed to reduce manual mapping efforts while improving lane count accuracy by 20% and speed limit sign localisation by 13%. These improvements translate to greater trust in data quality, reduced operational costs and enhanced navigation.
“To meet the needs of the region, we had to build something hyperlocal and dynamic – mapping Southeast Asia as it evolves,” said Adrian Margin, head of data science for geo-mapping at Grab, adding that fine-tuning GPT-4o was essential to handle complex geometries and reduce manual interventions.
Beyond mapping, Grab is expanding its artificial intelligence (AI) capabilities to make its platform more accessible and responsive. A voice assistant for visually impaired and elderly users is being developed, along with an advanced support chatbot that can handle complex inquiries and improve user experience.
Grab’s use of GPT-4o comes on the heels of plans by OpenAI to expand in Southeast Asia by localising its AI models for the region and opening a regional hub in Singapore later this year. The company elaborated on its plans during a recent press briefing, highlighting the region’s high adoption of ChatGPT and its thriving tech ecosystem as key drivers for the strategic move.
“Singapore has the highest per capita usage of ChatGPT globally, with about one in four people using it weekly,” said Jake Wilczynski, OpenAI’s head of communications in Asia-Pacific, pointing to the city-state’s technological leadership and AI-forward environment. “Choosing Singapore as our regional hub was a no-brainer.”
Read more about AI in APAC
- Manulife has been on a billion-dollar digital transformation journey over the past few years, leveraging the power of AI to streamline its operations and enhance customer experiences.
- Vietnam’s Techcombank is targeting $1bn in profit in 2024 while maintaining a flat headcount and reducing physical branches, thanks to its investments in AI and a data engine.
- At the Gartner IT Symposium 2024 in Gold Coast, experts call for CIOs to pace their adoption of AI, manage costs and embrace ‘augmented leadership’.
- Nvidia is deepening its presence in Japan and Indonesia through partnerships with local cloud providers and tech companies to build sovereign AI infrastructure and local large language models.
At the briefing, Olivier Godement, OpenAI’s platform product lead, showcased advancements in the company’s AI capabilities, including GPT-4, its most advanced language model, and the recently launched o1 model that uses “chain-of-thought” reasoning for complex problem-solving. The model’s ability to “think before responding”, analysing multiple hypotheses before delivering an answer, marks a significant leap in AI reasoning capabilities and paves the way for agentic AI.
Romain Huet, head of developer experience at OpenAI, highlighted the company’s advancements in speech-to-speech technology with a real-time application programming interface that allows developers to integrate natural, human-like conversations – including those in Singlish, a colloquial form of Singapore English – into their applications.
On the context window race among AI competitors, Godement acknowledged the importance of expanding the amount of information models can process, while maintaining accuracy and managing computational costs. He said different techniques, like retrieval augmented generation, will continue to play a role alongside larger context windows that OpenAI is working towards.
Looking ahead, Huet outlined four key focus areas for OpenAI to realise its vision of the future of AI, one where it can be more personalised and act on behalf of users, beyond answering questions. These include continued advancements in reasoning with the o1 model series, expanding multi-modal capabilities – including text, images, video and voice – in GPT-4, focusing on agent-based workflows facilitated by improved reasoning, and significantly driving down the cost of AI technology.
“We’ve seen that every time there’s a price drop, developers will build more applications and features that they were not able to build before,” said Huet, adding that the price per token has fallen by 99% since OpenAI released its first DaVinci models.
“Even though o1, our most capable model today, is slightly more expensive than GPT-4o, it’s still way cheaper than GPT-4 when it first came out,” he said. “We will continue to drive down the cost of everything we build as we want developers to build more with AI and scale what they’ve built into production.”