X
Innovation

AI leaders urged to integrate local data models for diversity's sake

Incorporating the SEA-LION large language model, for instance, will help ensure GenAI-generated responses more accurately reflect the Southeast Asian population.
Written by Eileen Yu, Senior Contributing Editor
global-data-gettyimages-1337196402
Yuichiro Chino/Getty Images

Tech giants currently pushing out generative artificial intelligence (GenAI) tools are urged to incorporate regional and local data models to ensure their products better reflect a diverse global population. 

Integrating the Southeast Asian Languages in One Network (SEA-LION) large language model (LLM), for instance, will help these GenAI tools generate more accurate responses, according to Laurence Liew, director of AI innovation at AI Singapore. 

Also: How Lenovo works on dismantling AI bias while building laptops

He related a test in which his team had input a question specific to a recent Asian election, asking SEA-LION and a more popular global public GenAI platform to predict the outcome. As it turned out, the former generated a more accurate result, he said.

The current iteration of SEA-LION runs on two base models: a 3-billion-parameter model and a 7-billion-parameter model. Its training data is composed of 981 billion language tokens, which AI Singapore defines as fragments of words created from breaking down text during the tokenization process. These fragments include 623 billion English tokens, 128 billion Southeast Asia tokens, and 91 billion Chinese tokens.  

Liew said that most public GenAI tools today are non-Asian focused and, hence, may have inherent data bias. LLMs such as SEA-LION are more "culturally sensitive", which he said will ensure GenAI-generated responses better reflect the region's societal mix. 

Asian countries such as Thailand and India also have developed their own LLMs.

Noting that SEA-LION is open source, Liew added that AI Singapore hopes the likes of Microsoft and Google can incorporate such regional and local LLMs for organizations that operate in this region.  

Launched in May 2017, AI Singapore is a government-wide collaboration that gathers all Singapore-based research institutions, startups, and companies that develop AI products. The national program aims to drive the country's AI capabilities and is supported by several government agencies, including the Smart Nation and Digital Government Office and Economic Development Board.

Also: Why open-source generative AI models are still a step behind GPT-4

Liew further highlighted growing interest among businesses in adopting GenAI products. He pointed to the multiple discussions his team has had with Microsoft, which had noted a flood of requests from enterprise customers for "thousands of seats" to roll out Copilot

Not every organization, though, feels it has the IT infrastructure to support its GenAI adoption. 

Just 30% feel their company has the IT assets that are essential to deploy GenAI, according to a study commissioned by Telstra International. Conducted by MIT Technology Review Insights, the study polled 300 C-level executives and business leaders, with 66% from Asia-Pacific markets -- including Australia, Singapore, South Korea, Japan, and India -- and 17% each from Europe and the Americas. The survey was conducted between November and December 2023. 

Also: Singapore looks to accelerate AI development with investment in compute and talent

Some 67% of early GenAI adopters felt their hardware was "modestly conducive" for rapid adoption. About half said likewise for their storage infrastructure. 

While 76% said they had worked with GenAI in some way last year, only 9% said their organization had adopted it widely. Most cited automation of low-value tasks as the key reason for doing so, while others pointed to customer service, strategic analysis, and product innovation.

Some 60% acknowledged that GenAI would substantially disrupt their industry over the next five years, with 78% treating the technology as a competitive opportunity. Only 8% viewed it as a threat. Some 65% said their organization was actively looking at new and innovative ways to use GenAI to extract value from data. 

Also: Generative AI should be more inclusive as it evolves, according to OpenAI's CEO

Asked about barriers to adoption, 77% pointed to regulatory, compliance, and data privacy considerations, while 56% highlighted budget as a constraint. 

Respondents also noted a shortage of the necessary skills, with machine learning engineers, AI data scientists, and translators among the top talent needed.

"As the world becomes increasingly digitized and human-to-machine interactions flourish, being able to process data to drive informed real-time or near real-time business decisions is paramount," said Geraldine Kor, Telstra International's South Asia managing director and head of global enterprise. 

"However, building end-to-end capabilities to handle large datasets, accurately contextualize the data for business value, and ensure the responsible and ethical application of AI is extremely challenging," said Kor, who discussed the survey findings at a media briefing Monday that included Liew as a guest. 

Also: Move over Gemini, open-source AI has video tricks of its own

In addition, organizations continue to operate in a global climate that is fraught with geopolitical issues and an uncertain economic landscape. This will impact their decisions on how to invest their IT spend and whether this should include GenAI, Kor said. She recommends that companies start by identifying one specific function to which to apply GenAI. 

"Singapore, like most countries, is still in the early stages of adopting GenAI, with the technology only recently becoming available in productivity suites suitable for a wider audience," Liew said. "The requirements for effective implementation of GenAI include access to real datasets, AI engineers, and computer infrastructure."

Companies also face a dilemma in acquiring the necessary hardware, he noted. "Choices include outright purchase and pay-as-you-go outsourcing, both of which carry their own risks. Additionally, data quality, storage, and talent remain bottlenecks for effective deployment," he said. 

Also: The best AI chatbots: ChatGPT isn't the only one worth trying

He added that AI Singapore is looking to plug some of these gaps with initiatives such as the AI Apprenticeship Programme and LLM Application Developer Programme. 

Singapore in January also published a handbook to help local companies, including small and midsize businesses, navigate their adoption of GenAI and acquire the necessary skill sets to support such initiatives. 

Editorial standards