Baidu’s Ernie Bot 4.0 and start-up Zhipu AI’s GLM-4 rank top among Chinese large language models (LLMs), but their foreign rivals still lead in overall capabilities, according to a new test by Tsinghua University in Beijing.
The SuperBench assessment report examined 14 representative LLMs – the technology underpinning generative artificial intelligence (AI) chatbots – and found that overseas models, such as OpenAI’s GPT-4 and Anthropic’s Claude-3, came out on top in multiple capabilities, including semantic comprehension, coding abilities and alignment with human commands.
Researchers found “obvious gaps” in the code-writing and operative abilities in the real-world environment between domestic and first-class foreign models.
The report aims to “provide objective and scientific evaluation criteria” to examine a growing number of LLMs that have emerged recently, according to a WeChat post published by Tsinghua’s Basic Model Research Centre, which conducted the assessment with the state-backed Zhongguancun Laboratory.
Chinese tech giants and start-ups have been racing to improve their LLMs since OpenAI, a US start-up backed by Microsoft, launched a series of innovative tools powered by generative AI, including ChatGPT and text-to-video service Sora.
Around 200 LLMs have been introduced in China, where OpenAI’s services are officially unavailable, according to government figures.
Established in 2019, Zhipu AI has raised 2.5 billion yuan (US$347 million) since last year, according to its founder, from backers that include state-affiliated investors, venture capitalists and Big Tech companies such as Alibaba, Tencent Holdings and Meituan.
Moonshot AI, also based in Beijing, raised US$1 billion in a funding round in February, according to multiple Chinese media reports.