Google's Gemini continues the dangerous obfuscation of AI technology

The company's lack of disclosure, while not surprising, is made more striking by one very large omission: model cards. Here's why.
Written by Tiernan Ray, Senior Contributing Writer
Google Gemini website on laptop reads, welcome to the Gemini era
Maria Diaz/ZDNET

Until this year, it was possible to learn a lot about artificial intelligence technology simply by reading research documentation published by Google and other AI leaders with each new program they released. Open disclosure was the norm for the AI world. 

All that changed in March of this year, when OpenAI elected to announce its latest program, GPT-4, with almost no technical detail. The research paper provided by the company obscured just about every important detail of GPT-4 that would allow researchers to understand its structure and to attempt to replicate its effects. 

Also: AI in 2023: A year of breakthroughs that left no human thing unchanged

Last week, Google continued that new approach of obfuscation, announcing the formal release of its newest generative AI program, Gemini, developed in conjunction with its DeepMind unit, which was first unveiled in May. The Google and DeepMind researchers offered a blog post devoid of technical specifications, and an accompanying technical report almost completely devoid of any relevant technical details. 

Much of the blog post and the technical report cite a raft of benchmark scores, with Google boasting of beating out OpenAI's GPT-4 on most measures, and beating Google's former top neural network, PaLM. 

Neither the blog nor the technical paper include key details that have been customary in years past, such as how many neural net "parameters," or, "weights," the program has, a key aspect of its design and function. Instead, Google refers to three versions of Gemini, with three different sizes, "Ultra," "Pro," and "Nano." The paper does disclose that Nano is trained with two different weight counts, 1.8 billion and 3.25 billion, while failing to disclose the weights of the other two sizes. 

Also: These 5 major tech advances of 2023 were the biggest game-changers

Numerous other technical details are absent, just as with the GPT-4 technical paper from OpenAI. In the absence of technical details, online debate has focused on whether the boasting of benchmarks means anything. 

OpenAI researcher Rowan Zellers wrote on X (formerly Twitter) that Gemini is "super impressive," and added, "I also don't have a good sense on how much to trust the dozen or so text benchmarks that all the LLM papers report on these days." 

😂 in a more serious note though --
the gemini model is super impressive (can't wait to play with the new multimodality aspects!)
I also don't have a good sense on how much to trust the dozen or so text benchmarks that all the LLM papers report on these days 😀

— Rowan Zellers (@rown) December 7, 2023

Tech news site TechCrunch's Kyle Wiggers reports anecdotes of poor performance by Google's Bard search engine, enhanced by Gemini. He cites posts on X by people asking Bard questions such as movie trivia or vocabulary suggestions and reporting the failures. 

Also: I asked DALL-E 3 to create a portrait of every US state, and the results were gloriously strange

Mud-slinging is a common phenomenon in the introduction of a new technology or product. In the past, however, technical details allowed outsiders to make a more-informed assessment of capabilities by assessing the technical differences between the latest program and the program's immediate predecessors, such as PaLM. 

For lack of such information, assessments are being made in hit-or-miss fashion, by people randomly typing things to Bard. 

The sudden swing to secrecy by Google and OpenAI is becoming a major ethical issue for the tech industry because no one knows, outside the vendors -- OpenAI and its partner Microsoft, or, in this case, Google's Google Cloud unit -- what is going on in the black box in their computing clouds. 

In October, scholars Emanuele La Malfa at the University of Oxford and collaborators at The Alan Turing Institute and the University of Leeds, warned that the obscurity of GPT-4 and other models "causes a significant problem" for AI for society, namely that, "the most potent and risky models are also the most difficult to analyze." 

Google's lack of disclosure, while not surprising given its commercial battle with OpenAI, and partner Microsoft, for market share, is made more striking by one very large omission: model cards. 

Also: Two breakthroughs made 2023 tech's most innovative year in over a decade

Model cards are a form of standard disclosure used in AI to report on the details of neural networks, including potential harms of the program (hate speech, etc.) While the GPT-4 report from OpenAI omitted most details, it at least made a nod to model cards with a "GPT-4 System Card" section in the paper, which it said was inspired by model cards.

Google doesn't even go that far, omitting anything resembling model cards. The omission is particularly strange given that model cards were invented at Google by a team that included Margaret Mitchell, formerly co-lead of Ethical AI at Google, and former co-lead Timnit Gebru. 

Instead of model cards, the report offers a brief, rather bizarre passage about the deployment of the program with vague language about having model cards at some point:

Following the completion of reviews, model cards ?? [emphasis Google's] for each approved Gemini model are created for structured and consistent internal documentation of critical performance and responsibility metrics as well as to inform appropriate external communication of these metrics over time.

If Google puts question marks next to model cards in its own technical disclosure, one has to wonder what the future of oversight and safety is for neural networks.

Editorial standards