Large Language Models like ChatGPT say The Darnedest Things

Publish date: 2023-01-16 Monday
Last updated: 2023-01-16 Monday

Tags:

Gary Marcus
Ernest Davis

Large Language Models like ChatGPT say The Darnedest Things

On November 30, 2022, OpenAI released ChatGPT, a chatbot powered by the large language model GPT-3. Although it was not particularly more intelligent or powerful than earlier versions of GPT-3, it caught the attention of the public and the press to an unprecedented degree.

Needless to say — or, rather, this should be needless to say, but this obvious fact can get buried beneath the ongoing avalanche of hype — ChatGPT made all the same kinds of mistakes that its predecessors did.

we recently decided to put together a community-facing corpus of ChatGPT errors (and to include other Large Language models as well) We enlisted a few friends — Jim Hendler, William Hsu, Evelina Leivada, Vered Shwartz, and Michael Witbrock — and the technical help of Michael Ma, and put together this site. Importantly, we have structured it so that anyone can look at the data at any time.

There is an interface for adding examples and a separate interface for viewing the collection in database format. In adding an example, the model that generated it, a brief description of the error, and either a screenshot or the text of the example are required, and, for verification purposes, you must supply an email address, which is not published. A categorization of error type, a link to a relevant external site (e.g. a posting on social media) and additional comments are optional.

interface for adding examples: https://researchrabbit.typeform.com/llmerrors?typeform-source=garymarcus.substack.com
looking at results: https://docs.google.com/spreadsheets/d/1kDSERnROv5FgHbVN8z_bXH9gak2IXRtoqz0nwhrviCw/edit#gid=1302320625

related AI incident database

what do I think about it

This is great, because the LLMs are closed and there is not enough information.

type of link:

blogpost