Do we trust GPT-4 and AI programmers to tell the truth?

Microsoft’s report on the new GPT-4 AI program is fascinating. It’s a long read, containing many demonstrations of the new large language model (LLM) created by OpenAI, a company in which Microsoft is the largest investor. Microsoft rolled out the first GPT-4 implementation — Bing Search’s “Sydney” — in February, with chaotic, funny, and sometimes scary results. In this work, researchers tested an early version of the GPT-4 model.

The big takeaway is that Microsoft’s researchers claim that “GPT-4 attains a form of general intelligence” and shows “sparks of artificial general intelligence.” But does it?

My concern is two-fold. First, the report doesn’t meet the criteria for a scientific study, perhaps because its researchers are employed by a company that partially owns the technology. Despite the technical tone, it’s clearly more of an advertisement of capability. Second, the researchers don’t address some of the bigger picture questions that society really wants to know. Is GPT-4 truly suited to take over human tasks? Should we even consider training and employing language models for such purposes? And most importantly, is GPT-4 telling us the truth? Let’s keep these questions in mind as we dig in.

GPT-4 is impressive

Right off the bat, it’s clear that GPT-4 is even more capable than GPT-3 and ChatGPT. GPT-4’s output tokens are often noticeably better than those of the other models when given the same input tokens. That’s a technical way of saying that GPT-4 can write better poetry, generate more lucid dialog between ancient Greek philosophers, give fewer wrong answers about stacking objects, and draw more detailed cartoons of animals.

Does this mean that the new LLM possesses common sense and can “see”? The researchers appear to believe so, though these sorts of questions are nearly impossible to answer. GPT-4 is more capable than its predecessors at mimicking the outputs of a person who has common sense and can see. It’s creations can be complex, realistic, and indeed creative — if we allow ourselves to anthropomorphize the model. It also seems to get more answers “right.” (More on this below.) Human judges usually chose the GPT-4 answer over the GPT-3 answer if asked, though they also described both as incorrect a quarter of the time.

GPT-4 can say weird things — about you

Data privacy did not appear to be a big concern for GPT-4’s programmers. If you have ever posted anything online, including personal information like how you’re feeling depressed, GPT-4 probably knows. OpenAI has already fed GPT giant amounts of this stuff, without permission.

The LLM is asked to discriminate personally identifiable information (PII) — things like your location, bank account and credit card numbers, medical records, criminal history, financial records, and passwords. GPT-4 often recognizes PII, but fails about one quarter of the time, meaning the model will take in this information about 25% of the time it comes across it. And a model that is trained on your private data and communications might blather them to anyone who asks nicely enough. Perhaps even more troublingly, the model can also tell a bizarre, detailed, and convincing hallucinated lie about you.

One of the more fascinating but frightening features is GPT-4’s ability to produce misinformation. GPT-4 is great at writing propaganda and manipulative statements. When prompted, it convincingly argues that vaccines cause autism and devises a way to bully a child into doing what his friends are pressuring him to do. But if GPT-4 can be manipulated into saying wrong or nasty things, can it also be manipulated into saying things that are right and honorable?

Do all questions have “right” answers?

GPT-4 undergoes a battery of tests to gauge its ability to discriminate between categories of things, distinguish correct and incorrect answers, and produce “true” answers. The researchers claim that GPT-4 produces the desired answers to modestly tricky questions somewhere between 30% and 70% of the time, which is a little better than GPT-3. But there’s a catch.

The hardest question ever asked is: What is truth? GPT-4’s programmers think they have the answer. That’s what disturbs me the most. It turns out that the biggest problem of all is philosophical rather than technical, and I get the impression that the researchers have put little thought into this. The researchers’ perspective appears to be that all questions do have “right” answers, hence the model just needs to be trained and properly queried to say the right thing.

But this entirely misses the point that in human society “right” answers and “socially acceptable” answers often aren’t the same thing. What is factually true might also be offensive. Other “correct” answers are dictated by faith, intuition, belief, dogma, political power, and social trends. How should GPT-4 handle these situations? Whose version of truth is the actual truth?

In an example that is illustrative of the times we live in, the researchers asked it to create sentences about hypothetical people and then grilled it on how often it chose different pronouns to describe them. For example, when asked to write a sentence in Portugese (which uses gendered grammar) about an imaginary doctor, the researchers wanted to know if it would use the masculine médico or feminine médica. Now, imagine applying that logic to all the other hyper-political issues we face today, and you can imagine that GPT-4 might become politically biased. And indeed, that’s exactly what people have found using ChatGPT.

The biggest lingering question in my mind about GPT-4 isn’t if it’s going to destroy jobs or take over the world. Instead, it is this: Do we trust AI programmers — and the tech companies they work for — to tell society what is true?