Big tech fails transparency test: Gary Marcus on what we should demand of AI
- A team of researchers from MIT, Stanford, and Princeton rated the transparency of 10 AI companies; each one received a failing grade.
- AI expert Gary Marcus argues we need more transparency surrounding AI if we are to understand these systems and their effect on society.
- He recommends citizens demand transparency around several key factors, including their environmental impact, the data these systems are trained on, and how these algorithms work.
Transparency — being clear about what you’ve done and what the impact is. It sounds wonky, but matters enormously. Companies like Microsoft often give lip service to “transparency,” but provide precious little actual transparency into how their systems work, how they are trained, or how they are tested internally, let alone what trouble they may have caused.
We need to know what goes into [AI] systems, so we can understand their biases (political and social), their reliance on purloined works, and how to mitigate their many risks. We need to know how they are tested, so we can know whether they are safe.
Companies don’t really want to share, which doesn’t mean they don’t pretend otherwise.
For example, in May 2023, Microsoft’s president Brad Smith announced a new “5 point plan for governing AI,” allegedly “promoting transparency”; the CEO immediately amplified his remarks, saying, “We are taking a comprehensive approach to ensure we always build, deploy, and use AI in a safe, secure, and transparent way.”
But as I write this, you can’t find out what Microsoft’s major systems were trained on. You can’t find out how much they relied on copyrighted materials. You can’t find out what kind of biases might follow from their choice of materials. And you can’t find out enough about what they were trained on to do good science (e.g., in order to figure out how well the models are reasoning versus whether they simply regurgitate what they are trained on). You also can’t find out whether they have caused harm in the real world. Have large language models been used, for example, to make job decisions, and done so in a biased way? We just don’t know.
In an interview with Joanna Stern of The Wall Street Journal, OpenAI’s CTO Mira Murati wouldn’t even give the most basic answers about what data had been used in training their system Sora, claiming, improbably, to have no idea.
Not long ago, in a briefing on AI that I gave at the UN, I highlighted this gap between words and action. Since then, a team with members from Stanford University, MIT, and Princeton, led by computer scientists Rishi Bommasani and Percy Liang, created a careful and thorough index of transparency, looking at ten companies across 100 factors, ranging from the nature of the data that was used to the origins of the labor involved to what had been done to mitigate risks.
Every single AI company received a failing grade. Meta had the highest score (54 percent), but even it failed on factors such as transparency of their data, labor, usage policy, and feedback mechanisms. Not a single company was truly transparent around what data they used, not even Microsoft (despite their lip service to transparency) or OpenAI, despite their name.
The report’s conclusions were scathing:
“The status quo is characterized by a widespread lack of transparency across developers. … Transparency is a broadly-necessary condition for other more substantive societal progress, and without improvement opaque foundation models are likely to contribute to harm. Foundation models are being developed, deployed, and adopted at a frenetic pace: for this technology to advance the public interest, real change must be made to rectify the fundamental lack of transparency in the ecosystem.“
Worse, as the Stanford/Princeton/MIT team put it, “While the societal impact of these models is rising, transparency is on the decline.”
While I was sketching [my book], a nonprofit reassuringly called the Data & Trust Alliance — sponsored by 20-plus big tech companies — managed to get coverage in a New York Times article titled “Big Companies Find a Way to Identify A.I. Data They Can Trust.” When I checked out the alliance’s webpage, it had all the right buzzwords (like “[data] provenance” and “privacy and protection”), but the details were, at best, geared toward protecting companies, not consumers.
With something like GPT-4, it would tell you almost nothing you actually wanted to know, for example, about copyrighted sources, likely sources of bias, or other issues. It would be like saying for a Boeing 787: “source of parts: various, U.S. and abroad; engineering: Boeing and multiple subcontractors.” True, but so vague as to be almost useless. To actually be protected, we would need much more detail.
What should we, as citizens, demand?
Data transparency. At the bare minimum, we should have a manifest of the data that systems are trained on; it should be easy for any interested person to see what copyrighted materials have been used. It should also be easy for any researcher to investigate likely sources of biases or to figure out how well the models were reasoning versus were they simply regurgitating what they were trained on. In essence, as several have argued, we need “nutrition labels for data” that explain where datasets come from, what appropriate use cases might exist, what limitations there might be, and other factors.
Algorithmic transparency. When a driverless car has an accident, or a consumer’s loan application has been denied, we should be able to ask what’s gone wrong. The big trouble with the black box algorithms that are currently in vogue is that [nobody] knows exactly why an LLM or generative model produces what it does. Guidelines like the White House’s Blueprint for an AI Bill of Rights, UNESCO’s Recommendation on the Ethics of Artificial Intelligence, and the Center for AI and Digital Policy’s Universal Guidelines for AI all decry this lack of interpretability. The EU AI Act represents real progress in this regard, but so far in the United States, there is little legal requirement for algorithms to be disclosed or interpretable (except in narrow domains such as credit decisions).
To their credit, Senator Ron Wyden (D-OR), Senator Cory Booker (D-NJ), and Representative Yvette Clarke (D-NY) introduced an Algorithmic Accountability Act in February 2022 (itself an update of an earlier proposal from 2019), but it has not become law. If we took interpretability seriously — as we should — we would wait until better technology was available. In the real world, in the United States, the quest for profits is basically shoving aside consumer needs and human rights.
Source transparency. In the coming years, there’s going to be a huge amount of propaganda, including deep-fake videos that are increasingly convincing, and loads of scams, such as the voice-cloning scams […]. Unfortunately, few people are trained to recognize machine-generated content, and there is no automated way to do so with certainty. Worse, by using simple tricks like personal pronouns and emojis, AI can fool a lot of people a lot of the time. Increasingly, we will see what the late philosopher Dan Dennett called “counterfeit people.” Similarly, journalist Devin Coldewey proposed that “software be prohibited from engaging in pseudanthropy, the impersonation of humans,” and I concur. In this new era, everyone needs to be on their guard. But governments need to help, insisting that AI-generated content be labeled as such, as Michael Atleson at the Federal Trade Commission (FTC) has encouraged; in his straightforward words, “People should know if they’re communicating with a real person or a machine.” (As he notes, we should also be told what is an ad, and what is not: “any generative AI output should distinguish clearly between what is organic and what is paid.”)
Environmental and labor transparency. Every large generative AI system (say, the size of GPT-4, Claude, or Gemini) should report on environmental impact regarding use of water, energy, and other resources, as well as carbon emissions. Chip manufacturers like NVidia should also be more forthcoming about their impact, in the full life cycle of their products. We should demand transparency around labor practices for the data workers who do the data labeling and provide human feedback.
Corporate transparency. We also need transparency regarding what the companies know about the risks of their own systems. In the famous Ford Pinto saga, Ford knew its cars’ rear gas tanks might sometimes explode, but didn’t share what they knew with the public. As tech analyst (and publisher) Tim O’Reilly has pointed out, tech companies should be required to be forthcoming about the risks they know about and about the internal work they have done around risks, “an ongoing process by which the creators of AI models fully, regularly, and consistently disclose the metrics that they themselves use to manage and improve their services and to prohibit misuse.” We also need every corporation to contribute to a public database of known incidents, and perhaps a government-sponsored global AI observatory to track these things. (The AI incident database is a good start.) As Marietje Schaake has sharply observed, without corporate transparency, no regulatory framework can really work.
Writing good transparency bills takes hard work. As Archon Fung and co-writers’ Full Disclosure put it: “to be successful, transparency policies must be accurate, keep ahead of disclosers’ efforts to find loopholes, and, above all, focus on the needs of ordinary citizens” — and it is work that absolutely must be done.
The good news is there is some motion here. In December 2023, Representatives Anna Eshoo (D-CA) and Don Beyer (D-VA) introduced an important bill on transparency; in February 2024, Senator Ed Markey (D-MA) and Senator Martin Heinrich (D-NM), working together with representatives Eshoo and Beyer, introduced a bill for environmental transparency. I hope these bills make their way into law.