Digitizing Old Books Using Human Computation and reCAPTCHA, with Luis von Ahn

Luis von Ahn, CEO of Duolingo and one of the inventors of CAPTCHA, explains how reCAPTCHA harnesses the abilities of both humans and computers in order to accomplish tasks such as the digitization of old books.

Luis von Ahn: So human computation, the idea, is that there are problems that computers cannot yet solve. It's funny because some of these problems are very simple problems seemingly. For example, a computer cannot tell you what's inside an image. They can tell you somethings but it can't really quite tell you there's a cat next to a dog and they're both running. A computer can't do that. Well humans, we can do it super easily. And there are many things that computers cannot do that humans can. Conversely, there are also things that computers can do that humans can't do. I mean computers can multiply humongous numbers, humans may be able to do it but very slowly and we're error-prone. And so the idea with human computation is to combine both humans and computers together in a very large scale to solve problems that neither can solve alone.

My project that has been used by most people is a project called reCAPTCHA, where the Idea with reCAPTCHA is that we take a problem that neither humans nor computers can solve by themselves, which is fully digitizing books. The idea there is we would like to digitize books. And the way this process works is you start with a book and then you scan it. The next step in the process is that the computer needs to be able to decipher all of the words in this picture. It's a picture of words. The computer needs to be able to decipher all of those words. The problem is that sometimes the computer cannot decipher these words because for older books the ink has faded a little or the pages have turned yellow so the computer cannot decipher all of the words. But, humans can. So what we're doing with reCAPTCHA, If you've ever seen these distorted letters that you have to type all over the Internet, for example, when you buy tickets on Ticketmaster or whenever you get a Facebook account or something you have to type these distorted characters. That thing is called a CAPTCHA and I was one of the people who helped invent it. And the reason it's there, there's a primary purpose, which is to make sure that you're a human and not a computer. And it's because humans can read these squiggly characters but computers can't. This is a security mechanism and it has been there for a while, but at some point I realized its second use, which is helping to digitize books. The idea is that some of these words, nowadays some of these words are words that are actually coming from books that the computer could not recognize in this process and we're using what people enter to help us digitize the books. So that's the idea.

And so this is a project where it's about 1.1 billion people in the world have helped us digitize at least one word out of a book using this. So here we're taking a very large number of humans to do precisely the step that computers cannot do in the book digitization process. This is a company that was bought by Google, by now Google is digitizing the equivalent of about 2 million books a year with basically humans typing every now and then some of the words through CAPTCHAs all over the Internet. So that's the idea of human computation.

Directed/Produced by Jonathan Fowler, Elizabeth Rodd, and Dillon Fitton

 

Back at the beginning of the century, Luis von Ahn helped invent CAPTCHA, the online security device featuring squiggly letters that you have to re-type in order to prove you're human. In 2007, von Ahn invented reCAPTCHA, a new form of CAPTCHA that serves a second purpose: the digitization of old books.


In this video clip, von Ahn describes how reCAPTCHA works while discussing the power of human computation, a term he helped coin that describes the harnessing of both human and computer abilities in order to accomplish difficult tasks.

‘Designer baby’ book trilogy explores the moral dilemmas humans may soon create

How would the ability to genetically customize children change society? Sci-fi author Eugene Clark explores the future on our horizon in Volume I of the "Genetic Pressure" series.

Surprising Science
  • A new sci-fi book series called "Genetic Pressure" explores the scientific and moral implications of a world with a burgeoning designer baby industry.
  • It's currently illegal to implant genetically edited human embryos in most nations, but designer babies may someday become widespread.
  • While gene-editing technology could help humans eliminate genetic diseases, some in the scientific community fear it may also usher in a new era of eugenics.
Keep reading Show less

Lair of giant predator worms from 20 million years ago found

Scientists discover burrows of giant predator worms that lived on the seafloor 20 million years ago.

Credit: Jenny – Flickr
Surprising Science
  • Scientists in Taiwan find the lair of giant predator worms that inhabited the seafloor 20 million years ago.
  • The worm is possibly related to the modern bobbit worm (Eunice aphroditois).
  • The creatures can reach several meters in length and famously ambush their pray.
Keep reading Show less

FOSTA-SESTA: Have controversial sex trafficking acts done more harm than good?

The idea behind the law was simple: make it more difficult for online sex traffickers to find victims.

Credit: troyanphoto on Adobe Stock
Politics & Current Affairs
  • SESTA (Stop Enabling Sex Traffickers Act) and FOSTA (Allow States and Victims to Fight Online Sex Trafficking Act) started as two separate bills that were both created with a singular goal: curb online sex trafficking. They were signed into law by former President Trump in 2018.
  • The implementation of this law in America has left an international impact, as websites attempt to protect themselves from liability by closing down the sections of their sites that sex workers use to arrange safe meetings with clientele.
  • While supporters of this bill have framed FOSTA-SESTA as a vital tool that could prevent sex trafficking and allow sex trafficking survivors to sue those websites for facilitating their victimization, many other people are strictly against the bill and hope it will be reversed.
Keep reading Show less

Octopus-like creatures inhabit Jupiter’s moon, claims space scientist

A leading British space scientist thinks there is life under the ice sheets of Europa.

Credit: NASA/JPL-Caltech/SETI Institute
Surprising Science
  • A British scientist named Professor Monica Grady recently came out in support of extraterrestrial life on Europa.
  • Europa, the sixth largest moon in the solar system, may have favorable conditions for life under its miles of ice.
  • The moon is one of Jupiter's 79.
Keep reading Show less

What is the ‘self’? The 3 layers of your identity.

Answering the question of who you are is not an easy task. Let's unpack what culture, philosophy, and neuroscience have to say.

Videos
  • Who am I? It's a question that humans have grappled with since the dawn of time, and most of us are no closer to an answer.
  • Trying to pin down what makes you you depends on which school of thought you prescribe to. Some argue that the self is an illusion, while others believe that finding one's "true self" is about sincerity and authenticity.
  • In this video, author Gish Jen, Harvard professor Michael Puett, psychotherapist Mark Epstein, and neuroscientist Sam Harris discuss three layers of the self, looking through the lens of culture, philosophy, and neuroscience.
Keep reading Show less
Quantcast