How the inventor of software CAPTCHA solved two wildly different problems with this one solution

I recently listened to an interview by Guy Ray with Luis von Ahn who invented the software CAPTCHA as a PhD student and founded Duolingo, a language learning app that’s valued at over US$9 billion. 

In 2000, Luis von Ahn was a 21 yr old computer science student when he learned about one of Yahoo's biggest problems at a talk he attended: automated bots were signing up for millions of free email accounts and generating huge amounts of spam.

It became his PhD research topic to find a solution.  How can bots be stopped from opening new email accounts? 

Luis boiled the problem down to answering this question:

What can computers automatically create AND only humans can do that will prove they are human?

Luis found that computers can create distorted images, but they couldn’t read them. 

This became the idea behind the software CAPTCHA, the squiggly letters and images that appear on a website we need to decipher to prove we are human and not a bot.

He wrote some code to prove his idea would work and sent it to Yahoo. Within a week of giving the solution to Yahoo, it was functional and live. Surprisingly he gave away that idea for free.

After a few years of people giving him a hard time for wasting 10 seconds of their life whenever they needed to prove their human credentials to a computer, Luis pondered how to do something useful with the roughly 500,000 hours a day people collectively across the globe were using CAPTCHA. 

He then became aware of another problem. It was proving to be very hard and costly to digitise a publishing archive of over 100 million published books because computers at the time found it difficult to read all the words on scanned pages. 

At the time roughly 30% of the words on every scanned page were not readable by a computer because they were blurry.

This new problem, and a chance meeting with someone from the New York Times, led to the development of reCAPTCHA.  

In addition to using allocated words to prove their human credentials, users were prescribed a word a computer couldn’t read from a scanned New York Times archive article to decipher.

If 10 people typed the same word to the archive article image they assumed it would be correct.  This became an early form of crowdsourcing. 

Because Facebook was using reCAPTCHA software at the time, it took only a week to digitise an entire year of New York Times archive articles. Luis later sold reCAPTCHA to Google.

It was truly elegant way of solving two problems with one solution.

Luis then went on to use his crowdsourcing experience to create the most popular language learning app Duolingo. 

Previous
Previous

How changing your story can change your life

Next
Next

Discomfort makes you smarter faster