Skip to main content

CAPTCHA : A story of old books, traffic lights and self driving cars




Before we talk more about CAPTCHA, we need to talk quickly about Turing Tests. Long before Artifical Intelligence was the cool thing to work on, Alan Turing the great british mathematician deviced a test to tell a computer apart from a human. As part of the Turing Test, a human interogator asks the same question to another human and a computer. The interogator does not know which one is the human and which is the computer. Based on the responses from the human vs the computer, the interogator has to guess which response is from the human. If the computer is able to fool the interogator into beleiving that it is the human for more than 50% of the responses then it is considered that the machine has acheived AI. So, considering that we are in a world where we have not acheived AI which can match human intelligence, a machine or computer would fail at a task that humans can do easily. CAPTCHA uses this philosophy at it's core. 



CAPTCHA is a software test that validates if the subject executing it is a human being or a machine. A test pass corresponds to a successful authentication for a human being. A test fail is an authentication failure and prevents the machine from exploiting a protected resource such as account sign ups, server time etc. 

In the early 2000s when I was a college student, I remember the websites used to be filled with spam comments of all kinds. These were the days when blogging was the cool thing to do and Orkut was the social media of choice. There was no facebook yet. Rogue programs could be written to spam blogs with comments, perform numerous sign ups and in the worst case engage in unsophisticated Denial of Service attacks bringing down entire websites. Website or any front end interface is meant for human use. But when machines can mimic humans actions, it can be exploited for nefarious purposes. Using a free automation tool like Selenium Webdriver one can easily build a script in 5 mins to create actions on a web client. 


CAPTCHA prevented spam attacks in a novel way. Present a squiggly image of a word or a phrase to the subject and challenge them to type it out. The task is simple for a human but near impossible for a machine. Thats sublime and brilliant! Completely Automated Public Turing Test To Tell Computers and Humans Apart. CAPTCHA. Luis Von Ahn and his team coined this term in 2000 while at Carnegie Mellon University. Von Ahn turned out to be a pioneer of crowdsourcing software and went on to found the language learning platform Duolingo. 
The Captcha technology was used to translate and digitize thousands of old books in a very smart way. A standard CAPTCHA challenge was accompanied with a picture of a word or phrase from an old book and presented to users. This way thousands of old books were digitized by Google. And we helped do that unknowingly. That's a great use of hundreds of thousands of hours of human effort that was wasted in solving CAPTCHA challenges. This technology is called re-CAPTCHA and it is owned by Google. 

Traffic lights and Self driving Cars 
The latest CAPTCHA challenges that we see these days ask us to identify traffic lights and cars in an urban image. Now why always traffic lights and cars? 

Turns out Waymo, google's self driving car company is training it's AI models to better identify traffic lights and cars on the road. And like digitizing books in the past, we are helping Google make self driving cars a reality in the future. 

No CAPTCHA re-CAPTCHA 
Another type of CAPTCHA that Google has been using is the no captcha. Google knows our online behavior and based on that it generates a score between 0.0 to 1.0 that tells the likelihood of the user being a bot. As a developer, depending on the score returned, you can decide whether to present a challenge or not. 

The "I am not a robot" checkbox is a variation of the no-captcha as well in which Google asks you to click on a checkbox as a challenge. Google uses your move movement attributes to determine if you are a machine or a human.  

Breaking CAPTCHA
There are shops with humans sitting in front of computers to perform fake sign ups, likes, reviews etc. It can be a challenge for technologies to catch these because essentially these are not computers but humans interacting. The artificial intelligence and image recognition technology have seen amazing advancements recently which has made some of the traditional CAPTCHA ineffective or vulnerable. The challenge presented would keep changing as modern machine become more and more efficient at imitating humans. 

Popular posts from this blog

9 tips to build a resume that will get you calls from Google, Facebook and Amazon

I have a sweet tooth. Whenever I am at a grocery store, I love to surf the candy aisle. Specifically the shelves with the chocolates. And the selection of chocolates is vast. There are literally hundreds of options. So many brands, flavors, sizes to choose from. But some chocolates would stand out in the crowd. And what makes some stand out amongst the others is the packaging. Out of the hundreds of selections on offer, I pick up only a few to review and amongst them only one gets bought.  (Side note: The Endangered Species Chocolate company has a great product packaging and story. Try it out if you haven’t.) If not with chocolates, you would have experienced a similar situation while purchasing something else. Maybe a book, a bottle of wine or a bag of chips. And every time you are looking to discover something new, the first thing that draws you to the product is its packaging. Now, why am I talking about product packaging in a post about building great resumes? Well, if you rea...

From Code to Customer: Measuring Software Quality Before Release

  “When a metric becomes a target, it ceases to be a good metric.” -  economist Charles Goodhart I feel that every discussion about metrics should mandatorily start with the above word of caution from Goodhart’s law. Metrics should inform decisions, not drive behavior in isolation. Without context, they can easily be gamed or give a false sense of confidence. Alright! With that disclaimer out of the way, lets talk about Quality Metrics for production readiness. What you’ll find here comes from the trenches — lessons from things that worked, things that didn’t, ideas that sounded smart but fell apart at scale, and blind spots I didn’t see until they hit production. I’ve owned software quality across industries like e-commerce, fintech, streaming, and SaaS — in startups, scaleups, and big tech. Your context may vary, but these insights should hit home for most teams shipping software. Why Should We Even Measure Quality? I believe there are three reasons to measure software quali...

Ep 12 : Musk vs Zuck Showdown | Who will win? | 10 Round Fight Tech Banter Edition

Ep 11 : MBA while working in Tech? IC to Manager, Work life harmony, Sharad Kumar Sr Prog Mgr OpenText,Nokia

Ep 8 : Career Advice from a Principal SWE, Isaac Adams

 

Episode 7 : Discussing the Vision Pro Highlights | Meta vs Apple | Fantasy stock picks | The Tech Banter Ep 7