How does Google reCAPTCHA v2 work behind the scenes?

Question

This post refers to Google ReCaptcha v2 (not the latest version)

Recently Google introduced a simplified "captcha" verification system (video) that enables users to pass the "captcha" just by clicking on it.

But how can it differentiate a bot from a person just by a click?

As per this answer, (assuming a similar implementation), at first "recaptcha" generates a hidden key and attaches it to a hidden input element and also lazily renders a check box (not an actual check box input but a div) with the same key which when clicked, sends an asynchronous request (XHR) to the Google backend servers to mark it as a valid verification key (i.e. a key that has to be validated when the form is submitted).

But why can't bots automate that click (at least, browser-based bots)?

How might this work?

Probably similar to the way they were sending simple captchas to humans, and hard captchas to bots — mukunda, Commented Dec 4, 2014 at 4:24
The way I understood it is - there still is a captcha, but unless you make suspicious requests - you will never have to solve it. — Kelm, Commented Dec 4, 2014 at 4:25
@Louie What was "stolen"? Someone asked the same question, and linked the same (and only) post—but it's worded a bit differently, and none of the answers are the same. What's more, the original Quora question was posted on December 3, a day before this question. What seems to be the problem? — wchargin, Commented Feb 11, 2015 at 5:42
@CiroSantilli六四事件法轮功 what the heck? the link is gone — TechLife, Commented Apr 10, 2015 at 9:44
@TechLife true! Seems to have moved to github.com/neuroradiology/InsideReCaptcha ? Reminder to self: always fork stuff. — Ciro Santilli OurBigBook.com, Commented Apr 10, 2015 at 10:13

AgmLauncher · Accepted Answer · 2014-12-04 16:50:38Z

210

This is speculation, but based on Google's reference to the "risk analysis engine" they use (http://googleonlinesecurity.blogspot.com/2014/12/are-you-robot-introducing-no-captcha.html)

I would assume it looks at how you behaved prior to clicking, how your cursor moved on its way to the check (organic path/acceleration), which part of the checkbox was clicked (random places, or dead on center every time), browser fingerprint, Google cookies & contents, click location history tied to your fingerprint or account if it detects one etc.

It's fairly difficult to fake "organic" behavior in such a way that it would fool a continuously learning pattern detection engine. In the cases where it's not sure, it still prompts you to match an actual CAPTCHA string.

answered Dec 4, 2014 at 16:50

AgmLauncher

7,2109 gold badges47 silver badges71 bronze badges

77

That seems right and should explain why I always have to type a string on my PSVita with the sticks. It doesn't move like a normal mouse.
– Domino
Commented Mar 25, 2015 at 0:10
3

I'm wondering how Google would react on a sufficient huge amount of recorded organic behaviour.
– Markus Malkusch
Commented Apr 17, 2015 at 15:21
16

Mouse movement definitely does not contribute to this. Place the cursor right on the spot where the checkbox would appear. Navigate to the site without moving your cursor. Click the checkbox and it will pass.
– Derek 朕會功夫
Commented Jun 12, 2015 at 2:36
3

@Derek, I don't think that is proof of anything. Cookies, IP and many other factors might contribute to letting you pass before they fall back to mouse movement. I don't feel like testing it, but if you were to fire a fresh computer from a fresh IP and not use the mouse at all, I'm willing to bet it would fail.
– Caimen
Commented Oct 1, 2015 at 20:57
16

Note that you can also tab over to it and press space.
– JSideris
Commented Dec 22, 2015 at 12:37

| Show 13 more comments

barbolo · Accepted Answer · 2018-06-08 13:22:37Z

A new paper has been released with several tests against reCAPTCHA:

https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf

Some highlights:

By keeping a cookie active for +9 days (by browsing sites with Google resources), you can then pass reCAPTCHA by only clicking the checkbox;
There are no restrictions based on requests per IP;
The browser's user agent must be real, and Google run tests against your environment to ensure it matches the user agent;
Google tests if the browser can render a Canvas;
Screen resolution and mouse events don't affect the results;

Google has already fixed the cookie vulnerability and is probably restricting some behaviors based on IPs.

Another interesting finding is that Google runs a VM in JavaScript that obfuscates much of reCAPTCHA code and behavior. This VM is known as botguard and is used to protect other services besides reCAPTCHA:

https://github.com/neuroradiology/InsideReCaptcha

UPDATE 2017

A recent paper (from August) was published on WOOT 2017 achieving 85% accuracy in solving noCAPTCHA reCAPTCHA audio challenges:

http://uncaptcha.cs.umd.edu/papers/uncaptcha_woot17.pdf

UPDATE 2018

Google is introducing reCAPTCHA v3, which looks like a "human score prediction engine" that is calibrated per website. It can be installed into different pages of a website (working like a Google Analytics script) to help reCAPTCHA and the website owner to understand the behaviour of humans vs. bots before filling a reCAPTCHA.

https://www.google.com/recaptcha/intro/v3beta.html

mouse events don't affect the results That's interesting, as I (and I believe many others) had thought that was the main thing that affected results. I thought on mobile instead of the checkbox, users were asked to select all images that are alike instead, because there are not mouse movements on a touchscreen. However, looking at the introductory blog post again, it appears that might not be the case. Perhaps selecting images is instead of typing distorted text, not instead of checking a box. Do you (or anyone) know whether reCAPTCHA ever allows simply checking a box on mobile? — Nateowami, Commented Dec 26, 2016 at 16:00
Mouse events do affect the results. If you press Tab and Enter to select the checkbox, it will show the images captcha for you to select them based on a criterion. — mbomb007, Commented Sep 11, 2017 at 14:04
@mbomb007 Mouse events might affect the results but pressing Tab and Enter will not show the image captcha all the time. Most of the time pressing Tab and Enter is accepted — Manish Ojha, Commented Mar 6, 2018 at 7:09

Ingo · Accepted Answer · 2016-05-13 22:27:31Z

28

My Bots are running well against ReCaptcha.

Here my Solution.

Let your Bot do this Steps:

First write a Human Mouse Move Function to move your Mouse like a B-Spline (Ask me for Source Code). This is the most important Point.

Also use for better results a VPN like https://www.purevpn.com

For every Recpatcha do these Steps:

If you use VPN switch IP first
Clear all Browser Cookies
Clear all Browser Cache
Set one of these Useragents by Random:

a. Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

b. Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0

5 Move your Mouse with the Human Mouse Move Funktion from a RandomPoint into the I am not a Robot Image every time with different 10x10 Randomrange

Then Click ever with random delay between

WM_LBUTTONDOWN

and

WM_LBUTTONUP
Take Screenshot from Image Captcha
Send Screenshot to

http://www.deathbycaptcha.com

or

https://2captcha.com

and let they solve.

After receiving click cooridinates from captcha solver use your Human Mouse move Funktion to move and Click Recaptcha Images
Use your Human Mouse Move Funktion to move and Click to the Recaptcha Verify Button

In 75% all trys Recaptcha will solved

Chears Google

Tom

edited May 13, 2016 at 22:27

answered May 13, 2016 at 22:21

Ingo

5,3211 gold badge32 silver badges25 bronze badges

2

Why do you need the "Human Mouse Move Function"? It looks unnecessary to accomplish your goal.
– barbolo
Commented May 20, 2016 at 13:29
11

'Human Mouse Move' Function is the most important Point. Google detects inside the captcha Mousespeed, MouseWay, MouseButton down and up Events, Clickpositions, Mouse EntryPoint into the captcha, ... and send this informations via Javascript to Google Database with many millions of real Human Mousemove Trackings. After interpreting all this captured Informations the Captcha will marked als solved, only if the google algorithm says it was a human.
– Ingo
Commented May 21, 2016 at 9:32
4

@ barbolo: Please Check this official Google Blog security.googleblog.com/2014/12/… -> Google says "To counter this, last year we developed an Advanced Risk Analysis backend for reCAPTCHA that actively considers a user’s entire engagement with the CAPTCHA—before, during, and after—to determine whether that user is a human. "
– Ingo
Commented May 25, 2016 at 0:10
10

Step 8 uses an external API where humans solve the captcha for you. The bot is not solving anything.
– Andrea Lazzarotto
Commented Jul 21, 2017 at 23:47
19

Is it just me, or is it both disturbing and fascinating to anyone else that bot writers are using Stack Overflow to help solve (and debate about!) reCAPTCHAs?
– Ogre Psalm33
Commented Nov 22, 2017 at 20:16

| Show 13 more comments

4ae1e1 · Accepted Answer · 2015-11-16 00:41:06Z

3

May I present my guess, since this is not a open technology.

Google says it's about combing information from before, during, after to distinguish human from robot. But I am more interested about that final click on the check box.

Say, the POST data (solved CAPTCHA) has a field called fingerprint, a string calculated from user behavior. I think there may be a field about that check box location. I guess this check box is in a coordinate system randomly generated by Google back-end and encrypted by the public key of my site. So, a robot may "guess/calculate" a location about this box, but when site owner makes the GET query with private key to verify user identity, Google will decrypt the coordinate system and say if the user click on the right place. So, only one possible right click(with some offsets, it's a square box) location in this random coordinate system owned by only Google and site owners.

edited Nov 16, 2015 at 0:41

4ae1e1

7,4949 gold badges49 silver badges77 bronze badges

answered Jan 3, 2015 at 4:26

hakunami

2,4015 gold badges32 silver badges50 bronze badges

If the browser is good enough to actually show the box and detect clicks, then why would a hacking robot not be able to do the same? I could however set the position of the checkbox to a very precise position (in decimals) so if a click is detected with the same decimals, it means it's a robot who didn't bother adding random decimals to the click position. But again, that's not foolproof.
– Domino
Commented Mar 25, 2015 at 0:14
Google is supposedly using a 'learning' algorithm so that if some clients with the same characteristics seem to tact the same general path and general time to get there and it's happened 100,000 times a day, they're probably not legit.
– Allison
Commented Mar 26, 2015 at 3:51
1

It should be relatively easy to simulate clicking in the square area. Doesn't matter how google encrypts the data before sending.
– Eugene C
Commented Jul 14, 2015 at 18:39

Add a comment |

Ingo · Accepted Answer · 2019-11-29 00:17:35Z

1

Please remember that Google also use reCaptcha together with

Canvas fingerprinting

to uniquely recognize User/Browsers without cookies!

edited Nov 29, 2019 at 0:17

answered Oct 28, 2019 at 13:28

Ingo

5,3211 gold badge32 silver badges25 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How does Google reCAPTCHA v2 work behind the scenes?

5 Answers 5

Not the answer you're looking for? Browse other questions tagged
captcha
recaptcha
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Not the answer you're looking for? Browse other questions tagged captcharecaptcha or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
captcha
recaptcha
or ask your own question.