Captcha
AnsweredIs there any way to automate web-scaping when Capcha codes are involved?
-
Hi Anca,
these types of tasks can be solved by using OCR technology. Please refer to the article that describes how to use Foxtrot OCR action:
https://support.foxtrotalliance.com/hc/en-us/articles/115003456025-How-To-Use-The-Foxtrot-OCR-Action
The article also has references to more advanced solutions.
I must stress that it depends on whether a particular Captcha is recognizable enough by the OCR engine. We have automated processes that included Captcha recognition, but the quality of the picture was pretty high and the engine was able to recognize every symbol. -
You could OCR it with multiple engines and compare. I just started off with this idea the other day for another purpose, but it seems like it would be great way to handle CAPTCHA.
Foxtrot OCR it
Tesseract OCR it -- https://support.foxtrotalliance.com/hc/en-us/articles/360025120592
Windows 10 OCR it -- https://github.com/HumanEquivalentUnit/PowerShell-Misc/blob/master/Get-Win10OcrTextFromImage.ps1
Compare the three results. Maybe if all disagree then request a new test, if 2/3 agree try it, and if there is a fail limit then only try when you have 3/3 agreement? Or maybe 3/3 agreement is common enough to just retry until you get one.
Please sign in to leave a comment.
Comments
2 comments