reCAPTCHA: Human-Based Character Recognition via Web Security Measures Science (09/12/08) Vol. 321, No. 5895, P. 1465; von Ahn, Luis; Maurer, Benjamin; McMillen, Colin
as it appeared in the September 24, 2008 edition of ACM TechNews.
The reCAPTCHA project employs CAPTCHAs to help digitize scanned typeset texts by having people decipher the words that computers are incapable of recognizing, says Carnegie Mellon University's Luis von Ahn and colleagues. CAPTCHAs are distorted word puzzles that humans can successfully solve but current computer programs cannot, and they are used to prevent the abuse of online services by automated programs. Von Ahn notes that reCAPTCHA "is used by more than 40,000 Web sites and demonstrates that old print material can be transcribed, word by word, by having people solve CAPTCHAs throughout the World Wide Web." ReCAPTCHA provides the user with two words, the one for which the answer is unknown and a second "control" word for which the answer is known. A correctly typed control word causes the system to assume that the user is human and confidently conclude that he also typed the other word correctly. ReCAPTCHA accounts for human error in the digitization process by sending every unrecognizable or suspicious word to multiple users, each time with a different random distortion. The authors have learned from a large-scale implementation of the reCAPTCHA system that deciphering words using CAPTCHAs can match the highest-quality guarantee provided by dedicated human transcription services. Von Ahn and colleagues conclude that reCAPTCHA clearly shows that "'wasted' human processing power can be harnessed to solve problems that computers cannot yet solve." Click Here to View Full Article - Web Link May Require Paid Subscription