3. Definition
CAPTCHA stands for Completely Automated
Public Turing test to tell Computers and Humans
Apart
A program that can tell whether its user is a
human or a computer.
The challenge: develop a software program that
can create and grade challenges most humans can
pass but computers cannot
3
4. Background
First used by Altavista in1997
Reduced SPAM add-url by over 95%
CMU/Yahoo!
Automated the creating and grading of challenges
PARC
Relies on document image degradation to prevent
successful OCR
Conducted user-focused studies to assess the
effectiveness of CAPTCHAs
4
5. Background
CAPTCHAs are based on open AI problems
Breaking CAPTCHAs help advance AI by
solving these open problems
Improving CAPTCHAs help telling
computers and human apart
Win-win situation
5
7. Types of CAPTCHAs
Text based
Gimpy, ez-gimpy
Gimpy-r, Google CAPTCHA
Simard’s HIP (MSN)
Graphic based
Bongo
Pix
Audio based
7
8. Text Based CAPTCHAs
Gimpy, ez-gimpy
Pick a word or words from a small dictionary
Distort them and add noise and background
Gimpy-r, Google’s CAPTCHA
Pick random letters
Distort them, add noise and background
Simard’s HIP
Pick random letters and numbers
Distort them and add arcs
8
10. Graphic Based CAPTCHAs
Bongo
Display two series of blocks
User must find the characteristic that sets the two
series apart
User is asked to determine which series each of four
single blocks belongs to
Difference? thick vs. thin lines
10
11. Graphic Based CAPTCHAs
PIX
Create a large database of labeled images
Pick a concrete object
Pick four images of the object from the images database
Distort the images
Ask the user to pick the object for a list of words
11
13. Audio Based CAPTCHAs
Pick a word or a sequence of numbers at random
Render them into an audio clip using a TTS software
Distort the audio clip
Ask the user to identify and type the word or
numbers
13
14. Breaking CAPTCHAs
Most text based CAPTCHAs have been broken by
software
OCR
Segmentation
Other CAPTCHAs were broken by streaming the tests
for unsuspecting users to solve.
14
15. Properties
CAPTCHA should be automatically generated
and graded
Test can be taken quickly and easily by human
users
Test will accept virtually all human users and
reject software agents
Test will resist automatic attack for many years
despite the technology advances and prior
knowledge of algorithms
16. Final Thoughts
They are crucial to preventing boot attacks
Hopefully, they will become more user-friendly to
people with disabilities (visual, mental)
CAPTCHA’s are mainly produced from AJAX and
PHP technology
Various algorithms are present
Use of XML