Tuesday, October 16, 2012

hackyou CTF: PPC 300

This challenge was very similar to PPC 100 in that it was supposed to be an anti-human captcha. There was a timed challenge to factor a very large number and send one of the factors to the website. However, the number was also an image, which meant I need to do some OCR. To accomplish this, I used Sage to do the math and pytesser (which uses the Tesseract OCR program) to do the character recognition. I was having trouble getting tesseract to work in wine, so I compiled it from source (which takes FOREVER) and then it was fairly successful! It occasionally turned an 8 into an S and a 0 into an O. But a short time later I had a python script that was running successfully.
import os
from subprocess import *
from pytesser import *
import urllib2, urllib

# A method to ease the calling of commands to the system
def run_cmd(cmd):
        p = Popen(cmd, shell=True, stdout=PIPE)
        output = p.communicate()[0]
        return output

# This gets the html of the captcha
response = urllib2.urlopen('http://misteryou.ru/ppc300/')
html = response.read()
# Get the url of the image so I can download it
pic = html.split('img src=\'')[1].split("'")[0]
# Get the trueanswer field which acts as a cookie
trueanswer = html.split("'trueanswer' value='")[1].split("'")[0]
# Download the image and save it to win.png
urllib.urlretrieve('http://misteryou.ru'+pic, "win.png")

print 'Converting image'
# Convert the png to a tif so that pytesser can work
image = Image.open('win.png').convert('RGB').save('win.tif')
# Get a handle for the tif
image = Image.open('win.tif')
# Convert the image to a string
num = image_to_string(image)
# Isolate the number we want from the rest of the image
num = num.split('\n')[0].split(' ')[1]
print num    # Print the original string
# Replace known image conversion issues with the correct number
num = num.replace('O','0').replace('S', '8').replace('-','')
print num    # Print the corrected string

# Do the math by asking sage to factor our problem
result = run_cmd('sage -c "print factor('+num+')"')
print result
# Get a factor to send to the server
result = result.split(' * ')[0]
print result
print 'Making web request'

# Set up the post variables
values = {
	'captchatype' : 'refactor',
	'trueanswer' : trueanswer,
	'answer' : result
}

header = {'User-Agent':'Mozilla/4.0'}
data = urllib.urlencode(values)
req = urllib2.Request('http://misteryou.ru/ppc300/', data, header)
response = urllib2.urlopen(req)
print response.read() # Read the result!
The result from the web server was
Ok, u are robot
Secret is:
1101011
1101001
1101100
1101100
1011111
110001
1011111
1101000
1110101
1101101
1100001
1101110
which translates to kill_1_human as 7-bit ASCII.

-- suntzu_II

1 comment: