Tuesday, February 5, 2013

HackIM CTF 2013: Reverse Engineering 300

RE 300 was extremely interesting - it was essentially a javascript vm for an invented language. The js file is available here for you to look at if you want, but I will explore the relevant sections of code here.

A couple of variables I will refer to are pc (program counter), opcode (the first digit of any piece of memory), addr (called mailbox in their program and is the other two digits of any piece of memory), acc (the accumulator), and input_counter (the counter to see which digit of input we are examining).

The memory at the beginning of the program is:
var code=[901,340,505,140,305,461,901,722,340,539,723,241,339,540,238,142,342,901,722,901,722,606,000,542,243,243,244,340,830,653,553,140,145,353,546,140,653,000,17,10,000,001,000,400,60,459,41,22,76,76,75,75,37,417,560,140,145,360,547,140,417,567,140,145,367,548,140,417,574,140,145,374,549,140,417,581,140,145,381,550,140,417,588,140,145,388,551,140,417,595,140,145,395,552,140,417,423];
Viewed more clearly as:
0123456789
0901340505140305461901722340539
1723241339540238142342901722901
2722606000542243243244340830653
35531401453535461406530001710
40000010004006045941227676
5757537417560140145360547140
6417567140145367548140417574140
And the rest doesn't really matter
Now, to abstract the exact code, I have come up with some basic assembly-type ideas to represent what the opcodes are doing:
1 = ADD ACC,[ADDR] (Add data at memory location ADDR to accumulator) 
2 = SUB ACC,[ADDR] (Subtract data at ADDR from accumulator)
3 = STACC ADDR (Store accumulator in ADDR)
5 = LDACC [ADDR] (Load accumulator with data from ADDR)
6 = JMP ADDR (Change the pc to ADDR)
7 = BRZ ADDR (Branch if zero to ADDR)
8 = BRGEZ ADDR (Branch if greater than or equal to zero to ADDR)
9 = INPUT (stored in accumulator) or OUTPUT (Dependent on ADDR)
else = DIE
Now we can start analyzing what the program does. The first important string of commands is 901, 340, 505, 140, 305, 461.
901: INPUT 
340: STACC 40 (Stores the input at memory 40)
505: LDACC [5] (Loads the accumulator with the contents of addr=5, 461)
140: ADD ACC, [40] (Address 40 becomes a temporary variable of sorts. Think of it as RAM.
     This command adds the contents of memory 40 to the accumulator, 461 + input)
305: STACC 5 (Stores 461+input in memory 5)

461+input: This command has now changed based on our input. Since there are no
opcodes that begin with 4, we know we must input a value that makes this be greater
than 500. However, looking ahead to the next command (a 901), we know that the value
of the accumulator will be ignored anyway. So any ASCII character will do as the
first character.
Now comes the interesting part. The next string of commands is 901, 722, 340, 539, 723, 241, 339, 540, 238, 142, 342, 901, 722, 901, 722, 606.
901: INPUT  
722: BRZ 22 (If the input is zero, branch to addr 22 which is a zero. This is essentially an exit() command)
340: STACC 40 (Store the input in address memory 40)

This next part is essentially a loop counter that gets a number,
checks if it is zero, and decrement. Think of it as for (int i = 10; i > 0; i--) type code. 539: LDACC [39] (Load accumulator with the contents of addr 39. This is a loop counter that counts down from 10)
723: BRZ 23 (If the value in accumulator is 0, branch to addr 23. This will happen after looping 10 times)
241: SUB ACC, [41] (Addr 41 has a value of 1, so this is essentially a decrement of the accumulator)
339: STACC 39 (Stores the decremented value back into memory)

This next part is summing the ASCII values of inputted characters
and storing the result in memory location 42. 540: LDACC [40] (Load accumulator with input we stored earlier)
238: SUB ACC, [38] (Subtract 17 every time)
142: ADD ACC, [42] (Add the previous sum into the accumulator)
342: STACC 42 (Store the sum into it's memory location)
901: INPUT (This input is ignored)
722: BRZ 22 (DIE)
901: INPUT (This input is ignored)
722: BRZ 22 (DIE)
606: JMP 6 (Jump back to the beginning of this section)
After analyzing this section of code, we know that the program sums the first of every set of three inputs (and subtracts 17) 10 times and stores the result somewhere. So the result is going to be sum = input * 10 - 170. With that information in mind, we look start ADDR=23 where we jump to once the loop counter reaches 0. The string of commands is 542, 243, 243, 244, 340, 830, 653, 553, 140, 145, 353, 546, 140, 653.
542: LDACC [42]  (This loads the sum into the accumulator) 
243: SUB ACC,[43] (Subtract 400 from the sum)
243: SUB ACC,[43] (Subtract 400 from the sum)
244: SUB ACC,[44] (Subtract 60 from the sum)
340: STACC 40 (Store the result from the subtractions in addr 40)
830: BRGEQZ 30 (Branch if the result is greater or equal to zero. THIS MUST HAPPEN.)
653: JMP 53 (This will die at this point. The instruction at 53 does not work)
553: LDACC [53] (Load the accumulator with 417)
140: ADD ACC,[40] (Add the result of the subtractions to the accumulator)
145: ADD ACC,[45] (Add 459 to the accumulator. We now have 876+result in the accumulator)
353: STACC 53 (Store 876+result in the addr 53)
546: LDACC [46] (Load accumulator with 41)
140: ADD ACC,[40] (Add the remainder to the accumulator)
653: JMP 53 (Jump to addr 53, which now contains 876+result)
So what just happened? The sum of every third character (minus 170) needs to be greater than 860, else this chunk of code will die. Also, 876+result is going to be a piece of code that we run, so what should we put there? Well, knowing that the code 902 outputs data (and no 902 exists in memory at the beginning of this program), I guessed that we need to put 902 in that spot. So the result needs to be 26 to force this piece of data to be 902. An ASCII value that will make this happen is somewhere around the letter i. Using the built-in javascript debugger in Chrome, the string I ended up using to get 902 was aiaaiaaiaaiaajaajaajaajaajaajaajaa, which immediately spit out the key "C0ffee?".

-- suntzu_II

No comments:

Post a Comment