Fetusmaze

Overview

We are given a binary written in assembly, so the decompilation produced by ghidra at first is quite bad. The code at the entry point is simple, so after a bit of analysis we discover it simply asks for input, calls a function and prints the flag if the return value is 1.

The decompilation of this function looks like this:

As we can see, it consumes one byte of the input, and depending on its value it jumps to another function. The cool thing is here is that ghidra can compute the possible jumps for us:

Out of the four possible paths, three of them are just dead ends and return 0. The other one is a function exactly like the one we saw: it takes one byte and jumps to other place. So that’s the maze: it is composed of functions, and the first one is the entry of the maze. There seem to be thousands of them, so we must figure out a way of automating this.

Solving

Leveraging the fact that ghidra could compute each possible jump for us, I decided this was a good opportunity to learn about its python api. The idea is to create a script that iterates the maze saving the current path until it gets to the end.

In order to know the address of the end function (the one that finally returns 1), I first wrote a simple script that creates every function in the binary. Given the address of the entry point, it iterates every possible jump that is not a dead end and creates the function. I first tried to implement this recursively, but ghidra exploded. I then took an iterative approach using a queue, which seemed more efficient. This failed again: I was not recording which functions I had already visited, so I was in an infinite loop. After solving those small issues, the resulting script for creating the functions looks like this:

from ghidra.program.model.symbol.FlowType import *

def analyze(beginning_addr):
    queue = [beginning_addr]
    while queue:
        pc = queue.pop(0)

        # If it exists, we have already visited it before
        func = getFunctionAt(pc)
        if func:
            #print "omiting existing", pc
            continue
        func = createFunction(pc, None)

        # Iterate instructions until finding a RET
        addr = pc
        while True:
            inst = getInstructionAt(addr)
            if not inst:
                addr = addr.next()
                continue
            if str(inst) == "RET":
                break
            if inst.getFlowType() == COMPUTED_JUMP: # JMP RAX
                for ref in inst.getReferencesFrom():
                    # For each reference, add it to the queue if it is not a dead end
                    addr_ref = ref.getToAddress()
                    inst = getInstructionAt(addr_ref)
                    if str(inst) != "XOR EAX,EAX":
                        queue.append(addr_ref)
            addr = addr.next()

pc = askAddress("Address", "Enter MAZE initial address (hex, don't use 0x)")
analyze(pc)

Once I had every function defined, I wrote another script that iterates them and prints every function that is not a dead end nor a normal path function.

f = getFirstFunction()
while f:
    addr = f.getEntryPoint()
    inst = getInstructionAt(addr)
    if str(inst) != "MOV AL,byte ptr [RDI]" and str(inst) != "XOR EAX,EAX":
        print(f)
    f = getFunctionAfter(f)

It prints the addresses of the entry function and the end function. Once I knew that, I did small modifications to the first script so it records the current path and stops when it gets to the end function. It turns out that the computed references are ordered: the first one matches the input ‘1’, the second one the input ‘2’, and so on, so we don’t have to worry about it. The result:

from ghidra.program.model.symbol.FlowType import *

WIN = 0x408b40

def analyze(beginning_addr):
    queue = [(beginning_addr, "")]
    while queue:
        pc, path = queue.pop(0)

        # We've got to the end
        if pc.getOffset() == WIN:
            print "WIN:", pc, path
            return

        # If it exists, we have already visited it before
        func = getFunctionAt(pc)
        if func:
            #print "omiting existing", pc
            continue
        func = createFunction(pc, None)
        
        # Iterate instructions until finding a RET
        addr = pc
        while True:
            inst = getInstructionAt(addr)
            if not inst:
                addr = addr.next()
                continue
            if str(inst) == "RET":
                break
            if inst.getFlowType() == COMPUTED_JUMP: # JMP RAX
                i = 1
                for ref in inst.getReferencesFrom():
                    # For each reference, add it to the queue if it is not a dead end
                    addr_ref = ref.getToAddress()
                    new_inst = getInstructionAt(addr_ref)
                    if str(new_inst) != "XOR EAX,EAX":
                        new_elem = (addr_ref, path+str(i))
                        queue.append(new_elem)
                    i+=1
            addr = addr.next()

pc = askAddress("Address", "Enter MAZE initial address (hex, don't use 0x)")
analyze(pc)

Scripts are not very elegant or efficient, but they work pretty good. Output of this last one:

Using the result path as input remotely gives us the flag:

flag{d1d_y0u_g3t_l0st?}

Leave a comment

Your email address will not be published. Required fields are marked *