Introduction to reverse engineering with Radare2
Radare2 is a binary analysis framework. It includes a large number of utilities. Initially, it developed as a hexadecimal editor for searching and recovering data, then it acquired functionality and has now become a powerful data analysis framework. In this article I will tell you how to analyze the logic of the program using the Radare2 framework, and also describe the main elements of the assembly language that are necessary for reverse engineering.
Radare2 is a bundle of several utilities:
radare2 (r2) - Hexadecimal editor, disassembler and debugger with an enhanced command line interface. Allows you to work with various input / output devices, such as disks, remote devices, debugged processes, etc., and also work with them as with simple files.
rabin2 - Used to get information about executable binaries.
rasm2 - Allows conversions from opcode to machine code and vice versa. Supports a wide variety of architectures.
rahash2 - A utility for calculating checksums. Supports many algorithms, allows you to get the checksum of an entire file, part of it or an arbitrary line.
radiff2 - A utility for comparing binary files, supports many algorithms, can compare blocks of code of executable files.
rafind2 - Utility for finding a sequence of bytes.
ragg2 - A utility for compiling small programs.
rarun2 - A utility capable of running the analyzed program with different environment settings.
rax2 - A small calculator that allows simple calculations in various number systems. <mumblecut />
The main drawback that hinders the prevalence of the framework is the lack of a quality GUI. There are third-party implementations, but unfortunately they are not very convenient. It is also worth noting the presence of a built-in web interface.
Radare2 is most often used as a reverse engineering tool, as an advanced disassembler. We will consider Radare2 exactly as a disassembler and will analyze a simple crackme.
Introduction to assemblerBefore starting to analyze the program, it is worth dwelling on the main points that are necessary to understand the assembly code. The description of the basic assembly language instructions deserves a separate article, so only the main groups of instructions will be given here.
Copy instructions (mov, movsx, movzx)
Boolean instructions (and, or, xor, test)
Arithmetic instructions (add, sub)
Sequence control instructions (jmp, jne, ret)
Interrupt instructions (int)
I / O instructions (in, out)
By default Radare2 uses intel syntax, which is characterized by the following notation format:
instruction operand; commentBasic instructions can have one or two operands. In the case of working with two operands, the recording format will take the following form:
instruction operand1, operand2; commentMany instructions such as and, sub, add store the result of the calculation in the first operand.
Assembly language does not support operations in which both operands are in memory. Therefore, one or both values have to be placed in registers, which will later be used as operands. Thus, we smoothly approached the definition of registers.
Registers are very fast memory locations that reside in the processor. They are much faster than RAM or cache, but the amount of memory stored in them is very small. In the processor architecture x86 (x86-32) there are 8 general-purpose registers of 32 bits. The amd64 (x86-64) processors have 16 general-purpose registers of 64 bits. More details are provided in the table below.
Examining crackmeLet's analyze the analysis of executable files using the example of a simple crackme obtained from here https://github.com/geyslan/crackmes . Let's run the program and see its behavior. We immediately see an invitation to enter the password, try to enter 123456.
We have not guessed the password, the program asks to try again and exits. Let's start the analysis, for this we launch the radar with the command "r2 -A crackme". The -A argument is needed so that the radar can analyze functions immediately, equivalent to the aa command. Use the izz command to display the text lines that are contained in the program.
Here we see several lines, two we have already met during the launch of the program. We also see a line that, presumably, is displayed if the correct password is entered. This string is stored at 0x08048888, we remember this address.
Let's execute afl command to get a list of functions.
Here, in addition to library functions, we see also the entry0 function, which, as the name implies, is the entry point of the program. The main function is the starting point for the execution of all programs written in C / C ++. From the names of the remaining functions, it is difficult to infer about the role in the program.
Let's see the code of the main function by executing pdf @ main. Here we see several function calls. The first call is fwrite, which outputs the prompt string. Second, the fgets function reads from the input device and puts the input into memory. This is followed by a call to two functions of unknown purpose. Then two more calls to fwrite. We are interested in the section of the code in which the address of the line that we remembered earlier occurs.
Here we see that the line will be output if the conditional branch "jne 0x804875e" does not occur, for this, at the time of execution of "test eax, eax", the value of the register eax must be 0. It can be assumed that the function fcn.08048675, executed earlier , checks the password, and if the password is correct, it writes it to eax 0. Therefore, if you remove the conditional jump, the program, regardless of the entered password, will consider that the correct password has been entered. This can be done in various ways, for example, before checking, forcibly set the value of the eax register to 0. Change the transition address or simply remove the transition, replacing it with nop opcodes.
We will try the last option by reopening the file in write mode by running the oo + command. Then we go to the address 0x08048735 and execute the command "wa nop; nop". As a result, we replaced the conditional branch with two nop opcodes.
Let's start the program and try to enter the password.
Great, we have successfully patched the program. In the case of a more complex program, such a solution may not work quite correctly, and as a result, the program may behave completely differently than it was intended. You can go a more complicated way and find out the correct password, for this you need to analyze the fcn.08048675 and fcn.08048642 functions. Let's start with fcn.08048642, execute pdf @ fcn.08048642.
After analyzing the code, we see that the function takes two arguments, although one of them is not used. A counter loop is executed in the function body. mov dword [local_4h], 0 initializes the counter to 0. Next, an unconditional jump to address 0x0804866d is performed, where the counter is compared with the value 5. If the counter value is less than 5, then the jump to address 0x08048651 is performed. Here, the counter value is written to the edx register, then the value of the second argument is written to the eax register, most likely this is a pointer to the string we entered. Further, the values of these registers are added, as a result, we get the address with the offset of the counter, relative to the pointer to our string.
The result of the addition is stored in the edx register. Then a similar action is performed, only the result is stored in eax. On the next line, the movzx operand copies the byte pointed to by the address in eax to the lower part of this register, al. After that, an exclusive or operation is performed, between the byte in the eax register and 0x6c. The result is written to the address stored in edx. Then 1 is added to the counter. If the counter is less than 5, the cycle repeats.
After the counter reaches 5, the loop exits and the function ends. Thus, the line we entered is bypassed and each character in it is changed. Based on the maximum value of the counter, we conclude that the password consists of 6 bytes.
Next, the fcn.08048675 function is called, which takes 2 arguments, the address we entered, the converted password and the address 0x8049b60, let's call them line 1 and line 2, and the addresses to the characters inside them, respectively, pointer 1 and pointer 2. This function consists of a cycle within which several checks are performed. At the beginning of the loop iteration, the string pointer 1 is written to eax, then the value by the pointer to is written to edx. The same thing is repeated for line 2, only the value is written to eax. Then the low bytes of these registers are compared.
If the values are not the same, the loop is exited and the transition to the address 0x0804867a, where the values of the bytes referred to by both pointers are checked for zero value. If both bytes have a nonzero value, the pointers are incremented by 1. If the bytes are not equal or one of them is 0, the code at the address 0x080486b0 is executed, in which the value is checked for pointer two. If the value is 0, then 0 is written to the eax register, otherwise 0xffffffff or -1. Then the function is exited.
As you can see, this function simply compares two strings and, if they are the same, returns 0, otherwise -1. We can also conclude that the correct password is stored at 0x8049b60. As we learned earlier, its length is 6 bytes, let's read it.
Let's try to do the reverse transformation of the first character by executing the command “? 1b ^ 0x6c "and we get the first character" w ".
As a result, we get the string whyn0t. Let's check it by first replacing the patched version with the original one.
The password is correct, we have successfully solved this crackme.
|Vote for this post
Bring it to the Main Page