Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Mastering Malware Analysis
Mastering Malware Analysis

Mastering Malware Analysis: The complete malware analyst's guide to combating malicious software, APT, cybercrime, and IoT attacks

Arrow left icon
Profile Icon Alexey Kleymenov Profile Icon Amr Thabet
Arrow right icon
$38.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5 (10 Ratings)
eBook Jun 2019 562 pages 1st Edition
eBook
$38.99
Paperback
$54.99
Subscription
Free Trial
Renews at $12.99p/m
Arrow left icon
Profile Icon Alexey Kleymenov Profile Icon Amr Thabet
Arrow right icon
$38.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5 (10 Ratings)
eBook Jun 2019 562 pages 1st Edition
eBook
$38.99
Paperback
$54.99
Subscription
Free Trial
Renews at $12.99p/m
eBook
$38.99
Paperback
$54.99
Subscription
Free Trial
Renews at $12.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Mastering Malware Analysis

A Crash Course in CISC/RISC and Programming Basics

Before diving into the malware world, we need to have a complete understanding of the core of the machines we are analyzing malware on. For reverse engineering purposes, it makes sense to focus largely on the architecture and the operating system it supports. Of course, there are multiple devices and modules that comprise a system, but it is mainly these two that define a set of tools and approaches used during the analysis. The physical representation of any architecture is a processor. A processor is like a heart of any smart device or computer in that it keeps them alive.

In this chapter, we will cover the basics of the most widely used architectures, from the well-known x86 and x64 Instruction Set Architectures (ISAs) to solutions powering multiple mobile and Internet of Things (IoT) devices that are often misused by malware families, such as Mirai and many others. It will set the tone for your journey into malware analysis, as static analysis is impossible without understanding assembly instructions. Although modern decompilers indeed become better and better, they don't exist for all platforms that are targeted by malware. Additionally, they will probably never be able to handle obfuscated code. Don't be daunted by the complexity of assembly; it just takes time to get used to it, and after a while, it becomes possible to read it like any other programming language. While this chapter provides a starting point, it always makes sense to deepen your knowledge by practicing and exploring further.

This chapter is divided into the following sections to facilitate the learning process: 

  • Basic concepts
  • Assembly languages
  • Becoming familiar with x86 (IA-32 and x64)
  • Exploring ARM assembly
  • Basics of MIPS
  • Covering the SuperH assembly
  • Working with SPARC
  • Moving from assembly to high-level programming languages

Basic concepts

Most people don't really understand that the processor is pretty much a smart calculator. If you look at most of its instructions (whatever the assembly language is), you will find many of them dealing with numbers and doing some calculations. However, there are multiple features that actually differentiate processors from usual calculators:

  • Processors have access to a bigger memory space compared to traditional calculators. This memory space gives them the ability to store billions of values, which allows them to perform more complex operations. Additionally, they have multiple fast and small memory storage units embedded inside the processors' chip called registers.
  • Processors support many instruction types other than arithmetic instructions, such as changing the execution flow based on certain conditions.
  • Processors are able to communicate with other devices (such as speakers, mics, hard disks, graphics card, and so on).

Armed with such features in conjunction with great flexibility, processors became the go-to smart machines for technologies such as AI, machine learning, and others. In the following sections, we will explore these features and later will dive deeper into different assembly languages and how these features are manifested in these languages' instruction set.

Registers

As most of the processors have access to a huge memory space storing billions of values, it takes longer for the processor to access the data (and it gets complex, as we will see later). So, to speed up the processor operations, they contain small and fast internal memory storage units called registers.

Registers are built into the processor chip and are able to store the immediate values that are needed while performing calculations and data transfer from one place to another.

Registers may have different names, sizes, and functions, depending on the architecture. Here are some of the types that are widely used:

  • General data registers: General data registers are registers that are used to save values or results from different arithmetic and logical operations.
  • Stack and frame pointers: These are registers that are used to point to the beginning and the end of the stack.
  • Instruction pointer/program counter: The instruction pointer is used to point to the start of the next instruction to be executed by the processor.

Memory

Memory plays an important role in the development of all smart devices that we see nowadays. The ability to manage lots of values, text, images, and videos on a fast and volatile memory allows processors to process more information and display graphical interfaces in 3D and virtual reality.

Virtual memory

In modern operating systems, whether they are 32-bit, 64-bit, or whatever the size of the physical memory, the operating system allocates a fixed size, isolated virtual memory (in which its pages are mapped to the physical memory pages) for each application to secure the operating system's and the other applications' data. 

Each application only has the ability to access their own virtual memory. They have the ability to read, write, or execute instructions in their virtual memory pages. Each virtual memory page has a set of permissions assigned to it that represent the type of operations that the application is allowed to execute on this page. These permissions are read, write, and execute. Additionally, multiple permissions can be assigned to each memory page.

For an application to access any stored value inside a memory address, it needs a virtual address, which is basically the address of where this value is stored in memory.

Despite knowing the virtual address, access can be hindered by another issue, which is storing this virtual address. The size of the virtual address in 32-bit systems is 4 bytes and in 64-bit systems is it 8 bytes. This means we need to allocate another space in memory to store that virtual address. For this new space in memory, we will need to store its own memory address in another memory space that will lead us to an infinite loop, as shown in the following figure:

Figure 1: Virtual memory addresses

To solve this condition, multiple solutions are used nowadays, and in the next section, we will cover one of them, which is the stack.

Stack

Stack literally means a pile of objects. In computer science, a stack is basically a data structure that helps to save different values in memory with the same size in a pile structure using the principle of Last in First Out (LIFO). 

A stack is pointed to by two registers (the frame pointer points to its top and the stack pointer points to its bottom).

A stack is common between all known assembly languages and it has several functions. For example, it may help in solving mathematical equations, such as X = 5*6 + 6*2 + 7(4 + 6), by storing each calculated value and pushing each one in the stack, and later pop ping (or pulling) them back to calculate the sum of all of them and saving them in variable X.

It is also commonly used to pass arguments (especially if there are a lot of them) and store local variables.

A stack is also used to save the return addresses just before calling a function or a subroutine. So, after this routine finishes, it pops the return address back from the top of the stack and returns it to where it was called from to continue the execution. 

While the stack pointer is generally pointing to the current top of the stack, the frame pointer is keeping the address of the top of the stack before the subroutine call, so it can be easily restored after it is returned.

Branches, loops, and conditions

The second feature that processors have is the ability to change the execution flow of a program based on a given condition. In every assembly language, there are multiple comparison instructions and flow control instructions. The flow control instructions can be divided into the following categories:

  • Unconditional jump: This is a type of instruction that forcefully changes the flow of the execution to another address (without any given condition).
  • Conditional jump: This is like a logical gate that switches to another branch based on the given condition (such as equal to zero, greater than, or lower than), as shown in the following figure:

Figure 2: An example of a conditional jump
  • Call: This changes the execution to another function and saves the return address in the stack.

Exceptions, interrupts, and communicating with other devices

In assembly language, communication with different hardware devices is done through what's called interrupts.

An interrupt is a signal to the processor sent by the hardware or software indicating that there's something happening or there is a message to be delivered. The processor suspends its current running process, saving its state, and executes a function called an interrupt handler to deal with this interrupt. Interrupts have their own notation and are widely used to communicate with hardware for sending requests and dealing with their responses.

There are two types of interrupts. Hardware interrupts are generally used to handle external events when communicating with hardware. Software interrupts are caused by software, usually by calling a particular instruction. The difference between an interrupt and an exception is that exceptions take place within the processor rather than externally. An example of an operation generating an exception can be a division by zero.

Assembly languages

There are two big groups of architectures defining assembly languages that we will cover in this section, and they are Complex Instruction Set Computer (CISC) and Reduced Instruction Set Computer (RISC).

CISC versus RISC

Without going into too many details, the main difference between CISC assemblies, such as Intel IA-32 and x64, and RISC assembly languages associated with architectures such as ARM, is the complexity of their instructions.

CISC assembly languages have more complex instructions. They focus on completing tasks using as few lines of assembly instructions as possible. To do that, CISC assembly languages include instructions that can perform multiple operations, such as mul in Intel assembly, which performs data access, multiplication, and data store operations.

In the RISC assembly language, assembly instructions are simple and generally perform only one operation each. This may lead to more lines of code to complete a specific task. However, it may also be more efficient, as this omits the execution of any unnecessary operations.

Types of instructions

In the following sections, we will cover the main structure of each assembly language, the three basic types of assembly instructions, and how they are translated into each of these languages:

  • Data manipulation:
    • Arithmetic manipulation
    • Logic and bit manipulation
    • Shifts and rotations
  • Data transfers:
    • Transfers between memory and registers
    • Transfers between registers
  • Execution of flow control:
    • Jumps or calls
    • Branches based on a condition

Becoming familiar with x86 (IA-32 and x64)

Intel x86 (IA-32 and x64) is the most common architecture used in PCs and is powering many servers, so there is no surprise that most of the malware samples we have at the moment are supporting it. x86 is a CISC architecture, and it includes multiple complex instructions in addition to simple ones. In this section, we will introduce the most common of them, along with how compilers take advantage of them in their calling conventions.

Registers

Here is a table showing the relationship between registers in IA-32 and x64 architectures:

Figure 3: Registers used in the x86 architecture

r8 to r15 are available only in x64 and not in IA-32, and spl, bpl, sil, and dil can be accessed only in x64.

The first four registers (rax, rbx, rcx, and rdx) General-Purpose Registers (GPRs), but some of them have the following special use for certain instructions:

  • rax/eax: This is used to store information and it's a special register for some calculations
  • rcx/ecx: This is used as a counter register in loop instructions
  • rdx/edx: This is used in division to return the modulus

In x64, the registers from r8 to r15 are also GPRs that were added to the available GPRs.

The rsp/esp register is used as a stack pointer that points to the top of the stack. It moves when there's a value getting pushed up, or down, when there's a value getting pulled out from the stack. The rbp/ebp register is used as a frame pointer, which means it points to the bottom of the stack and it's helpful for the function's local variable, as we will see later in this section. In addition to this, rbp/ebp is sometimes used as a GPR for storing any kind of data.

rsi/esi and rdi/edi are used mostly to define the addresses when copying a group of bytes in memory. The rsi/esi register always plays the role of the source and the rdi/edi register plays the role of the destination. Both registers are non-volatile and are also GPRs .

The instruction structure

For Intel x86 assembly (IA-32 or x64), the common structure of its instructions is opcode, dest, and src

Let's get deeper into them.

opcode

opcode is the name of the instruction. Some instructions have only opcode without any dest or src such as the following:

Nop, pushad, popad, movsb
pushad and popad are not available in x64.

dest

dest represents the destination or where the result of the calculations will be saved, as well as becoming part of the calculations themselves like this:

add eax, ecx ;eax = (eax + ecx)
sub rdx, rcx ;rdx = (rdx - rcx)

Also, it could play a role of a source and a destination with some opcode instructions that take only dest without a source:

inc eax
dec ecx

Or, it could be only the source, such as these instructions that save the value to the stack like this:

push rdx
pop rcx

dest could look like the following:

  • REG: A register such as eax and edx.
  • r/m: A place in memory such as the following:
DWORD PTR [00401000h]
BYTE PTR [EAX + 00401000h]
WORD PTR [EDX*4 + EAX+ 30]
  • A value in the stack (used to represent local variables), such as the following:
DWORD PTR [ESP+4]
DWORD PTR [EBP-8]

src

src represents the source or another value in the calculations, but it doesn't save the results afterward. It may look like this:

  • REG: For instance, add rcx and r8
  • r/m: For instance, add ecx and dword ptr [00401000h]
  • imm: An immediate value such as mov eax and 00100000h

The instruction set

Here, we will cover the different types of instructions that we listed in the previous section.

Data manipulation instructions

Some of the arithmetic instructions are as follows:

Instruction

Structure

Description

add/sub

add/sub dest, src

dest = dest + src/dest = dest - src

inc/dec

inc/dec dest

dest = dest + 1/dest = dest - 1

mul

mul src

(Unsigned multiply) rdx:rax = rax* src

div

div src

rdx:rax/src (returns the result in rax and the remainder/modulus in rdx)

Additionally, for logic and bits manipulation, they are like this:

Instruction

Structure

Description

or/and/xor

or/and/xor dest, or src

dest = dest & src/dest = dest | src/dest = dest ^ src

not

not dest

dest = !dest (the bits are flipped)

 

And, lastly, for shifts and rotations they are like this:

Instruction

Structure

Description

shl/shr

shl/shr dest, imm, or cx

(the dest register's maximum number of bits such as 32 or 64)

dest = dest << src/dest = dest >> src
(shifts the dest register's bits to the left or the right, which is the same effect as multiplying or dividing by two src times)

rol/ror

shl/shr destimm, or cx

(same as shl and shr)

Rotates the dest register's bits left or right

Data transfer instructions

There's a mov instruction, which copies a value from src to dest. This instruction has multiple forms, as we can see in this table:

Instruction

Structure

Description

mov

mov dest or src

dest = src

movsx/movzx

movsx/movzx dest or src

src is smaller than dest (src is 16-bits and dest is 32-bits)
movzx: Sets the remaining bits in dest to zero
movsx: Preserves the sign of the src value 

 

Other instructions related to stack are like this:

Instruction

Structure

Description

push/pop

push/pop dest

Pushes the value on to the top the stack (esp = esp -4)/
pulls the value out of the stack (esp = esp + 4)

pushad/popad

pushad/popad

Saves all registers to the stack/pulls out all registers from the stack (in x86 only)

For string manipulation, they are like this:

Instruction Structure Description
lodsb/lodsw/lodsd/lodsq lodsb/lodsw/lodsd/lodsq Loads a byte, 2 bytes, 4 bytes, or 8 bytes from rsi/esi into al/ax/eax/rax
stosb/stosw/stosd/stosq stosb/stosw/stosd/stosq Stores a byte, 2 bytes, 4 bytes, or 8 bytes in rdi/edi from al/ax/eax/rax
movsb/movsw/movsd/movsq movsb/movsw/movsd/movsq Copy a byte, 2 bytes, 4 bytes, or 8 bytes from rsi/esi to rdi/edi

Flow control instructions

Some of the unconditional redirections are as follows:

Instruction Structure Description
jmp jmp <relative address>
jmp DWORD/QWORD ptr [Absolute Address]
The relative address is calculated from the start of the next instruction after jmp to the destination
call

call <relative address>
call DWORD/QWORD ptr [Absolute Address]

Same as jmp but it saves the return address in the stack
ret/retn ret imm Pulls the return address from the stack, cleans the stack from the pushed arguments, and jumps to that address 

 

Some of the conditional redirections are as follows:

Instruction Structure Description
jnz/jz/jb/ja jz/jnz <relative address> Similar to jmp, but jumps based on a condition
loop loop <relative address> Similar to jmp, but it decrements rcx/ecx and jumps if it didn't reach zero (uses rcx/ecx as a loop counter)
rep rep opcode dest or src (if needed) rep is a prefix that is used with string instructions; it decrements rcx/ecx, and repeats the instruction until rcx/ecx reaches zero

Arguments, local variables, and calling conventions (in x86 and x64)

There are multiple ways in which the compilers represent functions, calls, local variables, and more. We will not be covering all of them, but we will be covering some of them. We will cover standard call (stdcall), which is only used in x86, and then we will be covering the differences between the other calls and stdcall.

stdcall

The stack, rsp/esp, and rbp/ebp registers do most of the work when it comes to arguments and local variables. The call instruction saves the return address at the top of the stack before transferring the execution to the new function, and the ret instruction at the end of the function returns the execution back to the caller function using the return address saved in the stack.

Arguments

For stdcall, the arguments are also pushed in the stack from the last argument to the first like this:

Push Arg02
Push Arg01
Call Func01

In the call function, the arguments can be accessed by rsp/esp but keeping in mind how many values have been pushed to the top of the stack through time with something like this:

mov eax, [esp + 4] ;Arg01
push eax
mov ecx, [esp + 8] ; Arg01 keeping in mind the previous push

In this case, the value located at the address specified by the value inside the square brackets is transferred. Fortunately, modern static analysis tools, such as IDA Pro, can detect which argument is being accessed in each instruction, as in this case.

The most common way to access arguments, as well as local variables, is by using rbp/ebp. First, the called function needs to save the current rsp/esp in rbp/ebp register and then access them this way:

push ebp
mov ebp, esp
...
mov ecx, [ebp + 8] ;Arg01
push eax
mov ecx, [ebp + 8] ;still Arg01 (no changes)

And, at the end of the called function, it returns back the original value of rbp/ebp and the rsp/esp like this:

mov esp,ebp
pop ebp
ret

As it's a common function epilogue, Intel created a special instruction for it, which is leave, so it became this:

leave
ret

Local variables

For local variables, the called function allocates space for them by shifting the rsp/esp instruction up. To allocate space for two variables of four bytes each, the code will be this:

push ebp
mov ebp,esp
sub esp, 8

Additionally, the end of the function will be this:

mov ebp,esp
pop ebp
ret

Figure 4: An example of a stack change at the beginning and at the end of the function

Additionally, if there are arguments, the ret instruction cleans the stack given the number of bytes to pull out from the top of the stack like this:

ret 8 ;2 Arguments, 4 bytes each

cdecl

cdecl (which stands for c declaration) is another calling convention that was used by many C compilers in x86. It's very similar to stdcall, with the only difference being that the caller cleans the stack after the callee function (the called function) returns like this:

Caller:
push Arg02
push Arg01
call Callee
add esp, 8 ;cleans the stack

fastcall

The __fastcall calling convention is also widely used by different compilers, including Microsoft C++ compiler and GCC. This calling convention passes the first two arguments in ecx and edx, and pushes the remaining arguments in the stack. It's only used in x86 as there's only one calling convention for x64.

thiscall

For object-oriented programming and for the non-static member functions (such as the classes' functions), the C compiler needs to pass the address of the object whose attribute will be accessed or manipulated using this function as an argument.

In GCC compiler, this call is almost identical to the cdecl calling convention and it passes the object address as a first argument. But in the Microsoft C++ compiler, it's similar to stdcall and it passes the object address in ecx. It's common to see such patterns in some object-oriented malware families.

The x64 calling convention

In x64, the calling convention is more dependent on the registers. For Windows, the caller function passes the first four arguments to the registers in this order: rcx, rdx, r8, r9, and the rest are pushed back to the stack. While for the other operating systems, the first six arguments are usually passed to the registers in this order: rsi, rdi, rcx, rdx, r8, r9, and the remaining to the stack.

In both cases, the called function cleans the stack after using ret imm, and this is the only calling convention for these operating systems in x64.

Exploring ARM assembly

Most readers are probably more familiar with the x86 architecture, which implements the CISC design, and may wonder—why do we actually need something else? The main advantage of RISC architectures is that processors that implement them generally require fewer transistors, which eventually makes them more energy and heat efficient and reduces the associated manufacturing costs, making them a better choice for portable devices. We start our introduction to RISC architectures with ARM for a good reason—at the moment, this is the most widely used architecture in the world.

The explanation is simpleprocessors implementing it can be found on multiple mobile devices and appliances such as phones, video game consoles, or digital cameras, heavily outnumbering PCs. For this reason, multiple IoT malware families and mobile malware targeting Android and iOS platforms have payloads for ARM architecture; an example can be seen in the following screenshot:

Figure 5: Disassembled IoT malware targeting ARM-based devices

Thus, in order to be able to analyze them, it is necessary to understand how ARM works first.

ARM originally stood for Acorn RISC Machine, and later for advanced RISC Machine. Acorn was a British company considered by many as the British Apple, producing some of the most powerful PCs of that time. It was later split into several independent entities with Arm Holdings (currently owned by SoftBank Group) supporting and extending the current standard.

There are multiple operating systems supporting it, including Windows, Android, iOS, various Unix/Linux distributions, and many other lesser known embedded OSes. The support for a 64-bit address space was added in 2011 with the release of the ARMv8 standard.

Overall, the following ARM architecture profiles are available:

  • Application profiles (suffix A, for example, the Cortex-A family): This implements a traditional ARM architecture and supports a virtual memory system architecture based on a Memory Management Unit (MMU). These profiles support both ARM and Thumb instruction sets (as discussed later).
  • Real-time profiles (suffix R, for example, the Cortex-R family): This implements a traditional ARM architecture and supports a protected memory system architecture based on a Memory Protection Unit (MPU).
  • Microcontroller profiles (suffix M, for example, the Cortex-M family): This implements a programmers' model and is designed for integration into Field Programmable Gate Arrays (FPGAs).

Each family has its own corresponding set of associated architectures (for example, the Cortex-A 32-bit family incorporates ARMv7-A and ARMv8-A architectures), which in turn incorporate several cores (for example, ARMv7-R architecture incorporates Cortex-R4, Cortex-R5, and so on).

Basics

Here, we will cover both the original 32-bit and the newer 64-bit architectures. There were multiple versions released over time, starting from the ARMv1. In this book, we will focus on the recent versions of them.

ARM is a load-store architecture; it divides all instructions into the following two categories:

  • Memory access: Moves data between memory and registers
  • Arithmetic Logic Unit (ALU) operations: Does computations involving registers

ARM supports arithmetic operations for adding, subtracting, and multiplying, and some new versions, starting from ARMv7, also support division operations. It supports big-endian order, and uses the little-endian format by default.

There are 16 registers visible at any time on the 32-bit ARM: R0-R15. This number is convenient as it takes only 4 bits to define which register is going to be used. Out of them, 13 (sometimes referred to as 14 including R14 or R15, also R13) are general-purpose registers: R13 and R15 each have a special function while R14 can take it occasionally. Let's have a look at them in greater detail:

  • R0-R7: Low registers are the same in all CPU modes.
  • R8-R12: High registers are the same in all CPU modes except the Fast Interrupt Request (FIQmode not accessible by 16-bit instructions. 
  • R13 (also known as SP): Stack pointer—points to the top of the stack, and each CPU mode has its own version of it. It is discouraged to use it as a GPR.
  • R14 (also known as LR): Link registerin user mode it contains the return address for the current function, mainly when BL (Branch with Link) or BLX (Branch with Link and eXchange) instructions are executed. It can also be used as a GPR if the return address is stored on the stack. Each CPU mode has its own version of it.
  • R15 (also known as PC): Program counter, points to the currently executed command. It's not a GPR.

Altogether, there are 30 general-purpose 32-bit registers on most of the ARM architectures overall, including the same name instances in different CPU modes.

Apart from these, there are several other important registers, as follows:

  • Current Program Status Register (CPSR): This contains bits describing a current processor mode, a processor state, and some other values.
  • Saved Program Status Registers (SPSR): This stores the value of CPSR when the exception is taken, so it can be restored later. Each CPU mode has its own version of it, except the user and system modes, as they are not exception-handling modes.
  • Application Program Status Register (APSR): This stores copies of the ALU status flags, also known as condition code flags, and on later architectures, it also holds the Q (saturation) and the greater than or equal to (GE) flags.

The number of Floating-Point Registers (FPRs) for a 32-bit architecture may vary, depending on the core, up to 32.

ARMv8 (64-bit) has 31 general-purpose X0-X30 (R0-R30 notation can also be found) and 32 FPRs accessible at all times. The lower part of each register has the W prefix and can be accessed as W0-W30.

There are several registers that have a particular purpose, as follows:

Name Size Description

XZR/WZR

64/32 bits, respectively

Zero register

PC

64 bits

Program counter

SP/WSP

64/32 bits, respectively

Current stack pointer

ELR

64 bits

Exception link register

SPSR

32 bits

Saved processor state register

 

ARMv8 defines four exception levels (EL0-EL3), and each of the last three registers gets its own copy of each of them; ELR and SPSR don't have a separate copy for EL0.

There is no register called X31 or W31; the number 31 in many instructions represents the zero register, ZR (WZR/XZR). X29 can be used as a frame pointer (which stores the original stack position), and X30 as a link register (which stores a return value from the functions).

Regarding the calling convention, R0-R3 on the 32-bit ARM and X0-X7 on the 64-bit ARM are used to store argument values passed to functions R0-R1 and X0-X7 (and X8, also known as XR indirectly) to hold return results. If the type of the returned value is too big to fit them, then space needs to be allocated and returned as a pointer. Apart from this, R12 (32-bit) and X16-X17 (64-bit) can be used as intra-procedure-call scratch registers (by so-called veneers and procedure linkage table code), R9 (32-bit) and X18 (64-bit) can be used as platform registers (for OS-specific purposes) if needed, otherwise they are used the same way as other temporaries.

As previously mentioned, there are several CPU modes implemented according to the official documentation, as follows:

Operating mode name

Abbreviation

Description

User

usr

Usual program execution state, used by most of the programs

Fast interrupt

fiq

Supports data transfer or channel process

Interrupt

irq

Used for general-purpose interrupt handling

Supervisor

svc

Protected mode for the OS

Abort

abt

Is entered after a data or instruction Prefetch Abort

System

sys

Privileged user mode for the OS. Can be entered only from another privileged mode by modifying the mode bit of the CPSR

Undefined

und

Is entered when an undefined instruction is executed

Instruction sets

There are several instruction sets available for ARM processors: ARM and Thumb. A processor that is executing ARM instructions is said to be operating in the ARM state and vice versa. ARM processors always start in the ARM state, and then a program can switch to the Thumb state by using a BX instruction. Thumb Execution Environment (ThumbEE) was introduced relatively recently in ARMv7 and is based on Thumb, with some changes and additions to facilitate dynamically generated code.

ARM instructions are 32 bits long (for both AArch32 and AArch64), while Thumb and ThumbEE instructions are either 16 or 32 bits long (originally, almost all Thumb instructions were 16-bit, while Thumb-2 introduced a mix of 16- and 32-bit instructions).

All instructions can be split into the following categories according to the official documentation:

Instruction Group Description Examples

Branch and control

These instructions are used to:

  • Follow subroutines
  • Go forward and backwards for conditional structures and loops
  • Make instructions conditional
  • Switch between ARM and Thumb states

B: Branch

BX: Branch and exchange instruction set

CBZCompare against zero and branch

ITIf-then, makes up to four following instructions conditional (32-bit Thumb)

Data processing

Operate with GPRs, support data movement between registers and arithmetic operations

ADD: Add

MOV: Move data

MUL: Multiply

Register load and store

Move data between registers and memory

LDR: Load register (1 byte)

STRB: Store register (1 byte)

SWP: Swap register and memory content

Multiple register load and store

Load or store multiple GPRs from or to memory

STM/LDM: Store and load multiple registers to and from memory

PUSH/POP: Push and pop registers to and from the stack

Status register access

Move the content of a status register (CPSR or SPSR) to or from a GPR

MRSMove the contents of the CPSR or SPSR to a GPR MSR; load specified fields of the CPSR or SPSR with an immediate value or another register's value

Coprocessor

Extend the ARM architecture; enable control of the system control coprocessor registers (CP15)

CDP/CDP2: Coprocessor data operations

 

In order to interact with the OS, syscalls can be accessed using the Software Interrupt (SWIinstruction, which was later renamed the Supervisor Call (SVCinstruction.

See the official ARM documentation (a link is provided later) to get the exact syntax for any instruction. Here is an example of how it may look:

SVC{cond} #imm

The {cond} code in this case will be a condition code. There are several condition codes supported by ARM, as follows:

  • EQ: Equal to
  • NENot equal to
  • CS/HSCarry set or unsigned higher or both
  • CC/LOCarry clear or unsigned lower
  • MINegative
  • PLPositive or zero
  • VSOverflow
  • VCNo overflow
  • HIUnsigned higher
  • LSUnsigned lower or both
  • GESigned greater than or equal to
  • LTSigned less than
  • GTSigned greater than
  • LESigned less than or equal to
  • ALAlways (normally omitted)

An imm value stands for the immediate value.

Basics of MIPS

Microprocessor without Interlocked Pipelined Stages (MIPS) was developed by MIPS technologies (formerly MIPS computer systems). Similar to ARM, at first, it was a 32-bit architecture with 64-bit functionality added later. Taking advantage of the RISC ISA, MIPS processors are characterized by low power and heat consumption. They can often be found in multiple embedded systems such as routers and gateways, and several video game consoles such as Sony PlayStation also incorporated them. Unfortunately, due to the popularity of this architecture, the systems implementing it became a target of multiple IoT malware families. An example can be seen in the following screenshot:

Figure 6: IoT malware targeting MIPS-based systems

As the architecture evolved, there were several versions of it, starting from MIPS I and going up to V, and then several releases of the more recent MIPS32/MIPS64. MIPS64 remains backward-compatible with MIPS32. These base architectures can be further supplemented with optional architectural extensions called Application Specific Extension (ASE) and modules to improve performance for certain tasks that are generally not used by the malicious code much. MicroMIPS32/64 are supersets of MIPS32 and MIPS64 architectures respectively, with almost the same 32-bit instruction set and additional 16-bit instructions to reduce the code size. They are used where code compression is required, and are designed for microcontrollers and other small embedded devices.

Basics

MIPS supports bi-endianness. The following registers are available:

  • 32 GPRs r0-r3132-bit size on MIPS32 and 64-bit size on MIPS64.
  • A special-purpose PC register that can be affected only indirectly by some instructions.
  • Two special-purpose registers to hold the results of integer multiplication and division (HI and LO). These registers and related instructions were removed from the base instruction set in the release of 6 and now exist in the Digital Signal Processor (DSP) module.

The reason behind 32 GPRs is simple—MIPS uses 5 bits to specify the register, so this way, we can have a maximum of 2^5 = 32 different values. Two of the GPRs have a particular purpose, as follows:

  • Register r0 (sometimes referred to as $0 or $zero) is a constant register and always stores zero, and provides read-only access. It can be used as a /dev/null analog to discard the output of some operation, or as a fast source of a zero value.
  • r31 (also known as $rastores the return address during the procedure call branch/jump and link instructions.

Other registers are generally used for particular purposes, as follows:

  • r1 (also known as $at): Assembler is temporary—used when resolving pseudo-instructions
  • r2-r3 (also known as $v0 and $v1): Valueshold return function values
  • r4-r7 (also known as $a0-$a3): Argumentsused to deliver function arguments
  • r8-r15 (also known as $t0-$t7/$a4-$a7 and $t4-$t7): Temporaries—the first four can also be used to provide function arguments in N32 and N64 calling conventions (another O32 calling convention uses only r4-r7 registers; subsequent arguments are passed on the stack)
  • r16-r23 (also known as $s0-$s7): Saved temporaries—preserved across function calls
  • r24-r25 (also known as $t8-$t9): Temporaries
  • r26-r27 (also known as $k0-$k1): Generally reserved for the OS kernel
  • r28 (also known as $gp): Global pointer—points to the global area (data segment)
  • r29 (also known as $sp): Stack pointer
  • r30 (also known as $s8 or $fp): Saved value/frame pointer—stores the original stack pointer (before the function was called).

MIPS also has the following co-processors available:

  • CP0: System control
  • CP1: FPU
  • CP2: Implementation-specific
  • CP3: FPU (has dedicated COP1X opcode type instructions)

The instruction set

The majority of the main instructions were introduced in MIPS I and II. MIPS III introduced 64-bit integers and addresses, and MIPS IV and V improved floating-point operations and added a new set to boost the overall efficacy. Every instruction there has the same length—32 bits (4 bytes), and any instruction starts with an opcode that takes 6 bits. The following three major instruction formats supported are R, I, and J:

Instruction category

Syntax

Description

R-type

Specifies three registers: an optional shift amount field (for shift and rotate instructions), and an optional function field (for control codes to differentiate between instructions sharing the same opcode).

These instruction are used when all the data values used are located in registers.

I-type

Specifies two registers and an immediate value.

This group is used when the instruction operates with a register and an immediate value, for example, the ones that involve memory operations to store the offset value.

J-type

Has a jump target address after the opcode that takes the remaining bits.

They are used to affect the control flow.

 

For the FPU-related operations, the analogous FR and FI types exist.

Apart from this, several other less common formats exist, mainly coprocessors and extension-related formats.

In the documentation, registers usually have the following suffixes:

  • Source (s)
  • Target (t)
  • Destination (d)

All instructions can be split into the following several groups depending on the functionality type:

  • Control flow—mainly consists of conditional and unconditional jumps and branches:
    • JR: Jump register (J format)
    • BLTZ: Branch on less than zero (I format)
  • Memory access—load and store operations:
    • LB: Load byte (I format)
    • SW: Store word (I format)
  • ALU—covers various arithmetic operations:
    • ADDU: Add unsigned (R format)
    • XORExclusive or (R format)
    • SLL: Shift left logical (R format)
  • OS interaction via exceptions—interacts with the OS kernel:
    • SYSCALLSystem call (custom format)
    • BREAK: Breakpoint (custom format)

Floating-point instructions will have similar names for the same types of operations in most cases, for example, ADD.S. Some instructions are more unique such as Check for Equal (C.EQ.D).

As we can see here and later, the same basic groups can be applied to virtually any architecture, and the only difference will be in the implementation. Some common operations may get their own instructions to benefit from optimizations and, in this way, reduce the size of the code and improve the performance.

As the MIPS instruction set is pretty minimalistic, the assembler macros called pseudo-instructions also exist. Here are some of the most commonly used:

  • ABS: Absolute value—translates to a combination of ADDU, BGEZ, and SUB
  • BLTBranch on less thantranslates to a combination of SLT and BNE
  • BGT/BGE/BLE: Similar to BLT
  • LI/LA: Load immediate/addresstranslates to a combination of LUI and ORI or ADDIU for a 16-bit LI
  • MOVE: Moves the content of one register into another—translates to ADD/ADDIU with a zero value
  • NOP: No operationtranslates to SLL with zero values
  • NOT: Logical NOT—translates to NOR

Diving deep into PowerPC

PowerPC stands for Performance Optimization With Enhanced RISC—Performance Computing and sometimes spelled as PPC. It was created in the early 1990s by the alliance of Apple, IBM, and Motorola (commonly abbreviated as AIM). It was originally intended to be used in PCs and was powering Apple products including PowerBooks and iMacs up until 2006. The CPUs implementing it can also be found in game consoles such as Sony PlayStation 3, XBOX 360, and Wii, and in IBM servers and multiple embedded devices, such as car and plane controllers and even in the famous ASIMO robot. Later, the administrative responsibilities were transferred to an open standards body, Power.org, where some of the former creators remained members, such as IBM and Freescale. They then separated from Motorola and were later acquired by NXP Semiconductors, as well as many new entities. The OpenPOWER Foundation is a newer initiative by IBM, Google, IBM, NVIDIA, Mellanox, and Tyan, which is aiming to facilitate collaboration in the development of this technology.

PowerPC was mainly based on IBM POWER ISA and, later, a unified Power ISA was released, which combined POWER and PowerPC into a single ISA that is now used in multiple products under a Power Architecture umbrella term.

There are plenty of IoT malware families that have payloads for this architecture.

Basics

The Power ISA is divided into several categories; each category can be found in a certain part of the specification or book. CPUs implement a set of these categories depending on their class; only the base category is an obligatory one.
Here is a list of the main categories and their definitions in the latest second standard:

  • Base: Covered in Book I (Power ISA User Instruction Set Architecture) and Book II (Power ISA Virtual Environment Architecture)
  • Server: Covered in Book III-S (Power ISA Operating Environment Architecture – Server Environment)
  • Embedded: Book III-E (Power ISA Operating Environment Architecture – Embedded Environment)

There are many more granular categories covering aspects such as floating-point operations and caching for certain instructions.

Another book, Book VLE (Power ISA Operating Environment Architecture – Variable Length Encoding (VLE) Instructions Architecture), defines alternative instructions and definitions intended to increase the density of the code by using 16-bit instructions as opposed to the more common 32-bit ones.

Power ISA version 3 consists of three books with the same names as Books I to III of the previous standard, without distinctions between environments.

The processor starts in the big-endian mode but can switch by changing a bit in the MSR (Machine State Register), so that bi-endianness is supported.

There are many sets of registers documented in Power ISA, mainly grouped around either an associated facility or a category. Here is a basic summary of the most commonly used ones:

  • 32 GPRs for integer operations, generally used by their number only (64-bit)
  • 64 Vector Scalar Registers (VSRs) for vector operations and floating-point operations:
    • 32 Vector Registers (VRs) as part of the VSRs for vector operations (128-bit)
    • 32 FPRs as part of the VSRs for floating-point operations (64-bit)
  • Special purpose fixed-point facility registers, such as the following:
    • Fixed-point exception register (XER)—contains multiple status bits (64-bit)
  • Branch facility registers:
    • Condition Register (CR)—consists of 8 4-bit fields, CR0-CR7, involving things like control flow and comparison (32-bit)
    • Link Register (LR)provides the branch target address (64-bit)
    • Count Register (CTR)holds a loop count (64-bit)
    • Target Access Register (TAR)specifies branch target address (64-bit)
  • Timer facility registers:
    • Time Base (TB)is incremented periodically with the defined frequency (64-bit)
  • Other special purpose registers from a particular category, including the following:
    • Accumulator (ACC) (64-bit)—the Signal Processing Engine (SPE) category

Generally, functions can pass all arguments in registers for non-recursive calls; additional arguments are passed on the stack.

The instruction set

Most of the instructions are 32-bit size, only the Variable-Length Encoding (VLE) group is smaller in order to provide a higher code density for embedded applications. All instructions are split into the following three categories:

  • Defined: All of the instructions are defined in the Power ISA books.
  • Illegal: Available for future extensions of the Power ISA. An attempt to execute them will invoke the illegal instruction error handler.
  • Reserved: Allocated to specific purposes that are outside the scope of the Power ISA. An attempt to execute them will either perform an implemented action or invoke the illegal instruction error handler if the implementation is not available.

Bits 0 to 5 always specify the opcode, and many instructions also have an extended opcode. A large number of instruction formats are supported; here are some examples:

  • I-FORM [OPCD+LI+AA+LK]
  • B-FORM [OPCD+BO+BI+BD+AA+LK]

Each instruction field has its own abbreviation and meaning; it makes sense to consult the official Power ISA document to get a full list of them and their corresponding formats. In the case of the previously mentioned I-FORM, they are as follows:

  • OPCD: Opcode
  • LI: Immediate field used to specify a 24-bit signed two's complement integer
  • AA: Absolute address bit
  • LK: Link bit affecting the link register

Instructions are also split into groups according to the associated facility and category, making them very similar to registers:

  • Branch instructions:
    • b/ba/bl/bla: Branch
    • bc/bca/bcl/bcla: Branch conditional
    • sc: System call
  • Fixed-point instructions:
    • lbz: Load byte and zero
    • stb: Store byte
    • addi: Add immediate
    • ori: Or immediate
  • Floating-point instructions:
    • fmr: Floating move register 
    • lfs: Load floating-point single
    • stfd: Store floating-point double
  • SPE instructions:
    • brinc: Bit-reversed increment

Covering the SuperH assembly

SuperH, often abbreviated as SH, is a RISC ISA developed by Hitachi. SuperH went through several iterations, starting from SH-1 and moving up to SH-4. The more recent SH-5 has two modes of operation, one of which is identical to the user-mode instructions of SH-4, while another, SHmedia, is quite different. Each family takes its own market niche:

  • SH-1: Home appliances
  • SH-2: Car controllers and video game consoles such as Sega Saturn
  • SH-3: Mobile applications such as car navigators
  • SH-4: Car multimedia terminals and video game consoles such as Sega Dreamcast
  • SH-5: High-end multimedia applications

Microcontrollers and CPUs implementing it are currently produced by Renesas Electronics, a joint venture of the Hitachi and Mitsubishi Semiconductor groups. As IoT malware mainly targets SH-4-based systems, we will focus on this SuperH family.

Basics

In terms of registers, SH-4 offers the following:

  • 16 general registers R0-R15 (32-bit)
  • 7 control registers (32-bit):
    • Global Base Register (GBR)
    • Status Register (SR)
    • Saved Status Register (SSR)
    • Saved Program Counter (SPC)
    • Vector Base Counter (VBR)
    • Saved General Register (SGR) 15 
    • Debug Base Register (DBR) (only from the privileged mode)
  • 4 system registers (32-bit):
    • MACH/MACL: Multiply-and-accumulate registers
    • PR: Procedure register
    • PC
    • FPSCR: Floating-point status/control register
  • 32 FPU registers FR0-FR15 (also known as DR0/2/4/... or FV0/4/...) and XF0-XF15 (also known as XD0/2/4/... or XMTRX); two banks of either 16 single-precision (32-bit) or eight double-precision (64-bit) FPRs and FPUL (floating-point communication register) (32-bit)

Usually, R4-R7 are used to pass arguments to a function with the result returned in R0. R8-R13 are saved across multiple function calls. R14 serves as the frame pointer and R15 as a stack pointer.

Regarding the data formats, in SH-4, a word takes 16 bits, a long word takes 32 bits, and a quad word takes 64 bits.

Two processor modes are supported: user mode and privileged mode. SH-4 generally operates in the user mode and switches to the privileged mode in case of an exception or an interrupt.

The instruction set

The SH-4 features instruction set is upward-compatible with the SH-1, SH-2, and SH-3 families. It uses 16-bit fixed length instructions in order to reduce the program code size. Except for BF and BT, all branch instructions and the RTE (return from exception instruction) implement so-called delayed branches, where the instruction following the branch is executed before the branch destination instruction.

All instructions are split into the following categories (with some examples):

  • Fixed-point transfer instructions:
    • MOV: Move data (or particular data types specified)
    • SWAP: Swap register halves
  • Arithmetic operation instructions:
    • SUB: Subtract binary numbers
    • CMP/EQ: Compare conditionally (in this case on equal to)
  • Logic operation instructions:
    • AND: AND logical
    • XOR: Exclusive or logical
  • Shift instructions:
    • ROTL: Rotate left
    • SHLL: Shift logical left
  • Branch instructions:
    • BF: Branch if false
    • JMP: Jump (unconditional branch)
  • System control instructions:
    • LDC: Load to control register
    • STS: Store system register
  • Floating-point single-precision instructions:
    • FMOV: Floating-point move
  • Floating-point double-precision instructions:
    • FABS: Floating-point absolute value
  • Floating-point control instructions:
    • LDS: Load to FPU system register
  • Floating-point graphics acceleration instructions
    • FIPR: Floating-point inner product

Working with SPARC

Scalable Processor Architecture (SPARC) is a RISC ISA that was originally developed by Sun Microsystems (now part of the Oracle corporation). The first implementation was used in Sun's own workstation and server systems. Later, it was licensed to multiple other manufacturers, one of them being Fujitsu. As Oracle terminated SPARC Design in 2017, all future development continued with Fujitsu as the main provider of SPARC servers.

Several fully open source implementations of SPARC architecture exist. Multiple operating systems are currently supporting it, including Oracle Solaris, Linux, and BSD systems, and multiple IoT malware families have dedicated modules for it as well.

Basics

According to the Oracle SPARC Architecture documentation, the particular implementation may contain between 72 and 640 general-purpose 64-bit R registers. However, only 31/32 GPRs are immediately visible at any one time; 8 are global registers, R[0] to R[7] (also known as g0-g7), with the first register, g0, hardwired to 0; and 24 are associated with the following register windows:

  • Eight in registers in[0]-in[7] (R[24]-R[31]): For passing arguments and returning results
  • Eight local registers local[0]-local[7] (R[16]-R[23]): For retaining local variables
  • Eight out registers out[0]-out[7] (R[8]-R[15]): For passing arguments and returning results

The CALL instruction writes its own address into the out[7] (R[15]) register.

In order to pass arguments to the function, they must be placed in the out registers and, when the function gets control, it will access them in its in registers. Additional arguments can be provided through the stack. The result is placed to the first in register, which then becomes the first out register when the function returns. The SAVE and RESTORE instructions are used in this switch to allocate a new register window and later restore the previous one, respectively.

SPARC also has 32 single-precision FPRs (32-bit), 32 double-precision FPRs (64-bit), and 16 quad-precision FPRs (128- bit), some of which overlap.

Apart from that, there are many other registers that serve specific purposes, including the following:

  • FPRS: Contains the FPU mode and status information
  • Ancillary state registers (ASR 0, ASR 2-6, ASR 19-22, and ASR 24-28 are not reserved): Serve multiple purposes, including the following:
    • ASR 2Condition Codes Register (CCR)
    • ASR 5: PC
    • ASR 6: FPRS
    • ASR 19: General Status Register (GSR)
  • Register-Window PR state registers (PR 9-14): Determine the state of the register windows including the following:
    • PR 9: Current Window Pointer (CWP)
    • PR 14: Window State (WSTATE)
  • Non-register-Window PR state registers (PR 0-3, PR 5-8 and PR 16): Visible only to software running in the privileged mode

32-bit SPARC uses big-endianness, while 64-bit SPARC uses big-endian instructions but can access data in any order. SPARC also uses a notion of traps that implement a transfer of control to privileged software using a dedicated table that may contain the first 8 instructions (32 for some frequently used traps) of each trap handler. The base address of the table is set by software in a Trap Base Address (TBA) register.

The instruction set

The instruction from the memory location, which is specified by the PC, is fetched and executed, and then new values are assigned to the PC and the Next Program Counter (NPC), which is a pseudo-register.

Detailed instruction formats can be found in the individual instruction descriptions.

Here are the basic categories of instructions supported with examples:

  • Memory access:
    • LDUB: Load unsigned byte
    • ST: Store
  • Arithmetic/logical/shift integers:
    • ADD: Add
    • SLL: Shift left logical
  • Control transfer:
    • BE: Branch on equal 
    • JMPL: Jump and link
    • CALL: Call and link
    • RETURN: Return from the function
  • State register access:
    • WRCCR: Write CCR
  • Floating-point operations:
    • FOR: Logical or for F registers
  • Conditional move:
    • MOVcc: Move if the condition is True for the selected condition code (cc)
  • Register window management:
    • SAVE: Save caller's window
    • FLUSHW: Flush register Windows
  • Single Instruction Multiple Data (SIMD) instructions:
    • FPSUB: Partitioned integer subtraction for F registers

From assembly to high-level programming languages

Developers mostly don't write in assembly. Instead, they write in higher-level languages, such as C or C++, and the compiler converts this high-level code into a low-level representation in assembly language. In this section, we will look at different code blocks represented in the assembly.

Arithmetic statements

Now we will look at different C statements and how they are represented in the assembly. We will take Intel IA-32 as an example and the same concept applies to other assembly languages as well:

  • X = 50 (assuming 0x00010000 is the address of the X variable in memory):
mov eax, 50
mov dword ptr [00010000h],eax
  • X = Y+50 (assuming 0x00010000 represents X and 0x00020000 represents Y):
mov eax, dword ptr [00020000h]
add eax, 50
mov dword ptr [00010000h],eax
  • X = Y+ (50*2): 
mov eax, dword ptr [00020000h]
push eax ;save Y for now
mov eax, 50 ;do the multiplication first
mov ebx,2
imul ebx ;the result is in edx:eax
mov ecx, eax
pop eax ;gets back Y value
add eax,ecx
mov dword ptr [00010000h],eax
  • X = Y+ (50/2):
mov eax, dword ptr [00020000h]
push eax ;save Y for now
mov eax, 50
mov ebx,2
div ebx ;the result in eax, and the remainder is in edx
mov ecx, eax
pop eax
add eax,ecx
mov dword ptr [00010000h],eax
  • X = Y+ (50 % 2) (% represents the remainder or the modulus):
mov eax, dword ptr [00020000h]
push eax ;save Y for now
mov eax, 50
mov ebx,2
div ebx ;the reminder is in edx
mov ecx, edx
pop eax
add eax,ecx
mov dword ptr [00010000h],eax

Hopefully, this explains how the compiler converts these arithmetic statements to assembly language.

If conditions

Basic If statements may look like this:

  • If (X == 50) (assuming 0x0001000 represents the X variable):
mov eax, 50
cmp dword ptr [00010000h],eax
  • If (X | 00001000b) (| represents the OR logical gate):
mov eax, 000001000b
test dword ptr [00010000h],eax

In order to understand the branching and flow redirection, let's take a look at the following diagram to see how it's manifested in pseudocode:

Figure 7: Conditional flow redirection

To apply this branching sequence in assembly, the compiler uses a mix of conditional and unconditional jmps, as follows:

  • IF.. THEN.. ENDIF:
cmp dword ptr [00010000h],50
jnz 3rd_Block ; if not true

Some Code

3rd_Block:
Some code
  • IF.. THEN.. ELSE.. ENDIF:
cmp dword ptr [00010000h],50
jnz Else_Block ; if not true
...
Some code
...
jmp 4th_Block ;Jump after Else
Else_Block:
...
Some code
...
4th_Block:
...
Some code

While loop conditions

The while loop conditions are quite similar to if conditions in terms of how they are represented in assembly:

While (X == 50){

}

1st_Block:
cmp dword ptr [00010000h],50
jnz 2nd_Block ; if not true

jmp 1st_Block
2nd_Block:

Do{
}While(X == 50)

1st_Block:

Cmp dword ptr [00010000h],50
Jz 1st_Block ; if true

Summary

In this chapter, we covered the essentials of computer programming and described universal elements shared between multiple CISC and RISC architectures. Then, we went through multiple assembly languages including the ones behind Intel x86, ARM, MIPS, and others, and understood their application areas, which eventually shaped the design and structure. We also covered the fundamental basics of each of them, learned the most important notions (such as the registers used and CPU modes supported), got an idea of how the instruction sets look, discovered what opcode formats are supported there, and explored what calling conventions are used.

Finally, we went from the low-level assembly languages to their high-level representation s3 in C or other similar languages, and became familiar with a set of examples for universal blocks, such as if conditions and loops.

After reading this chapter, you should have the ability to read the disassembled code of different assembly languages and be able to understand what high-level code it could possibly represent. While not aiming to be completely comprehensive, the main goal of this chapter is to provide a strong foundation, as well as a direction that you can follow in order to deepen your knowledge before starting analysis on actual malicious code. It should be your starting point for learning how to perform static code analysis on different platforms and devices.

In Chapter 2, Basic Static and Dynamic Analysis for x86/x64, we will start analyzing the actual malware for particular platforms, and the instruction sets we have become familiar with will be used as languages describing its functionality.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Set up and model solutions, investigate malware, and prevent it from occurring in future
  • Learn core concepts of dynamic malware analysis, memory forensics, decryption, and much more
  • A practical guide to developing innovative solutions to numerous malware incidents

Description

With the ever-growing proliferation of technology, the risk of encountering malicious code or malware has also increased. Malware analysis has become one of the most trending topics in businesses in recent years due to multiple prominent ransomware attacks. Mastering Malware Analysis explains the universal patterns behind different malicious software types and how to analyze them using a variety of approaches. You will learn how to examine malware code and determine the damage it can possibly cause to your systems to ensure that it won't propagate any further. Moving forward, you will cover all aspects of malware analysis for the Windows platform in detail. Next, you will get to grips with obfuscation and anti-disassembly, anti-debugging, as well as anti-virtual machine techniques. This book will help you deal with modern cross-platform malware. Throughout the course of this book, you will explore real-world examples of static and dynamic malware analysis, unpacking and decrypting, and rootkit detection. Finally, this book will help you strengthen your defenses and prevent malware breaches for IoT devices and mobile platforms. By the end of this book, you will have learned to effectively analyze, investigate, and build innovative solutions to handle any malware incidents.

Who is this book for?

If you are an IT security administrator, forensic analyst, or malware researcher looking to secure against malicious software or investigate malicious code, this book is for you. Prior programming experience and a fair understanding of malware attacks and investigation is expected.

What you will learn

  • Explore widely used assembly languages to strengthen your reverse-engineering skills
  • Master different executable file formats, programming languages, and relevant APIs used by attackers
  • Perform static and dynamic analysis for multiple platforms and file types
  • Get to grips with handling sophisticated malware cases
  • Understand real advanced attacks, covering all stages from infiltration to hacking the system
  • Learn to bypass anti-reverse engineering techniques

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 06, 2019
Length: 562 pages
Edition : 1st
Language : English
ISBN-13 : 9781789614879
Category :
Languages :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jun 06, 2019
Length: 562 pages
Edition : 1st
Language : English
ISBN-13 : 9781789614879
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$12.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 6,500+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$129.99 billed annually
Feature tick icon Unlimited access to Packt's library of 6,500+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$179.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 6,500+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 158.97
Mastering Malware Analysis
$54.99
Learning Malware Analysis
$54.99
Mastering Reverse Engineering
$48.99
Total $ 158.97 Stars icon
Visually different images

Table of Contents

12 Chapters
A Crash Course in CISC/RISC and Programming Basics Chevron down icon Chevron up icon
Basic Static and Dynamic Analysis for x86/x64 Chevron down icon Chevron up icon
Unpacking, Decryption, and Deobfuscation Chevron down icon Chevron up icon
Inspecting Process Injection and API Hooking Chevron down icon Chevron up icon
Bypassing Anti-Reverse Engineering Techniques Chevron down icon Chevron up icon
Understanding Kernel-Mode Rootkits Chevron down icon Chevron up icon
Handling Exploits and Shellcode Chevron down icon Chevron up icon
Reversing Bytecode Languages: .NET, Java, and More Chevron down icon Chevron up icon
Scripts and Macros: Reversing, Deobfuscation, and Debugging Chevron down icon Chevron up icon
Dissecting Linux and IoT Malware Chevron down icon Chevron up icon
Introduction to macOS and iOS Threats Chevron down icon Chevron up icon
Analyzing Android Malware Samples Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5
(10 Ratings)
5 star 70%
4 star 20%
3 star 0%
2 star 10%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Bogart Jan 16, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Beginners will find it very useful as it easy to follow. Very good reference for advanced analysts.
Amazon Verified review Amazon
sgcl Mar 10, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book was excellent on both breadth and depth on malware analysis topics. It also provided the inspiring thinking beside techniques. The only regret was that the book did not provide many practical examples so a bit hard for beginner to follow. But the explanation on the complex concept was neat and clear, so it helped greatly on learning for all levels of the readers.
Amazon Verified review Amazon
Sydney G. Dec 10, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Clean, logical, easy to read. If you have an interest in learning or MA old soul, you will enjoy this book and you will learn.
Amazon Verified review Amazon
Amazon Customer Sep 29, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The best book available for mastering Malware Analysis. It helped me a lot for strengthening my skills. A little bit complex but really helpful and recommend to all.
Amazon Verified review Amazon
Anthony Richardson Aug 02, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Currently going through book while taking the Malware Analyst’s Mindset course by Amr Thabet. Both the book and course have been informative thus far. Fills a lot of the holes left after reading books like Practical Malware Analysis and Malware analysts cookbook.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.