Malware Analysis Series 0x1

Mahmoud Elfawair
8 min readMar 4, 2024

--

Hello You.

This is the second article of the malware analysis series, and you may wonder why it is labeled 0x1 (we programmers start counting from zero). So, if you haven’t read the previous article go back and read it, it is only 3 minutes read.

Today, I’m writing about the chapter number 1 of the Practical Malware Analsys Book, And it’s about Basic Static Techniques, Let’s dive in

Basic Static Techniques

It is the first step of analyzing a malware and it’s about analyzing the code or structure of a program to determine its function

The topics we are going to discuss today are:

  1. Using antivirus tools to confirm maliciousness
  2. Using hashes to identify malware
  3. gleaning information from a file’s strings, functions and headers

Each techniques has its own features and each one of them provide different information.

Using Antivirus Tools To Confirm Maliciousness

It’s a good idea to use VirusTotal which is a website that allows us to upload a file for scanning by multiple antivirus engines. You may find that some antiviruses have identified it.

But you have to know that antiviruses are not prefect they have 2 methods to determine if a file is malicious or not, the first one is relying on a database of hashes or indicators of known suspicious files, the second one is relying on behavioral and pattern matching analysis (heuristics).

the problem with the first method is that malware authors can easily modify their code, thereby changing their programs’ signature and evading virus scanners.

the problem with the second method is new and unique malware may be able to bypass it (but it is much better than the first method)

Using Hashes To Identify Malware

To the ones who do not know what a hash is, it is like a mathematical equation that has no inverse and every value it generates is a unique one, so if we use it on a file it would generate a hash value (sort of a fingerprint) to that malware. remember hash is irreversible.

We have so many hashing algorithms but we will work with the Message-Digest Algorithm 5 (MD5), let’s see how we can do it in windows

I used the Get-FileHash cmdlet.

Note: sometimes we are dealing with a sensitive data and the sensitive data could be the malware so we don’t want to upload the file to anyone else especially any service online like VirusTotal so we get the has and use the hash instead of uploading the file.

Gleaning Information From A File’s Strings

Before we do that let’s understand what strings are? they are a sequence of characters in a program as simple as that, it could be a URL or data used in the program or IP etc…

To get the strings of a file you need to download the binary first, you can download it from here https://learn.microsoft.com/en-us/sysinternals/downloads/strings , and I really suggest you download the whole sysinternal suite because we will use some of its tools later on.

the strings command will search for characters that are typically stored in either ASCII or unicode format, there some differences between them like the ASCII only take one byte for a character but unicode format takes two bytes for one character. But they both end with something called a NULL terminator which is a byte of zeros that indicates the end of a string.

ASCII
Unicode

Now let’s use strings cmdlet on one of the files we have

there is more

strings searches for a three-letter or greater sequence of ASCII or unicode characters followd by a string termination character.

as you can see not all the strings are useful, but we can easily identify those useful strings like UPX which will talk about latter, a possible URL and MalService which gives us a clue that this malware might be using something related to mail.

Packed & Obfuscated Malware

Malware writers may use packing or obfuscation to make their files more difficult to detect or analyze using static analysis. packing is the process of compressing the code section where you won’t be able to see and understand the purpose of the code.

Note: legitimate programs almost always include many strings. Malware that is packed or obfuscated contains very few string.

Packed and obfuscated code will often include on of these functions LoadLibrary and GetProcAddress which are used to gain access to additional functions.

When the packed program is run, a small wrapper program also runs to decompress the packed file and then run the unpacked file

When a packed program is analyzed statically, only the small wrapper program can be dissected.

Detecting Packers With PEiD

The author uses this program to detect packers but I don’t really recommend using it it is freaking old and has a lot of bugs but since the book is using it I’m going to do the same here, let’s say you have installed the program and supplied it with the file you are analyzing, this is how it would look like:

me personally I’ll use unpac.me website. it will basically unpack the file if it is packed and give you rich report about the file itself.

Portable Executable File Format

Like ELF files in linux, windows has the PE file format. The file format can reveal a lot about the program’s functionality.

The Portable Executable (PE) file format is used by Windows executables, object code, and DLLs (Dynamic Linked Libraries). It is a data structure that consists of necessary information for the windows OS loader to manage the wrapped executable code.

Linked Libraries & Functions

One of the most useful pieces of information that we can gather about executable is the list of functions that it imports. Imports are functions used by one program that are stored in a different program, such as code libraries that contain functionality common to many programs. Code libraries can be connected to the main executable by Linking

Imports are linked into programs by programmers to avoid having to re-implement particular functionality. It is possible to connect code libraries dynamically, at runtime, or statically.

Static, Runtime, and Dynamic Linking

Static linking is popular in UNIX like OSs and the idea behind it is that all the libraries that are being used by the binary are going to be added to the binary code itself, this linking type makes the binary bigger and harder to find the user code.

Runtime linking is unpopular in friendly programs, and the idea behind it is that the program will only connect to libraries when one a function from them is needed, using this linking technique is popular when dealing with packed or obfuscated binaries.

Dynamic linking will load all needed functions when the program starts, and when the program calls the function executes within the library.

Functions

Several Microsoft Windows functions allow programmers to import linked functions not listed in a program’s file header. the most commonly used are LoadLibrary, GetProcAddress, LdrGetProcAddress, LdrLoadDll these functions allow the program to access any function in any library which makes it very hard for us to understand what is going on using only static analysis.

Knowing the functions that are being used in the program allows us to make an educated guess on what is the type of the malware and what it is doing.

For example, if a program imports the function URLDownloadToFile you might guess that it is downloading something from the internet, when it’s using CreateProcessA you may guess that it is creating another process, so you should watch out for the launch of aditional programs.

Function Naming Conventions

You might see functions with an Ex suffix; this indicates that Microsoft has updated the old function without the Ex suffix and the new function is incompatible with the old one.

Other suffix like the A and W in functions like CreateDirectoryW which means that this function takes strings as parameters, note these letters do not appear in documentation so when searching for them remember to drop the trailing A, W

Common DLLs

  1. Kernel32.dll — This is a very common DLL that contains core functionality, such as access and manipulation of memory, files, and hardware.
  2. Advapi32.dll — This DLL provides access to advanced core Windows components such as the Service Manager and Registry.
  3. User32.dll — This DLL contains all the user-interface components, such as buttons, scroll bars, and components for controlling and responding to user actions.
  4. Gdi32.dll — This DLL contains functions for displaying and manipulating graphics.
  5. Ntdll.dll — The Windows kernel can be accessed with this DLL. This file is always imported indirectly by Kernel32.dll, but executables rarely import it directly. The author wanted to leverage functionality that is not typically available to Windows applications if an executable imports this file. This interface will be used for some tasks, such as hiding functionality or changing processes.
  6. .WSock32.dll and Ws2_32.dll — These are networking DLLs. A program that accesses either of these most likely connects to a network or performs network-related tasks.
  7. Wininet.dll — This DLL contains higher-level networking functions that implement protocols such as FTP, HTTP, and NTP.

PE File Sections

  • .text : This section contains the executable code.
  • .rdata : This section holds read-only data that can be accessed globally within the program.
  • .data : Global data that is accessed throughout the program is stored in this section.
  • .idata: This section, sometimes present, stores import function information. If absent, import function information is stored in the .rdata section.
  • .edata : Sometimes present, this section stores export function information. If not found, export function information is stored in the .rdata section.
  • .pdata : Present only in 64-bit executables, this section stores exception-handling information.
  • .rsrc : This section stores resources needed by the executable.
  • .reloc : Information for relocating library files is contained in this section.

Other Tools

  • PEview — tool used to inspect and analyze the structure and properties of Portable Executable (PE) files, we can get the time of compilation of the program even though it is not reliable because the malware author could easily change it and some comiplers like Delphi put a specific time.
  • resource hacker — allows us to examine the data in the .rsrc section. data in this section can be icons, images, other binaries or even drivers and etc…

I will explain using the tools I talked about today when releasing the solutions of the labs.

and yeah that’s it, thanks for reading. cya

--

--

Mahmoud Elfawair
Mahmoud Elfawair

Written by Mahmoud Elfawair

reverse engineering and linux enthusiast

No responses yet