# Malware Analysis Fundamentals

* Malware: code that is used to perform malicious actions.

#### Malware Analysis Goals

* Assess the nature of malware threats
* Determine the scope of the indicdent
* Eradicate malicious artifacts
* Strengthen your ability to handle malware incidents
* Stages of malware analysis techniques increase in complexity
* Most Complex:
* Manual Code Reversing
* Interactive Behavior analysis
* Static Properties analysis
* Fully automated analysis

### Static Properties Analysis

* **Static properties** are also called metadata
* Entails examining the strings embedded into the file
* Overall file structure
* Header data
* Does not include running the actual file
* Automated and static analysis allow an analyst to justify allocating additional time to taking a closer look at the specimen

### Interaction with other Infosec Professions

&#x20;**Input to REM staff**

* Verbal reports
* Suspicious files
* File system image
* Memory image
* Network Logs
* Anomaly observations
* **Output from REM staff**
* What malware does
* How to identify it
* Attackers profile
* IR recommendations
* Reports and IOCs
* Malware trends

#### What to include in a malware analysis report

* Determine how detailed and formal the output of your malware analysis needs to be
* If a formal report is needed:
* **Summary of the analysis:** Key takeaways the reader should get from the report regarding the specimens nature, origin, capabilities, and other relevant characteristics
* **Identification:** The type of file, its name, size, and hashes, malware name, and current antivirus detection capabilities
* **Characteristics:** The specimens capabilities for infecting files, self-preservation, spreading, leaking data, interacting with the attacker
* **Dependencies:** Files and network resources related to the specimens functionality such as supported OS version and required initialization files, custom DLLs, executables, URLs, and scripts
* **Behavioral and code analysis findings:** Overview of the analysts behavioral, as well as static and dynamic code analysis observations
* **Supporting Figures:** Logs, screenshots, string excerpts, function listings, and other exhibits that support the investigators findings
* **Incident recommendations:** Indicators for detecting the specimen on other systems and networks, and possible for eradication steps

**Templates**

* <https://github.com/MAECProject/schemas/wiki/Malware-Capabilities>
* <https://github.com/MBCProject/mbc-markdown> [Malware\_Analysis\_Template.docx](https://github.com/jtaubs1/Reverse-Engineering-Malware/files/8701238/Malware_Analysis_Template.docx)

### Open Source Malware Research

* **Malware data repositories:** VT, #totalhash
* **Multi-engine antivirus scanners:** VT, MetaDefender, VirSCAN, AVCaesar
* **File Reputation:** Malware Hash Registry, Hashsets, Windindex
* **Automated sandboxes:** Any.run, CAPE, Intezer Analye, Hybrid Analysis
* **Website investigation:** vURL, Quttera, urlscan.io
* **Other threat intelligence:** Shodan, Open Threat Exchange, RiskIQ Community Edition

#### Dont use your normal internet connection when interacting with sites outside your lab during the investigation

* Tor is a reasonable option, but adversaries can track exit nodes
* Commercial VPN services can work <https://github.com/trailofbits/algo>
* Set up your own VPN in the cloud
* Remember to check for DNS leakage in the case the adversary is checking authoritative DNS server logs for malicious domains
* You could also deploy a transient lab in a public cloud
* **Note:** Check for DNS leakage <https://dnsleaktest.com/>

#### Self Defending Malware

* Malware might include self defending capabilities
* Detect virtualization, monitoring, analysis tools
* Detect or confuse code analysis tools such as debuggers or disassemblers
* **If it detects its being analyzed it might**
* Terminate itself
* Put itself to sleep
* Interfere with analysis tools
* Exhibit different characteristics
* **There are ways to deal with malware that detects analysis tools**
* Could use physical system
* Clone a clean parition with Clonezilla, FOG, and dd
* Store the clean copy away from the system you are infecting
* At the end of the analysis restore the infected partition with the clean copy
* Can also restore the clean image over the network with PXE booting
* **Instructions** <https://mariohenkel.medium.com/using-cape-sandbox-and-fog-to-analyze-malware-on-physical-machines-4dda328d4e2c>
* **The Lab should include tools that can examine the sample statically and dynamically**
* **Static Properties Analysis** - PeStudio, strings, CFF Explorer, peframe, Detect It Easy, etc
* **Behavioral Analysis** - Process Hacker, Process Monitor, RegShot, Wireshark, fakedns, etc
* **Code Analysis** - Ghidra, x64db/x32db, OllyDumpEx, runsc, Scylla, etc

### Static Properties Analysis

* See [brbbot-analysis.md](https://github.com/jtaubs1/Reverse-Engineering-Malware/blob/main/malware/day1/brbbot-analysis.md)
* **EXE Info Download** - <https://github.com/ExeinfoASL/ASL>
* **Remnux Static Properties Tools** - <https://docs.remnux.org/discover-the-tools/examine+static+properties>

### Behavioral Analysis

* See [brbbot-analysis.md](https://github.com/jtaubs1/Reverse-Engineering-Malware/blob/main/malware/day1/brbbot-analysis.md)
* **Process Hacker** - Replaces built-in Task Manager similar to Microsofts Process Explorer
* **Process Monitor** - Records interactions of processes with the registry, file system, and other processes. <https://learn.microsoft.com/en-us/sysinternals/downloads/procmon>
* **RegShot** - Highlights changes to the file system and the registry <https://sourceforge.net/projects/regshot/>
* **ProcDOT** - Visulaizes Process Monitor logs for easier analysis [https://procdot.com](https://procdot.com/)
* **Wireshark** - Sniffs the network and captures packets
* **TCPLogView** - "monitors" the open TCP connections on your system <http://www.nirsoft.net/utils/tcp_log_view.html>

#### fakedns

* It can redirect the traffic from the infected host to an additional vm you control
* Resolves all hostname queries to the IP address of your REMnux VM
* Use `nslookup example.com` to test that it is working --> should point to REMnux

### Code Analysis Essentials

* **IDA** - A popular disassembler <https://www.hex-rays.com/ida-pro/>
* **Windbg** - a powerful free windows debugger from MSFT <https://learn.microsoft.com/en-us/windows-hardware/drivers/download-the-wdk>
* **Cutter** - an open source toolkit code analysis installed on REMnux [https://cutter.re](https://cutter.re/)
* **Binary Ninja** - a commercial disassembler that is especially strong for automated analysis tasks [https://binary.ninja](https://binary.ninja/)
* **Hopper** - A commercial disassembler and decompiler that runs on OS X and Linux [https://hopperapp.com](https://hopperapp.com/)
* **Analysts employ several code analysis approaches when reverse engineering malicious software**
* **Dissassembling** - Involves translating binary machine level instructions to human readable assembly language code
* **Decompiling** - Involves going a step further to generate an approximation of the original programs source code
* Dynamic code level analysis sometimes called **debugging** involves examining the code while running the program
* Static code level analysis involves using a **disassembler** or **decompiler** to examine the code without actually executing it
* **Emulating** the execution of the code uses specialized tools to preview the key actions the specimen will take when it runs
* **Emulate the execution of a program to preview its capabilities**
* Emulators don't replace disassemblers and debuggers but they can
* Provide and overview of the capabilities
* Suggest code areas worth analyzing more e.g. functions
* Emulators are especially useful for examining API-level activity
* Emulators are confused by unfamiliar instructions of API calls
* Examples of emulators are `speakeasy` and `capa`

### speakeasy

```
run_speakeasy.py -t brbbot.exe -o speakeasy.json 2> speakeasy.txt
```

* Parse .json output with `jq`

```
jq ".entry_points[].apis[].api_name" speakeasy.json | more
```

* <https://github.com/mandiant/speakeasy>
* `-o` directs `speakeasy` to save its output using the json format
* The script sends the rest of its output to stderr
* Load the output into vscode

```
code speakeasy.txt
```

* Look into interesting API calls as that can feed `ghidra` or `x64/32db` analysis later

### Capa

* <https://github.com/mandiant/capa>
* Capa is not good for packed samples
* Capa automatically identifies the specimens capabilities that malware analysts typically want to see
* Maps observed capabilities to ATT\&CK and MBC frameworks for additional insight
* <https://github.com/MBCProject/mbc-markdown>
* <https://attack.mitre.org/>

```
capa brbbot.exe | more
```

* use `-vv` with capa for additional insights
* will display the name of the corresponding API call and the location in the code that invoked that function
* example will see `InternetReadFile` at `0x140001840` --> will be a great place to set a breakpoint as malware often uses that for C2

### x64/32db

* First pay attention to the handle when a sample is loaded into the debugger
*

```
<figure><img src="https://user-images.githubusercontent.com/75596877/173685523-8b3a4424-5c01-4ad7-ac16-e7121c497be3.png" alt=""><figcaption></figcaption></figure>
```

* Here the current module that is being analyzed is `ntdll.dll`, which is a Windows DLL that has been imported by the malware
* This is not code written by the malware author.
* We are only interested in analyzing code that has been written by the malware author
* Select `Debug --> Run` to get to the `Main()` function of the sample.
* Confirm it is the the `Entrypoint` of the malware by examining the thread which should have changed
*

```
<figure><img src="https://user-images.githubusercontent.com/75596877/173685830-622fff95-be8f-4e97-b28b-33e4849f9ae1.png" alt=""><figcaption></figcaption></figure>
```

**Step Over/Into**

\-

<figure><img src="https://user-images.githubusercontent.com/75596877/173683638-b56762a8-f22e-417d-8c86-035c84cc1425.png" alt=""><figcaption></figcaption></figure>

* Understanding `Step Over and Step Into`.
* If you reach a `call` instruction you now have to decide if you want to `step over` or `step into`
* If you step into you will move to the new function, see the instructions executed and then return back to where you left off
* If you step over the function will still execute, you however will not see the instructions executed by that function

#### Shortcuts:

```
Run - F9
Run until selection - F4
Pause - F12
Restart - Ctrl+F2
Step Into - F7
Step Over F8
Run to user code - Alt+F9
Go to your current position - Shift + 8
```

**Commands:**

```
SetBPX ReadFile --> Sets a breakpoint on API Call Readfile
bp ReadFile --> Shortcut for above command
```

* Note API calls are case sensitive
* `RIP` on x64 is register that contains the address for the current instruction
* `EIP` on x86 is the register that contains the address for the current instruction
* **Registers** are special locations in the CPU that are very efficient at storing small amounts as data

#### Note About Breakpoints in x64/32db

* When you set a BP on an API call and execute until selection / Run it will drop you down to the API call
* We want to see the code implemented by the developer called `user code`
* To accomplish this we want to allow the API call to finish executing and pause once it reaches `user code`
* To do this, once you hit your BP click on the `DeBug menu --> run to user code` or `Alt+F9`

#### Process Injection

* Always look for `VirtualAlloc` as this is where the malware is allocating space for the new process.
* When you `Run` and hit the breakpoint ensure you check the `Thread` that is running at the top bar
* If the malware is not going to take a `jmp` hit the enter key to force it to `jmp`
* Example: you would want it to take the bottom `jmp`
*

```
<figure><img src="https://user-images.githubusercontent.com/75596877/173686515-e4a2eb29-d415-4005-bebb-55d620cc0d88.png" alt=""><figcaption></figcaption></figure>
```

* When dealing with `VirtualAlloc` you want to set a `bp` on the ret (end of the function)
* When you hit the `bp` on the `ret` value `step into or over` will work
* The goal is to get the `EIP` to point to `VirtualAlloc`
* By checking the parameters that are being passed after the space in memory has been allocated we can see if the unpacked malware is being stored in these parameters which are highlighted below.
*

```
<figure><img src="https://user-images.githubusercontent.com/75596877/173687103-fccfbdc1-2de2-4c61-a5d5-1c9df1e14be2.png" alt=""><figcaption></figcaption></figure>
```

* `Right-clicking on [esp+28]` and then selecting `Follow in Dump` will display the contents of this parameter. Using the option `Selected Address` or `Address: ESP+28` will provide the same outcome.
*

```
<figure><img src="https://user-images.githubusercontent.com/75596877/173687251-c15233ae-dbb4-4232-8bf2-14b8af957f9c.png" alt=""><figcaption></figcaption></figure>
```

* What we are now looking for is the header of an executable file.
* Windows is always `4D 5A` in hexadecimal, this is `MZ` in ASCII.

### API Monitor

* Allows you to execute a program and "spy" on the API calls made
* Can also attach to a running process, but there is a chance you will miss the relevant API calls
* <http://www.rohitab.com/apimonitor>
* Will display what parameters were passed to the external function and what data was returned by the function

### Cyberchef

* Free open source tool to help decode, de-obfuscate, and even decrypt the data you might encounter during an investigation
* <https://gchq.github.io/CyberChef/>

### InetSim

* InetSim can emulate many protocols from the server side of the connection
* Can emulate HTTP, HTTPS, SMTP, FTP, POP3, TFTP, IRC and more
* Can tweak the config by editing `/etc/inetsim/inetsim.conf`
* InetSim files used by its emulators in `/var/lib/inetsim`
* Saved detailed log files in `/var/log/inetsim`
* Use it in conjunction with fakedns or enable its built-in DNS emulator by editing `inetsim.conf`
* Free copy [https://inetsim.org](https://inetsim.org/)
* Post infection look at `/var/log/inetsim/service.log`

### Fidder

* Can also observe the client side connection with Fiddler
* Proxy-like debugging capabilities are also useful when assessing web applications
* If you do not have web services active you can use the autoresponder to generate HTTP HTTPs responses
* Fiddlers HTTPS settings must be configured to `Capture HTTPS Connects, Decrypt HTTPS traffic Ignore Server certificate errors`
* Access those options from `Tools --> Options`
* <https://www.telerik.com/download/fiddler>
* Can auto generate responses to HTTP HTTPs requests, to enable you need to go to the `Autoresponder` tab and ebale the checkbox called `Enable rules`
