Firmware Extraction Methodology
Last updated
Last updated
This document is a team that carried out the BoB 12th "NVR Vulnability Analysis" project.This is a report described by ENVY (Kim Chan-in, Park Myung-hoon, Shin Myung-jin, Yang Gang-min, Lee Yu-kyeong) on the firmware extraction methodology.
Before analyzing the vulnerabilities of embedded devices, one of the key steps is firmware extraction. If firmware is not obtained, it may be difficult to find vulnerabilities with the black box approach. Therefore, in this report, as the first step on how to extract firmware from embedded devices, we would like to present a methodology on what procedures to follow.
Since the purpose of firmware extraction is white box-based vulnerability analysis, file system and major binary acquisition can be viewed as criteria for firmware extraction.
There are two ways to obtain the file system as follows, and this report describes the acquisition of Kernel Shell and extraction of the file system through UART serial communication.
Flash Memory dump
Obtaining Kernel Shell
In embedded equipment, a printed circuit board (PCB) is an integrated circuit that collects various parts such as CPU, RAM, and flash memory, and operates as a single computer and performs the core functions of embedded equipment.
In order to obtain a Flash Memory dump or Kernel Shell, the chip type and debugging port must be identified, so PCB analysis before firmware extraction must be preceded.
There are generally four types of flash memory used by embedded devices, and information such as manufacturers, chip types, and serial numbers can be checked on the chip surface.
When the above SOP or SON chip is used as a flash memory on the PCB, it is easily identified in the process of checking the manufacturer and chip type of each chip on the PCB. However, if System on Chip (SoC) is used on the PCB, it is impossible to identify Flash Memory on the PCB, and Kernel Shell must be obtained to extract firmware.
After identifying Flash Memory, Datasheet can be checked through the manufacturer and chip type.
Universal Asynchronous Receiver Transmitter (UART) is the simplest serial communication protocol, used for debugging purposes, and can obtain Bootloader Shell and Kernel Shell through serial communication.
The types of pins used by UART are as follows.
0V
3.3V or 5V
Receive data
Send data
UART pins are mainly composed of four continuous pins on the PCB, and TX and RX are sometimes marked on the PCB.
After identifying candidates composed of four consecutive pins on the PCB, actual UART pins may be identified through a multi-tester and signal analyzer.
After confirming the UART candidate group through PCB analysis, GND, VCC, TX, and RX can be identified by measuring voltage and current.
Although the use of a signal analyzer is required for accurate pin identification, since UART pins can be roughly identified with light-emitting diodes alone, this report describes how to identify pins using multi-testers and light-emitting diodes.
Multi-tester is a device that can measure voltage, current, and resistance and is used to identify GND among UART pins.
If you connect each pin to the positive (+) and negative (-) of the multi-tester, you can check the voltage shown in the table as follows. Since GND has a potential of 0v, if the result of the multi-tester is 3.3v, the pin connected to the cathode can be determined to be a definite GND.
For the other three pins, the voltage may all be the same at 3.3v, or only one pin may be 3.3v. VCC is 3.3v, so if only one pin is 3.3v, it can be suspected that the pin is VCC.
voltage
3.3V
3.3V or Low
3.3V or Low
-3.3V
After identifying GND, VCC, TX, and RX can be identified using light-emitting diodes.
Since VCC has the highest voltage and current, it can be determined that VCC is the pin that emits the strongest light when connecting the light emitting diode.
In the case of a TX pin, since it is a pin that transmits data, it can be distinguished from RX using the characteristics of the pin.
Various data passes through the TX pin in the process of initializing the device when booting the embedded equipment, and the light continues to flicker when the light emitting diode is connected because the current of the TX pin continues to change in the process. That is, a pin in which the light of the light emitting diode intermittently blinks immediately after the device is booted may be determined as TX.
The above is summarized in the table below.
0V
3.3V or 5V
Pin on which the light emitting diode flashes intermittently when the device is booted
one pin left
After identifying all UART pins, wires must be connected to each pin using soldering or connectors, and converters such as serial-to-usb are required for serial communication.
The role of each pin in serial-to-usb is as follows.
GND
TX
RX
VCC
There is a caveat when connecting the UART pin and serial-to-usb of the device. GND and VCC can be connected to each pin, but TX and RX are pins used for data transmission and reception, so the TX of the embedded device must be connected to the RX of the converter and the RX of the embedded device must be connected to the TX of the converter.
After connecting all the pins, you can check the name of the usb device through the device manager after connecting usb to the PC.
After checking the device name, serial communication is possible through putty. At this time, "boadrate (speed)" should be designated, and "boadrate" means the communication speed between the embedded equipment and the PC. It mainly uses 115200 a lot, and if the output is not properly visible during serial communication, the load rate must be changed.
When booting the device after UART serial communication, the U-Boot log and the kernel boot log can be checked through the terminal.
During U-Boot execution, you can check the following script in the boot log, and at that point, you can access the Bootloader Shell using a specific magic key.
There may be several types of magic keys depending on the chip manufacturer, and there may be the following.
Enter any keyboard
ctrl+u
shift+8
In most cases, the Bootloader Shell can be accessed only with the input of the magic key, but if the bootdelay environment variable is set to 0 or Secure UART is applied, a method such as Side Channel Attack should be used. The method will be covered in detail in 5.1.
After acquiring the Bootloader Shell, you can check the list of available commands using the help command, and you can check the list of environmental variables used at boot time through the printenv command.
The list of available commands may include "nand" or "md" commands used for memory dumps, and among the environment variables, there may be environment variables related to kernel boot mode or kernel log output.
In the Bootloader Shell, the kernel shell may be obtained by changing the factors of the environmental variables that are handed over when the kernel is booted. The environmental variable usually exists under the name "bootargs", and may be another environmental variable with "args" depending on the file system or U-Boot used.
The form of "bootargs" is as follows, and after the file system is executed, the binary specified in the init factor is executed first. If the content of the init factor is changed to "/bin/sh", the kernel shell may be acquired because the command "/bin/sh" is executed instead of the service being executed after the kernel is booted. If the embedded device uses ramfs, the "/bin/sh" content should be added by adding the "rdinit" factor instead of the "init" factor.
If Kernel Shell does not run even though both "init" and "rdinit" factors are changed to "/bin/sh", the service may be automatically executed immediately after Kernel Shell is executed, making interaction impossible. In this case, "single init" or "single rdinit" can be used to set only the "/bin/sh" binary to be executed when the kernel is executed.
If Kernel Shell is not obtained even by the above method, it is possible that the manufacturer used environmental variables to block the output of the Kernel Shell prompt. The solution to the problem will be described in more detail in 5.4.
If a user acquires a Kernel Shell, it may be difficult for manufacturers to acquire Kernel Shell using various methods because it is possible to acquire a file system as well as identify the operation of the device and analyze major binaries.
This text deals with issues that may arise in the process from Bootloader Shell access to Kernel Shell acquisition and bypass methods in each case.
In order to acquire the Kernel Shell from a normal embedded device, it is essential to access the Bootloader Shell because the environmental variable that is handed over to the kernel at boot time must be changed. Therefore, manufacturers sometimes use several methods to block access to the Bootloader Shell.
In order to access the Bootloader Shell from U-Boot, you need to enter the magic key before "bootdelay" becomes zero, but before the product is released, the value of "autoboot" is changed to a value of zero or less to prevent the input of the magic key, or to prevent access to the Bootloader Shell by requiring separate authentication when entering the magic key through Secure UART application.
Glitch Attack is a method that can be used in the above cases.
Glitch Attack is one of the Side Channel Attack types, which creates unexpected behavior by applying electrical interference to the chip.
If you use Glitch Attack, you can access the Bootloader Shell by bypassing Secure UART, etc.
If you check the source code of U-Boot, the autoboot_command() function is executed and automatically booted.
In the autoboot_command() function, after checking whether the magic key is input in the abortboot() function, if Secure UART is applied, logic requiring authentication is executed, and if the magic key is not input, the run_command_list() function is executed.
In the run_command_list() function, instructions specified in the "bootcmd" environment variable are executed sequentially, and tasks such as loading the kernel and loading the file system are performed.
However, if an error occurs while executing the run_command_list() function, the run_command_list() and autoboot_command() functions are returned before booting is completed because there is no logic related to error processing, and the cli_loop() function executing the Bootloader Shell in the main_loop() function is executed.
At this time, Glitch Attack may be used as a way to cause an error.
The normal kernel booting process in the embedded equipment is shown in the left figure below. When the run_command_list() function is executed in U-Boot, the kernel compressed image in Flash Memory is read to RAM, and the CPU decompresses the data to execute the kernel.
However, when the kernel compressed image is read from the flash memory, as shown in the figure on the right, if electrical interference is applied to the flash memory through gliding, abnormal data is uploaded to the RAM, and an error occurs while the CPU decompresses the data.
Glitch Attack can be performed by connecting the CS (Chip Select) pin and the DO (Data Out) pin of Flash Memory using a conductor at the moment of loading the kernel compression image into RAM during the booting process, as mentioned in the kernel booting process analysis.
By checking the boot log after UART serial communication, the moment when data is read from the flash memory can be identified.
After a few seconds, when the conductor connected to the pin is removed, an abnormal kernel compression image is loaded into the RAM, and an error occurs in the process of decompressing the data, causing the Bootloader Shell to execute.
Afterwards, commands such as "help" or "printenv" can be executed.
After accessing the Bootloader Shell, commands such as "help" or "printenv" can be executed in normal cases. However, there are cases in which manufacturers modify the code of U-Boot and cannot use some commands. In this case, since commands such as "printenv" or "md" cannot be used, it is impossible to check environment variables or dump memory.
If the environmental variable is modified to obtain Kernel Shell, the contents of the original "bootargs" environmental variable must be checked because the existing boot process must be maintained. However, if the "printenv" command does not exist, the content of the original environmental variable cannot be confirmed, so the content of the environmental variable cannot be changed.
If the content of the environmental variable cannot be confirmed in this way, the "setenv" command can be used instead. Among the Bootloader Shell commands, the "setenv" command is a command that sets an environment variable, and the content of the set environment variable can be checked through the log output when the command is executed.
After confirming the environmental variable, Kernel Shell can be obtained by changing the "init" or "rdinit" factor.
Unlike the case where only limited commands can be executed, there are cases where no Bootloader Shell command works at all. In this case, only some specific commands, such as "debug" or "shell," can be executed, and authentication that only the manufacturer knows is required when entering the command.
If the bootloader shell prevents the execution of the command, if there is a vulnerability in filtering the command, the filtering may be bypassed using command injection. As mentioned in 5.2, the "setenv" command outputs the execution result through the log, so if the "setenv" command is executable and command injection occurs, the execution result of other commands can be checked using semicolon(;).
The "bootargs" environment variable can be identified using the bypass method, and Kernel Shell can be obtained after modifying the environment variable.
Even if you access the Bootloader Shell and correct the environment variables properly, you may not be able to check the Kernel Shell prompt. In this case, if all outputs including the boot log as well as the prompt cannot be checked, it is possible that the manufacturer blocked the log output.
In normal cases, a lot of logs are output as the kernel and service binary are executed, but if the log output is blocked, the log cannot be checked after the kernel is executed. In this case, there is a high possibility that there is a kernel log output mode and an environmental variable. It is mainly an environmental variable whose value is set to 0 or 1, and the log output can be checked by modifying the environmental variable.
However, even if environmental variables related to the kernel log output mode are found, they cannot be changed through the general "setenv" command. In this case, the "-f" option can be used to force a change in the value.
If the changed environmental variable is related to the actual log output, you can check the Kernel Shell prompt after changing the contents of the "bootargs" environmental variable.
If Kernel Shell is obtained, file system extraction is essential for major binary and related library reverse engineering.
File system extraction can be divided into two main ways.
Obtaining a file system from the update firmware
Obtain a file system from a practical machine
Some manufacturers encrypt and distribute all firmware, making it difficult to extract the file system.
Since practical machines are sold including decoding logic in each product for normal service driving, the file system may be extracted by analyzing the corresponding part, or the file system may be extracted by copying it to the analyst's storage device after the decoding is finished.
This table of contents describes a methodology for moving the file system from the practical machine to the storage device after the decoding logic is finished.
In the process of mounting USB in the embedded device, in the case of a specific file system, the mount may be blocked at the kernel end. In this case, a method of bypassing the protection policy will be described.
The use of hard disks in certain devices, such as NVR, is essential. However, if you connect a USB device as shown in the picture below, it was impossible to extract the file system through USB because the number is blocked from the kernel.
Therefore, after formatting USB as xfs and making it recognized as a hard disk, the file system could be extracted after mounting USB in NVR.
There are cases in which there is no port to connect an external storage device in the device or the actual service codes may not be accessed by virtually mounting the hard disk. The NVR device provides the ability to mount an external NAS server in case the hard disk is physically damaged or full of capacity. It is mounted using iSCSI or NFS file systems and helps external file systems to be used locally. This function is a good solution in difficult situations where hard disks are mounted virtually.
After building your own NFS server, enter the following commands in the Device Shell.
After executing the command, you can check whether the mount has been successful through the mount command, and you can check the files you have configured in the NFS server.
After that, "cp -a" commands and options allow you to extract the file system by copying it to your NFS server while maintaining all properties.