A New Approach to Enterprise Server Firmware Security

A New Approach to Enterprise Server Firmware Security

Implementing Platform Hardware Protection and Recovery (PFR) in hardware using an FPGA-based root-of-trust device according to the new NIST SP 800 193 standard provides a new level of protection for server firmware from cyber-attacks. Lattice’s new PFR development kit enables simple and fast implementation of FPGA-based PFR solutions.

Implementing Platform Hardware Protection and Recovery (PFR) in hardware using an FPGA-based root-of-trust device according to the new NIST SP 800 193 standard provides a new level of protection for server firmware from cyber-attacks. Lattice’s new PFR development kit enables simple and fast implementation of FPGA-based PFR solutions.

Overview

A typical enterprise server contains multiple processing components, each of which uses its own non-volatile SPI Flash cache to hold its firmware (i.e. the software required for the processing components to start immediately after power-up). While using flash memory is convenient for field upgrades and fixes, it is also vulnerable to malicious attacks. Hackers can gain unauthorized access to firmware and plant malicious code in the component’s flash memory. These codes can easily evade standard system detection methods, and even an update or hard drive replacement cannot be resolved, causing permanent damage to the system.

To address this problem, some processing components employ on-chip hardware circuitry to detect unauthorized firmware modifications. However, other processing components on the circuit board that do not use this scheme still lack effective protection, and the entire server remains vulnerable. The National Institute of Standards and Technology (NIST) recently released the 2018 NIST SP800193 standard, which defines a standard security mechanism called Platform Firmware Protection Recovery (PFR), which is based on three guiding principles:

A New Approach to Enterprise Server Firmware Security

The PFR function mainly relies on external hardware (chip) with a “root of trust” (RoT) device. Implementing PFR solutions using FPGA-based RoT devices is more secure, scalable, and more reliable than using MCU-based root-of-trust devices. The new PFR development kit from Lattice enables server OEMs to quickly add PFR capabilities to their existing designs and take advantage of this powerful security technology. System architects and system integrators can now more easily design, implement, and maintain PFR-compliant FPGA RoT devices without requiring specialized security expertise.

Server firmware vulnerable to cyber attacks

By 2021, cybercriminal losses are expected to reach $6 trillion1. Cyber ​​hackers are constantly finding new ways to circumvent security measures aimed at:

• Peeking or stealing proprietary data stored on servers (credit card numbers, company intellectual property, etc.)
• Bypass servers to peek or steal data
• Hijack servers, DDoS attacks against other targets
• Damage a server by rendering one or more of its hardware components inoperable (called “bricking”)

Because operating systems and applications are regularly updated to include new features or fix bugs, they are easily the biggest targets for hackers to break into servers. As a result, an organization’s security resources and strategies will generally be geared toward protecting operating systems and applications. However, there is another, lesser-known attack vector for compromising servers: firmware.

Firmware refers to the first boot code that is executed immediately after a server component (i.e. CPU, network controller, on-chip RAID solution, etc.) is powered up first. The component’s processor assumes the firmware as a valid and reliable starting point from which to boot and uses it to verify and load higher-level functionality in stages, depending on the server’s configuration. In some cases, the processing component uses firmware to perform the required functions throughout its runtime.

According to a 2016 survey by the International Association of Information Systems Auditors (ISACA), more than half of those respondents who claimed to consider hardware security a top priority for their organization “reported at least one instance of malware-influenced firmware being introduced into corporate systems.” incidents” and 17% of respondents said “these incidents had a material impact. 2”

Firmware Security Status

Server firmware can be compromised at various stages of the supply chain, including:

• At OEMs: Operators maliciously implant infected firmware during production
• At the system integrator: Install unauthorized firmware when configuring the server according to customer requirements
• In transit to the customer: Hackers can unpack the server, download unauthorized firmware over the wire, and plant malicious code in the component’s SPI memory
• During live operation: Hackers can take advantage of automatic firmware updates, replacing normal updates with fake firmware that bypasses any existing protection mechanisms. Typical server motherboards currently use at least two standard instances of firmware: Unified Extensible Firmware Interface (UEFI) and Baseboard Management Controller (BMC). Although these interfaces provide some protection to the firmware, they are also very limited.

Unified Extensible Firmware Interface (UEFI)

UEFI (formerly BIOS) is the software program responsible for loading server firmware into the operating system. UEFI is installed in production and used to check what hardware components a server has, wake them up, and hand them over to the OS3. The standard detects unauthorized firmware through a process called secure boot, which prevents hardware components from booting if unauthorized firmware is detected4. However, the implementation and support of Secure Boot varies by component and vendor, which can lead to vulnerabilities in component security that can be exploited by hackers. Additionally, if the illegal firmware manages to bypass Secure Boot, UEFI cannot restore the component’s firmware to the last authorized version and continue to operate.

baseboard management controller

A baseboard management controller is a dedicated microcontroller (MCU) on a motherboard that communicates with system administrators and uses sensors to monitor “a computer, network server, or other hardware device5” through a separate connection. Many BMCs screen their own firmware installations to make sure the firmware is legit, but can’t do anything about other server firmwares. BMC cannot prevent malicious code from attacking other firmware on the board. For example, if malicious code is planted in an unused partition of a component’s SPI memory, the BMC cannot prevent the code from entering the entire code flow of the server.

A New Approach to Enterprise Server Firmware Security

Figure 1: Unified Extensible Firmware Interface and Baseboard Management Controller Interface provide only limited firmware protection

Platform Firmware Protection Recovery (NIST SP 800 193 Standard)

To address the security concerns of current firmware standards, the National Institute of Standards and Technology (NIST) released a new standard in May 2018 that provides comprehensive protection for all firmware, including UEFI and BMC. The new NIST SP 800 standard, known as PFR, is designed to “provide technical guidance and recommendations to support the recovery of platform firmware and data against potentially damaging intrusions.”6 It provides a unified approach to protecting all firmware in a system method, and can be configured to be non-intrusive to normal system operation. Once it determines that unauthorized firmware is trying to install, it stops any related components. And the PFR also operates independently of any safety features that the individual components may support.

The standard outlines three key principles for securing firmware:

• Protection C ensures that the component’s firmware is in a stable state by preventing unauthorized writes to protected areas of the component’s SPI memory or by wiping out all or part of the firmware. In some cases, even the read operation of the protected area is prohibited.

• Test C The firmware update package from the OEM can be verified before the component’s processor boots from the firmware. If corrupt or unauthorized firmware is detected, the recovery process is initiated.

• Recovery C If tampered or corrupted firmware is detected, the processor will boot from the last trusted firmware version (ie “golden image”), or obtain new firmware through a trusted process to initiate a system-wide recovery.

PFR requires a hardware-based root of trust

According to this NIST standard, implementing secure PFR functionality requires a root of trust (RoT) device to perform protection, detection, and recovery operations on the server’s firmware. NIST compliant RoT devices must do this to their firmware prior to booting without any other external components.

A hardware RoT solution must have the following characteristics:

• Scalable C RoT devices must implement protection, detection, and recovery through an external SPI mirror, with nanosecond response speed. This requires dedicated processing and I/O interfaces to ensure that server performance is not compromised.
• Non-Bypassable C Unauthorized firmware cannot bypass RoT devices and thus cannot boot the server from compromised firmware.
• Self-protection C RoT devices must dynamically respond to a changing attack surface (all nodes in a device or system that unauthorized users can access), protecting themselves from external attacks.
• Self-testing C RoT devices must be able to detect unauthorized firmware using non-bypassable cryptographic hardware blocks.
• Self-recovery C When the device discovers unauthorized firmware, the RoT device must be able to automatically switch to the last golden firmware image, ensuring that the server continues to operate.

Protect

Is it possible to detect defective firmware before booting?

Is it possible to recover from defective firmware?

Are all firmware protected from attacks during the update process inside the system at runtime?

UEFI embedded solution

Yes

no

no

BMC Embedded Hardware Module

Yes

no

no

NIST PFR using Root of Trust

Yes

Yes

Yes

Figure 2: Current firmware standards fail to protect component firmware at all stages of operation

A New Approach to Enterprise Server Firmware Security
Figure 3: NIST SP 800-193 Standard: Platform Firmware Protection Recovery

As shown in Figure 3, the RoT device first powers up and cryptographically checks the firmware of all components for unauthorized modifications. If the RoT device detects any corruption, the trusted firmware recovery process is initiated. In extreme cases, if all firmware on the board is compromised, a RoT device can also perform full system recovery (via BMC) using trusted firmware stored in the device. After the BMC starts from the trusted firmware, it obtains the known and trusted firmware from outside the system to replace the corrupted firmware version. The RoT device then verifies all firmware again, then initiates the board’s power-up procedure, during which all components on the board are powered up, forced to boot from a known good firmware image, and then begin to function normally.

In order to ensure that the SPI memory is no longer compromised, the RoT will actively monitor all activities between the SPI memory and the corresponding processor, and block the installation of the update if malicious firmware updates are detected.

Implementing NIST Compliant PFR Solutions

The difficulty in implementing Root of Trust on PLDs is to implement the solution without placing too much burden on the OEM. A root-of-trust hardware solution, including a PLD-based solution, must be scalable, which means it can protect all firmware on a server with nanosecond response times. It also needs to be able to use an unmodifiable cryptographic module to determine if the firmware has been tampered with through cryptographic detection. Combining PFR with full boot sequence control of all components of the server makes RoT impossible to bypass. Finally, the solution should also be able to automatically switch back to the most recent golden firmware image so that the server can continue to run if the current firmware is found to be compromised.

By definition, hardware-based RoT devices naturally need to be implemented in chips. In this case, the most commonly used chip platforms are microcontrollers (MCUs) and field programmable gate arrays (FPGAs). After fully considering the operational characteristics and characteristics of FPGAs and MCUs, we found that FPGAs are largely more suitable for PFR solutions.

Root of Trust using MCU

MCUs used to be used in server hardware products to build a root of trust. Simply put, it is to reserve a part of the MCU layer as a trusted execution environment. This part of the MCU is physically isolated from the rest of the chip and continuously monitors the firmware to make sure it is authorized and functioning properly. Typically, PFR functionality on servers is achieved by adding RoTMCUs to the existing hardware architecture.

MCUs often struggle to support multiple firmware instances in a verification server. This is because it cannot respond to an in-system attack on all firmware instances of a server without the help of an external device such as a PLD (whereas a PLD can monitor SPI memory traffic in real-time and simultaneously detect and respond to intrusions).

As shown in Figure 4, the three components for implementing PFR using an MCU are:

• RoTMMCUC The RoT MCU performs detection, recovery and protection functions; it is the core component for implementing RoT.
• Protect PLDC Implement PFR at scale by monitoring the activity between all component processors and their SPI memory devices in real-time to fully protect the board.
• Controlling the PLDC The device integrates all board-level power-up and reset sequence functions, including fan control, SGPIO, I2C buffering, signal integration, and out-of-band communications, which are necessary to start the board. The RoTMCU command controls the PLD to power up the board. If recovery is required in extreme cases, the RoTMCU commands and controls the PLD to supply power to only some of the boards used in the trusted recovery process.

A New Approach to Enterprise Server Firmware Security
Figure 4: If all components need to be started at the same time, a server that complies with the PFR standard and uses an MCU as a root of trust requires additional components (FPGA) to provide the necessary high performance; in large-scale server application scenarios, this solution Scheme is not scalable

This MCU-based PFR scheme has many limitations. For example, the control PLD used in the circuit of Figure 4 cannot protect its own firmware, which means that this architecture is not fully NIST PFR compliant. It is still possible that the code controlling the PLD could be modified to disable the RoTMCU. There is also the potential for a permanent denial of service (PDoS) attack, which renders the system inoperable by deleting information on these PLDs, thereby rendering the server unbootable. Security gaps in protecting and controlling PLDs make it difficult for components to prevent attacks on firmware during shipping or system integration. To meet the NIST SP 800 193 standard, the RoTMCU must implement the PFR function for both the control PLD and the protection PLD. Implementing recovery and protection functions on these devices using an MCU is very difficult. Finally, MCU-based schemes require additional system-level processes to detect attacks that attempt to bypass the entire RoT circuit.

Root of trust using FPGA

A New Approach to Enterprise Server Firmware Security
Root-of-TrustFPGA
Figure 5: The RoT FPGA solution integrates the functions of the RoTMCU, control PLD, and protection PLD on a single chip, which is not only reliable, scalable, but also impossible to bypass. On a server with a PFR-compliant PLD, the PLD’s performance is sufficient to monitor the firmware of all components in parallel without the need for an additional FPGA

Advantages of PFR Scheme Based on Root of Trust FPGA

As the name suggests, a programmable logic circuit (PLD) is an integrated circuit that can be remotely reprogrammed almost instantaneously to adapt to changing scenarios. A PLD can alter its circuitry at the hardware level, so once an unauthorized firmware is detected, that firmware cannot be installed.

Because PLDs are designed to be reprogrammable, they have more I/O interfaces than MCUs, which allows them to run multiple functions in parallel rather than sequentially. As a result, they recognize and respond faster when detecting unauthorized firmware.

In addition, PLD uses advanced simulation software that allows engineers to verify the functionality of their PLD designs. Engineers can also use this tool to test whether their designs against various firmware cyber-attacks can protect the PLD itself. Firmware updates for MCUs require more complex testing and verification compared to PLDs because MCUs cannot support functional verification through emulation. Instead, any update to the MCU firmware must go through multiple regressions (trial and error process) testing to ensure that the new firmware does not adversely affect other functions in the MCU; this process is far more tedious than running PLD emulation software.

When we compare the characteristics of PLD and MCU, we will find that PLD can provide a better and more reliable platform to implement hardware-based root of trust; it has also become a necessary device to meet the PFR standard.

Responding to Supply Chain Attacks: MCU vs. FPGA PFR Solutions

In the event of a firmware attack, two different types of PFR systems take the following countermeasures (in order of implementation):

RoT MCUs

RoT FPGA

Detection: The RoTMCU performs a sequential cryptographic detection of all SPI memory devices to detect the presence of unauthorized firmware. A compromised control PLD (in the supply chain) can bypass the RoTMCU detection and allow the BMC to boot from a compromised image.

Detection: The RoT FPGA performs sequential cryptographic detection of all SPI memory devices to detect the presence of unauthorized firmware. The FPGA records fault conditions in its on-chip non-volatile memory for later analysis. RoTFPGA protects itself from attacks from supply chain links.

If corrupted firmware is detected, the recovery process is initiated by the protection PLD managing the boot source SPI memory, or by the control or protection PLD and monitored by the RoTMCU.

FPGA-based systems integrate this functionality into their hardware. No external communication is required between RoT and ControlPLD. This makes the solution more reliable and immune to external attacks.

After the server is fully booted, the protection PLD actively monitors all SPI activity in real-time to block subsequent attacks and notifies the RoT MCU when an intrusion is detected.

The final solution was simpler and fully NIST compliant.


PFR Development Kit Simplifies Implementation of FPGA Root-of-Trust Solutions

Lattice now offers a PFR development kit that simplifies the implementation of FPGA RoT solutions. OEMs and system integrators of server components can now quickly implement FPGA-based PFRs to meet time-to-market requirements. The kit includes a software function library, associated IP and 3 development boards for implementing PFR (including protection PLD functions). Users can add board control PLD functionality to RoTFPGA designs through the Lattice Diamond software tool. Lattice PFR development kits and boards include:

• A RoTFPGA development board

• An ECP5FPGA board running a Python script to emulate the server’s BMC. Developers can simulate an attack on a component’s SPI memory by executing commands through a Python script.

• A PFR adapter card to store BMC code in SPI memory. The PFR functionality implemented in the RoTFPGA of the development board protects the PFR adapter card firmware from attacks (meaning this FPGA-based solution is NISTPFR compliant).
The Lattice suite enables users to design, implement and maintain NIST-compliant custom PFR schemes without the need for specialized security expertise.

A New Approach to Enterprise Server Firmware Security

Figure 6: The Lattice FPGA RoT Development Kit has three development boards: the RoTFPGA development board, the ECP5 development board to emulate the server BMC, and the SPI flash board to store the emulated BMC firmware

summary

Cybersecurity is a critical issue for businesses and organizations involved in the digital space. Hackers today exploit corporate server firmware to gain unauthorized access to server data, or simply take down servers permanently. PFR implemented through FPGA-based RoT devices can effectively solve this problem, providing a safe, reliable, easily scalable, and complete solution to protect the firmware of server components at any point in the supply chain. The new Lattice PFR Development Kit provides an easy way to accelerate and simplify the development of RoT devices, keeping your servers safe and secure.

The Links:   LQ104V1DG51 SKKT57-16E LCD-DISPLAY