### Characterization of COTS FPGA in MEO Earth orbit onboard TRISAT-R mission

## Klemen Bravhar<sup>(1)</sup>, dr. Iztok Kramberger<sup>(2)</sup>, dr. David Selčan<sup>(3)</sup>

<sup>(1)</sup> University of Maribor, FERI, Koroška cesta 46, 2000 Maribor, Slovenia, +386 (2) 22 07 204, klemen.bravhar@um.si

<sup>(2)</sup> University of Maribor, FERI, Koroška cesta 46, 2000 Maribor, Slovenia<sup>,</sup> +386 (2) 22 07 178, iztok.kramberger@um.si

<sup>(3)</sup> Skylabs d.o.o., Zagrebška cesta 104, 2000 Maribor, Slovenia, david.selcan@skylabs.si

### ABSTRACT

The space sector knows many Field Programmable Gate Array (FPGA) types, and the most used FPGA technology is Static Random Access Memory (SRAM). Based on tests from nuclear laboratories, SRAM cells are not suitable for harsh radiation environments since a radiation particle with its energy can change its state. Furthermore, the same particles can cause a Single Even Transient (SET) in a combinatorial or a sequential logic. To protect circuits, there are two mitigation techniques; A Triple Modular Redundancy (TMR) protects logic with triplication and a Memory Scrubber whose task is to detect and correct corrupted in SRAM cells. The research aim is to evaluate Xilinx Artix-7 FPGA in MEO orbit at 6000 km as part of the TRISAT-R satellite mission. The research includes four experiments, where three have a simple snake architecture (in series connected negators and D-Flip-Flops) with different TMR protection, while the fourth experiment tests Radio Frequency (RF) application. The onboard control unit monitors the number of errors in configuration memory and snake's data path in the Artix-7. Obtained results will be compared with resalts obtained from radiation tests in nuclear laboratories. Furthermore, we will conduct a correlation between data obtained from satellite and simulations for given orbit.

### **1 INTRODUCTION**

A reconfigurable FPGA integrated circuits are becoming an essential component in all electronic devices due to their flexibility, high density of reconfigurable resources, and changing functionality "on the fly". The most popular FPGA for terrestrial hardware applications is FPGA with configuration RAM (CRAM) based on SRAM; however, its reliability is questionable when such circuit works in a harsh radiation environment since a high charged radiation particle may cause a bitflip in CRAM or a logic.

A bitflip is a soft error[1] known as Single Event Upset (SEU), and it has non-destructive properties. An SEU may alter the functionality of the onboard system. SEUs generate many other events in semiconductor devices. If an SEU affects only one memory cell, then event is called a Single Bit Upset (SBU) [2], or if it changes states in more than one memory cell, the event is called Multiple Cell Upset (MCU)[3]. An SEU does not only affect configuration memory, but it can affect the functionality of a device by generating a glitch on a data and a clock path and this is called a SET event. The generated SET can store corrupted information in D-Flip-Flop (DFF) cell. A TMR[4] mitigation technique protects digital designs before unpredicted SET events. A TMR triplicates combinatorial and sequence components and inserts voters. A voter is a combinatorial digital circuit, and it detects and corrects ("on the fly") an error caused by an SEU event. To some extent, a TMR mitigation technique can mask a change of state[5] in CRAM memory. Since CRAM is not immune one SEU events, some FPGAs have integrated protection mechanism called Memory Scrubber mitigation technique. A memory scrubber reads states from CRAM and in case of SBU or MBU, it detects the event and sets corrupted CRAM cell to original state.

Before a spacecraft is launched in Earth's orbit, FPGA designers shall estimate the error rate and risk of how errors affect other functionalities. One of the best steps toward estimating an error rate is simulating Register Transfer Level (RTL) designs with software simulation tools or testing FPGAs directly in nuclear laboratories. Quite a lot of work has been done in the field of evaluating FPGAs in nuclear laboratories with radiation particles such as heavy-ion, neutrons, high-energy electrons, and protons.

In this paper we are focus on Xilinx Artix-7 (XC7A100T) SRAM FPGA device (manufactured with 28-nm bulk CMOS technology), and its technology belongs to the Commercial-Of-The-Shelf (COTS) components. COTS FPGAs do not have any radiation protection, but some COTS FPGAs have integrated IP cores (Soft Error Mitigation (SEM) by Xilinx), which helps detect and correct bitflips (up to two) in CRAM. Proposed FPGA have gone through many nuclear tests, and tests have returned a good FPGA characterization; however, Artix-7 has never been tested in Earth's orbit (MEO) at an altitude of 6000 km. The proposed experiments are part of the TRISAT-R mission.

The proposed FPGA will conduct four experiments in Medium Earth Orbit (MEO) as part of the TRISAT-R (TRISAT – Radiation) mission. Three experiments test Artix-7 with the Snake Architecture RTL architecture, where the snake architectures include different mitigation techniques, while the fourth experiment tests RF application, which includes Artix-7 and RF Agile Transceiver.

# 2 RELATED WORKS

The CMOS technology (28-nm bulk) is the target manufacturing technology since the proposed FPGA Artix-7 uses the same technology and has been extendedly tested in nuclear laboratories. FPGA's fabric includes digital components such as Look-Up Tables (LUT), DFFs, Block RAM (BRAM), etc. RTL designs can help with evaluating FPGA fabric in hazardous environments full of charged radiation particles.

A simple counter [2] has tested LUT and DFF components with associated CRAM in laboratory with neutrons (neutron's flux  $1.43 - 2.35 \times 10^6$  cm<sup>-2</sup> s<sup>-1</sup> and fluence  $0.43 - 3.37 \times 10^9$  cm<sup>-2</sup>). The test also included a BRAM memory cell. The test's outcome shows that the highest number of bitflips happened in CRAM, while DFFs and BRAM had up to 3 bitflips per test.

A satellite in Earth's orbit is bombarded with charged particles from all directions due to its rotating moment or from an unknown source in outer space. The study in [6] tested Artix-7 in the laboratory with low Linear Energy Transfer (LET) heavy ions and proved that a particle's inclination angle

influences the behavior of an FPGA. The given charts show that an inclination angle does not affect SEU's cross-sections, while MBU's cross-section worsens when the inclination angle increases (this applies to CRAM and BRAM memory).

Digital logic must have protection; otherwise, radiation particles interfere with the implemented design in changing the functionality. Most common-used mitigation methods in FPGAs are TMR and CRAM scrubbers. For example, Kintex-7[7] FPGA was exposed to a neutron's fluence between 1.08 -  $2.83 \times 10^{11}$  cm<sup>-2</sup>. The unmitigated FPGA design had a twenty times higher failure rate than the design with TMR mitigation method and CRAM memory scrubber. Furthermore, mitigated design with TMR and memory scrubber improved its cross-section by fifty-three times compared to unmitigated design.

In the past, several space missions[8], [9] have analyzed FPGA's immunity in the radiation environment of LEO Earth's orbit. For example, CFEsat satellite tested nine Xilinx Virtex FPGA at an altitude of 560 km. The onboard FPGAs conducted four different experiments and on average monitoring circuits in the proposed experiments, CRAM and BRAM detected 0.39 SEU events per day.

# **3 TRISAT-R MISSION**

In this paper, we evaluate and test Artix-7 in the Earth's orbit. Our experiment will be launched in MEO orbit at an altitude 6000 km as a TRISAT-R (TRISAT – Radiation monitor) satellite mission payload. The TRISAT-R is a 3U CubeSat, and its objective is to perform an MEO's radiation analysis with four scientific radiation monitoring payloads and IOD of several other technologies. The satellite mission is a collaboration between three European institutes (European Space Agency, CERN, and University of Maribor) and one industrial partner (SkyLabs). On top of the project's direct scientific and technical contributions, the mission also has a substantial education outreach. Several undergraduate and postgraduate students will actively participate in in developing the TRISAT-R satellite environment.



Figure 1. Picture of assembled TRISAT-R satellite

## 3.1 ATTM board

The ATTM board is a miniaturized satellite communication subsystem, and it supports a full-duplex satellite Software Define Radio (SDR) communication link and allows in-orbit programable SDR architecture. The device which supports programable SDR architecture is SRAM-base Xilinx Artix-7 FPGA, and Artix-7 serves as a testbed for our four experiments. Besides, it has embedded intelligence, powered by the PicoSkyFT microcontroller, which is implemented on radiation-tolerant MicroChip's ProAsic3 FPGA. The microcontroller monitors error events in Artix-7 FPGA due to Sigle Error Event (SEE). Artix-7 can experience two types of soft errors (bit flips in a snake's data path or CRAM memory) and the hard error (Letch-up), which has destructible consequences for Artix-7 FPGA. The PicoSkyFT task is to detect a soft error by reading the status of Soft Error Mitigation (SEM[10]) or the error registers that detects a bitflip on a data path. The Latch-up Current Limiter (LCL) protects Artix-7 FPGA before a hard error. If any error occurs, the PicoSkyFT always reprograms Artix-7 FPGA with the currently active FPGA image in the NAND FLASH memory. The NAND FLASH memory is part of the ATTM board, and it has sixteen partitions for sixteen FPGA images. All sixteen partitions are occupied with FPGA images, since four copies of the FPGA image allocate four partitions. Four copies of the FPGA image have been made for security purposes, in case, if one FPGA image gets corrupted during the flight. Three experiments (the Snake Architecture RTL) use an Artix-7 FPGA as a test device, while the fourth application uses Artix-7 and RF Agile Transceiver AD9364 for its testbed.



Figure 2. Block schematic of ATTM board. Board includes a Xilinx Artix-7 FPGA as a testbed, a MicroChip's ProAsic3 FPGA with implemented the PicoSkyFT microcontroller, NAND FLASH with four experiments' FPGA images and RF Agile Transceiver AD9364

## 3.2 Getting data from TRISAT-R satellite

The TRISAT-R is highly constrained with onboard power budged, and due to that, experiments will be enabled for a limited time. During run time, the control logic of the experiments will log the number of occurring errors and store them in the main memory of the satellite. When the TRISAT-R passes the ground station, the main communication computer will transmit stored logs to Earth.

## **4 ONBOARD EXPERIMENTS**

The ATTM board carries four scientific experiments. Three experiments snake data path target a COTS Xilinx Artix-7 FPGA, while the fourth experiment evaluates RF application in MEO and includes devices Artix-7 and RF Agile Transceiver AD9364.

#### 4.1 Snake Architecture RTL experiments

The Artix-7 FPGA device supports all three Snake Architecture RTL (hereafter Snake) experiments, where the controlling device is a ProAsic 3 FPGA with the PicoSkyFT, which controls the behaver of uploaded image on Artix-7 FPGA. The abstract level of snake's architecture has an SPI controller, Register, Chain Controller, SNAKE, and SEM. SPI controller behaves in Artix-7 as a slave device (the master device on the SPI bus is the PicoSkyFT) and connects Register with the PicoSky's firmware. The Register is an array of memory cells containing information about the snake's status and controls it when the chain is active. Among many status registers, two of them are constantly monitored by the PicoSkyFT; error registers for SEM and snake. SEM and Chain Control blocks are directly connected to the Register. Soft Error Mitigation (SEM) is IP core provided by Xilinx and it is part of Artix-7. The SEM IP core monitors CRAM, and it performs an action if SEU event causes a bitflip in CRAM. It automatically detects bitflip. Furthermore, it can fix an error and classifies SEU event as well. Since the research topic is the location of an error due to an SEU event, the SEM's task is to detect a bitflip in CRAM and all other functionalities are disabled. Besides, PicoSkyFT constantly communicates with the SEM component through the UART communication protocol. The Chain control generates a pseudo generated bitstream, shifts the bitstream to the snake's data path, and monitors if an SEU has caused a bit flip on the snake's data path. If the order of received bits from the snake block does not correspond to the transited bitstream, the error detector detects an anomaly and increases the error counter. The PicoSkyFT can always access to error counter through registers.



Figure 3. Abstract presentation of the testbed architecture implemented in Xilinx Artix-7 FPGA. The architecture has 5 main blocks: SPI Controller, REGISTER, Snake Controller, SEM and test environment SNAKE.

#### 4.1.1 Snake Architecture RTL

The snake experiments are in series-connected chain links s with different mitigation techniques on the RTL level. A chain link has a negator at the input and DFF at the output. All three proposed snake architectures have the same length of data path where the data path contains three thousand chain links. One experiment does not include any radiation mitigation method, while two experiments use a mitigation method called Triple Modular Redundancy (TMR) with voters. A TMR mitigation method can triplicate chosen or all RTL components and inserts voters. Tree proposed experiments use a snake architecture; however, each snake architecture uses its own chain link, and the following description gives detailed information about the chain link's RTL and applied mitigation techniques: 1. Non-protection RTL chain link does not have any radiation protection or mitigation technique.

The RTL has a negator at the input where the chain link's output is equal to the output of DFF.



Figure 4. Simple snake non-protected RTL chain link

2. Partial-protection RTL chain link has a TMR mitigation on the level of DFF where DFFs are triplicated. The chain link has only one negator on the input, and the negator's output connects all three DFFs' inputs. Furthermore, the chain link's output is not equal to DFFs' outputs, but a voter is between DFFs' outputs and the chain link's output.



Figure 5. Partial protection of chain link's RTL. The TMR is applied on level of DFF and an error in DDFs is eliminated with voter and the chain link's output.

3. Full-protection RTL chain link an additional level of protection compared to a Partial-protection RTL chain link. In other words, TMR mitigation applies to all components in a chain link. The third experiment's RTL design has triplicated negators, DFFs, and voters. Negator's output connects to corresponding DFF input, where DFF's output connects to each voter in chain link as the Figure 6 shows. The chain link's outputs have the same states as voters' outputs.



Figure 6. Full protection of chain link's RTL. The RTL has TMR applied on all components which means negators, DFFs and voters are triplicated.

#### 4.2 **RF Application**

Our fourth experiment tests onboard RF application which is integrated in the ATTM board. The experiment includes an Artix-7 FPGA and an RF Agile Transceiver AD9364 device. The proposed experiment is an SDR design, which has proved itself on the first TRISAT satellite mission in LEO orbit at an altitude 400 km. However, the TRISAT-R satellite targets MEO Earth's orbit and an altitude 6000 km, where the environment has a higher population of charged radiation particles with higher energies than particles at an altitude 400 km. The test scenario for SDR design uses a loopback on the AD9364 device, where the device's RF output loops back to the RF input. The controlling component is the PicoSkyFT, which sends data packets to the testbed and receives data packets from the testbed. If data in the received packet does not match the data in the transited packet, the PicoSkyFT first checks Artix-7's CRAM. If the bitstream in the CRAM differs from the uploaded bitstream, then the PicoSkyFT predicts that the cause for the error in the received data packet was a change in the CRAM. Otherwise, PicoSkyFT classifies an error as a change in the data path.

## 5 FIRMWARE FLOW FOR EXPERIMENT

### 5.1 PicoSkyFT's actions in case of an error in the Snake design

The PicoSkyFT is a microcontroller that runs on ProAsic3 FPGA, and the firmware for the microcontroller bases on C code. The PicoSkyFT's firmware task is to monitor the status of Artix-7 FPGA and to receive commands from a user via CAN bus. For controlling the experiments, a user can choose between two types of commands massages:

- Start/Stop the experiment
- A user can set which experiment is active by choosing an FPGA image located in NAND FLASH memory.

Figure 7 presents the PicoSkyFT's firmware actions when an SEU has generated an error in Artix-7 FPGA. When the experiment is in the IDLE state, the firmware waits for a command from a user, and when it receives the START command, it goes to the next state PROGRAM\_FPGA. The firmware loads a new FPGA image to FPGA's CRAM. When FPGA confirms to firmware that an FPGA image was correctly stored in CRAM, the firmware changes to the READ\_FPGA\_STATUS state, and the logic in Artix-7 starts shifting pseudo generated bits through Snake's data path. Meanwhile, the microcontroller reads from the error register and SEM's status. SEM IP core's task is to detect bitflips in configuration memory, and it replies to firmware with two possible answers; configuration memory has stayed intact, or a bitflip has happened in one of the CRAM's cells. If any of the readbacks report an error, the firmware executes the last read of the error register and SEM's status and jumps to the LOG state. In the LOG state, the firmware classifies errors in three groups:

- If the error register has its value greater than 0 and SEM replies with CRAM has stayed intact, then firmware classifies the error as an error in snake's data path
- If SEM replies with a bitflip in CRAM cell and the error register's value is equal to 0, then firmware logs the error as an error in the CRAM
- If both SEM and error register reply with the error event, then firmware saves this event as the error in snake's data path and CRAM

Classified errors are then sent to the control unit of the TRISAT-R satellite, and then firmware jumps back at the beginning to PROGRAM\_FPGA state, and it starts repeating described steps unit a user does not send a new command (either Upload a new FPGA image or STOP command).



Figure 7. The PicoSky's firmware flow when ATTM board test one of the three snake experiments

### 5.1 PicoSkyFT's actions in case of an error in the RF Application

The currently active experiment determines the firmware's flow. Snakes' experiments have the same firmware flow, while the RF application has its own flow since error detection is done in firmware.

The firmware flow follows steps as Figure 8 represents. When the PicoSky's firmware from a user receives a command that it shall activate RF experiment, it sets itself in RF experiment mode and enters the IDLE state. Here it waits until the START command. When the START command is received, it jumps to the PROGRAM\_FPGA state, where it programs Artix-7 with RF application FPGA image. The following firmware's step is the SEND\_PACKET state, and it starts sanding a data packet. When the last character of a data packet is transmitted, the firmware goes to the RECEIVE\_PACKET state. The firmware waits and reads data from Artix-7, and it starts comparing received with transmitted data. If the content of the received data packet is equal to the data in the transmitted packet, then the firmware jumps back to the SEND\_PACKET state. Otherwise, if content differs, the firmware jumps to the READ\_SEM state. The state reads status of SEM, and SEM replies with two statutes as described in the previous section (PicoSkyFT's actions in case of an error in Snake). When the firmware gets the reply from SEM, it jumps to the LOG state, where it logs the errors and sends them to the central unit of the TRISAT-R satellite. LOG state classifies errors in two following groups:

- If a transmit and receive data packet has different content and SEM replay with no changes in CRAM, then the error is classified as the error in the snake's data path
- If a transmit and receive data packet has different content and SEM replay with an error in CRAM, then the error is recognized as the error in the CRAMAfter the LOG state, firmware jumps back at the beginning to PROGERAM\_FPGA state and flows to described steps.



Figure 8. The PicoSky's firmware flow when ATTM board test RF application

## 7 IMPLEMENTATIONS

We implemented the experiment using the Xilinx Vivado developing tool. The research includes two types of testbeds RF experiment and snake architecture RTL experiments. RF experiment was synthesized and implemented without any constraints, while at the snake architecture, some constraints were applied:

- The first part includes communication and control with SPI, Register, Chain control, and SEM block. These blocks have full TMR protection on combinatorial and sequential components. Used constraints constrained SPI, Register, Chain control, and SEM block to CLB blocks in the upper two banks
- The second part includes one of the proposed chains (Non-protection, Partial-protection, and Full-protection snake), and every component (negators, D-flip-flops and voters) in the design has its unique address location in the FPGA floorplan. For generating unique addresses, we used a python script. The python script places components from rows 0 to 149 of the Artix-7 floorplan.



Figure 9. Figure show the occupation of all three snakes' implementation on Artix-7 FPGA. Figure (A) shows a floorplan in Artix-7 when non-protected Snake architecture RTL is implemented, (B) shows a floorplan in Artix-7 when partial-protected and final (C) shows implementation of full-protected snake architecture

# 9 CONCLUTIONS

From conducted experiments, we believe that the satellite mission will give us Artix-7 FPGA's characterization in MEO orbit. The Artix-7 FPGA has not been developed for space applications; however, it has an integrated SEM IP core that mitigates an SEU event in CRAM. Sometimes, an SEU can generate a glitch on the data or clock path, resulting in a bitflip in combinatorial or sequential logic. The TMR mitigation method addresses this issue with combinatorial or sequential logic triplication. Furthermore, the TMR method is applied to three proposed Snake architecture RTL experiments. One experiment does not have any TMR protection, while the second experiment has triplicated DFFs, and the third has triplicated all components, including voters. The outcome of the experiments will give us a good overview of which implemented TMR mitigation methods give us sufficient protection related to cost if the Artix-7 like device is placed in the MEO orbit. From the experiments, we want to evaluate how much a change in CRAM due to an SEU affects the

functionality of implemented design in FPGA and can a TMR masks a change in CRAM. The final RF application will conduct a radiation immunity test of the SDR application implemented on Artix-7 and RF Agile Transceiver AD9364. With obtained data, we will conduct a comparison between data obtained from the TRISAT-R and simulation environments. We will use for simulating radiation hardness of our RTL designs the VERI-place[11] SW tool and for simulating SET events in Artix-7 we will use data obtained from Spenvis[12].

## 8 **REFERENCES**

- F. Smith and S. Gaffoor, "An amalgamation of hardening methods for single event upset mitigation in memory elements," in *Proceedings of the European Conference on Radiation and its Effects on Components and Systems, RADECS*, Oct. 2017, vol. 2016-September, pp. 1–4. doi: 10.1109/RADECS.2016.8093150.
- [2] J. C. Fabero *et al.*, "Single Event Upsets under 14-MeV Neutrons in a 28-nm SRAM-Based FPGA in Static Mode," *IEEE Transactions on Nuclear Science*, vol. 67, no. 7, pp. 1461–1469, Jul. 2020, doi: 10.1109/TNS.2020.2977874.
- [3] D. G. Mavis, P. H. Eaton, M. D. Sibley, R. C. Lacoe, E. J. Smith, and K. A. Avery, "Multiple bit upsets and error mitigation in ultra-deep submicron SRAMS," in *IEEE Transactions on Nuclear Science*, Dec. 2008, vol. 55, no. 6, pp. 3288–3294. doi: 10.1109/TNS.2008.2006893.
- [4] Luca Sterpone and Luca Boragno, "Analysis of Radiation-induced Cross Domain Errors in TMR Architectures on SRAM-based FPGAs," *IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)*, 2017.
- [5] R. Glein *et al.*, "Reliability of space-grade vs. COTS SRAM-based FPGA in N-modular redundancy," 2015 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2015, 2015, doi: 10.1109/AHS.2015.7231159.
- [6] J. Tonfat *et al.*, "Analyzing the influence of the angles of incidence on SEU and MBU events induced by low LET heavy ions in a 28-nm SRAM-based FPGA," in *Proceedings of the European Conference on Radiation and its Effects on Components and Systems, RADECS*, Oct. 2017, vol. 2016-September, pp. 1–6. doi: 10.1109/RADECS.2016.8093186.
- [7] A. M. Keller, T. A. Whiting, K. B. Sawyer, and M. J. Wirthlin, "Dynamic SEU Sensitivity of Designs on Two 28-nm SRAM-Based FPGA Architectures," *IEEE Transactions on Nuclear Science*, vol. 65, no. 1, pp. 280–287, Jan. 2018, doi: 10.1109/TNS.2017.2772288.
- [8] H. Quinn *et al.*, "On-orbit results for the xilinx virtex-4 FPGA," 2012. doi: 10.1109/REDW.2012.6353715.
- [9] M. Caffrey *et al.*, "On-orbit flight results from the reconfigurable cibola flight experiment satellite (CFESat)," in *Proceedings IEEE Symposium on Field Programmable Custom Computing Machines, FCCM 2009*, 2009, pp. 3–10. doi: 10.1109/FCCM.2009.22.
- [10] Xilinx, "LogiCORE IP Soft Error Mitigation Controller v3.1."
- [11] Luca Sterpone, "VERI-Place Algorithm," *https://staff.polito.it/luca.sterpone/tools.html*, Feb. 2021. https://staff.polito.it/luca.sterpone/tools.html (accessed Apr. 11, 2022).
- [12] Royal Belgian Institute fo Space Aeronomy, "Spenvis." https://www.spenvis.oma.be/ (accessed Apr. 11, 2022).