University of Surrey

Test tubes in the lab Research in the ATI Dance Research

ACEDR: Automatic Compiler Error Detection and Recovery for COTS CPU and Caches

Nezzari, Yasser and Bridges, Christopher (2019) ACEDR: Automatic Compiler Error Detection and Recovery for COTS CPU and Caches IEEE Transactions on Reliability.

08766120.pdf - Accepted version Manuscript

Download (3MB) | Preview


—Recently there has been an increasing demand for more powerful processors for the next-generation space missions, such as communication and earth observation. The challenge is how to improve the reliability of the processor under the “single event effects” in orbit. We have previously proposed a new way of implementing any traditional software error detection and correction techniques at instruction level, capable of covering both the CPU and caches of “commercial off the shelf” processors. In this paper, a novel way of evaluation of the software protection is presented, based on a theoretical model and software injection experiments to predict the reliability of the whole processing architecture. The fault injection will evaluate the ability of the protection code to detect and recover errors in addition to the accuracy of the reliability models, by comparing the reliability of the theoretical predictions to the reliability of the injection experiments. Automatic compiler error detection and recovery improves the reliability of the system by reducing the error rate of “single event upsets.” In some benchmarks, the error rate was reduced to less than 1%. This research has been tested in two machines; Intel core i5-3470 and a Raspberry Pi 3. On the first processor, the overhead was less than 15%, and on the second one, the overhead was less than 17%. This research can also be ported to multiple high level languages, with the ability to cover multiple instructions and datatypes

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
Date : 29 August 2019
OA Location :
Copyright Disclaimer : © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission
Uncontrolled Keywords : Cache, compiler, CPU, error detection and recovery, processing architecture, reliability, single event upsets (SEUs).
Depositing User : James Marshall
Date Deposited : 04 Feb 2020 09:54
Last Modified : 04 Feb 2020 16:39

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800