University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Modelling processor reliability using LLVM compiler fault injection

Nezzari, Y and Bridges, Christopher (2018) Modelling processor reliability using LLVM compiler fault injection In: 2018 IEEE Aerospace Conference, 3 - 10 March 2018, Big Sky, MT, USA.

Full text not available from this repository.

Abstract

The use of commercial of the shelf (COTS) processors is increasingly attractive for the space domain, especially with emerging high demand applications in Earth observation and communications. An order of magnitude improvement in on-board processing capability with less size, mass, and power is possible, however, COTS parts still lag in terms of reliability in the space environment. Costly protection techniques to ensure resilience to single event effects (SEEs) is required. Whilst current software reliability techniques are only capable of detecting errors, and performing partial recovery, our research offers a step change for both error detection and recovery without degradation in fault coverage. This targets modern multicore processors. We have previously shown how to create additional passes in the compiler's intermediate representation layer to automatically add differing protection codes at compile-time using the LLVM compiler framework. LLVM is supported by multiple processing architectures, and multiple high level languages - meaning it can be ported to not just space applications, but aerospace, defence, medical, and automotive. In this paper a new LLVM fault injection tool is presented to validate and measure software protection methods - either statically at compile time or dynamically at runtime for multiple errors such as silent data corruption (SDC), control/flow errors, and crashes. We use our tool to inject faults into unprotected and protected codes and make quantitative comparisons of the errors and associated statistical confidence. Our protection method shows high coverage, up to 100% for some benchmarks, and does not assume that the memory system is protected via typical TMR hardware approaches. This means that we protect all memory instructions that use read and write. Another reason for the high coverage is the inclusion of multiple data and instruction types (i32, i32*, i1, i8, i8*, i64, float & double, float & double pointers). This research has been implemented in two processing architectures; Intel core i5-3470 with 3.2 GHz frequency and a Raspberry Pi 3. On the 1 st processing platform the overhead was less than 15% and on the 2 nd platform the overhead was less than 17%.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Nezzari, Y
Bridges, ChristopherC.P.Bridges@surrey.ac.uk
Date : January 2018
DOI : 10.1109/AERO.2018.8396489
Related URLs :
Depositing User : Melanie Hughes
Date Deposited : 21 Sep 2018 11:19
Last Modified : 21 Sep 2018 11:19
URI: http://epubs.surrey.ac.uk/id/eprint/849398

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800