SINGLE-EVENT MULTIPLE-TRANSIENT CHARACTERIZATION AND
MITIGATION VIA STANDARD CELL PLACEMENT METHODS

By
Bradley T. Kiddie

Dissertation
Submitted to the Faculty of the
Graduate School of Vanderbilt University
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
in
Electrical Engineering
December, 2016
Nashville, Tennessee

Approved:
William H. Robinson, Ph.D.
Bharat L. Bhuva, Ph.D.
Ronald D. Schrimpf, Ph.D.
Gabor Karsai, Ph.D.
Mark N. Ellingham, Ph.D.
ACKNOWLEDGEMENTS

Thank you to Vanderbilt University and my advisor, Dr. William H. Robinson, for the continued instruction and support over the years; for sparking my interest in this field, for pointing me towards resources useful for research; and for instilling a thirst for new knowledge in the face of unanswered research questions.

Thanks are due as well to my committee: Dr. Bharat L. Bhuva, Dr. Ronald D. Schrimpf, Dr. Gabor Karsai, and Dr. Mark N. Ellingham for their support in the intermediate and final stages of this research, and for their guidance in creating this culminated dissertation. The Vanderbilt Radiation Effects and Reliability (RER) research group was also instrumental in providing research and social support during this work, providing a foundation of experimental knowledge and research for rooting my theoretical- and modeling-based work in.

Financial support for this dissertation research was provided by Vanderbilt University and by the National Science Foundation.
# TABLE OF CONTENTS

<table>
<thead>
<tr>
<th>Section</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>ACKNOWLEDGEMENTS</td>
<td>ii</td>
</tr>
<tr>
<td>LIST OF ACRONYMS</td>
<td>v</td>
</tr>
<tr>
<td>LIST OF FIGURES</td>
<td>vi</td>
</tr>
<tr>
<td>LIST OF TABLES</td>
<td>xi</td>
</tr>
<tr>
<td>Chapter</td>
<td></td>
</tr>
<tr>
<td>I. INTRODUCTION</td>
<td>1</td>
</tr>
<tr>
<td>Research Contributions</td>
<td>4</td>
</tr>
<tr>
<td>Organization of Dissertation</td>
<td>4</td>
</tr>
<tr>
<td>II. FUNDAMENTALS OF SINGLE-EVENT EFFECTS</td>
<td>6</td>
</tr>
<tr>
<td>Sources of Radiation</td>
<td>6</td>
</tr>
<tr>
<td>Single-Event Upsets and Single-Event Transients</td>
<td>9</td>
</tr>
<tr>
<td>Effects of Technology Scaling</td>
<td>13</td>
</tr>
<tr>
<td>Summary</td>
<td>18</td>
</tr>
<tr>
<td>III. FUNDAMENTALS OF EDA AND PHYSICAL DESIGN</td>
<td>19</td>
</tr>
<tr>
<td>Logic Synthesis and Standard Cells</td>
<td>20</td>
</tr>
<tr>
<td>Placement and Routing</td>
<td>22</td>
</tr>
<tr>
<td>Logical Reconvergence</td>
<td>24</td>
</tr>
<tr>
<td>Pulse Quenching</td>
<td>25</td>
</tr>
<tr>
<td>Summary</td>
<td>27</td>
</tr>
<tr>
<td>IV. SINGLE-EVENT MULTIPLE-TRANSIENT CHARACTERIZATION</td>
<td>28</td>
</tr>
<tr>
<td>Related SEMT Modeling Methods</td>
<td>33</td>
</tr>
<tr>
<td>IC Simulation Methods and Tradeoffs</td>
<td>35</td>
</tr>
<tr>
<td>Transient Masking Factors in Logic</td>
<td>37</td>
</tr>
<tr>
<td>SEMT Modeling</td>
<td>40</td>
</tr>
<tr>
<td>A. SEMT Optimized for Physical Design</td>
<td>43</td>
</tr>
<tr>
<td>B. Characterization and Error Propagation Probability Reporting</td>
<td>46</td>
</tr>
<tr>
<td>Comparison to Previous Models</td>
<td>48</td>
</tr>
<tr>
<td>Conclusion</td>
<td>50</td>
</tr>
</tbody>
</table>
V. SINGLE-EVENT MULTIPLE-TRANSIENT RESULTS ........................................52

  SEMT Characterization Results ..................................................................52
  Simulation Parameter Experiments ...............................................................63
  Conclusion ...............................................................................................66

VI. IMPACTS OF PLACEMENT ON SEMT RELIABILITY ..............................67

  Experiment Overview ..................................................................................68
  Macro Constraints: Plan Groups .................................................................69
    A. Plan Groups: Reliability and Performance ..........................................70
    B. Plan Groups: Selective Path Hardening .............................................76
  Micro Constraints: Cell Binding ..................................................................81
    C. Cell Binding: Reliability and Performance .........................................83
    D. Impact of Density on Placement and Reliability .................................87
  Conclusion ...............................................................................................89

VII. PLACEMENT ALGORITHM FOR SEMT MITIGATION ..........................91

  Placement Design Mechanisms for SEMT Mitigation ................................91
    A. Cell Bounds for SEMT Mitigation ......................................................92
    B. Logic Design for SEMT Mitigation ....................................................98
    C. Plan Groups for SEMT Mitigation .....................................................101
  Relative Placement for Pulse Quenching ..................................................104
    D. Boolean Logic Analysis .....................................................................106
    E. Boolean Reliability Automated Design Flow ......................................109
    F. Results ...............................................................................................113
  Conclusion ...............................................................................................116

VIII. CONCLUSIONS .....................................................................................118

REFERENCES ............................................................................................120

PUBLICATIONS AND PRESENTATIONS ....................................................126
LIST OF ACRONYMS

<table>
<thead>
<tr>
<th>Acronym</th>
<th>Definition</th>
</tr>
</thead>
<tbody>
<tr>
<td>ASIC</td>
<td>Application-Specific Integrated Circuit</td>
</tr>
<tr>
<td>APR</td>
<td>Automated Place and Route</td>
</tr>
<tr>
<td>ATPG</td>
<td>Automated Test Pattern Generation</td>
</tr>
<tr>
<td>CMOS</td>
<td>Complementary Metal-Oxide-Semiconductor</td>
</tr>
<tr>
<td>DCC</td>
<td>Direct Charge Collection</td>
</tr>
<tr>
<td>DEF</td>
<td>Design Exchange Format</td>
</tr>
<tr>
<td>ECC</td>
<td>Error-Correcting Codes</td>
</tr>
<tr>
<td>EDA</td>
<td>Electronic Design Automation</td>
</tr>
<tr>
<td>EPP</td>
<td>Error Propagation Probability</td>
</tr>
<tr>
<td>GCR</td>
<td>Galactic Cosmic Ray</td>
</tr>
<tr>
<td>GDS</td>
<td>Graphic Database System</td>
</tr>
<tr>
<td>HDL</td>
<td>Hardware Description Language</td>
</tr>
<tr>
<td>IC</td>
<td>Integrated Circuit</td>
</tr>
<tr>
<td>IP</td>
<td>Intellectual Property</td>
</tr>
<tr>
<td>LET</td>
<td>Linear Energy Transfer</td>
</tr>
<tr>
<td>MBU</td>
<td>Multiple-Bit Upset</td>
</tr>
<tr>
<td>MCU</td>
<td>Multiple-Cell Upset</td>
</tr>
<tr>
<td>MRED</td>
<td>Monte Carlo Radiative Energy Deposition</td>
</tr>
<tr>
<td>MT</td>
<td>Multiple Transient</td>
</tr>
<tr>
<td>NMOS</td>
<td>n-type Metal-Oxide-Semiconductor</td>
</tr>
<tr>
<td>P&amp;R</td>
<td>Place and Route</td>
</tr>
</tbody>
</table>
PDK ................................................................................................................. Process Design Kit
PMOS .............................................................................................................. p-type Metal-Oxide-Semiconductor
PPA ................................................................................................................... Power, Performance, Area
RHBD ............................................................................................................. Radiation Hardened by Design
RP ..................................................................................................................... Relative Placement
RTL .................................................................................................................... Register-Transfer Level
SAIF ............................................................................................................... Switching Activity Interchange Format
SDF .................................................................................................................. Standard Delay Format
SE ..................................................................................................................... Single Event
SEE ................................................................................................................... Single-Event Effect
SEMT ............................................................................................................. Single-Event Multiple-Transient
SER .................................................................................................................. Soft Error Rate
SET ................................................................................................................... Single-Event Transient
SEU .................................................................................................................. Single-Event Upset
SOI .................................................................................................................... Silicon-on-Insulator
TCAD .............................................................................................................. Technology Computer-Aided Design
TID ..................................................................................................................... Total Ionizing Dose
TMR ................................................................................................................... Triple Modular Redundancy
VCD .................................................................................................................. Value Change Dump
VCS .................................................................................................................. Synopsys Verilog Compiler Simulator
VPD ................................................................................................................... Value Change Dump Plus
WCSI ............................................................................................................... Well-Collapse Source-Injection
<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>II-1.</td>
<td>Demonstration of cosmic shower for terrestrial effects</td>
<td>7</td>
</tr>
<tr>
<td>II-2.</td>
<td>Demonstration of ionized carriers perturbing the depletion region, leading to enhanced collection via drift processes</td>
<td>8</td>
</tr>
<tr>
<td>II-3.</td>
<td>Observed MCU patterns from testing of a 65-nm SRAM array with Kr, angled at 78.5 from normal, parallel to the n-well</td>
<td>10</td>
</tr>
<tr>
<td>II-4.</td>
<td>Predicted error rate as a function of frequency for combinational and sequential logic elements as well as their sum</td>
<td>12</td>
</tr>
<tr>
<td>II-5.</td>
<td>Case study of error rates for memory and logic versus frequency, showing the linear increase of SET errors with frequency</td>
<td>12</td>
</tr>
<tr>
<td>II-6.</td>
<td>TCAD representation of an MRED-generated nuclear event</td>
<td>14</td>
</tr>
<tr>
<td>II-7.</td>
<td>Measured MCU probabilities are plotted as a function of maximum MCU cluster size for planar 32-nm and tri-gate 22-nm technologies</td>
<td>15</td>
</tr>
<tr>
<td>II-8.</td>
<td>3x3 inverter layout in 65-nm bulk, showing different patterns of transients from single neutron strike events</td>
<td>16</td>
</tr>
<tr>
<td>II-9.</td>
<td>Heavy ion testing of 65-nm bulk inverters showing horizontal and vertical sizes of SEMTs</td>
<td>17</td>
</tr>
<tr>
<td>III-1.</td>
<td>Example of a transistor-level NAND gate and a standard cell used for automated placement</td>
<td>21</td>
</tr>
<tr>
<td>III-2.</td>
<td>Example of standard cell placement with shared wells as common locations for charge sharing</td>
<td>23</td>
</tr>
<tr>
<td>III-3.</td>
<td>Example of logical reconvergence of two pulses leading to a broadened or attenuated output error</td>
<td>25</td>
</tr>
<tr>
<td>III-4.</td>
<td>Example of pulse quenching mechanism</td>
<td>26</td>
</tr>
<tr>
<td>IV-1.</td>
<td>ISCAS85 circuit c7552 demonstrating shared-rail standard cell placement optimized for area, timing, and power</td>
<td>31</td>
</tr>
<tr>
<td>IV-2.</td>
<td>ISCAS85 circuit c7552 demonstrating final snapshot, post placement, routing, DRC and LVS checking, optimized for area, timing, and power</td>
<td>32</td>
</tr>
</tbody>
</table>
IV-3. Comparison of SEMT simulation methods on the ISCAS85 circuit c3540 ......................49

V-1. Error propagation probability (EPP) for each output of ISCAS85 circuit c1908 for simulations injecting a particular number of physically-adjacent transients per test .................................................................53

V-2. SEMT characterization of ISCAS85 circuit c1908, reconfigured to demonstrate SET/SEMT sensitivity of different functional categories of datapaths. ..................54

V-3. SEMT characterization of ISCAS85 circuit c2670, a 12-bit ALU and controller with 370 standard logic cells as synthesized ..........................................................55

V-4. SEMT characterization of ISCAS85 circuit c3540, an 8-bit ALU with 549 standard logic cells as synthesized and 22 outputs. ..........................................................56

V-5. SEMT characterization of ISCAS85 circuit c5315, a 9-bit ALU with 819 standard logic cells as synthesized and 103 outputs. ..........................................................57

V-6. SEMT characterization of ISCAS85 circuit c7552, a 32-bit adder/comparator with 869 standard logic cells as synthesized and 56 outputs. ........................................57

V-7. SEMT characterization of floating-point multiplier circuit cf_fp_mul_c_3_4, an 8-bit multiplier with 295 standard logic cells as synthesized and 8 outputs...............58

V-8. SEMT characterization of floating-point multiplier circuit cf_fp_mul_c_11_52, a 64-bit IEEE-754 compliant multiplier with 20,717 standard logic cells as synthesized and 64 outputs. ..................................................59

V-9. Aggregate data for all 4 floating-point multipliers, showing average output error propagation probability (EPP) for simulations injecting a particular number of physically-adjacent transients per test .................................................................60

V-10. Demonstration of traditional SET error propagation probability (EPP) vs. SEMT EPP at different modeled radiation event radii. Average number of standard cells within the radius plotted on right axis. Data from characterization of cf_fp_3_4, 8-bit floating-point multiplier .................................................................................61

V-11. SET/SEMT modeling for selected circuits of Table IV-1. Cf_fp_5_10, cf_fp_3_4, c6288 perform arithmetic functions; c1908, c2670, c3540, c5315, and c7552 are multi-function with controls ..................................................................................62

V-12. SET pulsewidth data for the ISCAS85 circuits tested .....................................................63

V-13. SEMT pulsewidth data for the ISCAS85 circuits tested ..................................................64

V-14. SEMT characterization of 8-bit floating-point multiplier. The constant pulsewidth method (CPW) is contrasted with a variable pulsewidth method (VPW) that scales linearly with distance. ..................................................65
VI-1. Example of plan grouping placement strategy ................................................................. 69

VI-2. Aggregate data for 9 multipliers and benchmark circuits, showing the EPP difference seen between the best and the worst of three plan group placement alternatives ........................................................................................................... 73

VI-3. Results for cf_fp_mul_c_8_23, the 32-bit multiplier, showing beneficial results from placement alternatives ............................................................................................................................................... 75

VI-4. Results for cf_fp_mul_c_11_52, the 64-bit multiplier, showing detrimental results from placement alternatives ............................................................................................................................................... 75

VI-5. Comparison of SEMT characterization results for the 64-bit floating-point multiplier, micro-plan grouped placement to generic placement. Up to 5 transients injected per test ............................................................................................................................................... 77

VI-6. Comparison of SEMT characterization results for the c1908 benchmark circuit, plan group PG placement variant over other placement variants ............................................................................................................................................... 78

VI-7. SEMT Characterization data for PG placement alternative of the c3540 benchmark circuit (8-bit ALU). SEMT results shown for up to 5 transients injected per test ............................................................................................................................................... 79

VI-8. SEMT characterization data for MPG placement alternative of the 32-bit floating-point multiplier. SEMT results shown for up to 5 transients injected per test ............................................................................................................................................... 80

VI-9. Longest-delay timing path generated by Design Compiler for a given output of benchmark c1908 .......................................................................................................................................................... 82

VI-10. Example of the effect of cell bounds, where 4 gates with 3 progressive pairs of create_bounds commands results in a closer placement for this selection .................................................. 83

VI-11. Best error propagation probability improvement taken for each c1908 output from the 25 logic path binding placement alternatives created, as compared to the average error probability of each output ............................................................................................................................................... 84

VI-12. Decrease in output error propagation probability for different c3540 outputs achieved via path binding placement alternatives ............................................................................................................................................... 86

VI-13. Measurements of upper-level metal layer usage for placement alternatives of c6288 multiplier circuit using varying numbers of cell bound constraints ............................................................................................................................................... 87

VI-14. Impact of increasing numbers of cell bound constraints on wirelength for c6288 (left) and c3540 (right) .......................................................................................................................................................... 89

VI-15. Impact of increasing numbers of cell bound constraints on dynamic power for c6288 (left) and c3540 (right) .......................................................................................................................................................... 89
VII-1. Results of combining two sets of create_bound commands that lead to improvements in output error propagation probability, shown to be additive to a certain practicable degree .................................................................95

VII-2. A second example of combining two sets of create_bound commands to harden multiple outputs through bind commands, showing repeatability to this method .................95

VII-3. Following Figures VII-1 and VII-2, this plot combines bind constraints for four output best-case scenarios, for a partially-hardened result .................................................................96

VII-4. Difference in output transient error pulsewidth from models that have been modified by binding together cells at the same logic depth from the specified output, as compared to a base, unhardened model .................................................................97

VII-5. Reduction of SEMT errors achievable using create_bound techniques on c6288, at 70% density, 90% density, and 90% density resynthesized with higher LMR ............100

VII-6. Example of LMR-based plan group shaping (left) and LD-based plan group shaping (right) of the ISCAS85 c3540 8-bit ALU benchmark circuit.................................102

VII-7. SEMT errors reduced by using LMR, LD, and combined factors to inform plan group shaping for c3540, as compared to an unconstrained placement ......................103

VII-8. Illustration of using RP constraints to designate a 2-cell group (rp1, rp2, or rp3) and add components to specific locations (U1-U6) .................................................................105

VII-9. Example of a pair of logic gates with a direct fan-in connection to a common NOR gate, in layout and schematic form .................................................................106

VII-10. Inserting SEMT analysis into the EDA workflow for automated placement-based hardening against SEMT-induced errors .................................................................110

VII-11. Boolean reliability automated design flow: RP cell nomination, placement, success statistics per circuit .................................................................112

VII-12. Example of relative placement automation results for the ISCAS85 circuit c7552 .........................................................................................................................113

VII-13. Reduction in SEMT-induced errors for different outputs of cf_fp_mul_c_3_4 8-bit floating-point multiplier using RP strategy versus unconstrained design .............114

VII-14. Reductions in SEMT-induced errors for all benchmark circuits and modeled radiation event radii .................................................................115
## LIST OF TABLES

<table>
<thead>
<tr>
<th>Table</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>IV-1</td>
<td>Circuits selected for simulation in this dissertation</td>
<td>29</td>
</tr>
<tr>
<td>IV-2</td>
<td>Post-place and route design characteristics of 28/32-nm circuits used for initial SEMT characterization experiments</td>
<td>32</td>
</tr>
<tr>
<td>IV-3</td>
<td>Case study of random input vector selection and ATPG methods normalized and compared to exhaustive testing</td>
<td>38</td>
</tr>
<tr>
<td>VI-1</td>
<td>Plan group placement alternatives: timing differences</td>
<td>71</td>
</tr>
<tr>
<td>VI-2</td>
<td>Plan group placement alternatives: power differences</td>
<td>71</td>
</tr>
<tr>
<td>VI-3</td>
<td>Plan group placement alternatives: differences in error propagation probability from generic for characterization of 5-transient SEMT</td>
<td>74</td>
</tr>
<tr>
<td>VII-1</td>
<td>Boolean logic truth tables for NOR2, NAND2, XOR2, and XNOR2. Inputs AB, gate output, gate output given a SEMT event, and the error count at the output of the gate are listed</td>
<td>107</td>
</tr>
<tr>
<td>VII-2</td>
<td>Relative placement automated flow statistics, and associated performance costs</td>
<td>113</td>
</tr>
<tr>
<td>VII-3</td>
<td>Relative placement flow statistics for SEMT-induced error masking at simulated 100-nm radiation event radius</td>
<td>116</td>
</tr>
</tbody>
</table>
EXAMINATION AND EVALUATION OF RADIATION EFFECTS ON THE RELIABILITY OF INTEGRATED CIRCUITS (ICs) HAVE CONTINUED TO BECOME NECESSARY DESIGN STEPS IN THE DEVELOPMENT OF ELECTRONICS BUILT TO OPERATE IN SPACE AND OTHER HARSH ENVIRONMENTS. THESE STEPS ARE BECOMING INCREASINGLY RELEVANT FOR THE OPERATION OF ICs IN TERRITORIAL ENVIRONMENTS AS WELL [BARTH ET AL. 2003]. IONIZING PARTICLES PASSING THROUGH SEMICONDUCTOR DEVICES DEPOSIT CHARGE IN THE DEVICE. IF THIS CHARGE IS ACCUMULATED IN SENSITIVE REGIONS, THEN IT MAY CAUSE SINGLE-EVENT EFFECTS (SEE): SINGLE-EVENT UPSETS (SEU) IN MEMORY, OR SINGLE-EVENT TRANSIENTS (SET) IN COMBINATIONAL LOGIC [FERLET-CAVROIS ET AL. 2013].

designs are vulnerable, with multiple upsets experimentally measured from a single ion at the 22-nm node [Seifert et al. 2012].

Traditionally, reliability analyses of ICs tend to focus on hardening against SEUs in memory [Baumann TDMR 2005]. But with increased clock speeds, SETs in combinational logic become more of a risk. Experiments have demonstrated that SET-induced errors may surpass SEU-induced errors in the multi-GHz range [Gill et al. 2009, Mahatme et al. 2011]. Modeling of transient errors represents a significant design challenge, including multiple factors of logical masking, electrical masking, and latch-window masking at play, in addition to capturing the mechanism of logical reconvergence during transient propagation. Combined with the potential for multiple upset nodes in modern technologies, characterization of multiple transient interaction is a complex and active research concern [Black et al. 2013].

Previous experimental work has explored and established the interaction of multiple transients in small circuit structures and their capability towards “pulse quenching” [Ahlbin et al. 2009]. A particle strike in an IC primarily collects charge within a shared well [Zhu et al. 2007], and when this charge diffuses to adjacent, logically connected nodes, it is possible for the multiple generated transients to interact and “quench” the errant pulse to a significantly shortened pulsewidth. The pulse quenching mechanism has been measured experimentally with laser testing [Ahlbin et al. 2013] and heavy ion testing [Du and Chen 2016].

The previously mentioned concerns give rise to the unique challenge of modeling single-event multiple-transients (SEMT), as well as understanding the vulnerability of logic to these events. Several studies have been performed in recent years on efficiently performing multiple-transient injection in modern logic circuits [Miskov-Zivanov and Marculescu, 2010, Ebrahimi et al. 2013, Casey et al. 2008, Pagliarini et al. 2011]. The research presented in this dissertation
further integrates the physical mechanisms of a particle strike into the SEMT characterization flow. A new state-of-the-art is achieved, using a combination of: (1) layout analysis with physically-informed transient injection as opposed to bit-flip injection, (2) selection of transient injection points and simulation values based on Monte Carlo analysis, and (3) implementation using a modern technology library. The approach achieves physically-realistic and computationally feasible results with up to a 40% improvement in accuracy over random-based methods previously described in literature and 11% improvement in accuracy over the previous state-of-the-art.

This dissertation subsequently presents simulation experiments to demonstrate the importance of standard cell placement in the reliability of combinational logic. Experimental simulations explore and quantify how placement can be modified in the electronic design automation (EDA) physical design phase to increase reliability against SEMT effects with minimal cost to circuit performance.

SEMT modeling and placement experiments also reveal a unique design space for a new method of radiation hardening. Pulse quenching strategies have been used to design hardened standard cells [Atkinson et al. 2011], but at a cost of 10-40% higher area and with impacts on power and timing. Alternatively, the application of pulse quenching concepts to standard cell placement has heretofore not been explored. A study of Boolean logic relations identifies node pairs in a design netlist that are particularly conducive towards pulse quenching effects, and the study of standard cell placement techniques provides methods of intelligently and unobtrusively modifying the standard EDA flow to include these relative placement constraints. By rearranging standard logic cells according to simple logic considerations, this work demonstrates that a large proportion of SEMT-induced soft errors can be masked without any area cost to the circuit, and
with timing costs that remain well within design margins, while maintaining compatibility with most other forms of radiation-hardening-by-design (RHBD) methods.

Research Contributions

The primary contributions of this research to the fields of reliability and EDA can be summed into these four themes:

1. A scalable, layout-aware, and physically realistic simulation methodology for characterizing the sensitivity of logic to radiation-induced single-event multiple-transients;

2. An examination of SEMT sensitivity for a variety of combinational logic circuits, to better understand connections between logic functionality and transient propagation;

3. An extensive analysis of the methods available in commercial EDA tools for: (a) modifying standard cell placement, (b) evaluating strategies for implementation, and (c) quantifying their impact on both performance and SEMT reliability; and

4. An algorithm for standard cell placement to increase reliability by identifying and constraining SEMT vulnerabilities in order to mask charge-sharing-induced errors with minimal cost to circuit performance.

Organization of Dissertation

Chapter I introduces motivation for this work. Chapter II provides a brief background on radiation sources, radiation effects and error mechanisms within an IC. Chapter III provides a primer on important EDA concepts including placement, routing, and signal reconvergence mechanisms. Chapter IV discusses SEMT modeling challenges as well as the novel SEMT characterization framework that has been developed. Chapter V continues with the SEMT
characterization results of benchmark circuits, comparison to other models, and valuable observations provided by these results. Chapter VI explores the methods available for standard cell placement modification in EDA and their impacts on performance and reliability. Chapter VII harnesses these methods and pairs their results with a Boolean logic study to create a standard cell placement algorithm that allows for near-zero cost masking of SEMT-induced errors, then demonstrates the RHBD capabilities of this algorithm through a SEMT characterization comparison of hardened versus unhardened designs. Chapter VIII summarizes the work and describes future directions in this field of research.
CHAPTER II

FUNDAMENTALS OF SINGLE-EVENT EFFECTS

In this work, we are particularly interested in: (1) the modeling of multiple transients from a single particle strike in combinational (intra-pipeline) logic circuits, and (2) the mitigation of these effects. Paramount to the characterization of SEMT effects is a discussion of the source of the problem – radiation that causes SEMTs. Radiation originates from multiple sources, and its effects on electronics are studied particularly in the context of space applications. With modern technologies leading to smaller feature sizes, lower operating voltages, and lower critical charge, terrestrial applications are increasingly at risk as well. This chapter will discuss these sources of radiation, their effects on memory and logic circuits, and the issue of worsening effects seen with technology scaling, all particularly as they pertain to multiple affected nodes from single radiation events.

Sources of Radiation

Radiation that affects the behavior of electronics originates from a variety of sources and has several different effects depending upon the severity of the radiation, as well as the technology and location of exposure. In space, trapped radiation areas such as the Van Allen belt can be a particularly hostile environment. Galactic cosmic rays (GCRs) and solar particles produce heavy ions and other effects. In terrestrial environments, GCRs pass through the atmosphere and strike nitrogen and oxygen atoms, creating a “shower” of particles, as illustrated in Figure II-1. The resultant protons, neutrons, heavy ions, and other secondary particles of different energies can
affect circuit behavior by depositing small amounts of charge for transitory effects, longer-lasting effects through total ionizing dose (TID) effects, or, with high enough amounts of charge, single ionizing particles that can cause single-event effects (SEE). Alpha particles have also originated from contaminants and processing materials, especially in older technologies. Effects not focused on in this dissertation include displacement damage, latch-up, or snapback. More details on different radiation risks in different environments can be found in other sources [Barth et al. 2003]; this dissertation is primarily concerned with SEE in terrestrial and space environments.

Figure II-1. Demonstration of cosmic shower for terrestrial effects. [Barth et al. 2003]

Traditionally, circuits can be characterized for resilience against radiation effects by testing with heavy ions, neutrons, alphas, or laser spot testing. A given particle has a linear energy transfer (LET) that defines the amount of energy deposited, or its mass-stopping power. As a particle passes
through a material, it loses energy and creates a dense track of electron/hole pairs. The reverse-biased junction is the most charge-sensitive part of circuits [Baumann TDMR 2005]. An electron/hole track close to the depletion region causes a transient current/voltage at that node, as shown in Figure II-2. The terms “funneling” or “well-collapse” are often used to describe this effect and the related charge sharing phenomena where multiple nodes can be affected by a single particle.

![Diagram](image)

Figure II-2. Demonstration of ionized carriers perturbing the depletion region, leading to enhanced collection via drift processes. [Loveless and Massengill 2012]

Radiation that affects memory and logic circuits of interest for this dissertation generally stems from two main sources – alpha particles in packaging, and neutrons from cosmic rays that cause by-product interactions. Certain complex effects, such as multiple-bit upsets (MBU) or SEMT, cannot generally be terrestrially induced by alpha particles, and are typically due to high-energy neutron effects [Baumann TDMR 2005]. Ultimately, high-energy cosmic neutron radiation defines the SER limit in terrestrial environments [Baumann D&T 2005].

When a neutron strikes a silicon nucleus, the excitation reaction creates ionizing particles that spread in different directions and may affect the circuit. Early work has shown that 98% of these interactions that result in errors are from neutrons that induce a single “daughter” particle
[Wrobel et al. 2000], and later studies have continued to confirm that observation; even for more modern technologies of 45-nm and 90-nm, neutron interactions that create a single daughter ion are a magnitude more common than multiple ion by-products [Cannon 2010]. When a charged particle (alpha, heavy ion, or other neutron interaction by-product) passes through a circuit, it deposits charge along a trail, which can be collected by sensitive volumes [Black et al. 2013]. Neutron particle interactions, in particular, are complex, and many different scenarios are possible. However, experimental studies have commonly shown that, when multiple nodes in an IC are affected by a single-event, these nodes are most likely to share a common well [Tipton et al. TNS 2008, Harada et al. 2011, Black et al. 2008, Evans 2016]. This behavior for particle interactions, which has been captured in multiple experimental studies, is used as a basis for the assumptions behind sources of transient faults in this dissertation’s work.

Single-Event Upsets and Single-Event Transients

Collected charge from a single event can result in the bit flip of a storage element, or a single-event upset (SEU) [Dodd and Massengill 2003]. Upsets in memory were first observed in the 1970s [May and Woods 1979], and since then, hardening techniques have focused on software error-correcting codes or hardware designs such as guard rings [Anelli et al. 1999], triple modular redundancy (TMR) [Lyons and Vanderkulk 1962], or solutions such as the Dual Interlocked Cell (DICE) latch [Calin et al. 1996]. Errors in memory have generally been considered the primary contributor to system SER, although errors in logic become more prevalent at higher clock speeds.

Traditional work in the field of reliability and circuit design assumes that particle strike events will cause at most a single upset or transient. However, with more modern technologies, single events can affect multiple sensitive volumes. Mechanisms such as direct charge collection
(DCC) and well-collapse source-injection (WCSI) describe charge sharing that can result in multiple upset sensitive nodes from a single particle strike event. Studies show that the probability of multiple cell upsets (MCU) or multi-bit upsets (MBU) increase at grazing angles, where a particle can affect multiple cells, up to 17 SRAM bits from a single neutron event in a 90-nm technology, for example [Tipton et al. TDMR 2008]. Furthermore, particle strikes at angled incidence to the device under test (DUT) demonstrate that MCU are likely to share a common well [Tipton et al. TNS 2008], as abstracted within Figure II-3.

Figure II-3. Observed MCU patterns from testing of a 65-nm SRAM array with Kr (LET=28.9 MeV-cm²/mg), angled at 78.5 from normal, parallel to the n-well. The MCU patterns show a constant string of upsets (i.e., DCC) where the ion strike occurred, with surrounding alternating upsets where WCSI was observed. [Black et al. 2008]

In older technologies, single upsets are the norm, but as detailed in the following section on technology scaling, modern technologies see increasing rates of MCU over SEU. As inter-cell distances decrease, the ratios of MCU to SEU by device simulation and experimental results exponentially increase. Simulation studies suggest that MCU dominates below 0.5-μm cell distances, with corresponding neutron testing of a 65-nm bulk process presenting slightly more
conservative results [Zhang et al. 2014]. Heavy ion testing of a bulk 65-nm SRAM at an LET of 34 MeV-cm²/mg has attributed 90% of errors to MCU events, with 3 cells upset being the most common at that test level [Uznanski et al. 2010]. Future characterization of SEU reliability and mitigation of these effects will require acknowledgement of multiple simultaneous upsets. For the work presented in this dissertation, the model of MCU will be helpful for understanding similar effects in logic.

The mechanism that causes upsets in memory can also lead to errant values in combinational logic and is termed a single-event transient (SET). In logic, however, values are transitory and are subject to various masking factors before a single-event transient is latched into an error in a register or memory. Logical, electrical, and latch-window masking must be accounted for in considering SETs and SEMTs; their role in this work is discussed in Chapter IV.

Up until recently, SEUs have by and large been the largest contributing factor to circuit SER. In 1997, Buchner predicted that, with the increase of clock speeds, single-event transients are bound to become more common in modern technologies than SEUs, shown in Figure II-4 [Buchner et. al 1997]. In 2001, Seifert made a similar prediction from simulation studies; with increasing clock speeds, combinational logic SER would increase and memory SER would decrease [Seifert et al. 2001]. In 2011, these predictions were further confirmed with experimental measurements in a 40-nm technology [Mahatme et al. 2011]. In Figure II-5, logic SER surpassed flip-flop SER in the single-GHz range. Hardened designs with better SEU immunity are even more sensitive to the effects of logic SER [C.-H. Chen et al. 2014]. These studies have demonstrated an increased need for examination of SET effects in circuits and mitigation of these effects for modern technologies.
Figure II-4. Predicted error rate as a function of frequency for combinational and sequential logic elements as well as their sum. [Buchner et al. 1997]

Figure II-5. Case study of error rates for memory and logic versus frequency, showing the linear increase of SET errors with frequency. [Mahatme et al. 2011]
When a particle causes an SET at a combinational logic node, that node will maintain an errant value for a particular transient pulsewidth, which corresponds to a different value for different technologies and LET values. Studies have shown that the ratio of an SET pulsewidth to latch delay may increase significantly with decreasing features sizes and supply voltages [Benedetto et al. 2006]. As SET pulsewidth varies widely based upon these factors and circuit design, up to multiple clock cycles in some studies [Narasimham 2008], it is important to simulate the impact of different pulsewidths on a circuit’s SER.

Beyond SETs, modern technology with shrinking feature sizes has given rise to the occurrence of multiple transients from single particle strikes (SEMTs). As mentioned earlier, there has already been some work completed on analyzing multiple-node upsets in memory [Black et al. 2005, 2008], but there is not yet an agreed upon method of applying these principles to modeling and mitigation in logic. For additional detail on single-transient (SET) modeling and analysis, see [Ferlet-Cavrois et al. 2013]. Characterizing additional SEMT effects requires information on transient pulsewidth, logical and electrical connections in the circuit, and, for accurate results, physical circuit topology information. Previously proposed methods for simulating SEMTs will be compared and contrasted with this dissertation’s work in Chapter V.

Effects of Technology Scaling

In the preceding section, it was shown that SEUs and SETs each have their own contribution to SER, but in more advanced technologies, multiple devices are commonly impacted by a single particle strike. Although particle size does not change over time within the same environment, the scaling of technology means that sensitive areas of standard cells are now smaller, closer together, and require less charge to upset; this trend is expected to continue as the
technology continues to shrink. 3D TCAD simulations have shown that high-LET particles deposit charge that may affect devices within a radius of up to 2 μm [Amusan et al. 2006]. Recent studies have even indicated the need for a new term coined “effective sensitive area” [S. Chen et al. 2014], which realizes that the size and construction of current devices creates larger sensitivities to charge deposition than ordinarily assumed, especially when considering charge sharing among devices.

A large number of studies have been performed to begin to create an understanding of the effects of single-event charge deposition on memory and logic [Black et al. 2013]. In Tipton et al., TCAD studies allow for an understanding of the interaction of multiple particles in sensitive volumes [TDMR 2008]; Figure II-6 shows the shower of particles possible from an incident neutron, as well as the track of a single resultant alpha particle that upsets multiple volumes.

Of primary interest is the relationship between the number of affected sensitive areas or nodes and the distance between these nodes. In Sheshadri et al., MRED simulations were
conducted that showed an increase in multiple-node charge collection with decreasing distance between sensitive volumes in flip-flops, due to technology scaling [2010]. Charge sharing and parasitic bipolar mechanisms have been found to be major contributors to multiple upsets, and charge deposited at a passive node increases with decreasing distance, according to TCAD simulations [Massengill et al. 2007] and laser experiments [Amusan et al. 2009].

Beyond simulation studies, heavy ion and other experiments have also showed increased proportions of multiple upset nodes in more advanced technologies. Heavy ion testing in a 90-nm bulk technology showed that multiple-cell upsets outnumber single-cell upsets above an LET of 7 (MeV-cm²)/mg [Lawrence and Kelly 2008]. At a 65-nm bulk technology, heavy ion experiments with an LET of 34 (MeV-cm²)/mg reported error rates with an over 90% multiple cell upset (MCU) percentage in memory, with 3 upset cells being the most common [Uznanski et al. 2010]. Even in new tri-gate technologies, MCU probabilities in SRAM are generally in line with last-generation planar technologies [Seifert et al. 2012]. In Figure II-7, MCU sizes up to 4 bits occurred during proton testing, with up to a 1.3-µm affected area.

Figure II-7. Measured cosmic radiation-induced MCU probabilities are plotted as a function of maximum MCU cluster size for planar 32-nm and tri-gate 22-nm technologies. Triangles denote solid data patterns and squares checkerboard patterns. [Seifert et al. 2012]
With modern and upcoming technologies, CMOS standard cells are small enough that a single particle may deposit charge that encompasses the area of multiple logic cells, as illustrated in Figure II-8. In the quoted study [Harada et al. 2011], at the 65-nm technology node, up to 6 inverter cells could experience transient pulses due to a single particle strike.

Figure II-8. 3x3 inverter layout in 65-nm bulk CMOS, showing different patterns of transients from single neutron strike events. Transients can be induced even with a separation of 1.5 microns between nodes in this technology. [Harada et al. 2011]

The key consideration in modeling SEMTs over traditional SET modeling is capturing the physical mechanism of charge sharing. Complex effects, such as SEMT, cannot generally be induced by alpha particles, and are typically due to high-energy neutron or heavy ion effects [Baumann 2005]. When a single charged particle passes through a circuit and deposits charge, this charge creates a local well-collapse event. TCAD simulations at the 130-nm technology node have shown that upsets with NMOS-NMOS and PMOS-PMOS transistor pairs are possible at multiple simulated LETs and angles of incidences, but PMOS-NMOS upsets do not occur in any case.
[Amusan et al. 2007]; heavy ion irradiation at the 65-nm node draws similar conclusions [Tipton et al. 2008]. These charge sharing observations have been backed up by more recent experiments in combinational logic circuits. Heavy ion testing at multiple LET levels of 65-nm bulk CMOS inverter chains can induce transients at up to 5 physically adjacent logic cells that horizontally share the same well, and a maximum of 2 physically adjacent cells that vertically span a well, as shown in Figure II-9 [Evans 2016].

Figure II-9. Heavy ion irradiation of a 65-nm bulk CMOS grid of inverters captures various degrees of SEMT at different LET levels. Up to 5 transients are observed in horizontally adjacent cells, along a shared well (top figure) and up to 2 transients observed in vertically adjacent cells, across a shared well (bottom figure) by a single-event. [Evans 2016]
Summary

Circuit-level simulations, device-level simulations, and irradiation experiments generally agree that lower supply voltages, smaller feature sizes, and decreasing device spacing lead to an increased risk of charge sharing and multiple upset nodes in bulk CMOS technologies. Development of future reliable circuit designs, therefore, requires an awareness of spatial dependence and strategies for mitigating charge sharing effects. Though standard logic cells are larger than memory cells and traditionally have posed less of a reliability risk, increasing clock frequencies have increased the contribution of logic to SER, and physical design of logic therefore has become more relevant in achieving reliability standards for digital ICs.
CHAPTER III

FUNDAMENTALS OF EDA AND PHYSICAL DESIGN

With the scaling of technology nodes causing multiple-node charge collection to be likely or even typical during a particle strike, it becomes readily apparent that physical design is now of greater importance than before in creating reliable and robust ICs. Physical design refers to the design step following synthesis and leading into chip manufacturing, otherwise known as RTL-to-GDSII. Circuit netlists are translated into device geometries in an IC, producing physical positions for logical components.

Circuit hardening techniques tend to fall into three broad categories: (1) technology-level hardening, such as silicon-on-insulator (SOI) or unique materials, (2) individual node hardening, focusing on identified critical vulnerabilities [Lunardini et al. 2004], or (3) software-level hardening, such as error-correcting codes (ECC) [Chen and Hsiao 1984]. Physical design of standard cells for radiation hardening has been investigated in a few studies, but physical design of an IC for radiation hardening is a largely untouched research area. If radiation hardening can be achieved in this area, then the enhancements may be additive to other technology-, node-, or software-level hardening, for more robust designs overall.

To introduce the reader to the fields of EDA and physical design as they may pertain to radiation hardening studies, this chapter will briefly describe several EDA concepts, including logic synthesis, standard cells, placement, and routing. For a more in-depth discussion of the EDA field, see [Jess 2000], and for more detail on the use of EDA tools for IC physical design, see [MacMillen et al. 2000]. Creating a means to successfully mask SETs via EDA requires a clear
understanding of transient pulse interaction at the logic-level, termed logical reconvergence, as well as the documented physical occurrences of charge sharing that enables this mechanism, termed pulse quenching.

Logic Synthesis and Standard Cells

The EDA process of converting a hardware description language (HDL) file describing a circuit into a completed, physical GDSII file ready for IC manufacturing begins with a process known as logic synthesis [Micheli 1994]. A high-level description of a process is first compiled down into individual logic functions, and then synthesized with a selected technology library into specific logic gates. Logic synthesis is usually an automated process using a commercial tool, such as Synopsys Design Compiler or Cadence Encounter RTL Compiler. Design constraints, such as timing, area, and power, are accepted alongside the submitted HDL design, and the compiler seeks to honor the constraints when possible while ensuring that the design can be legally mapped to the selected technology cell library. The end result is a cell-level netlist, which is a file (typically in Verilog or VHDL format) describing the inputs, outputs, logic cells, and the wires (i.e., signals) connecting the listed components. After logic synthesis, the next steps are design floorplanning, cell placement, routing, and chip finishing to produce the final GDSII file.

Logic synthesis is an initial process in the design chain, but in recent years, even this process has been examined for possible impacts on the reliability of a final design under radiation effects. Recent studies have examined synthesis constraints and discovered that attributes, such as drive strength, map effort, cell selection, and timing constraints, can all have an impact on the reliability and error propagation probabilities in combinational logic designs [Limbrick et al. 2011, Limbrick et al. 2013].
Besides the netlist and design constraints, the other required piece for design synthesis of an IC is a technology library of standard cells to which to map. A given technology library process design kit (PDK) has a set of standard cells corresponding to several different logic functions. A standard NAND gate implementation is shown in Figure III-1. Although a logic function could be implemented with only two or three gates for the entire design, having a variety of gates available allows for flexibility in meeting area, timing, and power constraints. Standard cells are also available with different drive strengths or supply voltage requirements in order to help meet the design constraints.

Figure III-1. Example of a transistor-level NAND gate (left) and a standard cell used for automated placement (right).

Some basic cells, such as the inverter (i.e., INV), 2-input NAND, or 2-input NOR, are small, while more advanced libraries have combined AND-OR-INV, FADD (i.e., full adder), or MUX (i.e., multiplexor) cells that combine multiple logical connections into a condensed space. Figure III-1 (right) also demonstrates the placement of n-well and p-substrate in a standard cell, with VDD and VSS rail connections. The location of radiation-sensitive volumes and the size and
positioning of standard cells become relevant in modern designs for studying charge sharing within a circuit. A large number of small cells may be affected by a single particle strike versus a smaller number of large cells.

Placement and Routing

After synthesis has provided a final list of standard cells in a design, EDA tools, such as Synopsys IC Compiler or Cadence Encounter, are used to conduct automated placement of the components to meet PPA (power, performance, and area) constraints. Standard or default placement (IC Compiler command “create_fp_placement”) minimizes area, timing, and power in a circuit. Few commands are available in EDA tools to modify placement for non-PPA goals, but IC Compiler has been explored as part of this work to determine what placement modification commands exist and how they can be appropriated for gains in reliability.

Modifying placement for reliability gains depends on the mechanism of charge sharing. When a particle strikes a circuit and deposits charge, multiple physical studies (such as shown in Figures II-8 and II-9) have shown that the most common outcome is charge accumulated within a well. Cells that share this well can be affected. Figure III-2 shows a simple example of a circuit layout superimposed with some possible affected areas from a particle strike at different LETs or incidence angles. Charge sharing is typically considered a negative effect in a circuit, but when it occurs between logically related cells, there exists a chance that the induced transients could interact and cancel, to be discussed in the following sections. If placement of cells can be changed by way of modifying the EDA placement engine, then it can be possible to take advantage of charge sharing when it occurs by pairing logical adjacency with physical adjacency, to mitigate transient errors that would otherwise propagate unhindered. The key challenge would be to achieve
this effect while incurring minimal cost, in order to account for the still-low occurrence rates of SEMTs.

Figure III-2. Example of standard cell placement with shared wells as common locations for charge sharing. A particle strike of a particular LET and/or incidence angle (some simple affected area examples shown) may strike a well and “collapse” it to the effect that several cells are temporarily shorted to a low value or a high value.

After placement, the next step in the circuit design process is routing, to: (1) legalize VDD/VSS cell connections, (2) fine-tune clock tree connections, and (3) connect cells within the design according to the synthesized netlist. The goal of automated routing is to accomplish these tasks while meeting the submitted timing constraints, minimizing power usage, and minimizing signal skew.

Static power of a circuit is determined by the choice of components or standard cells. Routing has a significant impact on the dynamic power usage of the circuit as well as the timing
Collecting and using routing information is important for an accurate treatment of electrical masking and latch-window masking factors in transient injection and circuit simulation, and will be discussed in the next chapter. Routing and wiring choices can also have an impact on transient pulsewidth and other reliability factors [Limbrick et al. 2013]. It is even possible to increase the wirelength of a circuit to act as a low pass filter and decrease circuit SER, but of course this reliability method comes with high delay penalties [Bhattacharya and Ranganathan 2011].

After the placement and routing stages have been completed, circuit designs proceed through EDA finishing stages to extract a final netlist (Verilog), predicted timing information (SDF), parasitics information (SPEF), and physical placement information (DEF & GDS). These files can be used for physically-realistic modeling and simulation, and the extracted GDSII file is submitted for manufacture of the design.

Logical Reconvergence

The important element of physical design as it pertains to SEMT characterization and mitigation is logical reconvergence. When a transient pulse is split via a fanout to multiple nodes, or if a particle strike event deposits charge that upsets multiple nodes, then multiple transients will be simultaneously propagating through a circuit. If these two or more transients are in separate logic paths, then they may propagate to output latches and produce separate errors. If they are in the same logic path, then depending on the timing of the pulses, they may reconverge.

Figure III-3 illustrates two possible cases of simple logical reconvergence. If two pulses (red) are of reinforcing values, then they may produce a single, but longer pulse at the output. If two pulses (blue) are of opposing values and still overlap, then they could serve to reduce the
output transient pulsewidth or effectively cancel it entirely. More complex cases of logical reconvergence are possible, especially when considering radiation-induced transients. In Figure III-3, gates 3 and 5 both feed into gate 4. If a radiation event causes near-simultaneous transients at the outputs of gates 3 and 5, then they may arrive closely at gate 4 with either a reinforcing or canceling effect.

![Figure III-3](image)

Figure III-3. Example of logical reconvergence of two pulses leading to a broadened or attenuated output error.

**Pulse Quenching**

Previous experimental work has explored and established the interaction of multiple transients in a circuit and their capability towards “pulse quenching” [Ahlbin et al. 2009]. A particle strike in an IC primarily collects charge within a shared well, and when this charge diffuses to adjacent, logically connected nodes, it is possible for the multiple generated transients to interact and “quench” the errant pulse to a significantly shortened pulsewidth. In the bulk CMOS inverter chain example of Figure III-4, PMOS2 is initially off and vulnerable to SEE. When a particle...
strikes this transistor, Out2 is shorted HIGH. The electrical signal propagates to Out3 to turn off PMOS3, contemporaneously with the charge diffusion in the shared n-well, which produces a second transient pulse that quenches the first. The pulse quenching behavior of multiple related transients is well known and has been measured experimentally with laser testing [Ahlbin et al. 2013] and heavy ion testing [Du and Chen 2016].

Figure III-4. Example of pulse quenching mechanism, schematic on left and TCAD/SPICE simulation on right. An ion strike on a PMOS transistor diffuses charge to the adjacent, connected transistor (left figure), and the resultant output transient at Out3 is significantly quenched versus the original induced transient at Out2 (right figure). [Ahlbin et al. 2009]

Further work has noted that pulse quenching can occur in radiation events with an LET as low as 9 MeV·cm²/mg in a 65-nm bulk technology [Ahlbin et al. 2013]. Encouraging charge sharing is shown to be a more effective hardening technique than guard bands or well contacts, especially among PMOS transistors. And as devices are spaced more closely, pulse quenching effects become more pronounced [Xueyan et al. 2011].

Charge sharing effects have inspired the modification of layout on a small scale for sensitive node active charge cancelation [Blaine et al. 2011] and differential charge cancelation.
[Blaine et al. 2012], but each of these assume small-scale charge sharing and have impacts on circuit area. Pulse quenching strategies have also been used to design hardened standard cells [Atkinson et al. 2011, Du et al. 2014], but at a cost of 10-40% higher area to the cells and therefore the total area of an IC, and impacting other performance metrics.

Alternatively, the application of charge sharing/pulse quenching concepts to standard cell placement has heretofore not been explored in depth. The example in Figure III-4 is simple, showing pulse quenching due to charge sharing between two closely-spaced inverters. Encouraging pulse quenching in the general case of combinational logic cells, however, is not straightforward [Black et al. 2013] and is an active research area. Limited attempts have been made at placing complementary cells adjacently [Pagliarini et al. 2013], but with limited scope and therefore limited effect.

Summary

This dissertation describes a novel study of Boolean logic relations to identify node pairs that are particularly conducive towards pulse quenching-related effects, and, pairing this method with a study of standard cell placement techniques, paves the way for easily and unobtrusively modifying the standard EDA flow to include these additional placement constraints. By rearranging standard logic cells according to simple logic considerations, it is demonstrated that a large proportion of SEMT-induced soft errors can be masked via this modified standard cell placement algorithm without any area cost to the circuit, and with timing costs that remain well within design margins.
CHAPTER IV

SINGLE-EVENT MULTIPLE-TRANSIENT CHARACTERIZATION

The initial and fundamental goal of this dissertation is to investigate and characterize the response of combinational logic systems to single-event radiation strikes that induce multiple transients. Characterizing the SEMT response of a circuit will build upon previous attempts in literature in order to create a novel contribution that utilizes circuit topology in conjunction with the netlist for a physically realistic simulation model. Characterizing charge sharing accurately at the circuit level requires: (1) a proper understanding of the physical mechanisms behind charge deposition, sensitive volumes, and charge sharing, (2) the inclusion of relevant information for cell location and cell orientation in simulations, (3) a transient injection methodology that is well informed by physical testing, and (4) a careful emulation of the circuit netlist and different masking factors, resulting in a tenable characterization time for circuits similar in size to modern intrapipeline designs.

This chapter will begin by setting up the physical design environment in which simulation experiments will take place. The following includes: (1) a discussion of previous work on SEMT modeling; (2) a discussion of IC simulation methods and tradeoffs; (3) a discussion of transient masking factors that make logic modeling a complex topic; and (4) this dissertation’s SEMT modeling work flow, which was optimized for physical design and reporting results. The chapter will conclude with a comparison to previous work, and the subsequent chapter will describe other results of interest from this work flow.
Table IV-1. Circuits selected for simulation in this dissertation.

<table>
<thead>
<tr>
<th>ID</th>
<th>Circuit</th>
<th>In</th>
<th>Out</th>
<th>Exponent: Mantissa</th>
<th>Gates</th>
</tr>
</thead>
<tbody>
<tr>
<td>cf_fp_mul_c_3_4</td>
<td>8-bit floating-point multiplier</td>
<td>16</td>
<td>8</td>
<td>3:4</td>
<td>295</td>
</tr>
<tr>
<td>cf_fp_mul_c_5_10</td>
<td>16-bit floating-point multiplier</td>
<td>32</td>
<td>16</td>
<td>5:10</td>
<td>1079</td>
</tr>
<tr>
<td>cf_fp_mul_c_8_23</td>
<td>32-bit (IEEE-754 single)</td>
<td>64</td>
<td>32</td>
<td>8:23</td>
<td>4704</td>
</tr>
<tr>
<td>cf_fp_mul_c_11_52</td>
<td>64-bit (IEEE-754 double)</td>
<td>128</td>
<td>64</td>
<td>11:52</td>
<td>20717</td>
</tr>
<tr>
<td>c1355</td>
<td>32-bit SEC</td>
<td>41</td>
<td>32</td>
<td>n/a</td>
<td>898</td>
</tr>
<tr>
<td>c1908</td>
<td>16-bit SEC/DED</td>
<td>33</td>
<td>25</td>
<td>n/a</td>
<td>880</td>
</tr>
<tr>
<td>c2670</td>
<td>12-bit ALU and controller</td>
<td>233</td>
<td>140</td>
<td>n/a</td>
<td>1193</td>
</tr>
<tr>
<td>c3540</td>
<td>8-bit ALU</td>
<td>50</td>
<td>22</td>
<td>n/a</td>
<td>1669</td>
</tr>
<tr>
<td>c5315</td>
<td>9-bit ALU</td>
<td>178</td>
<td>123</td>
<td>n/a</td>
<td>2406</td>
</tr>
<tr>
<td>c6288</td>
<td>16-bit multiplier</td>
<td>32</td>
<td>32</td>
<td>n/a</td>
<td>2406</td>
</tr>
<tr>
<td>c7552</td>
<td>32-bit adder/comparator</td>
<td>207</td>
<td>108</td>
<td>n/a</td>
<td>3512</td>
</tr>
</tbody>
</table>

In order to study the effects of multiple transients on a variety of combinational logic circuits, this dissertation uses a selection of standard benchmark circuits as well as some open-source arithmetic circuits. When possible, hierarchical netlists were used so as to open up more possibilities for automated placement modification for later chapters of this research. Several reverse-engineered circuits from the ISCAS85 benchmark suite [Hayes 1997] were used in order to understand trends within combinational logic circuits in general, and a selection of floating-point multipliers was used from the OpenCores repository [Hawkins 2009], including the 32-bit and 64-bit varieties that are IEEE-754 compliant, in order to examine trends among similar circuits. Since studies have shown that single events on recent technology nodes generate at most 6 transients [Harada et al. 2011], larger circuits are not necessary to test SEMT effects; for simplicity and computational speed, the ISCAS85 benchmark circuits were used without loss of accuracy. Table IV-I provides pertinent information about the selected circuits and their relative sizes. Each multiplier is supplied in VHDL format, and each output has a particular precision for
the exponent and mantissa as shown. Each ISCAS85 circuit is supplied in Verilog format, and the unsynthesized gate count is listed.

The intent of this SEMT characterization and mitigation research is to produce a workflow that can be applied to typical circuit design. Therefore, Synopsys tools were used in conjunction with modular scripts previously used to manufacture industrial chips in several technologies. This approach ensures both realistic designs and applicability to real design flows. Design Compiler is used for area/power optimization and synthesis to a modern 28/32-nm bulk library by Synopsys [Synopsys 2012]. Design synthesis constraints and their impact on reliability have been covered by other work [Limbrick et al. 2011, Limbrick et al. 2013]. Hence, in this dissertation, synthesis constraints were simplified in order to focus on the subsequent step of automated place and route (APR). Because all physical data on SEMT events currently in the literature focus on the observance of SEMTs in inverters, large standard cells (e.g., full adders, half adders, multiplexers) are excluded from synthesis solutions (set_dont_use command). Without timing constraints set, the majority of designs referenced in this dissertation are synthesized with minimum-drive standard cells.

To complete the RTL-to-GDS flow, IC Compiler is used to produce a final netlist (Verilog), timing information (SDF), parasitics information (SPEF), and physical placement information (DEF & GDS) from the finished design. An extensive, industry-validated script was developed and utilized that:

- allows for the selection of specific technology libraries and circuit density,
- proceeds through rectilinear floorplanning, power meshing, and routing,
- conducts multi-threaded standard cell placement and extensive signal routing for minimization of timing and circuit signal congestion, then
- extracts design information after chip finishing and DRC/LVS checks.

Although this dissertation examines SEMT behavior via simulation, this standard flow was also used to create ASICs for manufacture in 32-nm SOI, 45-nm SOI, and 65-nm bulk processes. SEMT characterization for this body of work has been conducted at 32-nm, 45-nm, 65-nm, and 90-nm technology nodes, but is presented in this dissertation primarily at the 32-nm node. Figure IV-1 illustrates a completed standard cell placement phase of a combinational logic circuit, and Figure IV-2 illustrates the completed design, post-routing and design checking. Basic post-place and route design characteristics for these circuits at the 32-nm node are presented in Table IV-2.

Figure IV-1. ISCAS85 circuit c7552 demonstrating shared-rail standard cell placement optimized for area, timing, and power.
Figure IV-2. ISCAS85 circuit c7552 demonstrating final snapshot, post placement, routing, DRC and LVS checking, optimized for area, timing, and power.

Table IV-2. Post-place and route design characteristics of 28/32-nm circuits used for initial SEMT characterization experiments.

<table>
<thead>
<tr>
<th>ID</th>
<th>Gates</th>
<th>Core Area (um²)</th>
<th>Core Utilization</th>
<th>Timing (ps)</th>
<th>Dynamic Power (W)</th>
<th>Total Power (µW)</th>
</tr>
</thead>
<tbody>
<tr>
<td>cf_fp_mul_c_3_4</td>
<td>295</td>
<td>1128.40</td>
<td>0.65</td>
<td>4860</td>
<td>2.15E+07</td>
<td>68.01</td>
</tr>
<tr>
<td>cf_fp_mul_c_5_10</td>
<td>1079</td>
<td>3883.57</td>
<td>0.70</td>
<td>13580</td>
<td>8.05E+07</td>
<td>341.30</td>
</tr>
<tr>
<td>cf_fp_mul_c_8_23</td>
<td>4704</td>
<td>15896.71</td>
<td>0.75</td>
<td>70600</td>
<td>3.65E+08</td>
<td>2035.50</td>
</tr>
<tr>
<td>cf_fp_mul_c_11_52</td>
<td>20717</td>
<td>66691.45</td>
<td>0.80</td>
<td>194790</td>
<td>1.63E+09</td>
<td>11953.00</td>
</tr>
<tr>
<td>c1355</td>
<td>910</td>
<td>1896.42</td>
<td>0.80</td>
<td>2220</td>
<td>4.99E+07</td>
<td>171.92</td>
</tr>
<tr>
<td>c1908</td>
<td>210</td>
<td>1023.69</td>
<td>0.55</td>
<td>2100</td>
<td>1.68E+07</td>
<td>67.73</td>
</tr>
<tr>
<td>c2670</td>
<td>370</td>
<td>1525.63</td>
<td>0.60</td>
<td>2270</td>
<td>2.68E+07</td>
<td>90.05</td>
</tr>
<tr>
<td>c3540</td>
<td>549</td>
<td>1889.81</td>
<td>0.70</td>
<td>2640</td>
<td>3.68E+07</td>
<td>123.58</td>
</tr>
<tr>
<td>c5315</td>
<td>819</td>
<td>3061.16</td>
<td>0.70</td>
<td>1870</td>
<td>5.54E+07</td>
<td>229.11</td>
</tr>
<tr>
<td>c6288</td>
<td>1440</td>
<td>3495.75</td>
<td>0.80</td>
<td>6180</td>
<td>1.13E+08</td>
<td>478.96</td>
</tr>
<tr>
<td>c7552</td>
<td>869</td>
<td>3309.46</td>
<td>0.75</td>
<td>3850</td>
<td>6.64E+07</td>
<td>279.98</td>
</tr>
</tbody>
</table>
The SEMT characterization workflow itself utilizes the Verilog, SDF, and DEF files for a given circuit. This approach allows for the simulation of a circuit with all three masking factors (i.e., logical, electrical, and latch-window), thus building upon previous netlist-only methodologies by adding placement information, as well as building upon other recent SEMT characterization attempts by adding cell orientation information for better quantification of charge sharing and logical reconvergence phenomena.

Related SEMT Modeling Methods

A selection of works has been included here for comparison and as a snapshot of development in the field of single-event multiple-transient (SEMT) simulation. Traditional fault injection studies inject single transients or upsets as glitches, by simply inverting the value held at a node. When injecting multiple transients, however, the physical mechanism of well collapse [Tipton et al. TNS 2008] dictates that these multiple transients will tie the affected nodes all to logic 1 or all to logic 0. “Well collapse” refers to when a particle strike deposits sufficient charge in a shared well such that several adjacent cells using the same well are affected in a similar manner. SEMT simulation and characterization is a new and complex research problem, so there is no golden standard for which to compare; every method entails its own accuracy and speed tradeoffs.

An initial attempt towards modeling SEMT over traditional SET methods was to model charge depositions at multiple nodes in a circuit [Casey et al. 2008]. This methodology varies deposited charge amounts, but is lacking in that the multiple nodes chosen for fault injection are random. A particle strike will in reality deposit charge at physically adjacent or close nodes, not two random nodes. Physically close nodes have a much higher chance of being logically
connected, and SETs at these locations may interact and cancel via logical reconvergence. This method therefore results in higher error rates. Pagliarini et al. report that random SEMT injection can cause up to 40% inaccuracy [2011].

As an alternative to randomly selecting nodes, Miskov-Zivanov and Marculescu proposed utilizing the netlist to determine node adjacency [Miskov-Zivanov and Marculescu 2009, 2010]. This method bypasses traditional EDA simulation tools in favor of a mathematical simulation model, and as such, is fairly scalable. However, the intrinsic assumption that fan-in/fan-out neighbors are also physically adjacent means that multiple injected transients will always interact and therefore have an artificially high rate of logical reconvergence, leading to lower error rates. As reported in [Ebrahimi et al. 2013], netlist-based fault injection can result in up to 36% inaccuracy.

More state-of-the-art work has been done to couple cell placement information with netlist information to run simulations using actual physical adjacency knowledge [Pagliarini et al. 2011, Kiddie 2012, Kiddie et al. 2013, Ebrahimi et al. 2013]. The Pagliarini work neglects possible pulse quenching behavior due to logical reconvergence, inducing error of 4-16% due to this simulation choice [Yankang et al. 2012]. In terms of physical adjacency, the Ebrahimii work is likely to have very close-to-realistic levels of logical reconvergence. However, both of these works assume that a particle strike will inject a bit-flip error at each affected node, rather than modeling the effects of charge sharing along a well.

Using standard cell placement information in conjunction with the netlist has been established as necessary for proper reliability testing of modern circuits, but this dissertation furthers the state of the art by adding n-well/p-substrate layout geometry to the simulation workflow. Simulation using this netlist-, placement-, and layout-inclusive method results in higher
error propagation rates than those previously reported in [Ebrahimi et al. 2013, Pagliarini et al. 2011], which are based on a glitch-injection model. In SEMT analysis, injecting multiple glitches presents a higher chance of transient effects canceling out, while injecting with a set-high/set-low model has a lower and more realistic chance of transient cancellation.

IC Simulation Methods and Tradeoffs

Physical manufacture and radiation testing of IC designs produce an often untenable design turnaround time, resulting in the development of different modeling methods to characterize radiation response prior to fabrication. Towards the goal of providing SEMT characterization of intra-pipeline combinational logic blocks, it is necessary to evaluate and contrast different modeling-based means towards reaching these characterization results in order to determine a proper tradeoff of simulation accuracy and tenable runtimes.

Tools exist that have a more physics-based approach [Reed et al. 2013] to represent the interaction of radiation with a semiconductor and then propagate it as a pulse through the constructed circuit. Technology Computer-Aided Design (TCAD) refers to a 2D, 3D, or mixed-mode approach to solving the semiconductor equations in a unified manner for a very small system. In particular interest to this dissertation, studies have provided a great amount of detail into the interaction of multiple transients and pulse quenching on a small scale [Ahlbin et al. 2009]. MRED is a Monte Carlo engine used to track energy deposition within specified volumes, built upon Geant4 physics [Agostinelli et al. 2003]. Physics-based models are generally limited to simulating single devices, short events, or few sensitive volumes, and they can provide a lot of detail. However, they are an unrealistic approach applied to a large IC due to large runtimes, and other approximated methods are deemed necessary [Black et al. 2013].
Capturing the behavior of a logically-diverse, intra-pipeline combinational block under normal operation may best be accomplished through modeling by analytical simulation or functional verification [Quinn et al. 2013]. SPICE is valuable for the simulation of a small number of devices, understanding the behavior and shape of transient pulses, or observing the propagation and interaction of pulses after they have been generated in a circuit. Further abstracting, purely analytical, user-built models can offer high levels of speed and can be conjoined with fault injection methods for radiation characterization, but they may lack much detail depending on the assumptions made.

To understand the impact of multiple transients on a large combinational logic system, as in this dissertation, an abstracted simulation methodology is necessary that maintains accuracy on critical simulation points such as masking factors, timing, and circuit performance, but still provides a full characterization of the circuit with a tenable simulation time. Synopsys VCS (Verilog Compiler Simulator) provides a functional testbench for logical simulation of the necessary details with a good level of performance [Synopsys 2013]. VCS compiles a design netlist with a technology library and post-synthesis timing information, along with adaptable minimum pulsewidth propagation requirements. The compiled executable can be used with scripts for circuit stimulation and fault injection at selected nodes, and the output results can be tabulated, compared, and used to generate detailed parasitics and power data for future design verification and methodology modification.

With the versatility of Synopsys VCS, the SEMT characterization methodology presented in this dissertation has been used to model SEMT behavior at a variety of technology nodes, including Synopsys 90-nm and 28/32-nm bulk libraries [Synopsys 2012] and a generic 65-nm bulk
library [Sika et al. 2013], and is also used in Chapter VI to generate data for initial detailed circuit performance measurement experiments with PrimeTime [Synopsys 2015].

Transient Masking Factors in Logic

In combinational logic, three masking factors may mask transients before they propagate to the output of a circuit: logical masking, electrical masking, and latch-window masking [Diehl et al. 1983]. Some of these have been observed to have less effect in more modern technologies, yet all three must still be considered and accounted for in validating an SEMT modeling method.

Logical masking describes how a transient may be masked by the lack of a sensitive path between the strike location and the output. For example, if one input of an AND gate is 0, then any transient pulse on the other input is masked – the output of the gate is still 0, and the transient will not affect the output. Logical masking does not change as technology scales, and is captured in our simulation using Synopsys VCS and analyzing different masking by using different input vectors.

Traditionally in logic simulation, it is common to use a series of input vectors for internal fault coverage as generated by automated test pattern generation (ATPG) tools, such as Synopsys Tetramax [Synopsys 2010] or exhaustive simulation; however, these approaches have the drawbacks of being useful for only single fault models or manufacturing defects, or not being scalable for larger circuits, respectively. Accurately capturing the possible inputs to a circuit is a required step to understanding its SER vulnerability [Rezaei et al. 2014]. An alternative uses the principles of Monte Carlo statistical coverage to capture the effects of logical masking at a high confidence and a lower simulation time than exhaustive testing.

As a case study, we compared a Monte Carlo method, exhaustive simulation, and Tetramax vectors (internal and random patterns for defect detection) for circuits on a very small scale to
enable exhaustive methods, within the classic 74XXX circuit series. A basic SEMT injection process was applied in the same manner to each circuit, with one particle strike modeled per simulation cycle. The number of simulation cycles is bounded by the number of input vectors of each method. Table IV-3 lists sizes for these small circuits, runtimes for each method, and speedup and error of results observed comparing to exhaustive as a gold model.

Table IV-3. Case study of random input vector selection and ATPG methods normalized and compared to exhaustive testing, contrasting speedup and error of a Monte Carlo-based approach.

<table>
<thead>
<tr>
<th>Circuit #</th>
<th>Gates</th>
<th>Inputs</th>
<th>Exhaustive</th>
<th>Tmax internal</th>
<th>Tmax random</th>
<th>10% Random</th>
</tr>
</thead>
<tbody>
<tr>
<td>74182</td>
<td>19</td>
<td>9</td>
<td>0:31:07 1 0.0%</td>
<td>0:00:42 44 2.9%</td>
<td>0:00:55 34 3.5%</td>
<td>0:01:47 17 0.2%</td>
</tr>
<tr>
<td>74283</td>
<td>36</td>
<td>9</td>
<td>0:36:58 1 0.0%</td>
<td>0:01:36 23 1.8%</td>
<td>0:01:00 37 2.6%</td>
<td>0:02:26 15 2.7%</td>
</tr>
<tr>
<td>74L85</td>
<td>33</td>
<td>11</td>
<td>6:31:56 1 0.0%</td>
<td>0:09:45 40 19.3%</td>
<td>0:09:26 42 15.4%</td>
<td>0:48:04 8 2.2%</td>
</tr>
<tr>
<td>74181</td>
<td>61</td>
<td>14</td>
<td>53:21:56 1 0.0%</td>
<td>0:14:55 215 2.7%</td>
<td>0:16:25 195 2.0%</td>
<td>6:03:02 9 1.0%</td>
</tr>
</tbody>
</table>

As compared to the exhaustive methodology, Monte Carlo-based random simulation has a high degree of input vector coverage, offering on average 1.5% error with a speedup of at least 12x in this small dataset, providing significant scalability without compromise in simulation accuracy. Studies in the literature have produced similar results [Asadi et al. 2012].
Electrical masking describes the possible attenuation of a pulse by propagating through gates on the way to an output, due to the electrical properties of those gates. Generally, so long as a transient pulse is longer than the combined rise and fall time of a typical gate delay in a circuit, it is likely to propagate unattenuated [Massengill and Tuinenga 2008]. As technology scales, the characteristic time constant decreases, and narrower SET pulses propagate more freely.

It has been predicted [Shivakumar et al. 2002] that, in more recent technologies, electrical masking becomes much less effective, and in other studies [Cavrois et al. 2008] shown that while the physical mechanisms behind electrical masking typically attenuates, this is not always true; it can even broaden transient errors. Therefore, accurate electrical masking using gate information and pulse propagation is still required in order to observe any possible pulse quenching or broadening.

Lastly, latch-window masking refers to the fact that a transient that propagates to the output of a combinational logic block must be present for the setup and hold time of the output flip-flop in order to actually be latched into memory. Simulating circuits with accurate timing information is critical to understanding the contribution of particular nodes to the system’s SER [Krishnaswamy et al. 2008]. In this dissertation, purely combinational logic is analyzed in order to understand and help mitigate SEMT behavior in intra-pipeline designs; but to capture this timing information, the duration of injected pulses and observed output pulses are recorded and compared. As discussed in [Mahatme et al. 2011], latch-window masking is less of a concern with the increasing operating speeds of newer technologies, since induced transients may now last further beyond setup-and-hold times, and in some extreme cases, potentially even for multiple clock cycles [Narasimham 2008].
In this dissertation, simulations are conducted in Synopsys VCS with input vector assignments synchronized to a standard clock, and output values are logged and analyzed to observe errors and duration of errors at each output corresponding to particular test parameters.

SEMT Modeling

Two different approaches for SEMT modeling are developed and utilized in this dissertation. For a logic-based perspective of multiple transient interaction in a circuit, the first simulation method compares SET injection to discrete multiple-transient injection – in other words, a given circuit is characterized selecting that a particle strike may generate 1, 2, 3, 4, or 5 transients for each and every modeled event. This perspective is useful for studying logic composition, logic depth, and function of a particular design.

To optimize towards a more physical design-based perspective, the second simulation method compares SET injection to transients injected in cells selected by multiple radiation event radii. This perspective is useful for work presented in upcoming chapters on physical design, so as to understand the impact of rearranging placement of cells that are different sizes.

In simulating the ability of a single particle strike to induce multiple transients in combinational logic, several particular physical phenomena are important to consider. With modern and upcoming technologies, standard cells are small enough that a single particle may deposit charge that encompasses the area of multiple cells, as discussed in earlier chapters. In [Harada et al. 2011], at the 65-nm technology node, up to 6 inverter cells could experience transient pulses due to a single particle strike. In [Black et al. 2008], multiple upsets in memory could be seen at an even higher quantity. Although particle size does not change over time within the same environment, the scaling of technology means that sensitive areas of standard cells are now smaller
and require less charge to upset, and this trend is expected to continue as the technology continues to shrink. Recent studies have even indicated the need for a new term coined “effective sensitive area” [Chen et al. 2014], which further incorporates potential for charge sharing between devices.

In the previously quoted cases, as well as in [Tipton et al. 2008], the mechanism observed of multiple induced transients or upsets is commonly termed “well collapse”, where a particle strike deposits sufficient charge in a shared well such that several adjacent cells using the same well are also affected in the same way. Traditional fault injection studies inject transients or upsets as glitches, by simply inverting the value held at a node. When injecting multiple transients, however, the physical mechanism of well collapse dictates that these multiple transients will tie the affected nodes all to 1 or all to 0. This behavior is captured by our method of choosing to inject either all set-high transients or all set-low transients in a particular simulation run.

After full placement and routing of a circuit, the extracted DEF file is analyzed using an automated script to determine which cells are locally adjacent along and across each power rail for each standard cell in a design. Traditionally, studies have used just the netlist for multiple-transient injection, choosing to inject transients in random locations in a circuit [Casey et al. 2008], or in gates that are fan-in/fan-out neighbors [Miskov-Zivanov and Marculescu 2010]. However, as discussed earlier in this chapter, the layout must be used for both physical cell adjacency and rail location information so that transients can be injected using a common set-high/set-low model.

For each circuit, 100,000 gates are selected for SET injection, then SEMT injection with 2, 3, 4, and up to 5 adjacent transients in each given physical location, with a total of 500,000 tests run per circuit per layout. For SEMT injection, each cell in the design is given a list of neighboring and locally close cells across each rail – VDD as a set-high transient possibility, VSS as set-low. Figure III-2 demonstrated some examples of how a group of cells may be chosen for 2-, 3-, 4-, or
5-transient SEMT injection. A particle strike in a circuit will deposit charge in the well of that region, potentially elongated depending on the angle of the strike, causing well collapse, which is modeled by this cell selection process.

**ALGORITHM 1.** SEMT Transient Injection

**Input:** Standard 28/32nm gate library, circuit DEF files (G, PG, MPG), simulation parameters

**Output:** Scripts for 500,000 radiation events, x3 for each circuit

1. Load Synopsys 32nm library LEF, parse text data
2. Build database with all standard cells and associated sizes
3. **for each in [G, PG, MPG] do**
   - Load DEF file, parse text data
   - Build database with each component, x- and y-coordinates, orientation
   - Update with x-right and left coordinates using 28/32nm library lookup
   - Identify VDD and VSS rails with orientation and y-coordinates
   **for each in [components] do**
     - Identify adjacent, physically close cells on same row
     - Identify cells across shared VSS, VDD rails separately
     - Create lookup tables for set-hi and set-lo injection
   **end**
4. **end**

Collect user parameters: #, length of transients to inject per test, # of tests to run, simulation speed (period)

5. **for each in [G, PG, MPG] do**
   - Start new set of TCL files
   - Write simulation initialization commands
   **for [# of tests to run] do**
     - Begin simulation cycle
     - Select gate from netlist at random, select VDD or VSS at random
     - Identify adjacent gates across/along well from lookup tables
     - Inject specified # of transients for specified length
     - Complete simulation cycle
   **end**
6. **end**

In order to characterize SEMT behavior of the same circuit with different placement alternatives and compare them more directly, fault injection scripts for one or more placement alternatives are generated at the same time, as detailed in Algorithm 1 for three simultaneous characterizations. Note that simulation parameters are provided to this algorithm, allowing the
circuit designer to indicate: (1) how many tests are to be run at each SET/SEMT injection level, (2) how many transients are to be injected at that level, and (3) the injected transient pulsewidth.

Injected transient pulsewidth is initially chosen to be long enough to meet minimum propagation requirements as in [Massengill and Tuinenga 2008], and full sets of transient injection simulations are run using transient durations of 500 ps, 250 ps, 125 ps and 64 ps, to study any possible effect of pulsewidth on propagation. When a particle strikes an IC, it is likely to produce multiple transients of different pulsewidths, dependent upon factors including gate drive strengths, load, values held at a node, the distance between a node and the particle strike center, the angle of the strike, and the energy. For a clear view into the impact of logic and placement on SEMT vulnerability of a circuit, simulations completed for this dissertation generally inject multiple transients all of the same duration. However, some simulations have been run where a maximum specified transient pulsewidth is induced at the affected cell at the center of a strike, with scaled-down pulsewidths impacting adjacent or nearby cells, and later results will demonstrate how this does not significantly affect the identification of vulnerable nodes or logic paths.

In addition to these transient injection scripts, which were used: (1) to characterize for SEMT behavior, (2) to test transient pulsewidth distribution behavior, and (3) to compare placement alternatives, transient injection scripts were also generated to compare our method against methods of random glitch injection [Casey et al. 2008] and layout-based glitch injection [Ebrahimi et al. 2013, Pagliarini et al. 2011].

A. SEMT Optimized for Physical Design

While the method described in the prior section is useful for understanding logic connections in a combinational netlist and their impact on sensitivity to radiation events, further
The optimization of the method was performed to evaluate the effect of physical design on a circuit’s radiation sensitivity. This optimization of the model is particularly relevant for the simulation experiments presented in Chapter VII.

**ALGORITHM 2.** SEMT Transient Injection, Physical Design Optimization

**Input:** Standard 28/32nm gate library, circuit DEF files (G, PG, MPG), simulation parameters

**Output:** Scripts for rasterscan of radiation events, x3 for each circuit

- Load Synopsys 32nm library LEF, parse text data
- Build database with all standard cells and associated sizes
- **for each in** [G, PG, MPG] **do**
  - Load DEF file, parse text data
  - Build database with each component, x- and y-coordinates, orientation
  - Update with x-right and left coordinates using 28/32nm library lookup
  - Identify VDD and VSS rails with orientation and y-coordinates
  - **for each in** [radiation event radii] **do**
    - row=0
    - **while** row < rowmax **do**:
      - xcord = xmin – radius/2
      - **while** xcord < xmax + radius/2:
        - Identify cells in row, across VSS rail, within radius; add to array
        - Identify cells in row, across VDD rail, within radius; add to array
        - xcord = xcord + 100
      - **end**
    - row = row + 1
    **end**
- **end**
- Collect user parameters: length of transients to inject per test, simulation speed (period)
- **for each in** [G, PG, MPG] **do**
  - Start new set of TCL files
  - Write simulation initialization commands
  - **for** [length of VSS array] **do**
    - Begin simulation cycle
    - Inject transient drive to 0 at all cells in array[x] for specified length
    - Complete simulation cycle
    - x = x + 1
  - **end**
  - **for** [length of VDD array] **do**
    - Begin simulation cycle
    - Inject transient drive to 1 at all cells in array[x] for specified length
    - Complete simulation cycle
    - x = x + 1
  - **end**
- **end**
From a physical design standpoint, Algorithm 1 is simply modified to utilize various radiation event radii, to investigate the effect of different particle energies on a circuit. The general flow of this method is shown in Algorithm 2. Rather than injecting transients in a cluster centering around each cell progressing through the netlist, a raster scan of the circuit is performed, with a scaled overlap of the core area in order to meter edge-related effects on the simulation results. At each 100-nm step along each row, all cells within the specified radius across each well are added to an array. In this way, the circuit layout is covered by a fine mesh of simulated radiation strikes, which result in more strikes affecting larger gates and fewer strikes affecting smaller gates, an effect not considered from the logic perspective.

To create the simulation TCL script, transient pulsewidths (of same width or of scaled pulsewidth based on cell distance from the raster scan coordinate) are injected at each cell in an array, driving the outputs of each cell to the specified shared rail. A full raster scan is simulated multiple times, either by: (1) varying inputs during the scan in order to quickly characterize for SEMT behavior, or (2) using a static set of input vectors (i.e., one vector used for each full raster scan) in order to compare and contrast placement strategies, which eliminates variance in the results originating from differences in logically-activated paths. For comparisons of different placements (here labeled G, PG, MPG, discussed in Chapter VI), a static set of 200 input vectors is generated for each ISCAS85 circuit, to provide a reasonable balance of runtime and logical coverage. Each raster scan simulates quickly and multiple scans can be performed in parallel, thereby enabling speedy characterization.
B. Characterization and Error Propagation Probability Reporting

Once each circuit has proceeded through the EDA flow and TCL script generation, a VHDL testbench is created to simulate the circuit, back-annotated with the SDF timing information and parasitics and utilizing a Monte Carlo method of input value assignments throughout the simulation. Within the VHDL wrapper, the Verilog design under test is called twice, once for transient injection tests, and once as a “Golden” circuit, to be simulated in parallel without injected transients, in order to enable a direct comparison and capture of all fault behavior. The VHDL wrapper also handles simulation initialization and clock switching at the specified period, with fresh input conditions applied to each circuit at the beginning of each clock period.

In this research, Synopsys VCS is operated in a detailed timing mode to allow for a balance between full circuit simulation and the capture of electrical masking and timing information – not as detailed as a full SPICE simulation, but maintaining relevant information for a circuit-level quantification of results, while running significantly more quickly. Post-P&R, an executable is compiled for each design of interest, with the specified design library (Synopsys 28/32-nm bulk CMOS), a constructed VHDL testbench, an SDF file for the unit under test, and compilation parameters set for +transport_path_delays +pulse_e/0 +pulse_r/0 -sdf. As these are intra-pipeline combinational circuits, timing-annotated simulations with Synopsys VCS run fairly speedily and are easily parallelizable, completing 500,000 tests for a full SEMT characterization in roughly 1-1.5 hours for most circuits shown in Table IV-1 on a single Intel Xeon 2.8 GHz core, or in less than 5 minutes on a 32-core server.

Results are (1) collected in proprietary VPD (Value Change Dump Plus) form, (2) translated to VCD (Value Change Dump) for readability, then (3) processed for both overall circuit responses and differences in individual logic path behavior. Post-simulation, analyzing the results
is primarily concerned with two parameters: (1) the number of tests (combination of input vectors and transient injections) that result in an error at a given output and (2) the duration of these errors as compared to the duration of the injected transient(s). Algorithm 3 covers the basic process in simplified form.

**Algorithm 3. VCD Transient Results Aggregator**

**Input:** Synopsys VCS VCD output, simulation parameters  
**Output:** # of errors, duration of errors for each output  
Initialize all error counts to 0

**for each in 
[*.VCD] do**

**for line in file do**

  Update timestamp, inputs A, B, outputs S_GLD, S_TST
  Bitwise compare S_GLD to S_TST
  if S_GLD[x] != S_TST[x] then
    errorcount[x]++
    Flag errorcount[x] for transient measurement
  if S_GLD[x] == S_TST[x] then
    End transient measurement for errorcount[x], store
    Sum errors, durations
  end

**end**

Write aggregate data to report

Results can be examined individually to see if particular input vector combinations or transient injection combinations are more prone to errors than others, or in aggregate in order to more succinctly study each circuit. Algorithm 3 does not produce a soft error rate (SER) or cross-section as is seen in common literature reporting of experimental results; as primarily a study on logic propagation and circuit design, this dissertation reports the ratio of simulations run to errors detected as an error propagation probability (EPP):

\[
EPP = \frac{\sum \text{input vector & transient injection combinations resulting in an error at the output}}{\sum \text{all simulated input vector and transient injection combinations}}
\] (1)

in a manner similar to that of [Limbrick et al. 2011].
Collecting the EPP is important for a logical evaluation of the system; examining the duration of transient errors at the output provides a look into which outputs exhibit more pulse quenching or pulse broadening. As an exercise to determine if the duration of injected transients is a significant parameter, injected durations were varied, and the output results can be examined to see if there is a sizeable trend.

Comparison to Previous Models

Previous works that used only the logical netlist for simulation result in an over- or under-estimation of charge sharing effects. If multiple transients are injected that are fan-in/fan-out neighbors as in [Miskov-Zivanov and Marculescu 2010], then these transients are guaranteed to always be logically related, will react, and have a higher-than-typical chance of canceling out and giving lower error rates. As reported in [Ebrahimi et al. 2013], netlist-based fault injection can result in up to 36% inaccuracy. In contrast, multiple transients could be injected at random nodes without concern for netlist relation; in this case, as in [Casey et al. 2008], selected nodes are much less likelier than typical to be logically related, so there is a lower chance of canceling out, thus higher error rates are produced. [Pagliarini et al. 2011] reports that random SEMT injection can cause up to 40% inaccuracy.

When two or more transients are caused by a particle strike in an IC, these transients are due to a well collapse that ties adjacent gates to VDD or VSS. Simulations from this dissertation that capture this mechanism result in higher error propagation rates than those previously reported in [Ebrahimi et al. 2013, Pagliarini et al. 2011], which are based on a glitch-injection model. In SEMT analysis, injecting multiple glitches presents a higher chance of transient effects canceling
out, while injecting with a set-high/set-low model has a lower and more realistic chance of transient cancellation.

Scripts were created in a similar manner to this dissertation’s SEMT analysis scripts, to emulate previous research on glitch injection and random node selection. (1) The simplest script, “Random/Glitch,” models an SEMT event by analyzing the netlist and randomly selecting the specified number of nodes, then writes a command to invert the value held at that node during the test cycle. (2) Including placement information, “Layout/Glitch” selects a node at random, analyzes the layout to determine which nodes are adjacent, and inverts the value held at a specified number of nodes within that list during the test cycle. (3) For a common well, charge-sharing-based model, “Random/Well” analyzes the netlist and randomly selects the specified number of nodes, then shunts the value of those nodes all to 1 or all to 0 (randomly selected) during the test cycle. (4) The last model, “Layout/Well” is this dissertation’s model, already described.

Figure IV-3. Comparison of SEMT simulation methods on the ISCAS85 circuit c3540. An average error rate for the circuit is shown using each method of glitch injection or high/low injection, using layout information for adjacency selection or random selection.
Figure IV-3 shows a comparison of the differences between using layout information versus random selection in choosing multiple transients, and the differences between injecting multiple glitches versus injecting multiple set-low transients or multiple set-high transients. There is a clear asymptotic behavior of layout-based methods as the number of injected transients increase; this kind of testing captures the instances where injected transients are logically related and overlap or cancel out, lowering error rates. These cases may actually be encouraged through changing the placement, as discussed in subsequent chapters.

It should also be clear that the “Layout/Glitch” method, here performed similarly to that in [Ebrahimi et al. 2013, Pagliarini et al. 2011], attenuates more quickly (has a lower slope) than our proposed “Layout/Well” method, which more accurately captures charge sharing effects between physically adjacent cells. Transient interaction to the effect of cancelation occurs at a higher, but less realistic rate in the “Layout/Glitch” method. Glitch-based injection methods will inject a transient at every nominated cell, while set-high/set-low-based methods will only see an error when the chosen injected value is opposite of that currently held at the node(s), so glitch-based methods should have approximately double the error rate of high/low-based methods, assuming a 50/50 spread of 1’s and 0’s at nodes. Even accounting for this scenario, the “Layout/Glitch” method has up to an 11% difference from the “Layout/Well” method for larger numbers of transients injected.

Conclusion

Experimental work [Harada et al. 2011, Evans 2016, Seifert et al. 2012] has demonstrated the increased sensitivity of modern combinational logic systems to single-event-induced multiple-transients (SEMT). Data shows that a particle collects charge primarily in a shared well, inducing
transient shorts at physically adjacent devices. The SEMT characterization suite in this dissertation abstracts and captures this experimentally-observed behavior for a simulative model.

The new simulation method presented in this dissertation makes use of a series of automated scripts to provide an easy-to-use implementation, and the modular construction allows for parallelizable simulation, providing quick characterization runtimes. SEMT sensitivity characterization of combinational logic circuits is achieved with an 11% increase in accuracy over the previous state-of-the-art. Beyond reliability characterization, this script suite provides a valuable framework for studying SEMT reliability differences between circuits due to standard cell placement techniques.
CHAPTER V

SINGLE-EVENT MULTIPLE-TRANSIENT RESULTS

The backbone of this body of work is the SEMT injection and simulation methodology covered in the prior chapter. Products of this workflow include (1) general SEMT characterization of circuits from this method, (2) results from the physical design optimization of this method, and (3) investigation of some simulation parameter choices on timing.

SEMT Characterization Results

As an initial demonstration, Figure V-1 shows a general SEMT characterization of ISCAS85 circuit c1908. This is a 16-bit SEC/DED circuit, with control, parity, and data outputs. The circuit as synthesized to the 28/32-nm bulk CMOS library consists of 210 standard logic cells, and runs nominally at 476 MHz. Plotted on the y-axis is the output Error Propagation Probability (EPP), or the probability that a single event causing one or more transients (SET or SEMT) will result in an error present at that output of the combinational logic block, scaled by any modulation of the transient pulsewidth. As demonstrated, different outputs have different EPP, or sensitivity to radiation-induced errors, and EPP increases with larger degrees of SEMT, or more transients produced by a single event. The average EPP of the circuit is reported as a sum of errors observed at all outputs of the circuit, divided by the number of outputs of the circuit.
Traditional SET testing injects a single transient in a circuit at a time, corresponding to the “1” mark on the x-axis, for 1 injected transient per test. Generally, SET error propagation probabilities (EPP) are low, <2% errors seen per output on average, but they do not take into account any charge sharing or multiple transient effects that are likely in modern technologies. When 2, 3, 4, or 5 transients are caused by a single particle strike, the output EPP increases overall, as expected.

Each output EPP may increase more than others with increasing SEMT severity (# of injected transients per test). Note that some the outputs of c1908 generally follow one of two trends. The 16-bit OUT data output has primarily low error rates, all grouped near the bottom of the plot, while the miscellaneous outputs and the 6-bit SC output see higher error rates. This figure is reformatted to demonstrate this observation more clearly in Figure V-2. Analyses such as these are useful for determining which outputs are more prone to error, as well as which are more volatile under SEMT effects versus traditional SET testing.
Figure V-2, SEMT characterization of ISCAS85 circuit c1908, reconfigured to demonstrate SET/SEMT sensitivity of different functional categories of datapaths.

Note also that, as in [Ebrahimi et al. 2013, Pagliarini et al. 2011], error rates increase with SEMT testing over SET results, but error rates do not double going from 1 injected transient to 2, or triple from 1 to 3 – increases are asymptotic, and at a shallower angle than glitch injection testing would produce otherwise. These simulations take into account logical masking and electrical masking. Outputs are measured to give output error observability windows, as in latch-window masking.

Similar results are observed in other circuits. A variety of combinational logic circuits from the ISCAS85 suite were characterized for vulnerability to various degrees of SEMT, in order to study the impact logical function may have on SEMT vulnerability. Several multipliers of different scales from an OpenCores suite were also characterized, for further investigation into the effect that logic depth and increased circuit complexity may have on transient propagation and masking.
Figure V-3 shows SEMT characterization results for ISCAS85 circuit c2670, which is a 12-bit ALU and controller, densely synthesized into 370 logic gates. Larger circuits (measured by number of functional outputs) tend to have lower error probabilities, since a single event strike is fairly unlikely to affect the majority of logic paths. However, of course, error propagation probabilities are still high for some particular outputs; the two indicated in Figure V-3 are simple comparator outputs and are more vulnerable due to their low logic depth and therefore low transient masking probability.

![Figure V-3. SEMT characterization of ISCAS85 circuit c2670, a 12-bit ALU and controller with 370 standard logic cells as synthesized.](image)

Following are SEMT characterization results for several other ISCAS85 combinational logic circuits: c3540 is an 8-bit ALU with 22 outputs, synthesized to 549 gates; c5315 is a 9-bit ALU with 103 outputs, synthesized to 819 gates, and c7552 is a 32-bit adder/comparator with 56 outputs, synthesized to 869 gates.
C3540 has a similar SEMT response to c2670 in that two particular outputs are particularly sensitive to SET or SEMT, and likewise demonstrate a highly asymptotic EPP increase with increasing degree of SEMT. Vulnerable logic paths like these are particularly sensitive to radiation events, and the asymptotic behavior demonstrates that they are also particularly impacted by logical reconvergence, and therefore could likewise benefit from placement-based hardening.
Figure V-5. SEMT characterization of ISCAS85 circuit c5315, a 9-bit ALU with 819 standard logic cells as synthesized and 103 outputs.

Figure V-6. SEMT characterization of ISCAS85 circuit c7552, a 32-bit adder/comparator with 869 standard logic cells as synthesized and 56 outputs.
Though the circuits simulated in the ISACS85 benchmark suite show SET and SEMT vulnerabilities of different combinational logic circuits, it is difficult to evaluate specific attributes that may impact the SET or SEMT vulnerability of each arithmetic function implemented in these circuits. The OpenCores suite of floating-point multipliers allows a closer look. Each multiplier produces a final output with 3 elements: a sign bit, an exponent, and a mantissa. For the 8-bit multiplier shown in Figure V-7, the exponent is 3 bits and the mantissa 4. For the 64-bit multiplier shown in Figure V-8, the exponent is 11 bits and the mantissa 52.

Figure V-7. SEMT characterization of floating-point multiplier circuit cf_fp_mul_c_3_4, an 8-bit multiplier with 295 standard logic cells as synthesized and 8 outputs.

In all four multiplier circuits simulated, the sign bit, determined by a one-bit logic path, has essentially the same EPP for SET and SEMT testing. The exponent is a complex but balanced function, such that the exponent bits all have approximately the same sensitivity. At a high
modeled radiation event, 5-SEMT, the exponent of the 8-bit multiplier has a 38% EPP, and the
exponent of the 64-bit multiplier has a 3.2% EPP. The difference in these two figures is due to the
difference in size of the circuits; a radiation event in a large circuit is unlikely to affect many paths
as compared to one in a small circuit, so the error rate given a random radiation event will be
smaller in a large circuit than in a small one.

Figure V-8. SEMT characterization of floating-point multiplier circuit cf_fp_mul_c_11_52, a 64-
bit IEEE-754 compliant multiplier with 20,717 standard logic cells as synthesized and 64 outputs.

Analyzing the data represented in Figure V-8 shows that the arithmetic function of the
mantissa in these multipliers has more complex behavior that leads to a dynamic SEMT rate. The
mantissa is not composed of balanced datapaths; as synthesized, some bits of the mantissa have
shorter logic depths than others. The effect is such that the MSB, bit 51, has a lower EPP (2.3% at
5-SEMT) than the LSB (15.1% at 5-SEMT). The MSB has a greater logic depth, and therefore
transients injected in this path have a higher chance of being masked as they propagate towards the output, hence the lower Error Propagation Probability.

Figure V-9 shows an aggregate collection of data for the floating-point multiplier circuits simulated in this dissertation. Individual output results are not shown here for brevity, but, we see that larger circuits produce lower error rates than smaller circuits; this result is again due to the higher chances of logical masking and reconvergence in circuits with a higher logic depth, and because a random particle strike is less likely to affect a given output. More complex circuits, such as the ISCAS85 series, show that this observation also holds true when comparing outputs of similar functionality between circuits, but that complex circuits as a whole can have unique responses.

Figure V-9. Aggregate data for all 4 floating-point multipliers, showing average output error propagation probability (EPP) for simulations injecting a particular number of physically-adjacent transients per test.
The figures shown thus far focus on the logic-based approach to evaluating SEMT rates, which is also largely used as the basis for initial experiments on modifying standard cell placement in the following chapter. The penultimate chapter of this dissertation focuses on developing a standard cell placement algorithm specifically informed by physical design, and therefore the SEMT evaluation method was optimized to calculate the SET and SEMT vulnerability of a circuit by modeling radiation event radii. The following figures illustrate results generated using this kind of perspective.

Figure V-10. Demonstration of traditional SET error propagation probability (EPP) vs. SEMT EPP at different modeled radiation event radii. Average number of standard cells within the radius plotted on right axis. Data from characterization of cf_fp_3_4, 8-bit floating-point multiplier.

As plotted in Figure V-10, a traditional SET error rate methodology would report the SET propagation susceptibility of this circuit at 10.2%, but SEMT testing (performed here at radii of 100, 200, 500, 1000, and 1500 nm) shows that the error probability is higher when considering
charge sharing effects, and continues to increase corresponding to an increasing radiation event radius. For this 32-nm technology node, even a small radiation event radius encompasses multiple cells, with larger radii extending up to 4 or more cells.

Figure V-11. SET/SEMT modeling for selected circuits of Table IV-1. Cf_fp_5_10, cf_fp_3_4, c6288 perform arithmetic functions; c1908, c2670, c3540, c5315, and c7552 are multi-function with controls.

Figure V-11 demonstrates that arithmetic circuits are more susceptible to error than multi-function decoders and ALUs, because a greater proportion of the circuit is sensitized in an average test. In all cases, it is clear that traditional SET testing under-reports error susceptibility, and that SEMT-induced error is an increasing concern with (a) small cell sizes and (b) as single-event charge collection increases (i.e., with higher-LET particle environments).
Simulation Parameter Experiments

As an exercise to see if the injected transient duration has a significant impact on propagation, all tests for the ISCAS85 circuits were replicated four times, with injected transient pulsewidths of 500 ps, 250 ps, 125 ps, and 64 ps. Each output of each circuit was observed for errors, and these errors were measured. Depending on the injected transient pulsewidth, some pulse broadening or quenching may occur [Cavrois et al. 2008]. Figures V-12 and V-13 shows these results for SET testing (1 injected transient) and SEMT testing (5 injected transients). Quenching/broadening is plotted versus the circuit period, as there is a clear linear relationship between the two parameters.

Figure V-12. SET pulsewidth data for the ISCAS85 circuits tested. For these tests, one transient at a time was injected with a length of 500 ps, 250 ps, 125 ps and 64 ps, and the outputs were observed to measure the average error pulsewidth. The change in output transient pulsewidth is reported against the speed of the circuit.
For this technology library and these synthesized drive strengths, in both SET and SEMT testing, there are similar results. The 500-ps transients tend to be quenched slightly, to about ~450 ps at the outputs, and quenched more for slower circuits, i.e., circuits with a deeper pipeline. The 250-ps transients see very little quenching or broadening, and the 125-ps transients are broadened slightly, to ~135 ps, or more for slower circuits. The 64-ps transients are broadened more significantly, to ~70 ps for SET testing and ~80 ps for SEMT testing. Typically, longer injected transients experience quenching and shorter injected transients experience broadening.

Additionally, simulation experiments were performed to evaluate the effect of scaling pulsewidth with the distance a cell is from a modeled strike location. This can be achieved within the physical-design-optimized SEMT methodology. A particle strike that passes through an IC will deposit charge in a gradient fashion, such that nodes closer to the center will receive more charge,
and nodes further away will receive less charge. This approach will have the effect of producing a lower-drive and/or shorter-length transient pulsewidth at further away cells.

Most of the work presented in this dissertation uses the assumption that a particle strike affects the selected cells evenly, since this allows for a much cleaner and clearer evaluation of placement strategy effects. However, for a more physically precise SEMT characterization, the methodology can be adapted to scale the injected transient pulsewidth for each node nominated for injection, according to the distance between the modeled strike location and the specified node. Figure V-14 shows a comparison of results using this method. The variable pulsewidth method scales the injected transient pulsewidth linearly according to the distance of the cell from the center of the modeled radiation strike.

Figure V-14. SEMT characterization of 8-bit floating-point multiplier. The constant pulsewidth method (CPW) is contrasted with a variable pulsewidth method (VPW) that scales linearly with distance. Raw number of errors (bold lines) and number of errors scaled by pulsewidth modulation (square markers) are plotted.
When multiple transients are induced by a single particle strike, regardless of whether they are the same pulsewidth or not, the number of errors observed at the output remains roughly the same (blue and red bold lines in Figure V-14). When variable injected pulsewidths are used, the timing-scaled results are observably lower. However, investigation of the individual output data show that the most vulnerable outputs remain the same between the two methods, and vulnerability identification is the key purpose of the SEMT characterization methodology of this dissertation.

Conclusion

SEMT characterization results via simulation have been provided for a variety of benchmark and arithmetic circuits, to show general trends within circuit families as well as with combinational logic circuits in general. Pulse broadening and attenuation due to transient pulsewidth and pulse propagation has been measured via simulation. These results lend useful insight into the radiation sensitivity of combinational logic intellectual property (IP) blocks prior to chip redesign or tape-out.

Performing SEMT characterization and simulation parameter experiments reveals a variety of particular observations about SEMT reliability in combinational logic circuits. SEMT error probabilities are observed to increase with an increasing number of transients induced by a single-event, or with an increasing radiation event radius, as expected. However, logical reconvergence due to multiple transient interaction also increases with increasing SEMT severity, providing a potential mechanism for error rate reduction through physical design, to be explored in the following chapters.
CHAPTER VI

IMPACTS OF PLACEMENT ON SEMT RELIABILITY

Once a methodology has been produced and developed for characterization of SEMT effects, it becomes clear that standard cell placement in the circuit design stage is now an important factor for reliability of ICs manufactured at modern technology nodes. Primarily an issue in bulk technologies, when charge is deposited into a circuit, multiple adjacent nodes may be affected. When multiple transients are induced in logic, two cases are possible: (1) these transients are in different logic paths, and may therefore propagate to separate outputs and potentially cause multiple errors, or (2) these transients are in the same logic path, and may propagate and reconverge, either overlapping for a single error, or masking for a reduced or no error. Adjusting standard cell placement to encourage Case (2) could improve reliability of circuits in terms of SEMT vulnerability. Prior to this dissertation, little research has been done to achieve radiation hardening of an IC via a standard cell placement algorithm.

This chapter of the dissertation is therefore concerned with the impact of standard cell placement on SEMT reliability. The goal for this chapter is not to directly achieve reliability through placement; the intent is to explore physical design, to find and to evaluate methods to modify standard cell placement. When a noteworthy mechanism for modified placement is achieved, studies are performed to observe the effect on (1) SEMT reliability and (2) circuit performance metrics. Observations gained from these design and simulation experiments are used in the following chapter to inform the creation of a standard cell placement algorithm that achieves radiation hardening while maintaining minimal impacts on area, timing, and power.
Experiment Overview

In this chapter, several hierarchical, combinational logic circuits from open-source repositories and benchmark sources, listed in Table IV-1, are modified in the standard placement EDA phase. Multiple, unique placements of the same circuit are created and compared to study how attributes relate, including: (1) logical hierarchy, (2) core utility ratio, (3) interconnect length and circuit congestion, (3) static and dynamic power, (4) circuit timing, and (5) circuit functionality. Also, different tradeoffs are examined to determine the reliability of a circuit under single-event multiple-transients.

Synopsys IC Compiler typically performs standard cell placement with the goals of minimizing interconnect, which minimizes area, power usage, and timing. There are few avenues available to modify placement in such a way that charge sharing opportunities between logically-related cells are increased (and therefore SEMT reliability is enhanced), but skipping commercial placement tools entirely as in other works [Pagliarini and Pradhan 2014] reduces the performance of the circuit by an unacceptable degree. In the referenced work, wirelength is increased by 2-3x, plus associated power and timing costs. For both useful performance and reliability improvements, the work described in this chapter investigated the ways that standard cell placement can be modified, rather than rewritten.

Placement modification in IC Compiler can be visualized from two different perspectives: (1) macro-level modification and (2) micro-level modification. In the physical design flow, large designs are normally broken down into individual IP macros, which can be placed, routed, and finalized individually. Soft macros, or “plan groups,” are pliable and adaptable towards reliability-centric experiments. On a much smaller scale, “cell bounds” are used to direct relative placement
of individual standard logic cells to bring selected pairs closer together in the placement phase. Both of these mechanisms will be explored and contrasted in this chapter.

Macro Constraints: Plan Groups

In IC Compiler, plan groups are typically used to organize different components in a larger design by physically relegating cells under the same hierarchical logic module to a specified physical location. It can also be harnessed on a smaller scale by indicating a size and shape of a quadrant for a logic module. For example, in a complex circuit such as an ALU, logic that pertains to an adder can be placed in one location, a comparator can be placed in another, or an error-checker can be placed in another. If done properly, then this approach could increase the chance that a particle strike affects cells that are more closely related logically than in an unconstrained circuit, and therefore increase the chance of logical reconvergence and decrease error counts.

Figure VI-1. Example of plan grouping placement strategy. Default placement with no plan group actions (i.e., generic, “G”) is shown on the left, first-level plan grouping of logical hierarchies (i.e., plan-grouped, “PG”) is shown in the center, and second-level plan grouping of logical hierarchies (micro-plan-grouped, “MPG”) is shown on the right.

Several circuits in Table IV-1 were placed and characterized for SEMT sensitivity with generic, plan-grouped, and micro-plan-grouped placement alternatives. Figure VI-1 shows an
example of these placement strategies. Generic, default placement (IC Compiler command create_fp_placement) uses no plan group constraints, and offers a balance of minimized congestion and wirelength. Then, plan groups are created for each first-level module available in the logic hierarchy of the circuit (“PG”), and each of these are placed in their own rectangular sector of the circuit, thus relegating logically-related components to closer locations. Finally, this restriction is made tighter by creating plan groups for all second-level modules in the hierarchy (“MPG”), when made available in the benchmark netlist. Each plan group is automatically placed and shaped rectilinearly by IC Compiler. Although PG and MPG sometimes share a few of the same plan groups in limited-hierarchy designs, shapes differ between each placement alternative, and MPG always has more plan groups than PG.

After the placement stage has been completed, each of the three design alternatives are sent through routing and finishing stages for precise calculation of timing and power effects, and final Verilog, DEF, SDF, and parasitics files are extracted. These results will be examined and discussed for contrasting the SEMT reliability and performance costs (Subsection VI.A) and applications for selective path hardening (Subsection VI.B).

A. Plan Groups: Reliability and Performance

When placing standard cells for a design in IC Compiler or another EDA tool, several design factors are minimized: area, timing, and power, in order of default priority. Although this order can be changed, generally a design is placed for the smallest area first, then finalized for the best timing possible and the lowest power usage (static and dynamic).

For this dissertation, we created a generic placement for each circuit, then, holding to the same area constraints, created PG and MPG alternative placements in order to compare SEMT
behavior. For each of the three total placements, area remains constant; no components are added or removed – they are simply rearranged. Any benefit that comes from using one placement over another comes at zero cost to circuit area. In terms of timing and power, there are fairly small penalties when using the PG and MPG placements over the generic placement. Table VI-1 shows the timing tradeoffs and Table VI-2 shows the power tradeoffs.

Table VI-1. Plan group placement alternatives: timing differences

<table>
<thead>
<tr>
<th>ID</th>
<th>Area (µm²)</th>
<th>Timing G (ns)</th>
<th>Timing PG Δ</th>
<th>Timing MPG Δ</th>
</tr>
</thead>
<tbody>
<tr>
<td>cf_fp_mul_c_3_4</td>
<td>1128</td>
<td>4.86</td>
<td>0.62%</td>
<td>0.62%</td>
</tr>
<tr>
<td>cf_fp_mul_c_5_10</td>
<td>3884</td>
<td>13.58</td>
<td>-0.37%</td>
<td>0.29%</td>
</tr>
<tr>
<td>cf_fp_mul_c_8_23</td>
<td>15897</td>
<td>70.60</td>
<td>-1.35%</td>
<td>1.36%</td>
</tr>
<tr>
<td>cf_fp_mul_c_11_52</td>
<td>66691</td>
<td>194.79</td>
<td>5.80%</td>
<td>2.85%</td>
</tr>
<tr>
<td>c1908</td>
<td>1024</td>
<td>2.10</td>
<td>1.43%</td>
<td>0.95%</td>
</tr>
<tr>
<td>c2670</td>
<td>1526</td>
<td>2.27</td>
<td>3.52%</td>
<td>0.88%</td>
</tr>
<tr>
<td>c3540</td>
<td>1890</td>
<td>2.64</td>
<td>1.14%</td>
<td>3.03%</td>
</tr>
<tr>
<td>c5315</td>
<td>3061</td>
<td>1.87</td>
<td>1.60%</td>
<td>3.21%</td>
</tr>
<tr>
<td>c7552</td>
<td>3309</td>
<td>3.85</td>
<td>6.49%</td>
<td>4.16%</td>
</tr>
<tr>
<td>Average</td>
<td></td>
<td></td>
<td>2.10%</td>
<td>1.93%</td>
</tr>
</tbody>
</table>
As shown in Table VI-1, timing costs to produce these alternative placements are minimal, averaging at about 2% for PG or MPG versus generic. Some layouts benefit from the alternative gate placement and run even more speedily than generic. As shown in Table VI-2, costs to power are slightly higher, with an average increase of 8.5% in total power for an alternative placement over generic, and 8.2% increase in net switching power. There is no significant difference between PG and MPG placement alternatives in terms of timing or switching power, but the MPG placement methodology does often increase the total power usage over PG placement, due to the increased interconnect. Among the three placement alternatives, the components remain the same; the only difference is the location of some components and therefore the amount of interconnect between connected gates.

SEMT characterization on each circuit and placement style was performed, and error propagation probabilities from generic, PG, and MPG placement strategies were compared to see what differences occurred in SEMT testing. Nodes that are logically related are more likely to be physically closer in PG and MPG varieties, presenting a possibility that MPG or PG circuits may therefore be more reliable than a generic placement under SEMT effects, and ideally both PG and MPG should lend more reliability to some logic paths over the default placement algorithm. MPG circuits, with a close pairing of logical adjacency and physical adjacency between a majority of cells, may see more logical reconvergence and lower error rates than PG. PG circuits may also see lower error rates than G, for the same reason. Yet, since plan grouping is intended primarily for larger-scale design separation rather than studies of effects between individual cells, these plan grouping methods can still be too coarse to have a beneficial effect over G, and the MPG alternative will not always necessarily exhibit higher reliability than PG.
As an example, Figure VI-2 shows the maximum differences in SEMT testing that are possible among the three placement alternatives for each circuit tested, by comparing the most and the least reliable of the three. The EPP difference at 5 injected transients for PG and MPG placements of these circuits compared to the generic placement as a base are additionally listed in Table VI-3. Modifying the standard placement will not have any effect for SET testing (1 injected transient on the x-axis), but the output Error Propagation Probability (EPP) can decrease by as much as 5% for 2 injected transients, or 9% for higher degrees of SEMT due simply to the placement of cells within the same area. ISCAS85 circuit c1908 sees the greatest improvements in this case, with the PG placement variety achieving EPP reduction over the other two placement strategies, with no change in area and only a 1.4% reduction in circuit speed. These results serve to show that placement has an impact on SEMT vulnerability, but that one strategy of G, PG, or MPG may not necessarily always provide the most reliable result.

![Output EPP Change vs. Degree of SEMT](image-url)

Figure VI-2. Aggregate data for 9 multipliers and benchmark circuits, showing the error propagation probability (EPP) difference seen between the best and the worst of three plan group placement alternatives, for simulations injecting a particular number of physically-adjacent transients per test.
Investigating the change in Error Propagation Probability for individual outputs in each circuit, it becomes clear that plan group placement modification generally entails tradeoffs – some logic paths may be hardened against charge-sharing-induced errors, while others may become more vulnerable. Figures VI-3 and VI-4 investigate the floating-point multiplier circuits and show the differences in SEMT testing provided by PG and MPG placement as compared to G as a base. Again, there is 0% difference in error rates between G, PG and MPG testing with 1 injected transient per test. But when multiple, physically adjacent transients are induced by a single particle strike, placement has an effect. In Figures VI-3 and VI-4, we show a selection of output EPP difference extrema as well as the average circuit EPP change for each level of SEMT severity. The 32- or 64-bit O output represents the product output of the multiplier. Figure VI-3 shows that, for the 32-bit IEEE-754 multiplier, (1) a PG-placement circuit can perform better than a generic placement in cases where a large number of transients are produced by a particle strike, and (2) a MPG-placement circuit can see even greater improvements with 3 or more transients, even as much as 6% lower error propagation probability for a particular logic path.

Figure VI-4 shows that, for even larger circuits, simple plan grouping does not always provide beneficial results, but finer MPG efforts, in this case, still provide an improvement over coarser PG circuits. These differences are small, but the placement changes for this case study are non-invasive and straightforward. No change in area and minimal change in timing and power can
provide these improvements. Future work involving more micro-scale gate placement could see substantial improvements.

Figure VI-3. Results for cf_fp_mul_c_8_23, the 32-bit multiplier, showing beneficial results from placement alternatives. Comparison of generic placement (G) to plan grouped (PG) and micro-plan grouped (MPG) and the resultant EPP change for each set of transient injection simulations. Bold lines represent average circuit EPP differences; thinner lines represent output extrema of interest. Negative EPP difference means a lower EPP versus generic.

Figure VI-4. Results for cf_fp_mul_c_11_52, the 64-bit multiplier, showing detrimental results from placement alternatives, using same format as Figure VI-3. Positive EPP difference means a higher EPP versus generic.
The results of Figures VI-3 and VI-4 show that alternative placements can provide different SEMT results, but of the 9 circuits tested, one method (G, PG, or MPG) cannot be determined as a universally best strategy. For Figure VI-3, clearly MPG provides an ideal result, but in Figure VI-4, the generic placement performs better over PG or MPG. Among the 9 circuits simulated, there is a balance between G, PG, and MPG providing the ideal SEMT reliability behavior.

When a charged particle strikes a circuit and induces multiple events, the resultant SEMT will affect several cells along a shared well. N-wells run horizontally in a standard cell placement. As seen earlier in Figure VI-1, plan groups that are automatically created by IC Compiler can sometimes be stretched vertically. In these cases, a particle strike may affect multiple plan groups and therefore result in a higher quantity of non-interacting transients. When plan groups are shaped more horizontally along wells, particle strikes may have a higher chance of affecting a single group, and logical reconvergence among the resultant transients can reduce error rates. However, plan grouping in IC Compiler is focused primarily for design separation, not reliability, so prioritizing the shape is a challenge. The result is three alternative placements that can be compared, but are not necessarily ranked in terms of reliability. The conclusions drawn from these observations are that: (1) choice of placement strategy is important for SEMT performance, so this step must be more closely analyzed in future circuit design, and (2) creating strict rules for plan group shaping may have potential for increased radiation hardening.

B. Plan Groups: Selective Path Hardening

Overall performance penalties were shown in Tables VI-1 and VI-2. Depending on a circuit designer’s priorities, the timing and power tradeoffs are well worth it for the general increases in reliability shown thus far. However, these plan group placement alternatives can produce even
more worthwhile increases in reliability for specific outputs, useful to help guarantee performance for particular logic paths within a design, while still maintaining very low performance cost.

A deeper look into individual circuit results shows the behavior of the circuit in these tests. SEMT characterization results for the MPG placement of the 64-bit multiplier as compared to SEMT characterization results for the generic placement are shown in Figure VI-5. Each output of the multiplier in increasing magnitude is plotted on the x-axis. Generally, placement modification operates as a series of tradeoffs – some output logic paths see an increase in SEMT reliability, while others will see a decrease. Circuit designers that use selective data path hardening [Srinivasan et al. 2005, Mahatme et al. 2013] could see potential in using methods such as this to increase the reliability of particular paths while trading off the reliability of others.

Figure VI-5. Comparison of SEMT characterization results for the 64-bit floating-point multiplier, micro-plan grouped placement to generic placement. Up to 5 transients injected per test. MPG placement induces a tradeoff that improves the reliability (decreases the EPP) of higher-magnitude mantissa bits and decreases the reliability of lower-magnitude mantissa bits, a beneficial result in approximate computing
In Figure VI-5, we see that the more significant bits of the mantissa output see a decrease in error propagation probability, while the less significant bits see an increase in EPP, which in the research field of approximate computing [Venkataramani et al. 2013], can be a very useful result for increased reliability of important components at minimal cost. This kind of result was seen in multiple circuits in this study, including each of the 4 floating-point multipliers. Especially observed with MPG plan group placement, this indicates that tighter plan grouping in the placement design phase could lead to increased impact on SEMT reliability for more significant bits of arithmetic functions, due to increased logic masking.

![Graph showing PG Output EPP Change Per Each Circuit Output](image)

Figure VI-6. Comparison of SEMT characterization results for the c1908 benchmark circuit, plan group PG placement variant over other placement variants. For this SEC/DED circuit, the output MSB sees a significant increase in SEMT reliability (15% lower EPP) and the LSB sees a decrease (23% higher EPP), a trend that is also generally scaled between the intermediate bits.

Figure VI-6 shows SEMT characterization results for the 16-bit data output for the ISCAS85 c1908 SEC/DED circuit, demonstrating that for the PG placement variant, the more significant bits of the output can also see a significant decrease in errors as compared to other placement variants, while the less significant bits see an increase in error. Of course, the most ideal
result would be a decrease in EPP for all outputs, but in exploring ultra-low-cost solutions such as this (zero area, minimal timing and power change) where tradeoffs are more permissible, this is a very interesting result. Also in cases such as these, where the entire placement was modified merely to produce placement alternatives, the resultant reliability benefits are less than if placement was modified on more of a micro scale for advantages to particular logic paths. Work presented in the next section will investigate micro-scale placement.

Figure VI-7. SEMT Characterization data for PG placement alternative of the c3540 benchmark circuit (8-bit ALU). SEMT results shown for up to 5 transients injected per test. The number of tests resulting in errors for each output (out of 100,000 tests run) are shown on the left axis; the change in EPP as compared to other placement alternatives is shown on the right axis, positive change indicating % increase in errors and negative change indicating % decrease in errors. Few c3540 outputs see an increase in EPP, but those that do (OParA, MiscOuts[3-4]) see extremely low rates of error initially.

In some cases examined in this dissertation, placement modification was shown to decrease the EPP for nearly all logic paths, with increases in only a few. Upon further examination, it can be seen that this already minimal tradeoff is even more desirable. Figure VI-7 shows SEMT characterization results for the c3540 circuit with the PG placement variant. This placement strategy produces zero change in area and a 1.1% penalty to timing as compared to generic
placement. Plotted on the left axis is the number of tests resulting in an error at that output (blue bars); plotted on the right axis is the change in EPP for that logic path (green bars). Most logic paths see a decrease in EPP, a beneficial result. Only 3 logic paths see a significant increase in EPP: OParA and MiscOuts[3-4]. But this graph shows that these outputs are generally very reliable regardless of placement – MiscOuts[3-4] only see about 300 errors each in 100,000 modeled radiation events, so an increase of approximately 30% in EPP represents a still minimal error rate. Data outputs Z[7-0], however, average 9400 errors per output out of 100,000 radiation events, and see an average decrease in EPP of 6%, or 550 fewer errors per output – a potentially acceptable trade.

Figure VI-8. SEMT characterization data for MPG placement alternative of the 32-bit floating-point multiplier. SEMT results shown for up to 5 transients injected per test. Same format as Figure VI-7. The outputs that are most prone to error, the mantissa, see decreases in error rate for almost all outputs.

Figure VI-8 shows SEMT characterization results from the 32-bit floating-point multiplier that exhibit similar behavior. The outputs of this circuit include the sign bit $i$, the exponent, and the mantissa. Most of the mantissa data output sees a decrease in EPP for this placement alternative versus the generic placement. The only significant increases in EPP are for a couple of exponent outputs and primarily the sign bit. But an investigation of the absolute numbers shows that the sign
bit is unlikely to experience an error anyway – a 5.45% increase in error propagation probability represents merely 3 more errors in 100,000 radiation events. The mantissa outputs are the ones that are most prone to error, and almost all of these experience a decrease in EPP.

These results show that placement is an important design consideration for SEMT reliability. In many cases, using plan groups for fine-grained logic-based placement alternatives results in improved SEMT reliability over generic, congestion-minimized standard placement, with zero cost to circuit area and minimal or negligible cost to circuit timing or power. However, plan group strategies can also produce higher performance costs without a guaranteed lowering of charge-sharing-induced error rates, due to the lack of fine control over plan group selection and shaping. Work in the next chapter will seek to address these concerns.

Micro Constraints: Cell Binding

A common technique in circuit radiation hardening methods is to identify critical vulnerabilities and separate them to reduce the possibility of multiple errors caused simultaneously. Unfortunately, IC Compiler has no mechanisms in place to simply separate individual standard cells. It does, however, have a command to place cells closely together, which can potentially be used to increase logical reconvergence rates and therefore drive SEMT-induced error rates down. Minimal experiments have been performed with this command before [Entrena et al. 2012], with a reported reduction in SEMT error rate of 2.22%, using glitch-based error injection as opposed to a charge-sharing-based modeling approach. This dissertation expands upon the premise substantially, by investigating several alternatives in further detail.

To bind together two standard cells in the circuit, a “create_bound” command is submitted to IC Compiler, with an “ultra” effort to attempt binding those two cells together closely. Upon
overall placement of the circuit, IC Compiler will attempt to honor the bind commands and place the listed cells adjacently while balancing overall placement constraints.

Figure VI-9. Longest-delay timing path generated by Design Compiler for a given output of benchmark c1908. Placement of this path can be constrained with bind commands to create an alternative placement that, depending on the path topology, may modulate the SEMT resiliency of the logic path.

Previous work on using the \texttt{create\_bounds} command [Entrena et al. 2012] analyzes error rates after several thousand clock cycles before making recommendations for paired nodes; this dissertation delves in more deeply to see if transient errors can be mitigated closer to their initial generation. Of main interest in this initial chapter on placement methods, individual logic paths are investigated to see if it is possible to selectively harden outputs of interest. For a given logic path, a longest-delay timing path is generated, as in Figure VI-9. In this example, for the 14 gates shown, 13 total \texttt{create\_bounds} commands are written, for each subsequent pair of nodes.

```
"create\_bounds -name bound0 -effort ultra \{Ckt1908/M1/U49 Ckt1908/M1/U50\} ... create\_bounds -name bound12 -effort ultra \{Ckt1908/M3/U16 Ckt1908/M4/U16\}"
```

This method is used to create 25 placement alternatives, one for each output of c1908; likewise, in similar manner, a placement alternative is created focused on each output for c3540 and c6288. The end result is that, when automatic placement is performed with these constraints using IC Compiler, progressive pairs of gates may be brought closer in placement, as illustrated in Figure VI-10.
Figure VI-10. Example of the effect of cell bounds, where 4 gates with 3 progressive pairs of create_bounds commands results in a closer placement for this selection.

Plan group experiments allowed for an investigation of modifying logical reconvergence and the possible performance penalties that may be endured through these placement modifications. Beyond this, the next series of simulation experiments provides a space to measure and test a variety of other design attributes to look for correlations in specifications, reliability, and performance. A standard multiplier benchmark circuit, a SEC/DED circuit, and an ALU circuit are each placed and characterized for SEMT sensitivity, using the create_bounds command to tie together cells in the longest delay path for each logic path.

In addition, placement experiments are performed at multiple core utility ratios in order to investigate the impact of cell placement density on performance and reliability metrics. When simulated and compared, measurements include wirelength, circuit congestion, timing, static power, and dynamic (switching) power, to allow for a comparison of all of these attributes. The goal is to develop a way to indicate what placement modifications are likely to cause particular changes in reliability and performance.

C. Cell Binding: Reliability and Performance

Initial create_bounds experiments were performed on the c1908 SEC/DED circuit. 25 placement varieties were created, each focused on constraining a particular logic path. After SEMT
transient injection and characterization of the circuit, each placement variety was compared back to the original, unconstrained model, and the best SEMT reliability improvement for each logic path was taken. These improvements are plotted against the overall average EPP performance of that logic path in Figure VI-11. This demonstrates the SEMT hardening that is possible with this methodology.

Figure VI-11. Best error propagation probability improvement taken for each c1908 output from the 25 logic path binding placement alternatives created, as compared to the average error propagation probability of each output. EPP difference vs. average of each logic path of c1908 is plotted versus # of injected transients per test.

On average for the case study circuit, outputs can be hardened by 7.7% against two-transient cases, and up to 12.5% for five-transient cases (the black, bold line in Figure VI-11). Looking at individual logic paths, most outputs accomplish a 10-20% reduction in SEMT error probability. When a particle strikes a gate that is directly adjacent to a logically-connected cell, due to a bind command, the two induced transients will interact and therefore stand a chance of canceling or reducing the error. In addition, the SEMT-hardening results from binding cells
together in specified logic paths as in Figure VI-11 are repeatable by using the same series of
create_bounds commands.

The improvements seen cannot all be accomplished at once, of course. Hardening one logic
path in this manner typically (but not always) comes at the cost of a potentially less-than-optimal
placement for another. Reliability consideration with these strategies generally requires a tradeoff.
In this experiment, the c1908 data output was more easily hardened and at less performance cost
than other outputs. Studies into the topology of the circuit demonstrate that there are more gates
between the input and output for the Out[x] data outputs, allowing the possibility of more logical
masking opportunities that can be attuned with bind commands.

With the c3540 ALU circuit, placement alternatives were created in the same way, with
each placement constructed with bind commands contained within a particular logic path, then
characterized for SEMT sensitivity. C3540 is a multi-function circuit; the outputs include a general
data output for the ALU, Z; carry and support bits, X/C/P; parity computation outputs; and
miscellaneous functions of the inputs A and B.

Of the 21 placement alternatives, Figure VI-12 plots the Error Propagation Probability
difference achieved on average for each functional group of outputs, for varying distances over
which a radiation event is modeled. The circuit on average is hardened roughly 5% against SEMT-
induced errors. ALU sum and logical functions are observed to be particularly conducive towards
this SEMT hardening approach, but parity operations are less so as observed in Figure VI-12.
Additionally, this figure demonstrates that the small scale of placement modification achieved with
cell bounds has the effect of hardening circuits against likewise small-scale radiation events. The
largest decrease in errors occurs for radiation events modeled with a 500-nm radiation event radius,
though lesser degrees of hardening are still achieved with larger radiation event radii. Fortunately,
lower-charge sources of radiation are generally more common than particles with a higher LET, so this serves to focus on more important vulnerabilities.

C1908 and C3540 both have multiple logic functions contained within their circuitry, which allows for greater malleability of the placement for SEMT hardening. C6288 performs a single function, and is less impacted by these techniques, but still helps to round out the analyses of performing placement modification on different circuits and their effect on performance metrics.

Cell binding techniques focus on modifying small portions of the circuit placement and leave the majority of the circuit otherwise unrestrained. The effect of small changes in placement on circuit performance are minimal. As with plan groups, there is no change in circuit area or static power, since no components are added or removed. Timing is very minimally impacted; across the 127 placement varieties of the c1908, c3540, and c6288 circuits, post-P&R timing was increased on average by 0.15%.

Figure VI-12. Decrease in output error propagation probability for different c3540 outputs achieved via path binding placement alternatives.
With increased perturbations in the standard cell placement design phase, IC Compiler must exert a small amount of additional effort in order to achieve a legal placement honoring most or all of the offered constraints. IC Compiler does not guarantee honoring all create_bounding constraints set, even with the “ultra” effort setting. However, investigating congestion in terms of higher-layer metal usage shows a loose correlation between wirelength usage in upper layers and an increased number of placement constraints, shown in Figure VI-13.

![Figure VI-13. Measurements of upper-level metal layer usage for placement alternatives of c6288 multiplier circuit using varying numbers of cell bound constraints.](image)

D. Impact of Density on Placement and Reliability

In addition to area, timing, and congestion impacts of placement modifications, replicating create_bounding experiments at multiple circuit core utility ratios allowed for an investigation of the effect of placement modification and cell placement density on performance metrics, such as wirelength and power.
For a precise measurement of power usage of a particular logical and physical design, Synopsys Primetime uses a switching activity interchange format (SAIF) file, output from actual simulation results performed using Synopsys VCS. Since this dissertation’s SEMT characterization suite utilizes VCS simulation, this approach allowed for a fairly simple analysis of power usage using Primetime. Initial experiments compared several different placements of c6288 with switching activity in Primetime. The results indicated that the differences in power usage between placement alternatives were approximately the same as the differences produced using the built-in power estimation provided by IC Compiler, so IC Compiler was used for the remainder of experiments in the interest of time.

Cell bounds are not strict constraints that must be 100% honored, and therefore the effects that their usage has on performance should be viewed as correlations within statistical variation. Results are shown for these placement & performance correlations in Figures VI-14 and VI-15. As shown, cell placement density does tend to have an impact on the metrics of wirelength and dynamic power. Circuits that are “loosely” placed at a 70% core utility ratio generally experience a slight decrease in wirelength and dynamic power as the number of placement constraints increase. Circuits that are “densely” placed at a 90% core utility ratio see a slight increase in wirelength and dynamic power with added placement constraints.

These results demonstrate that the performance impact of micro-level constraints such as cell bounds is extremely low. Though the impact demonstrated in this chapter on hardening against SEMT errors was also low, if directed placement as performed in the following chapter can achieve greater SEMT hardening via micro-placement, the potential cost to circuit performance could be a minimized concern.
Conclusion

In this chapter, a large number of alternative standard cell placements were generated for a variety of combinational logic circuits, to explore the impact of placement on a number of performance metrics as well as on the sensitivity of the circuit to SEMT. Two primary placement mechanisms were explored: (1) plan groups and (2) cell bounds.

Plan group experiments demonstrated that rectilinear constraints on placement of logical hierarchies can increase or decrease charge sharing within a specified cluster or group of logic gates, with associated impacts on the SEMT error probability. Performance costs to modulating
the placement of an entire IC or logic IP block in this way were higher than desired. However, these simulation experiments illustrated that shared well-collapse is a dominant mechanism in SEMT vulnerability. Taking advantage of shared wells to encourage multiple transient interaction in beneficial cell groups will be key for a successful SEMT mitigation strategy.

Cell bound experiments demonstrated that small, local perturbations to a default standard cell placement algorithm have very low costs to timing or dynamic power. At low placement densities, the use of cell bounds may improve some performance metrics. However, implementation is not strict enough to provide sharp differences in SEMT reliability. Making small, local changes to placement will be key for implementing a low-cost SEMT-aware placement algorithm.
CHAPTER VII

PLACEMENT ALGORITHM FOR SEMT MITIGATION

The first aim of this dissertation (Chapters IV and V) focused on exploring and quantifying the effects of radiation on combinational logic, specifically in modern technologies where single-events have a high potential of inducing multiple transients. Given that these are physically-adjacent transients that then propagate through logic, Chapter VI then demonstrated that physical design of ICs has a distinct impact on SEMT sensitivity and propagation. This present chapter builds upon these analyses to investigate and develop a specific placement design flow to harden circuits against SEMT effects with minimal cost.

Simulation experiments investigated localized placement, global placement, and logic manipulation. Placement experiments were run on the variety of circuits listed in Table IV-1. SEMT characterization of each placement strategy was performed as described in Chapter IV for comparison of SEMT behavior, sensitivity, and design complexity. The end result is a streamlined flow that intercepts the standard design placement step to provide a modified placement methodology, substantially increasing SEMT reliability while minimally perturbing commercial-standard minimized area, power, and timing.

Placement Design Mechanisms for SEMT Mitigation

Plan groups and cell bounds have been established in the previous chapter as primary ways to adjust placement in IC Compiler while still allowing the software to maintain optimization in area, power, and timing. Experiments in this section investigate the targeted use of these
mechanisms for SEMT reduction, including: (1) binding placement of selected vulnerable pairs of cells, (2) combining placement modification strategies for additive hardening results, (3) binding cell pairs based on logic depth, (4) logic resynthesis for greater design malleability, (5) shaping plan groups to take better advantage of charge sharing aligned along wells, and (6) well-based plan group charge sharing enhancement based on logic analysis and function. Each strategy towards reducing SEMT vulnerability in a logic block will be explored to determine what can be achieved in a standard cell placement design flow.

Experiments in this section will be evaluated and compared, then successful mechanisms will be integrated together into a final standard cell placement algorithm in the following section.

Key components of physical design here include: (1) micro-scale placement, (2) logic design, and (3) macro-scale placement.

A. Cell Bounds for SEMT Mitigation

Initial efforts to achieve SEMT mitigation via placement focus on the use of cell bounds. The create_bounds command offers discrete placement perturbations that are can be scripted from analysis of a netlist, and as per the results of Chapter VI, have minimal performance costs. Experiments here focus on the ISCAS85 c1908 16-bit SEC/DED circuit, synthesized to the Synopsys 28/32-nm library. For all placement versions created in this subsection, the circuit had an area of 1023.69 µm². There is no cost to area for any of the placement strategies; since placement modification only rearranges cells, it does not add, remove, or change cells in the design. Circuit timing is reported by IC Compiler at 2,100 ps for a standard model with no placement modifications. Once modified, timing changes minimally, with an average increase from these experiments to 2,111 ps, or 0.53% penalty to operating speed.
The fundamental means to reducing errors caused by SEMT is to encourage overlap between the transients generated in a single-event. If multiple transients are generated in separate paths, then multiple errors may result. If multiple transients are generated in the same path, then at most one output will experience an error, and this error may potentially be truncated. Experimental work in pulse quenching (See Figure III-4 for review) provides background to the fact that SEMT impacting physically adjacent cells that are directly connected logically is a fairly ideal event for SEMT, as compared to when logically separate cells are affected by the same particle.

The first create_bounds experiment aims to maximize physical adjacency between fan-in/fan-out neighbors, in order to see if this approach can reduce charge-sharing-induced errors. Commercial P&R tools generally aim to place logically-connected cells closely in order to minimize wirelength, but this aim is balanced with spreading components in order to minimize congestion. Therefore, an automated script was built to analyze the c1908 netlist and produce two lists of create_bounds commands: (1) for each node $i$, bind to each fan-out of that node in the circuit, (2) for each node $j$, bind to each fan-in of that node in the circuit. In theory, these two lists of bound commands would be the same, and repeating each experiment showed that all 4 total results behaved similarly. Each list of bound commands was implemented to generate a placement of c1908, which was then evaluated for SEMT sensitivity.

The fan-in/fan-out experiment established that implementation of placement constraints has limits. A list of 400+ constraints on a circuit with 210 gates means that IC Compiler cannot honor all constraints during automatic placement. The result is that constraints are split, and while overall performance is not greatly impacted, error rates in 2-transient SEMT simulation are
increased by 5.7%. Error rates in 5-transient SEMT simulation are increased by 10.2%. Further experiments must be aware of over-constraining designs.

Rather than attempting to harden an entire IC, a solution that showed promise was to harden individual paths. In Chapter VI, this dissertation investigated using timing reports to generate lists of logically-connected gates, then using create_bound commands to tie successive pairs of these gates together. The result was several placements; each alternative is associated with a list of bound commands and a particular SEMT error rate for the circuit as a whole and for individual paths.

Aggregate results for individual path hardening with bound commands were shown in Figure VI-11. The improvements seen cannot all be achieved at once; hardening one path against SEMT errors typically comes at the cost of a potentially less-than-optimal placement for another.

Lists of create_bound commands for some of the most-hardened paths in Figure VI-11 were combined, and new placements implemented, to see if effects from placement constraints are additive. Figures VII-1 and VII-2 show examples of this process. In the first example, a list of 13 bound constraints produced a placement where Out[7] was hardened against SEMT errors. A list of 13 bound constraints likewise hardened Out[2]. Combining these for a list of 23 commands (excluding overlap) produced a placement that achieved reduced error propagation probabilities as compared to an unconstrained placement in both logic paths, for different levels of injected transients in the SEMT simulation workflow. As illustrated, there is a cost of a slightly higher average circuit error probability.

This experiment was repeated with two other create_bound lists corresponding to hardening of two other logic paths; Figure VII-2 demonstrates the combination of lists for Out[11] and Out[13] to achieve lower error probabilities than an unconstrained placement in both logic
paths. The two selected logic paths are hardened against SEMT errors by 15.5%, with a penalty of increasing the average circuit SEMT EPP by 2.3%.

![Figure VII-1](image1.png)

Figure VII-1. Results of combining two sets of `create_bounds` commands that lead to improvements in output error propagation probability, shown to be additive to a certain practicable degree. Each of the two outputs are shown as well as the average circuit EPP change.

![Figure VII-2](image2.png)

Figure VII-2. A second example of combining two sets of `create_bounds` commands to harden multiple outputs through bind commands, showing repeatability to this method.

These experiments show that use of placement constraints for reduction of charge-sharing-induced errors can be additive. However, there are limits to this method as well. These initial lists
of bind commands are simply produced from timing reports and do not pertain directly to strict logical connections. Figure VII-3 was produced by combining bind constraints from 4 placements that each hardened a particular logic path, then evaluating for SEMT sensitivity versus default placement. The placement produced with this combined list of 27 create_bounds commands resulted in a very slightly-reduced EPP for the circuit as a whole; three of the selected outputs were hardened against most SEMT-induced errors, but this strategy failed to reduce SEMT-induced errors in all selected outputs.

![Logic Path Binding: Combined 4 Outputs](image)

Figure VII-3. Following Figures VII-1 and VII-2, this plot combines bind constraints for four output best-case scenarios, for a partially-hardened result. Additive operations are limited with this coarse methodology.

Analysis shows that IC Compiler may not be able to handle large numbers of intersecting create_bounds commands, and that another, more stringent placement mechanism focusing on isolated pairs may be more routinely successful.

Lastly in bind command experiments, experiments were performed to see if transient errors in individual logic paths can specifically be truncated by binding together cells located at the same logical depth from the output. When multiple transients are produced by a single particle and these
transients interact and overlap as they propagate, then they may potentially cancel out partially and create smaller overall transients. If small enough, then these remnants can be attenuated via electrical masking or missed entirely due to latch-window masking.

For a selected output, this study calculated the gates leading to that output and their respective depths, using the command “report_transitive_fanin -to outx” and bound together all cells at each level of depth from the output. Several placements were created, some aiming to harden a single output, some hardening multiple outputs, to 3-5 levels of depth. During SEMT characterization, transient pulsewidth was measured at the output for each observed error, and compared back to SEMT characterization of a base, unconstrained model. Figure VII-4 shows some of the results from this experiment. The majority of outputs chosen for selective hardening saw a very minimal reduction in transient error pulsewidth, ~1%, due to “hardening” via this placement method, and thus this method seems inefficient as a hardening tool. Out[14] in one placement experiment had more significantly reduced errors in its output, however, but generally this method was not conclusive to reducing transient pulsewidths.

Figure VII-4. Difference in output transient error pulsewidth from models that have been modified by binding together cells at the same logic depth from the specified output, as compared to a base, unhardened model.
In this section, placement experiments and SEMT characterization simulations have shown that placement constraints can be repeatable and additive in having an effect on reducing charge-sharing-induced error vulnerability, but that there are also limits in physical design-based hardening. Overlapping placement constraints are difficult for IC Compiler to implement, and mass quantities of constraints negate intended hardening attempts.

B. Logic Design for SEMT Mitigation

Using physical design techniques to restructure a circuit so that charge sharing masks multiple errors rather than cause multiple errors has a definite dependence on the logic composition of the specimen circuit. Logic resynthesis for radiation hardening has been investigated in depth by other works [Limbrick et al. 2013], so experiments performed by this dissertation are brief. This section will focus on the impact that including or excluding logic cells from the synthesis library has on physical design implementation.

In the pulse quenching example of Figure III-4, the two gates investigated, inverters, are directly complementary. If transients are induced at two adjacent inverters, as in the example, then the transient at the fan-out inverter will mask the transient at the fan-in inverter. This case is ideal for pulse quenching, and therefore a model case for “positive” charge sharing. Logic circuits, however, being composed of a much larger variety of logic gates, present a more challenging situation for SEMT-aware design.

In this subsection and the next, circuits are quantified and investigated in terms of the average Logic Masking Ratio (LMR) of their constituent gates, in an effort to quantify complementary capabilities of gates. LMR is here defined as the minimum ratio of possible outputs. An inverter has two possible outputs, 0 and 1. The LMR ratio of these outputs is therefore
0.5. In like manner, a XOR2 gate has possible outputs of 0, 1, 1, 0, and also an LMR of 0.5. AND2 and OR2, with outputs of 0, 0, 1, and 0, 1, 1, respectively, have an LMR of 0.25. 3-input AND and OR gates have an LMR of 0.125. The maximum LMR of a gate is 0.5.

One simplified hypothesis following from the pulse quenching example is that, assuming an even spread of 1/0 net values in an IC, higher-LMR gates may be more conducive toward pulse quenching-like behavior when a SEMT occurs. The actual case is highly dependent on input values and logic synthesis, of course, but in general, higher-LMR gates “flip” output value more often, leading to a higher potential for multiple, related “flips” to reduce final error counts.

The ISCAS85 c6288 16-bit multiplier circuit, synthesized as previously performed in this dissertation to the Synopsys 28/32-nm library, has 1440 gates, primarily INV, XOR, XNOR, AO/OA (AND-OR/OR-AND), NAND2/3, and NOR2/3 gates. As previously mentioned, NAND3 and NOR3 gates have very low LMR ratings at 0.125. C6288 was resynthesized excluding these gates from the legal set. The original circuit had an average LMR of 0.26; the new circuit, with 1875 gates, had an average LMR of 0.37.

Placement alternatives were created focused on each output, as part of Chapter VI; each placement alternative implements a series of create_bounds constraints developed from a timing report for one output of the circuit. SEMT characterization of each placement alternative was performed, and the results of these placement alternatives were aggregated together. The same steps were performed for regularly-synthesized c6288 circuits at multiple core utility ratios. Each placement alternative is compared to an unconstrained placement in order to quantify the number of SEMT errors reduced due to placement constraints.

Figure VII-5 demonstrates results from these characterization simulations. For each category of placements, the average number of errors reduced by using placement bounds is
shown, all normalized to the same number of SEMT events modeled on each circuit. The aggregate of all of these results is fairly abstracted in this plot, but two points are suggested, albeit without statistical significance: (1) placement bounds can more effectively reduce SEMT errors in circuits with a higher density, and (2) placement bounds can more effectively reduce SEMT errors in circuits with a higher LMR.

![Output EPP Decrease vs. Charge Collection](chart.png)

Figure VII-5. Reduction of SEMT errors achievable using create_bounds techniques on c6288, at 70% density, 90% density, and 90% density resynthesized with higher LMR.

The results of these simulation experiments were not strong enough to suggest resynthesis of circuits as a solution to the SEMT problem. Resynthesis of circuits with a restricted standard cell library incurs noteworthy area, timing, and power penalties. However, SEMT-aware physical design could be directed to take advantage of nonhomogeneous circuits by focusing efforts on encouraging “positive” charge sharing opportunities in high-LMR sections of a circuit and discouraging charge sharing opportunities in lower-LMR sections.
C. Plan Groups for SE MT Mitigation

HDL netlists for an IC generally section designs into multiple hierarchies, for simpler, step-by-step implementation, rather than a flattened netlist requiring global placement and routing. This approach allows for a circuit designer to implement different constraints on different segments of a design. Following the plan group experiments of Chapter VI, and the logic analyses of the previous section, the next step is to investigate plan group shaping to take advantage of local differences in logic composition of a circuit.

As an experiment, several circuits from Table IV-1 were analyzed in terms of the logic gates in each module or submodule of their design hierarchy. Important metrics include the average logic masking ratio (LMR) of the gates within the module as previously described, the average logic depth (LD) of the module (# gates / # outputs), and the arithmetic/control functional nature of the module. These factors can be used to judge the chance that transient interaction/cancelation may occur in a module given an SE MT.

If LMR or LD factors indicate that a module of logic cells has a higher-than-average chance of transient interaction, then its plan group can be constrained to be elongated horizontally, which minimizes the number of wells and maximizes the chance that charge-sharing-induced SE MT remain within a module. Alternatively, if these factors determine that multiple transients induced in a cell group are unlikely to be able to interact and cancel, then its plan group may be elongated vertically, maximizing the number of wells and physically separating vulnerable cell pairs.

Figure VII-6 illustrates an example of using the LMR and LD classifications to inform horizontal or vertical skewing of plan groups in the placement design phase. Because the entire circuit is modified in this kind of placement strategy, the effect of LMR- and LD-based design choices may not be clear on the global SE MT sensitivity characteristics of the circuit. However, it
is possible with benchmarks of these sizes to identify particular modules as primary contributors to particular IC outputs, and to observe those outputs for more direct impacts of plan group shaping strategies.

![Figure VII-6](image)

Figure VII-6. Example of LMR-based plan group shaping (left) and LD-based plan group shaping (right) of the ISCAS85 c3540 8-bit ALU benchmark circuit.

Placements were created with the LMR strategy and the LD strategy, and characterized for SEMT vulnerability at varying radiation event radii. Errors observed at the outputs were classified according to which functional group of outputs they are located at in a manner similar to Figure VI-12, and compared to SEMT error counts from the characterization of an unconstrained placement. Figure VII-7 shows the results from these simulation experiments, showing that the LMR strategy and LD strategy each reduce the number of observed SEMT errors in some logic paths to varying degrees, and have different effects on the overall circuit.
Figure VII-7. SEMT errors reduced at a modeled radiation event radius of 500 nm by using LMR, LD, and combined factors to inform plan group shaping for c3540, as compared to an unconstrained placement.

Transient interaction has a significant dependence on logic depth. If multiple transients are induced by a particle strike and then are presented directly to outputs, then multiple errors will be produced. If, instead, they have a large number of gates to propagate through (high logic depth), then the chances for transient interaction and/or logical masking are increased, and the transients may be masked.

Combining the results from LMR and LD shaping studies with a functional analysis of each module allows for the strongest effect on reducing SEMT errors via plan group regulation, also shown in Figure VII-7. Constraining placement of an entire core placement with plan groups, however, incurs performance costs in the region of 20-40%, and is not easily automated. Therefore, plan group-based experiments serve to suggest the importance of considering shared wells among adjacent cells, but may not be the most valuable means toward hardening circuits against SEMTs via an automated workflow.
Relative Placement for Pulse Quenching

Simulation experiments reported thus far in this chapter have provided evidence toward several observations: (1) small, local changes to placement have minimized performance costs; (2) reliability impacts of placement constraints can be additive and repeatable; (3) Boolean logic composition has an impact on the effectiveness of placement constraints; (4) charge sharing in common wells indicates that well-sensitive placement is important; and (5) global placement modification induces both positive and negative impacts on SEMT vulnerability, and is not a good candidate for common success. Altogether, these observations demonstrate that localized EDA placement modification has promise in encouraging “positive” charge sharing that produces pulse quenching of multiple transients, but a stricter implementation and a strategic Boolean logic analysis are required for optimal execution.

The positive properties of plan groups and cell bounds can be realized together with the application of relative placement constraints in Synopsys IC Compiler. A relative placement group is an association of cell instances and/or other relative placement groups, defined by the number of cell rows and columns that it uses, illustrated in Figure VII-8. With a strong methodology centered on multiple transient interaction between Boolean logic pairs, even higher SEMT hardening rates than those thus far reported can be attained, by defining design rules for local optimizations and automating scripted commands.
Relative placement groups (RP groups) are similar to cell bounds in that a group can consist of as few as 2 standard cells without incurring the performance penalties associated with larger plan groups, and hence many small groups can be constrained. RP groups surpass the benefits of cell bounds by providing additional opportunity for cell orientation locking and precise placement of a cell relative to another. RP groups are similar to plan groups in that they regulate placement within rectilinear sectors, but placement and shaping can be performed via discrete commands, allowing increased automation over plan groups.

The following subsections describe how RP groups can be used to encourage positive charge sharing opportunities for quenched transients and lower SEMT-induced error rates, through (1) an introduction and Boolean logic analysis, (2) development of a modular script for automated, reliability-aware placement modification, and (3) generation of consistently SEMT-hardened placement results.
D. Boolean Logic Analysis

Figure VII-9 (left) demonstrates the placement of two inverters in a RP group within a placed-and-routed IC. Three compiler commands are required. First, a relative placement group is created and defined to consist of one column and two rows. Second, U1470 is added to row 0, with allowed orientations of S or FS (south or flipped south, with VSS on “top” and VDD on “bottom”). Third, U1471 is added to row 1, orientations N or FN (north or flipped north, VDD on “top” and VSS on “bottom”). Upon running automated standard cell placement, the compiler honors these strict constraints while pursuing normal area/timing/power optimization, and places U1470 and U1471 such that they share a common n-well.

```
create_rp_group groupname -columns 1 -rows 2 -move_effort high -group_orient N
add_to_rp_group Circuit6288_pg::groupname -leaf U1470 -row 0 -orientation {S FS}
add_to_rp_group Circuit6288_pg::groupname -leaf U1471 -row 1 -orientation {N FN}
```

Fig. VII-9. Example of a pair of logic gates with a direct fan-in connection to a common NOR gate, in layout and schematic form.
Relative placement of cells with a specified shared well becomes noteworthy when compared to the pulse quenching example of Figure III-4. In Figure III-4, deposited charge from a particle strike diffuses through a shared well, affecting adjacent, sensitized devices. In a very similar fashion in Figure VII-9, a particle strike event in the circled region will induce transient drives to VDD (logic HIGH) for U1470 and U1471 if they are not already held HIGH. If one is gate is already held HIGH, then only one device will be sensitized and one transient will be produced, or none if both gates are already held HIGH. Other gates within reach of charge diffusion may also experience transient drives to HIGH.

Table VII-1. Boolean logic truth tables for NOR2, NAND2, XOR2, and XNOR2. Inputs AB, gate output, gate output given a SEMT event, and the error count at the output of the gate are listed. SEMT event for NOR2 considers fan-in gates with shared VDD rail; NAND2 with shared VSS rail; and XOR/XNOR unconstrained to a particular well.

<table>
<thead>
<tr>
<th>AB</th>
<th>NOR</th>
<th>NOR MT</th>
<th>VDD</th>
<th>Errors</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>0</td>
<td>0</td>
<td>masked</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>0</td>
<td>0</td>
<td>masked</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>0</td>
<td>0</td>
<td>masked</td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>AB</th>
<th>NAND</th>
<th>NAND MT</th>
<th>VSS</th>
<th>Errors</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>1</td>
<td>1</td>
<td>masked</td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>1</td>
<td>1</td>
<td>masked</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>1</td>
<td>1</td>
<td>masked</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>AB</th>
<th>XOR</th>
<th>XOR MT</th>
<th>Errors</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>0</td>
<td>0</td>
<td>masked</td>
</tr>
<tr>
<td>01</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>10</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>11</td>
<td>0</td>
<td>0</td>
<td>masked</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>AB</th>
<th>XNOR</th>
<th>XNOR MT</th>
<th>Errors</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>1</td>
<td>1</td>
<td>masked</td>
</tr>
<tr>
<td>01</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>10</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>11</td>
<td>1</td>
<td>1</td>
<td>masked</td>
</tr>
</tbody>
</table>

In this scenario, both U1470 and U1471 are fan-in of the nearby U1602 NOR2 gate. If two inputs to a NOR2 gate are logic HIGH, as with the example single-event, then the output of U1602 will be 0. Consulting the simple digital logic truth tables in Table VII-1, three out of the four possible outputs of a NOR2 gate are 0. Hence, in three of the four possible input conditions, a small-radius SEMT event in the circled region will induce two transients, at U1470 and U1471,
but converge at U1602 and be masked to produce zero errors. In the remaining input condition, the 2-transient SEMT event will produce only a single error that propagates to other portions of the circuit.

Relative placement in the manner described effectively operates under the concept that if radiation-induced charge sharing occurs on a gate, then it is best to surround that gate with closely-connected gates so as to encourage masking of those charge-sharing-induced transients. The RP technique used in this dissertation does not prevent charge sharing. The goal is to increase the number of areas in the chip where transients are likely to be masked if charge sharing does occur. This effect is achieved through identifying and constraining the placement of common fan-in pairs that meet useful logical masking requirements.

Combinational circuits, at a simple level, are composed largely of NAND, NOR, XOR, XNOR, and INV gates, while other permutations are produced at lower rates in design synthesis. Table VII-1 lists four particularly valuable cases that can be appropriated for SEMT-masking by using RP groups. The case of two fan-ins with a common n-well to a NOR gate has already been described. Two fan-ins with a common p-substrate to a NAND gate also mask 2-transient SEMT events for three out of four input conditions, with a single error for the fourth condition. Two fan-ins with either a shared n-well or a shared p-substrate for an XOR or XNOR gate mask 50% of SEMT events, with a single error for the other 50%.

Table VII-1 describes localized, simplified cases. In physical operation, an ion strike event that generates multiple transients (MT) will produce transients of different pulsewidths and drive strengths, according to the capacitance of the node and the collected charge, related to the distance between the particle strike location and the sensitive node(s) of the gate. Also, while two signals may converge at a common fan-in gate, they may propagate elsewhere as well, in which case those
Transient errors may propagate and still produce errors. However, this dissertation’s SEMT characterization model can account for different transient insertion and duration based on single-event intensity and location as described at the end of Chapter V. The simplified cases are highly relevant; using RP groups to maximize these identified transient masking opportunities creates “safe spaces” where transients are more likely to be quenched, with performance costs similar to cell bounds.

Beyond these 2-transient cases, it is possible to additionally create RP groups structured with more than two logic gates, to take advantage of other logical connections that, if affected by SEMT, may also mask the induced transients. However, RP groups, as defined with strict columns and rows, are fairly rigid structures, placement of a 2-component RP group is roughly akin to placement of a doubly-sized standard cell, with some whitespace room provided if the two included gates are of different sizes. Placement of 2-component RP groups comes at a very low cost to both performance and placement complexity. The placement phase for logic IP of the size reported in this dissertation experiences only a short delay on the order of 10% in implementation.

E. Boolean Reliability Automated Design Flow

The goal of this dissertation is to produce an automated design flow that allows for simple constraining of standard cell placement in order to achieve more SEMT-reliable designs. The Boolean logic pairs described in the prior section achieve this for localized, 2-element cases, and RP group commands may be scripted to provide an elegant SEMT placement constraint interjection into the EDA flow, as illustrated in Figure VII-10.
A simple algorithm script is created to analyze a given netlist and identify occurrences of NAND/NOR/XOR/XNOR fan-in sensitive cell pairs. NAND and NOR fan-in cases are prioritized, given their ideal 75% SEMT masking, then RP groups are created for remaining XOR and XNOR fan-in pairs. If a gate is already present within an RP group, then it is excluded from being added to an additional RP group, in order to prevent overlapping constraints and to allow for successful placement by IC Compiler. Algorithm 4 lists the steps in this netlist processing script.
ALGORITHM 4.  RP Command Generator

| **Input:** Standard 28/32-nm library, IC Compiler-generated cell connections |
| **Output:** RP constraint commands, statistic information |

Load standard cell library, parse for list of legal cells
Load ICC_circuitnetlist_connections.txt

for each in [cell connections] do

  Create new array element for cell, add gate type
  Add fan-in connections
  Add fan-out connections
end

for each in [cells] do

  if cells[x].type == NAND then
    write create_rp_group -columns 1 -rows 2 -group_orient N
    write add_to_rp_group fanin1 -row 1 -orientation [S FS]
    write add_to_rp_group fanin2 -row 0 -orientation [N FN]
    remove fanin1, fanin2 from cells
  end

  if cells[x].type == NOR then
    write create_rp_group -columns 1 -rows 2 -group_orient N
    write add_to_rp_group fanin1 -row 1 -orientation [N FN]
    write add_to_rp_group fanin2 -row 0 -orientation [S FS]
    remove fanin1, fanin2 from cells
  end

  if cells[x].type == (XOR or XNOR) then
    write create_rp_group -columns 1 -rows 2
    write add_to_rp_group fanin1 -row 1
    write add_to_rp_group fanin2 -row 0
    remove fanin1, fanin2 from cells
  end

end

Save RP command file
Report number of write commands (RP cells)

For this series of tests of tests, the placement density target for each circuit was set at 80%.

For each circuit, floorplanning is conducted, RP constraints are applied, then automated standard
cell placement is performed. create_fp_placement performs local and global optimizations while
honoring the placement and orientation constraints of the specified RP groups. For large numbers
of nominated RP groups, not all constraints can be honored by compiler-powered placement, but
the RP placement success rate for all circuits tested for this dissertation is 95% or above. Statistics
related to the RP nomination/placement algorithm are given in Figure VII-11.
Figure VII-11. Boolean reliability automated design flow: RP cell nomination, placement, and success statistics per circuit. Number of cells plotted on left axis, RP placement success rate on right axis.

Figure VII-12 (left) shows an in-close example of cell placement produced with RP groups. Identified gate fan-in pairs are placed across specified shared wells precisely when possible, with small deviations permitted by the compiler for placement legalization and performance enhancement. Figure VII-12 (right) demonstrates the large degree of netlist coverage within the IC core area achieved by this algorithm, with green cells highlighting 75% masking NAND/NOR cases, and yellow cells highlighting 50% masking XOR/XNOR cases. Yet, because RP groups are each a mere two cells, IC Compiler maintains the ability to rearrange these gate pairs similarly to unconstrained placement, and global optimizations still meet near-optimal results. Because cells are simply rearranged, area cost and static power cost for this placement algorithm are zero. As compared to an unconstrained placement, the cost to circuit speed averages 6.9%. Individual statistics are listed in Table VII-2.
Figure VII-12. Example of relative placement automation results for the ISCAS85 circuit c7552. Full core area on right with NAND/NOR RP groups highlighted in green and XOR/XNOR RP groups highlighted in yellow.

Table VII-2. Relative placement automated flow statistics, and associated performance costs.

<table>
<thead>
<tr>
<th>Circuit</th>
<th>RP Cells Placed</th>
<th>% RP Cells</th>
<th>Area Cost</th>
<th>Static Power Cost</th>
<th>Timing Cost</th>
</tr>
</thead>
<tbody>
<tr>
<td>cf_fp_5_10</td>
<td>1442</td>
<td>77%</td>
<td>0%</td>
<td>0%</td>
<td>11.9%</td>
</tr>
<tr>
<td>cf_fp_3_4</td>
<td>254</td>
<td>65%</td>
<td>0%</td>
<td>0%</td>
<td>7.9%</td>
</tr>
<tr>
<td>c1908</td>
<td>142</td>
<td>60%</td>
<td>0%</td>
<td>0%</td>
<td>6.9%</td>
</tr>
<tr>
<td>c2670</td>
<td>326</td>
<td>73%</td>
<td>0%</td>
<td>0%</td>
<td>5.6%</td>
</tr>
<tr>
<td>c3540</td>
<td>710</td>
<td>75%</td>
<td>0%</td>
<td>0%</td>
<td>4.4%</td>
</tr>
<tr>
<td>c6288</td>
<td>816</td>
<td>52%</td>
<td>0%</td>
<td>0%</td>
<td>3.0%</td>
</tr>
<tr>
<td>c5315</td>
<td>814</td>
<td>68%</td>
<td>0%</td>
<td>0%</td>
<td>7.1%</td>
</tr>
<tr>
<td>c7552</td>
<td>804</td>
<td>68%</td>
<td>0%</td>
<td>0%</td>
<td>8.4%</td>
</tr>
</tbody>
</table>

F. Relative Placement Results

Post-placement, each RP-optimized circuit is characterized for SET and SEMT sensitivity rates at several modeled radiation event radii, representing different severities of radiation, as per the methodology described in Chapter IV. Each circuit is also characterized with an unconstrained placement as a base comparison. For each radiation event radius, SEMT-induced errors are defined as the number of errors observed above traditional SET testing. Therefore, a 30% reduction in
SEMT-induced errors means that the circuit error propagation probability in the specified environment has been reduced 30% towards the base SET rate.

Figure VII-13. Reduction in SEMT-induced errors for different outputs of cf_fp_mul_c_3_4 8-bit floating-point multiplier using RP strategy versus unconstrained design.

Figure VII-13 shows the SEMT hardening achieved for each logic path of the 8-bit floating-point multiplier benchmark circuit, mantissa bits colored cool and exponent bits colored warm. The sign bit logic path consists of one gate and experiences no SEMT impact due to placement. The figure shows that SEMT errors are reduced for every output as compared to an unrestrained placement, for several modeled radiation event radii. As RP groups are optimized for 2-element pairs, the effectiveness of this placement strategy drop off as the modeled radiation event radius is increased further beyond this size, but the number of SEMT errors for the circuit as a whole remain reduced as compared to an unconstrained design up to a modeled radius of 1500 nm, far beyond most common radiation particle strike sizes.
Figure VII-14. Reductions in SEMT-induced errors for all benchmark circuits and modeled radiation event radii. Average 30% reduction in SEMT error for most common radiation event radii, and decreasing as the radius increases.

Figure VII-14 shows the levels of SEMT hardening achieved for all modeled benchmark circuits in aggregate. Generally, there is a correlation between the percentage of algorithm-nominated RP groups (Table VII-1 column 2) and the percentage of SEMT errors masked (Table VII-3 column 5). Because these are small, localized changes to placement, the ability of this RP placement strategy to mask SEMT-induced errors is again most effective with small radiation events. As the modeled radiation event radius increases and the number of affected cells increases, localized changes in placement become less impactful. However, given that low-LET ion events are more common than high-LET strikes, it is more important to achieve higher levels of radiation hardening focused on events with a lower radiation event radius. Given a 100-nm radius, SEMT-induced errors are reduced by as much as 56% in the 8-bit floating-point multiplier by using the RP strategy, and approximately 30% for the benchmark circuits on average.
Table VII-3. Relative placement flow statistics for SEMT-induced error masking at simulated 100-nm radiation event radius.

<table>
<thead>
<tr>
<th>Circuit</th>
<th>% RP Cells</th>
<th># of SEMT Errors</th>
<th># Errors Masked</th>
<th>% Errors Masked</th>
</tr>
</thead>
<tbody>
<tr>
<td>cf_fp_5_10</td>
<td>77%</td>
<td>80234</td>
<td>45248</td>
<td>56.4%</td>
</tr>
<tr>
<td>cf_fp_3_4</td>
<td>65%</td>
<td>15308</td>
<td>5440</td>
<td>35.5%</td>
</tr>
<tr>
<td>c3540</td>
<td>60%</td>
<td>183525</td>
<td>59334</td>
<td>32.3%</td>
</tr>
<tr>
<td>c2670</td>
<td>73%</td>
<td>78341</td>
<td>22195</td>
<td>28.3%</td>
</tr>
<tr>
<td>c5315</td>
<td>75%</td>
<td>197036</td>
<td>69897</td>
<td>35.5%</td>
</tr>
<tr>
<td>c7552</td>
<td>52%</td>
<td>210724</td>
<td>42168</td>
<td>20.0%</td>
</tr>
<tr>
<td>c1908</td>
<td>68%</td>
<td>65736</td>
<td>10306</td>
<td>15.7%</td>
</tr>
<tr>
<td>c6288</td>
<td>68%</td>
<td>775363</td>
<td>94954</td>
<td>12.2%</td>
</tr>
</tbody>
</table>

One strength of the RP algorithm for placement-based hardening against SEMT-induced errors is the ability to take advantage of design complexity in modern intra-pipeline logic. Note the lowest level of performance, 12.2% of SEMT errors masked in the ISCAS85 c6288 design, which also corresponds to the lowest ratio of RP cells placed, 52%. C6288 is a homogenous multiplier circuit, formed purely of 240 full and half adders. All other circuits are heterogeneous, multi-level hierarchies formed of many different standard cells and logic structures. Varied compositions allow for greater give-and-take during design synthesis and place-and-route, producing a high percentage of RP cells and corresponding high percentage of masked SEMT errors at low performance costs, while the homogenous c6288 can only be ~50% constrained by RP placement rules before constraints begin to overlap, limiting RP group nomination.

Conclusion

The logic-aware relative placement design flow represents a cost-effective method to mask SEMT vulnerability in logic IP. As-is, the timing cost of the RP hardening strategy keeps performance well within typical design margins. Area and static power costs remain at zero for a
design strategy that rearranges components, rather than adding or removing costs. The increasing complexity of modern ICs and logic IP is promising for increased applicability of placement-based SEMT hardening, as this strategy naturally takes advantage of placement heterogeneity and more broad logic composition in order to ensure non-overlapping constraints. Finally, the automated implementation allows for a circuit designer to submit a logic IP netlist and automatically produce a list of constraints, which are implemented in standard cell placement with minimal delay in computation.
CHAPTER VIII

CONCLUSIONS

As fabrication technology scales towards smaller transistor sizes and lower critical charge, single-event radiation effects are more likely to cause errant behavior in multiple, physically adjacent devices in modern integrated circuits. With higher operating frequencies, this risk increasingly impacts design logic over memory as well. In order to increase future IC reliability, circuit designers need greater awareness of the effects of single-event multiple-transients (SEMTs) in logic. Understanding the behavior of this class of error will provide guidance in implementing mitigation strategies in the early stage of standard cell placement in the design process.

To measure the propagation and observability of multiple transients from single radiation events and characterize the response of a circuit to SEMTs, this work produces a novel method of combining netlist, placement, and timing information with statistical input vector coverage and scalable transient injection testing, with automated simulation implementation and test result processing. Several intra-pipeline combinational logic circuits at the 28/32-nm technology node are simulated and characterized, using multiple different standard cell placements of each design, to show the effects of multiple simultaneous transients in logic and how SEMT reliability can be modulated and improved by changing cell placement. It was shown that layout-informed SEMT simulation provides increased accuracy over netlist-only simulation, and including n-well/p-substrate geometry provides even greater accuracy into charge sharing effects over previous methods in the literature.
Although there are a variety of different methods that can be used to harden circuits against radiation effects, one level of the design hierarchy where hardening has not been heretofore thoroughly explored is in standard cell placement. This dissertation explored the different methods available within the commercial EDA tool Synopsys IC Compiler to modify the design step of standard cell placement, and compared these methods on the basis of circuit performance and impact on charge sharing sensitivity at a circuit level. It is shown that modifying standard cell placement of a logic circuit has an impact on the propagation and potential reconvergence of single-event multiple-transients. SEMT hardening useful for approximate computing applications are achieved through plan grouping, and hardening of selected logic paths is made possible with cell bounds.

Incorporating together the experiments on placement modification versus SEMT reliability, an automated standard cell placement algorithm has been developed to provide the circuit designer with an easily implemented method to place a circuit design with reliability as a constraint, while maintaining traditional PPA constraints. The relative placement automated design flow, based on complementary relationships inherent to Boolean logic, was applied to several circuits, achieving 30% masking of SEMT-induced errors for the circuits tested in this dissertation, with no cost to area and very minimal costs to performance characteristics.
REFERENCES


PUBLICATIONS AND PRESENTATIONS


