Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Reduced On-chip Storage of Seeds for Built-in Test Generation

Published: 14 March 2024 Publication History

Abstract

Logic built-in self-test (LBIST) approaches use an on-chip logic block for test generation and thus enable in-field testing. Recent reports of silent data corruption underline the importance of in-field testing. In a class of storage-based LBIST approaches, compressed tests are stored on-chip and decompressed by an on-chip decompression logic. The on-chip storage requirements may become a bottleneck when the number of compressed tests is large. In this case, using each compressed test for applying several different tests allows the storage requirements to be reduced. However, producing different tests from each compressed test has a hardware overhead. This article suggests a new on-chip storage scheme for compressed tests that eliminates the additional hardware overhead. Under the new storage scheme, a set of N B-bit compressed tests targeting a set of faults F0 is translated into a sequence S of N ⋅ B bits. Every B consecutive bits of S are considered as a compressed test. The sequence S thus yields close to N ⋅ B compressed tests, magnifying the test data stored in S almost B times. Taking advantage of the extra tests, the article describes a software procedure that is applied offline to reduce S without losing fault coverage of F0. Experimental results for benchmark circuits demonstrate significant reductions in the storage requirements of S and significant increases in the fault coverage of a second set of faults, F1.

1 Introduction

Testing of electronic chips is important after manufacturing, to detect defects that were introduced by fabrication processes, as well as in-field, to detect defects that escaped manufacturing testing or occurred during the lifetime of a chip. Test application can be performed by automatic test equipment (ATE) or using on-chip logic. Logic built-in self-test (LBIST) approaches [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28] use an on-chip logic block for test generation, making it unnecessary to use external test data. This facilitates in-field test application. It also addresses security concerns related to loading and unloading of test data [10]. Recent reports of silent data corruption underline the importance of LBIST for in-field testing.
A linear-feedback shift-register (LFSR) is typically an important part of the on-chip test generation logic. A basic LBIST approach uses the LFSR for applying pseudo-random tests [1]. The fault coverage achieved by the LFSR is increased significantly if multiple initial states, referred to as seeds, are used for the LFSR. Each seed produces a different subset of tests and contributes to the detection of a different subset of faults [2, 5, 7, 17].
A linear logic block, such as an LFSR, is also used as part of test data compression approaches. In this case, an ATE stores compressed tests similar to seeds for an LFSR. On-chip decompression logic transforms a compressed test provided by the ATE into a test that can be applied to the circuit. In [12, 21, 22] and [28], compressed tests are stored on-chip and decompressed using on-chip decompression logic to apply tests to the circuit. This results in the LBIST approach illustrated in Figure 1 and referred to as storage-based LBIST.
Fig. 1.
Fig. 1. Storage-based LBIST approach.
To reduce the storage requirements for compressed tests or increase the fault coverage, each compressed test in [21, 22] and [28] is used for applying several different tests to the circuit. The LBIST approach from [21] complements scan vectors produced by the on-chip decompression logic to allow each compressed test to produce several different tests. The LBIST approach from [22] complements values in the compressed tests before they are decompressed. The approach described in [28] partitions compressed tests (seeds for an LFSR) into subvectors. It combines subvectors on-chip pseudo-randomly to form new seeds that are then used by the on-chip decompression logic for forming different tests. The variations possible with pseudo-random combinations of subvectors allow the number of subvectors and storage requirements to be reduced significantly.
Other LBIST approaches also modify LFSR seeds or the pseudo-random tests they produce to apply different tests and thus increase the fault coverage [5, 6, 7, 13, 18].
In all the approaches discussed thus far, additional hardware is required for forming additional tests that improve the fault coverage or allow the storage requirements to be reduced. In the storage-based LBIST approaches, the extra hardware is used for complementing bits of compressed or applied tests or for forming seeds from subvectors.
This article suggests a new storage scheme for LFSR seeds in the configuration from Figure 1 that does not require additional hardware other than that required for storage. The new storage scheme is applied to a compressed deterministic test set \(\Psi\) consisting of N seeds that was generated for a B-bit LFSR targeting a set of faults \(F_0\) , with every seed in \(\Psi\) producing one test. The set \(\Psi\) is translated into a sequence of \(N \cdot B\) consecutive bits. The sequence is denoted by S and referred to as a seed sequence. Every B consecutive bits of S are considered as a seed. The seed sequence S thus yields close to \(N \cdot B\) seeds, magnifying the test data stored in S almost B times. By storing S in a shift register and using the B leftmost bits of the shift register as seeds, it is possible to perform test application without storing any additional information and without requiring additional hardware support. Thus, the first contribution of this article is a storage scheme that magnifies the stored seeds without requiring additional hardware support.
The second contribution of this article is a software procedure that modifies the seed sequence S to reduce its storage requirements. The procedure is applied offline to reduce S before it is stored on-chip. Whereas general-purpose storage schemes do not allow the stored data to be modified, the modification of S is accompanied by a fault simulation procedure to ensure that no loss of fault coverage occurs for \(F_0\) when S is modified. After translating \(\Psi\) into a seed sequence S, the procedure omits from S bits that are not needed as part of any seed and reorders S to obtain a new seed sequence from which additional bits can be omitted. When bits are omitted from S, the number of applied tests is reduced as well.
The third contribution of this article is to consider two sets of faults from different fault models: \(F_0\) , for which \(\Psi\) was generated, and a second set of faults, \(F_1\) , that was not targeted when \(\Psi\) was generated. Detecting faults from a second fault model is important for the quality of the test set. Compared with \(\Psi\) , the initial seed sequence S yields a significantly higher fault coverage for \(F_1\) . As the seed sequence is reduced, the procedure simulates the faults in \(F_1\) to determine the effects of the reduction on the fault coverage of \(F_1\) .
For the experiments in this article, \(F_0\) consists of stuck-at faults, and \(F_1\) consists of single-cycle gate-exhaustive faults. The storage scheme and software procedure are not limited to these fault models, and other fault models can be considered instead.
The article is organized as follows. Section 2 describes the storage scheme for seeds. Section 3 describes the software procedure for producing a seed sequence with reduced storage requirements. Section 4 presents experimental results. Section 5 concludes the article.

2 Storage of Seeds

This section discusses the storage scheme for seeds. A given set of seeds \(\Psi = \lbrace \Psi _0 , \Psi _1 , \ldots , \Psi _{N-1} \rbrace\) consists of N seeds. Each seed has B bits. A seed is represented as \(\Psi _i = \psi _{i,0} \psi _{i,1} \ldots \psi _{i,B-1}\) .
For simplicity, it is assumed that a test is compressed into a single seed. In general, a test may be compressed into several seeds. With m seeds per test and N tests, the number of seeds in \(\Psi\) would be \(m \cdot N\) , and m seeds would be used for applying a test.
Figure 2 shows a set of seeds \(\Psi\) with \(N = 3\) and \(B = 4\) as a two-dimensional array. If \(\Psi\) is stored in an on-chip memory, the used part of the memory is of dimensions \(N \times B\) , and a seed is obtained by accessing a memory word.
Fig. 2.
Fig. 2. Set of seeds.
The entries of a two-dimensional array are commonly stored consecutively. The set \(\Psi\) with the seeds stored consecutively is shown at the top of Figure 3.
Fig. 3.
Fig. 3. Seed sequence.
The bits of the seeds in \(\Psi\) are renumbered as shown by the array S in Figure 3. Considering every B consecutive bits of S as a seed, S yields \(N \cdot B - B + 1 = 9\) different seeds, \(S_0 = s_0 s_1 s_2 s_3 = \Psi _0\) , \(S_1 = s_1 s_2 s_3 s_4\) , \(S_2 = s_2 s_3 s_4 s_5, \ldots , S_8 = s_8 s_9 s_{10} s_{11} = \Psi _3\) . Four of these seeds are shown in Figure 3.
In general, the number of bits in S is denoted by n. With N seeds and B bits per seed, \(n = N \cdot B\) initially. The procedure described in Section 3 produces a seed sequence S with a reduced value of n. Using every B consecutive bits of \(S = \langle s_0 s_1 \ldots s_{n-1} \rangle\) as a seed, the possible seeds are \(S_i = s_i s_{i+1} \ldots s_{i+B-1}\) , for \(0 \le i \le n-B\) .
Storage of S on-chip can be implemented in one of several ways. The method for extracting the seeds from S depends on the way S is stored on-chip. If S is stored in an on-chip memory, it is necessary to be able to address every B consecutive bits of the memory. A counter can be used for pointing to the first bit of a seed. Instead, it is possible to implement S as a shift register. As the shift register is shifted left one bit at a time, the B leftmost bits of the shift register provide the next seed. Figure 4 illustrates this implementation for the set of seeds from Figure 3. Figure 4 shows the first five seeds obtained from S. Compared with storage of \(\Psi\) in an on-chip memory, both storage schemes of S are more complex. The benefit is that significantly fewer bits need to be stored.
Fig. 4.
Fig. 4. Storage of seeds using a shift register.
Not every seed that can be extracted from S is needed for detecting target faults. In Figure 5, an additional bit indicates whether the seed \(S_i\) should be used for producing a test. The extra bit is denoted by \(a_i\) . In addition, \(A = \langle a_0 a_1 \ldots a_{n-1} \rangle\) . Only when \(a_i = 1\) is \(S_i\) used for producing a test. In the example of Figure 5, the test set consists of the tests based on \(S_0\) , \(S_1\) , and \(S_7\) . The vector A does not need to be stored if it is acceptable to apply \(n-B+1\) tests based on S.
Fig. 5.
Fig. 5. Storage of seeds and tests to apply.
It should be noted that circular shift allows n seeds to be obtained from S. Experimental results indicate that the addition of \(B - 1\) seeds obtained with circular shift does not have a significant effect on the results. Therefore, circular shift is not used in this article.

3 Procedure for Reducing the Seed Sequence

This section describes a software procedure for producing a seed sequence with reduced storage requirements from a given set of seeds. The procedure produces the seed sequence S that will be stored on-chip.

3.1 Procedure Overview

The given set of seeds is denoted by \(\Psi\) , and it consists of N seeds for a B-bit LFSR. The set \(\Psi\) targets a set of faults denoted by \(F_0\) . The same set of target faults is used while reducing the seed sequence S. A second set of target faults, \(F_1\) , is used for fault simulation only.
The procedure is outlined in Figure 6. Initially, all the seeds in \(\Psi\) are concatenated to form the initial seed sequence S. The procedure forms a test set T based on S by using all the seeds from S to produce tests and including in T tests that detect faults from \(F_0\) . The procedure reduces S by identifying bits that do not contribute to any test in T and removing them from S.
Fig. 6.
Fig. 6. Procedure for constructing a seed sequence.
After S is reduced, new seeds may be obtained from the reduced seed sequence. With the new seeds it is possible to form a new test set T. Based on the new test set T, additional bits of S may be removed. This process is repeated as long as S can be reduced.
When no additional reduction of S is possible, the test set T is used for partitioning S into non-overlapping subsequences. The subsequences are such that every test from T has a subsequence from which its seed can be obtained. This implies that, as long as the subsequences remain intact, their order can be changed without losing the ability to detect any fault from \(F_0\) . The procedure reorders the subsequences to obtain a new seed sequence.
With the new seed sequence, it is possible to obtain new seeds and a new test set. The procedure repeats the process of reducing and reordering S until a termination condition is met. The details of the procedure are described next.

3.2 Fault Simulation

A sequence \(S = \langle s_0 s_1 \ldots s_{n-1} \rangle\) yields \(n-B+1\) seeds, \(S_0\) , \(S_1, \ldots , S_{n-B}\) . Each one of the seeds can be used for producing a test. The test produced by \(S_i\) is denoted by \(t_i\) . Let \(U = \lbrace t_i : 0 \le i \le n-B \rbrace\) be the set of all the tests that can be produced by S. A fault simulation procedure for S is given as Procedure 1 and described next.
Given a set of target faults \(F_0\) , fault simulation with fault dropping of the tests in U is used for finding a test set T. A test \(t_i \in U\) is included in T only if it detects new faults from \(F_0\) .
After computing T, forward-looking reverse order fault simulation is used for removing unnecessary tests from T. This procedure simulates the test set T in reverse order. The procedure avoids simulation of a test \(t_i\) if all the faults it detected in the forward order are already detected or will be detected by tests that appear before it in the test set. A test that does not detect any new faults is removed from T.
Based on T, it is possible to define the sequence \(A = \langle a_0 a_1 \ldots a_{n-1} \rangle\) such that \(a_i = 1\) if \(t_i \in T\) , and \(a_i = 0\) otherwise.
Procedure 1: Fault simulation
(1)
Let \(F_0\) be the set of target faults. Let S be a given seed sequence of length n. Assign \(T = \emptyset\) .
(2)
For \(0 \le i \le n-B\) :
(a)
Find the test \(t_i\) produced by the seed \(S_i\) .
(b)
Simulate \(F_0\) under \(t_i\) and remove detected faults from \(F_0\) .
(c)
If any faults were removed from \(F_0\) when \(t_i\) was simulated, add \(t_i\) to T.
(3)
Apply to T forward-looking reverse order fault simulation and remove unnecessary tests.
(4)
For \(0 \le i \le n-B\) , if \(t_i \in T\) , assign \(a_i = 1\) ; otherwise, assign \(a_i = 0\) .

3.3 Reducing S

A test \(t_i \in T\) is produced by the seed \(S_i = s_i s_{i+1} \ldots s_{i+B-1}\) . Thus, the inclusion of \(t_i\) in T requires the bits \(s_i\) , \(s_{i+1}, \ldots , s_{i+B-1}\) to be retained in S.
For a bit \(s_j\) of S, let \(m_j = 1\) indicate that there is a test \(t_i \in T\) whose seed \(S_i\) includes \(s_j\) . Let \(M = \langle m_0 m_1 \ldots m_{n-1} \rangle\) . If \(m_j = 0\) , \(s_j\) can be removed from S without losing fault coverage.
For illustration, the sequence A from Figure 5 is used in Figure 7 for computing the sequence M. For every seed \(S_i\) with \(a_i = 1\) , Figure 7 shows the bits that result in \(m_j = 1\) .
Fig. 7.
Fig. 7. Reducing the seed sequence.
In Figure 7, \(m_j = 0\) is obtained for \(j = 5\) , 6, and 11. Therefore, \(s_5\) , \(s_6\) , and \(s_{11}\) can be removed to obtain the sequence S shown at the bottom of Figure 7.

3.4 Iterative Reduction of S

Considering the seed sequences in Figure 7, the new sequence at the bottom of Figure 7 has several seeds that do not exist in the initial sequence at the top of Figure 7. Whereas the seeds \(s_0 s_1 s_2 s_3\) , \(s_1 s_2 s_3 s_4\) , and \(s_7 s_8 s_9 s_{10}\) exist in both sequences, the new sequence at the bottom of Figure 7 has the seeds \(s_2 s_3 s_4 s_7\) , \(s_3 s_4 s_7 s_8\) , and \(s_4 s_7 s_8 s_9\) that do not exist in the initial sequence. These seeds may result in new tests that will allow additional bits to be removed from S. To benefit from this observation, the procedure repeats the fault simulation process to compute a new test set and update S, as long as bits can be removed from S. The procedure for reducing S is given as Procedure 2.
Procedure 2: Reducing S
(1)
Call Procedure 1 for S to produce the test set T.
(2)
Assign \(m_j = 0\) for \(0 \le j \lt n\) .
(3)
For every test \(t_i \in T\) , assign \(m_j = 1\) for \(i \le j \le i+B-1\) .
(4)
For \(0 \le j \lt n\) , if \(m_j = 0\) , remove \(s_j\) from S.
(5)
If any bits were removed from S, go to Step 1.

3.5 Reordering of the Seed Sequence

In Figure 8, the reduced seed sequence from Figure 7 is renumbered, and tests are computed for detecting the faults in \(F_0\) . The seeds for these tests are shown in Figure 8.
Fig. 8.
Fig. 8. Partitioning of the seed sequence.
Let \(S[i,j] = \langle s_i s_{i+1} \ldots s_j \rangle\) denote the subsequence of S that consists of bits \(s_i\) , \(s_{i+1}, \ldots , s_j\) . To be able to obtain \(S_0\) from S, the subsequence \(S[0,3] = \langle s_0 s_1 s_2 s_3 \rangle\) must remain intact. In a similar way, \(S_1\) requires the subsequence \(S[1,4] = \langle s_1 s_2 s_3 s_4 \rangle\) to remain intact, and \(S_5\) requires the subsequence \(S[5,8] = \langle s_5 s_6 s_7 s_8 \rangle\) to remain intact.
The subsequences \(S[0,3]\) and \(S[1,4]\) overlap, implying that the subsequence \(S[0,4] = S[0,3] \cup S[1,4]\) must remain intact. This partitions S into two non-overlapping subsequences, \(S[0,4]\) and \(S[5,8]\) .
In general, a partition of S is obtained as follows. For every test \(t_i \in T\) , the seed \(S_i\) that produces \(t_i\) is used for defining a subsequence \(S[i,i+B-1]\) . In an iterative process, every two overlapping subsequences, \(S[i_0,j_0]\) and \(S[i_1,j_1],\) are merged into a single subsequence \(S[i_2,j_2] = S[i_0,j_0] \cup S[i_1,j_1]\) such that \(i_2 = min \lbrace i_0 , i_1 \rbrace\) and \(j_2 = max \lbrace j_0 , j_1 \rbrace\) .
After obtaining non-overlapping subsequences, S can be reordered by changing the order of the subsequences. The reordering procedure is given as Procedure 3.
Procedure 3: Reordering S
(1)
Call Procedure 1 for S to produce the test set T.
(2)
Assign \(R = \emptyset\) . For every test \(t_i \in T\) , add to R the range \([i,i+B-1]\) .
(3)
For every two ranges \([i_0,j_0] \in R\) and \([i_1,j_1] \in R\) with an index x such that \(i_0 \le x \le j_0\) and \(i_1 \le x \le j_1\) , merge \([i_0,j_0]\) and \([i_1,j_1]\) into a range \([i_2,j_2]\) such that \(i_2 = min \lbrace i_0 , i_1 \rbrace\) and \(j_2 = max \lbrace j_0 , j_1 \rbrace\) .
(4)
Reorder the ranges in R. Reorder S based on R.
Considering the reordering strategy, experimental results with various reordering heuristics indicate that a random reordering produces the best results overall for the first iterations. This can be explained as follows. Pairs of consecutive subsequences create options for seeds that are not available from each subsequence alone. Therefore, reordering of the subsequences in S creates new options for seeds that were not available before reordering. Without performing fault simulation, it is not possible to predict which pairs of subsequences should be placed consecutively to create new seeds that detect target faults. A random reordering explores the search space without requiring fault simulation.
For many benchmark circuits, random reordering reduces the number of subsequences to a small number. As the number of subsequences is reduced, every bit of the seed sequence is utilized for more seeds. As a result, the number of bits in the seed sequence approaches its minimum value.
When random reordering leaves a large number of subsequences, a heuristic that was found to be useful experimentally is to reorder the subsequences from low to high number of bits. When subsequences with small numbers of bits appear at the beginning of the seed sequence, they tend to be better utilized. This occurs since fault simulation of the seeds is performed in the order \(S_0\) , \(S_1\) , \(\ldots\) with fault dropping. Consequently, the first seeds tend to detect the largest numbers of faults, and as additional seeds are simulated, fewer faults remain to be detected. Considering subsequences instead of seeds, the same effect occurs. Therefore, the first subsequences tend to be utilized better for detecting more faults, and later subsequences, with more bits and fewer detected faults, can be reduced.
To benefit from both strategies, the procedure initially uses random reordering. After an iteration with random reordering where S is not reduced, reordering from low to high number of bits is used for the next iteration. The strategies alternate after every iteration where S is not reduced.

3.6 Termination Condition

In the procedure from Figure 6, after reordering the seed sequence, the new seed sequence is reduced and reordered again. The procedure terminates after a constant number of consecutive iterations where the number of bits in S is not reduced.

4 Experimental Results

The procedure from Figure 6 was applied to benchmark circuits as described in this section.

4.1 Setup

The compressed test set \(\Psi\) is a compact set of seeds targeting single stuck-at faults. The set of target faults \(F_0\) also consists of single stuck-at faults. A second set of faults, \(F_1\) , consists of single-cycle gate-exhaustive faults. Single-cycle gate-exhaustive faults are a superset of single-cycle cell-aware faults when the same gates or cells are used. Single-cycle gate-exhaustive faults are more difficult to detect accidentally than single stuck-at faults since they have more activation conditions that need to be satisfied. Fault simulation of \(F_1\) is carried out under \(\Psi\) . In addition, for the initial seed sequence S, and at the end of every iteration of the procedure from Figure 6, fault simulation of \(F_1\) is carried out using all the seeds that can be produced from S.
The procedure terminates after 128 iterations where the seed sequence S is not reduced. This large number of iterations is used for demonstrating the extent to which S can be reduced. The results demonstrate that the procedure can terminate earlier as discussed at the end of this section.

4.2 Results and Comparison

The results are shown in Tables 13. There are several rows for every circuit. The first row, with a dash under column I, uses the set \(\Psi\) containing N seeds, each one stored separately. This row provides a baseline for comparison with an approach that stores compressed tests. The storage requirements of \(\Psi\) are reduced by test data compression, but without the hardware overhead of using each stored compressed test to apply several different tests. This is an appropriate comparison with the procedure from Figure 6 that also has no additional hardware requirements beyond storing a set of compressed tests.
Table 1.
        stuck-atgate-exhaustive 
circuitinpBIreorpartbitsfractestsf.c.testsf.c.ntime
sasc13213005331.00041100.0004167.3741.00
sasc132130005331.00041100.00026797.6752.00
sasc132131001810.34051100.00015287.60212.50
sasc132133031560.29345100.00013284.50229.50
usb_phy11218006481.00036100.0003682.5361.00
usb_phy112180006481.00036100.00017799.1592.60
usb_phy112181003630.56048100.00016698.38321.00
usb_phy1121820113010.46552100.00015798.25437.00
usb_phy112186132480.38351100.00015396.63687.00
s359321,76313007411.0005789.8095798.5091.00
s359321,763130007411.0005789.809110100.0001.32
s359321,763131001750.2366289.80910299.97114.05
b047828009521.0003499.8513479.2631.00
b0478280009521.0003499.85113989.5164.00
b0478281007010.7364999.85112989.05512.50
b04782820155850.6144799.85111688.53723.00
b04782830105570.5854699.85111888.53735.50
b04782821044610.4844699.85110588.249123.00
s14239118009901.0005599.0765590.3811.00
s142391180009901.0005599.07618997.6412.70
s142391181004940.4996099.07616996.37013.00
s142391189193430.3466099.07615095.40263.75
systemcdes32014001,4281.000102100.00010293.3911.00
systemcdes320140001,4281.000102100.00029899.8011.83
systemcdes320141003430.240108100.00024498.2628.20
systemcdes320146142660.186109100.00019796.79029.88
b075336001,4761.0004199.9154159.4591.00
b0753360001,4761.0004199.91521079.1585.25
b0753361009650.6545099.91517676.55930.00
b07533630158170.5545599.91517175.62468.00
b0753366087210.4885599.91516475.156135.00
simple_spi14638001,7481.00046100.0004655.0661.00
simple_spi146380001,7481.00046100.00066397.1384.13
simple_spi146381001,0770.61663100.00059194.88312.83
simple_spi1463820169910.56764100.00059394.95920.17
simple_spi146387198700.49866100.00055993.76954.67
des_area36714002,2121.000158100.00015872.4241.00
des_area367140002,2121.000158100.0001,10399.9201.77
des_area367141004430.200155100.00037786.8039.84
des_area367142034280.193155100.00036486.01915.35
i2c14543002,4941.00058100.0005873.9741.00
i2c145430002,4941.00058100.00038592.5645.20
i2c145431001,9800.79466100.00035991.35222.14
i2c14543170171,7330.69575100.00035690.884251.57
Table 1. Experimental Results ( \(|S| \lt 2,500\) )
Table 2.
        stuck-atgate-exhaustive 
circuitinpBIreorpartbitsfractestsf.c.testsf.c.ntime
systemcaes92829003,1611.00010999.99510979.6131.00
systemcaes928290003,1611.00010999.9951,37598.7012.94
systemcaes928291001,2920.40918999.99583793.74619.00
systemcaes9282920131,2320.39019299.99581693.10036.80
systemcaes928298139300.29419399.99565691.381101.80
wb_dma73847003,9481.00084100.0008473.8401.00
wb_dma738470003,9481.00084100.00098592.7563.02
wb_dma738471003,2270.817123100.00094591.22418.12
wb_dma7384720442,7040.685145100.00087590.44941.34
wb_dma73847211112,3560.597160100.00086689.314299.03
s537821436004,7881.00013399.13113367.6051.00
s5378214360004,7881.00013399.1311,22695.1143.64
s5378214361003,0410.63520199.1311,08492.70342.81
s53782143620342,7560.57621599.1311,05992.03459.06
s53782143660142,3720.49522699.13197090.571137.51
s9234247750010,0501.00013493.47513470.8251.00
s92342477500010,0501.00013493.4751,12885.8877.93
s9234247751008,7690.87317393.4751,10285.187103.82
s92342477540607,9180.78820093.4751,06884.772295.15
s923424775441407,0230.69921593.4751,05884.7831,951.33
s9234247753111216,0070.59823993.4751,06984.3099,027.45
aes_core788280010,6401.000380100.00038098.2591.00
aes_core7882800010,6401.000380100.0001026100.0001.60
aes_core788281001,9270.181587100.000101599.95816.28
s15850611570011,2291.00019796.68219773.2851.00
s158506115700011,2291.00019796.6821,50482.1856.80
s15850611571009,6350.85824696.6821,42381.54393.85
s1585061157301048,7870.78326996.6821,34781.096180.76
s1585061157290617,8130.69628996.6821,32580.9091,077.27
s15850611571550336,7170.59830996.6821,25280.4404,001.87
s13207700470011,7971.00025198.46225162.2591.00
s132077004700011,7971.00025198.4622,10587.6026.61
s13207700471007,3610.62432898.4621,88782.32576.19
s132077004720806,9300.58733698.4621,86482.443134.07
s132077004760325,8810.49935598.4621,77980.457305.02
s132077004755164,6860.39737798.4621,66278.4111,584.23
spi274440015,8401.00036099.98536067.0731.00
spi2744400015,8401.00036099.9853,06595.5653.49
spi274441003,8130.24145999.9852,20587.82617.67
spi2744420342,8490.18047299.9851,91585.18633.14
Table 2. Experimental Results ( \(2,500 \le |S| \lt 20,000\) )
Table 3.
        stuck-atgate-exhaustive 
circuitinpBIreorpartbitsfractestsf.c.testsf.c.ntime
s385841,464980021,3641.00021895.85221888.3131.00
s385841,4649800021,3641.00021895.8522,62998.9984.69
s385841,4649810014,4810.67844995.8522,48798.58981.58
s385841,46498304912,0070.56252995.8522,47598.403217.85
s385841,46498701510,5240.49353295.8522,35898.222463.96
b205271190028,3221.00023893.30423864.2501.00
b2052711900028,3221.00023893.3041,95785.7349.22
b2052711910024,9220.88028893.3041,92085.506144.43
b2052711910013822,5820.79729593.3041,91185.173885.98
b2052711935118519,7590.69831693.3081,82084.71714,763.71
b154831130030,0581.00026698.58026659.4441.00
b1548311300030,0581.00026698.5802,51771.20617.13
b1548311310022,7410.75734998.6202,42370.384131.70
b154831134012620,8740.69436298.6202,41870.318399.29
s384171,6641110031,4131.00028399.47128370.7231.00
s384171,66411100031,4131.00028399.4714,60085.15513.86
s384171,66411110028,8730.91934199.4714,54084.963100.46
s384171,6641112021227,7380.88336599.4714,49584.990196.70
s384171,6641117013624,9310.79442399.4714,35384.540556.14
s384171,6641112607521,8980.69747799.4714,22284.1801,710.20
s384171,66411111403018,7640.59751999.4714,03983.9354,947.02
b142801280037,1201.00029094.96029072.0681.00
b1428012800037,1201.00029094.96098283.47614.13
b1428012810028,3880.76532894.96095583.296168.40
b142801286015325,9620.69933494.96094583.283677.34
b1428012834119722,2530.59933794.97092683.25116,067.46
tv803721090040,3301.00037099.52737077.2641.00
tv8037210900040,3301.00037099.5272,98691.3659.66
tv8037210910028,9820.71949199.5272,82390.310100.67
tv803721092015626,5410.65851399.5272,75490.091188.91
tv80372109509924,0860.59753099.5272,73089.447378.33
tv803721092005120,0060.49656099.5272,65388.8431,234.77
tv8037210920001016,0940.39957899.5272,48987.9636,867.42
b171,444940044,0861.00046993.86346944.2931.00
b171,4449400044,0861.00046993.8635,93653.20022.50
b171,4449410030,8000.69974494.9025,60452.833171.90
b171,44494409925,8670.58781895.5135,43253.538454.67
Table 3. Experimental Results ( \(|S| \ge 20,000\) )
The case where the initial seed sequence S, with \(N \cdot B\) bits, is used without any reduction is shown in the second row. Column I has a zero for this case.
Additional rows of Tables 13 show the results of the procedure from Figure 6 as it reduces S. The row with \(I = 1\) corresponds to iteration 1. The next rows correspond to the iterations where the number of bits in S is decreased below 0.9, 0.8, 0.7, \(\ldots\) of the number of bits required for \(\Psi\) .
For every iteration, after the name of the circuit, column inp shows the number of inputs (including primary inputs and present-state variables or flip-flops). Column B shows the number of LFSR bits. Column I shows the iteration of the procedure from Figure 6. For \(I \ge 2\) , column reor has a zero for random reordering and a one for reordering from low to high number of bits. Column part shows the number of subsequences into which the seed sequence S is partitioned. Column bits shows the number of bits in \(\Psi\) or S. Column frac shows the number of bits as a fraction of the number of bits in \(\Psi\) . Column \(stuck-at\) shows the number of tests (and seeds) required for detecting single stuck-at faults, followed by the stuck-at fault coverage. Column \(gate-exhaustive\) shows the number of tests (and seeds) required for detecting single-cycle gate-exhaustive faults, followed by the single-cycle gate-exhaustive fault coverage. Column ntime shows the normalized runtime, defined as follows. Let the runtime for fault simulation of \(\Psi\) be \(\rho _0\) . Let the total runtime of the procedure from Figure 6 up to iteration I be \(\rho _I\) . The normalized runtime is \(\rho _I / \rho _0\) .

4.3 Discussion

The following points can be seen from Tables 13. Iteration 1 reduces the number of bits in the initial seed sequence significantly. This is achieved by considering approximately \(N \cdot B\) seeds based on the initial seed sequence S instead of the N seeds included in \(\Psi\) , and without changing the order of the seed sequence. In the case of tv80, iteration 1 reduces the number of bits to 0.719 of the number of bits required for \(\Psi\) , which is already a compressed test set. Thus, the additional storage reductions are achieved on top of those achieved by test data compression that is used for \(\Psi\) .
For iteration \(I \ge 2\) , the seed sequence is partitioned into a number of subsequences that depends on the circuit. The number of subsequences varies from a few to over 100. Even with a small number of subsequences, additional iterations typically achieve an additional reduction in the number of bits by reordering the seed sequence before attempting to reduce it again. In the case of tv80, the last iteration reduces the storage requirements to 0.399 relative to \(\Psi\) .
The number of subsequences typically decreases as additional iterations are performed and the number of bits in S is reduced. The reduction in the number of subsequences implies that more overlapping seeds are obtained, and every bit of the seed sequence is utilized for more seeds.
The fault coverage of \(F_1\) is increased significantly when S is considered instead of \(\Psi\) in iteration 0, before S is reduced. As S is reduced, the fault coverage of \(F_1\) may decrease, but it remains significantly higher than that of \(\Psi\) . The fault coverage of \(F_0\) is not allowed to decrease. The same can be required for \(F_1\) . This option was not implemented so as not to limit the reduction of S.
The number of tests available from S is equal to \(n-B+1\) , where n is the number of bits in S. For LBIST, these numbers of tests are acceptable. Tables 13 also show the numbers of tests needed for detecting stuck-at and single-cycle gate-exhaustive faults. These numbers are lower than \(n-B+1\) , with more tests needed for single-cycle gate-exhaustive faults than for stuck-at faults. The advantage of applying all \(n-B+1\) tests is the potential for defect detection.
It is interesting to note that the number of tests required for detecting single stuck-at faults increases as the seed sequence is reduced. This is a result of the fact that the seeds have more overlaps as the seed sequence is reduced.
Random reordering is used in more iterations than reordering from low to high number of bits. This explains why column reor has more zeros than ones. Considering all the iterations (and not only the ones reported), reordering from low to high number of bits is used in a significant number of iterations and has a significant effect on the results.

4.4 Computational Effort

The normalized runtime measures the computational effort of the procedure from Figure 6 in terms of fault simulation time. This is appropriate since the procedure performs fault simulation to support the reduction of S.
The runtime for iteration 0 is higher than the runtime for simulating \(\Psi\) because of the number of simulated tests. The runtime per iteration decreases as additional iterations are performed, and the number of bits in S is reduced.
The normalized runtime for iteration 0 does not increase with the size of the circuit, and larger circuits sometimes have a lower normalized runtime. This indicates that the procedure scales similarly to a fault simulation procedure, which is manageable for circuits of any size.
The highest normalized runtimes in Tables 13 are obtained when the procedure performs a large number of iterations. It is possible to limit the number of iterations to limit the runtime. Figure 9 illustrates this point by considering the reduction in the storage requirements of S as a function of the normalized runtime for b14. This circuit was selected since it has the highest normalized runtime at termination. Iteration 19 is marked in Figure 9.
Fig. 9.
Fig. 9. Results for b14.
Based on the data in Figure 9, and considering the results for other circuits in Tables 13, it is possible to terminate the procedure after 20 iterations without a significant cost in terms of the storage requirements. In general, it is possible to set a target for the minimum acceptable reduction in storage requirements per iteration. When the reduction drops below the minimum, the procedure can terminate. A similar approach can be applied to the fault coverage of \(F_1\) . When this fault coverage drops below a preselected bound relative to the fault coverage obtained in iteration 0, the procedure can terminate. In this case, it is possible to select a solution where the fault coverage of \(F_1\) has a local maximum. For example, for s13207, instead of a solution with a storage reduction to 0.499 and a fault coverage of 80.457% for \(F_1\) , it is possible to use a solution with a storage reduction to 0.493 and a fault coverage of 80.882% for \(F_1\) .

4.5 Further Comparison

In Tables 13 the first row corresponds to an approach that stores a set \(\Psi\) of compressed tests on-chip and uses each compressed test from \(\Psi\) to apply a single test to the circuit. The procedure from Figure 6 is compared with this case to demonstrate the reduced storage requirements relative to a compressed test set and the improved fault coverage achieved for \(F_1\) .
In Table 4, the procedure from Figure 6 is compared with the LBIST approach from [28]. In [28], the compressed tests from \(\Psi\) are partitioned into subvectors, and subvectors are stored on-chip. Compressed tests are formed on-chip using pseudo-random combinations of subvectors. In this approach, each subvector may contribute to the formation of several different compressed tests. A software procedure described in [28] takes advantage of the ability to use each subvector multiple times to reduce the storage requirements and improve the fault coverage of gate-exhaustive faults. The main hardware cost of the approach from [28] is the need for multiplexers, controlled by an LFSR, to select combinations of subvectors that will form compressed tests. For parameters l, p, and \(|V|\) , the approach from [28] requires p l-bit multiplexers with \(|V|\) data inputs and \(\lceil log_2(|V|) \rceil\) select inputs.
Table 4.
 [28]bitstestsstuck-atgate-exh
circuitlp \(|V|\) [28]S[28]S[28]S[28]S
sasc11322156416143100.000100.00093.92384.502
usb_phy118222482,913230100.000100.00099.67796.636
s359321132217523216289.80989.80999.99699.971
b041282246146,31843399.85199.85198.56088.249
s14231182234325,57932599.07699.07699.75895.402
systemcdes114222661,297252100.000100.00099.64696.790
b075822110721963,25868599.91599.91599.63675.156
simple_spi21936870620,445832100.000100.00099.87393.769
des_area11422428739414100.000100.00093.43286.019
i2c153355251,733158,0721,690100.000100.00097.79790.884
systemcaes1292293039,41890199.99599.99599.98991.381
wb_dma1052772,7702,356986,8142,309*99.989100.00099.57789.314
s5378136222,37253,4322,33699.13199.13199.53390.571
s92343258246,007996,3685,932*92.79693.47596.13184.309
aes_core128221,9273,5481,899100.000100.00099.99799.958
s1585041515606,717999,7976,660*96.11996.68292.85380.440
s13207147224,686190,2714,63998.46298.46297.62378.411
spi68513062,849683,6912,80599.98599.98599.93185.186
s385845202814010,524995,50410,426*95.79695.85299.76698.222
b205242914519,759997,43019,640*88.86193.30886.05784.717
b155233216020,874999,77920,761*97.70898.62083.51370.318
s38417428145618,764998,53518,653*99.34399.47195.89283.935
b1434351522,253973,58222,125*85.88394.97078.68083.251
tv8071612486816,094998,07215,985*98.91999.52797.42487.963
Table 4. Comparison with [28]
Table 4 is organized as follows. After the name of the circuit, the values of the parameters l, p, and \(|V|\) from [28] are given. Next, the following parameters of the approach from [28] and the approach suggested in this article are compared. Column bits shows the number of bits that need to be stored. Column tests shows the number of applied tests. For the approach suggested in this article, the number of applied tests is computed as \(n-B\) assuming that all the tests based on S will be applied. Column \(stuck-at\) shows the stuck-at fault coverage achieved. When the approach from [28] loses Stuck-at fault coverage, the fault coverage is marked with an asterisk. Column \(gate-exh\) shows the single-cycle gate-exhaustive fault coverage.
The procedure from [28] applies up to 1 million tests. For several of the circuits considered, this bound on the number of tests is not sufficient for achieving complete single stuck-at fault coverage. This limitation is common with LBIST, and a small fault coverage loss is tolerated by many LBIST approaches. This shortcoming does not exist with the approach suggested in this article, which guarantees to achieve the single stuck-at fault coverage of the deterministic test set \(\Psi\) .
The advantages of the approach from [28] over the approach suggested in this article are a reduced number of bits and an increased single-cycle gate-exhaustive fault coverage. Both of these advantages are achieved at a cost of a significantly increased number of applied tests.
Considering the hardware cost, the need for additional hardware is typical of LBIST approaches, and exists in [28] as well. It does not exist with the storage scheme suggested in this article.

5 Concluding Remarks

This article described a new storage scheme for LFSR seeds as part of a storage-based LBIST approach. Under the suggested storage scheme, a set \(\Psi\) of N seeds for a B-bit LFSR is translated into a sequence of \(N \cdot B\) consecutive bits denoted by S. Every B consecutive bits of S are considered as a seed. With \(N \cdot B\) bits in S, the sequence S yields close to \(N \cdot B\) seeds, magnifying the test data stored in S almost B times. The article described a procedure that uses the extra tests to reduce S without losing fault coverage for a set of target faults \(F_0\) . The reduction is achieved by forming a test set T and removing from S bits that do not contribute to T. When no further reduction is possible, the sequence is reordered by finding independent subsequences and changing their order. A second set of target faults, \(F_1\) , is simulated to demonstrate that this approach increases the fault coverage for \(F_1\) . Experimental results for benchmark circuits showed significant reductions in the storage requirements with a significant increase in the fault coverage of \(F_1\) .

References

[1]
P. H. Bardell, W. H. McAnney, and J. Savir. 1987. Built-in Test for VLSI Pseudorandom Techniques. Wiley Interscience.
[2]
S. Hellebrand, S. Tarnick, J. Rajski, and B. Courtois. 1992. Generation of vector patterns through reseeding of multiple-polynomial linear feedback shift register. In Proc. Intl. Test Conf., 120–129.
[3]
J.-M. Lu and C.-W. Wu. 2000. Cost and benefit models for logic and memory BIST. In Proc. Conf. on Design, Automation and Test in Europe, 710–714.
[4]
L. Chen, S. Dey, P. Sanchez, K. Sekar, and Y. Cheng. 2000. Embedded hardware and software self-testing methodologies for processor cores. In Proc. Design Automation Conf., 625–630.
[5]
S. Hellebrand, H.-G. Liang, and H.-J. Wunderlich. 2001. A mixed mode BIST scheme based on reseeding of folding counters. Journal of Electronic Testing 17 (2001), 341–349.
[6]
N. A. Touba and E. J. McCluskey. 2001. Bit-fixing in pseudorandom sequences for scan BIST. In IEEE Transactions on Computer-Aided Design 20, 4 (2001), 545–555.
[7]
H.-G. Liang, S. Hellebrand, and H.-J. Wunderlich. 2001. Two-dimensional test data compression for scan-based deterministic BIST. In Proc. Intl. Test Conf., 894–902.
[8]
I. Pomeranz and S. M. Reddy. 2002. A storage based built-in test pattern generation method for scan circuits based on partitioning and reduction of a precomputed test set. IEEE Transactionson Computers 51, 11 (2002), 1282–1993.
[9]
A. A. Al-Yamani and E. J. McCluskey. 2003. Seed encoding with LFSRs and cellular automata. In Proc. Design Automation Conf., 560–565.
[10]
S. Pateras. 2004. Security vs. test quality: Fully embedded test approaches are the key to having both. In Proc. Intl. Test Conf., Panel P2.2, 1413.
[11]
B. Cheon, E. Lee, L.-T. Wang, X. Wen, P. Hsu, J. Cho, J. Park, H. Chao, and S. Wu. 2005. At-speed logic BIST for IP cores. In Proc. Conf. on Design, Automation and Test in Europe, 860–861.
[12]
D. Jose Costa Alves and E. Barros. 2009. A logic built-in self-test architecture that reuses manufacturing compressed scan test patterns. In Proc. Symp. on Integrated Circuits and System Design, Art. 21, 1–6.
[13]
D. Xiang, M. Chen, and H. Fujiwara. 2007. Using weighted scan enable signals to improve test effectiveness of scan-based BIST. In IEEE Transactions on Computers 56, 12 (2007), 1619–1628.
[14]
L.-T. Wang, X. Wen, S. Wu, H. Furukawa, H.-J. Chao, B. Sheu, J. Guo, and W.-B. Jone. 2010. Using launch-on-capture for testing BIST designs containing synchronous and asynchronous clock domains. IEEE Transactions on Computer-Aided Design 29, 2 (2010), 299–312.
[15]
R. S. Oliveira, J. Semiao, I. C. Teixeira, M. B. Santos, and J. P. Teixeira. 2011. On-line BIST for performance failure prediction under aging effects in automotive safety-critical applications. In Proc. Latin American Test Workshop, 1–6.
[16]
Y. Sato, H. Yamaguchi, M. Matsuzono, and S. Kajihara. 2011. Multi-cycle test with partial observation on scan-based BIST structure. In Proc. Asian Test Symp., 54–59.
[17]
O. Acevedo and D. Kagaris. 2012. Using the Berlekamp-Massey algorithm to obtain LFSR characteristic polynomials for TPG. In Proc. Intl. Symp. on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 233–238.
[18]
M. E. Imhof and H. Wunderlich. 2014. Bit-flipping scan—A unified architecture for fault tolerance and offline test. In Proc. Design, Automation & Test in Europe Conf., 1–6.
[19]
N. Li, E. Dubrova, and G. Carlsson. 2015. A scan partitioning algorithm for reducing capture power of delay-fault LBIST. In Proc. Design, Automation & Test in Europe Conf., 842–847.
[20]
C. Shiao, W. Lien, and K. Lee. 2016. A test-per-cycle BIST architecture with low area overhead and no storage requirement. In Proc. Intl. Symp. on VLSI Design, Automation and Test, 1–4.
[21]
Y. Liu, N. Mukherjee, J. Rajski, S. M. Reddy, and J. Tyszer. 2018. Deterministic stellar BIST for in-system automotive test. In Proc. Intl. Test Conf., 1–9.
[22]
B. Kaczmarek, G. Mrugalski, N. Mukherjee, J. Rajski, Ł. Rybak, and J. Tyszer. 2020. Test sequence-optimized BIST for automotive applications. In Proc. European Test Symp., 1–6.
[23]
A. Koneru and K. Chakrabarty. 2020. An interlayer interconnect BIST and diagnosis solution for monolithic 3-D ICs. IEEE Transactions on Computer-aided Design 39, 10 (2020), 3056–3066.
[24]
I. Pomeranz. 2021. Storage-based built-in self-test for gate-exhaustive faults. IEEE Transactions on Computer-aided Design 40, 10 (2021), 2189–2193.
[25]
A. Chaudhuri, S. Banerjee, J. Kim, H. Park, B. W. Ku, S. Kannan, K. Chakrabarty, and S. K. Lim. 2021. Built-in self-test and fault localization for inter-layer vias in monolithic 3D ICs. ACM Journal on Emerging Technologies in Computing Systems 18, 1 (2021), Art. 22, 1–37.
[26]
D. K. Maity, S. K. Roy, and C. Giri. 2022. A cost-effective built-in self-test mechanism for post-manufacturing TSV defects in 3D ICs. ACM Journal on Emerging Technologies in Computing Systems 18, 4 (2022), Art. 70, 1–23.
[27]
S. Wang, X. Zhou, Y. Higami, H. Takahashi, H. Iwata, Y. Maeda, and J. Matsushima. 2023. Test point insertion for multi-cycle power-on self-test. ACM Transactions on Design Automation of Electronic Systems 28, 3 (2023), Art. 46, 1–21.
[28]
I. Pomeranz. 2023. Storage-based logic built-in self-test with partitioned deterministic compressed tests. IEEE Transactions on VLSI Systems 31, 9 (2023), 1259–1268.

Index Terms

  1. Reduced On-chip Storage of Seeds for Built-in Test Generation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Design Automation of Electronic Systems
      ACM Transactions on Design Automation of Electronic Systems  Volume 29, Issue 3
      May 2024
      374 pages
      EISSN:1557-7309
      DOI:10.1145/3613613
      • Editor:
      • Jiang Hu
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 14 March 2024
      Online AM: 01 February 2024
      Accepted: 27 January 2024
      Revised: 02 January 2024
      Received: 19 August 2023
      Published in TODAES Volume 29, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Linear-feedback shift-register (LFSR)
      2. logic built-in self-test (LBIST)
      3. on-chip storage
      4. on-chip test generation
      5. test data compression

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 336
        Total Downloads
      • Downloads (Last 12 months)336
      • Downloads (Last 6 weeks)49
      Reflects downloads up to 02 Sep 2024

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media