[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[f-cpu] Verification, Testing, And Random Numbers
- To: f-cpu@seul.org
- Subject: [f-cpu] Verification, Testing, And Random Numbers
- From: Michael Riepe <michael+fcpu@stud.uni-hannover.de>
- Date: Tue, 03 Feb 2004 17:18:30 +0100
- Delivered-to: archiver@seul.org
- Delivered-to: f-cpu-outgoing@seul.org
- Delivered-to: f-cpu@seul.org
- Delivery-date: Tue, 03 Feb 2004 11:18:41 -0500
- Organization: mach keine witze!
- Reply-to: f-cpu@seul.org
- Sender: owner-f-cpu@seul.org
- User-agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.3.1) Gecko/20030425
[I hope the mail doesn't bounce again]
Hi F-gang!
You seem to be talking about a lot of different things at the same time:
a) verification of a design and/or its actual implementation (in VHDL),
b) testing the generated circuit (in silicon), c) the famous F-CPU POST,
and d) the use of random numbers.
For a), you have to verify that a unit does what it is supposed to do:
adders shall add, multipliers shall multiply and so on. There are
several methods to do that. An exhaustive test is most reliable, but
takes too long: Each unary operation would take 2**64 individual tests,
binary operations take 2**128, and ternary (like `mac') take 2**192 of
them. But we definitely can't afford to wait years for the result, do we?
In my testbenches, I use "systematic" tests. They test only specific
cases, and make use of "don't cares". E.g. if both operands of an adder
have their MSBits set, you *know* that a carry-out will be generated, no
matter what the other input bits are. Therefore, you can set them to 'X'
at the input (and ignore the output bits). Similarly, the LSBit of the
result will always be the XOR of the operands' LSBits, no matter what
the other bits are. And so on...
Systematic tests are less reliable than exhaustive tests, but they're
still quite good at detecting errors in a design. And they're a lot
faster (but still take too long in some cases).
When using random numbers for verification, you have a general problem:
If you feed the unit with random operands, you get a result that is
likewise random. How can you verify that it is correct? You have to
compare (that is, bitwise XOR or something like that) it with the output
of another unit that is known to work correctly (so-called "golden
code"). If the results differ, your code is broken.
But in general, there is no "golden" code. There is only another
implementation of the same problem that may or may not be correct. You
have no way to verify that, unless you use one of the methods outlined
above. In that case, you can as well verify your circuit directly.
The second problem with random number tests is that its "error coverage"
is too low. You only check a small percentage of the 2**64 (or more)
possible input variants, but errors may hide in any of them, so you
probably won't catch them all. This leads to Pentium FDIV bugs and the like.
Functional symbolic verification would be the real thing. It transforms
both the operands and the result into a special representation (using
BDDs oder one of their derivatives), performs the operation in either
representation, and compares the results. You need only a single run per
operation, and the result is as reliable as that of an exhaustive test,
only much faster. But there are no tools for it (yet).
Let's come to b), testing. There's no time for doing an exhaustive test:
even at 1000 GHz it would take 2**128 / 10**12 = ~3*10**26 seconds to
test all variants of a single binary operation like `add'. A year has
only ~3*10**7 seconds, so you would have to wait 10**19 years. Are any
immortals listening here?
The usual way is to use one of the common "fault models" and build as
many test vectors as you need to test for all detectable faults of a
unit. But this requires that you know its actual implementation, which
is not available before synthesis. Actually, this is another case of a
"systematic" test which is fast but not fully reliable.
Random numbers may work for boundary scan testing as well, but still has
the same problems as with verification: there is no "golden" code, and
the error coverage is too low.
c) The famous F-CPU POST can't use an exhaustive test, nor can it use
test vectors (there is not enough room on the chip - and, after all,
errors may occur anywhere, even in the ROM that would hold the vectors).
It doesn't make sense to use random numbers, because there is no way to
verify the result of an operation - putting "golden" code on a chip is
simply impossible.
The only way to perform a reasonable POST is to start from a well-known
state, perform a well-known sequence of operations - that is, execute a
"built-in" test program -, and compare the result with a well-known
value. How that comparision is done is more or less a matter of taste;
the LFSR approach outlined by Yann is one possible solution (reminds me
of a CRC generator, by the way).
Finally, we come to d), random numbers. If you've listened carefully to
what I said above, you will probably come to the conclusion that they're
not useful at all for verification, testing, or POST.
Q.E.D.
Michael.
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/