[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [f-cpu] SIMD and exception
[...]
So here is a problem:
There is in the first stage the exponent substraction. For single,
exponent size is 8 bit, so the substractor delay is 6.So it just fit
into the stage
But for double, it takes more, so i will have to slip it between
stage 1 and stage 2 (putting csv and cla). So the two datapathes
(single and double) will not arrive on the same time on the main
adder (mantissa adder). So i cannot use the same adder for both
operation (except if i bufferize single datapath and "wait" for
double one).
Is there any restriction about register number because it will add
lot of registers (for double and single datapath)...
[...]
The subtractor will cross the first pipeline register in either case.
Don't forget that you'll have to select different slices from the
operands:
unfortunatly it's true... i'll have to cut lot of part...
-- assuming separate adders for chunk 0 and 1
-- suffixes -0/-1 indicate the chunk number
Ea1 := (others => '0'); -- at least 11 bits
Eb1 := (others => '0');
Ea0 := (others => '0'); -- at least 8 bits
Eb0 := (others => '0');
if mode = double then
Ea1(10 downto 0) := A(62 downto 52);
Eb1(10 downto 0) := B(62 downto 52);
-- Ea0/Eb0 not used in double mode
else
Ea1(7 downto 0) := A(62 downto 55);
Eb1(7 downto 0) := B(62 downto 55);
Ea0(7 downto 0) := A(30 downto 23);
Eb0(7 downto 0) := B(30 downto 23);
end if;
-- and then subtract:
De1 := Ea1 - Eb1; -- for upper single or double
De0 := Ea0 - Eb0; -- for lower single
ok i understand...
but since the decoder works with 2 bit control vector, it should take
d=2 delay (in the general case)...
that's why i cannot decode in every stage, so i will have 2 data path in
//. Except for mantissa adders, everything else will be doubled...
or similar. That will add some latency before the subtractor. After
that, you should have enough room for a row of 4-bit adders (d=4/t=6)
and input inverters for one operand (d=1/t=1), maybe even for the
final CLA (d=5/t=6 from the beginning of the adder).
mode decoder d=1, 2
inverter d=1
4bit adder d=4
it's already d=6 (or 7)...
But the CSV will have to reside in stage #2.
yes it's true
The same will be true for the "el cheapo" variant using a single
16-bit subtractor (with an 8+8 split for SIMD):
Ea := (others => '0'); -- 16 bits
Eb := (others => '0'); -- 16 bits
-- left-align operands
Ea(15 downto 8) := A(62 downto 55);
Eb(15 downto 8) := B(62 downto 55);
if mode = double then
-- add least significant bits
Ea(7 downto 5) := A(54 downto 52);
Eb(7 downto 5) := B(54 downto 52);
split_adder = '0';
else
-- add second chunk
Ea(7 downto 0) := A(30 downto 23);
Eb(7 downto 0) := B(30 downto 23);
split_adder = '1';
end if;
-- and then subtract:
De := SIMD_subtract(Ea, Eb, split_adder);
Yet another solution is this one:
De_1 := A(62 downto 55) - B(62 downto 55); -- 8 bits
De_2 := A(54 downto 52) - B(54 downto 52); -- 3 bits
De_3 := A(30 downto 23) - B(30 downto 23); -- 8 bits
Then, in stage #2, select De := De_1 & De_2 for double (requires
another CLA/CSV step), and one of (De_1, De_3) for single.
But whatever you do, you'll have to sacrifice part of stage #2 if you
implement variable operand sizes.
will I have to make a carry select tree for mantissa adder like in the
CSAdd if i split the mantissa in 8x8bit adders?
ok thank you very much. !!
If i count well, i'm far from 4 stages for fadder... it will take at
least takes 6 cycles... :'(
Michael.
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/
--
~~ Gaetan ~~
http://www.xeberon.net
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu in the body. http://f-cpu.seul.org/