# Re: [f-cpu] Shifting Unit

```hello,

> "Richard E. Hartny" wrote:
>
> I have a comment that may or may not be of use.
>
> A shift of one to eight is one logic level;
>
> A shift of eight to sixteen requires an additional logic level;
>
> A shift of seventeen to 32 requires an additional logic level;
>
> And a shift of 33 tp 64 requires one additional logic level;
>
> As you can see; it is straight binary progression.
>
> Hope this helps some.

Thank you. I will add some other facts :

* if you use a 2-1 mux for each shift level, you need
2 levels of metal connexions and 6 levels.
If you use a 4-1 mux, you need 3 levels but 4 metals.
At this point, the first level (shift by 0 to 3) is
relatively easy but the other stages are getting very large.

* the surface used by a simple barrel shifter (like what
we speak about) is proportional to O(n*log2(n)).
Doing a small shift is relatively easy, but large amounts
use a lot of room because the wires have a width. If you multiply
the wire width (plus the minimum spacing between them) by
the length, you realise that the surface is getting meaningful.
The wire length is proportional to the shift amount and the
32-bit shift takes almost as much surface as all the preceding
stages altogether. The signals are thus slower and weaker,
more prone to alteration by interference.

* I am still trying to figure a way to perform the 16-bit
and 32-bit shifts with intermediate stages. The signals would
be stronger and the propagation time reduced. Curiously, for
a wide shifter, the more stages -> the faster...
And maybe we can add a pipeline flip-flop if it's too slow.

* Michael said he was designing the shifter as two shift
units, one shifts left (by 32 bits i think) and the other
shifts right. By playing with the recombinations step
(yet another mux), we can shift 64-bit words left or right
by any amount.

* shift and memory : the instruction i recently proposed
(simply called "sh") has another very important use :
aligning words before/after memory access (explicitely,
with explicit instructions). For example, if you want to
store a 64-bit register to a non-aligned address, use the
pointer's LSB as the shift amount for the data and issue
the shift. Because the shift amount is truncated to the
meaningful LSB (ie : 5 bits are kept if we shift 32-bit
words), we have to shift the pointer left by 3 to get the byte
aligned pointer.

* The instruction i recently proposed was 2r2w : the result
was an "expanded" version of the source data, without the OR
of the two shifter's output (it is only swapped depending on
the shift direction). If we are going to use this instruction
for helping the store instruction with non-aligned data,
we can dot the reverse : a 3r1w would do the reverse for
a load instruction. This version would take two consecutive
data (got with two consecutive loads), shift and combine the
result. For example, if a 64-bit word is stored across a
8-byte boundary, we can reconstruct it with 2 loads and 1 shift.
From this point of view, these shifter instructionw can be
named "align" and "unalign". What do you think about that ?

> Regards;
> Dick Hartney
WHYGEE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PS; hope you're having/you had fun at the MPF
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
```