[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[f-cpu] whygee's Nth slaughtered ROP2 version



hi,

so i did it again...
You know i can't keep myself from overdoing it...

This time i checked wih Vanilla, Simili and ncvhdl.

there are some french comments because i want to
use these files as a tutorial for a french magazine...
this is why i skipped the xbar buffer and LUT.

btw, nicO, could you please check if synopsis groks it ?

@+
WHYGEE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------------------------------------------------------------
-- f-cpu/vhdl/eu_rop2/rop2_unit.vhdl - ROP2 Execution Unit for the F-CPU
-- Copyright (C) 2000-2001 Yann GUIDON (whygee@f-cpu.org)
--
-- v0.2: Michael Riepe reorganized the main for-generate loop
-- + corrected the lookup table (wrong op for ORN)
-- v0.3: YG replaced UMAX/8 with MAXSIZE :-)
-- v0.4: 11/17/2000, YG wants to rewrite the unit with MR's gate library ...
--  -> abandonned. we stick to high-level coding.
-- v0.5: 8/12/2001, YG modifies the interface, the names, adds MUX,...
-- Sun Aug 12 01:16:11 2001: still untested but it includes
-- the latest updates to the FC0 core.
-- Tue Aug 21 08:45:16 2001: trying to make something that works reasonably.
-- Mon Sep  3 08:49:45 2001: YG fixed some silly compile bugs :-/
-- vanillaHDL script and testbench added.
-- Sun Oct  7 05:39:23 2001: changed ROP2 function to MUX4
-- Mon Oct  8 01:39:45 2001: merged SELECT with the FANOUT loop.
-- sam nov 24 04:40:35 2001: cleanup
--------------------------BEGIN-VHDL-LICENCE-----------------------------
-- This program is free software; you can redistribute it and/or modify
-- it under the terms of the GNU General Public License as published by
-- the Free Software Foundation; either version 2 of the License, or
-- (at your option) any later version.
--
-- This program is distributed in the hope that it will be useful,
-- but WITHOUT ANY WARRANTY; without even the implied warranty of
-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-- GNU General Public License for more details.
--
-- You should have received a copy of the GNU General Public License
-- along with this program; if not, write to the Free Software
-- Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
---------------------------END-VHDL-LICENCE------------------------------
--
-- It should be easily synthetizable but it is not attempted yet.
-- What matters most today is that it compiles and behaves correctly.
-- Warning : this code is and should remain purely combinatorial,
-- there is no latching here, it must be done at another level.
-- Furthermore, the function lookup table is now moved earlier
-- in the pipeline, in parallel with the Xbar cycle : look at the
-- f-cpu/vhdl/eu_rop2/rop2_xbar.vhdl file
-- The big fanout problems (propagation of the opcode from 1 to 64 bits)
-- overlaps the Xbar cycle so we can make a nice "signal tree".
-- Finally, only byte combines are possible yet. The COMBINE
-- instruction is still not completely specified in the manual.
--------------------------------------------------------------------------

-- inclusion des librairies standard
LIBRARY ieee;
    USE ieee.std_logic_1164.ALL;
    USE ieee.numeric_std.all;
-- inclusion des définitions globales du F-CPU
LIBRARY work;
    USE work.FCPU_config.ALL;

-- définition de l'interface du circuit :
Entity EU_ROP2 is
  port(
    ROP2_in_A,
    ROP2_in_B,
    ROP2_in_C : in F_VECTOR;    -- the 3 operands
    ROP2_function_bit0,
    ROP2_function_bit1,   -- pre-buffered boolean function bits
    ROP2_function_bit2,
    ROP2_function_bit3 : in Std_ulogic_vector((MAXSIZE *2) downto 0); -- fanout=4
    ROP2_mode : in Std_ulogic_vector(1 downto 0);  -- 2 function bits from the instruction
--    Combine_size : in Std_ulogic_vector(1 downto 0);   -- unused ATM. Byte chuncks only.
    ROP2_out     : out F_VECTOR     -- the result
  );
end EU_ROP2;

Architecture arch1 of EU_ROP2 is
-- signaux internes, "variables locales" :
  signal
    partial_ROP,
    partial_MUX : F_VECTOR;  -- the partial results.
  signal
    partial_OR1,
    partial_OR2,
    partial_AND1,
    partial_AND2 : Std_ulogic_vector(MAXSIZE-1 downto 0);
  signal
    partial_OR,
    partial_AND : Std_ulogic_vector((MAXSIZE*2)-1 downto 0);
    -- these signals help for the fanout problem

  subtype sv2 is std_ulogic_vector(1 downto 0);
    -- this solves simili's syntax limit for the with .. select
begin

-- cette partie est en mode "concurrentiel" :
-- l'ordre des parties n'importe pas puisqu'on ne fait que décrire
-- les connexions ou les dépendances de données. Cela permet de
-- ne faire que deux boucles au lieu de 3.

--------------------------------------------------------------------------
-- ROP2 cycle : (combinational part only)
--------------------------------------------------------------------------

-- 1 : last fanout for the function bits + the ROP2 operator itself :
-- (the former loop was merged with the ROP2/MUX operation)
  mux_loop : for i in UMAX-1 downto 0 generate
    with sv2'(ROP2_in_A(i) & ROP2_in_B(i)) select
    partial_ROP(i) <=
      ROP2_function_bit0(i/4) when "00",
      ROP2_function_bit1(i/4) when "10",
      ROP2_function_bit2(i/4) when "01",
      ROP2_function_bit3(i/4) when others; -- "11"

-- 2 bis : the "select" MUX
    with ROP2_in_C(i) select
      partial_MUX(i) <=
        ROP2_in_B(i) when '1',
        ROP2_in_A(i) when others; -- '0'

-- 4 : final selection stage :
    with ROP2_mode select
      ROP2_out(i) <=
        partial_ROP(i)   when ROP2_DIRECT_MODE,
        partial_AND(i/4) when ROP2_AND_MODE,
        partial_OR(i/4)  when ROP2_OR_MODE,
        partial_MUX(i)   when others; -- MUX
-- YG> warning : huge fanous on ROP2_mode ! 1->64 for 4 signals,
-- i hope that the synthesiser will generate the proper buffer tree.    
  end generate mux_loop;

-- 3 : partial ORs and ANDs on the byte chuncks :
  BYTE_COMBINE : for i in MAXSIZE-1 downto 0 generate
    partial_AND1(i) <= partial_ROP(8*i)
                   and partial_ROP(8*i+1) 
                   and partial_ROP(8*i+2) 
                   and partial_ROP(8*i+3);
    partial_AND2(i) <= partial_ROP(8*i+4)
                   and partial_ROP(8*i+5) 
                   and partial_ROP(8*i+6) 
                   and partial_ROP(8*i+7);
    partial_OR1(i) <= partial_ROP(8*i)
                   or partial_ROP(8*i+1) 
                   or partial_ROP(8*i+2) 
                   or partial_ROP(8*i+3);
    partial_OR2(i) <= partial_ROP(8*i+4)
                   or partial_ROP(8*i+5) 
                   or partial_ROP(8*i+6) 
                   or partial_ROP(8*i+7);

    partial_AND(i*2)   <= partial_AND1(i) and partial_AND2(i);
    partial_AND(i*2+1) <= partial_AND1(i) and partial_AND2(i);
    partial_OR(i*2)    <= partial_OR1(i)  or  partial_OR2(i);
    partial_OR(i*2+1)  <= partial_OR1(i)  or  partial_OR2(i);

-- YG> I'm still uncertain about the best way to write a multi-size version
-- YG> because the latency might explode the ROP2 unit's performance.
-- YG> So the multi-size version is dropped until it becomes necessary.
-- YG> Let's stick to plain bytes...
-- YG> Note : rop2.eps explains the trick to relieve the fanout (1->8) problem.

  end generate BYTE_COMBINE;
end;

scripts.sh

-------------------------------------------------------------------------
-- WARNING !  This file is a manual, shortened version !
--------------------------BEGIN-VHDL-LICENCE-----------------------------
-- This program is free software; you can redistribute it and/or modify
-- it under the terms of the GNU General Public License as published by
-- the Free Software Foundation; either version 2 of the License, or
-- (at your option) any later version.
--
-- This program is distributed in the hope that it will be useful,
-- but WITHOUT ANY WARRANTY; without even the implied warranty of
-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-- GNU General Public License for more details.
--
-- You should have received a copy of the GNU General Public License
-- along with this program; if not, write to the Free Software
-- Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
---------------------------END-VHDL-LICENCE------------------------------
--
-- (c) Yann GUIDON, oct. 21, 2000 <whygee@f-cpu.org>
-- v0.2 : Michael Riepe changed F_RANGE
-- v0.3 : YG specified the user-modifiable constants + GPL
-- v0.4 : MR proposed LOGMAXSIZE, YG added the ROP2 constants.
-- v0.5 : nov. 17, 2000, YG added SR_IRQ_BASE, SR_TRAP_BASE,
--         SR_SYSCALL_BASE, SR_URL etc.
-- v0.6 : nov. 26, 2000, YG moved some SR stuff to /VHDL/EU_sr
-- v0.7 : aug. 19, 2001, YG hacked for m4 preprocessing.
--        run f-cpu/configure.sh to update this file.
-- v0.8 : aug. 28, 2001 : YG + MR modified some stuffs.
--        MR hinted the "eval(radix)" trick, status is satisfying.
--
-- This version is for demonstration only.
--
-- This package defines the "characteristic widths" of
-- the internal units. Please respect the restrictions.
--
-- **************************************************************
-- WARNING : All the user-modifiable values are defined in the 
-- f-cpu/configuration/f-cpu_user.m4 file.
-- **************************************************************
--
--  * LOGMAXSIZE : Log2 of the Size of the registers in bytes.
--  Can be any integer above or equal to 2. 2 corresponds to
--  a 32-bit implementation, 3 corresponds to a 64-bit version.
--  This is the most important parameter, the first with
--  which one can play. Be careful anyway. The 32-bit version will
--  not work yet.
--

LIBRARY ieee;
    USE ieee.std_logic_1164.ALL;

package FCPU_config is

------------------------------------------------------
-- Most important F-CPU constants :
------------------------------------------------------

-- Number >=2, 3 corresponds to 64-bit registers
  constant LOGMAXSIZE : natural := 3;
    -- defined in f-cpu/configuration/f-cpu_user.m4

-- Size of the registers in bytes
  constant MAXSIZE : natural := 2**LOGMAXSIZE;

-- Size of the registers in bits.
  constant UMAX : natural := MAXSIZE * 8;

-- Range of a register width declaration.
  subtype F_RANGE is natural range UMAX-1 downto 0 ;

-- shortcut for a very common declaration.
  subtype F_VECTOR is std_ulogic_vector(F_RANGE) ;


-------------------------------------------------------
-- The ROP2 unit : these constants specify the
-- correspondance between the binary code and the actual
-- operation. These data are copied here for convenience
-- only, for example if you want to make an assembler in
-- VHDL. Check the file rop2_xbar.vhdl for more informations.
--------------------------------------------------------

  constant ROP2_DIRECT_MODE : std_ulogic_vector(1 downto 0) := "00";
  constant ROP2_AND_MODE :    std_ulogic_vector(1 downto 0) := "01";
  constant ROP2_OR_MODE :     std_ulogic_vector(1 downto 0) := "10";
  constant ROP2_MUX_MODE :    std_ulogic_vector(1 downto 0) := "11";

  constant ROP2_AND   : std_ulogic_vector(2 downto 0) := "000";
  constant ROP2_ANDN  : std_ulogic_vector(2 downto 0) := "001";
  constant ROP2_XOR   : std_ulogic_vector(2 downto 0) := "010";
  constant ROP2_OR    : std_ulogic_vector(2 downto 0) := "011";
  constant ROP2_NOR   : std_ulogic_vector(2 downto 0) := "100";
  constant ROP2_XNOR  : std_ulogic_vector(2 downto 0) := "101";
  constant ROP2_ORN   : std_ulogic_vector(2 downto 0) := "110";
  constant ROP2_NAND  : std_ulogic_vector(2 downto 0) := "111";

  constant ROP2_VALUE_AND   : std_ulogic_vector(3 downto 0) := "0001";
  constant ROP2_VALUE_ANDN  : std_ulogic_vector(3 downto 0) := "0010";
  constant ROP2_VALUE_XOR   : std_ulogic_vector(3 downto 0) := "0110";
  constant ROP2_VALUE_OR    : std_ulogic_vector(3 downto 0) := "0111";
  constant ROP2_VALUE_NOR   : std_ulogic_vector(3 downto 0) := "1000";
  constant ROP2_VALUE_XNOR  : std_ulogic_vector(3 downto 0) := "1001";
  constant ROP2_VALUE_ORN   : std_ulogic_vector(3 downto 0) := "1011";
  constant ROP2_VALUE_NAND  : std_ulogic_vector(3 downto 0) := "1110";

end FCPU_config;


package body FCPU_config is

-- vide pour cette version "courte".

end FCPU_config;