PIPE MULT Pipelined Multiplier with generic width and depth Rev. 1.3 Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core Function y = a * b Input values as signed or unsigned numbers Output values as signed or unsigned numbers Configurable data with and pipeline depth Supports both LUT-based or hard multiplier blocks Includes a classic shift-add multiplier for larger width implementations High-speed fully pipelined architecture Figure 1: Pipelined multiplier architecture (conceptual model) Applications General Description Fixed-point mathematics PIPE MULT (Figure 1) is a general purpose multiplier with a configurable Fundamental building block in all digital processing functions data width and configurable number of pipeline stages. Input values are accepted as either signed or unsigned numbers depending on the generic setting use signed. Likewise, output values are either signed or unsigned Pin-out Description depending on the same setting. The number of pipeline stages may be programmed using the generic Pin name I/O Description Active state parameter levels. By changing this value, a multiplier may be generated which trades off latency against maximum attainable clock frequency. clk in Synchronous clock rising edge In addition, the pipelined multiplier component also includes a compiler en in Clock enable high hint generic setting. By modifying this setting, the compiler can be a in dw - 1:0 in Input value 1 data instructed to infer LUT-based or hard multiplier/DSP resources. b in dw - 1:0 in Input value 2 data Values are sampled on the rising clock-edge of clk when en is high. The product dw*2 - 1:0 out Output product data function has a clock-cycle latency which is equal to the number of pipeline levels. Generic Parameters Functional Timing Generic name Description Type Valid range Figure 2 demonstrates the computation of: a in * b in, where a = 0xA92F (-22225 in decimal) and b = 0x712C (28972 in decimal). In this example, dw Input data width integer 1 the parameters have been set to dw = 16, levels = 3, use signed = true. The result, 0xD99ED314 (-643902700) has a latency of 3 clock cycles. levels Number of pipeline integer 1 stages style Multiplier style string Altera: (compiler hint) dsp, logic Xilinx: auto, block, lut pipe lut, etc. use signed Use signed or boolean TRUE/FALSE unsigned arithmetic Figure 2: Calculation of a * b Copyright 2014 www.zipcores.com Download this VHDL Core Page 1 of 3PIPE MULT Pipelined Multiplier with generic width and depth Rev. 1.3 Source File Description Synthesis All source files are provided as text files coded in VHDL. The following The source files required for synthesis and the design hierarchy is shown table gives a brief description of each file. below: pipe mult.vhd Source file Description pipe mult reg.vhd pipe mult reg.vhd Pipeline register block pipe mult classic.vhd Classic pipelined multiplier The VHDL core is designed to be technology independent. However, as a (More suited to large width LUT- benchmark, synthesis results have been provided for the Xilinx Virtex 6 based implementations) and Spartan 6 FPGA devices. Synthesis results for other FPGAs and pipe mult classic unsigned.vhd Classic pipelined multiplier technologies can be provided on request. (Unsigned version) Synthesis results are shown with the generic parameters set to: dw = 32, pipe mult.vhd Top-level block levels = 5, style = auto, use signed = true. pipe mult bench.vhd Top-level test bench Note that increasing the number of pipeline levels will increase the maximum attainable clock frequency (up to a point) for a given multiplier data width. Functional Testing Two additional Classic implementations of the pipelined multiplier are also provided with the source code In some instances these may give An example VHDL testbench is provided for use in a suitable VHDL better results than the standard pipe mult.vhd component. These files simulator. The compilation order of the source code is as follows: are called: pipe mult classic.vhd and pipe mult classic unsigned.vhd . These versions of the multiplier have a fixed latency of 4 and 3 cycles 1. pipe mult reg.vhd respectively. They are coded as a series of partial products, shifts and 2. pipe mult.vhd adds and are generally more suited to LUT-based or very wide multiplier 3. pipe mult bench.vhd implementations. Resource usage is specified after Place and Route. The VHDL testbench instantiates the multiplier component and the user may modify the generic parameters as required. The simulation must be run for at least 2 ms during which time the multiplier will be driven with a VIRTEX 6 randomized sequence input values. The test terminates automatically. Resource type Quantity used The simulation generates two text files called: pipe mult in.txt and Slice register 49 pipe mult out.txt. These files respectively contain the input and output data samples captured at the interfaces during the test. Slice LUT 17 Block RAM 0 Figure 3 shows the results of the multiplier used to implement the function 2 f(x) = x . Results are shown for the first 1000 samples. DSP48 4 Occupied slices 10 Clock frequency (approx) 650 MHz SPARTAN 6 Resource type Quantity used Slice register 49 Slice LUT 22 Block RAM 0 DSP48 4 Occupied slices 9 Clock frequency (approx) 200 MHz 2 Figure 3: Plot of test results for function: f(x) = x Copyright 2014 www.zipcores.com Download this VHDL Core Page 2 of 3