FIR NTAP MUX N-Channel Multiplexed FIR Filter Rev. 1.2 Key Design Features Block Diagram Synthesizable, technology independent VHDL Core N-channel FIR filter core implemented as a systolic array for speed and scalability Support for one or more independent channels (8 maximum) Configurable and independent coefficient sets for each channel Configurable data width and number of taps Symmetric arithmetic rounding limits DC-bias problems Output saturation or wrap modes Much cheaper than implementing separate FIR filters in parallel as hardware resources are shared between all channels 1 Supports 550 MHz+ sample rates (550/N MHz per channel) Applications High-speed filtering applications where hardware resources are limited - e.g. when it becomes impractical to use multiple FIR filters in parallel Dual-channel inputs such as complex valued I/Q in digital communications systems Figure 1: N-Channel FIR filter architecture Parallel DSP processor architectures Pin-out Description General purpose FIR filters with odd or even numbers of taps Filters with arbitrary sets of coefficients (e.g. non-symmetrical) Pin name I/O Description Active state clk in System clock (F ) rising edge S Generic Parameters reset in System reset low en in Clock enable high Generic name Description Type Valid range x val in Filter inputs valid high num channels Number of filter integer 1 N 8 (coincident with first valid channels (N) input sample at x0 in) num taps Number of filter taps integer > 2 xN in dw - 1:0 in Filter input samples data (signed number) dw Width of input/output integer 2 data samples y val out Filter outputs valid high (coincident with first valid cw Width of coefficients integer 2 output sample at y0 out) fw Number of coefficient integer 0 yN out dw - 1:0 out Filter output samples data fraction bits (fw < cw) (signed number) coeff A to coeff H Filter coefficients integer Any integer in num taps-1:0 (one coefficient set array x N range (cw-1) per filter channel) +/- 2 General Description USE ROUNDING Use symmetric Boolean TRUE/FALSE arithmetic rounding (not truncate) FIR NTAP MUX is an N-channel multiplexed FIR filter designed for high sample rate applications where hardware resources are limited. The USE SATURATE Saturate outputs Boolean TRUE/FALSE main filter core is organized as a scalable systolic array permitting the (not wrap) user to specify large order filters without compromising maximum attainable clock-speed. 1 Xilinx Virtex 6 FPGA used as a benchmark Copyright 2013 www.zipcores.com Download this VHDL Core Page 1 of 5FIR NTAP MUX N-Channel Multiplexed FIR Filter Rev. 1.2 Essentially the filter functions as if it were N separate FIR filters. Each Filter latency input sample is multiplexed into the filter at a sample rate equal to F /N, S where F is the sampling frequency of the main filter core. Likewise, S output samples are updated at a frequency of F /N. The latency of the filter defined here is the latency in system clock-cycles S from the point in which the first input sample is valid, to the point in which The first sample into the filter is aligned by asserting the signal x val high. the first output sample is valid. The total latency is defined by the The signal y val is asserted with the first valid output sample. Data following formula: samples are advanced in the pipeline on the rising clock-edge of clk when en is active high. When en is low then all data samples are stalled. The clock-enable signal may be used to temporarily disable the filter - or Lat =(N Taps)+(N 2)+ Lat + Lat + 2 TOT RND SAT possibly to modify the effective sampling frequency of the system clock. If the clock-enable is not needed it is recommended that this signal be tied N = Number of channels high as it will improve overall circuit performance. Taps= Number of filter taps Mathematically, the filter implements the difference equation: Lat = 2 if rounding enabled,0otherwise RND Lat = 1 if saturate enabled, 0otherwise SAT y n = h x n +h x n1 + ...+ h x nN 0 1 N As an example, consider a 4-channel, 50 tap filter with rounding and saturation enabled. The total latency would be calculated as: (4*50) + In the above equation, the input signal is x n , the output signal is y n and (4*2) + 2 + 1 + 2 = 213 clock cycles. h0 to hN represent the filter coefficients. The number N is the filter order, the number of filter taps being equal to N+1. Sampling frequency considerations Filter coefficients and I/O specification The system clock frequency is the sampling frequency of the internal filter core. Let this be denoted as FS. It follows that the sampling frequency of 2 Filter coefficients are defined as signed fixed-point numbers in cw fw the input and output samples is dependent on the number of multiplexed format where cw is the total number of coefficient bits and fw is the channels, N. In particular the following formula must be observed for number of bits in the fractional part. In all cases, cw must be at least 2 correct filter operation: bits and fw must be less than cw to accommodate the sign bit. For instance, a coefficient in 10 8 format would be arranged as follows: F S F (onechannel)= S N Functional Timing Figure 2 shows a sequence of input samples for an 8-channel filter. Note The number of bits in the input and output samples is controlled by the that the signal x val is used to align the first data sample at the filter input. parameter dw. Inputs and outputs are signed values (their format is From that point onwards, the remaining inputs are sampled sequentially in purely relative). Unused inputs should be tied to zero. For instance, if a turn. If the user wishes to re-align the filter inputs, then a system reset filter design only requires four channels, then inputs x4 in to x7 in should must be performed before x val is reasserted with the new first sample. be tied low. Filter implementation options Output samples may be truncated to dw bits or rounded depending on the implementation option USE ROUNDING. If the rounding option is selected, then symmetric arithmetic rounding is used. This means that the fraction 0.1000... is added to positive numbers and 0.0111... is added to negative numbers. Note that filters implemented with the rounding option will help to reduce the small amplitude offset introduced at DC (0 Hz baseband frequency) attributable to rounding error. In addition, the option USE SATURATE determines what will happen if the output samples are too large. If the saturate option is enabled, then in the event of an overflow, the output samples will saturate to the largest positive or negative number permitted by dw. With the saturate option disabled, the output samples will simply wrap around. Note that depending on the format of the coefficients and the data width relative to Figure 2: Input timing - all 8 channels active the magnitude of the input samples, the filter outputs may not overflow. In this case, the user may not require the saturation logic. The output samples follow a similar pattern. Figure 3 shows the corresponding outputs for an 8-channel filter. From the point at which 2 The design is supplied with Matlab scripts for the easy generation of y val is asserted, the downstream circuit must sample the filter outputs on coefficient sets using FDAtool. Please see application note: consecutive clock cycles. app note zc002.pdf for more details. Copyright 2013 www.zipcores.com Download this VHDL Core Page 2 of 5