# DESIGN OF RNS FREQUENCY SAMPLING FILTER BANKS Uwe Meyer-Bäse, Jon Mellott, and Fred Taylor University of Florida, High Speed Digital Architecture Laboratory 405 SE BLDG 42, Gainesville, FL 32611-6130 USA {uwe,jon,fjt}@alpha.ee.ufl.edu ## **ABSTRACT** Frequency sampling filters (FSF) are of interest to the designers of multirate filter banks due to their intrinsic loworder, complexity, and linear phase behavior. Fast FSFs residing in smaller packages will be required to support future high-bandwidth, mobile image and signal processing applications. Since FSF designs rely on the exact annihilation of selected poles-zeros, a new facilitating technology is required which is fast, compact, and numerically exact. Exact FSF pole-zero annihilation is guaranteed by implementing polynomial filters over an integer ring in the residue arithmetic system (RNS). The design methodology is evaluated as an ASIC. Based on an FPGA technology, at least an 86% complexity reduction can be achieved with even greater advantages gained as a custom VLSI. An RNS-based FSF implementation of an eight channel cochlea filter bank is presented which demonstrates both the performance and packaging advantages of the new FSF paradigm. # 1. INTRODUCTION A classical frequency sampling filter (FSF) consists of a comb filter cascaded with a bank of frequency selective resonators [1, 2]. The resonators independently produce a collection of poles which annihilate the zeros produced by the comb pre-filter. The gains applied to the output of the resonators are chosen so as to approximately profile the magnitude frequency response of a desired filter. For stability reasons, the poles and zeros are generally designed to be slightly interior to the unit circle. An FSF can also be created by cascading all-pole filter sections with all-zero filter (comb) sections as suggested in Figure 1. The delay of the comb-section $1 \pm z^{-D}$ are chosen that its zeros cancel the poles of the all-pole prefilter as shown in Figure 2. It can be observed that wherever there is a complex pole there also exists an annihilating complex zero which results in an all-zero FIR input-output behavior, with the usual linear phase and constant group delay properties. FSF filters of this type are known to provide very efficient multirate interpolation and decimation solutions as well as serve as high-decimation rate filters for RF to baseband conversion of radio signals [3]. Figure 2: Example of pole/zero-compensation for a pole-angle of $60^{\circ}$ and Comb-delay D=6. [4]. ## 2. FREQUENCY SELECTIVE PROPERTIES The poles of the FSF filter developed in this paper will reside on the periphery of the unit circle. This is in contrast with the customary practice of forcing the poles and zeros to reside at interior locations to guard against possible inexact pole-zero cancellation. It will be shown the stability is not an issue if the FSF is implemented using the exact residue number system (RNS). The RNS [5] is an exact arithmetic system which is also known to possess a bandwidth/area ratio which greatly exceeds that obtainable using conventional fixed-point system (e.g., two's complement) in FIR-like applications, especially when complex arithmetic is performed. Arithmetic in the RNS is performed in a modular sense within a set of relatively-prime, independent, small wordlength channels. An example of an This work was supported by the German DFG under grand ME1419/2-1. Table 1: Filters with integer coefficients producing unique angular pole locations up to order six. | $C_k(z)$ | Order | $a_0$ | a <sub>1</sub> | <b>a</b> <sub>2</sub> | a <sub>3</sub> | a4 | <b>a</b> <sub>5</sub> | a <sub>6</sub> | $\psi_1$ | $\psi_2$ | $\psi_3$ | |------------------------|-------|-------|----------------|-----------------------|----------------|----|-----------------------|----------------|----------|----------|----------| | $-C_1(z)$ | 1 | 1 | -1 | | | | | | 08 | | | | $C_2(z)$ | 1 | 1 | 1 | | | | | | 180° | | | | $C_6(z)$ | 2 | 1 | -1 | 1 | | | | | 60° | | | | $C_4(z)$ | 2 | 1 | 0 | 1 | | | | | 90° | | | | $C_3(z)$ | 2 | 1 | 1 | 1 | | | | | 120° | | | | $\overline{C_{12}(z)}$ | 4 | 1 | 0 | -1 | 0 | 1 | | | 30° | 150° | | | $C_{10}(z)$ | 4 | 1 | -1 | 1 | -1 | 1 | | } | 36° | 108° | | | $C_8(z)$ | 4 | 1 | 0 | 0 | 0 | 1 | | ł | 45° | 135° | | | $C_5(z)$ | 4 | 1 | 1 | 1 | 1 | 1 | | | 72° | 144° | | | $C_{16}(z)$ | 6 | 1 | 0 | 0 | -1 | 0 | 0 | 1 | 20.00° | 100.00° | 140.00° | | $C_{14}(z)$ | 6 | 1 | -1 | 1 | -1 | 1 | -1 | 1 | 25.71° | 77.14° | 128.57° | | $C_7(z)$ | 6 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 51.42° | 102.86° | 154.29° | | $C_9(z)$ | 6 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 40.00° | 80.00° | 160.00° | Figure 1: Cascading of frequency sampling filter to save a factor of R delays for multirate signal processing [1, Sec. 3.4]. RNS systolic array of multiply-accumulate (MAC) cells is shown in Figure 3. In addition, by allowing the FSF poles and zeros to reside on the unit circle, a multiplier-less FSF can be realized with an attendant reduction in complexity an increase in data bandwidth. In Figure 1, first-order filter sections are used to produce poles the angles 0° and 180° (i.e., $z=\pm 1$ ). Second-order sections with integer coefficients can produce poles at angles 60°, 90°, 120° according to $2\cos(2\pi K/D)=1$ , 0, and -1. For sections of higher order, there are no single frequency selective filters as shown in Table 1. Here the results of complete search are reported for all polynomials up to order six, with integer coefficients and roots on the unit circle which deliver additional (new) angular frequencies. From this list of filters, up to order twenty-four, with integer coefficients and poles residing on the periphery of the unit circle, an efficient and compact FSF can be designed and implemented. ## 3. CYCLOTOMIC POLYNOMIALS The integer polynomials from Table 1 found by computer search are known from number theory as cyclotomic polynomials which are defined by [6, p.158-160]: $$C_k(z) = \prod_{\gcd(r,k)=1} z - W_k^r \tag{1}$$ where $\gcd(r,k)$ is the greatest common divisor of r and k. A useful property of $C_k(z)$ is that the order of the polynomial is equal to the Euler totient function $\phi(k)$ , which is the number of relative primes not exceeding k. Because $\phi(k)$ is an increasing function with a lower bound of $\phi(k) \geq 0.215481 \cdot k^{1.0077} + 1.36$ , for all k > 5, only cyclotomic polynomials up to k = 18 are to be evaluated for polynomials of order six and k = 30 for polynomials of order eight, see Figure 4. The $C_k(z)$ were used to verify the results shown in Table 1. It's interesting to notice [7, p.74] that the coefficients of $C_k(z)$ , up to k = 105, are one of 0,1, or -1 and Figure 3: RNS systolic array chip. [4]. only polynomials with order higher than twenty-four have coefficients greater than 1. FSF in the RNS are therefore multiplier free up to order twenty-four. #### 4. DESIGN EXAMPLE An RNS filter bank, developed for use as a cochlea implant, was designed using a Stanford-Implant eight filter model having logarithmic coverage of the frequency range from 900-8000 Hz [8, 9]. Using the developed design procedure, and a maximal pole positional error criterion of 5.7% (see Table 2) from the Stanford-Implant ideal, a sixth order solu- Table 2: Error for different maximal order of the pole section. | | mean error | Maximum error | |--------------------------------|------------|---------------| | First angle<br>up to order six | 3.6% | 5.7% | | All angle<br>up to order six | 2.8% | 4.7% | | All angle<br>up to order eight | 2.4% | 3.7% | tion was found at a sampling frequency of 16624 Hz. An integer coefficient half-band filter HB6 [10] anti-aliasing filter and third order multiplier-free CIC-filter (a.k.a. Hogenauer filter [11]), was added to the design to suppress unwanted frequency components as shown in Figure 5. The bandwidth of each resonator can be independently tuned by the number of stages and the delays in the comb-section. The number of stages and delay was optimized to meet the lis- Figure 4: Order of $C_k(z)$ , which is $\phi(k)$ . Computation of lower bound by starting with a high k value (1000) and computing $\phi(k)$ down to k=1. Lower bound approximation for k>5 is $\phi(k)\geq 0.215481\cdot k^{1.0077}+1.36$ . tener bandwidth requirements [12]. All frequency selective filters have two stages and delays. Table 3: Bit width of the single filter. | Filter | Stage 1 | Stage 2 | Stage 3 | Stage 4 | |--------|---------|---------|---------|---------| | F20 | 20 | 17 | 15 | 14 | | F25 | 20 | 18 | 15 | 14 | | F36 | 20 | 17 | 15 | 14 | | F51 | 19 | 16 | 15 | 14 | | F72 | 20 | 16 | 15 | 14 | | F90 | 20 | 17 | 15 | 14 | | F120 | 20 | 17 | 15 | 14 | | F180 | 19 | 16 | 15 | 14 | | III | 19 | 18 | 17 | | | D4,D5 | 16 | 16 | 15 | | | HB6 | 22 | | | | The design was prototyped using Xilinx XC4000 FP-GAs with the complexity reported in Table 3 and Table 4. Using high-level design tools, the number of used CLBs is typically 20% more than the theoretical prediction obtained by counting adder, flip-flops, ROMs and RAMs. A custom CMOS RNS design, based on the cells used in the device shown in Figure 3, is being pursued for comparative purposes. ### 5. CONCLUSION A methodology for efficiently implementing FSF filters having poles (up to order twenty-four) and zeros on the unit circle is presented. The implementation was based on the use of the exact RNS arithmetic system which insured pole-zero annihilation was complete. The RNS-based FSF processor was studied in the context of an FPGA design, consisting of 1572 CLBs of a Xilinx XC4000 and a custom CMOS device. A custom VLSI device is also being developed to further reduce the complexity of an FSF system. The resulting sys- Figure 5: Design of an artificial cochlea consisting of a half-band and CIC prefilter and FSF comb-resonator sections. Table 4: Number of used CLBs of Xilinx XC4000 FPGAs (Notation: F20D90 means filter pole-angle $20.00^{\circ}$ delay Comb D=90). Total Practice: 1572 CLBs. Total non-recursive FIR: 11292 CLBs. F25D70 F36D60 F51D49 F20D90 | 122 | 184 | 128 | 164 | | |--------|-----------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--| | 160 | 271 | 190 | 240 | | | 2256 | 1836 | 1924 | 1140 | | | | | | | | | F72D40 | F90D40 | F120D33 | F180D14 | | | 124 | 65 | 86 | 35 | | | 190 | 93 | 120 | 53 | | | 1039 | 1287 | 1260 | 550 | | | l IIDa | 777 | ъ. | To # | | | HR6 | 111 | D4 | D5 | | | 122 | 31 | 24 | 24 | | | | | 33 | 33 | | | | 160<br>2256<br>F72D40<br>124<br>190<br>1039<br>HB6<br>122 | 160 271 2256 1836 F72D40 F90D40 124 65 190 93 1039 1287 HB6 III | 160 271 190 2256 1836 1924 F72D40 F90D40 F120D33 124 65 86 190 93 120 1039 1287 1260 HB6 III D4 122 31 24 | | tem, compared to conventional implementations, are shown to possess a superior bandwidth/complexity (area) metric which is important in applications having severe packaging constraints (i.e., mobile applications). # 6. REFERENCES - [1] U. Meyer-Bäse, Der Einsatz komplexer Algorithmen zur Realisierunguniverseller Abtastempfänger in FPGA-Technik (The Use of Powerful Algorithms in the Realisation of Universal Sampling Receivers in FPGA Technics). Ph.D. thesis, VDI-press, Reihe 10, Nr. 404, 1995. - [2] F. Taylor. Digital Filter Design Handbook. Marcel Dekker, 1983. - [3] Harris Semiconductor, Data Sheet HSP43220 Decimating Digital Filter, Aug. 1992. - [4] J. Mellot, M. Lewis, F. Taylor, and P. Coffield. ASAP A 2D DFT VLSI Processor and Architecture. In ICASSP, page VLSI 3.3, May 1996. - [5] M. Soderstrand, W. Jenkins, G. Jullien, and F. Taylor. Residue Number System Arithmetic: Modern Applications in Digital Signal Processing. IEEE Press Reprint Series. IEEE Press, 1986. - [6] T. Nagell, Introduction to Number Theory. Chelsea, 1964. - [7] J. McClellan and C. Rader, Number Theory in Digital Signal Processing. Prentice Hall, 1979. - [8] S. Dworak. Entwurf und Realisierung einer FPGA-Schaltung für eine neue Klasse von digitalen Frequenzabtastfiltern zur Sprachvorverarbeitung. Master's thesis, TH-Darmstadt, 1996. - O. Six. Entwicklung und Aufbau einer Universalplatine für Xilinx XC-4000 FPGAs. Master's thesis, TH-Darmstadt, 1996. - [10] D. J. Goodman and M. J. Carey. Nine Digital Filters for Decimation and Interpolation. IEEE Transactions ASSP, pages 121-126, April 1977. - [11] E. B. Hogenauer. An Economical Class of Digital Filters for Decimation and Interpolation. IEEE Transactions ASSP, pages 155-162, April 1981. - [12] B. Delgutte. Matlab data for digital filters, personal communication, 1996.