Pipelined and Parrellelized FPGA 102 Tap FIR Filter
High-Performance 102-Tap FIR Filter with Pipelined and Parallelized Architectures
This project designs and implements a low-pass FIR filter with 102 taps, targeting FPGA platforms. The filter achieves a transition region of 0.2π–0.23π rad/sample, >80 dB stopband attenuation, and robust noise suppression for a 1 kHz sine wave input. Leveraging MATLAB, Python, and Verilog, the design explores pipelined, parallel (L=2/L=3), and hybrid architectures to optimize throughput, area, and power efficiency.
Objectives
- Design a 102-tap FIR filter with MATLAB, ensuring >80 dB stopband attenuation
- Quantize coefficients to 32-bit fixed-point and manage arithmetic overflow
- Implement four architectures: Traditional, Pipelined, L=2/L=3 Parallel, and Combined Pipelined & L=3
- Analyze hardware metrics (area, frequency, power) across architectures
- Validate noise removal using ModelSim and MATLAB frequency response analysis
Project Process
-
MATLAB Filter Design & Quantization:
- Generated ideal coefficients using MATLAB’s FIR design tools
- Quantized coefficients to 32-bit signed format with scaling to prevent overflow
- Analyzed frequency response deviations post-quantization (<1 dB ripple in passband)
-
Architecture Implementation:
- Traditional: Direct-form MAC structure with 102 taps
- Pipelined: Inserted pipeline stages in MAC units to reduce critical path delays
- Parallel L=2/L=3: Split coefficients into sub-filters via Python-based polyphase decomposition
- Combined Pipelined & L=3: Merged pipelining with parallel processing for maximum throughput
-
Hardware Synthesis:
- Synthesized designs using Synopsys Design Compiler (45nm technology)
- Compared area utilization: 811 cells (Pipelined) vs 13,831 cells (Combined L=3)
- Achieved 47 kHz clock frequency with >21,000 ns setup slack across all designs
- Optimized power consumption: 8.22 μW leakage (Pipelined) vs 351.29 μW total (Combined)
-
Validation & Analysis:
- Verified noise removal using ModelSim with 16-bit noisy sine wave input
- Confirmed stopband attenuation >80 dB post-quantization via MATLAB analysis
- Demonstrated 2.3x throughput improvement in L=3 vs Traditional architecture
Conclusion and Future Improvements
The pipelined architecture achieved optimal balance between area (811 cells) and power (8.22 μW), while the combined L=3 design maximized throughput at higher resource costs. Future enhancements could explore higher-order parallelization (L=4+), adaptive coefficient tuning, or ASIC implementation for ultra-low-power edge applications. The modular Verilog codebase and automated Python/MATLAB workflows provide a scalable foundation for real-time DSP systems.
Project Information
- Category: Digital Signal Processing / VLSI Design
- Client: Rensselaer Polytechnic Institute
- Project Date: March 14, 2025
- GitHub Repository: View Implementation