Pipelined and Parrellelized FPGA 102 Tap FIR Filter

High-Performance 102-Tap FIR Filter with Pipelined and Parallelized Architectures

This project designs and implements a low-pass FIR filter with 102 taps, targeting FPGA platforms. The filter achieves a transition region of 0.2π–0.23π rad/sample, >80 dB stopband attenuation, and robust noise suppression for a 1 kHz sine wave input. Leveraging MATLAB, Python, and Verilog, the design explores pipelined, parallel (L=2/L=3), and hybrid architectures to optimize throughput, area, and power efficiency.

Objectives

  • Design a 102-tap FIR filter with MATLAB, ensuring >80 dB stopband attenuation
  • Quantize coefficients to 32-bit fixed-point and manage arithmetic overflow
  • Implement four architectures: Traditional, Pipelined, L=2/L=3 Parallel, and Combined Pipelined & L=3
  • Analyze hardware metrics (area, frequency, power) across architectures
  • Validate noise removal using ModelSim and MATLAB frequency response analysis

Project Process

  1. MATLAB Filter Design & Quantization:
    • Generated ideal coefficients using MATLAB’s FIR design tools
    • Quantized coefficients to 32-bit signed format with scaling to prevent overflow
    • Analyzed frequency response deviations post-quantization (<1 dB ripple in passband)
  2. Architecture Implementation:
    • Traditional: Direct-form MAC structure with 102 taps
    • Pipelined: Inserted pipeline stages in MAC units to reduce critical path delays
    • Parallel L=2/L=3: Split coefficients into sub-filters via Python-based polyphase decomposition
    • Combined Pipelined & L=3: Merged pipelining with parallel processing for maximum throughput
  3. Hardware Synthesis:
    • Synthesized designs using Synopsys Design Compiler (45nm technology)
    • Compared area utilization: 811 cells (Pipelined) vs 13,831 cells (Combined L=3)
    • Achieved 47 kHz clock frequency with >21,000 ns setup slack across all designs
    • Optimized power consumption: 8.22 μW leakage (Pipelined) vs 351.29 μW total (Combined)
  4. Validation & Analysis:
    • Verified noise removal using ModelSim with 16-bit noisy sine wave input
    • Confirmed stopband attenuation >80 dB post-quantization via MATLAB analysis
    • Demonstrated 2.3x throughput improvement in L=3 vs Traditional architecture

Conclusion and Future Improvements

The pipelined architecture achieved optimal balance between area (811 cells) and power (8.22 μW), while the combined L=3 design maximized throughput at higher resource costs. Future enhancements could explore higher-order parallelization (L=4+), adaptive coefficient tuning, or ASIC implementation for ultra-low-power edge applications. The modular Verilog codebase and automated Python/MATLAB workflows provide a scalable foundation for real-time DSP systems.

Project Information

  • Category: Digital Signal Processing / VLSI Design
  • Client: Rensselaer Polytechnic Institute
  • Project Date: March 14, 2025
  • GitHub Repository: View Implementation