Github Website Link for our lab projects: https://github.com/Intelligent-Computing-Lab-Panda
SPIKING NEURAL NETWORKS
- Algorithms: [CVPR’25], [DATE’24], [NeurIPS’23], [Frontiers’23], [ECCV’22], [ECCV’22-ORAL PRESENTATION], [Frontiers’21]
- Theory: [AAAI’23], [Neural Networks’21], [Nature SREP’21]
- Applications
- Wearable Diagnostics: [NeurIPS’22], [Frontiers’23]
- Vision (Videos): [arXiv’24], [ECCV’22]
- Hardware
- Digital ASIC Accelerators: [MICRO’24], [IEEE TETCI’24], [ASP-DAC’24 BEST PAPER NOMINATION], [SATA - IEEE TCAD’22]
- Compute-In-Memory Memristive Accelerators: [ISLPED’25], [SpikeSim - IEEE TCAD’23 BEST PAPER AWARD], [ISLPED’22 BEST PAPER AWARD]
- Presentations and Talks: [SNUFA’22 Seminar], [ESWEEK’20 Tutorial], [Rutgers’21 Seminar]
EFFICIENT AI: LLMs & VLMs
Our research explores the development and optimization of efficient Transformer or Mamba architectures for large language models (LLMs), vision language models (VLMs) and computer vision tasks. We aim to balance performance and energy efficiency by leveraging quantization, sparsity, and dynamic computation techniques, pushing the boundaries of generative AI models. We follow a top-down algorithm centric approach innovating on novel algorithm design aware of the hardware capabilities and a bottom up hardware centric approach innovating on accelerator design with efficient dataflows and architectures. This project focuses on achieving lower memory, power, and latency solutions using algorithm-hardware co-design philosophy, crucial for the next generation of AI applications.
- Algorithms: [ICML’25], [arXiv’25], [Neural Networks’23], [ECCV’24], [ECCV’24]
- Hardware
- FPGA: [MLSys’25], [DAC’24]
- Digital ASIC Accelerator: [DAC’25] , [IEEE TVLSI’24]
- Compute-in-Memory Memristive Accelerator: [IEEE TETC’24], [IEEE TCAD’24]
CROSS-LAYER Co-DESIGN & AUTOMATION
To enable ubiquitous intelligence,our research focuses on developing automated software-hardware co-design and co-search toolflows that maximize efficiency and throughput under limited resource constraints. These tools also accelerate AI deployment cycles by optimizing performance across the entire stack. We validate our techniques on real-world systems, utilizing commercial devices or designing custom digital ASIC/FPGA/compute-in-memory accelerators for seamless system integration. Our end-to-end design flow not only focuses on efficiency but also ensures robustness against hardware non-idealities and software vulnerabilities, critical for reliable on-device intelligence.
- Compute-in-Memory Toolflows: [IEEE TETC’24], [IEEE TCAD’24], [DAC’23], [DAC’23]
- Digital ASIC Toolflows: [DAC’25], [IEEE TETCI’24]
- FPGA Toolflows: [MLSys’25], [DAC’24]
- Presentations and Talks: [PNNL’23 Seminar]
DISTRIBUTED LEARNING
Our research tackles the complexities of collaborative machine learning, with a strong focus on optimizing federated learning for distributed edge-cloud environments. By addressing the challenges of training both spiking neural networks (SNNs) and conventional models across decentralized edge devices, we ensure maximum efficiency, robustness, privacy, and security. Through innovative algorithmic advancements and cutting-edge hardware and system-level solutions, we achieve significant breakthroughs in communication efficiency and training convergence, paving the way for scalable and secure distributed learning in real-world applications.
- Algorithms
- Spiking Neural Networks: [AAAI’22], [IEEE Signal Processing’21]
- Transformers/DNNs: [TMLR’25], [Nature Machine Intelligence’24], [Neural Networks’23]
- Hardware
- FPGA: [IEEE TETCI’24]
- Compute-in-Memory: [DATE’24]