Github Website Link for our lab projects: https://github.com/Intelligent-Computing-Lab-Yale
SPIKING NEURAL NETWORKS
Spiking neural networks (SNNs) have emerged as a promising bio-inspired alternative to conventinal AI due to their huge energy efficiency benefits. However, the development of energy-efficient training and inference on conventional platforms is still immature, which limits its utility in real-world applications. To fill the gap, we are designing new SNN training and inference algorithms and applications grounded with fundamental theories that improve the accuracy, robustness, and efficiency of SNNs. To truly leverage the energy-efficiency benefits of SNNs for large-scale AI applications, we need to build hardware simulators and accelerators that will enable us to run tradeoff analysis between energy-latency-area-accuracy for different SNN models across various tasks. We are designing end-to-end SNN algorithm-hardware co-design solutions leveraging sparsity, quantization and dynamic computation to build ultra low power spike-powered AI technologies.
- Algorithms: [DATE’24], [NeurIPS’23], [Frontiers’23], [ECCV’22],[BNTT - Frontiers’21]
- Theory: [AAAI’23], [Neural Networks’21], [Nature SREP’21]
- Applications
- Wearable Diagnostics: [NeurIPS’22], [Frontiers’23]
- Vision (Videos): [arXiv’24], [ECCV’22]
- Hardware
- Digital ASIC Accelerators: [MICRO’24], [IEEE TETCI’24], [ASP-DAC’24 BEST PAPER NOMINATION], [SATA - IEEE TCAD’22]
- Compute-In-Memory Memristive Accelerators: [SpikeSim - IEEE TCAD’23], [ISLPED’22 BEST PAPER AWARD]
- Presentations and Talks: [SNUFA’22 Seminar], [ESWEEK’20 Tutorial], [Rutgers’21 Seminar]
EFFICIENT TRANSFORMERS for LLMs & VISION
Our research explores the development and optimization of efficient Transformer architectures for large language models (LLMs) and computer vision tasks. We aim to balance performance and energy efficiency by leveraging quantization, sparsity, and dynamic computation techniques, pushing the boundaries of generative AI models. We follow a top-down algorithm centric approach innovating on novel algorithm design aware of the hardware capabilities and a bottom up hardware centric approach innovating on accelerator design with efficient dataflows and architectures. This project focuses on achieving lower memory, power, and latency solutions using algorithm-hardware co-design philosophy, crucial for the next generation of AI applications.
- Algorithms: [Neural Networks’23], [ECCV’24], [ECCV’24]
- Hardware
- FPGA: [DAC’24]
- Digital ASIC Accelerator: [IEEE TVLSI’24]
- Compute-in-Memory Memristive Accelerator: [IEEE TETC’24], [IEEE TCAD’24]
CROSS-LAYER Co-DESIGN & AUTOMATION
To enable ubiquitous intelligence,our research focuses on developing automated software-hardware co-design and co-search toolflows that maximize efficiency and throughput under limited resource constraints. These tools also accelerate AI deployment cycles by optimizing performance across the entire stack. We validate our techniques on real-world systems, utilizing commercial devices or designing custom digital ASIC/FPGA/compute-in-memory accelerators for seamless system integration. Our end-to-end design flow not only focuses on efficiency but also ensures robustness against hardware non-idealities and software vulnerabilities, critical for reliable on-device intelligence.
- Compute-in-Memory Toolflows: [IEEE TETC’24], [IEEE TCAD’24], [DAC’23], [DAC’23]
- Digital ASIC Toolflows: [IEEE TETCI’24]
- FPGA Toolflows: [DAC’24]
- Presentations and Talks: [PNNL’23 Seminar]
DISTRIBUTED LEARNING
Our research tackles the complexities of collaborative machine learning, with a strong focus on optimizing federated learning for distributed edge-cloud environments. By addressing the challenges of training both spiking neural networks (SNNs) and conventional models across decentralized edge devices, we ensure maximum efficiency, robustness, privacy, and security. Through innovative algorithmic advancements and cutting-edge hardware and system-level solutions, we achieve significant breakthroughs in communication efficiency and training convergence, paving the way for scalable and secure distributed learning in real-world applications.
- Algorithms
- Spiking Neural Networks: [AAAI’22], [IEEE Signal Processing’21]
- Deep Neural Networks: [Nature Machine Intelligence’24], [Neural Networks’23]
- Hardware
- FPGA: [IEEE TETCI’24]
- Compute-in-Memory: [DATE’24]