Github Website Link for our lab projects: https://github.com/Intelligent-Computing-Lab-Yale 


SPIKING NEURAL NETWORKS

Spiking neural networks (SNNs) have emerged as a promising bio-inspired alternative to conventinal AI due to their huge energy efficiency benefits. However, the development of energy-efficient training and inference on conventional platforms is still immature, which limits its utility in real-world applications. To fill the gap, we are designing new SNN training and inference algorithms and applications grounded with fundamental theories that improve the accuracy, robustness, and efficiency of SNNs. To truly leverage the energy-efficiency benefits of SNNs for large-scale AI applications, we need to build  hardware simulators and accelerators that will enable us to run tradeoff analysis between energy-latency-area-accuracy for different SNN models across various tasks. We are designing end-to-end SNN algorithm-hardware co-design solutions leveraging sparsity, quantization and dynamic computation to build ultra low power spike-powered AI technologies.


EFFICIENT TRANSFORMERS for LLMs & VISION

Our research explores the development and optimization of efficient Transformer architectures for large language models (LLMs) and computer vision tasks. We aim to balance performance and energy efficiency by leveraging quantization, sparsity, and dynamic computation techniques, pushing the boundaries of generative AI models. We follow a top-down algorithm centric approach innovating on novel algorithm design aware of the hardware capabilities and a bottom up hardware centric approach innovating on accelerator design with efficient dataflows and architectures. This project focuses on achieving lower memory, power, and latency solutions using algorithm-hardware co-design philosophy, crucial for the next generation of AI applications.


CROSS-LAYER Co-DESIGN & AUTOMATION

To enable ubiquitous intelligence,our research focuses on developing automated software-hardware co-design and co-search toolflows that maximize efficiency and throughput under limited resource constraints. These tools also accelerate AI deployment cycles by optimizing performance across the entire stack. We validate our techniques on real-world systems, utilizing commercial devices or designing custom digital ASIC/FPGA/compute-in-memory accelerators for seamless system integration. Our end-to-end design flow not only focuses on efficiency but also ensures robustness against hardware non-idealities and software vulnerabilities, critical for reliable on-device intelligence.


DISTRIBUTED LEARNING

Our research tackles the complexities of collaborative machine learning, with a strong focus on optimizing federated learning for distributed edge-cloud environments. By addressing the challenges of training both spiking neural networks (SNNs) and conventional models across decentralized edge devices, we ensure maximum efficiency, robustness, privacy, and security. Through innovative algorithmic advancements and cutting-edge hardware and system-level solutions, we achieve significant breakthroughs in communication efficiency and training convergence, paving the way for scalable and secure distributed learning in real-world applications.