Systems for Artificial Intelligence Lab @ Princeton
SAIL@Princeton is an interdisciplinary research group focused on developing next-generation AI/ML systems that are efficient, accurate, reliable, scalable, and secure. As a collaboration between both systems and machine learning researchers, our work considers all parts of the compute stack, spanning the design of AI/ML algorithms and pipelines to runtime systems for training/inference to intelligent applications of ML to tackle core systems challenges.
High-throughput, low-latency, and low-cost inference systems and techniques for ML models (traditional DNNs and modern LLMs).
Efficient, high-throughput, fair, and cost-effective training and fine-tuning systems for ML models (traditional DNNs and modern LLMs).
Designing and optimizing systems that tackle AI tasks by combining multiple interacting components, including retrievers (Retrieval Augmented Generation), multiple calls to models, or external tools (Agents).
Applying ML (reinforcement learning, Bayesian optimizations, LLMs, etc.) to key systems decisions like caching, resource management, anomaly detection, and network protocol design.
Full-stack innovations on State Space Models (e.g., Mamba), and more generally, models with linear Attention/RNN layers.
Video analytics, video conferencing, and video steaming.
Multimodal models, diffusion models, MoEs, inference-time scaling, etc.