SAIL@Princeton

Systems for Artificial Intelligence Lab @ Princeton

About Us

SAIL@Princeton is an interdisciplinary research group focused on developing next-generation AI/ML systems that are efficient, accurate, reliable, scalable, and secure. As a collaboration between both systems and machine learning researchers, our work considers all parts of the compute stack, spanning the design of AI/ML algorithms and pipelines to runtime systems for training/inference to intelligent applications of ML to tackle core systems challenges.

Directions

Efficient ML Inference

High-throughput, low-latency, and low-cost inference systems and techniques for ML models (traditional DNNs and modern LLMs).

Efficient ML Training

Efficient, high-throughput, fair, and cost-effective training and fine-tuning systems for ML models (traditional DNNs and modern LLMs).

Compound AI Systems

Designing and optimizing systems that tackle AI tasks by combining multiple interacting components, including retrievers (Retrieval Augmented Generation), multiple calls to models, or external tools (Agents).

ML for Systems

Applying ML (reinforcement learning, Bayesian optimizations, LLMs, etc.) to key systems decisions like caching, resource management, anomaly detection, and network protocol design.

Novel Model Architectures

Full-stack innovations on State Space Models (e.g., Mamba), and more generally, models with linear Attention/RNN layers.

Edge AI Systems

Video analytics, video conferencing, and video steaming.

Emerging Paradigms

Multimodal models, diffusion models, MoEs, inference-time scaling, etc.

@ 2025 SAIL@Princeton. Powered by Bootstrap.