SAIL@Princeton

SAIL@Princeton

Systems for Artificial Intelligence Lab @ Princeton

About Us

SAIL@Princeton is an interdisciplinary research group focused on developing next-generation AI/ML systems that are efficient, accurate, reliable, scalable, and secure. As a collaboration between both systems and machine learning researchers, our work considers all parts of the compute stack, spanning the design of AI/ML algorithms and pipelines to runtime systems for training/inference to intelligent applications of ML to tackle core systems challenges.

Directions

Efficient ML Inference

High-throughput, low-latency, and low-cost inference systems and techniques for ML models (traditional DNNs and modern LLMs).

→

Efficient ML Training

Efficient, high-throughput, fair, and cost-effective training and fine-tuning systems for ML models (traditional DNNs and modern LLMs).

→

Compound AI Systems

Designing and optimizing systems that tackle AI tasks by combining multiple interacting components, including retrievers (Retrieval Augmented Generation), multiple calls to models, or external tools (Agents).

→

ML for Systems

Applying ML (reinforcement learning, Bayesian optimizations, LLMs, etc.) to key systems decisions like caching, resource management, anomaly detection, and network protocol design.

→

Novel Model Architectures

Full-stack innovations on State Space Models (e.g., Mamba), and more generally, models with linear Attention/RNN layers.

→

Edge AI Systems

Video analytics, video conferencing, and video streaming.

→

Emerging Paradigms

Multimodal models, diffusion models, MoEs, inference-time compute, etc.

→

Hardware Design for ML

Tools for exploring the hardware design space, silicon cost aware hardware design for ML workloads, hardware for serving large models.

→

LLM Safety

Building systems and datasets for improving LLM robustness, misuse prevention, and transparency.

→

Sequence Modeling

Developing model architectures for sequence modeling and optimizing the processing of long context windows.

→

Intelligent Resource Management

Scheduling, multi tenant GPU sharing, and learned cache eviction for ML workloads.

→

Security and Privacy

Privacy preservation for ML to prevent leaking and misusing sensitive information, efficient ML models for detecting threats.

→

Novel ML Applications

New and emerging applications of ML to domains such as IoT, healthcare, and security.

→

Data Systems

Leveraging ML and optimization techniques to build novel designs for data storage layouts, database indexes, and end-to-end data systems.

→

@ 2025 SAIL@Princeton. Powered by Bootstrap.