Systems for Artificial Intelligence Lab @ Princeton
SAIL@Princeton is an interdisciplinary research group focused on developing next-generation AI/ML systems that are efficient, accurate, reliable, scalable, and secure. As a collaboration between both systems and machine learning researchers, our work considers all parts of the compute stack, spanning the design of AI/ML algorithms and pipelines to runtime systems for training/inference to intelligent applications of ML to tackle core systems challenges.
High-throughput, low-latency, and low-cost inference systems and techniques for ML models (traditional DNNs and modern LLMs).
Efficient, high-throughput, fair, and cost-effective training and fine-tuning systems for ML models (traditional DNNs and modern LLMs).
Designing and optimizing systems that tackle AI tasks by combining multiple interacting components, including retrievers (Retrieval Augmented Generation), multiple calls to models, or external tools (Agents).
Applying ML (reinforcement learning, Bayesian optimizations, LLMs, etc.) to key systems decisions like caching, resource management, anomaly detection, and network protocol design.
Full-stack innovations on State Space Models (e.g., Mamba), and more generally, models with linear Attention/RNN layers.
Video analytics, video conferencing, and video streaming.
Multimodal models, diffusion models, MoEs, inference-time scaling, etc.
Tools for exploring the hardware design space, silicon cost aware hardware design for ML workloads, hardware for serving large models.
Building systems and datasets for improving LLM robustness, misuse prevention, and transparency.
Developing model architectures for sequence modeling and optimizing the processing of long context windows.
Scheduling, multi tenant GPU sharing, and learned cache eviction for ML workloads.
Privacy preservation for ML to prevent leaking and misusing sensitive information, efficient ML models for detecting threats.
Leveraging ML and optimization techniques to build novel designs for data storage layouts, database indexes, and end-to-end data systems.