I am a computer science researcher at
Microsoft Research in Redmond and the
Office of the CTO, Azure for Operators.
My research is primarily in networked systems, with a focus on enhancing them with
practical machine learning (ML) algorithms.
Video is a major thrust of my work owing to its synergy with ML.
- Practical ML for networked systems
- Video and ML
Bio: Francis Y. Yan is a Senior Researcher at
Microsoft Research in Redmond and the Office of the CTO, Azure for
Operators. His research is primarily in networked systems, with a
focus on enhancing them with practical machine learning algorithms.
Francis received his Ph.D. in computer science from Stanford
University, where he was advised by
Keith Winstein and
Philip Levis.
Before that, he completed his undergraduate studies at Tsinghua
University (Yao Class) and MIT. His work has engaged hundreds of
thousands of real users and also found wide use in academia, aiding
researchers in publishing many papers at top-tier
conferences. He is a recipient of an IRTF Applied Networking
Research Prize, a USENIX NSDI Community Award, a USENIX ATC Best
Paper Award, and an APNet Best Paper Award.
Zibo Wang, Pinghe Li, Chieh-Jan Mike Liang, Feng Wu, Francis Y. Yan
To appear in USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2024
menu details
Autothrottle is a bi-level learning-assisted resource management framework
that autoscales CPUs for microservice applications with latency SLOs.
It uses a lightweight learned controller at the application level to
assist agile per-microservice controllers, practically saving CPU
resources without violating SLOs.
Zhiying Xu, Francis Y. Yan, Rachee Singh, Justin T. Chiu, Alexander M. Rush, Minlan Yu
Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM), September 2023
menu details
Teal is a deep learning-based traffic engineering (TE) scheme that
accelerated the TE optimization on large WANs by several orders of magnitude
while achieving near-optimal traffic allocation.
Michael Rudow, Francis Y. Yan, Abhishek Kumar, Ganesh Ananthanarayanan,
Martin Ellis, K.V. Rashmi
USENIX Symposium on Networked Systems Design and Implementation (NSDI),
April 2023
menu details
Tambur is a new approach to forward error correction (FEC) for
videoconferencing built upon streaming codes and machine learning.
Tambur reduced decoding failures while consuming less bandwidth for redundancy.
Zhengxu Xia*, Yajie Zhou*, Francis Y. Yan, Junchen Jiang (*equal contribution)
Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM), August 2022
menu details
Genet is a novel training framework that enhances the performance and
generalization of reinforcement learning (RL) algorithms in networking.
Genet builds on curriculum learning with judicious use of rule-based baselines.
It substantially improves the performance and generalization of simulation-trained
RL algorithms under unseen workloads and in real environments.
Jeongyoon Eo, Zhixiong Niu, Wenxue Cheng, Francis Y. Yan,
Rui Gao, Jorina Kardhashi, Scott Inglis, Michael Revow, Byung-Gon Chun,
Peng Cheng, Yongqiang Xiong
Asia-Pacific Workshop on Networking (APNet), July 2022
menu details
OpenNetLab is an open platform for
training, validating, and evaluating RL-based congestion-control
(bandwidth estimation) algorithms for real-time communications (RTC) such as
videoconferencing. It has successfully aided the development of novel RL-based
congestion-control algorithms for RTC during our
Grand Challenge
hosted at ACM MMSys '21.
Francis Y. Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong,
Keyi Zhang, Philip Levis, Keith Winstein
USENIX Symposium on Networked Systems Design and Implementation (NSDI),
February 2020
menu details
We built
Puffer, a free, publicly
accessible website that live-streams television channels and operates as a
randomized experiment of adaptive bitrate (ABR) algorithms.
As of June 2020, Puffer has attracted 120,000 real users and streamed 60 years
of video across the internet. Using Puffer, we developed an ML-based ABR algorithm,
Fugu, that robustly outperformed existing schemes by learning
in situ,
on real data from its actual deployment environment.
Francis Y. Yan, Jestin Ma, Greg D. Hill, Deepti Raghavan, Riad S. Wahby,
Philip Levis, Keith Winstein
USENIX Annual Technical Conference (ATC), July 2018
menu details
Pantheon is a “training
ground” for congestion-control research and has
assisted four schemes from other research groups in publishing at
NSDI 2018
(
Copa and
Vivace),
ICML 2019
(
Aurora),
and SIGCOMM 2020 (
TCP-TACK).
It also enabled our own ML-based
congestion-control algorithm, Indigo, which was trained to imitate expert
congestion-control algorithms we created in emulation
and achieved good performance over the real internet.