![]() |
Research Scientist |
I am a Research Scientist at Google, New York. My research interest lies broadly in machine learning, algorithmic statistics and information theory. My recent interest focuses on theoretical and practical aspects of foundation models, including LLM efficiency and responsible AI. I am also interested in studying the tradeoffs between different resources in modern machine learning systems, including samples, privacy, communication, memory, and computation. I obtained my Ph.D. in Electrical and Computer Engineering at Cornell University, where I was extremely fortunate to be advised by Prof. Jayadev Acharya. Previously, I spent four wonderful years at Tsinghua University, where I got a Bachelor of Science degree in Electronic Engineering.
Authorship order: [C]: Contribution-based; [A]: Alphabetical.
InfAlign: Inference-aware language model alignment
[C] Ananth Balashankar*, Ziteng Sun*, Jonathan Berant, Jacob Eisenstein, Michael Collins, Adrian Hutter, Jong Lee, Chirag Nagpal, Flavien Prost, Aradhana Sinha, Ananda Theertha Suresh, Ahmad Beirami
arXiv:2412.19792
*Equal contribution
Block Verification Accelerates Speculative Decoding
[C] Ziteng Sun, Uri Mendlovic, Yaniv Leviathan, Asaf Aharoni, Ahmad Beirami, Jae Hun Ro, Ananda Theertha Suresh
arXiv:2403.10444, to appear at ICLR 2025 (acceptance rate: 32%)
Asymptotics of language model alignment
[C] Joy Qiping Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami
arXiv:2404.01730, ISIT 2024
The importance of feature preprocessing for differentially private linear optimization
[C] Ziteng Sun, Ananda Theertha Suresh, Aditya Krishna Menon
arXiv:2307.11106, ICLR 2024 (acceptance rate: 31%)
SpecTr: Fast Speculative Decoding via Optimal Transport
[C] Ziteng Sun*, Ananda Theertha Suresh*, Jae Hun Ro, Ahmad Beirami, Himanshu Jain, Felix Yu
arXiv:2310.15141, NeurIPS 2023 (acceptance rate: 26.1%)
*Equal contribution
Unified Lower Bounds for Interactive High-dimensional Estimation under Information Constraints
[A] Jayadev Acharya, Clément L. Canonne, Ziteng Sun, Himanshu Tyagi
arXiv:2010.06562, NeurIPS 2023 (acceptance rate: 26.1%)
Subset-Based Instance Optimality in Private Estimation
[A] Travis Dick, Alex Kulesza, Ziteng Sun, Ananda Theertha Suresh
arXiv:2303.01262, ICML 2023 (acceptance rate: 27.8%)
User-level Private Stochastic Convex Optimization with Optimal Rates
[A] Raef Bassily, Ziteng Sun
Conference version, ICML 2023 (acceptance rate: 27.8%)
Federated Heavy Hitter Recovery under Linear Sketching
[A] Adria Gascon, Peter Kairouz, Ziteng Sun, Ananda Theertha Suresh
arXiv:2307.13347, ICML 2023 (acceptance rate: 27.8%)
Concentration Bounds for Discrete Distribution Estimation in KL Divergence
[A] Clément L. Canonne, Ziteng Sun, Ananda Theertha Suresh
arXiv:2302.06869, ISIT 2023
Sample Complexity of Distinguishing Cause from Effect
[A] Jayadev Acharya, Sourbh Bhadane, Arnab Bhattacharyya, Saravanan Kandasamy, Ziteng Sun
Conference version, AISTATS 2023 (acceptance rate: 29%)
Discrete Distribution Estimation under User-level Local Differential Privacy
[A] Jayadev Acharya, Yuhan Liu, Ziteng Sun
arXiv:2211.03757, AISTATS 2023 (acceptance rate: 29%)
The Role of Interactivity in Structured Estimation
[A] Jayadev Acharya, Clément L. Canonne, Ziteng Sun, Himanshu Tyagi
arXiv:2203.06870, COLT 2022
Correlated quantization for distributed mean estimation and optimization
[C] Ananda Theertha Suresh, Ziteng Sun, Jae Hun Ro, Felix Yu
arXiv:2203.04925, ICML 2022 (acceptance rate: 19.8%)
Distributed estimation with multiple samples per user: sharp rates and phase transition
[A] Jayadev Acharya, Clément L. Canonne, Yuhan Liu, Ziteng Sun, Himanshu Tyagi
Conference version, NeurIPS 2021 (acceptance rate: 26%).
Learning with User-Level Privacy
[C] Daniel Levy*, Ziteng Sun*, Kareem Amin, Satyen Kale, Alex Kulesza, Mehryar Mohri, Ananda Theertha Suresh
arXiv:2102.11845, NeurIPS 2021 (acceptance rate: 26%).
*Equal contribution
Robust Testing and Estimation under Manipulation Attacks
[A] Jayadev Acharya, Ziteng Sun, Huanyu Zhang
arXiv:2104.10740, ICML 2021 (acceptance rate: 21.5%)
Interactive Inference under Information Constraints
[A] Jayadev Acharya, Clément L. Canonne, Yuhan Liu, Ziteng Sun, Himanshu Tyagi
Journal version at IEEE Transactions on Information Theory
arXiv:2007.10976. Preliminary version appeared at ISIT 2021
Estimating Sparse Discrete Distributions Under Local Privacy and Communication Constraints
[A] Jayadev Acharya, Peter Kairouz, Yuhan Liu, Ziteng Sun
arXiv:2011.00083, ALT 2021 (acceptance rate: 29.3%)
Differentially Private Assouad, Fano, and Le Cam
[A] Jayadev Acharya, Ziteng Sun, Huanyu Zhang
arXiv:2004.06830, ALT 2021 (acceptance rate: 29.3%)
Poster presentation at Theory and Practice of Differential Privacy (TPDP 2020)
Inference under information constraints III: Local privacy constraints
[A] Jayadev Acharya, Clément L. Canonne, Cody Freitag, Ziteng Sun, Himanshu Tyagi
arXiv:2101.07981
Journal version at IEEE Journal on Selected Areas in Information Theory
Context-Aware Local Differential Privacy
[A] Jayadev Acharya, K. A. Bonawitz, Peter Kairouz, Daniel Ramage, Ziteng Sun
arXiv:1911.00038, ICML 2020 (acceptance rate: 21.8%)
Poster presentation at Theory and Practice of Differential Privacy (TPDP 2019)
Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit
[A] Jayadev Acharya, Clément L. Canonne, Yanjun Han, Ziteng Sun, Himanshu Tyagi
arXiv:1907.08743, COLT 2020 (acceptance rate: 30.7%)
Social Distancing Has Merely Stabilized COVID-19 in the US
[C] Aaron B. Wagner, Elaine L. Hill, Sean E. Ryan, Ziteng Sun, Grace Deng, Sourbh Bhadane, Victor Hernandez Martinez, Peter Wu, Dongmei Li, Ajay Anand, Jayadev Acharya, David S.Matteson
medRxiv, Journal version at Stat
Advances and Open Problems in Federated Learning
[C] with Peter Kairouz, H. Brendan McMahan et al.
arXiv:1912.04977
Journal version at Foundations and Trends in Machine Learning
Can You Really Backdoor Federated Learning?
[C] Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, H. Brendan McMahan
arXiv:1911.07963, preprint
Poster presentation at Workshop on Federated Learning for Data Privacy and Confidentiality
Estimating Entropy of Distributions in Constant Space
[A] Jayadev Acharya, Sourbh Bhadane, Piotr Indyk, Ziteng Sun
arXiv:1911.07976, NeurIPS 2019 (acceptance rate: 21.2%).
Communication Complexity in Locally Private Distribution Estimation and Heavy Hitters
[A] Jayadev Acharya, Ziteng Sun
arXiv:1905.11888, ICML 2019 (acceptance rate: 22.6%).
Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication
[A] Jayadev Acharya, Ziteng Sun, Huanyu Zhang
arXiv:1802.04705, AISTATS 2019 (acceptance rate: 32.4%)
Talk presentation at Privacy in Machine Learning and Artificial Intelligence (PiMLAI 2018, 15.4% of all submissions)
Code available here
Differentially Private Testing of Identity and Closeness of Discrete Distributions
[A] Jayadev Acharya, Ziteng Sun, Huanyu Zhang
arXiv:1707.05128, Spotlight Presentation at NeurIPS 2018 (4% of all submissions).
INSPECTRE: Privately Estimating the Unseen
[A] Jayadev Acharya, Gautam Kamath, Ziteng Sun, Huanyu Zhang
arXiv:1803.00008, talk presentation at ICML 2018 (acceptance rate: 25.1%)
Journal version at Journal of Privacy and Confidentiality 10 (2).
Poster presentation at Theory and Practice of Differential Privacy (TPDP 2018)
Code available here
Improved Bounds on Minimax Risk of Estimating Missing Mass
[A] Jayadev Acharya, Yelun Bao, Yuheng Kang, Ziteng Sun
Talk presentation at ISIT 2018 (TPC choice session, 4% of all submissions)
ECE 4950 (Spring 2019): Machine Learning and Pattern Recognition (Teaching assistant)
ECE 4200 (Spring 2020): Fundamentals of Machine Learning (Teaching assistant)
Conferences: ISIT 2018, 2019, 2020, 2021; COLT 2019, 2020, 2021; ICML 2019. 2020, 2021; NeurIPS 2019, 2020; FOCS 2020; ICLR 2020, 2021; ALT 2021; Journals: TIT; TPAMI; JSAIT;