I am a fourth year CS PhD student at EPFL working with Michael Kapralov. I am broadly interested in optimizing the memory/runtime of LLM inference and fine-tuning. In the past I have also worked on fast algorithms for large-scale & high-dimensional data analysis and numerical linear algebra.
I spent Fall 2024 as an Applied Science intern at Amazon Luxembourg where I deployed ML and Optimization based solutions to production on AWS Infra for internal customers. I have also worked with Anirban Dasgupta and Dinesh Garg (IBM Research, Bengaluru) on randomized linear algebra. In the distant past I also spent a summer at Caltech on a SURF fellowship with Ashish Mahabal on deep learning for astronomy.
Streaming Attention Approximation via Discrepancy Theory
In Advances in Neural Information Processing Systems (NeurIPS), 2025 (Spotlight).
A KV‑cache compression method based on discrepancy theory, offering provable approximation guarantees and strong empirical performance on long‑context benchmarks.
Improved Algorithms for Kernel Matrix-Vector Multiplication
In the International Conference on Learning Representations (ICLR), 2025 (Poster). Best Paper at ICML 2024 Workshop on Long Context Foundation Models.
Subquadratic‑time algorithms for kernel/attention matrix–vector multiplication, enabling faster attention computations for long‑context LLMs with theoretical guarantees.
Sublinear Time Low-Rank Approximation of Hankel Matrices
In the ACM–SIAM Symposium on Discrete Algorithms (SODA), 2026.
Sublinear Time Low-Rank Approximation of Toeplitz Matrices
In the ACM–SIAM Symposium on Discrete Algorithms (SODA), 2024.
Toeplitz Low-Rank Approximation with Sublinear Query Complexity
In the ACM–SIAM Symposium on Discrete Algorithms (SODA), 2023.
Towards Non-Uniform k-Center with Constant types of Radii
In the Symposium on Simplicity in Algorithms (SOSA), 2022.
Deep-learnt classification of light curves
In IEEE Symposium Series on Computational Intelligence (SSCI), 2017.