About
I am a Staff Research Scientist in the Adaptive Experimentation group at Meta, where I develop sample-efficient optimization methods with a focus on efficient AutoML for large machine learning models. Before joining Meta, I completed my PhD in computer science at Cornell University, working with Professor Carla Gomes. My research interests include Bayesian optimization, sparse optimization, and active learning, with applications in materials science and beyond. My work at Meta has also involved leveraging Bayesian optimization to optimize the sustainability of the concrete mixes used in Meta's data centers, reducing the environmental impact while increasing strength and stability.
When I am not penciling Greek letters or hunting down missing minus signs in code, I enjoy cycling, dancing tango, and playing the piano. Hear me play a tango that I transcribed here
Selected Projects
Sustainable Concrete via Bayesian Optimization
Meta Blog Post · Paper · GitHub
Eight percent of global CO₂ emissions come from cement production — the dominant source of emissions in data center construction. In collaboration with Amrize and the University of Illinois Urbana-Champaign, we used Bayesian optimization to discover lower-carbon concrete formulations that are 13% stronger, 43% faster-setting, and 19% lower in CO₂ — at no extra cost and using only standard materials. The optimized mix has been deployed at Meta's data center in Rosemount, MN.
LogEI: Unexpected Improvements to Expected Improvement
Paper · NeurIPS Poster · AI at Meta on LinkedIn ·
Expected Improvement (EI) is the most widely used acquisition function in Bayesian optimization, yet it suffers from numerical pathologies — vanishing gradients and poor acquisition optimization. LogEI reformulates the EI family with principled transformations that enable reliable gradient-based optimization. The result is substantially improved sample efficiency across a wide range of tasks, even outperforming state-of-the-art entropy search acquisition functions. LogEI generalizes to the noisy, multi-objective, and constrained settings and is the default in BoTorch and Meta's Ax platform, powering large-scale adaptive experimentation at Meta and beyond.
Scientific Autonomous Reasoning Agent (SARA)
SARA is an autonomous experimentation system that integrates robotic materials synthesis with a hierarchy of AI methods to accelerate scientific discovery. By combining lateral gradient laser spike annealing with nested active learning cycles and end-to-end uncertainty quantification, SARA autonomously maps synthesis phase diagrams — achieving orders-of-magnitude acceleration in exploring metastable materials. We demonstrated SARA's capabilities by mapping the Bi₂O₃ system, including conditions for stabilizing δ-Bi₂O₃ at room temperature, a critical development for electrochemical technologies.
Publications
- BOxCrete: A Bayesian Optimization Open-Source AI Model for Concrete Strength Forecasting and Mix Optimization
- Empirical Gaussian Processes
- Autonomous Materials Exploration by Integrating Automated Phase Identification and AI-Assisted Human Reasoning
- Ax: A platform for adaptive experimentation
- Scalable Gaussian processes with latent Kronecker structure
- Probabilistic phase labeling and lattice refinement for autonomous materials research
- Data from: Probabilistic Phase Labeling and Lattice Refinement for Autonomous Materials Research
- Enhancing predictive capabilities in fusion burning plasmas through surrogate-based optimization in core transport solvers
- Robust Gaussian processes via relevance pursuit
- Scaling gaussian processes for learning curve prediction via latent kronecker structure
- Unexpected improvements to expected improvement for bayesian optimization
- Bayesian optimization over high-dimensional combinatorial spaces via dictionary-based embeddings
- Sustainable concrete via bayesian optimization
- Efficient projection algorithms onto the weighted ℓ1 ball
- Scalable first-order Bayesian optimization via structured automatic differentiation
- The fast kernel transform
- Generalized matching pursuits for the sparse optimization of separable objectives
- Advances in Sparse and Bayesian Optimization for Autonomous Scientific Discovery
- Autonomous materials synthesis via hierarchical active learning of nonequilibrium phase diagrams
- Automating crystal-structure phase mapping by combining deep learning with constraint reasoning
- Sparse Bayesian Learning via Stepwise Regression
- On the optimality of backward regression: Sparse recovery and subset selection
- Constrained Machine Learning: The Bagel Framework
- Data from: Autonomous synthesis of metastable materials
- Deep reasoning networks for unsupervised pattern de-mixing with constraint reasoning
- Optical identification of materials transformations in oxide thin films
- Deep reasoning networks: thinking fast and slow, for pattern de-mixing
- CRYSTAL: a multi-agent AI system for automated mapping of materials' crystal structures
- Multi-component background learning automates signal detection for spectroscopic data
- Exponentially-modified Gaussian mixture model: applications in spectroscopy
- Shufeng Kong, Santosh K. Suram, R. Bruce van Dover, and John M. Gregoire. 2019. CRYSTAL: A multi-agent AI system for automated mapping of materials’ crystal structures
- Accurate and efficient numerical calculation of stable densities via optimized quadrature and asymptotics
- An efficient relaxed projection method for constrained non-negative matrix factorization with application to the phase-mapping problem in materials science
- Solving the stochastic Landau-Lifshitz-Gilbert-Slonczewski equation for monodomain nanomagnets: A survey and analysis of numerical techniques
- Sparse Bayesian Learning via Stepwise Regression: Supplementary Materials