

Hello! I am a fourth-year PhD student at UC Berkeley working on machine learning and NLP. I am advised by Dan Klein and Dawn Song, and I am fortunate to also work with external collaborators such as Sameer Singh, Nicholas Carlini, and Colin Raffel. My research is supported by the Apple Scholars in AI Fellowship. Outside of Berkeley, I am a student researcher at Google Brain for 2023, and I have previously interned at FAIR and AI2.
Starting Fall 2023, I'm on the job market for academia and industry!
Hello! I am a fourth-year PhD student at UC Berkeley working on machine learning and NLP. I am advised by Dan Klein and Dawn Song, and I am fortunate to also work with external collaborators such as Sameer Singh, Nicholas Carlini, and Colin Raffel. My research is supported by the Apple Scholars in AI Fellowship. Outside of Berkeley, I am a student researcher at Google Brain for 2023, and I have previously interned at FAIR and AI2.
Starting Fall 2023, I'm on the job market for academia and industry!
Current Research Interests
I focus on improving language models, enhancing the security/privacy/reliability of ML, and the intersection of these topics. Some of my recent research directions include:
- Memorization & Privacy We've shown that LMs and diffusion models can memorize their training data [1,2,3,4], raising concerns regarding privacy, copyright agreements, GDPR statutes, and more.
- Prompting & Decoding We've done some of the early work on prompting LMs, including prompt design [4,5], parameter efficiency [6], and understanding failure modes [7].
- Robustness We've studied different types of natural [8] and adversarial distribution shifts [9,10,11], and we have traced model failures back to quality and diversity issues in the training data [12,13,14,15].
- New Threat Models We've explored and refined new types of adversarial vulnerabilities, including stealing models weights [16] and poisoning training sets [17].
Selected Publications
Here are some of my representative papers. See my Google Scholar page for a complete list.
-
Extracting Training Data from Diffusion Models
arXiv preprint
TLDR: We show how to extract hundreds of memorized images from popular diffusion models like Imagen and Stable Diffusion.@article{carlini2023extracting, title={Extracting training data from diffusion models}, author={Carlini, Nicholas and Hayes, Jamie and Nasr, Milad and Jagielski, Matthew and Sehwag, Vikash and Tram{\`e}r, Florian and Balle, Borja and Ippolito, Daphne and Wallace, Eric}, journal={arXiv preprint arXiv:2301.13188}, year={2023}}
-
Automated Crossword Solving
ACL 2022
TLDR: We create an AI for solving crossword puzzles that outperforms the world's best human players.@inproceedings{Wallace2022Crosswords, title={Automated Crossword Solving}, author={Wallace, Eric and Tomlin, Nicholas and Xu, Albert and Yang, Kevin and Pathak, Eshaan and Ginsberg, Matthew L. and Klein, Dan}, booktitle={Association for Computational Linguistics}, year={2022}}
-
Calibrate Before Use: Improving Few-shot Performance of Language Models
ICML 2021. Oral Presentation, top 3%
TLDR: We are the first to show that GPT-3's accuracy has high variance across different choices of the prompt. We propose a calibration procedure that reduces this variance and substantially improves average accuracy.@inproceedings{Zhao2021Calibrate, Title = {Calibrate Before Use: Improving Few-shot Performance of Language Models}, Author = {Tony Z. Zhao and Eric Wallace and Shi Feng and Dan Klein and Sameer Singh}, booktitle={International Conference on Machine Learning}, Year = {2021}}
-
Extracting Training Data From Large Language Models
USENIX Security 2021
TLDR: We create a method for extracting verbatim training examples from a language model.@inproceedings{carlini2020extracting, title={Extracting Training Data from Large Language Models}, author={Nicholas Carlini and Florian Tram\`er and Eric Wallace and Matthew Jagielski and Ariel Herbert-Voss and Katherine Lee and Adam Roberts and Tom Brown and Dawn Song and \'Ulfar Erlingsson and Alina Oprea and Colin Raffel}, booktitle={USENIX Security Symposium}, year={2021}}
-
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts
EMNLP 2020
TLDR: We propose a method for automatically designing prompts for large language models.@inproceedings{Shin2020Autoprompt, Author = {Taylor Shin and Yasaman Razeghi and Robert L. Logan IV and Eric Wallace and Sameer Singh}, BookTitle={Empirical Methods in Natural Language Processing}, Year = {2020}, Title = {{AutoPrompt}: Eliciting Knowledge from Language Models with Automatically Generated Prompts}}
-
Universal Adversarial Triggers for Attacking and Analyzing NLP
EMNLP 2019
TLDR: We create phrases that cause a model to produce a specific prediction when concatenated to any input. Triggers reveal egregious and insightful errors for text classification, reading comprehension, and text generation.
@inproceedings{Wallace2019Triggers, Author = {Eric Wallace and Shi Feng and Nikhil Kandpal and Matt Gardner and Sameer Singh}, Booktitle = {Empirical Methods in Natural Language Processing}, Year = {2019}, Title = {Universal Adversarial Triggers for Attacking and Analyzing {NLP}}}
-
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
EMNLP 2019. Best Demo Award
TLDR: We build an open-source toolkit on top of AllenNLP that makes it easy to interpret NLP models.
@inproceedings{Wallace2019AllenNLP, Author = {Eric Wallace and Jens Tuyls and Junlin Wang and Sanjay Subramanian and Matt Gardner and Sameer Singh}, Booktitle = {Empirical Methods in Natural Language Processing}, Year = {2019}, Title = {{AllenNLP Interpret}: A Framework for Explaining Predictions of {NLP} Models}}
-
Pathologies of Neural Models Make Interpretations Difficult
EMNLP 2018
TLDR: Saliency maps, a popular interpretation technique, can be negatively impacted by certain pathological behavior present in neural models, such as prediction overconfidence.
@inproceedings{Feng2018Pathological, Author = {Shi Feng and Eric Wallace and Alvin Grissom II and Mohit Iyyer and Pedro Rodriguez and Jordan Boyd-Graber}, Booktitle = {Empirical Methods in Natural Language Processing}, Year = {2018}, Title = {Pathologies of Neural Models Make Interpretations Difficult}}
Teaching & Mentoring
I am passionate about mentoring and teaching. At the moment, I currently don't have capacity to advise new research projects, but I happy to answer questions via email about my work and research at Berkeley.
-
CS288: Natural Language Processing
Spring 2023
-
Interpreting Predictions of NLP Models
EMNLP 2020
-
What a Crossword AI Reveals About Humans
-
Privacy & Security for Diffusion and LMs
-
What does GPT-3 “know” about me?
-
Neil deGrasse Tyson Podcast (Crosswords)
-
Does GPT-2 Know Your Phone Number?
-
AI models spit out photos of people and copyrighted images
-
Privacy Considerations in Language Models
-
Neural Crossword Solver Outperforms Humans For First Time
Selected Media Coverage
Here are a few news articles that feature my work, including interviews with me or my colleagues.