Inverting Language Models

Jack Morris
Cornell Tech

Time: 2024-04-24, 12:15 - 13:00 ET
Location: Rice 540 and Zoom

Abstract How much information do the outputs of NLP models contain about their inputs? We investigate the problem in two scenarios, recovering text inputs from the outputs of embeddings from sentence embedders and next-token probability outputs from language models. In both cases, our methods are able to fully recover some inputs given just the model output.

Bio Jack Morris is a third-year PhD student at Cornell Tech. He works on NLP and machine learning with applications to security and privacy, with a focus on text-based dense information retrieval systems.