Google at EMNLP 2023
Google at EMNLP 2023
Google is proud to be a Diamond Sponsor of Empirical Methods in Natural Language Processing (EMNLP 2023), a premier annual conference, which is being held this week in Sentosa, Singapore. Google has a strong presence at this year’s conference with over 65 accepted papers and active involvement in 11 workshops and tutorials. Google is also happy to be a Major Sponsor for the Widening NLP workshop (WiNLP), which aims to highlight global representations of people, perspectives, and cultures in AI and ML. We look forward to sharing some of our extensive NLP research and expanding our partnership with the broader research community.
We hope you’ll visit the Google booth to chat with researchers who are actively pursuing the latest innovations in NLP, and check out some of the scheduled booth activities (e.g., demos and Q&A sessions listed below). Visit the @GoogleAI X (Twitter) and LinkedIn accounts to find out more about the Google booth activities at EMNLP 2023.
Take a look below to learn more about the Google research being presented at EMNLP 2023 (Google affiliations in bold).
Quick links
Quick links
Board & Organizing Committee
-
Shyam Upadyay
- Sponsorship Chair
-
Imed Zitouni
- Industry Track Chair
-
Roee Aharoni
- Senior Program Committee
-
Annie Louis
- Senior Program Committee
-
Vinodkumar Prabhakaran
- Senior Program Committee
-
Shruti Rijhwani
- Senior Program Committee
-
Brian Roark
- Senior Program Committee
-
Partha Talukdar
- Senior Program Committee
Accepted papers
SynJax: Structured Probability Distributions for JAX
Miloš Stanojević, Laurent Sartran
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Clifton Poth, Hannah Sterz, Indraneil Paul, Sukannya Purkayastha, Leon Engländer, Timo Imhof, Ivan Vulić, Sebastian Ruder, Iryna Gurevych, Jonas Pfeiffer
DocumentNet: Bridging the Data Gap in Document Pre-training
Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, Alexander Hauptmann, Hanjun Dai, Wei Wei
AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-Powered Applications
Bhaktipriya Radharapu, Kevin Robinson, Lora Aroyo, Preethi Lahoti
CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
Mete Ismayilzada, Debjit Paul, Syrielle Montariol, Mor Geva, Antoine Bosselut
Large Language Models Can Self-Improve
Jiaxin Huang*, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Alon Jacovi, Avi Caciularu, Omer Goldman, Yoav Goldberg
Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models
Yichao Zhou, James Bradley Wendt, Navneet Potti, Jing Xie, Sandeep Tata
Measuring Attribution in Natural Language Generation Models
Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter
Inverse Scaling Can Become U-Shaped
Jason Wei*, Najoung Kim, Yi Tay*, Quoc Le
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback
Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei Li
On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-Based Method
Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart
Investigating Efficiently Extending Transformers for Long-Input Summarization
Jason Phang*, Yao Zhao, Peter J Liu
DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta*, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler
MultiTurnCleanup: A Benchmark for Multi-Turn Spoken Conversational Transcript Cleanup
Hua Shen*, Vicky Zayats, Johann C Rocholl, Daniel David Walker, Dirk Padfield
q2d: Turning Questions into Dialogs to Teach Models How to Search
Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb
Emergence of Abstract State Representations in Embodied Sequence Modeling
Tian Yun*, Zilai Zeng, Kunal Handa, Ashish V Thapliyal, Bo Pang, Ellie Pavlick, Chen Sun
Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Benjamin Muller*, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang
Weakly-Supervised Learning of Visual Relations in Multimodal Pre-training
Emanuele Bugliarello, Aida Nematzadeh, Lisa Anne Hendricks
How Do Languages Influence Each Other? Studying Cross-Lingual Data Sharing During LM Fine-Tuning
Rochelle Choenni, Dan Garrette, Ekaterina Shutova
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić
IC3: Image Captioning by Committee Consensus
David Chan, Austin Myers, Sudheendra Vijayanarasimhan, David A Ross, John Canny
The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models
Aviv Slobodkin, Omer Goldman, Avi Caciularu, Ido Dagan, Shauli Ravfogel
Evaluating Large Language Models on Controlled Generation Tasks
Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Wieting, Nanyun Peng, Xuezhe Ma
Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration
Daniel Deutsch, George Foster, Markus Freitag
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay*, Jason Wei*, Hyung Won Chung*, Vinh Q. Tran, David R. So*, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani
Data Similarity is Not Enough to Explain Language Model Performance
Gregory Yauney*, Emily Reif, David Mimno
Self-Influence Guided Data Reweighting for Language Model Pre-training
Megh Thakkar*, Tolga Bolukbasi, Sriram Ganapathy, Shikhar Vashishth, Sarath Chandar, Partha Talukdar
ReTAG: Reasoning Aware Table to Analytic Text Generation
Deepanway Ghosal, Preksha Nema, Aravindan Raghuveer
GATITOS: Using a New Multilingual Lexicon for Low-Resource Machine Translation
Alex Jones*, Isaac Caswell, Ishank Saxena
Video-Helpful Multimodal Machine Translation
Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li
Symbol Tuning Improves In-Context Learning in Language Models
Jerry Wei*, Le Hou, Andrew Kyle Lampinen, Xiangning Chen*, Da Huang, Yi Tay*, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma*, Quoc V Le
"Don't Take This Out of Context!" On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola, Xuhui Zhou, Elizabeth Clark, Maarten Sap
QAmeleon: Multilingual QA with Only 5 Examples
Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision
Eugene Kharitonov, Damien Vincent, Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin, Matt Sharifi, Marco Tagliasacchi, Neil Zeghidour
AnyTOD: A Programmable Task-Oriented Dialog System
Jeffrey Zhao, Yuan Cao, Raghav Gupta, Harrison Lee, Abhinav Rastogi, Mingqiu Wang, Hagen Soltau, Izhak Shafran, Yonghui Wu
Selectively Answering Ambiguous Questions
Jeremy R. Cole, Michael JQ Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein
PRESTO: A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs (see blog post)
Rahul Goel, Waleed Ammar, Aditya Gupta, Siddharth Vashishtha, Motoki Sano, Faiz Surani*, Max Chang, HyunJeong Choe, David Greene, Chuan He, Rattima Nitisaroj, Anna Trukhina, Shachi Paul, Pararth Shah, Rushin Shah, Zhou Yu
LM vs LM: Detecting Factual Errors via Cross Examination
Roi Cohen, May Hamri, Mor Geva, Amir Globerson
A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding
Andrea Burns*, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Bryan A. Plummer, Kate Saenko, Jianmo Ni, Mandy Guo
AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Said Ahmad, Meriem Beloucif, Saif M. Mohammad, Sebastian Ruder, Oumaima Hourrane, Alipio Jorge, Pavel Brazdil, Felermino D. M. A. Ali, Davis David, Salomey Osei, Bello Shehu-Bello, Falalu Ibrahim Lawan, Tajuddeen Gwadabe, Samuel Rutunda, Tadesse Destaw Belay, Wendimu Baye Messelle, Hailu Beshada Balcha, Sisay Adugna Chala, Hagos Tesfahun Gebremichael, Bernard Opoku, Stephen Arthur
Optimizing Retrieval-Augmented Reader Models via Token Elimination
Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation
Elizabeth Clark, Shruti Rijhwani, Sebastian Gehrmann, Joshua Maynez, Roee Aharoni, Vitaly Nikolaev, Thibault Sellam, Aditya Siddhant, Dipanjan Das, Ankur P Parikh
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Joshua Ainslie, James Lee-Thorp, Michiel de Jong*, Yury Zemlyanskiy, Federico Lebron, Sumit Sanghai
CoLT5: Faster Long-Range Transformers with Conditional Computation
Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontanon, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai
Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting
Preethi Lahoti, Nicholas Blumm, Xiao Ma, Raghavendra Kotikalapudi, Sahitya Potluri, Qijun Tan, Hansa Srinivasan, Ben Packer, Ahmad Beirami, Alex Beutel, Jilin Chen
Universal Self-Adaptive Prompting (see blog post)
Xingchen Wan*, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan O. Arik, Tomas Pfister
TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor
Hierarchical Pre-training on Multimodal Electronic Health Records
Xiaochen Wang, Junyu Luo, Jiaqi Wang, Ziyi Yin, Suhan Cui, Yuan Zhong, Yaqing Wang, Fenglong Ma
NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders
Livio Baldini Soares, Daniel Gillick, Jeremy R. Cole, Tom Kwiatkowski
How Does Generative Retrieval Scale to Millions of Passages?
Ronak Pradeep*, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, Vinh Q. Tran
Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets
Irina Bejan*, Artem Sokolov, Katja Filippova
Findings of EMNLP
Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs
Jiefeng Chen*, Jinsung Yoon, Sayna Ebrahimi, Sercan O Arik, Tomas Pfister, Somesh Jha
A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Alon Jacovi*, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva
1-PAGER: One Pass Answer Generation and Evidence Retrieval
Palak Jain, Livio Baldini Soares, Tom Kwiatkowski
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo, Linting Xue, Michal Yarom, Ashish V. Thapliyal, Idan Szpektor, Julien Amelot, Xi Chen, Radu Soricut
SDOH-NLI: A Dataset for Inferring Social Determinants of Health from Clinical Notes
Adam D. Lelkes, Eric Loreaux*, Tal Schuster, Ming-Jun Chen, Alvin Rajkomar
Machine Reading Comprehension Using Case-based Reasoning
Dung Ngoc Thai, Dhruv Agarwal, Mudit Chaudhary, Wenlong Zhao, Rajarshi Das, Jay-Yoon Lee, Hannaneh Hajishirzi, Manzil Zaheer, Andrew McCallum
Cross-lingual Open-Retrieval Question Answering for African Languages
Odunayo Ogundepo, Tajuddeen Gwadabe, Clara E. Rivera, Jonathan H. Clark, Sebastian Ruder, David Ifeoluwa Adelani, Bonaventure F. P. Dossou, Abdou Aziz DIOP, Claytone Sikasote, Gilles HACHEME, Happy Buzaaba, Ignatius Ezeani, Rooweither Mabuya, Salomey Osei, Chris Chinenye Emezue, Albert Kahira, Shamsuddeen Hassan Muhammad, Akintunde Oladipo, Abraham Toluwase Owodunni, Atnafu Lambebo Tonja, Iyanuoluwa Shode, Akari Asai, Anuoluwapo Aremu, Ayodele Awokoya, Bernard Opoku, Chiamaka Ijeoma Chukwuneke, Christine Mwase, Clemencia Siro, Stephen Arthur, Tunde Oluwaseyi Ajayi, Verrah Akinyi Otiende, Andre Niyongabo Rubungo, Boyd Sinkala, Daniel Ajisafe, Emeka Felix Onwuegbuzia, Falalu Ibrahim Lawan, Ibrahim Said Ahmad, Jesujoba Oluwadara Alabi, CHINEDU EMMANUEL MBONU, Mofetoluwa Adeyemi, Mofya Phiri, Orevaoghene Ahia, Ruqayya Nasir Iro, Sonia Adhiambo
On Uncertainty Calibration and Selective Generation in Probabilistic Neural Summarization: A Benchmark Study
Polina Zablotskaia, Du Phan, Joshua Maynez, Shashi Narayan, Jie Ren, Jeremiah Zhe Liu
Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation
Markus Freitag, Behrooz Ghorbani*, Patrick Fernandes*
Sources of Hallucination by Large Language Models on Inference Tasks
Nick McKenna, Tianyi Li, Liang Cheng, Mohammad Javad Hosseini, Mark Johnson, Mark Steedman
Don’t Add, Don’t Miss: Effective Content Preserving Generation from Pre-selected Text Spans
Aviv Slobodkin, Avi Caciularu, Eran Hirsch, Ido Dagan
What Makes Chain-of-Thought Prompting Effective? A Counterfactual Study
Aman Madaan*, Katherine Hermann, Amir Yazdanbakhsh
Understanding HTML with Large Language Models
Izzeddin Gur, Ofir Nachum, Yingjie Miao, Mustafa Safdari, Austin Huang, Aakanksha Chowdhery, Sharan Narang, Noah Fiedel, Aleksandra Faust
Improving the Robustness of Summarization Models by Detecting and Removing Input Noise
Kundan Krishna*, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu
In-Context Learning Creates Task Vectors
Roee Hendel, Mor Geva, Amir Globerson
Pre-training Without Attention
Junxiong Wang, Jing Nathan Yan, Albert Gu, Alexander M Rush
MUX-PLMs: Data Multiplexing for High-Throughput Language Models
Vishvak Murahari, Ameet Deshpande, Carlos E Jimenez, Izhak Shafran, Mingqiu Wang, Yuan Cao, Karthik R Narasimhan
PaRaDe: Passage Ranking Using Demonstrations with LLMs
Andrew Drozdov*, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler*, Kai Hui
Long-Form Speech Translation Through Segmentation with Finite-State Decoding Constraints on Large Language Models
Arya D. McCarthy, Hao Zhang, Shankar Kumar, Felix Stahlberg, Ke Wu
Unsupervised Opinion Summarization Using Approximate Geodesics
Somnath Basu Roy Chowdhury*, Nicholas Monath, Kumar Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi
SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data
Ruoxi Sun, Sercan O. Arik, Rajarishi Sinha, Hootan Nakhost, Hanjun Dai, Pengcheng Yin, Tomas Pfister
Retrieval-Augmented Parsing for Complex Graphs by Exploiting Structure and Uncertainty
Zi Lin, Quan Yuan, Panupong Pasupat, Jeremiah Zhe Liu, Jingbo Shang
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Tao Li, Gang Li, Zhiwei Deng, Bryan Wang*, Yang Li
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches
Daniel Fried, Nicholas Tomlin, Jennifer Hu, Roma Patel, Aida Nematzadeh
Improving Classifier Robustness Through Active Generation of Pairwise Counterfactuals
Ananth Balashankar, Xuezhi Wang, Yao Qin, Ben Packer, Nithum Thain, Jilin Chen, Ed H. Chi, Alex Beutel
mmT5: Modular Multilingual Pre-training Solves Source Language Hallucinations
Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, Sebastian Ruder
Scaling Laws vs Model Architectures: How Does Inductive Bias Influence Scaling?
Yi Tay, Mostafa Dehghani, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani Yogatama, Donald Metzler
TaTA: A Multilingual Table-to-Text Dataset for African Languages
Sebastian Gehrmann, Sebastian Ruder, Vitaly Nikolaev, Jan A. Botha, Michael Chavinda, Ankur P Parikh, Clara E. Rivera
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean Michel Amath Sarr, Xinyi Wang, John Frederick Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David Ifeoluwa Adelani, Vera Axelrod, Isaac Rayburn Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev, Partha Talukdar
On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval
Jiayi Chen*, Hanjun Dai, Bo Dai, Aidong Zhang, Wei Wei*
Workshops
-
Major Sponsor
The Seventh Widening NLP Workshop (WiNLP)Organizer: Sunipa Dev
Panelist: Preethi Lahoti
-
The Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC)
Invited Speaker: Bernd Bohnet
-
The 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS)
Organizer: Geeticka Chauhan
-
Combined Workshop on Spatial Language Understanding and Grounded Communication for Robotics (SpLU-RoboNLP)
Invited Speaker: Andy Zeng
-
Natural Language Generation, Evaluation, and Metric (GEM)
Organizer: Elizabeth Clark
-
The First Arabic Natural Language Processing Conference (ArabicNLP)
Organizer: Imed Zitouni
-
The Big Picture: Crafting a Research Narrative (BigPicture)
Organizers: Nora Kassner, Sebastian Ruder
-
BlackboxNLP 2023: The 6th Workshop on Analysing and Interpreting Neural Networks for NLP
Organizer: Najoung Kim
Panelist: Neel Nanda
-
The SIGNLL Conference on Computational Natural Language Learning (CoNLL)
Co-Chair: David Reitter
Areas and ACs: Kyle Gorman (Speech and Phonology), Fei Liu (Natural Language Generation)
-
The Third Workshop on Multi-lingual Representation Learning (MRL)
Organizer: Omer Goldman, Sebastian Ruder
Invited Speaker: Orhan Firat
Google Research booth activities
This schedule is subject to change. Please visit the Google booth for more information.
-
Fri, Dec 8 | 10:30AM -11:00AM SST
Developing and Utilizing Evaluation Metrics for Machine Translation & Improving Multilingual NLPPresenter: Isaac Caswell, Dan Deutch, Jan-Thorsten Peter, David Vilar Torres
-
Fri, Dec 8 | 3:30PM - 4:00PM SST
Differentiable Search Indexes & Generative RetrievalPresenter: Sanket Vaibhav Mehta, Vinh Tran, Kai Hui, Ronak Pradeep*
-
Sat, Dec 9 | 10:30AM -11:00AM SST
Retrieval and Generation in a single passPresenter: Palak Jain, Livio Baldini Soares
-
Sat, Dec 9 | 12:30PM -1:45PM SST
Amplifying Adversarial AttacksPresenter: Anu Sinha
-
Sat, Dec 9 | 3:30PM -4:00PM SST
Automate prompt design: Universal Self-Adaptive PromptingPresenter: Xingchen Wan*, Ruoxi Sun
* Work done while at Google