 
                Shachi Dave
            Shachi Dave is a Software Engineer in the Natural Language Understanding group at Google Research India. She received her Masters degree in Computer Science from University of Southern California, Los Angeles. Her research interests include, natural language understanding, conversational AI and data mining/modeling and their applications to various products like Search and Assistant.
          
        
        
      Authored Publications
    
  
  
  
    
    
  
      
        Sort By
        
        
    
    
        
        
          
              Preview abstract
          
          
              While large, generative, multilingual models are rapidly being developed and deployed, their safety and fairness evaluations primarily hinge on resources collected in the English language and some limited translations. This has been demonstrated to be insufficient, and severely lacking in nuances of unsafe language and stereotypes prevalent in different languages and the geographical pockets they are prevalent in. Gathering these resources, at scale, in varied languages and regions also poses a challenge as it requires expansive sociolinguistic knowledge and can also be prohibitively expensive. We utilize an established methodology of coupling LLM generations with distributed annotations to overcome these gaps and create the resource SeeGULL Multilingual, spanning 20 languages across 23 regions.
              
  
View details
          
        
      
    
        
          
            
              Beyond Aesthetics: Cultural Competence in Text-to-Image Models
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Nithish Kannen
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Marco Andreetto
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Adji Bousso Dieng
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            2024
          
          
        
        
        
          
              Preview abstract
          
          
              Use of Text-to-Image models is expanding beyond generating generic objects, as they are increasingly being adopted by diverse global communities to create visual representations of their unique culture. Current T2I benchmarks primarily evaluate image-text alignment, aesthetics and fidelity of generations for complex prompts with generic objects, overlooking the critical dimension of cultural understanding. In this work, we address this gap by defining a framework to evaluate cultural competence of T2I models, and present a scalable approach to collect cultural artifacts unique to a particular culture from Knowledge Graphs and Large Language Models in tandem. We assess the ability of state-of-the-art T2I models to generate culturally faithful and realistic images across 8 countries and 3 cultural domains. Furthermore, we emphasize the importance of T2I models reflecting a culture's diversity and introduce cultural diversity as a novel metric for T2I evaluation, drawing inspiration from the Vendi Score. We introduce T2I-GCube, a first-of-its-kind benchmark for T2I evaluation. T2I-GCube includes cultural prompts, metrics, and cultural concept spaces, enabling comprehensive assessment of T2I models' cultural knowledge and diversity. Our evaluations reveal significant gaps in the cultural knowledge of existing models and provide valuable insights into the diversity of image outputs for under-specified prompts. By introducing a novel approach to evaluating cultural diversity and knowledge in T2I models, T2I-GCube will be instrumental in fostering the development of models with enhanced cultural competence.
              
  
View details
          
        
      
    
        
          
            
              ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Akshita Jha
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sarah Laszlo
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Rida Qadri
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chandan Reddy
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            ACL  (2024)
          
          
        
        
        
          
              Preview abstract
          
          
              Recent studies have highlighted the issue of varying degrees of stereotypical depictions for different identity group. However, these existing approaches have several key limitations, including a noticeable lack of coverage of identity groups in their evaluation, and the range of their associated stereotypes. Additionally, these studies often lack a critical distinction between inherently visual stereotypes, such as `brown' or `sombrero', and culturally influenced stereotypes like `kind' or `intelligent'. In this work, we address these limitations by grounding our evaluation of regional, geo-cultural stereotypes in the generated images from Text-to-Image models by leveraging existing textual resources. We employ existing stereotype benchmarks to evaluate stereotypes and focus exclusively on the identification of visual stereotypes within the generated images spanning 135 identity groups. We also compute the offensiveness across identity groups, and check the feasibility of identifying stereotypes automatically. Further, through a detailed case study and quantitative analysis, we reveal how the default representations of all identity groups have a more stereotypical appearance, and for historically marginalized groups, how the images across different attributes are visually more similar than other groups, even when explicitly prompted otherwise.
              
  
View details
          
        
      
    
        
          
            
              Beyond Aesthetics: Cultural Competence in Text-to-Image Models
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Nithish Kannen
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Marco Andreetto
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Adji Bousso Dieng
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            2024
          
          
        
        
        
          
              Preview abstract
          
          
              Use of Text-to-Image models is expanding beyond generating generic objects, as they are increasingly being adopted by diverse global communities to create visual representations of their unique culture. Current T2I benchmarks primarily evaluate image-text alignment, aesthetics and fidelity of generations for complex prompts with generic objects, overlooking the critical dimension of cultural understanding. In this work, we address this gap by defining a framework to evaluate cultural competence of T2I models, and present a scalable approach to collect cultural artifacts unique to a particular culture from Knowledge Graphs and Large Language Models in tandem. We assess the ability of state-of-the-art T2I models to generate culturally faithful and realistic images across 8 countries and 3 cultural domains. Furthermore, we emphasize the importance of T2I models reflecting a culture's diversity and introduce cultural diversity as a novel metric for T2I evaluation, drawing inspiration from the Vendi Score. We introduce T2I-GCube, a first-of-its-kind benchmark for T2I evaluation. T2I-GCube includes cultural prompts, metrics, and cultural concept spaces, enabling comprehensive assessment of T2I models' cultural knowledge and diversity. Our evaluations reveal significant gaps in the cultural knowledge of existing models and provide valuable insights into the diversity of image outputs for under-specified prompts. By introducing a novel approach to evaluating cultural diversity and knowledge in T2I models, T2I-GCube will be instrumental in fostering the development of models with enhanced cultural competence.
              
  
View details
          
        
      
    
        
        
          
              Preview abstract
          
          
              Stereotypes are oversimplified beliefs and ideas about particular groups of people. These cognitive biases are omnipresent in our language, reflected in human-generated dataset and potentially learned and perpetuated by language technologies. Although mitigating stereotypes in language technologies is necessary for preventing harms, stereotypes can impose varying levels of risks for targeted individuals and social groups by appearing in various contexts. Technical challenges in detecting stereotypes are rooted in the societal nuances of stereotyping, making it impossible to capture all intertwined interactions of social groups in diverse cultural context in one generic benchmark. This paper delves into the nuances of detecting stereotypes in an annotation task with humans from various regions of the world. We iteratively disambiguate our definition of the task, refining it as detecting ``generalizing language'' and contribute a multilingual, annotated dataset consisting of sentences mentioning a wide range of social identities in 9 languages and labeled on whether they make broad statements and assumptions about those groups. We experiment with training generalizing language detection models, which provide insight about the linguistic context in which stereotypes can appear, facilitating future research in addressing the dynamic, social aspects of stereotypes.
              
  
View details
          
        
      
    
        
          
            
              Beyond Aesthetics: Cultural Competence in Text-to-Image Models
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Nithish Kannen
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Marco Andreetto
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Adji Bousso Dieng
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            2024
          
          
        
        
        
          
              Preview abstract
          
          
              Use of Text-to-Image models is expanding beyond generating generic objects, as they are increasingly being adopted by diverse global communities to create visual representations of their unique culture. Current T2I benchmarks primarily evaluate image-text alignment, aesthetics and fidelity of generations for complex prompts with generic objects, overlooking the critical dimension of cultural understanding. In this work, we address this gap by defining a framework to evaluate cultural competence of T2I models, and present a scalable approach to collect cultural artifacts unique to a particular culture from Knowledge Graphs and Large Language Models in tandem. We assess the ability of state-of-the-art T2I models to generate culturally faithful and realistic images across 8 countries and 3 cultural domains. Furthermore, we emphasize the importance of T2I models reflecting a culture's diversity and introduce cultural diversity as a novel metric for T2I evaluation, drawing inspiration from the Vendi Score. We introduce T2I-GCube, a first-of-its-kind benchmark for T2I evaluation. T2I-GCube includes cultural prompts, metrics, and cultural concept spaces, enabling comprehensive assessment of T2I models' cultural knowledge and diversity. Our evaluations reveal significant gaps in the cultural knowledge of existing models and provide valuable insights into the diversity of image outputs for under-specified prompts. By introducing a novel approach to evaluating cultural diversity and knowledge in T2I models, T2I-GCube will be instrumental in fostering the development of models with enhanced cultural competence.
              
  
View details
          
        
      
    
        
          
            
              Parameter-Efficient Finetuning for Robust Continual Multilingual Learning
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
    
    
    
    
    
            Findings of the Association for Computational Linguistics: ACL 2023
          
          
        
        
        
          
              Preview abstract
          
          
              We introduce and study the problem of Continual Multilingual Learning (CML), where a previously trained multilingual model is periodically updated using new data arriving in stages. If the new data is present only in a subset of languages, we find that the resulting model shows improved performance only on the languages included in the latest update (and few closely related languages) while its performance on all the remaining languages degrade significantly.  We address this challenge by proposing LAFT-URIEL, a parameter-efficient finetuning strategy which aims to increase the number of languages on which the model improves after an update, while reducing the magnitude of loss in performance for the remaining languages. LAFT-URIEL uses linguistic knowledge to balance overfitting and knowledge sharing across languages, thus resulting in 25% increase in the number of languages whose performances improve during an update and 78% relative decrease in average magnitude of losses on the remaining languages.
              
  
View details
          
        
      
    
        
          
            
              Building Stereotype Repositories with Complementary Approaches
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
    
    
    
    
    
                      
                        Akshita Jha
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jaya Goyal
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Dinesh Tewari
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            C3NLP workshop at EACL 2023 (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              Measurements of fairness in NLP have been critiqued for lacking concrete definitions of biases or harms measured, and for perpetuating a singular, Western narrative of fairness globally. To combat some of these pivotal issues, methods for curating datasets and benchmarks that target specific harms are rapidly emerging. However, these methods still face the significant challenge of achieving coverage over global cultures and perspectives at scale. To address this, in this paper, we highlight the utility and importance of complementary approaches in these curation strategies, which leverage both community engagement as well as large generative models. We specifically target the harm of stereotyping and demonstrate a pathway to build a benchmark that covers stereotypes about diverse, and intersectional identities.
              
  
View details
          
        
      
    
        
          
            
              Bootstrapping Multilingual Semantic Parsers using Large Language Models
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Abhijeet Awasthi
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Bidisha Samanta
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sunita Sarawagi
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            Conference of the European Chapter of the Association for Computational Linguistics (EACL) (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              Despite cross-lingual generalization demonstrated by pre-trained multilingual models, the translate-and-train paradigm of transferring English datasets across multiple languages remains to be the key ingredient for training task-specific multilingual models. However, for many low-resource languages, the availability of a reliable translation service entails significant amounts of costly human annotated translation pairs. Further, the translation services for low resource languages may continue to be brittle due to domain mismatch between the task-specific input text and the general-purpose text used while training the translation models. We consider the task of multilingual semantic parsing, and demonstrate the effectiveness and the flexibility offered by large language models (LLMs) for translating English datasets into several languages via few-shot prompting. We provide (i) Extensive comparisons with prior translate-and-train methods across 50 languages demonstrating that LLMs can serve as highly effective data translators, outperforming prior translation based methods on 40 out of 50 languages; (ii) A comprehensive study of the key design choices that enable effective data translation via prompted LLMs.
              
  
View details
          
        
      
    
        
        
          
              Preview abstract
          
          
              With rapid development and deployment of generative language models in global settings, there is an urgent need to also scale our measurements of harm, not just in the number and types of harms covered, but also how well they account for local cultural contexts, including marginalized identities and the social biases experienced by them.
This growth in our evaluation paradigms thus, needs to be enhanced and calibrated by including people from different cultures and societies worldwide. In this work, we demonstrate this socio-culturally aware expansion in the Indian societal context for the harm of stereotyping. We devise a community engaged effort to build a resource which contains stereotypes for axes of disparity that are uniquely present in India. The resultant resource increases the number of stereotypes known for and in the Indian context by many folds and is consequently beneficial for evaluations of generative AI.
              
  
View details
          
        
      
    