Mihai Christodorescu

Mihai Christodorescu is a research scientist working in the areas of security & privacy at Google. His research interests include data confidentiality and integrity, program analysis, and new primitives for secure operating systems.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Large Language Models have been able to replicate their success from text generation to coding tasks. While a lot of work has made it clear that they have remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree do auto-regressive models understand the logical constructs of the underlying programs. We propose CAPP, a counterfactual testing framework to evaluate whether large code models understand programming concepts. With only black-box access to the model, we use CAPP to evaluate 10 popular large code models for 5 different programming concepts. Our findings suggest that current models lack understanding of concepts such as data flow and control flow. View details
    Formal Analysis of the API Proxy Problem
    Anh Pham
    Somesh Jha
    arXiv, Google LLC (2023)
    Preview abstract Implementing a security mechanism on top of APIs requires clear understanding of the semantics of each API, to ensure that security entitlements are enforced consistently and completely across all APIs that could perform the same function for an attacker. Unfortunately, APIs are not designed to be ”semantically orthogonal” and they often overlap, for example by offering different performance points for the same functionality. This leaves it to the security mechanism to discover and account for API proxies, i.e., groups of APIs which together approximate the functionality of some other API. Lacking a complete view of the structure of the API-proxy relationship, current security mechanisms address it in an ad-hoc and reactive manner, by updating the implementation when new API proxies are uncovered and abused by attackers. We analyze the problem of discovering API-proxy relationships and show that its complexity makes it NP-complete, which makes computing exact information about API proxies prohibitively expensive for modern API surfaces that consist of tens of thousands of APIs. We then propose a simple heuristic algorithm to approximate the same API-proxy information and argue that this overapproximation can be safely used for security purposes, with only the downside of some utility loss. We conclude with a number of open problems of both theoretical and practical interest and with potential directions towards new solutions for the API-proxy problem. View details
    Identifying and Mitigating the Security Risks of Generative AI
    Clark Barrett
    Brad Boyd
    Brad Chen
    Jihye Choi
    Amrita Roy Chowdhury
    Anupam Datta
    Soheil Feizi
    Kathleen Fisher
    Tatsunori B. Hashimoto
    Dan Hendrycks
    Somesh Jha
    Daniel Kang
    Florian Kerschbaum
    Eric Mitchell
    John Mitchell
    Zulfikar Ramzan
    Khawaja Shams
    Dawn Song
    Ankur Taly
    Diyi Yang
    Foundations and Trends in Privacy and Security, 6 (2023), pp. 1-52
    Preview abstract Every major technical invention resurfaces the dual-use dilemma—the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, and reports on some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides a launching point on this important topic and provides interesting problems that the research community can work to address. View details