Jump to Content
Pei Wang

Pei Wang

I am a software engineer at Google. Previously, I was a Security Researcher at Baidu X-Lab. I finished my Ph.D. at the College of Information Sciences and Technology, The Pennsylvania State University, in 2018. I received my master's degree from University of Waterloo and my bachelor's degree from Peking University. My research interest subsumes dependable software engineering, program analysis, formal methods, trusted execution environments, and software obfuscation. Recently I started to work on web security, mostly to fight XSS. I am actively exploring multiple security research domains. My work has been regularly recognized by both the industry and academia. See my publication list for the latest peer-reviewed work. Visit my personal website to learn more about my research.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    If It’s Not Secure, It Should Not Compile: Preventing DOM-Based XSS in Large-Scale Web Development with API Hardening
    Julian Bangert
    Proceedings of the 43rd International Conference on Software Engineering (ICSE '21), IEEE (2021)
    Preview abstract Cross-site scripting (XSS) is one of the most intractable security vulnerabilities in web applications. Tons of efforts have been spent to mitigate XSS, yet it remains one of the most prevalent security threats on the Internet. Decades of exploitation and remediation demonstrated that code inspection and testing does not ensure the absence of XSS vulnerabilities in complex web applications with a high degree of confidence. This paper introduces Google's secure-by-design engineering paradigm that effectively prevents DOM-based XSS vulnerabilities in large-scale web development. Our approach, named API hardening, enforces a series of company-wide secure coding practices. We provide a set of secure APIs to replace native DOM APIs that are prone to XSS vulnerabilities. Through a combination of type contracts and appropriate validation and escaping, the secure APIs ensure that applications based thereon are free of XSS vulnerabilities. We deploy a simple yet capable compile-time checker to guarantee that developers exclusively use our hardened APIs to interact with the DOM. We make various of efforts to scale this approach to tens of thousands of Google engineers without significant productivity impact. By offering rigorous tooling and consultant support, we help developers adopt the secure coding practices as seamlessly as possible. We present empirical results showing how API hardening has helped reduce the occurrences of XSS vulnerabilities in Google's enormous code base over the course of two-year deployment. View details
    Adopting Trusted Types in Production Web Frameworks to Prevent DOM-Based Cross-Site Scripting: A Case Study
    Bjarki Ágúst Guðmundsson
    Proceedings of the 2021 IEEE European Symposium on Security and Privacy Workshops, IEEE (to appear)
    Preview abstract Cross-site scripting (XSS) is a common security vulnerability foundin web applications. DOM-based XSS, one of the variants, is becoming particularly more prevalent with the boom of single-page applications where most of the UI changes are achieved by modifying the DOM through in-browser scripting. It is very easy for developers to introduce XSS vulnerabilities into web applications since there are many ways for user-controlled, unsanitized input to flow into a Web API and get interpreted as HTML markup and JavaScript code. An emerging web security standard called Trusted Types aims to prevent DOM XSS by making Web APIs secure by default. Different from other XSS mitigations that mostly focus on post-development protection, Trusted Types direct developers to write XSS-free code in the first place. One of the common concerns when adopting a new security mechanism is how much effort is required to refactor existing applications. In this paper, we report a case study on adopting Trusted Types in a well-established web framework. Our experience can help the web community better understand the benefits of making web applications compatible with Trusted Types, while also getting to know the related challenges and resolutions. We focused our work on Angular, which is one of the most popular web development frameworks available on the market. View details
    Building and maintaining a third-party library supply chain for productive and secure SGX enclave development
    Mingshen Sun
    Huibo Wang
    Tongxin Li
    Rundong Zhou
    Zhaofeng Chen
    Yiming Jing
    Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice (2020), pp. 100-109
    Preview abstract The big data industry is facing new challenges as concerns about privacy leakage soar. One of the remedies to privacy breach incidents is to encapsulate computations over sensitive data within hardware-assisted Trusted Execution Environments (TEE). Such TEE-powered software is called secure enclaves. Secure enclaves hold various advantages against competing for privacy-preserving computation solutions. However, enclaves are much more challenging to build compared with ordinary software. The reason is that the development of TEE software must follow a restrictive programming model to make effective use of strong memory encryption and segregation enforced by hardware. These constraints transitively apply to all third-party dependencies of the software. If these dependencies do not officially support TEE hardware, TEE developers have to spend additional engineering effort in porting them. High development and maintenance cost is one of the major obstacles against adopting TEE-based privacy protection solutions in production. In this paper, we present our experience and achievements with regard to constructing and continuously maintaining a third-party library supply chain for TEE developers. In particular, we port a large collection of Rust third-party libraries into Intel SGX, one of the most mature trusted computing platforms. Our supply chain accepts upstream patches in a timely manner with SGX-specific security auditing. We have been able to maintain the SGX ports of 159 open-source Rust libraries with reasonable operational costs. Our work can effectively reduce the engineering cost of developing SGX enclaves for privacy-preserving data processing and exchange. View details
    Towards Memory Safe Enclave Programming with Rust-SGX
    Huibo Wang
    Mingshen Sun
    Yiming Jing
    Ran Duan
    Long Li
    Yulong Zhang
    Tao Wei
    Zhiqiang Lin
    Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (2019), pp. 2333-2350
    Preview abstract Intel Software Guard eXtension (SGX), a hardware supported trusted execution environment (TEE), is designed to protect security critical applications. However, it does not terminate traditional memory corruption vulnerabilities for the software running inside enclave, since enclave software is still developed with type unsafe languages such as C/C++. This paper presents RUST-SGX, an efficient and layered approach to exterminating memory corruption for software running inside SGX enclaves. The key idea is to enable the development of enclave programs with an efficient memory safe system language Rust with a RUST-SGX SDK by solving the key challenges of how to (1) make the SGX software memory safe and (2) meanwhile run as efficiently as with the SDK provided by Intel. We therefore propose to build RUST-SGX atop Intel SGX SDK, and tame unsafe components with formally proven memory safety. We have implemented RUST-SGX and tested with a series of benchmark programs. Our evaluation results show that RUST-SGX imposes little extra overhead (less than 5% with respect to the SGX specific features and services compared to software developed by Intel SGX SDK), and meanwhile have stronger memory safety. View details
    Generating Precise Dependencies for Large Software
    Jinqiu Yang
    Lin Tan
    Robert Kroeger
    J. David Morgenthaler
    Proceedings of the Forth International Workshop on Managing Technical Debt, IEEE (2013), pp. 47-50
    Preview abstract Intra- and inter-module dependencies can be a significant source of technical debt in the long-term software development, especially for large software with millions of lines of code. This paper designs and implements a precise and scalable tool that extracts code dependencies and their utilization for large C/C++ software projects. The tool extracts both symbol-level and module-level dependencies of a software system and identifies potential underutilized and inconsistent dependencies. Such information points to potential refactoring opportunities and help developers perform large-scale refactoring tasks. View details
    No Results Found