Jump to Content
Petros Maniatis

Petros Maniatis

Petros Maniatis is a Senior Staff Research Scientist at Google DeepMind, in the Learning for Code Team. Prior to that, he was a Senior Research Scientist at Intel Labs, working in Intel's Berkeley Research Lab and then at the Intel Science and Technology Center on Secure Computing at UC Berkeley. He received his MSc and Ph.D. from the Computer Science Department at Stanford University. Before Stanford, he obtained his BSc with honors at the Department of Informatics of the University of Athens in Greece. His current research interests lie primarily in the confluence of machine learning and software engineering.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    CodeQueries: A Dataset of Semantic Queries over Code
    Surya Prakash Sahu
    Madhurima Mandal
    Shikhar Bharadwaj
    Aditya Kanade
    Shirish Shevade
    Innovations in Software Engineering (ISEC), ACM, Bangalore, India (2024)
    Preview abstract Developers often have questions about semantic aspects of code they are working on, e.g., “Is there a class whose parent classes declare a conflicting attribute?”. Answering them requires understanding code semantics such as attributes and inheritance relation of classes. An answer to such a question should identify code spans constituting the answer (e.g., the declaration of the subclass) as well as supporting facts (e.g., the definitions of the conflicting attributes). The existing work on question-answering over code has considered yes/no questions or method-level context. We contribute a labeled dataset, called CodeQueries, of semantic queries over Python code. Compared to the existing datasets, in CodeQueries, the queries are about code semantics, the context is file level and the answers are code spans. We curate the dataset based on queries supported by a widely-used static analysis tool, CodeQL, and include both positive and negative examples, and queries requiring single-hop and multi-hop reasoning. To assess the value of our dataset, we evaluate baseline neural approaches. We study a large language model (GPT3.5-Turbo) in zero-shot and few-shot settings on a subset of CodeQueries. We also evaluate a BERT style model (CuBERT) with fine-tuning. We find that these models achieve limited success on CodeQueries. CodeQueries is thus a challenging dataset to test the ability of neural models, to understand code semantics, in the extractive question-answering setting View details
    Resolving Code Review Comments with Machine Learning
    Alexander Frömmgen
    Peter Choy
    Elena Khrapko
    Marcus Revaj
    2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (to appear)
    Preview abstract Code reviews are a critical part of the software development process, taking a significant amount of the code authors’ and the code reviewers’ time. As part of this process, the reviewer inspects the proposed code and asks the author for code changes through comments written in natural language. At Google, we see millions of reviewer comments per year, and authors require an average of ∼60 minutes active shepherding time between sending changes for review and finally submitting the change. In our measurements, the required active work time that the code author must devote to address reviewer comments grows almost linearly with the number of comments. However, with machine learning (ML), we have an opportunity to automate and streamline the code-review process, e.g., by proposing code changes based on a comment’s text. We describe our application of recent advances in large sequence models in a real-world setting to automatically resolve code-review comments in the day-to-day development workflow at Google. We present the evolution of this feature from an asynchronous generation of suggested edits after the reviewer sends feedback, to an interactive experience that suggests code edits to the reviewer at review time. In deployment, code-change authors at Google address 7.5% of all reviewer comments by applying an ML-suggested edit. The impact of this will be to reduce the time spent on code reviews by hundreds of thousands of engineer hours annually at Google scale. Unsolicited, very positive feedback highlights that the impact of ML-suggested code edits increases Googlers’ productivity and allows them to focus on more creative and complex tasks. View details
    Snowcat: Efficient Kernel Concurrency Testing using a Learned Coverage Predictor
    Sishuai Gong
    Dinglan Peng
    Pedro Fonseca
    Symposium on Operating Systems Principles (SOSP) (2023)
    Preview abstract Random-based approaches and heuristics are commonly used in kernel concurrency testing due to the massive scale of modern kernels and corresponding interleaving space. The lack of accurate and scalable approaches to analyze concurrent kernel executions makes existing testing approaches heavily rely on expensive dynamic executions to measure the effectiveness of a new test. Unfortunately, the high cost incurred by dynamic executions limits the breadth of the exploration and puts latency pressure on finding effective concurrent test inputs and schedules, hindering the overall testing effectiveness. This paper proposes Snowcat, a kernel concurrency testing framework that generates effective test inputs and schedules using a learned kernel block-coverage predictor. Using a graph neural network, the coverage predictor takes a concurrent test input and scheduling hints and outputs a prediction on whether certain important code blocks will be executed. Using this predictor, Snowcat can skip concurrent tests that are likely to be fruitless and prioritize the promising ones for actual dynamic execution. After testing the Linux kernel for over a week, Snowcat finds ∼17% more potential data races, by prioritizing tests of more fruitful schedules than existing work would have chosen. Snowcat can also find effective test inputs that expose new concurrency bugs with higher probability (1.4×∼2.6×), or reproduce known bugs more quickly (15×) than state-ofart testing tools. More importantly, Snowcat is shown to be more efficient at reaching a desirable level of race coverage in the continuous setting, as the Linux kernel evolves from version to version. In total, Snowcat discovered 17 new concurrency bugs in Linux kernel 6.1, of which 13 are confirmed and 6 are fixed. View details
    Predicting Dynamic Properties of Heap Allocations Using Neural Networks Trained on Static Code
    Christian Navasca
    Guoqing Harry Xu
    2023 ACM SIGPLAN International Symposium on Memory Management (ISMM 2023)
    Preview abstract Memory allocators and runtime systems can leverage dynamic properties of heap allocations – such as object lifetimes, hotness or access correlations – to improve performance and resource consumption. A significant amount of work has focused on approaches that collect this information in performance profiles and then use it in new memory allocator or runtime designs, both offline (in ahead-of-time compilers) and online (in JIT compilers). This is a special instance of profile-guided optimization. This approach has significant disadvantages: 1) The profiling oftentimes introduces substantial overheads, which are prohibitive in many production scenarios, 2) Creating a representative profiling run adds significant engineering complexity and reduces deployment velocity, and 3) Profiles gathered ahead of time or during the warm-up phase of a server are often not representative of all workload behavior and may miss important corner cases. In this paper, we investigate a fundamentally different approach. Instead of deriving heap allocation properties from profiles, we explore the ability of neural network models to predict them from the statically available code. As an intellectual abstract, we do not offer a conclusive answer but describe the trade-off space of this approach, investigate promising directions, motivate these directions with data analysis and experiments, and highlight challenges that future work needs to overcome. View details
    Learning to Answer Semantic Queries over Code
    Surya Prakash Sahu
    Madhurima Mandal
    Shikhar Bharadwaj
    Aditya Kanade
    Shirish Shevade
    Google Research (2022)
    Preview abstract During software development, developers need answers to queries about semantic aspects of code. Even though extractive question-answering using neural approaches has been studied widely in natural languages, the problem of answering semantic queries over code using neural networks has not yet been explored. This is mainly because there is no existing dataset with extractive question and answer pairs over code involving complex concepts and long chains of reasoning. We bridge this gap by building a new, curated dataset called CodeQueries, and proposing a neural question-answering methodology over code. We build upon state-of-the-art pre-trained models of code to predict answer and supporting-fact spans. Given a query and code, only some of the code may be relevant to answer the query. We first experiment under an ideal setting where only the relevant code is given to the model and show that our models do well. We then experiment under three pragmatic considerations: (1) scaling to large-size code, (2) learning from a limited number of examples and (3) robustness to minor syntax errors in code. Our results show that while a neural model can be resilient to minor syntax errors in code, increasing size of code, presence of code that is not relevant to the query, and reduced number of training examples limit the model performance. We are releasing our data and models to facilitate future work on the proposed problem of answering semantic queries over code. View details
    Preview abstract Designing a suitable representation for code-reasoning tasks is challenging in aspects such as the kinds of program information to model, how to combine them, and how much context to consider. We propose CodeTrek, a deep learning approach that addresses these challenges by representing codebases as databases that conform to rich relational schemas. The relational representation not only allows CodeTrek to uniformly represent diverse kinds of program information, but also to leverage program-analysis queries to derive new semantic relations, which can be readily incorporated without further architectural engineering. CodeTrek embeds this relational representation using a set of walks that can traverse different relations in an unconstrained fashion, and incorporates all relevant attributes along the way. We evaluate CodeTrek on four diverse and challenging Python tasks: variable misuse, exception prediction, unused definition, and variable shadowing. CodeTrek achieves an accuracy of 91%, 63%, 98%, and 94% on these tasks respectively, and outperforms state-of-the-art neural models by 2--19% points. View details
    Learning to Walk over Relational Graphs of Source Code
    Pardis Pashakhanloo
    Aaditya Naik
    Mayur Naik
    Deep Learning for Code (DL4C) Workshop @ ICLR 2022 (2022)
    Preview abstract Information-rich relational graphs have shown great potential in designing effective representations of code for program-understanding tasks. However, the wealth of structural and semantic information in such graphs can overwhelm models, because of their limited input size. A promising approach for overcoming this challenge is to gather presumed-relevant but smaller context from a larger graph, and random walks over graphs was one of the first such approaches discovered. We propose a deep-learning approach that improves upon random walks by learning task-specific walk policies that guide the traversal of the graph towards the most relevant context. In the setting of relational graphs representing programs and their semantic properties, we observe that models that employ learned policies for guiding walks are 6--36% points more accurate than models that employ uniform random walks, and 0.2--3.5% points more accurate than models that employ expert knowledge for guiding the walks. View details
    Preview abstract Graph representations of programs are commonly a central element of machine learning for code research. We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite "program graphs" that combine control-flow, data-flow, syntactic, and lexical information about a program. We present the capabilities and limitations of the library, perform a case-study applying the library to millions of competitive programming submissions, and showcase the library's utility for machine learning research. View details
    Preview abstract Spreadsheet formula prediction has been an important program synthesis problem with many real-world applications. Previous works typically utilize input-output examples as the specification for spreadsheet formula synthesis, where each input-output pair simulates a separate row in the spreadsheet. However, this formulation does not fully capture the rich context in real-world spreadsheets. First, spreadsheet data entries are organized as tables, thus rows and columns are not necessarily independent from each other. In addition, many spreadsheet tables include headers, which provide high-level descriptions of the cell data. However, previous synthesis approaches do not consider headers as part of the specification. In this work, we present the first approach for synthesizing spreadsheet formulas from tabular context, which includes both headers and semi-structured tabular data. In particular, we propose SpreadsheetCoder, a BERT-based model architecture to represent the tabular context in both row-based and column-based formats. We train our model on a large dataset of spreadsheets, and demonstrate that SpreadsheetCoder achieves top-1 prediction accuracy of 42:51%, which is a considerable improvement over baselines that do not employ rich tabular context. Compared to a rule-based system, SpreadsheetCoder assists 82% more users in composing formulas on Google Sheets. View details
    PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair
    Zimin Chen
    Vincent J Hellendoorn
    Subhodeep Moitra
    Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021) (2021)
    Preview abstract Machine learning for understanding and editing source code has recently attracted significant interest, with many developments in new models, new code representations, and new tasks. This proliferation can appear disparate and disconnected, making each approach seemingly unique and incompatible, thus obscuring the core machine learning challenges and contributions. In this work, we demonstrate that the landscape can be significantly simplified by taking a general approach of mapping a graph to a sequence of tokens and pointers. Our main result is to show that 16 recently published tasks of different shapes can be cast in this form, based on which a single model architecture achieves near or above state-of-the-art results on nearly all tasks, outperforming custom models like code2seq and alternative generic models like Transformers. This unification further enables multi-task learning and a series of cross-cutting experiments about the importance of different modeling choices for code understanding and repair tasks. The full framework, called PLUR, is easily extensible to more tasks, and will be open-sourced (link). View details
    Snowboard: Finding Kernel Concurrency Bugs through Systematic Inter-thread Communication Analysis
    Sishuai Gong
    Pedro Fonseca
    Proceedings of the 28th ACM Symposium on Operating Systems Principles (2021) (to appear)
    Preview abstract Kernel concurrency bugs are challenging to find because they depend on very specific thread interleavings and test inputs. While separately exploring kernel thread interleavings or test inputs has been closely examined, jointly exploring interleavings and test inputs has received little attention, in part due to the resulting vast search space. Using precious, limited testing resources to explore this search space and execute just the right concurrent tests in the proper order is critical. This paper proposes Snowboard a testing framework that generates and executes concurrent tests by intelligently exploring thread interleavings and test inputs jointly. The design of Snowboard is based on a concept called potential memory communication (PMC), a guess about pairs of tests that, when executed concurrently, are likely to perform memory accesses to shared addresses, which in turn may trigger concurrency bugs. To identify PMCs, Snowboard runs tests sequentially from a fixed initial kernel state, collecting their memory accesses. It then pairs up tests that write and read the same region into candidate concurrent tests. It executes those tests using the associated PMC as a scheduling hint to focus interleaving search only on those schedules that directly affect the relevant memory accesses. By clustering candidate tests on various features of their PMCs, Snowboard avoids testing similar behaviors, which would be inefficient. Finally, by executing tests from small clusters first, it prioritizes uncommon suspicious behaviors that may have received less scrutiny. Snowboard discovered 14 new concurrency bugs in Linux kernels 5.3.10 and 5.12-rc3, of which 12 have been confirmed by developers. Six of these bugs cause kernel panics and filesystem errors, and at least two have existed in the kernel for many years, showing that this approach can uncover hard-to-find, critical bugs. Furthermore, we show that covering as many distinct pairs of uncommon read/write instructions as possible is the test-prioritization strategy with the highest bug yield for a given test-time budget. View details
    Learning and Evaluating Contextual Embedding of Source Code
    Aditya Kanade
    International Conference on Machine Learning (ICML), Vienna, Austria (2020)
    Preview abstract Recent research has achieved impressive results on understanding and improving source code by building up on machine-learning techniques developed for natural languages. A significant advancement in natural-language understanding has come with the development of pre-trained contextual embeddings, such as BERT, which can be fine-tuned for downstream tasks with less labeled data and training budget, while achieving better accuracies. However, there is no attempt yet to obtain a high-quality contextual embedding of source code, and to evaluate it on multiple program-understanding tasks simultaneously; that is the gap that this paper aims to mitigate. Specifically, first, we curate a massive, deduplicated corpus of 6M Python files from GitHub, which we use to pre-train CuBERT, an open-sourced code-understanding BERT model; and, second, we create an open-sourced benchmark that comprises five classification tasks and one program-repair task, akin to code-understanding tasks proposed in the literature before. We fine-tune CuBERT on our benchmark tasks, and compare the resulting models to different variants of Word2Vec token embeddings, BiLSTM and Transformer models, as well as published state-of-the-art models, showing that CuBERT outperforms them all, even with shorter training, and with fewer labeled examples. Future work on source-code embedding can benefit from reusing our benchmark, and comparing against CuBERT models as a strong baseline. View details
    überSpark: Practical, Provable, End-to-End Guarantees on Commodity Heterogenous Interconnected Computing Platforms
    Amit Vasudevan
    Ruben Martins
    ACM SIGOPS Operating Systems Review, vol. Vol. 54, No. 1 (2020), pp. 8-22
    Preview abstract Today’s computing ecosystem, comprising commodity heterogeneous interconnected computing (CHIC) platforms, is increasingly being employed for critical applications, consequently demanding fairly strong end-to-end assurances.However, the generality and system complexity of today’s CHIC stack seem to outpace existing tools and methodologies towards provable end-to-end guarantees. This paper describes our on-going research, and presents überSpark, a system architecture that argues for structuring the CHIC stack around Universal Object Abstractions (üobjects), a fundamental system abstraction and building block towards practical and provable end-to-end guarantees. überSpark is designed to be realizable on heterogeneous hardware platforms with disparate capabilities, and facilitates compositional end-to-end reasoning and efficient implementation. überSpark also supports the use of multiple verification techniques towards properties of different flavors, for development compatible, incremental verification,co-existing and meshing with unverified components, at a fine granularity, and wide applicability to all layers of the CHIC stack. We discuss the CHIC stack challenges, illustrate our design decisions, describe the überSpark architecture, present our foundational steps, and outline on-going and future research activities. We anticipate überSpark to retrofit and unlock a wide range of unprecedented end-to-end provable guarantees on today’s continuously evolving CHIC stack. View details
    Preview abstract We present a new program synthesis approach that combines an encoder-decoder based synthesis architecture with a differentiable program fixer. Our approach is inspired from the fact that human developers seldom get their program correct in the first attempt, and perform iterative testing-based program fixing to get to the desired program functionality. Similarly, our approach first learns a distribution over programs conditioned on an encoding of a set of input-output examples, and then iteratively performs fix operations using the program fixer module. The fixer module takes as input the original examples and the current program outputs on example inputs, and generates a new distribution over the programs with the goal of reducing the discrepancies between the current program outputs and the desired example outputs. We train our architecture end-to-end on the RobustFill domain, and show that the addition of fixer module leads to a significant improvement of 7% on synthesis compared to using beam search. View details
    Global Relational Models of Source Code
    Vincent Josua Hellendoorn
    Rishabh Singh
    International Conference on Learning Representations (ICLR) (2020)
    Preview abstract Models of code can learn distributed representations of a program’s syntax and semantics to predict many non-trivial properties of a program. Recent state-of-the-art models leverage highly structured representations of programs, such as trees, graphs and paths therein (e.g., data-flow relations), which are precise and abundantly available for code. This provides a strong inductive bias towards semantically meaningful relations, yielding more generalizable representations than classical sequence-based models. Unfortunately, these models primarily rely on graph-based message passing to represent relations in code, which makes them de facto local due to the high cost of message-passing steps, quite in contrast to modern, global sequence-based models, such as the Transformer. In this work, we bridge this divide between global and structured models by introducing two new hybrid model families that are both global and incorporate structural bias: Graph Sandwiches, which wrap traditional (gated) graph message-passing layers in sequential message-passing layers; and Graph Relational Embedding Attention Transformers (GREAT for short), which bias traditional Transformers with relational information from graph edge types. By studying a popular, non-trivial program repair task, variable-misuse identification, we explore the relative merits of traditional and hybrid model families for code representation. Starting with a graph-based model that already improves upon the prior state-of-the-art for this task by 20%, we show that our proposed hybrid models improve an additional 10–15%, while training both faster and using fewer parameters. View details
    Preview abstract Due to its potential to improve programmer productivity and software quality, automated program repair has been an active topic of research. Newer techniques harness neural networks to learn directly from examples of buggy programs and their fixes. In this work, we consider a recently identified class of bugs called variable-misuse bugs. The state-of-the-art solution for variable misuse enumerates potential fixes for all possible bug locations in a program, before selecting the best prediction. We show that it is beneficial to train a model that jointly and directly localizes and repairs variable-misuse bugs. We present multi-headed pointer networks for this purpose, with one head each for localization and repair. The experimental results show that the joint model significantly outperforms an enumerative solution that uses a pointer based model for repair alone. View details
    Oscar: A Practical Page-Permissions-Based Scheme for Thwarting Dangling Pointers
    Thurston H.Y. Dang
    David Wagner
    USENIX Security, USENIX, Vancouver, BC, Canada (2017) (to appear)
    Preview abstract Using memory after it has been freed opens programs up to both data and control-flow exploits. Recent work on temporal memory safety has focused on using explicit lock-and-key mechanisms (objects are assigned a new lock upon allocation, and pointers must have the correct key to be dereferenced) or corrupting the pointer values upon free(). Placing objects on separate pages and using page permissions to enforce safety is an older, well-known technique that has been maligned as too slow, without comprehensive analysis. We show that both old and new techniques are conceptually instances of lock-and-key, and argue that, in principle, page permissions should be the most desirable approach. We then validate this insight experimentally by designing, implementing, and evaluating Oscar, a new protection scheme based on page permissions. Unlike prior attempts, Oscar does not require source code, is compatible with standard and custom memory allocators, and works correctly with programs that fork. Also, Oscar performs favorably – often by more than an order of magnitude – compared to recent proposals: overall, it has similar or lower runtime overhead, and lower memory overhead than competing systems. View details
    Glimmers: Resolving the Privacy/Trust Quagmire
    David Lie
    ACM Hot Topics in Operating Systems (HotOS), ACM SIGOPS, Whistler, British Columbia, Canada (2017)
    Preview abstract Users today enjoy access to a wealth of services that rely on user-contributed data, such as recommendation services, prediction services, and services that help classify and interpret raw data. The quality of such services inescapably relies on trustworthy contributions from users. However, validating the trustworthiness of contributions can often rely on supporting contextual data, which may contain privacy-sensitive information, such as a user's location or usage habits, creating a conflict between privacy and trust: users benefit from a higher-quality service that identifies and removes illegitimate user contributions, but, at the same time, they may be reluctant to let the service access their private information to achieve this high quality. We argue that this conflict can be resolved with a pragmatic Glimmer of Trust, which allows services to validate user contributions in a trustworthy way without forfeiting user privacy. We describe how trustworthy hardware such as Intel's SGX can be used on the client-side---in contrast to much recent work exploring SGX in cloud services---to realize the Glimmer architecture, and demonstrate how this realization is able to resolve the tension between privacy and trust in a variety of cases. View details
    Prochlo: Strong Privacy for Analytics in the Crowd
    Andrea Bittau
    Úlfar Erlingsson
    Ilya Mironov
    Ananth Raghunathan
    David Lie
    Ushasree Kode
    Julien Tinnes
    Bernhard Seefeld
    Proceedings of the Symposium on Operating Systems Principles (SOSP) (2017), pp. 441-459
    Preview abstract The large-scale monitoring of computer users’ software activities has become commonplace, e.g., for application telemetry, error reporting, or demographic profiling. This paper describes a principled systems architecture—Encode, Shuffle, Analyze (ESA)—for performing such monitoring with high utility while also protecting user privacy. The ESA design, and its Prochlo implementation, are informed by our practical experiences with an existing, large deployment of privacy-preserving software monitoring. With ESA, the privacy of monitored users’ data is guaranteed by its processing in a three-step pipeline. First, the data is encoded to control scope, granularity, and randomness. Second, the encoded data is collected in batches subject to a randomized threshold, and blindly shuffled, to break linkability and to ensure that individual data items get “lost in the crowd” of the batch. Third, the anonymous, shuffled data is analyzed by a specific analysis engine that further prevents statistical inference attacks on analysis results. ESA extends existing best-practice methods for sensitive-data analytics, by using cryptography and statistical techniques to make explicit how data is elided and reduced in precision, how only common-enough, anonymous data is analyzed, and how this is done for only specific, permitted purposes. As a result, ESA remains compatible with the established workflows of traditional database analysis. Strong privacy guarantees, including differential privacy, can be established at each processing step to defend against malice or compromise at one or more of those steps. Prochlo develops new techniques to harden those steps, including the Stash Shuffle, a novel scalable and efficient oblivious-shuffling algorithm based on Intel’s SGX, and new applications of cryptographic secret sharing and blinding. We describe ESA and Prochlo, as well as experiments that validate their ability to balance utility and privacy. View details
    Preview abstract We present überSpark (üSpark), an innovative architecture for compositional verification of security properties of extensible hypervisors written in C and Assembly. üSpark comprises two key ideas: (i) endowing low-level system software with abstractions found in higher-level languages (e.g., objects, interfaces, function-call semantics for implementations of interfaces, access control on interfaces, concurrency and serialization), enforced using a combination of commodity hardware mechanisms and light-weight static analysis; and (ii) interfacing with platform hardware by programming in Assembly using an idiomatic style (called CASM) that is verifiable via tools aimed at C, while retaining its performance and low-level access to hardware. After verification, the C code is compiled using a certified compiler while the CASM code is translated into its corresponding Assembly instructions. Collectively, these innovations enable compositional verification of security invariants without sacrificing performance. We validate üSpark by building and verifying security invariants of an existing open-source commodity x86 micro-hypervisor and several of its extensions, and demonstrating only minor performance overhead with low verification costs. View details
    Mantis: Efficient Predictions of Execution Time, Energy Usage, Memory Usage and Network Usage on Smart Mobile Devices
    Yongin Kwon
    Sangmin Lee
    Hayoon Yi
    Donghyun Kwon
    Seungjun Yang
    Byung-Gon Chun
    Ling Huang
    Mayur Naik
    Yunheung Paek
    IEEE Transactions on Mobile Computing, vol. 14 (2015), pp. 2059-2072
    Preview abstract We present Mantis, a framework for predicting the computational resource consumption (CRC) of Android applications on given inputs accurately, and efficiently. A key insight underlying Mantis is that program codes often contain features that correlate with performance and these features can be automatically computed efficiently. Mantis synergistically combines techniques from program analysis and machine learning. It constructs concise CRC models by choosing from many program execution features only a handful that are most correlated with the program’s CRC metric yet can be evaluated efficiently from the program’s input. We apply program slicing to reduce evaluation time of a feature and automatically generate executable code snippets for efficiently evaluating features. Our evaluation shows that Mantis predicts four CRC metrics of seven Android apps with estimation error in the range of 0-11.1 percent by executing predictor code spending at most 1.3 percent of their execution time on Galaxy Nexus. View details
    The Performance Cost of Shadow Stacks and Stack Canaries
    Thurston H.Y. Dang
    David Wagner
    Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIACCS), ACM (2015), pp. 555-566
    Preview abstract Control flow defenses against ROP either use strict, expensive, but strong protection against redirected RET instructions with shadow stacks, or much faster but weaker protections without. In this work we study the inherent overheads of shadow stack schemes. We find that the overhead is roughly 10% for a traditional shadow stack. We then design a new scheme, the parallel shadow stack, and show that its performance cost is significantly less: 3.5%. Our measurements suggest it will not be easy to improve performance on current x86 processors further, due to inherent costs associated with RET and memory load/store instructions. We conclude with a discussion of the design decisions in our shadow stack instrumentation, and possible lighter-weight alternatives. View details
    Making programs forget: Enforcing Lifetime for Sensitive Data
    Jayanthkumar Kannan
    Gautam Altekar
    Byung-Gon Chun
    Proceedings of the 13th USENIX conference on Hot topics in operating systems, USENIX Association, Berkeley, CA, USA (2013)
    Preview abstract This paper introduces guaranteed data lifetime, a novel system property ensuring that sensitive data cannot be retrieved from a system beyond a specified time. The trivial way to achieve this is to "reboot"; however, this is disruptive from the user's perspective, and may not even eliminate disk copies. We discuss an alternate approach based on state re-incarnation where data expiry is completely transparent to the user, and can be used even if the system is not designed a priori to provide the property. View details
    Towards verifiable resource accounting for outsourced computation
    Chen Chen
    Adrian Perrig
    Amit Vasudevan
    Vyas Sekar
    VEE (2013), pp. 167-178
    Mantis: Automatic Performance Prediction for Smartphone Applications
    Yongin Kwon
    Sangmin Lee
    Hayoon Yi
    Donghyun Kwon
    Seungjun Yang
    Byung-Gon Chun
    Ling Huang
    Mayur Naik
    Yunheung Paek
    USENIX Annual Technical Conference (2013), pp. 297-308
    Symbolic software model validation
    Cynthia Sturton
    Rohit Sinha
    Thurston H. Y. Dang
    Sakshi Jain
    Michael McCoyd
    Wei Yang Tan
    Sanjit A. Seshia
    David Wagner
    MEMOCODE (2013), pp. 97-108
    Verification with small and short worlds
    Rohit Sinha
    Cynthia Sturton
    Sanjit A. Seshia
    David Wagner
    FMCAD (2012), pp. 68-77
    Path-exploration lifting: hi-fi tests for lo-fi emulators
    Lorenzo Martignoni
    Stephen McCamant
    Pongsin Poosankam
    Dawn Song
    ASPLOS (2012), pp. 337-348
    Small trusted primitives for dependable systems
    Byung-Gon Chun
    Operating Systems Review, vol. 45 (2011), pp. 126-141
    Verifiable resource accounting for cloud computing services
    Vyas Sekar
    CCSW (2011), pp. 21-26
    Do You Know Where Your Data Are? Secure Data Capsules for Deployable Data Protection
    Devdatta Akhawe
    Kevin Fall
    Elaine Shi
    Dawn Song
    HotOS (2011)
    Secure Data Preservers for Web Services
    Jayanthkumar Kannan
    Byung-Gon Chun
    WebApps (2011)
    CloneCloud: elastic execution between mobile device and cloud
    Byung-Gon Chun
    Sunghwan Ihm
    Mayur Naik
    Ashwin Patti
    EuroSys (2011), pp. 301-314
    Secure Data Preservers for Web Services
    Jayanthkumar Kannan
    Byung-Gon Chun
    Proceedings of the 2nd USENIX conference on Web Application Development, USENIX Association, Berkeley, CA, USA (2011)
    CloneCloud: Boosting Mobile Device Applications Through Cloud Clone Execution
    Byung-Gon Chun
    Sunghwan Ihm
    Mayur Naik
    CoRR, vol. abs/1009.3088 (2010)
    Verifiable Network-Performance Measurements
    Katerina J. Argyraki
    Ankit Singla
    CoRR, vol. abs/1005.3148 (2010)
    A Data Capsule Framework For Web Services: Providing Flexible Data Access Control To Users
    Jayanthkumar Kannan
    Byung-Gon Chun
    CoRR, vol. abs/1002.0298 (2010)
    Verifiable network-performance measurements
    Katerina J. Argyraki
    Ankit Singla
    CoNEXT (2010), pp. 1
    Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression
    Ling Huang
    Jinzhu Jia
    Bin Yu
    Byung-Gon Chun
    Mayur Naik
    NIPS (2010), pp. 883-891
    Mantis: Predicting System Performance through Program Analysis and Modeling
    Byung-Gon Chun
    Ling Huang
    Sangmin Lee
    Mayur Naik
    CoRR, vol. abs/1010.0019 (2010)
    Not-a-Bot: Improving Service Availability in the Face of Botnet Attacks
    Ramakrishna Gummadi
    Hari Balakrishnan
    Sylvia Ratnasamy
    NSDI (2009), pp. 307-320
    Declarative networking
    Boon Thau Loo
    Tyson Condie
    Minos N. Garofalakis
    David E. Gay
    Joseph M. Hellerstein
    Raghu Ramakrishnan
    Timothy Roscoe
    Ion Stoica
    Commun. ACM, vol. 52 (2009), pp. 87-95
    Zeno: Eventually Consistent Byzantine-Fault Tolerance
    Atul Singh
    Pedro Fonseca 0001
    Petr Kuznetsov
    Rodrigo Rodrigues
    NSDI (2009), pp. 169-184
    Augmented Smartphone Applications Through Clone Cloud Execution
    Byung-Gon Chun
    HotOS (2009)
    Tiered Fault Tolerance for Long-Term Integrity
    Byung-Gon Chun
    Scott Shenker
    John Kubiatowicz
    FAST (2009), pp. 267-282
    Evita raced: metacompilation for declarative networks
    Tyson Condie
    David Chu
    Joseph M. Hellerstein
    PVLDB, vol. 1 (2008), pp. 1153-1165
    BFT Protocols Under Fire
    Atul Singh
    Tathagata Das
    Peter Druschel
    Timothy Roscoe
    NSDI (2008), pp. 189-204
    Diverse Replication for Single-Machine Byzantine-Fault Tolerance
    Byung-Gon Chun
    Scott Shenker
    USENIX Annual Technical Conference (2008), pp. 287-292
    Friday: Global Comprehension for Distributed Replay
    Dennis Geels
    Gautam Altekar
    Timothy Roscoe
    Ion Stoica
    NSDI (2007)
    Attested append-only memory: making adversaries stick to their word
    Byung-Gon Chun
    Scott Shenker
    John Kubiatowicz
    SOSP (2007), pp. 189-204
    Loss and Delay Accountability for the Internet
    Katerina J. Argyraki
    O. Irzak
    S. Ashish
    Scott Shenker
    ICNP (2007), pp. 194-205
    Proof Sketches: Verifiable In-Network Aggregation
    Minos N. Garofalakis
    Joseph M. Hellerstein
    ICDE (2007), pp. 996-1005
    Public Health for the Internet (PHI)
    Joseph M. Hellerstein
    Tyson Condie
    Minos N. Garofalakis
    Boon Thau Loo
    Timothy Roscoe
    CIDR (2007), pp. 332-340
    Using queries for distributed monitoring and forensics
    Atul Singh
    Timothy Roscoe
    Peter Druschel
    EuroSys (2006), pp. 389-402
    Declarative networking: language, execution and optimization
    Boon Thau Loo
    Tyson Condie
    Minos N. Garofalakis
    David E. Gay
    Joseph M. Hellerstein
    Raghu Ramakrishnan
    Timothy Roscoe
    Ion Stoica
    SIGMOD Conference (2006), pp. 97-108
    Induced Churn as Shelter from Routing-Table Poisoning
    Tyson Condie
    Varun Kacholia
    Sriram Sank
    Joseph M. Hellerstein
    NDSS (2006)
    A fresh look at the reliability of long-term digital storage
    Mary Baker
    Mehul A. Shah
    David S. H. Rosenthal
    Mema Roussopoulos
    Thomas J. Giuli
    Prashanth P. Bungale
    EuroSys (2006), pp. 221-234
    The Many Faces of Systems Research - and How to Evaluate Them
    Aaron B. Brown
    Anupam Chanda
    Rik Farrow
    Alexandra Fedorova
    Michael L. Scott
    HotOS (2005)
    A Fresh Look at the Reliability of Long-term Digital Storage
    Mary Baker
    Mehul A. Shah
    David S. H. Rosenthal
    Mema Roussopoulos
    Thomas J. Giuli
    Prashanth P. Bungale
    CoRR, vol. abs/cs/0508130 (2005)
    The Architecture of PIER: an Internet-Scale Query Processor
    Ryan Huebsch
    Brent N. Chun
    Joseph M. Hellerstein
    Boon Thau Loo
    Timothy Roscoe
    Scott Shenker
    Ion Stoica
    Aydan R. Yumerefendi
    CIDR (2005), pp. 28-43
    The LOCKSS peer-to-peer digital preservation system
    Mema Roussopoulos
    Thomas J. Giuli
    David S. H. Rosenthal
    Mary Baker
    ACM Trans. Comput. Syst., vol. 23 (2005), pp. 2-50
    Implementing declarative overlays
    Boon Thau Loo
    Tyson Condie
    Joseph M. Hellerstein
    Timothy Roscoe
    Ion Stoica
    SOSP (2005), pp. 75-90
    Attrition Defenses for a Peer-to-Peer Digital Preservation System
    Thomas J. Giuli
    Mary Baker
    David S. H. Rosenthal
    Mema Roussopoulos
    USENIX Annual Technical Conference, General Track (2005), pp. 163-178
    Notes On The Design Of An Internet Adversary
    David S. H. Rosenthal
    Mema Roussopoulos
    Thomas J. Giuli
    Mary Baker
    CoRR, vol. cs.DL/0411078 (2004)
    2 P2P or Not 2 P2P?
    Mema Roussopoulos
    Mary Baker
    David S. H. Rosenthal
    Thomas J. Giuli
    IPTPS (2004), pp. 33-43
    Impeding attrition attacks in P2P systems
    Thomas J. Giuli
    Mema Roussopoulos
    David S. H. Rosenthal
    Mary Baker
    ACM SIGOPS European Workshop (2004), pp. 12
    2 P2P or Not 2 P2P?
    Mema Roussopoulos
    Mary Baker
    David S. H. Rosenthal
    Thomas J. Giuli
    IPTPS (2004), pp. 33-43
    Preserving peer replicas by rate-limited sampled voting
    David S. H. Rosenthal
    Mema Roussopoulos
    Mary Baker
    Thomas J. Giuli
    Yanto Muliadi
    SOSP (2003), pp. 44-59
    2 P2P or Not 2 P2P?
    Mema Roussopoulos
    Mary Baker
    David S. H. Rosenthal
    Thomas J. Giuli
    CoRR, vol. cs.NI/0311017 (2003)
    Authenticated Append-only Skip Lists
    Mary Baker
    CoRR, vol. cs.CR/0302010 (2003)
    A Historic Name-Trail Service
    Mary Baker
    WMCSA (2003), pp. 88-99
    2 P2P or Not 2 P2P?
    Mema Roussopoulos
    Mary Baker
    David S. H. Rosenthal
    Thomas J. Giuli
    CoRR, vol. cs.NI/0311017 (2003)
    Preserving Peer Replicas By Rate-Limited Sampled Voting in LOCKSS
    Mema Roussopoulos
    Thomas J. Giuli
    David S. H. Rosenthal
    Mary Baker
    Yanto Muliadi
    CoRR, vol. cs.DC/0303026 (2003)
    Secure History Preservation Through Timeline Entanglement
    Mary Baker
    CoRR, vol. cs.DC/0202005 (2002)
    Peer-to-Peer Caching Schemes to Address Flash Crowds
    Tyron Stading
    Mary Baker
    IPTPS (2002), pp. 203-213
    Secure History Preservation Through Timeline Entanglement
    Mary Baker
    USENIX Security Symposium (2002), pp. 297-312
    Enabling the Archival Storage of Signed Documents
    Mary Baker
    FAST (2002), pp. 31-45
    A Historic Name-Trail Service
    Mary Baker
    CoRR, vol. cs.NI/0210019 (2002)
    Enabling the Long-Term Archival of Signed Documents through Time Stamping
    Thomas J. Giuli
    Mary Baker
    CoRR, vol. cs.DC/0106058 (2001)
    The mobile people architecture
    Mema Roussopoulos
    Edward Swierk
    Kevin Lai
    Guido Appenzeller
    Xinhua Zhao
    Mary Baker
    Mobile Computing and Communications Review, vol. 3 (1999), pp. 36-42
    Person-level Routing in the Mobile People Architecture
    Mema Roussopoulos
    Edward Swierk
    Kevin Lai
    Guido Appenzeller
    Mary Baker
    USENIX Symposium on Internet Technologies and Systems (1999)