Keun Soo YIM

Keun Soo YIM

K. S. Yim is an experimental computer scientist working as a software engineer (technical lead) at Google. His current research interests include reliability, quality, security, and productivity techniques for software development (e.g., video-on-demand, IoT client, web, big data, and machine learning applications). KS holds 30+ United States patents on intelligent management of mobile and cloud systems and has published 18+ technical papers in top-ranked journals and conferences. He obtained his Ph.D. degree in computer science from the University of Illinois at Urbana-Champaign. Dr. Yim has served as co-chair of the industry track of three international symposia (in software reliability and dependable computing) and has also served on multiple program committees of international conferences and workshops (in fault-tolerant computing, software engineering, computer systems, and parallel and distributed computing). Dr. Yim is a senior member of IEEE.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Task-oriented queries (e.g., one-shot queries to play videos, order food, or call a taxi) are crucial for assessing the quality of virtual assistants, chatbots, and other large language model (LLM)-based services. However, a standard benchmark for task-oriented queries is not yet available, as existing benchmarks in the relevant NLP (Natural Language Processing) fields have primarily focused on task-oriented dialogues. Thus, we present a new methodology for efficiently generating the Task-oriented Queries Benchmark (ToQB) using existing task-oriented dialogue datasets and an LLM service. Our methodology involves formulating the underlying NLP task to summarize the original intent of a speaker in each dialogue, detailing the key steps to perform the devised NLP task using an LLM service, and outlining a framework for automating a major part of the benchmark generation process. Through a case study encompassing three domains (i.e., two single-task domains and one multi-task domain), we demonstrate how to customize the LLM prompts (e.g., omitting system utterances or speaker labels) for those three domains and characterize the generated task-oriented queries. The generated ToQB dataset is made available to the public.We further discuss new domains that can be added to ToQB by community contributors and its practical applications. View details
    Preview abstract This chapter explores the possibility of building a unified assessment methodology for software reliability and security. The fault injection methodology originally designed for reliability assessment is extended to quantify and characterize the security defense aspect of native applications. Native application refers to system software written in C/C++ programming language. Specifically, software fault injection is used to measure the portion of injected software faults caught by the built-in error detection mechanisms of a target program (e.g., the detection coverage of assertions). To automatically activate as many injected faults as possible, a gray box fuzzing technique is used. Using dynamic analyzers during fuzzing further helps us catch the critical error propagation paths of injected (but undetected) faults, and identify code fragments as targets for security hardening. Because conducting software fault injection experiments for fuzzing is an expensive process, a novel, locality-based fault selection algorithm is presented. The presented algorithm increases the fuzzing failure ratios by 3–19 times, accelerating the speed of experiment. The case studies use all the above experimental techniques in order to compare the effectiveness of fuzzing and testing, and consequently assess the security defense of native benchmark programs. View details
    TREBLE: Fast Software Updates by Creating an Equilibrium in an Active Software Ecosystem of Globally Distributed Stakeholders
    Iliyan Batanov Malchev
    Andrew Hsieh
    Dave Burke
    ACM Transactions on Embedded Computing Systems, 18(5s) (2019), 23 pages
    Preview abstract This paper presents our experience with Treble, a two-year initiative to build the modular base in Android, a Java-based mobile platform running on the Linux kernel. Our TREBLE architecture splits the hardware independent core framework written in Java from the hardware dependent vendor implementations (e.g., user space device drivers, vendor native libraries, and kernel written in C/C++). Cross-layer communications between them are done via versioned, stable inter-process communication interfaces whose backward compatibility is tested by using two API compliance suites. Based on this architecture, we repackage the key Android software components that suffered from crucial post-launch security bugs as separate images. That not only enables separate ownerships but also independent updates of each image by interested ecosystem entities. We discuss our experience of delivering TREBLE architectural changes to silicon vendors and device makers using a yearly release model. Our experiments and industry rollouts support our hypothesis that giving more freedom to all ecosystem entities and creating an equilibrium are a transformation necessary to further scale the world largest open ecosystem with over two billion active devices. View details
    A Taste of Android Oreo (v8.0) Device Manufacturer
    Iliyan Batanov Malchev
    Dave Burke
    ACM Symposium on Operating Systems Principles (SOSP) - Tutorial (2017)
    Preview abstract In 2017, over two billion Android devices developed by more than a thousand device manufacturers (DMs) around the world are actively in use. Historically, silicon vendors (SVs), DMs, and telecom carriers extended the Android Open Source Project (AOSP) platform source code and used the customized code in final production devices. Forking, on the other hand, makes it hard to accept upstream patches (e.g., security fixes). In order to reduce such software update costs, starting from Android v8.0, the new Vendor Test Suite (VTS) splits hardware-independent framework and hardware-dependent vendor implementation by using versioned, stable APIs (namely, vendor interface). Android v8.0 thus opens the possibility of a fast upgrade of the Android framework as long as the underlying vendor implementation passes VTS. This tutorial teaches how to develop, test, and certify a compatible Android vendor interface implementation running below the framework. We use an Android Virtual Device (AVD) emulating an Android smartphone device to implement a user-space device driver which uses formalized interfaces and RPCs, develop VTS tests for that component, execute the extended tests, and certify the extended vendor implementation. View details
    The Rowhammer Attack Injection Methodology
    In Proceedings of the IEEE Symposium on Reliable Distributed Systems (SRDS) (2016), pp. 1-10
    Preview abstract This paper presents a systematic methodology to identify and validate security attacks that exploit user influenceable hardware faults (i.e., rowhammer errors). We break down rowhammer attack procedures into nine generalized steps where some steps are designed to increase the attack success probabilities. Our framework can perform those nine operations (e.g., pressuring system memory and spraying landing pages) as well as inject rowhammer errors which are basically modeled as ≥3-bit errors. When one of the injected errors is activated, such can cause control or data flow divergences which can then be caught by a prepared landing page and thus lead to a successful attack. Our experiments conducted against a guest operating system of a typical cloud hypervisor identified multiple reproducible targets for privilege escalation, shell injection, memory and disk corruption, and advanced denial-of-service attacks. Because the presented rowhammer attack injection (RAI) methodology uses error injection and thus statistical sampling, RAI can quantitatively evaluate the modeled rowhammer attack success probabilities of any given target software states. View details
    Evaluation Metrics of Service-Level Reliability Monitoring Rules of a Big Data Service
    In Proceedings of the IEEE International Symposium on Software Reliability Engineering (ISSRE) (2016), pp. 376-387
    Preview abstract This paper presents new metrics to evaluate the reliability monitoring rules of a large-scale big data service. Our target service uses manually-tuned, service-level reliability monitoring rules. Using the measurement data, we identify two key technical challenges in operating our target monitoring system. In order to improve the operational efficiency, we characterize how those rules were manually tuned by the domain experts. The characterization results provide useful information to operators supposed to regularly tune such rules. Using the actual production failure data, we evaluate the same monitoring rules by using standard metrics and the presented metrics. Our evaluation results show the strengths and weaknesses of each metric and show that the presented metrics can further help operators recognize when and which rules need to be re-tuned. View details
    Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Graphics Processing Units
    IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2014), pp. 458-467
    Preview abstract In N-body programs, trajectories of simulated particles have chaotic patterns if errors are in the initial conditions or occur during some computation steps. It was believed that the global properties (e.g., total energy) of simulated particles are unlikely to be affected by a small number of such errors. In this paper, we present a quantitative analysis of the impact of transient faults in GPU devices on a global property of simulated particles. We experimentally show that a single-bit error in non-control data can change the final total energy of a large- scale N-body program with ~2.1% probability. We also find that the corrupted total energy values have certain biases (e.g., the values are not a normal distribution), which can be used to reduce the expected number of re-executions. In this paper, we also present a data error detection technique for N-body pro- grams by utilizing two types of properties that hold in simulated physical models. The presented technique and an existing redundancy-based technique together cover many data errors (e.g., >97.5%) with a small performance overhead (e.g., 2.3%). View details
    Norming to Performing: Failure Analysis and Deployment Automation of Big Data Software Developed by Highly Iterative Models
    IEEE International Symposium on Software Reliability Engineering, IEEE International Symposium on Software Reliability Engineering (2014), pp. 144-155
    Preview abstract We observe many interesting failure characteristics from Big Data software developed and released using some kinds of highly iterative development models (e.g., agile). ~16% of failures occur due to faults in software deployments (e.g., packaging and pushing to production). Our analysis shows that many such production outages are at least partially due to some human errors rooted in the high frequency and complexity of software deployments. ~51% of the observed human errors (e.g., transcription, education, and communication error types) are avoidable through automation. We thus develop a fault-tolerant automation framework to make it efficient to automate end-to-end software deployment procedures. We apply the framework to two Big Data products. Our case studies show the complexity of the deployment procedures of multi-homed Big Data applications and help us to study the effectiveness of the validation and verification techniques for user-provided automation programs. We analyze the production failures of the two products again after the automation. Our experimental data shows how the automation and the associated procedure improvements reduce the deployment faults and overall failure rate, and improve the feature launch velocity. Automation facilitates more formal, procedure-driven software engineering practices which not only reduce the manual work and human-oriented, avoidable production outages but also help engineers to better understand overall software engineering procedures, making them more auditable, predictable, reliable, and efficient. We discuss two novel metrics to evaluate progress in mitigating human errors and the conditions indicating points to start such transition from owner-driven deployment practice. View details
    HTAF: Hybrid Testing Automation Framework to Leverage Local and Global Computing Resources
    David Hreczany
    Ravishankar K. Iyer
    Lecture Notes in Computer Science, 6784 (2011), pp. 479-494
    Preview abstract In web application development, testing forms an increasingly large portion of software engineering costs due to the growing complexity and short time-to-market of these applications. This paper presents a hybrid testing automation framework (HTAF) that can automate routine works in testing and releasing web software. Using this framework, an individual software engineer can easily describe his routine software engineering tasks and schedule these described tasks by using both his local machine and global cloud computers in an efficient way. This framework is applied to commercial web software development processes. Our industry practice shows four example cases where the hybrid and decentralized architecture of HTAF is helpful at effectively managing both hardware resources and manpower required for testing and releasing web applications. View details