Kun Lin

Kun Lin

Working on improving fleet efficiency.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    A Case Against CXL Memory Pooling
    Philip Levis
    Amy Tai
    Twenty-Second ACM Workshop on Hot Topics in Networks (HotNets) (2023)
    Preview abstract CXL is a new computer bus protocol, designed to replace PCIe. Because it has much lower latency than PCIe (hundreds of nanoseconds) and hardware support for cache coherence, CXL can provide efficient bus access to remote memory. These capabilities have opened the possibility of CXL memory pools in datacenter and cloud networks, consisting of a large memory pool, dynamically shared between multiple machines. In this paper, we give three reasons why CXL memory pools will not help datacenter or cloud applications: cost, complexity, and limited utility. While CXL memory pools can potentially decrease RAM costs through multiplexing, they require a new cabling and switching infrastructure in parallel to Ethernet, whose cost outweighs any savings. Experimental results show that CXL is substantially higher latency than main memory, such that using it without harming application performance will require either greater application or system complexity. Finally, as modern servers have hundreds of cores and TB of RAM, they provide great flexibility in job and VM placement, such that a good system scheduler does not strand resources or have difficulty finding servers with enough RAM. View details
    WSMeter: A Fast, Accurate, and Low-Cost Performance Evaluation for Warehouse-Scale Computers
    Jaewon Lee
    Changkyu Kim
    Rama Govindaraju
    Jangwoo Kim
    Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (2018) (to appear)
    Preview abstract A warehouse-scale computer (WSC) is a vast collection of tightly networked computers providing modern internet services, that is becoming increasingly popular as the most cost-effective approach to serve users at global scale. It is however extremely difficult to accurately measure the holistic performance of WSC. The existing load-testing benchmarks are tailored towards a dedicated machine model and do not address shared infrastructure environments. Evaluating the performance of a live shared production WSC environment presents many challenges due to the lack of holistic performance metrics, high evaluation costs, and potential service disruptions they may cause. WSC providers and customers are in need of a cost effective methodology to accurately evaluate the holistic performance of their platforms and hosted services. To address these challenges, we propose WSMeter, a cost effective framework and methodology to accurately evaluate the holistic performance of WSC in a live production environment. We define a new performance metric to accurately reflect the holistic performance of a WSC running a wide variety of unevenly distributed jobs. We propose a model to statistically embrace the performance variances amplified by co-located jobs, to evaluate holistic performance with minimum costs. For validation of our approach, we analyze two real-world use cases and show that WSMeter accurately discerns 7% and 1% performance improvements, using only 0.9% and 6.6% of the machines in the WSC, respectively. We show through a Cloud customer case study, where WSMeter helped quantify the performance benefits of service software optimization with minimal costs. View details