Jump to Content
Sean Dorward

Sean Dorward

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract Very large data sets often have a flat but regular structure and span multiple disks and machines. Examples include telephone call records, network logs, and web document repositories. These large data sets are not amenable to study using traditional database techniques, if only because they can be too large to fit in a single relational database. On the other hand, many of the analyses done on them can be expressed using simple, easily distributed computations: filtering, aggregation, extraction of statistics, and so on. We present a system for automating such analyses. A filtering phase, in which a query is expressed using a new programming language, emits data to an aggregation phase. Both phases are distributed over hundreds or even thousands of computers. The results are then collated and saved to a file. The design -- including the separation into two phases, the form of the programming language, and the properties of the aggregators -- exploits the parallelism inherent in having data and computation distributed across many machines. Animation: The paper references this movie showing how the distribution of requests to google.com around the world changed through the day on August 14, 2003. View details
    Sandbridge Software Tools
    C. John Glossner
    Sanjay Jinturkar
    Mayan Moudgill
    Erdem Hokenek
    Michael J. Schulte
    Stamatis Vassiliadis
    SAMOS (2005), pp. 269-278
    Venti: A New Approach to Archival Storage
    FAST (2002), pp. 89-101
    Low Delay Perpetually Lossless Coding of Audio Signals
    Dawei Huang
    Serap A. Savari
    Gerald Schuller
    Bin Yu
    Data Compression Conference (2001), pp. 312-
    Inferno: la commedia interattiva
    Rob Pike
    Phil Winterbottom
    Dave Presotto
    Dennis Ritchie
    Howard Trickey
    ATEC '97: Proceedings of the annual conference on USENIX Annual Technical Conference, USENIX Association, Berkeley, CA, USA (1997), pp. 26-26
    Plan 9 from Bell Labs
    Rob Pike
    David L. Presotto
    Bob Flandrena
    Ken Thompson
    Howard Trickey
    Phil Winterbottom
    Computing Systems, vol. 8 (1995), pp. 221-254
    Control Software for Virtual-Circuit Switches: Call Processing
    Ravi Sethi
    Roy H. Campbell
    Anand Iyengar
    Charles R. Kalmanek
    Gary J. Murakami
    Ce-Kuen Shieh
    See-Mong Tan
    25th Anniversary of INRIA (1992), pp. 175-186
    Adding New Code to a Running C++ Program
    Ravi Sethi
    Jonathan E. Shopiro
    C++ Conference (1990), pp. 279-292