Carl Staelin

Carl Staelin

Carl Staelin works for Google and is assistant editor for the Journal of Electronic Imaging. He was Chief Technologist for HP Labs Israel, working on digital commercial print, automatic image analysis and enhancement, and enterprise IT management. His research interests include storage systems, machine learning, image analysis and processing, document and information management, and performance analysis. He received his PhD in Computer Science from Princeton University in 1991 in high performance file system design.

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Models for Neural Spike Computation and Cognition
    David H. Staelin
    CreateSpace, Seattle, WA(2011), pp. 142
    Preview abstract This monograph addresses the intertwined mathematical, neurological, and cognitive mysteries of the brain. It first evaluates the mathematical performance limits of simple spiking neuron models that both learn and later recognize complex spike excitation patterns in less than one second without using training signals unique to each pattern. Simulations validate these models, while theoretical expressions validate their simpler performance parameters. These single-neuron models are then qualitatively related to the training and performance of multi-layer neural networks that may have significant feedback. The advantages of feedback are then qualitatively explained and related to a model for cognition. This model is then compared to observed mild hallucinations that arguably include accelerated time-reversed video memories. The learning mechanism for these binary threshold-firing “cognon” neurons is spike-timing-dependent plasticity (STDP) that depends only on whether the spike excitation pattern presented to a given single “learning-ready” neuron within a period of milliseconds causes that neuron to fire or “spike”. The “false-alarm” probability that a trained neuron will fire for a random unlearned pattern can be made almost arbitrarily low by reducing the number of patterns learned by each neuron. Models that use and that do not use spike timing within patterns are evaluated. A Shannon mutual information metric (recoverable bits/neuron) is derived for binary neuron models that are characterized only by their probability of learning a random input excitation pattern presented to that neuron during learning readiness, and by their false-alarm probability for random unlearned patterns. Based on simulations, the upper bounds to recoverable information are ~0.1 bits per neuron for optimized neuron parameters and training. This information metric assumes that: 1) each neural spike indicates only that the responsible neuron input excitation pattern (a pattern lasts less than the time between consecutive patterns, say 30 milliseconds) had probably been seen earlier while that neuron was “learning ready”, and 2) information is stored in the binary synapse strengths. This focus on recallable learned information differs from most prior metrics such as pattern classification performance and metrics relying on pattern-specific training signals other than the normal input spikes. This metric also shows that neuron models can recall useful Shannon information only if their probability of firing randomly is lowered between learning and recall. Also discussed are: 1) how rich feedback might permit improved noise immunity, learning and recognition of pattern sequences, compression of data, associative or content-addressable memory, and development of communications links through white matter, 2) extensions of cognon models that use spike timing, dendrite compartments, and new learning mechanisms in addition to spike timing- dependent plasticity (STDP), 3) simulations that show how simple optimized neuron models can have optimum numbers of binary synapses in the range between 200 and 10,000, depending on neural parameters, and 4) simulation results for parameters like the average bits/spike, bits/neuron/second, maximum number of learnable patterns, optimum ratios between the strengths of weak and strong synapses, and probabilities of false alarms. View details
    Clustered-Dot Halftoning With Direct Binary Search
    P. Goyal
    M. Gupta
    M. Fischer
    O. Shacham
    J.P. Allebach
    IEEE Transactions on Image Processing, 22(2013), pp. 473-487
    Preview abstract In this paper, we present a new algorithm for aperiodic clustered-dot halftoning based on direct binary search (DBS). The DBS optimization framework has been modified for designing clustered-dot texture, by using filters with different sizes in the initialization and update steps of the algorithm. Following an intuitive explanation of how the clustered-dot texture results from this modified framework, we derive a closed-form cost metric which, when minimized, equivalently generates stochastic clustered-dot texture. An analysis of the cost metric and its influence on the texture quality is presented, which is followed by a modification to the cost metric to reduce computational cost and to make it more suitable for screen design. View details
    Cost function analysis for stochastic clustered-dot halftoning based on direct binary search
    Puneet Goyal
    Madhur Gupta
    Mani Fischer
    Omri Shacham
    Jan Allebach
    Proc. SPIE 7866, Color Imaging XVI: Displaying, Processing, Hardcopy, and Applications, Society of Photo-Optical Instrumentation Engineers (SPIE)(2011)
    Preview abstract Most electrophotographic printers use periodic, clustered-dot screening for rendering smooth and stable prints. However, periodic, clustered-dot screening suffers from the problem of periodic moir´e resulting from interference between the component periodic screens superposed for color printing. There has been proposed an approach, called CLU-DBS for stochastic, clustered-dot halftoning and screen design based on direct binary search. This method deviates from conventional DBS in its use of different filters in different phases of the algorithm. In this paper, we derive a closed-form expression for the cost metric which is minimized in CLU-DBS. The closed-form expression provides us with a clearer insight on the relationship between input parameters and processes, and the output texture, thus enabling us generate better quality texture. One of the limitations of the CLU-DBS algorithm proposed earlier is the inversion in the distribution of clusters and voids in the final halftone with respect to the initial halftone. In this paper, we also present a technique for avoiding the inversion by negating the sign of one of the error terms in the newly derived cost metric, which is responsible for clustering. This not only simplifies the CLU-DBS screen design process, but also significantly reduces the number of iterations required for optimization. View details
    Local Gray Component Replacement Using Image Analysis
    Pavel Kisilev
    Yohanan Sivan
    Michal Aharon
    Renato Keshet
    Gregory Braverman
    Shlomo Harush
    19th Color and Imaging Conference Final Program and Proceedings, Society for Imaging Science and Technology(2011), pp. 234-238
    Preview abstract In printing, ink is one of the most important cost factors, which accounts for approximately 25-30% of the cost per page; therefore, reducing ink consumption is of great interest. The traditional approach to this problem is to modify the ICC profile to increase the use of black ink instead of the combination of cyan, yellow, and magenta; this approach is known as the gray component replacement (GCR). While this strategy reduces ink consumption, it often results in visually grainy images in otherwise smooth regions, and is therefore of limited use or even unacceptable for many applications, such as photoprinting. In this work, we propose a novel, context sensitive and spatially variant GCR method, which yields ink consumption figures that are similar to an aggressive GCR, but in contrast produces perfectly acceptable print quality results. Our approach is based on the visual masking effect: image areas with high activity level, such as high contrast textures, mask the increased graininess, and other inaccuracies such as (small) color shifts. Therefore, we propose to dynamically vary the amount of gray replacement across the image as a function of the local “activity” of the image. In lighter, smoother regions, less aggressive GCR is applied, and the image quality is preserved, while in more active regions where the change is not visible, more aggressive GCR is applied. The performance of the proposed method is tested on images randomly chosen from several photo collections. The initial results indicate about 15% reduction in overall ink consumption with perfectly acceptable print quality. View details
    Design of color screen tile vector sets
    Jin-Young Kim
    Yung-Yao Chen
    Mani Fischer
    Omri Shacham
    Kurt Bengston
    Jan Allebach
    Proc. SPIE 7866, Color Imaging XVI: Displaying, Processing, Hardcopy, and Applications, 78661C, Society of Photo-Optical Instrumentation Engineers (SPIE)(2011)
    Preview abstract For electrophotographic printers, periodic clustered screens are preferable due to their homogeneous halftone texture and their robustness to dot gain. In traditional periodic clustered-dot color halftoning, each color plane is independently rendered with a different screen at a different angle. However, depending on the screen angle and screen frequency, the final halftone may have strong visible moiré due to the interaction of the periodic structures, associated with the different color planes. This paper addresses issues on finding optimal color screen sets which produce the minimal visible moiré and homogeneous halftone texture. To achieve these goals, we propose new features including halftone microtexture spectrum analysis, common periodicity, and twist factor. The halftone microtexture spectrum is shown to predict the visible moiré more accurately than the conventional moiré-free conditions. Common periodicity and twist factor are used to determine whether the halftone texture is homogeneous. Our results demonstrate significant improvements to clustered-dot screens in minimizing visible moiré and having smooth halftone texture. View details
    ICC Profile Extension for Device MTF Characterization
    Lior Shapira
    Ron Banner
    Proceedings NIP & Digital Fabrication Conference, 2011 International Conference on Digital Printing Technologies, pp. 24-28
    Preview abstract We propose a method to maintain image sharpness consistency across different devices, analogous to the way ICC profiles are used to maintain color consistency. The method is based on creating a profile of the perceived blur caused by the printing and viewing process. We show how to measure and analyze the sharpness profile of a printer, and demonstrate that the profile consists of multiple 2D modulation transfer functions (MTF's), varying over different ink combinations and colors. We propose storing the sharpness profile within an ICC profile. The profile can then be used to adaptively pre-compensate for the induced blur, and maintain sharpness consistency. View details
    Detecting and exploiting near-sortedness for efficient relational query evaluation
    Sagi Ben-Moshe
    Yaron Kanza
    Eldar Fischer
    Arie Matsliah
    Mani Fischer
    Proceedings of the 14th International Conference on Database Theory, ACM, New York, NY, USA(2011), pp. 256-267
    CSched: Real-time disk scheduling with concurrent I/O requests
    Gidi Amir
    David Ben-Ovadia
    Ram Dagan
    Michael Melamed
    Dave Staas
    Hewlett-Packard Laboratories(2011)
    Preview abstract We present a new real-time disk scheduling algorithm, Concurrent Scheduler or CSched, which maximizes throughput for modern storage devices while providing real-time access guarantees, with computational costs of $O(\log n)$. To maximize performance it ensures request concurrency at the device and maximizes the depth of a new Limited Cyclical SCAN (L-CSCAN) queue that optimizes the request sequence sent to the device. For real-time requests there is an additional SCAN-EDF queue in front of the L-CSCAN queue to absorb bursts of real- time requests until they can be drained to the L-CSCAN queue. The real-time guarantees are provided by managing the worst-case latency at each stage of the pipeline: SCAN-EDF, L-CSCAN, and device. CSched is configured by the tuple {λ, σ, δ, τ(r), N}, where λ and σ are the minimal initial slack time and workload burstiness and are properties of the workload, and where δ, &tau(r);, and N are the device worst-case latency, worst-case throughput rate time for a request, and maximal number of concurrent requests, and are experimentally determined properties of the storage device. An experimental evaluation of CSched shows that given sufficient initial slack time, the system throughput performance costs of providing real-time guarantees are negligible. View details
    Electro-photographic model based stochastic clustered-dot halftoning with direct binary search
    P. Goyal
    M. Gupta
    M. Fischer
    O. Shacham
    T. Kashti
    J. Allebach
    Image Processing (ICIP), 2011 18th IEEE International Conference on, pp. 1721-1724
    Preview abstract Most electrophotographic printers use periodic, clustered-dot screening for rendering smooth and stable prints. However, when used for color printing, this approach suffers from the problem of periodic moire resulting from interference between the periodic halftones of individual color planes. There has been proposed an approach, called CLU-DBS for stochastic, clustered-dot halftoning and screen design based on direct binary search. We propose a methodology to embed a printer model within this halftoning algorithm to account for dot-gain and dot-loss effects. Without accounting for these effects, the printed image will not have the appearance predicted by the halftoning algorithm. We incorporate a measurement-based stochastic model for dot interactions of an electro-photographic printer within the iterative CLU-DBS binary halftoning algorithm. The stochastic model developed is based on microscopic absorptance and variance measurements. The experimental results show that electrophotography-model based stochastic clustered dot halftoning improves the homogeneity and reduces the graininess of printed halftone images. View details
    Automatic visual inspection and defect detection on variable data prints
    Marie Vans
    Sagi Schein
    Pavel Kisilev
    Steve Simske
    Ram Dagan
    Shlomo Harush
    Journal of Electronic Imaging, 20(2011)
    Preview abstract We present a system for automatic, on-line visual inspection and print defect detection for variable data printing (VDP). This system can be used to automatically stop the printing process and alert the operator to problems. We lay out the components required for constructing a vision-based inspection system and show that our approach is novel for the high-speed detection of defects on variable data. When implemented in a high-speed digital printing press, the system allows a single skilled operator to monitor and maintain several presses, reducing the number of operators required to run a shop floor of presses as well as reduce wasted consumables when a defect goes undetected. View details
    Design of color screen sets for robustness to color plane misregistration
    Jin-Young Kim
    Yung-Yao Chen
    M. Fischer
    O. Shacham
    J.P. Allebach
    18th IEEE International Conference on Image Processing (ICIP)(2011), pp. 1733-1736
    Preview abstract Periodic clustered-dot screens are widely used for electrophotographic printers due to their homogeneous halftone texture and their robustness to dot gain. However, when applied to color printing, there are two important phenomena that limit the quality of printed color halftones generated using a screening technology: (1) moire due to the superposition halftone patterns corresponding to different periodicity matrices, and (2) appearance changes due to misregistration between different colorant planes. This paper focuses on analyzing the registration sensitivity of periodic, clustered-dot screens. To quantitatively measure the effect of registration errors, we introduce two new functions: (1) cost, and (2) risk of registration errors. We propose the notion of #x201C;visual equivalence #x201D;, and derive three propositions under which visual equivalence can be achieved, even when registration errors occur. View details
    Creating the knowledge about IT events
    Gilad Barash
    Ira Cohen
    Eli Mordechai
    Rafael Dakar
    Proceedings of the 2010 workshop on Managing systems via log analysis and machine learning techniques, USENIX Association, Berkeley, CA, USA, pp. 1-1
    PDF document restoration and optimization during image enhancement
    Hui Chao
    Sagi Schein
    Marie Vans
    John Lumley
    Proceedings of the eighth ACM symposium on Document engineering, ACM, New York, NY, USA(2008), pp. 150-153
    Memory hierarchy performance measurement of commercial dual-core desktop processors
    Lu Peng
    Jih-Kwon Peir
    Tribuvan K. Prakash
    Yen-Kuang Chen
    David M. Koppelman
    Journal of Systems Architecture - Embedded Systems Design, 54(2008), pp. 816-828
    Biblio: automatic meta-data extraction
    Michael Elad
    Darryl Greig
    Oded Shmueli
    Marie Vans
    IJDAR, 10(2007), pp. 113-126
    <i>lmbench</i>: an extensible micro-benchmark suite
    Softw., Pract. Exper., 35(2005), pp. 1079-1105
    mkpkg: A software packaging tool
    LISA(1998), pp. 243-252
    mhz: Anatomy of a micro-benchmark
    Larry McVoy
    USENIX Annual Technical Conference, USENIX, Berkeley, CA(1998)
    Preview abstract Mhz is a portable ANSI/C program that determines the processor clock speed in a platform independent way. It measures the execution time of several different C expressions and finds the greatest common divisor to determine the duration of a single clock tick. Mhz can be used by anyone who wants or needs to know the processor clock speed. In large installations it is often easier to experimentally determine the clock speed of a given machine than to keep track of each computer. For example, a platform-independent database system optimizer may use the clock speed while calculating the performance tradeoffs of various optimization techniques. To run the benchmark long enough for timing to be accurate, mhz executes each expression in a loop. To minimize the loop overhead the expression is repeated a hundred times. Unfortunately, repetition enables many hardware and compiler optimizations that can have surprising effects on the experimental results. While writing mhz, much of the intellectual effort went into the design of expressions that minimize the opportunities for compiler and hardware optimization. Mhz utilizes lmbench 2.0’s new timing harness, which manages the benchmarking process. The harness automatically adjusts the benchmark to minimize run time while preserving accuracy, determines the necessary timing duration to get accurate results from the system clock, and measures and accounts for both loop overhead and measurement overhead. It is used throughout lmbench 2.0 and can be used to measure the performance of other application View details
    lmbench: Portable tools for performance analysis
    Larry McVoy
    USENIX Annual Technical Conference, USENIX, Berkeley, CA(1996), pp. 279-274
    Preview abstract lmbench is a micro-benchmark suite designed to focus attention on the basic building blocks of many common system applications, such as databases, simulations, software development, and networking. In almost all cases, the individual tests are the result of analysis and isolation of a customer’s actual performance problem. These tools can be, and currently are, used to compare different system implementations from different vendors. In several cases, the bench- marks have uncovered previously unknown bugs and design flaws. The results have shown a strong correlation between memory system performance and overall performance. lmbench includes an extensible database of results from systems current as of late 1995. View details
    Data Replication in Mariposa
    Jeff Sidell
    Paul M. Aoki
    Adam Sah
    Michael Stonebraker
    Andrew Yu
    ICDE(1996), pp. 485-494
    Mariposa: A Wide-Area Distributed Database System
    Michael Stonebraker
    Paul M. Aoki
    Witold Litwin
    Avi Pfeffer
    Adam Sah
    Jeff Sidell
    Andrew Yu
    VLDB J., 5(1996), pp. 48-63
    The {HP AutoRAID} hierarchical storage system
    Richard Golding
    Tim Sullivan
    ACM Transactions on Computer Systems (TOCS), 14(1996), pp. 108-136
    The HP AutoRAID hierarchical storage system
    J. Wilkes
    R. Golding
    T. Sullivan
    SIGOPS Oper. Syst. Rev., 29(1995), pp. 96-108
    Idleness is not sloth
    Richard A. Golding
    Peter Bosch II
    Tim Sullivan
    USENIX Winter(1995), pp. 201-212
    An Economic Paradigm for Query Processing and Data Migration in Mariposa
    Michael Stonebraker
    Robert Devine
    Marcel Kornacker
    Witold Litwin
    Avi Pfeffer
    Adam Sah
    PDIS(1994), pp. 58-67
    HighLight: Using a Log-structured File System for Tertiary Storage Management
    John T. Kohl
    Michael Stonebraker
    USENIX Winter(1993), pp. 435-448
    HighLight: a file system for tertiary storage
    J. Kohl
    M. Stonebraker
    Mass Storage Systems, 1993. Putting all that Data to Work. Proceedings., Twelfth IEEE Symposium on, pp. 157-161
    Preview abstract HighLight, a file system combining secondary disk storage and tertiary robotic storage that is being developed as part of the Sequoia 200 Project, is described. HighLight is an extension of the 4.4BSD log-structured file system (LFS), which provides hierarchical storage management without requiring any special support from applications. The authors present HighLight's design and various policies for automatic migration of file data between the hierarchy levels. The performance of HighLight was compared with that of the 4.4BSD LFS implementation. The initial results indicate that HighLight's performance is comparable to that of 4.4BSD LFS for disk-resident data, and the overhead associated with accessing data from the tertiary cache is negligible View details
    An Implementation of a Log-Structured File System for UNIX
    Margo I. Seltzer
    Keith Bostic
    Marshall K. McKusick
    USENIX Winter(1993), pp. 307-326
    {DataMesh} parallel storage servers
    Chia Chao
    Robert English
    David Jacobson
    Bart Sears
    Alex Stepanov
    SIGOPS Oper. Syst. Rev., 26(1992), pp. 11
    Smart Filesystems
    Hector Garcia-Molina
    USENIX Winter(1991), pp. 45-52
    File system design using large memories
    H. Garcia-Molina
    Information Technology, 1990. 'Next Decade in Information Technology', Proceedings of the 5th Jerusalem Conference on (Cat. No.90TH0326-9), pp. 11-21
    Preview abstract It is shown using experimental data that file activity is fairly stable over time, and the implications of this finding for file system design are examined. Several file access patterns and how they may be exploited to improve file system performance are shown. In particular, it is shown that current file temperature can be used to predict future file temperature. The design of the iPcress file system, which uses both a large disk cache and other techniques to improve file system performance is outlined. iPcress has a variety of cache staging algorithms and can choose the one most appropriate for each file. iPcress also stores access histories for each file to guide decisions such as file layout on DASD and caching. Preliminary performance figures for iPcress are presented View details
    Data Management with Massive Memory: A Summary
    Hector Garcia-Molina
    Robert K. Abbott
    Chris Clifton
    Kenneth Salem
    PRISMA Workshop(1990), pp. 63-70
    Evaluation ot Monitor Complexity for Concurrently Testing Microprogrammed Control Units
    Alexander Albicki
    ITC(1985), pp. 733-736