Jan Wassenberg
Authored Publications
Sort By
Vectorized and performance-portable Quicksort
Joachim Giesen
Mark Blacher
Peter Sanders
Software: Practice and Experience (2022) (to appear)
Preview abstract
Recent works showed that implementations of Quicksort using
vector CPU instructions can outperform the non-vectorized algorithms in
widespread use. However, these implementations are typically single-threaded,
implemented for a particular instruction set, and restricted to a small set of
key types. We lift these three restrictions: our proposed vqsort
algorithm integrates into the state-of-the-art parallel sorter ips4o,
speeding it up by a factor of 1.5 to 1.8. The same implementation works on seven
instruction sets (including SVE and RISC-V V) across four platforms. It also
supports floating-point and 16-128 bit integer keys. To the best of our
knowledge, this is the fastest sort for non-tuple keys on CPUs, up to 20 times
as fast as the sorting algorithms implemented in standard libraries. This paper
focuses on the practical engineering aspects enabling the speed and portability,
which we have not yet seen demonstrated for a Quicksort implementation.
Furthermore, we introduce compact and transpose-free sorting networks for
in-register sorting of small arrays, and a vector-friendly pivot sampling
strategy that is robust against adversarial input.
View details
Benchmarking JPEG XL lossy/lossless image compression
Evgenii Kliuchnikov
Evgeniy Upenik
Jon Sneyers
Luca Versari
Touradj Ebrahimi
Optics, Photonics and Digital Technologies for Imaging Applications VI, SPIE (2020)
Preview abstract
JPEG XL is a practical, royalty-free codec for scalable web distribution and efficient compression of high-quality photographs. It also includes previews, progression, animation, transparency, wide gamut, and high bit depth.
Experiments performed during standardization have shown the feasibility of economical storage without perceptive quality loss, lossless recompression of existing JPEG, and fast software encoders and decoders. We disclose the results of subjective and objective evaluations.
Users expect faithful reproductions of ever-larger images. JPEG XL is faster to share and more economical to store: 60% savings vs. JPEG at equivalent visual quality. We quantify this impact using a subjective evaluation versus existing codecs including HEVC-HM-YUV444 and JPEG.
New image codecs have to co-exist with the previous generation for several years. JPEG XL is unique in providing value for both existing JPEGs as well as new users. It includes coding tools to reduce the transmission and storage costs of JPEG by 22% while allowing byte-for-byte exact reconstruction of the original JPEG. Avoiding transcoding and additional artifacts helps to preserve our digital heritage.
Applications require fast and low-power decoding. JPEG XL was designed to benefit from multicore and SIMD, and actually decodes faster than JPEG. We report the resulting speeds on ARM and x86 CPUs. To enable reproduction of these results, we open sourced the JPEG XL software in 2019.
View details
JPEG XL next-generation image compression architecture and coding tools
Ruud van Asseldonk
Moritz Firsching
Thomas Fischbacher
Sebastian Gomez
Evgenii Kliuchnikov
Robert Obryk
Krzysztof Potempa
Alexander Rhatushnyak
Jon Sneyers
Zoltan Szabadka
Luca Versari
SPIE Applications of Digital Image Processing, SPIE (2019)
Preview abstract
An update on the JPEG XL standardization effort: JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It will provide various benefits compared to existing image formats: significantly smaller size at equivalent subjective quality; fast, parallelizable decoding and encoding configurations; features such as progressive, lossless, animation, and reversible transcoding of existing JPEG; support for high-quality applications including wide gamut, higher resolution/bit depth/dynamic range, and visually lossless coding. Additionally, a royalty-free baseline is an important goal. The JPEG XL architecture is traditional block-transform coding with upgrades to each component. We describe these components and analyze decoded image quality.
View details
Preview abstract
Algorithms that rely on a pseudorandom number generator often lose their performance guarantees when adversaries can predict the behavior of the generator. To protect non-cryptographic applications against such attacks, we propose 'strong' pseudorandom generators characterized by two properties: computationally indistinguishable from random and backtracking-resistant. Some existing cryptographically secure generators also meet these criteria, but they are too slow to be accepted for general-purpose use. We introduce a new open-sourced generator called 'Randen' and show that it is 'strong' in addition to outperforming Mersenne Twister, PCG, ChaCha8, ISAAC and Philox in real-world benchmarks. This is made possible by hardware acceleration. Randen is an instantiation of Reverie, a recently published robust sponge-like random generator, with a new permutation built from an improved generalized Feistel structure with 16 branches. We provide new bounds on active s-boxes for up to 24 rounds of this construction, made possible by a memory-efficient search algorithm. Replacing existing generators with Randen can protect randomized algorithms such as reservoir sampling from attack. The permutation may also be useful for wide-block ciphers and hashing functions.
View details
Guetzli: Perceptually Guided JPEG Encoder
Robert Obryk
Ostap Stoliarchuk
Zoltan Szabadka
arXiv (2017)
Preview abstract
Guetzli is a new JPEG encoder that aims to produce visually indistinguishable images at a lower bit-rate than other common JPEG encoders. It optimizes both the JPEG global quantization tables and the DCT coefficient values in each JPEG block using a closed-loop optimizer. Guetzli uses Butteraugli, our perceptual distance metric, as the source of feedback in its optimization process. We reach a 29-45% reduction in data size for a given perceptual distance, according to Butteraugli, in comparison to other compressors we tried. Guetzli's computation is currently extremely slow, which limits its applicability to compressing static content and serving as a proof- of-concept that we can achieve significant reductions in size by combining advanced psychovisual models with lossy compression techniques.
View details
Preview abstract
HighwayHash is a new pseudo-random function based on AVX2 multiply and permute instructions for thorough and fast hashing. It is 5.2 times as fast as SipHash for 1 KiB inputs. An open-source implementation is available under a permissive license. We discuss design choices and provide statistical analysis, speed measurements and preliminary cryptanalysis. Assuming it withstands further analysis, strengthened variants may also substantially accelerate file checksums and stream ciphers.
View details