Zuckerli: A New Compressed Representation for Graphs

Luca Versari
Alessio Conte
Roberto Grossi
IEEE Access, 8(2020), pp. 219233-219243


Zuckerli is a scalable compression system meant for large real-world graphs. Graphs are notoriously challenging structures to store efficiently due to their linked nature, which makes it hard to separate them into smaller, compact components. Therefore, effective compression is crucial when dealing with large graphs, which can have billions of nodes and edges. Furthermore, a good compression system should give the user fast and reasonably flexible access to parts of the compressed data without requiring full decompression, which may be unfeasible on their system. Zuckerli improves multiple aspects of WebGraph, the current state-of-the-art in compressing real-world graphs, by using advanced compression techniques and novel heuristic graph algorithms. It can produce both a compressed representation for storage and one which allows fast direct access to the adjacency lists of the compressed graph without decompressing the entire graph. We validate the effectiveness of Zuckerli on real-world graphs with up to a billion nodes and 90 billion edges, conducting an extensive experimental evaluation of both compression density and decompression performance. We show that Zuckerli-compressed graphs are 10% to 29% smaller, and more than 20% in most cases, with a resource usage for decompression comparable to that of WebGraph.