Iulia-Maria Comșa
Iulia Comșa is a Research Scientist at Google Research in Zürich working on biologically-inspired computing. She has developed learning algorithms for spiking neural networks and is currently exploring the limits of human-like cognition in large language models. She has a BSc in computer science from Babeș-Bolyai University and a PhD in neuroscience from Cambridge University.
Authored Publications
Sort By
Preview abstract
Spatial reasoning is a fundamental building block of human cognition, used in representing, grounding, and reasoning about physical and abstract concepts. We propose a novel benchmark focused on assessing inferential properties of statements with spatial prepositions. The benchmark includes original datasets in English and Romanian. Our aim is to probe the limits of foundational reasoning in large language models. We use prompt engineering to study the performance of two families of large language models, PaLM and GPT3, on our benchmark. Our results show considerable variability in the performance of smaller and larger models, as well as across prompts and languages. We also examine the performance of the largest model, PaLM-540b, in a generative setting and find that it can approach human level performance with few-shot prompting.
View details
Preview abstract
We propose a benchmark to assess the capability of large language models to reason with metaphor.
Our benchmark combines the previously isolated topics of metaphor detection and commonsense reasoning into a single task that requires a model to make inferences by accurately selecting between the literal and metaphorical register. We examine the performance of state-of-the-art pretrained models on forced-choice tasks and find a large discrepancy between small and very large models, going from chance- to human-level performance. However, upon examining the generative performance of the largest model, we find that there is still a gap to bridge before human performance is reached in a more natural conversational setting.
View details
Preview abstract
Spiking neural networks with temporal coding schemes process information based on the relative timing of neuronal spikes. In supervised learning tasks, temporal coding allows learning through backpropagation with exact derivatives, and achieves accuracies on par with conventional artificial neural networks. Here we introduce spiking autoencoders with temporal coding and pulses, trained using backpropagation to store and reconstruct images with high fidelity from compact representations. We explore the effect of different spike time target latencies, data noise levels and embedding sizes, as well as classification performance from the embeddings. The spiking autoencoder performs similarly to conventional autoencoders and exceeds their reconstruction performance on inverted-brightness images. We find that inhibition is essential in the functioning of the spiking autoencoders, particularly when the input needs to be memorised for a longer time before the expected output spike times. To reconstruct images with a high target latency, the network learns to accumulate negative evidence and to use the pulses as excitatory triggers for producing the output spikes at the required times. Our results highlight the potential of spiking autoencoders as building blocks for more complex biologically-inspired architectures.
View details
Preview abstract
Zuckerli is a scalable compression system meant for large real-world graphs. Graphs are notoriously challenging structures to store efficiently due to their linked nature, which makes it hard to separate them into smaller, compact components. Therefore, effective compression is crucial when dealing with large graphs, which can have billions of nodes and edges. Furthermore, a good compression system should give the user fast and reasonably flexible access to parts of the compressed data without requiring full decompression, which may be unfeasible on their system. Zuckerli improves multiple aspects of WebGraph, the current state-of-the-art in compressing real-world graphs, by using advanced compression techniques and novel heuristic graph algorithms. It can produce both a compressed representation for storage and one which allows fast direct access to the adjacency lists of the compressed graph without decompressing the entire graph. We validate the effectiveness of Zuckerli on real-world graphs with up to a billion nodes and 90 billion edges, conducting an extensive experimental evaluation of both compression density and decompression performance. We show that Zuckerli-compressed graphs are 10% to 29% smaller, and more than 20% in most cases, with a resource usage for decompression comparable to that of WebGraph.
View details
SO(8) Supergravity and the Magic of Machine Learning
Moritz Firsching
Thomas Fischbacher
Journal of High Energy Physics, August 2019 (2019), 2019:57
Preview abstract
Using de Wit-Nicolai D=4 N=8 SO(8) supergravity as an example, we show how modern Machine Learning software libraries such as Google's TensorFlow can be employed to greatly simplify the analysis of high-dimensional scalar sectors of some M-Theory compactifications.
We provide detailed information on the location, symmetries, and particle spectra and charges of 192 critical points on the scalar manifold of SO(8) supergravity, including one newly discovered N=1 vacuum with SO(3) residual symmetry, one new potentially stabilizable non-supersymmetric solution, and examples for "Galois conjugate pairs" of solutions, i.e. solution-pairs that share the same gauge group embedding into SO(8) and minimal polynomials for the cosmological constant. Where feasible, we give analytic expressions for solution coordinates and cosmological constants.
As the authors' aspiration is to present the discussion in a form that is accessible to both the Machine Learning and String Theory communities and allows adopting our methods towards the study of other models, we provide an introductory overview over the relevant Physics as well as Machine Learning concepts. This includes short pedagogical code examples. In particular, we show how to formulate a requirement for residual Supersymmetry as a Machine Learning loss function and effectively guide the numerical search towards supersymmetric critical points. Numerical investigations suggest that there are no further supersymmetric vacua beyond this newly discovered fifth solution.
View details
Temporal coding in spiking neural networks with alpha synaptic function
Krzysztof Potempa
Luca Versari
Thomas Fischbacher
arXiv:1907.13223 (2019)
Preview abstract
The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli. However, conventional artificial neural networks lack the intrinsic dimension of temporal coding present in biological networks. We propose a spiking neural network model that encodes information in the relative timing of individual neuron spikes. An image can be encoded in this manner by an input layer where each neuron spikes at a time proportional to the brightness of an individual pixel. In classification tasks, the output of the network is indicated by the first neuron to spike in the output layer. By encoding information in time in this manner, we are able to train the network to perform supervised learning with backpropagation, using exact derivatives of the postsynaptic spike times with respect to presynaptic spike times. The network operates using a biologically-plausible alpha synaptic transfer function. Additionally, we use trainable synchronisation pulses that provide bias, add more flexibility during the training process and allow the exploitation of the decay part of the alpha function. We show that such spiking networks can be trained successfully on noisy temporal Boolean logic problems. Moreover, they perform better than comparable spiking models on the MNIST benchmark when encoded in time. During training, we find that the network spontaneously discovers two operating regimes: a slow regime, where a decision is taken after all hidden neurons have spiked and the accuracy is very high, and a fast regime, where a decision is taken very fast but the accuracy is lower. These results demonstrate the computational power of spiking networks with biological characteristics that encode information in the timing of individual neurons. By studying temporal coding in spiking networks, we aim to create building blocks towards energy-efficient, state-based and more complex biologically-inspired neural architectures.
View details
JPEG XL next-generation image compression architecture and coding tools
Ruud van Asseldonk
Moritz Firsching
Thomas Fischbacher
Sebastian Gomez
Evgenii Kliuchnikov
Robert Obryk
Krzysztof Potempa
Alexander Rhatushnyak
Jon Sneyers
Zoltan Szabadka
Luca Versari
SPIE Applications of Digital Image Processing, SPIE (2019)
Preview abstract
An update on the JPEG XL standardization effort: JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It will provide various benefits compared to existing image formats: significantly smaller size at equivalent subjective quality; fast, parallelizable decoding and encoding configurations; features such as progressive, lossless, animation, and reversible transcoding of existing JPEG; support for high-quality applications including wide gamut, higher resolution/bit depth/dynamic range, and visually lossless coding. Additionally, a royalty-free baseline is an important goal. The JPEG XL architecture is traditional block-transform coding with upgrades to each component. We describe these components and analyze decoded image quality.
View details