Konstantinos Bousmalis
Research Areas
Authored Publications
Sort By
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Paul Wohlhart
Matthew Kelcey
Mrinal Kalakrishnan
Laura Downs
Julian Ibarz
Peter Pastor Sampedro
Kurt Konolige
Sergey Levine
ICRA (2018)
Preview abstract
Instrumenting and collecting annotated visual grasping datasets to train modern machine learning algorithms is prohibitively expensive. An appealing alternative is to use off-the-shelf simulators to render synthetic data for which ground-truth annotations are generated automatically.
Unfortunately, models trained purely on simulated data often fail to generalize to the real world. To address this shortcoming, prior work introduced domain adaptation algorithms that attempt to make the resulting models domain-invariant. However, such works were evaluated primarily on offline image classification datasets. In this work, we adapt these techniques for learning, primarily in simulation, robotic hand-eye coordination for grasping. Our approaches generalize to diverse and previously unseen real-world objects.
We show that, by using synthetic data and domain adaptation, we are able to reduce the amounts of real--world samples required for our goal and a certain level of performance by up to 50 times. We also show that by using our suggested methodology we are able to achieve good grasping results by using no real world labeled data.
View details
XGAN: Unsupervised Image-to-Image Translation for many-to-many Mappings
Amelie Royer
Stephan Gouws
Fred Bertsch
ICML Workshop (2017)
Preview abstract
Style transfer usually refers to the task of applying color and texture information from a specific style image to a given content image while preserving the structure of the latter. Here we tackle the more generic problem of semantic style transfer: given two unpaired collections of images, we aim to learn a mapping between the corpus-level style of each collection, while preserving semantic content shared across the two domains. We introduce XGAN ("Cross-GAN"), a dual adversarial autoencoder, which captures a shared representation of the common domain semantic content in an unsupervised way, while jointly learning the domain-to-domain image translations in both directions. We exploit ideas from the domain adaptation literature and define a semantic consistency loss which encourages the model to preserve semantics in the learned embedding space. We report promising qualitative results for the task of face-to-cartoon translation. The cartoon dataset we collected for this purpose is in the process of being released as a new benchmark for semantic style transfer.
View details
Preview abstract
Collecting well-annotated image datasets to train modern machine learning algorithms is prohibitively expensive for many tasks. One appealing alternative is rendering synthetic data where ground-truth annotations are generated automatically. Unfortunately, models trained purely on rendered images often fail to generalize to real images. To address this shortcoming, prior work introduced unsupervised domain adaptation algorithms that attempt to map representations between the two domains or learn to extract features that are domain-invariant. In this work, we present a new approach that learns, in an unsupervised manner, a transformation in the pixel space from one domain to the other. Our generative adversarial network (GAN)-based method adapts source-domain images to appear as if drawn from the target domain. Our approach not only produces plausible samples, but also outperforms the state-of-the-art on a number of unsupervised domain adaptation scenarios by large margins. Finally, we demonstrate that the adaptation process generalizes to object classes unseen during training.
View details
Domain Separation Networks
George Trigeorgis
Nathan Silberman
Dilip Krishnan
NIPS 2016 (2016)
Preview abstract
The cost of large scale data collection and annotation often makes the application of machine learning algorithms to new tasks or datasets prohibitively expensive. One approach circumventing this cost is training models on synthetic data where annotations are provided automatically. Despite their appeal, such models often fail to generalize from synthetic to real images, necessitating domain adaptation algorithms to manipulate these models before they can be successfully applied. Existing approaches focus either on mapping representations from one domain to the other, or on learning to extract features that are invariant to the domain from which they were extracted. However, by focusing only on creating a mapping or shared representation between the two domains, they ignore the individual characteristics of each domain. We suggest that explicitly modeling what is unique to each domain can improve a model's ability to extract domain-invariant features. Inspired by work on private-shared component analysis, we explicitly learn to extract image representations that are partitioned into two subspaces: one component which is private to each domain and one which is shared across domains. Our model is trained not only to perform the task we care about in the source domain, but also to use the partitioned representation to reconstruct the images from both domains. Our novel architecture results in a model that outperforms the state-of-the-art on a range of unsupervised domain adaptation scenarios and additionally produces visualizations of the private and shared representations enabling interpretation of the domain adaptation process.
View details
A Deep Matrix Factorization Method for Learning Attribute Representations
George Trigeorgis
Stefanos Zafeiriou
Björn W. Schuller
IEEE Trans. Pattern Analysis and Machine Intelligence, 39 (2016), pp. 417-429
Preview abstract
Semi-Non-negative Matrix Factorization is a technique that learns a low-dimensional representation of a dataset that lends itself to a clustering interpretation. It is possible that the mapping between this new representation and our original data matrix contains rather complex hierarchical information with implicit lower-level hidden attributes, that classical one level clustering methodologies cannot interpret. In this work we propose a novel model, Deep Semi-NMF, that is able to learn such hidden representations that allow themselves to an interpretation of clustering according to different, unknown attributes of a given dataset. We also present a semi-supervised version of the algorithm, named Deep WSF, that allows the use of (partial) prior information for each of the known attributes of a dataset, that allows the model to be used on datasets with mixed attribute knowledge. Finally, we show that our models are able to learn low-dimensional representations that are better suited for clustering, but also classification, outperforming Semi-Non-negative Matrix Factorization, but also other state-of-the-art methodologies variants.
View details