Mar Gonzalez-Franco

Mar Gonzalez-Franco

Mar Gonzalez-Franco, PhD, is a Computer Scientist and Neuroscientist at Google working on a new generation of Immersive technologies. With a background in real-time systems in her research she tries to build better interactions for immersive technologies using different disciplines: Virtual Reality, Augmented Reality, AI, computer graphics, computer vision, Avatars, and haptics. All while studying human behavior, perception and neuroscience. She was awarded the 2022 IEEE VGTC VR New Researcher Award, and the NAE Frontiers Engineer. She leads the BIRD lab, working on Blended Interactions Research and Devices.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Online-EYE: Multimodal Implicit Eye Tracking Calibration for XR
    Baosheng James Hou
    Lucy Abramyan
    Prasanthi Gurumurthy
    Khushman Patel
    Haley Adams
    Andrea Colaco
    Ken Pfeuffer
    Hans Gellersen
    Karan Ahuja
    2025
    Preview abstract Unlike other inputs for VR that work out of the box, eye tracking typically requires custom calibration per user or session. We present a multimodal inputs approach for implicit calibration of eye tracker in VR, leveraging UI interaction for continuous, background calibration. Our method analyzes gaze data alongside controller interaction with UI elements, and employing ML techniques it continuously refines the calibration matrix without interrupting users from their current tasks. Potentially eliminating the need for explicit calibration. We demonstrate the accuracy and effectiveness of this implicit approach across various tasks and real time applications achieving comparable eye tracking accuracy to native, explicit calibration. View details
    Preview abstract Eye-based interaction techniques for extended reality, such as gaze and pinch, are simple to use however suffer from input precision issues. We present H2E, a fine and coarse-grained pointing technique that cascades Hand, Head, and Eye inputs. As users initiate a pinch gesture, a cursor appears at the gaze point that can be dragged by head pointing before pinch confirmation. This has the potential advantage that it can add a precision component without changing the semantics of the technique. In this paper, we describe the design and implementation of the technique. Furthermore, we present an evaluation of our method in a Fitts-based user study, exploring the speed-accuracy trade-offs against a gaze and pinch interaction baseline. View details
    Preview abstract Despite the surge in popularity of virtual reality (VR), mobile phones remain the primary medium for accessing digital content, offering both privacy and portability. This short paper presents Beyond the Phone, a novel framework that enhances mobile phones in VR with context-aware controls and spatial augmentation. We first establish a comprehensive design space through brainstorming and iterative discussions with VR experts. We then develop a proof-of-concept system that analyzes UI layouts to offer context-aware controls and spatial augmentation, targeting six key application areas within our design space. Finally, we demonstrate that our system can effectively adapt to a broad spectrum of applications at runtime, and discuss future directions with reviews with seven experts. View details
    Geometry Fidelity for Spherical Images
    Anders Christensen
    Nooshin Mojab
    Khushman Patel
    Karan Ahuja
    Zeynep Akata
    Ole Winther
    Andrea Colaco
    ECCV (2024)
    Preview abstract Spherical, or omni-directional, images offer an immersive format appealing to a wide range of computer vision applications. However, the geometric properties of spherical images pose a major challenge for existing models and metrics designed for 2D images. Concretely, we demonstrate that the established generative evaluation metric FID fails to quantify shortcomings in these properties. To this end, we introduce two quantitative evaluation metrics accounting for geometric constraints of spherical images, namely Omnidirectional FID (OmniFID) and Discontinuity Score (DS). OmniFID is an extension of FID, tailored to additionally capture field-of-view requirements of the spherical format by leveraging cubemap projections. DS is a kernel-based seam alignment score of continuity across borders of 2D representations of spherical images. In experiments, OmniFID and DS detect issues with spherical structure better than previously utilized metrics. View details
    Preview abstract WindowMirror is a framework for using XR headsets in productivity scenarios. The toolkit provides users with a simulated, extended screen real-estate. It allows users to interact with multiple desktop applications in real-time within a XR environment. Our architecture has two main modules: one a Unity package and a Python backend, which makes it easy to use and extend. WindowMirror supports traditional desktop interaction methods such as mouse, keyboard, and hand tracking. Furthermore, it features a Cylindrical Window Layout, an emerging design pattern which is particularly effective for single-user, egocentric perspectives. The introduction of WindowMirror aims to set a foundation for future research in XR screen-focused productivity scenarios. View details
    Hovering Over the Key to Text Input in XR
    Diar Abdlkarim
    Arpit Bhatia
    Stuart Macgregor
    Jason Fotso-Puepi
    Hasti Seifi
    Massimiliano Di Luca
    Karan Ahuja
    2024
    Preview abstract Virtual, Mixed, and Augmented Reality (XR) technologies hold immense potential for transforming productivity beyond PC. Therefore there is a critical need for improved text input solutions for XR. However, achieving efficient text input in these environments remains a significant challenge. This paper examines the current landscape of XR text input techniques, focusing on the importance of keyboards (both physical and virtual) as essential tools. We discuss the unique challenges and opportunities presented by XR, synthesizing key trends from existing solutions. View details
    Augmented Object Intelligence with XR-Objects
    Mustafa Doga Dogan
    Karan Ahuja
    Andrea Colaco
    Johnny Lee
    Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST), ACM (2024), pp. 1-15
    Preview abstract Seamless integration of physical objects as interactive digital entities remains a challenge for spatial computing. This paper explores Augmented Object Intelligence (AOI) in the context of XR, an interaction paradigm that aims to blur the lines between digital and physical by equipping real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a portal to digital functionalities. Our approach utilizes real-time object segmentation and classification, combined with the power of Multimodal Large Language Models (MLLMs), to facilitate these interactions without the need for object pre-registration. We implement the AOI concept in the form of XR-Objects, an open-source prototype system that provides a platform for users to engage with their physical environment in contextually relevant ways using object-based context menus. This system enables analog objects to not only convey information but also to initiate digital actions, such as querying for details or executing tasks. Our contributions are threefold: (1) we define the AOI concept and detail its advantages over traditional AI assistants, (2) detail the XR-Objects system’s open-source design and implementation, and (3) show its versatility through various use cases and a user study. View details
    Preview abstract We present XDTK, an open-source Unity/Android toolkit for prototyping multi-device interactions in extended reality (XR). With the Unity package and Android app provided in XDTK, data from any number of devices (phones, tablets, or wearables) can be streamed to and surfaced within a Unity-based XR application. ARCore-supported device also provide self-tracked pose data. Devices on the same local network are automatically discovered by the Unity server and their inputs are routed using a custom event framework. We designed XDTK to be modular and easily extendable to enable fast, simple, and effective prototyping of multi-device experiences by both researchers and developers. View details
    Preview abstract Interactions with Extended Reality Head Mounted Devices (XR HMDs) applications require precise, intuitive and efficient input methods. Current approaches either rely on power-intensive sensors, such as cameras for hand-tracking, or specialized hardware in the form of handheld controllers. As an alternative, past works have explored the use of devices already present with the user, in the form of smartphones and smartwatches as practical input solutions. However, this approach risks interaction overload---how can one determine whether the user’s interaction gestures on the watch-face or phone screen are directed toward control of the mobile device itself or the XR device? To this effect, we propose a novel framework for cross-device input routing and device arbitration by employing Inertial Measurement Units (IMUs) within these devices. We validate our approach in a user study with six participants. By making use of the relative orientation between the headset and the target input device, we can estimate the intended device of interaction with 93.7% accuracy. Our method offers a seamless, energy-efficient alternative for input management in XR, enhancing user experience through natural and ergonomic interactions. View details
    Guidelines for Productivity in Virtual Reality
    Andrea Colaco
    ACM Interactions, 31 (2024), pp. 46-53
    Preview abstract Most of our interactions with digital content currently occur inside 2D screens, however moving from that format to immersive setups brings a paradigm shift. From content inside the screen to users inside the content. This change requires a revisit to how we blend the analog and the digital and how we transfer content between the two modes. Perhaps it even asks for new guidelines too. While different solutions appear in the space, the dynamic range only seems to widen. We can start to see what works and what does not work so well, in an empirical or ethnographic approach, beyond laboratory studies. But if we want to accelerate adoption we need to further the understanding on how current tasks can be improved. How this new form of interaction can increase their productivity. In this paper we focus on analyzing and converging what we think works, and envisioning how this new set of immersive devices and interactions can enable productivity beyond already existing tools. View details