Jump to Content
Alex Olwal

Alex Olwal

I am a Tech Lead/Manager in Google’s Augmented Reality team and a founder of the Interaction Lab. I direct research and development of interaction technologies based on advancements in display technology, low-power and high-speed sensing, wearables, actuation, electronic textiles, and human—computer interaction. I am passionate about accelerating innovation and disruption through tools, techniques and devices that enable augmentation and empowerment of human abilities. Research interests include augmented reality, ubiquitous computing, mobile devices, 3D user interfaces, interaction techniques, interfaces for accessibility and health, medical imaging, and software/hardware prototyping.

Google I/O 2022 Keynote: Augmented Language
Our Augmented Language project was featured in the I/O 2022 Keynote.
"Let's see what happens when we take our advances in translation and transcription, and deliver them in your line-of-sight", Sundar Pichai, CEO.

· 2020-Now Augmented Reality
· 2018-2020 Google AI: Research & Machine Intelligence
· 2017-2018 ATAP (Advanced Technology and Projects)
· 2016-2017 Wearables, Augmented and Virtual Reality
· 2015-2016 Project Aura, Glass and Beyond
· 2014-2015 Google X

My work is building on my experience from research labs and institutions, including MIT Media Lab, Columbia University, University of California - Santa Barbara, KTH (Royal Institute of Technology), and Microsoft Research. I have taught at Stanford University, Rhode Island School of Design and KTH.

Portfolio: olwal.com
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Experiencing InstructPipe: Building Multi-modal AI Pipelines via Prompting LLMs and Visual Programming
    Zhongyi Zhou
    Jing Jin
    Xiuxiu Yuan
    Jun Jiang
    Jingtao Zhou
    Yiyi Huang
    Kristen Wright
    Jason Mayes
    Mark Sherwood
    Ram Iyengar
    Na Li
    Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems, ACM, pp. 5 (to appear)
    Preview abstract Foundational multi-modal models have democratized AI access, yet the construction of complex, customizable machine learning pipelines by novice users remains a grand challenge. This paper demonstrates a visual programming system that allows novices to rapidly prototype multimodal AI pipelines. We first conducted a formative study with 58 contributors and collected 236 proposals of multimodal AI pipelines that served various practical needs. We then distilled our findings into a design matrix of primitive nodes for prototyping multimodal AI visual programming pipelines, and implemented a system with 65 nodes. To support users' rapid prototyping experience, we built InstructPipe, an AI assistant based on large language models (LLMs) that allows users to generate a pipeline by writing text-based instructions. We believe InstructPipe enhances novice users onboarding experience of visual programming and the controllability of LLMs by offering non-experts a platform to easily update the generation. View details
    ChatDirector: Enhancing Video Conferencing with Space-Aware Scene Rendering and Speech-Driven Layout Transition
    Brian Moreno Collins
    Karthik Ramani
    Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, ACM, pp. 16 (to appear)
    Preview abstract Remote video conferencing systems (RVCS) are widely adopted in personal and professional communication. However, they often lack the co-presence experience of in-person meetings. This is largely due to the absence of intuitive visual cues and clear spatial relationships among remote participants, which can lead to speech interruptions and loss of attention. This paper presents ChatDirector, a novel RVCS that overcomes these limitations by incorporating space-aware visual presence and speech-aware attention transition assistance. ChatDirector employs a real-time pipeline that converts participants' RGB video streams into 3D portrait avatars and renders them in a virtual 3D scene. We also contribute a decision tree algorithm that directs the avatar layouts and behaviors based on participants' speech states. We report on results from a user study (N=16) where we evaluated ChatDirector. The satisfactory algorithm performance and complimentary subject user feedback imply that ChatDirector significantly enhances communication efficacy and user engagement. View details
    UI Mobility Control in XR: Switching UI Positionings between Static, Dynamic, and Self Entities
    Siyou Pei
    Yang Zhang
    Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, ACM, pp. 12 (to appear)
    Preview abstract Extended reality (XR) has the potential for seamless user interface (UI) transitions across people, objects, and environments. However, the design space, applications, and common practices of 3D UI transitions remain underexplored. To address this gap, we conducted a need-finding study with 11 participants, identifying and distilling a taxonomy based on three types of UI placements --- affixed to static, dynamic, or self entities. We further surveyed 113 commercial applications to understand the common practices of 3D UI mobility control, where only 6.2% of these applications allowed users to transition UI between entities. In response, we built interaction prototypes to facilitate UI transitions between entities. We report on results from a qualitative user study (N=14) on 3D UI mobility control using our FingerSwitches technique, which suggests that perceived usefulness is affected by types of entities and environments. We aspire to tackle a vital need in UI mobility within XR. View details
    Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications through Visual Programming
    Na Li
    Jing Jin
    Michelle Carney
    Scott Joseph Miles
    Maria Kleiner
    Xiuxiu Yuan
    Anuva Kulkarni
    Xingyu “Bruce” Liu
    Ahmed K Sabie
    Ping Yu
    Ram Iyengar
    Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), ACM
    Preview abstract In recent years, there has been a proliferation of multimedia applications that leverage machine learning (ML) for interactive experiences. Prototyping ML-based applications is, however, still challenging, given complex workflows that are not ideal for design and experimentation. To better understand these challenges, we conducted a formative study with seven ML practitioners to gather insights about common ML evaluation workflows. This study helped us derive six design goals, which informed Rapsai, a visual programming platform for rapid and iterative development of end-to-end ML-based multimedia applications. Rapsai is based on a node-graph editor to facilitate interactive characterization and visualization of ML model performance. Rapsai streamlines end-to-end prototyping with interactive data augmentation and model comparison capabilities in its no-coding environment. Our evaluation of Rapsai in four real-world case studies (N=15) suggests that practitioners can accelerate their workflow, make more informed decisions, analyze strengths and weaknesses, and holistically evaluate model behavior with real-world input. View details
    InstructPipe: Building Visual Programming Pipelines with Human Instructions
    Zhongyi Zhou
    Jing Jin
    Xiuxiu Yuan
    Jun Jiang
    Jingtao Zhou
    Yiyi Huang
    Kristen Wright
    Jason Mayes
    Mark Sherwood
    Ram Iyengar
    Na Li
    arXiv, vol. 2312.09672 (2023)
    Preview abstract Visual programming provides beginner-level programmers with a coding-free experience to build their customized pipelines. Existing systems require users to build a pipeline entirely from scratch, implying that novice users need to set up and link appropriate nodes all by themselves, starting from a blank workspace. We present InstructPipe, an AI assistant that enables users to start prototyping machine learning (ML) pipelines with text instructions. We designed two LLM modules and a code interpreter to execute our solution. LLM modules generate pseudocode of a target pipeline, and the interpreter renders a pipeline in the node-graph editor for further human-AI collaboration. Technical evaluations reveal that InstructPipe reduces user interactions by 81.1% compared to traditional methods. Our user study (N=16) showed that InstructPipe empowers novice users to streamline their workflow in creating desired ML pipelines, reduce their learning curve, and spark innovative ideas with open-ended commands. View details
    Modeling and Improving Text Stability in Live Captions
    Xingyu "Bruce" Liu
    Jun Zhang
    Leonardo Ferrer
    Susan Xu
    Vikas Bahirwani
    Extended Abstract of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), ACM, 208:1-9
    Preview abstract In recent years, live captions have gained significant popularity through its availability in remote video conferences, mobile applications, and the web. Unlike preprocessed subtitles, live captions require real-time responsiveness by showing interim speech-to-text results. As the prediction confidence changes, the captions may update, leading to visual instability that interferes with the user’s viewing experience. In this work, we characterize the stability of live captions by proposing a vision-based flickering metric using luminance contrast and Discrete Fourier Transform. Additionally, we assess the effect of unstable captions on the viewer through task load index surveys. Our analysis reveals significant correlations between the viewer's experience and our proposed quantitative metric. To enhance the stability of live captions without compromising responsiveness, we propose the use of tokenized alignment, word updates with semantic similarity, and smooth animation. Results from a crowdsourced study (N=123), comparing four strategies, indicate that our stabilization algorithms lead to a significant reduction in viewer distraction and fatigue, while increasing viewers' reading comfort. View details
    Experiencing Augmented Communication with Real-time Visuals using Large Language Models in Visual Captions
    Xingyu 'Bruce' Liu
    Vladimir Kirilyuk
    Xiuxiu Yuan
    Xiang ‘Anthony’ Chen
    Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST), ACM (2023) (to appear)
    Preview abstract We demonstrate Visual Captions, a real-time system that integrates with a video conferencing platform to enrich verbal communication. Visual Captions leverages a fine-tuned large language model to proactively suggest visuals that are relevant to the context of the ongoing conversation. We implemented Visual Captions as a user-customizable Chrome plugin with three levels of AI proactivity: Auto-display (AI autonomously adds visuals), Auto-suggest (AI proactively recommends visuals), and On-demand-suggest (AI suggests visuals when prompted). We showcase the usage of Visual Captions in open-vocabulary settings, and how the addition of visuals based on the context of conversations could improve comprehension of complex or unfamiliar concepts. In addition, we demonstrate three approaches people can interact with the system with different levels of AI proactivity. Visual Captions is open-sourced at https://github.com/google/archat. View details
    Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai
    Na Li
    Jing Jin
    Michelle Carney
    Xiuxiu Yuan
    Ping Yu
    Ram Iyengar
    CHI EA '23: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, ACM, 448:1-4
    Preview abstract We demonstrate Rapsai, a visual programming platform that aims to streamline the rapid and iterative development of end-to-end machine learning (ML)-based multimedia applications. Rapsai features a node-graph editor that enables interactive characterization and visualization of ML model performance, which facilitates the understanding of how the model behaves in different scenarios. Moreover, the platform streamlines end-to-end prototyping by providing interactive data augmentation and model comparison capabilities within a no-coding environment. Our demonstration showcases the versatility of Rapsai through several use cases, including virtual background, visual effects with depth estimation, and audio denoising. The implementation of Rapsai is intended to support ML practitioners in streamlining their workflow, making data-driven decisions, and comprehensively evaluating model behavior with real-world input. View details
    Visual Captions: Augmenting Verbal Communication with On-the-fly Visuals
    Xingyu Bruce Liu
    Vladimir Kirilyuk
    Xiuxiu Yuan
    Xiang ‘Anthony’ Chen
    Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), ACM, pp. 1-20
    Preview abstract Computer-mediated platforms are increasingly facilitating verbal communication, and capabilities such as live captioning and noise cancellation enable people to understand each other better. We envision that visual augmentations that leverage semantics in the spoken language could also be helpful to illustrate complex or unfamiliar concepts. To advance our understanding of the interest in such capabilities, we conducted formative research through remote interviews (N=10) and crowdsourced a dataset of 1500 sentence-visual pairs across a wide range of contexts. These insights informed Visual Captions, a real-time system that we integrated into a videoconferencing platform to enrich verbal communication. Visual Captions leverages a fine-tuned large language model to proactively suggest relevant visuals in open-vocabulary conversations. We report on our findings from a lab study (N=26) and a two-week deployment study (N=10), which demonstrate how Visual Captions has the potential to help people improve their communication through visual augmentation in various scenarios. View details
    Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines
    Na Li
    Jing Jin
    Michelle Carney
    Jun Jiang
    Xiuxiu Yuan
    Kristen Wright
    Mark Sherwood
    Jason Mayes
    Lin Chen
    Jingtao Zhou
    Zhongyi Zhou
    Ping Yu
    Ram Iyengar
    ACM (2023) (to appear)
    Preview abstract We demonstrate Visual Blocks for ML, a visual programming platform that facilitates rapid prototyping of ML-based multimedia applications. As the public version of Rapsai , we further integrated large language models and custom APIs into the platform. In this demonstration, we will showcase how to build interactive AI pipelines in a few drag-and-drops, how to perform interactive data augmentation, and how to integrate pipelines into Colabs. In addition, we demonstrate a wide range of community-contributed pipelines in Visual Blocks for ML, covering various aspects including interactive graphics, chains of large language models, computer vision, and multi-modal applications. Finally, we encourage students, designers, and ML practitioners to contribute ML pipelines through https://github.com/google/visualblocks/tree/main/pipelines to inspire creative use cases. Visual Blocks for ML is available at http://visualblocks.withgoogle.com. View details
    ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users
    DJ Jain
    Khoa Huynh Anh Nguyen
    Steven Goodman
    Rachel Grossman-Kahn
    Hung Ngo
    Leah Findlater
    Jon E. Froehlich
    Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), ACM, pp. 24
    Preview abstract Recent advances have enabled automatic sound recognition systems for deaf and hard of hearing (DHH) users on mobile devices. However, these tools use pre-trained, generic sound recognition models, which do not meet the diverse needs of DHH users. We introduce ProtoSound, an interactive system for customizing sound recognition models by recording a few examples, thereby enabling personalized and fine-grained categories. ProtoSound is motivated by prior work examining sound awareness needs of DHH people and by a survey we conducted with 472 DHH participants. To evaluate ProtoSound, we characterized performance on two real-world sound datasets, showing significant improvement over state-of-the-art (e.g., +9.7% accuracy on the first dataset). We then deployed ProtoSound's end-user training and real-time recognition through a mobile application and recruited 19 hearing participants who listened to the real-world sounds and rated the accuracy across 56 locations (e.g., homes, restaurants, parks). Results show that ProtoSound personalized the model on-device in real-time and accurately learned sounds across diverse acoustic contexts. We close by discussing open challenges in personalizable sound recognition, including the need for better recording interfaces and algorithmic improvements. View details
    Preview abstract Consumer electronics are increasingly using everyday materials to blend into home environments, often using LEDs or symbol displays under textile meshes. Our surveys (n=1499 and n=1501) show interest in interactive graphical displays for hidden interfaces --- however, covering such displays significantly limits brightness, material possibilities and legibility. To overcome these limitations, we leverage parallel rendering to enable ultrabright graphics that can pass through everyday materials. We unlock expressive hidden interfaces using rectilinear graphics on low-cost, mass-produced passive-matrix OLED displays. A technical evaluation across materials, shapes and display techniques, suggests 3.6--40X brightness increase compared to more complex active-matrix OLEDs. We present interactive prototypes that blend into wood, textile, plastic and mirrored surfaces. Survey feedback (n=1572) on our prototypes suggests that smart mirrors are particularly desirable. A lab evaluation (n=11) reinforced these findings and allowed us to also characterize performance from hands-on interaction with different content, materials and under varying lighting conditions. View details
    Opportunistic Interfaces for Augmented Reality: Transforming Everyday Objects into Tangible 6DoF Interfaces Using Ad hoc UI
    Mathieu Le Goc
    Shengzhi Wu
    Danhang "Danny" Tang
    Jun Zhang
    David Joseph New Tan
    Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, ACM
    Preview abstract Real-time environmental tracking has become a fundamental capability in modern mobile phones and AR/VR devices. However, it only allows user interfaces to be anchored at a static location. Although fiducial and natural-feature tracking overlays interfaces with specific visual features, they typically require developers to define the pattern before deployment. In this paper, we introduce opportunistic interfaces to grant users complete freedom to summon virtual interfaces on everyday objects via voice commands or tapping gestures. We present the workflow and technical details of Ad hoc UI (AhUI), a prototyping toolkit to empower users to turn everyday objects into opportunistic interfaces on the fly. We showcase a set of demos with real-time tracking, voice activation, 6DoF interactions, and mid-air gestures and prospect the future of opportunistic interfaces. View details
    E-Textile Microinteractions: Augmenting Twist with Flick, Slide and Grasp Gestures for Soft Electronics
    Gowa Wu
    Thad Eugene Starner
    Proceedings of CHI 2020 (ACM Conference on Human Factors in Computing Systems), ACM, New York, NY, pp. 1-13
    Preview abstract E-textile microinteractions advance cord-based interfaces by enabling the simultaneous use of precise continuous control and casual discrete gestures. We leverage the recently introduced I/O Braid sensing architecture to enable a series of user studies and experiments which help design suitable interactions and a real-time gesture recognition pipeline. Informed by a gesture elicitation study with 36 participants, we developed a user-dependent classifier for eight discrete gestures with 94% accuracy for 12 participants. In a formal evaluation we show that we can enable precise manipulation with the same architecture. Our quantitative targeting experiment suggests that twisting is faster than existing headphone button controls and is comparable in speed to a capacitive touch surface. Qualitative interview feedback indicates a preference for I/O Braid’s interaction over that of in-line headphone controls. Our applications demonstrate how continuous and discrete gestures can be combined to form new, integrated e-textile microinteraction techniques for real-time continuous control, discrete actions and mode switching. View details
    Preview abstract Today’s wearable and mobile devices typically use separate hardware components for sensing and actuation. In this work, we introduce new opportunities for the Linear Resonant Actuator (LRA), which is ubiquitous in such devices due to its capability for providing rich haptic feedback. By leveraging strategies to enable active and passive sensing capabilities with LRAs, we demonstrate their benefits and potential as self-contained I/O devices. Specifically, we use the back-EMF voltage to classify if the LRA is tapped, touched, as well as how much pressure is being applied. The back-EMF sensing is already integrated into many motor and LRA drivers. We developed a passive low-power tap sensing method that uses just 37.7 uA. Furthermore, we developed active touch and pressure sensing, which is low-power, quiet (2 dB), and minimizes vibration. The sensing method works with many types of LRAs. We show applications, such as pressure-sensing side-buttons on a mobile phone. We have also implemented our technique directly on an existing mobile phone’s LRA to detect if the phone is handheld or placed on a soft or hard surface. Finally, we show that this method can be used for haptic devices to determine if the LRA makes good contact with the skin. Our approach can add rich sensing capabilities to the ubiquitous LRA actuators without requiring additional sensors or hardware. View details
    Wearable Subtitles: Augmenting Spoken Communication with Lightweight Eyewear for All-day Captioning
    Kevin Balke
    Dmitrii Votintcev
    Thad Starner
    Bonnie Chinh
    Benoit Corda
    Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, Association for Computing Machinery, New York, NY, USA (2020), 1108–1120
    Preview abstract Mobile solutions can help transform speech and sound into visual representations for people who are deaf or hard-of-hearing (DHH). However, where handheld phones present challenges, head-worn displays (HWDs) could further communication through privately transcribed text, hands-free use, improved mobility, and socially acceptable interactions. Wearable Subtitles is a lightweight 3D-printed proof-of-concept HWD that explores augmenting communication through sound transcription for a full workday. Using a low-power microcontroller architecture, we enable up to 15 hours of continuous use. We describe a large survey (n=501) and three user studies with 24 deaf/hard-of-hearing participants which inform our development and help us refine our prototypes. Our studies and prior research identify critical challenges for the adoption of HWDs which we address through extended battery life, lightweight and balanced mechanical design (54 g), fitting options, and form factors that are compatible with current social norms. View details
    SensorSnaps: Integrating Wireless Sensor Nodes into Fabric Snap Fasteners for Textile Interfaces
    Tomás Alfonso Vega Gálvez
    Proceedings of UIST 2019 (ACM Symposium on User Interface Software and Technology), ACM, New York, NY
    Preview abstract Adding electronics to textiles can be time-consuming and requires technical expertise. We introduce SensorSnaps, low-power wireless sensor nodes that seamlessly integrate into caps of fabric snap fasteners. SensorSnaps provide a new technique to quickly and intuitively augment any location on the clothing with sensing capabilities. SensorSnaps securely attach and detach from ubiquitous commercial snap fasteners. Using inertial measurement units, the SensorSnaps detect tap and rotation gestures, as well as track body motion. We optimized the power consumption for SensorSnaps to work continuously for 45 minutes and up to 4 hours in capacitive touch standby mode. We present applications in which the SensorSnaps are used as gestural interfaces for a music player controller, cursor control, and motion tracking suit. The user study showed that SensorSnap could be attached in around 71 seconds, similar to attaching off-the-shelf snaps, and participants found the gestures easy to learn and perform. SensorSnaps could allow anyone to effortlessly add sophisticated sensing capacities to ubiquitous snap fasteners. View details
    I/O Braid: Scalable Touch-Sensitive Lighted Cords Using Spiraling, Repeating Sensing Textiles and Fiber Optics
    Jon Moeller
    Greg Priest-Dorman
    Thad Starner
    Ben Carroll
    Proceedings of UIST '18 (ACM Symposium on User Interface Software and Technology), ACM, New York, NY, USA (2018), pp. 485-497
    Preview abstract We introduce I/O Braid, an interactive textile cord with embedded sensing and visual feedback. I/O Braid senses proximity, touch, and twist through a spiraling, repeating braiding topology of touch matrices. This sensing topology is uniquely scalable, requiring only a few sensing lines to cover the whole length of a cord. The same topology allows us to embed fiber optic strands to integrate co-located visual feedback. We provide an overview of the enabling braiding techniques, design considerations, and approaches to gesture detection. These allow us to derive a set of interaction techniques, which we demonstrate with different form factors and capabilities. Our applications illustrate how I/O Braid can invisibly augment everyday objects, such as touch-sensitive headphones and interactive drawstrings on garments, while enabling discoverability and feedback through embedded light sources. View details
    Hybrid Watch User Interfaces: Collaboration Between Electro-Mechanical Components and Analog Materials
    UIST 2018 Adjunct Proceedings (ACM Symposium on User Interface Software and Technology), pp. 200-202
    Preview abstract We introduce programmable material and electro-mechanical control to enable a set of hybrid watch user interfaces that symbiotically leverage the joint strengths of electro-mechanical hands and a dynamic watch dial. This approach enables computation and connectivity with existing materials to preserve the inherent physical qualities and abilities of traditional analog watches. We augment the watch's mechanical hands with micro-stepper motors for control, positioning and mechanical expressivity. We extend the traditional watch dial with programmable pigments for non-emissive dynamic patterns. Together, these components enable a unique set of interaction techniques and user interfaces beyond their individual capabilities. View details
    Preview abstract Advances in the past century have resulted in unprecedented access to empowering technology, with user interfaces that typically provide clear distinction and separation between environments, technology and people. The progress in recent decades indicates, however, inevitable developments where sensing, display, actuation and computation will seek to integrate more intimately with matter, humans and machines. This talk will explore some of the radical new challenges and opportunities that these advancements imply for next-generation interfaces. View details
    1D Eyewear: Peripheral, Hidden LEDs and Near-Eye Holographic Displays
    Bernard Kress
    Proceedings of ISWC 2018 (International Symposium on Wearable Computers)
    Preview abstract 1D Eyewear uses 1D arrays of LEDs and pre-recorded holographic symbols to enable minimal head-worn displays. Our approach uses computer-generated holograms (CGHs) to create diffraction gratings which project a pre-recorded static image when illuminated with coherent light. Specifically, we develop a set of transmissive, reflective, and steerable optical configurations that can be embedded in conventional eyewear designs. This approach enables high resolution symbolic display in discreet digital eyewear. View details
    Preview abstract We introduce programmable material and electro-mechanical control to enable a set of hybrid watch user interfaces that symbiotically leverage the joint strengths of electro-mechanical hands and a dynamic watch dial. This approach enables computation and connectivity with existing materials to preserve the inherent physical qualities and abilities of traditional analog watches. We augment the watch's mechanical hands with micro-stepper motors for control, positioning and mechanical expressivity. We extend the traditional watch dial with programmable pigments for non-emissive dynamic patterns. Together, these components enable a unique set of interaction techniques and user interfaces beyond their individual capabilities. View details
    Zensei: Embedded, Multi-electrode Bioimpedance Sensing for Implicit, Ubiquitous User Recognition
    Munehiko Sato
    Rohan S. Puri
    Yosuke Ushigome
    Lukas Franciszkiewicz
    Deepak Chandra
    Ivan Poupyrev
    Ramesh Raskar
    Proceedings of CHI 2017 (SIGCHI Conference on Human Factors in Computing Systems), ACM, pp. 3972-3985
    Preview abstract Interactions and connectivity is increasingly expanding to shared objects and environments, such as furniture, vehicles, lighting, and entertainment systems. For transparent personalization in such contexts, we see an opportunity for embedded recognition, to complement traditional, explicit authentication. We introduce Zensei, an implicit sensing system that leverages bio-sensing, signal processing and machine learning to classify uninstrumented users by their body’s electrical properties. Zensei could allow many objects to recognize users. E.g., phones that unlock when held, cars that automatically adjust mirrors and seats, or power tools that restore user settings. We introduce wide-spectrum bioimpedance hardware that measures both amplitude and phase. It extends previous approaches through multi-electrode sensing and high-speed wireless data collection for embedded devices. We implement the sensing in devices and furniture, where unique electrode configurations generate characteristic profiles based on user’s unique electrical properties. Finally, we discuss results from a comprehensive, longitudinal 22-day data collection experiment with 46 subjects. Our analysis shows promising classifi- cation accuracy and low false acceptance rate. View details
    StretchEBand: Enabling Fabric-Based Interactions through Rapid Fabrication of Textile Stretch Sensors
    Anita Vogl
    Patrick Parzer
    Teo Babic
    Joanne Leong
    Michael Haller
    Proceedings of CHI 2017 (SIGCHI Conference on Human Factors in Computing Systems), ACM, pp. 2617-2627
    Preview abstract The increased interest in interactive soft materials, such as smart clothing and responsive furniture, means that there is a need for flexible and deformable electronics. In this paper, we focus on stitch-based elastic sensors, which have the benefit of being manufacturable with textile craft tools that have been used in homes for centuries. We contribute to the understanding of stitch-based stretch sensors through four experiments and one user study that investigate conductive yarns from textile and technical perspectives, and analyze the impact of different stitch types and parameters. The insights informed our design of new stretch-based interaction techniques that emphasize eyes-free or causal interactions. We demonstrate with StretchEBand how soft, continuous sensors can be rapidly fabricated with different parameters and capabilities to support interaction with a wide range of performance requirements across wearables, mobile devices, clothing, furniture, and toys. View details
    shiftIO: Reconfigurable Tactile Elements for Dynamic Affordances and Mobile Interaction
    Evan Strasnick
    Jackie Yang
    Kesler Tanner
    Sean Follmer
    Proceedings of CHI 2017 (SIGCHI Conference on Human Factors in Computing Systems), ACM, pp. 5075-5086
    Preview abstract Currently, virtual (i.e. touchscreen) controls are dynamic, but lack the advantageous tactile feedback of physical controls. Similarly, devices may also have dedicated physical controls, but they lack the flexibility to adapt for different contexts and applications. On mobile and wearable devices in particular, space constraints further limit our input and output capabilities. We propose utilizing reconfigurable tactile elements around the edge of a mobile device to enable dynamic physical controls and feedback. These tactile elements can be used for physical touch input and output, and can reposition according to the application both around the edge of and hidden within the device. We present shiftIO, two implementations of such a system which actuate physical controls around the edge of a mobile device using magnetic locomotion. One version utilizes PCB-manufactured electromagnetic coils, and the other uses switchable permanent magnets. We perform a technical evaluation of these prototypes and compare their advantages in various applications. Finally, we demonstrate several mobile applications which leverage shiftIO to create novel mobile interactions. View details
    On-Skin Interaction Using Body Landmarks
    Jürgen Steimle
    Joanna Bergstrom-Lehtovirta
    Martin Weigel
    Aditya Shekhar Nittala
    Sebastian Boring
    Kasper Hornbæk
    IEEE Computer, vol. 50 (2017), pp. 19-27
    Preview abstract Recent research in human–computer interaction (HCI) has recognized the human skin as a promising surface for interacting with computing devices. The human skin is large, always available, and sensitive to touch. Leveraging it as an interface helps overcome the limited surface real estate of today’s wearable devices and allows for input to smart watches, smart glasses, mobile phones, and remote displays. Various technologies have been presented that transform the human skin into an interactive surface. For instance, touch input has been captured using cameras, body-worn sensors and slim skin-worn electronics. Output has been provided using projectors, thin displays, and computer-induced muscle movement. Researchers have also developed experimental interaction techniques for the human skin; for instance, allowing a user to activate an interface element by tapping on a specific finger location or by grabbing or squeezing the skin. To keep the design and engineering tractable, most existing work has approached the skin as a more or less planar surface. In that way, principles and models for designing interaction could be transferred from existing touch-based devices to the skin. However, this assumes that the resolution of sensing or visual output on the skin is as uniform and dense as on current touch devices. It is not; current on-skin interaction typically allows only touch gestures or tapping on a few distinct locations with varying performance and, therefore, greatly limits possible interaction styles. It might be acceptable for answering or rejecting a phone call, but it is not powerful enough to allow expressive interaction with a wide range of user interfaces and applications. More importantly, this line of thinking does not consider the fact that the human skin has unique properties that vary across body locations, making it fundamentally different from planar touch surfaces. For instance, the skin contains many distinct geometries that users can feel and see during interactions, such as the curvature of a finger or a protruding knuckle. Skin is also stretchable, which allows novel interactions based on stretching and deforming. Additionally, skin provides a multitude of sensory cells for direct tactile feedback, and proprioception guides the user during interaction on the body. View details
    Grabity: A Wearable Haptic Interface for Simulating Weight and Grasping in Virtual Reality
    Inrak Choi
    Heather Culbertson
    Mark R. Miller
    Sean Follmer
    Proceedings of UIST 2017 (ACM Symposium on User Interface Software and Technology), New York, NY, USA
    Preview abstract Ungrounded haptic devices for virtual reality (VR) applications lack the ability to convincingly render the sensations of a grasped virtual object’s rigidity and weight. We present Grabity, a wearable haptic device designed to simulate kinesthetic pad opposition grip forces and weight for grasping virtual objects in VR. The device is mounted on the index finger and thumb and enables precision grasps with a wide range of motion. A unidirectional brake creates rigid grasping force feedback. Two voice coil actuators create virtual force tangential to each finger pad through asymmetric skin deformation. These forces can be perceived as gravitational and inertial forces of virtual objects. The rotational orientation of the voice coil actuators is passively aligned with the real direction of gravity through a revolute joint, causing the virtual forces to always point downward. This paper evaluates the performance of Grabity through two user studies, finding promising ability to simulate different levels of weight with convincing object rigidity. The first user study shows that Grabity can convey various magnitudes of weight and force sensations to users by manipulating the amplitude of the asymmetric vibration. The second user study shows that users can differentiate different weights in a virtual environment using Grabity. View details
    SkinMarks: Enabling Interactions on Body Landmarks Using Conformal Skin Electronics
    Martin Weigel
    Aditya Shekhar Nittala
    Jürgen Steimle
    Proceedings of CHI 2017 (SIGCHI Conference on Human Factors in Computing Systems), ACM, pp. 3095-3105
    Preview abstract The body provides many recognizable landmarks due to the underlying skeletal structure and variations in skin texture, elasticity, and color. The visual and spatial cues of such body landmarks can help in localizing on-body interfaces, guide input on the body, and allow for easy recall of mappings. Our main contribution are SkinMarks, novel skin-worn I/O devices for precisely localized input and output on fine body landmarks. SkinMarks comprise skin electronics on temporary rub-on tattoos. They conform to fine wrinkles and are compatible with strongly curved and elastic body locations. We identify five types of body landmarks and demonstrate novel interaction techniques that leverage SkinMarks’ unique touch, squeeze and bend sensing with integrated visual output. Finally, we detail on the conformality and evaluate sub-millimeter electrodes for touch sensing. Taken together, SkinMarks expands the on-body interaction space to more detailed, highly curved and challenging areas on the body. View details
    WatchThru: Expanding Smartwatch Displays with Mid-air Visuals and Wrist-worn Augmented Reality
    Dirk Wenig
    Johannes Schöning
    Mathias Oben
    Rainer Malaka
    Proceedings of CHI 2017 (SIGCHI Conference on Human Factors in Computing Systems), ACM, pp. 716-721
    Preview abstract We introduce WatchThru, an interactive method for extended wrist-worn display on commercially-available smartwatches. To address the limited visual and interaction space, WatchThru expands the device into 3D through a transparent display. This enables novel interactions that leverage and extend smartwatch glanceability. We describe three novel interaction techniques, Pop-up Visuals, Second Perspective and Peek-through, and discuss how they can complement interaction on current devices. We also describe two types of prototypes that helped us to explore standalone interactions, as well as, proof-of-concept AR interfaces using our platform. View details
    SmartSleeve: Real-time Sensing of Surface and Deformation Gestures on Flexible, Interactive Textiles, using a Hybrid Gesture Detection Pipeline
    Patrick Parzer
    Adwait Sharma
    Anita Vogl
    Jürgen Steimle
    Michael Haller
    Proceedings of UIST 2017 (ACM Symposium on User Interface Software and Technology), ACM, pp. 565-577
    Preview abstract Over the last decades, there have been numerous efforts in wearable computing research to enable interactive textiles. Most work focus, however, on integrating sensors for planar touch gestures, and thus do not fully take advantage of the flexible, deformable and tangible material properties of textile. In this work, we introduce SmartSleeve, a deformable textile sensor, which can sense both surface and deformation gestures in real-time. It expands the gesture vocabulary with a range of expressive interaction techniques, and we explore new opportunities using advanced deformation gestures, such as, Twirl, Twist, Fold, Push and Stretch. We describe our sensor design, hardware implementation and its novel non-rigid connector architecture. We provide a detailed description of our hybrid gesture detection pipeline that uses learning-based algorithms and heuristics to enable real-time gesture detection and tracking. Its modular architecture allows us to derive new gestures through the combination with continuous properties like pressure, location, and direction. Finally, we report on the promising results from our evaluations which demonstrate real-time classification. View details
    proCover: Sensory Augmentation of Prosthetic Limbs Using Smart Textile Covers
    Joanne Leong
    Patrick Parzer
    Florian Perteneder
    Teo Babic
    Christian Rendl
    Anita Vogl
    Hubert Egger
    Michael Haller
    Proceedings of UIST 2016 (ACM Symposium on User Interface Software and Technology), ACM, pp. 335-346
    Preview abstract Today’s commercially available prosthetic limbs lack tactile sensation and feedback. Recent research in this domain focuses on sensor technologies designed to be directly embedded into future prostheses. We present a novel concept and prototype of a prosthetic-sensing wearable that offers a noninvasive, self-applicable and customizable approach for the sensory augmentation of present-day and future low to midrange priced lower-limb prosthetics. From consultation with eight lower-limb amputees, we investigated the design space for prosthetic sensing wearables and developed novel interaction methods for dynamic, user-driven creation and mapping of sensing regions on the foot to wearable haptic feedback actuators. Based on a pilot-study with amputees, we assessed the utility of our design in scenarios brought up by the amputees and we summarize our findings to establish future directions for research into using smart textiles for the sensory enhancement of prosthetic limbs. View details
    SpecTrans: Versatile Material Classification for Interaction with Textureless, Specular and Transparent Surfaces
    Munehiko Sato
    Shigeo Yoshida
    Boxin Shi
    Atsushi Hiyama
    Tomohiro Tanikawa
    Michitaka Hirose
    Ramesh Raskar
    Proceedings of CHI 2015 (SIGCHI Conference on Human Factors in Computing Systems), ACM, pp. 2191-2200
    Preview abstract Surface and object recognition is of significant importance in ubiquitous and wearable computing. While various techniques exist to infer context from material properties and appearance, they are typically neither designed for real-time applications nor for optically complex surfaces that may be specular, textureless, and even transparent. These materials are, however, becoming increasingly relevant in HCI for transparent displays, interactive surfaces, and ubiquitous computing. We present SpecTrans, a new sensing technology for surface classification of exotic materials, such as glass, transparent plastic, and metal. The proposed technique extracts optical features by employing laser and multi-directional, multispectral LED illumination that leverages the material’s optical properties. The sensor hardware is small in size, and the proposed classification method requires significantly lower computational cost than conventional image-based methods, which use texture features or reflectance analysis, thereby providing real-time performance for ubiquitous computing. Our evaluation of the sensing technique for nine different transparent materials, including air, shows a promising recognition rate of 99.0%. We demonstrate a variety of possible applications using SpecTrans’ capabilities. View details
    Shape Displays: Spatial Interaction with Dynamic Physical Form
    Daniel Leithinger
    Sean Follmer
    Hiroshi Ishii
    IEEE Computer Graphics and Applications, vol. 35 (2015)
    Preview abstract Shape displays are a new class of I/O devices that dynamically render physical shape and geometry. They allow multiple users to experience information through touch and deformation of their surface topology. The rendered shapes can react to user input or continuously update their properties based on an underlying simulation. Shape displays can be used by industrial designers to quickly render physical CAD models before 3D printing, urban planners to physically visualize a site, medical experts to tactually explore volumetric data sets, or students to learn and understand parametric equations. Previous work on shape displays has mostly focused on physical rendering of digital content to overcome the limitations of single-point haptic interfaces—examples include the Feelex and Lumen projects. In our research, we emphasize the use of shape displays for designing new interaction techniques that leverage tactile spatial qualities to guide users. For this purpose, we designed, developed, and engineered three shape display systems that integrate physical rendering, synchronized visual display, shape sensing, spatial tracking, and object manipulation. This enabling technology has allowed us to contribute numerous interaction techniques for virtual, physical, and augmented reality, in collocated settings as well as for remote collaboration. Our systems are based on arrays of motorized pins, which extend from a tabletop to form 2.5D shapes: Relief consists of 120 pins in a circular tabletop, a platform later augmented with spatial graphics for the Sublimate system. Our next-generation platform, inFORM renders higher resolution shapes through 900 pins (see Figure 1). The Transform system consists of 1,152 pins embedded into the surface of domestic furniture. To capture objects and gestures and to control visual appearance, we augment the shape displays with overhead depth-sensing cameras and projectors. In this article, we wish to introduce readers to some of the exciting interaction possibilities that shape displays enable beyond those found in traditional 3D displays or haptic interfaces. We describe new means for physically displaying 3D graphics, interaction techniques that leverage physical touch, enhanced collaboration through physical telepresence and unique applications of shape displays. Our current shape displays are based on prototype hardware that enabled us to design, develop, and explore a range of novel interaction techniques. Although the general applicability of these prototypes are limited by resolution, mechanical complexity, and cost, we believe that many of the techniques we introduce can be transferred to a range of special-purpose scenarios that have different sensing and actuation needs, potentially even using a completely different technical approach. We thus hope that our work will inspire future researchers to start considering dynamic physical form as an interesting approach to enable new capabilities and expressiveness beyond today’s flat displays. View details
    Physical Telepresence: Shape Capture and Display for Embodied, Computer-mediated Remote Collaboration
    Daniel Leithinger
    Sean Follmer
    Hiroshi Ishii
    Proceedings of UIST 2014 (ACM Symposium on User Interface Software and Technology), ACM, pp. 461-470
    Preview abstract We propose a new approach to Physical Telepresence, based on shared workspaces with the ability to capture and remotely render the shapes of people and objects. In this paper, we describe the concept of shape transmission, and propose interaction techniques to manipulate remote physical objects and physical renderings of shared digital content. We investigate how the representation of user's body parts can be altered to amplify their capabilities for teleoperation. We also describe the details of building and testing prototype Physical Telepresence workspaces based on shape displays. A preliminary evaluation shows how users are able to manipulate remote objects, and we report on our observations of several different manipulation techniques that highlight the expressive nature of our system. View details
    T(ether): Spatially-Aware Handhelds, Gestures and Proprioception for Multi-User 3D Modeling and Animation
    Dávid Lakatos
    Matthew Blackshaw
    Zachary Barryte
    Ken Perlin
    Hiroshi Ishii
    Proceedings of SUI 2014 (ACM Symposium on Spatial User Interaction), ACM, pp. 90-93
    Preview abstract T(ether) is a spatially-aware display system for multi-user, collaborative manipulation and animation of virtual 3D objects. The handheld display acts as a window into virtual reality, providing users with a perspective view of 3D data. T(ether) tracks users' heads, hands, fingers and pinching, in addition to a handheld touch screen, to enable rich interaction with the virtual scene. We introduce gestural interaction techniques that exploit proprioception to adapt the UI based on the hand's position above, behind or on the surface of the display. These spatial interactions use a tangible frame of reference to help users manipulate and animate the model in addition to controlling environment properties. We report on initial user observations from an experiment for 3D modeling, which indicate T(ether)'s potential for embodied viewport control and 3D modeling interactions. View details
    No Results Found