Ricardo A Garcia
Ricardo Garcia (rago) is a senior software engineer at Android Audio Framework team. He holds a BSEE from his native Colombia and two MS in Music Engineering (U. of Miami) and Media Arts & Sciences (MIT). He published work on audio watermarking, psychoacoustic models, genetic programming and hearing health. Worked at Chaoticom on low bitrate codecs and founded Base 10 Labs for custom DSP solutions.
Research Areas
Authored Publications
Sort By
Ecological Auditory Measures for the Next Billion Users
Brian Kemler
Chet Gnegy
Dimitri Kanevsky
Malcolm Slaney
Ear and Hearing (2020)
Preview abstract
A range of new technologies have the potential to help people, whether traditionally considered hearing impaired or not. These technologies include more sophisticated personal sound amplification products, as well as real-time speech enhancement and speech recognition. They can improve user’s communication abilities, but these new approaches require new ways to describe their success and allow engineers to optimize their properties. Speech recognition systems are often optimized using the word-error rate, but when the results are presented in real time, user interface issues become a lot more important than conventional measures of auditory performance. For example, there is a tradeoff between minimizing recognition time (latency) by quickly displaying results versus disturbing the user’s cognitive flow by rewriting the results on the screen when the recognizer later needs to change its decisions. This article describes current, new, and future directions for helping billions of people with their hearing. These new technologies bring auditory assistance to new users, especially to those in areas of the world without access to professional medical expertise. In the short term, audio enhancement technologies in inexpensive mobile forms, devices that are quickly becoming necessary to navigate all aspects of our lives, can bring better audio signals to many people. Alternatively, current speech recognition technology may obviate the need for audio amplification or enhancement at all and could be useful for listeners with normal hearing or with hearing loss. With new and dramatically better technology based on deep neural networks, speech enhancement improves the signal to noise ratio, and audio classifiers can recognize sounds in the user’s environment. Both use deep neural networks to improve a user’s experiences. Longer term, auditory attention decoding is expected to allow our devices to understand where a user is directing their attention and thus allow our devices to respond better to their needs. In all these cases, the technologies turn the hearing assistance problem on its head, and thus require new ways to measure their performance.
View details
The New Dynamics Processing Effect in Android Open Source Project
Audio Engineering Society Convention 145 (2018)
Preview abstract
The Android “P” Audio Framework’s new Dynamics Processing Effect (DPE) in Android Open Source Project (AOSP), provides developers with controls to fine-tune the audio experience using several stages of equalization, multi-band compressors, and linked limiters. The API allows developers to configure the DPE’s multichannel architecture to exercise real-time control over thousands of audio parameters. This talk additionally discusses the design and use of DPE in the recently announced Sound Amplifier accessibility service for Android and outlines other uses for acoustic compensation and hearing applications.
View details