3D Human Sensing, Action and Emotion Recognition in Robot Assisted Therapy of Children with Autism
Abstract
We introduce new, fine-grained action and emotion
recognition tasks defined on non-staged videos, recorded
during robot-assisted therapy sessions of children with
autism. The tasks present several challenges: a large
dataset with long videos, a large number of highly variable
actions, children that are only partially visible, have
different ages and may show unpredictable behaviour, as
well as non-standard camera viewpoints. We investigate
how state-of-the-art 3d human pose reconstruction methods
perform on the newly introduced tasks and propose extensions
to adapt them to deal with these challenges. We also
analyze multiple approaches in action and emotion recognition
from 3d human pose data, establish several baselines,
and discuss results and their implications in the context of
child-robot interaction.
recognition tasks defined on non-staged videos, recorded
during robot-assisted therapy sessions of children with
autism. The tasks present several challenges: a large
dataset with long videos, a large number of highly variable
actions, children that are only partially visible, have
different ages and may show unpredictable behaviour, as
well as non-standard camera viewpoints. We investigate
how state-of-the-art 3d human pose reconstruction methods
perform on the newly introduced tasks and propose extensions
to adapt them to deal with these challenges. We also
analyze multiple approaches in action and emotion recognition
from 3d human pose data, establish several baselines,
and discuss results and their implications in the context of
child-robot interaction.