Automatic Instructional Video Creation from a Markdown-Formatted Tutorial

Peggy Chi; Nathan Frey; Katrina Panovich; Irfan Essa

Automatic Instructional Video Creation from a Markdown-Formatted Tutorial

Peggy Chi

Nathan Frey

Katrina Panovich

Irfan Essa

UIST 2021: ACM Symposium on User Interface Software and Technology (2021)

Download Google Scholar

Abstract

We introduce HowToCut, an automatic approach that converts a Markdown-formatted tutorial into an interactive video that presents the visual instructions with a synthesized voiceover for narration. HowToCut extracts instructional content from a multimedia document that describes a step-by-step procedure. Our method selects and converts text instructions to a voiceover. It makes automatic editing decisions to align the narration with edited visual assets, including step images, videos, and text overlays. We derive our video editing strategies from an analysis of 125 web tutorials and apply Computer Vision techniques to the assets. To enable viewers to interactively navigate the tutorial, HowToCut's conversational UI presents instructions in multiple formats upon user commands. We evaluated our automatically-generated video tutorials through user studies (N=20) and validated the video quality via an online survey (N=93). The evaluation shows that our method was able to effectively create informative and useful instructional videos from a web tutorial document for both reviewing and following.

Research Areas

Human-computer interaction and visualization

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Automatic Instructional Video Creation from a Markdown-Formatted Tutorial

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs