Jump to Content

Using Nucleus and TensorFlow for DNA Sequencing Error Correction

Gunjan Baid
Helen Li
(2019)

Abstract

In this post, we formulate DNA sequencing error correction as a multiclass classification problem and propose two deep learning solutions. Our first approach corrects errors in a single read, whereas the second approach, shown in Figure 1, builds a consensus from several reads to predict the correct DNA sequence. Our Colab notebook tutorial implements the second approach using the Nucleus and TensorFlow libraries. Our goal is to show how Nucleus can be used alongside TensorFlow for solving machine learning problems in genomics.