Bioinfomatics Research Group

Computer Science, The University of Hong Kong

What Is IDBA-UD?

IDBA-UD is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.

Current Release

Latest version is avaliable in GitHub.

IDBA 1.1.1

Some bug fixes. Use 16 bits to store read length. All IDBA assemblers will support read length up to 65535 by modifying kMaxShortSequence in src/sequence/short_sequence.h

Download current release

IDBA, IDBA-UD, IDBA-Hybrid and IDBA-Tran all in one package Released Oct 18, 2012

All IDBA (iterative de Bruijn graph assembler) series assemblers are refined and included in this package. Plenty of errors are fixed and scaffolding on multiple levels of paired-end reads are supported in IDBA, IDBA-UD and IDBA-Hybrid.

The basic IDBA is included only for comparison.
If you are assembling genomic data without reference, please use IDBA-UD.
If you are assembling genomic data with a similar reference genome, please use IDBA-Hybrid.
If you are assembling transcriptome data, please use IDBA-Tran.

Download release 1.1.0

 

IDBA-UD package Released Oct 31, 2011

Download release

Please follow the instruction in README file to run the software.

Data

The references used in IDBA-UD paper can be download here. Please follow the README file in the package to generate the simulated data.

Experiment Parameters

Here is the parameters for running IDBA-UD in all five datat sets we presented in the paper.

1. Simluated 10x lacto: (default parameters are used)

2. E.coli Single Cell: --mink 40 --min_count 4 --min_support 2

3. S.aureus Single Cell: --mink 60 --min_count 8 --min_support 4

4. Simulated 3 lacto species: --pre_correction

5. Real human gut microbes: --pre_correction

References

If you use our assembler in your research, please cite our papers.

Peng, Y., et al. (2010) IDBA- A Practical Iterative de Bruijn Graph De Novo Assembler. RECOMB. Lisbon.

Peng, Y., et al. (2012) IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth, Bioinformatics, 28, 1420-1428.

Contact

E-mail: Peng Yu