Bioinfomatics Research Group

Computer Science, The University of Hong Kong

Releases

IDBA 0.20  Release on Jan 14,2011

IDBA tool kit 0.20 for 64-bit Linux. Add a script to performa T-IDBA (multiple-k).

Usage for T-IDBA (multiple-k)

script/tidba-multiple-k prefix read-file

Please install cd-hit-est before using this script. The output file can be found in prefix-cluster.fa.

IDBA 0.19  Release on Jan 14,2011

IDBA tool kit 0.19 for 64-bit Linux. Add Meta-IDBA(Iterative de Bruijn Graph De Novo short read Assembler for Metagenomics based on graph partition) into the toolkit.  Please see more information in Meta-IDBA page.

Usage for Meta-IDBA

Meta-IDBA: Iterative De Bruijn graph short read Assembler for Metagenomics based on graph partition.

metaidba --read read-file [--output out] [options]

Allowed Options:

  -h, --help           produce help message
  -r, --read arg         read file
  -l, --long arg         long read file
  -o, --output arg (=out)    prefix of output
    --mink arg (=25)      minimum k value
    --maxk arg (=50)      maximum k value
    --minCount arg (=2)    filtering threshold for each k-mer
    --cover arg (=0)      the cutting coverage for contigs
    --connect         use paired-end reads to connect components
    --minPairs arg (=5)  minimum number of pair-end connections to join two components
    --prefixLength arg (=3)  length of the prefix of k-mer used to split k-mer table		

The output consensus and alignment can be found in out.consensus and out.alignment.

Example
$ bin/metaidba --read meta-reads.fa --output out --connect

IDBA 0.18  Release on Oct 21,2010

IDBA tool kit 0.18 for 64-bit Linux. Add T-IDBA(Iterative de Bruijn Graph De Novo short read Assembler for Transcriptome) into the toolkit.  Please see more information in T-IDBA page.

Usage for T-IDBA

T-IDBA: Iterative De Bruijn graph short read Assembler for transcriptome

tidba --read read-file [--output out] [options]

Allowed Options:

-h, --help 		produce help message
-r, --read arg 		read file
-o, --output arg (=out) prefix of output
--mink arg (=25)	minimum k value
--modk arg (=50) 	moderate k value
--maxk arg (=90) 	maximum k value
--minCount arg (=2) 	filtering threshold for each k-mer
--minPairs arg (=5)	minimum number of pair-end connections to join two contigs
--prefixLength arg (=3) length of the prefix of k-mer used to split k-mer table
		

The output isoforms can be found in out-isoforms.fa

Example
$ bin/tidba --read mouse-tran-reads.fa --output mouse

IDBA 0.17  Release on Aug 1,2010

IDBA tool kit 0.17 for 64-bit Linux. Remove boost requirement for installation this software.

Usage for IDBA tool kit for 64-bit Linux
idba --read read-file [--output out] [options]

Allowed options:

-h, --help              produce help message
-r, --read arg          read file
-l, --long arg          long read file
-o, --output arg (=out) prefix of output
--scaffold              use pair end information to merge contigs
--mink arg (=25)        minimum k value
--maxk arg (=50)        maximum k value
--minCount arg (=2)     filtering threshold for each k-mer
--cover arg (=0)        the cutting coverage for contigs
--minPairs arg (=5)     minimum number of pair-end connections to join two contigs
--prefixLength arg (=3) length of the prefix of k-mer used to split k-mer table
      

The output contigs will be found in file out-contig.fa and out-contig-mate.fa. The former is single end output, the latter is pair end output. If you want to run pair-end version, please use --scaffold parameter.

Example (using the sample input)
$ bin/idba --read Lacto.reads-mate-30-0.01-75 --output lacto

If the data is in fastq format, please use fq2fa tool to do conversion first.

$ bin/fq2fa fq-file fa-file

IDBA(pair end) requires pair end reads stored in single file and a pair of reads is in consecutive two lines. If not, please use mergeReads to merge two read files to single file.

$ bin/mergeReads read-file1 read-file2 merge-read-file

Reads with same length are prefered. normReads tool can help to normlize reads. If the read-file is merged by mergeReads program, please use mate flag to do normalization.

$ bin/normReads read-file output-read-file --length l --mate

Installation Guide

Exract the package, then use make to compile the source code.

$ ./configure
$ make 

Sample Input and Output

Before your testing, please first read Installation Guide.

Reference genome

Simulated reads (30x depth, 75 read length, 1% error rate, 100 insert distance)

Contigs generated by IDBA (single)

Contigs generated by IDBA (pair)