siRNA Design Software Manual

Input

 
mRNA Sequence Name
This is an optional input to identify the input sequence.
 
mRNA Sequence
Select either accession number or nucleotide sequence as the input format. Enter an NCBI accession number if accession number is selected as the input format. For nucleotide sequence, all blanks and non-alphabetic characters will be removed; "U" and "u" are treated as "T".
 
Position of start codon
This is the position of the first nucleotide of the translation start codon with respect to the mRNA (see Appendix I). This field is automatically determined if accession number is chosen as the input format.
 
Position of terminating codon
This is the position of the third nucleotide of the translation terminating codon with respect to the mRNA (see Appendix I). This field is automatically determined if accession number is chosen as the input format.
 
Organism
This is the kind of organism for the input mRNA sequence. At the current moment, only Human and Mouse are supported.
 
Software
There are 12 existing siRNA design software to choose from. Users can visit these software via the links provided. The design rules that each software is based on and the limit of the length of the input sequence are specified for your reference.
  • At least one software must be selected.
  • Some software invokes BLAST by default and requires a longer processing time. A longer timeout has been set by default for these software but users can also change the individual timeout for each software if needed.
(Remarks: Some software mentioned in the paper requires registration and are therefore not included in our software.)
 
 
Filtering of ineffective siRNAs based on secondary structure
Users can choose to make use of our filtering program (based on the secondary structure of the mRNA) to filter out any ineffective siRNAs. The rules for filtering are given in our paper. Software for computing the secondary structure needs to be invoked if filtering is required. Currently, only one software (mfold) is supported.
  • To use mfold, users can also choose to provide an optional url containing the mfold results (see Appendix II). If no url is present, mfold will be invoked automatically before filtering and a longer time (around 30 minutes) is needed.
  • Users can also specify a longer timeout if the default timeout is not long enough for their sequence.
 
Requirements of output siRNAs
Only the siRNA candidates satisfying all the user requirements are reported.
  • The number of software selected here should not exceed the number of software selected in Software. This number is also called the popularity of an siRNA candidate.
  • GC content should be within the range 0%-100%.
  • The minimum position from start codon should be within the range (-start codon + 1) to (terminating codon - start codon). For example, if the start codon is at 76 and the terminating codon is at 963, then the minimum position from the start codon should be within the range -75 to 887.
  • Starting nucleotides is the dinucleotide sequence just before the target site (such as AA, NA, where "N" refers to any nucleotide). For example, if the target sequence starts from position 76, the two nucleotides in positions 74 and 75 are its starting nucleotides.
  • The default length of the siRNA target site is 21. Other lengths are not supported at the current moment.
 
Delivery of output
Users can choose to view the results online or receive the results via email when it is done.
  • To receive results via email, a valid email address should be provided. The email address is kept confidential and will be discarded once the result is sent to the address. A link to the result page will be provide in the email. The results will be erased after 48 hours.
  • Results can also be displayed directly after processing if "Online" is selected.

After specifying the user input, click the "Submit" button.

Output

w/o filtering

The output consists of 3 parts:



Each table entry consists of a number of attributes:
Target Site
This is the nucleotide sequence of the target site from which sense and antisense siRNAs are derived. For example, if the target site is GGCAACTCCAGTCAGAACA and the preferred overhang is dTdT, then:
Sense siRNA:         5'-GGCAACUCCAGUCAGAAdTdT-3'
Antisense siRNA: 3'-dTdTCCGUUGAGGUCAGUCUU-5'
The last two nucleotides are considered to be the overhang and are italicized for clarity.
 
Reported By
This shows the software that reports the siRNA candidate. The entry will be a "Y" if the siRNA candidate is reported by the corresponding software.
 
Popularity
This is the number of software which reports the siRNA candidate.
 
Weighted Popularity
Since some software reports a large number of candidates and some reports only a few, a higher popularity of an siRNA candidate (in which each software can be considered to be weighted equally) does not imply it has a higher chance to be an effective candidate. So for weighted popularity, each software is given a different weight (1 if the software reports less candidates than the average of all software and 0.5 otherwise). Summing over the weights of all software that reports the siRNA candidate gives the weighted popularity. A candidate with a higher weighted popularity is likely to be more effective according to the results collected from existing design tools.
 
Position from start codon
This is the position of the first nucleotide of the target site w.r.t. the start codon.
 
GC content
This is the percentage of GC on the target site, calculated by the following formula:
((# of Gs or Cs on the target site) / length of siRNA target site) * 100
 
Starting nucleotides
This is the dinucleotide sequence just before the target site.

By default, the candidates are sorted by weighted popularity, then by popularity, starting nucleotides, GC content, and lastly, position of start codon. Users can also sort the siRNA candidates according to their needs by changing the sort order and pressing the "Sort" button.

w/ filtering



This is similar to the output w/o filtering except:

Appendix I

Finding the positions of start and terminating codons from GenBank:

Step 1
Visit the website of GenBank. Select "Nucleotide" and input the accession number or the name of your input mRNA sequence. Then click "Go".
 
Step 2
Click the link to view information about your sequence.
 
Step 3
Scroll down the page to find "CDS". The two numbers (in this example, 119..976) are the positions of the start and terminating codons respectively.

Appendix II

Getting the url containing mfold results.

Step 1
Visit the website of RNA mfold. Input the mRNA sequence. Note that the mRNA sequence cannot exceed 6000bp.
 
Step 2
Input the job nature. For an immediate job, the mRNA sequence length cannot exceed 800bp; while for a batch job, it cannot exceed 6000bp. Provide a valid email address if batch job is selected.
 
Step 3
If batch job is selected, the url containing mfold results can be found in the email notification.

The folding of myseq is completed.

You may retrieve the results at
http://www.bioinfo.rpi.edu/applications/mfold/old/mfold/5/05Jan11-05-05-43/.
They will be erased in 30 hours.
 
If immediate job is selected, the url containing mfold results can be found at the bottom of the page.