In most of the case, you will have to tune OligoArray 2.0 by using various options described below. You can get the following help by typing java -jar OligoArray2.jar:
OligoArray2 - Oligonucleotide design for Microarrays
java -jar OligoArray2.jar
java -jar OligoArray2.jar [-h]
java -jar OligoArray2.jar [-i] [-d] [-orRnlLDtTsxpPmNg]
OligoArray2 is a program to design specific oligonucleotide at the genome
scale in order to perform gene expression profiling using microarrays
Command line options are described below.
-i The input file that contains sequences to process. Expected format is FastA. A file name is expected. This option is required
-d The Blast database that will be used to compute oligo's specificity. A database name is expected. This option is required
-o The output file that will contain oligonucleotide data. A file name is expected. Default is 'oligo.txt'
-r The file that will contain sequences for which the design failed. A file name is expected. Default is 'rejected.fas'
-R The log file that will contain informations generated during design. A file name is expected. Default is 'OligoArray.log'
-n The maximum number of oligonucleotides expected per input sequences. A positive integer is expected. Default is '1'
-l The minimum oligonucleotide length. An integer comprised between 15 and 75 is expected. (Default is '45')
-L The maximum oligonucleotide length. An integer comprised between 15 and 75 is expected. (Default is '47')
-D The maximum distance accepted between the 5' end of the oligo and the 3' end of the input sequence. A positive integer is expected. Default is '1500'
-t The minimum oligonucleotide Tm. A positive integer below 100 and below the maximum Tm is expected. (Default is '85')
-T The maximun oligonucleotide Tm. A positive integer below 100 and above the minimum Tm is expected. (Default is '90')
-s A temperature to use during secondary structure prediction. An oligo will be rejected if it can fold into a stable secondary structure at this temperature. A positive real is expected. Default is '65.0'
-x A threshold to start to consider putative cross-hybridizations. All targets hybridizing with this oligo with a Tm above this threshold will be reported. A positive integer is expected. Default is '65'
-p The minimum oligonucleotide GC content. A positive real below 100 and below the maximum GC content is expected. Default is '40'
-P The maximun oligonucleotide GC content. A positive real below 100 and above the minimum GC content is expected. Default is '60'
-m A list of prohibited sequences to mask in the input sequence. These sequences will never appear in the oligo sequence. Items are separated by semi-colon in the list: "CCCCC;GGGGG". Default is '""'
-N The number of sequences to process at the same time. Depending on the number of processors and the memory available, you can process up to 3 sequences in parallel per processors. Default is '1'
-g The minimum distance between the 5' end of two adjacent oligos. If you want to avoid any overlaps between oligos, you should use a value bigger than the maximum oligo length. A positive integer is expected Default is '1.5 * the average oligo size'
At this time, there is no graphical user interface. I just need time ...
Input Sequence FileThis file should contain all the sequences you want to process. A Fasta format is expected. One line starting with ">" for the comment, and then the following line contains the start of your sequence. There is an important limitation concerning the comment line. For the same sequence, comment lines should be the same in both the input file and in the input Blast database. I strongly suggest to use only one word, a unique identifier, to fill this line.Local Limited Blast DatabaseOligonucleotide specificity is computed by searching for similar sequences in a database using the Blast program (Altschul et al. 1997 NAR 25(17):3389-402) available from NCBI. In order to reduce search time and increase sensitivity, I strongly suggest creating a new local Blast database limited to the sequences of interest. If you want to design oligonucleotides for gene expression studies in your favorite model, you should create a database containing only transcribed sequences from this organism. This is easy to do for an organism with fully sequenced and annotated genome. In the other case, you have to enter as much known sequence as possible. For more details on how to create this database, please check here.
OligonucleotidesThe oligonucleotide file contains all data related to the oligos in a tab delimited format (one oligo per line) ready to import into any spreadsheet program. The two first oligos of the oligo.txt.ref file included in the OligoArray 2.0 package are presented below. In this example, an oligo specific to its target is shown in green while a non specific oligo is shown in red.Rejected sequences
YAL069W 49 47 -308.65 -374.59 -1014.59 85.38 YAL069W ACCACATGCCATACTCACCCTCACTTGTATACTGATTTTACGTACGC
YAL069W 126 47 -302.33 -366.4 -985.60 87.54 YAL069W; YJR162C (-16.90 -281.20 -781.58 73.24 acttaccctactctcacattccact-----ccatcacccatctctca); YLL065W (-16.90 -281.20 -781.58 73.24 acttaccctactctcacattccact-----ccatcacccatctctca); YFL063W (-19.58 -269.0 -737.57 77.20 -cttaccctactttcacattccact-----ccatggcccatctctca); YKL225W (-18.77 -281.90 -778.13 75.58 acttaccctactctcacattccact-----ccatggcccag-tctca) ACTTACCCTACTCTCAGATTCCACTTCACTCCATGGCCCATCTCTCA
First, the program reports the name of the input sequence (YAL069W), then the position of the 5' end of the oligo on the input sequence (49 nucleotides) and the length of the oligo (47 mer). The next four numbers are the free energy of formation of the dsDNA at 37 °C (-308.65 kcal/mol), the enthalpy (-374.59 kcal/mol), the entropy (-1014.59 cal/mol.K) and the melting temperature of the of the dsDNA (Tm; 85.38 °C). The next item is a list of targets for the given oligo. If there is a single target for this oligo (specific), the program should report the name of the input sequence (YAL069W), otherwise it will report the specific target (YAL069W; ) plus a list of non specific targets separated by semi-colons. For each non specific target, it will report its name followed by the free energy at the temperature used to start to consider the non-specific hybridization (see option -x), the enthalpy, the entropy and the Tm of the dsDNA and the sequence of this non-specific target (YJR162C (-16.90 -281.20 -781.58 73.24 acttaccctactctcacattccact-----ccatcacccatctctca);). In case of perfect sequence homology between two or more non-specific targets, all the sequence names are reported before the thermodynamic parameters and the sequence (name 1, name 2, name 3 (free energy enthalpy entropy Tm sequence)). Then, the program reports the oligonucleotide's sequence (5' to 3', same strand as the input sequence).
When OligoArray cannot find an oligo for an input sequence, this input sequence is copied in a new file (RejectedSeq.fas by default). A sequence may be rejected due to absence of oligonucleotide with a Tm inside the Tm range choosen by user. It can be also rejected due to a sequence full of secondary structure.Log file
I use a Fasta format for this file, so you can use it directly to run a new design (remember to rename it before doing a new design otherwise you may overwrite it!)