Tracembler

Tracembler Help

Introduction


Tracembler1 takes an input sequence (peptide or cDNA), performs iterative BLAST analysis of the NCBI trace archive, assembles all retrieved sequences using CAP3, and splice-aligns your query sequence to the contig(s) using GeneSeqer. Each run can take up to several hours, depending on parameters and database size, so a link to results is emailed to you.

Step by Step Help


Step 1

:

A valid email address is required to communicate the completion of your job.

Step 2

:

In this step, supply your query sequence(s) that will serve as seeds for your assembly. You can either paste your sequence(s) in the text box or upload a local file containing your sequence(s). Query sequence(s) can be of nucleotide or protein type. Sequences should be in FASTA format. Multiple sequences are allowed. For example in doing a inter-species two gene chromosome walk, two protein sequences are supplied as query sequences. If multiple sequences are supplied, the sequences must all be of the same type, all protein or all nucleotide.

Step 3

:

Designate an NCBI Trace Archive Database as the target database for your Tracembler job. To learn more about a given species, refer to the NCBI Taxonomy Browser.

Step 4

:

A BLAST job, either blastp or tblastn depending on your query sequence(s), of your supplied sequences from Step 2 against your specified database from Step 3 is run. The 'initial BLAST E-value' specifies which hits from this BLAST result are used to begin the recursive BLAST process.

Step 5

:

Click 'Run Tracembler' to submit your job. You be will sent an email when the Tracembler job is complete. You may also watch the progress of your job at the URL provided in the submition page.

Advanced Parameters


Tracembler

  • Number of Rounds specifies the number of rounds of recursive BLAST1 iterations. One iteration is the combine BLAST2 jobs from the query sequences. For example if there are 2 initial query sequences, the first iteration or round 1 includes 2 BLAST jobs. The hits above the 'round BLAST E-value' are used as queries for the next iteration, round 2.
  • Round BLAST E-value specifies the maximum E-value of hits to keep for the next round.
  • Maximum number of BLAST queries specifies the total amount of individual blast jobs. Tracembler sets an upper limit on the amount of total blast jobs. This default value is suitable for most Tracembler uses. The maximum value ensures that the Tracembler job will be completed in a reasonable amount of time and will not overload NCBI's BLAST Service3.

BLAST

The following parameters are the same for the Inital BLAST (blastn or tblastn) of your query sequences and for the Recursive BLAST (blastn) rounds. Once can specify different values of these parameters depending on the type of BLAST (Initial or Recursive). Each parameter's possible values are derived from NCBI's Trace Archive BLAST Services. 4,5

  • Filters provide the ability to mask regions of the query sequence. The available options include Low Complexity, Rodents Repeats, and Mask for lookup table only. For further information, please see NCBI's help page.
  • Word-size defines the size of the seed matches used in the BLAST algorithm. For further information, please see NCBI's help page.
  • Percent Identity defines the minimum amount of percent identity required for alignment to be retained. For further information, please see NCBI's help page.

Spliced-Alignment

  • GenomeThreader 5 species-specific splice site model specifies a species model in constructing the spliced alignment of your query sequence(s) against your new Contig(s).

Tracembler Software Availability


A webserver is available for general use at PlantGDB. The Tracembler software itself is freely available. It is mainly written in Perl and is designed to run a standard Linux machine. Instructions on obtaining the required external programs such as CAP3 are provided with the software.

References


  1. Dong, Q., Wilkerson, M.D. & Brendel, V. (2007) Tracembler - software for in silico chromosome walking in unassembled genomes. BMC Bioinformatics 8, 151. [PubMed ID: 17490482] [online abstract]
  2. McGinnis S, Madden TL. (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 32, W20-5.
  3. NCBI QBLAST URL API. http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html.
  4. http://www.ncbi.nlm.nih.gov/blast/tracemb.shtml
  5. http://www.ncbi.nlm.nih.gov/blast/mmtrace.shtml
  6. Gremme, G., Brendel, V., Sparks, M.E. & Kurtz, S. (2005) Engineering a software tool for gene structure prediction in higher organisms. Information Software Technol. 47, 965-978

For further questions/assistance, please email Tracembler Help.

Loading Help Page...Thanks for your patience!

Loading Video...Thanks for your patience!

Loading Image...Thanks for your patience!