For each species, the sequence data consists of two components. One is the FASTA formatted sequence; the other is tab-delimited table text files containing the GenBank feature annotations. All the data are provided in .bz2 compressed format for faster download. To uncompress them on Linux/Unix, use the command bunzip2

Plant sequences that are made available through public repositories comprise the core PlantGDB sequence set. Our sequence-processing pipeline extracts all plant nucleotide sequences from EST, GSS, STS,TSA, HTG, HTC, and other genomic DNA sequence categories at GenBank. The extracted sequences as well as their associated annotations are sorted by organism to provide easy and fast access to individual species or to phylogenetically related subsets of organisms. Similarly, plant protein sequences are extracted from UniProt.

Overview of PlantGDB's Version Release Schedule

