Developmental Transcriptomes of S. purpuratus

These RNA-seq data were generated from a comprehensive transcriptome survey of 22 samples, including 10 embryonic stages, 6 feeding larval and metamorphosed juvenile stages, and 6 adult tissues.

In total 784 million 76bp pair-end reads were obtained. The reads were mapped onto the S. purpuratus genome v3.0, and new gene models were constructed. These RNA-seq models are identified with WHL22 prefix. Various information about the analysis is provided on this page.

The analysis was done based on S. purpuratus genome v3.0. The genome v3.0 differs with the current v3.1 in only a few places due to the removal of contaminating microbial sequences. The assembled transcript sequences remain the same, although the coordinates of exons might change.

For more detailed information, please see the Publication section.

  • The Search box accepts multiple ID/names. WHL and SPU IDs can be embedded in other text, and the program recognizes them by the pattern. Names have to be  one name per line.
  • The expression cluster IDs can be found in this plot. They are [three-digit numbers] in the plot titles. Use the complete IDs like '020', not just '20'.
  • To use the IGV link, make sure you have downloaded the IGV program and it is running on your computer. The link will locate the corresponding gene model in the genome context.
  • mRNA sequences are sequences assembled directly from the RNA-seq data. CDS/protein sequences are predicted.

Gene Structure and Raw Reads, Using IGV Genome Browser


IGV visualization of HLA-B. The HLA-B region for the genome graph produced by our approach as visualized in the Integrated Genomics Viewer. (a) The coverage (gray bar) of the eight included assemblies (NA19240, HX1, etc.) and the alignment of each to the graph genome. Colored vertical lines indicate sequence variants (green = A, blue = C, orange = G, red = T), horizontal black lines indicate deletions, and vertical purple "I" characters show insertions. (b) Genomic annotations. High rates of polymorphism are observed around peptide-binding-site encoding exons 2 and 3.


IGV (Integrative Genomics Viewer), developed by Broad Institute MIT, is a high-performance desktop genome browser for interactive exploration of large, integrated datasets. The genome view of RNA-seq models assembled in this study and reads etc can be accessed through IGV using the data server described below.

Quick Start:

Download IGV:

  • Go to IGV homepage, download the program with the option for the maximum memory compatible with your computer, launch IGV. In practice, ~1G memory is necessary to load gene models, ~1.5-2G or even more memory is necessary to load reads.

Load the S. purpuratus genome:

  • In IGV, go to the genomes drop-down menu, select "S.purpuratus (3.0)", to load the scaffolds and GLEAN3 models.

Load datasets:

  • First, change the data server setting: select menu: View -> Prefereces -> Advanced, select "Edit server properties", change the Data Registry URL to this ( Don't change the Genome Server URL.
  • Then, select menu: File -> Load from server, select the datasets.


  • The locus of interest can be reached by selecting the drop-down menu of scaffolds. You can also type the coordinate, exact official gene name, or RNA-seq model ID in the search box.
  • The view can be panned and zoomed by the mouse or keyboard shortcuts.
  • The gene name is the "official" name used in SpBase. Make sure the exact same text is typed in the search box.