Developmental Transcriptomes of S. purpuratus


These RNA-seq data were generated from a comprehensive transcriptome survey of 22 samples, including 10 embryonic stages, 6 feeding larval and metamorphosed juvenile stages, and 6 adult tissues.

In total 784 million 76bp pair-end reads were obtained. The reads were mapped onto the S. purpuratus genome v3.0, and new gene models were constructed. These RNA-seq models are identified with WHL22 prefix. Various information about the analysis is provided on this page.

The analysis was done based on S. purpuratus genome v3.0. The genome v3.0 differs with the current v3.1 in only a few places due to the removal of contaminating microbial sequences. The assembled transcript sequences remain the same, although the coordinates of exons might change.

For more detailed information, please see the Publication section.


1. Sequence and Expression, Using the Query Tool

The Query Tool provides an interface to search the data by:
  • SpBase official gene name (e.g. Tgif),
  • SpBase official gene ID (e.g. SPU_018126),
  • RNA-seq model ID (e.g. WHL22.614286),
  • Function class (2 levels),
  • Expression profile cluster ID (e.g. 020)

It returns a table of gene models found, with

  • A brief info table of names, IDs, function classes, expression clusters, links to SpBase and IGV genome browser;
  • A table of quantitative data;
  • Expression dynamics in embryonic stages by line plots or heat maps;
  • mRNA sequences, predicted CDS and protein sequences;
  • Downloadable data table.

2. Gene Structure and Raw Reads, Using IGV Genome Browser

IGV (Integrative Genomics Viewer), developed by Broad Institute MIT, is a high-performance desktop genome browser for interactive exploration of large, integrated datasets. The genome view of RNA-seq models assembled in this study and reads etc can be accessed through IGV using the data server described below.Quick Start:

  • Download IGV:
    • Go to IGV homepage, download the program with the option for the maximum memory compatible with your computer, launch IGV. In practice, ~1G memory is necessary to load gene models, ~1.5-2G or even more memory is necessary to load reads.
  • Load the S. purpuratus genome:
    • In IGV, go to the genomes drop-down menu, select "S.purpuratus (3.0)", to load the scaffolds and GLEAN3 models.
  • Load datasets:
    • First, change the data server setting: select menu: View -> Prefereces -> Advanced, select "Edit server properties", change the Data Registry URL to this ( Don't change the Genome Server URL.
    • Then, select menu: File -> Load from server, select the datasets.
  • Navigation:
    • The locus of interest can be reached by selecting the drop-down menu of scaffolds. You can also type the coordinate, exact official gene name, or RNA-seq model ID in the search box.
    • The view can be panned and zoomed by the mouse or keyboard shortcuts.
    • The gene name is the "official" name used in SpBase. Make sure the exact same text is typed in the search box.

3. Alternative Data Accesses

The assembled transcriptome sequences have been submitted to NCBI Transcriptome Shotgun Assembly Sequence Database under accession numbers JT094275 - JT123346 that can also be retrieved in its entirety through NCBI BioProject Database by the accession number PRJNA81157.

The sequences and other data are being integrated into SpBase on individual gene pages. A BLAST service to search the assembled mRNA sequences is also provided.

20th Feb 2024

Recent Posts