MuscleDBs

From Muscle
Revision as of 19:31, 18 July 2019 by Akberdinir@gmail.com (talk | contribs) (Summarized table of the databases)
Jump to: navigation, search

Introduction

Skeletal muscles have indispensable functions in human body and also possess prominent regenerative ability. The rapid emergence of Next Generation Sequencing (NGS) data in recent years offers us an unprecedented perspective to understand gene regulatory networks governing skeletal muscle development and regeneration. However, the data from public NGS database are often in raw data format or processed with different procedures, causing obstacles to make full use of them (Yuan et al., 2019) [1]. Herein, we have integrated all information about current databases developed to represent disparate and heterogeneous omics data (with a focus on transcriptomics data) generated for skeletal muscle in different species.


Databases

MuscleDB

The Hughes (UMSL) and Esser (Univ. of Kentucky School of Medicine) labs are assembling a database of muscle tissue gene expression in mice and rat. They profiled global gene expression using RNA-sequencing from different muscle tissues, including 8 unique skeletal muscle tissues. In this repository, authors are developing a web-based platform to explore, visualize, and share these data build on a Shiny dashboard. This data set, MuscleDB, reveals extensive transcriptional diversity, with greater than 50% of transcripts differentially expressed among skeletal muscle tissues. Developers detected mRNA expression of hundreds of putative myokines that may underlie the endocrine functions of skeletal muscle. Authors were able to identify candidate genes that may drive tissue specialization, including Smarca4,Vegfa, and Myostatin (Terry et al., 2018 [2]). This resource allow investigators to perform analyses such as generating muscle-specific Cre-recombinase mouse strains for genetically manipulating specific muscle groups. Most importantly, these data provides the foundation for computational modeling of transcription factor networks, a method authors believe will uncover the genetic mechanisms that establish and maintain muscle specialization.

GeneXX

GeneXX has been developed as a new web-based resource to facilitate exploration of skeletal muscle gene responses to exercise (Reibe et al., 2018 [3]). Users can enter any human gene of interest, (e.g., PPARGC1A) and immediately observe log-fold change values, adjusted P values (q value), and the time point post exercise at which the transcript was measured, with color- and shape-coded symbols to indicate statistical significance and sex of participants, respectively. Also included are PubMed scores and a short summary about the gene of interest from the NCBI gene site. The main feature of geneXX is that it provides an accessible and instant insight into the response of a particular gene of interest to exercise in human skeletal muscle. To demonstrate its utility, authors carried out a meta-analysis on the included data sets and show transcript changes in skeletal muscle that persist regardless of sex, exercise mode, and duration, some of which have had minimal attention in the context of exercise. To enable visualization of all data embedded in the new results tables, on a single gene basis, a Shiny web app was created

SKmDB

SKmDB is an integrated skeletal muscle NGS database allowing users to explore overall data organization, to obtain information including gene expression, co-expression subnetwork, lincRNA catalog, enhancer profile, and hotspot regions by querying a gene name or a specific region, and to visualize the data of interest. To compile all the available datasets for the field of skeletal muscle, authors searched keywords related to skeletal muscle and myogenesis in Roadmap, ENCODE, GEO database and collected 11 types of NGS data (CLIP-seq, miRNA-seq, small RNAseq, single cell RNA-seq, RNA-seq, ChIP-seq, AIMS-seq, DNase-seq, ATAC-seq, MNase-seq and Bisulfite-seq) corresponding to 16 mouse and 13 human cell types as well as 9 mouse and 20 human tissues (Figure 1). To search for typical enhancers or super enhancers, SKmDB enables querying for gene associated enhancer regions as well as tissue or cell type-specific enhancers that fall into the queried genome region.

SKMDB total-ngs-datasets.png

Figure 1 from (Yuan et al., 2019) [1]. Data overview page of SKmDB.

MGS resource

Muscle Gene Sets (MGS) resource is available for [www.sys-myo.com/muscle_gene_sets download] and is accessible through three commonly used functional genomics platforms (GSEA, EnrichR, and WebGestalt). The MGS is a collection of gene sets extracted from expression studies of skeletal muscle cells and tissues, and a smaller number of cardiac studies. These relate to various aspects of muscle molecular physiology and pathology, including myopathies, cardiomyopathies, metabolism, exercise, ageing, development, regeneration, and others. The MGS resource can be used to investigate the behavior of any list of genes across previous comparisons of muscle conditions, to compare previous studies to one another, and to explore the functional relationship of muscle dysregulation to the Gene Ontology. Its major intended use is in enrichment testing for functional genomics analysis (Malatras et al., 2019 [4]).

NeuroMuscleDB

[Human Skeletal Muscle Proteome Project]

SkeletalVis

Summarized table of the databases

Database Short description Data type Functionality Statistics Current status Reference

MuscleDB

MuscleDB is a project that uses unbiased RNA sequencing (RNA-seq) to profile global mRNA expression in a wide array of smooth, cardiac, and skeletal muscle tissues from mice and rats.

Expression profiling by high throughput sequencing.

User can filter the database search by:
1. gene symbol (like ‘Per1’);
2. gene ontology (like ‘GTPase activity’);
3. muscle tissue type;
4. expression level;
5. p-value (statistically significant difference between tissues (based on a two-way ANOVA));
6. change in expression, relative to another tissue type.

User can also select which muscle tissues are interest of. By default, all tissues are checked. At the bottom of the plot options, just below ‘advanced filtering’, are the different ways to display the data. User can choose to show:
1. plot (default): a bar graph of the expression levels in the tissues (in FPKM, Fragments Per Kilobase per Million reads) for each transcript, and options to save the plots;
2. table: numeric table with the gene symbols, transcript names, expression levels in the tissues (in FPKM, Fragments Per Kilobase per Million reads), and the q-value (difference between tissues from a two-way ANOVA);
3. volcano plot: volcano plot comparing two muscles, showing the logarithm of q-value versus the logarithm of the fold-change in expression;
4. heat map: a dynamic heat map comparing the expression level of each transcript for each tissue;
5. compare genes: a series of scatter plots comparing the expression levels to a particular reference tissue.

126 samples, 17 mouse tissues (all from males), 2 female rat tissues, 2 male rat tissues. Six replicates for each tissue; each replicate is 3 individual samples pooled. For mouse tissues, 3 are smooth muscle, 3 are cardiac muscle and 11 are skeletal muscle. For male and female rat samples, both tissues are skeletal.

The beta-version is alive. The last update is 2 ya.

Terry et al., 2018 [2]

GeneXX

GeneXX is an online tool for the exploration of transcript changes in skeletal muscle associated with exercise.

Expression profiling by microarray (Illumina, Affymetrix, or Agilent) and high throughput sequencing (Illumina HiSeq 2000).

Users can enter any human gene of interest, (e.g., PPARGC1A) and immediately observe log-fold change values, adjusted P values (q value), and the time point postexercise at which the transcript was measured, with color- and shape-coded symbols to indicate statistical significance and sex of participants, respectively. Also included are PubMed scores and a short summary about the gene of interest from the NCBI gene site.

In total database includes 19 data sets from GEO the info about which is summarized in Table 1 of the article [3]. Criteria for inclusion were that the data were collected on healthy participants completing an acute bout of endurance or resistance exercise, before or at the end of a chronic training regime, as defined by the conductors of each study. Included are both cross-sectional (exercise vs. sedentary) or within subjects (pre- vs. postexercise) comparisons analyzing biopsies of the vastus lateralis or biceps brachii (GSE24235 only) and in both male and female participants of all ages.

The current version of the database is alive, but this tool is still in it's trial phase. The last update was 1 ya.

Reibe et al., 2018 [3]

SKmDB

SKmDB is an integrated database of NGS information in skeletal muscle. SKmDB not only includes all NGS datasets available in the human and mouse skeletal muscle tissues and cells, but also provide preliminary data analyses including gene/isoform expression levels, gene co-expression subnetworks, as well as assembly of putative lincRNAs, typical and super enhancers and transcription factor hotspots.

SKmDB is gathering all NGS datasets available in the human and mouse skeletal muscle cells and tissue including CLIP-seq, miRNA-seq, small RNAseq, single cell RNA-seq, RNA-seq, ChIP-seq, AIMS-seq, DNase-seq, ATAC-seq, MNase-seq and Bisulfite-seq.

Users can efficiently search, browse and visualize the information with the well-designed user interface and server side. Track visualization of all the RNA-seq, small RNA-seq, miRNA-seq, ChIP-seq, DNase-seq, ATAC-seq and MNase-seq data on mm9, mm10, hg19, hg38 reference genome are provided through a genome visualization tool called Biodalliance.

To compile all the available datasets for the field of skeletal muscle, authors searched keywords related to skeletal muscle and myogenesis in Roadmap, ENCODE, GEO database and collected 11 types of NGS data corresponding to 16 mouse and 13 human cell types as well as 9 mouse and 20 human tissues.

The current version of the database is alive, but the server is occasionally not available.

Yuan et al., 2019 [1]

MGS resource

MGS resource is a collection of phenotypic-level gene sets in which genes share actual connections in the form of differential expression in the same transcriptomic comparison. The MGS resource can be used to investigate the behaviour of any list of genes across > 1100 previous comparisons of muscle conditions, to compare previous studies to one another, and to explore the functional relationship of muscle dysregulation to the gene ontology.

Expression profiling by microarray (Affymetrix).

Its major intended use is in enrichment testing for functional genomics analysis, for which purpose it has been made accessible through three commonly used analytical tools (GSEA, EnrichR, and WebGestalt).

The current download is comprised of 1,517 Gene Sets. Of these, 1,156 were derived from authors recent analysis of 302 studies of muscle physiology and disease published from 2005-present, including 4305 separate samples. A further 122 were derived from published in vitro muscle microarray studies carried out from 2005-present, as used in their previous work, and 185 were derived from a previous meta-analysis carried out by Jelier et al., 2008 [5]. The remaining 54 are from muscle-related gene ontology terms, but also several other muscle-relevant entries in the MSigDB database (mostly comprising muscle-related pathways from Reactome or Biocarta databases).

The current version 3 is released March 2019.

Malatras et al., 2019 [4]

Table1. Summarized table of the databases with transcriptomics data generated for skeletal muscle in different species.

References

  1. Yuan J, Zhou J, Wang H, and Sun H. SKmDB: an integrated database of next generation sequencing information in skeletal muscle. Bioinformatics. 2019 Mar 1;35(5):847-855. DOI:10.1093/bioinformatics/bty705 | PubMed ID:30165538 | HubMed [1]
  2. Terry EE, Zhang X, Hoffmann C, Hughes LD, Lewis SA, Li J, Wallace MJ, Riley LA, Douglas CM, Gutierrez-Monreal MA, Lahens NF, Gong MC, Andrade F, Esser KA, and Hughes ME. Transcriptional profiling reveals extraordinary diversity among skeletal muscle tissues. Elife. 2018 May 29;7. DOI:10.7554/eLife.34613 | PubMed ID:29809149 | HubMed [2]
  3. Reibe S, Hjorth M, Febbraio MA, and Whitham M. GeneXX: an online tool for the exploration of transcript changes in skeletal muscle associated with exercise. Physiol Genomics. 2018 May 1;50(5):376-384. DOI:10.1152/physiolgenomics.00127.2017 | PubMed ID:29547064 | HubMed [3]
  4. Malatras A, Duguez S, and Duddy W. Muscle Gene Sets: a versatile methodological aid to functional genomics in the neuromuscular field. Skelet Muscle. 2019 May 3;9(1):10. DOI:10.1186/s13395-019-0196-z | PubMed ID:31053169 | HubMed [4]
All Medline abstracts: PubMed | HubMed