The following table shows important quality control metrics for each sample.
Gene Length (bp): The total number of base pairs (bp) in the gene annotation. The longest gene isoform is used for counting, however the longest isoform is not always expressed in the sample.
Total reads: The total number of reads mapping to the gene of interest in the database across all 6 replicates.
Mean Coverage: The mean read depth computed across all gene bases.
Coverage Coefficient of Variation (CV): The ratio of the standard deviation of read depth to the mean read depth, a measure of relative variability of coverage across all gene bases.
Coverage uniformity: The percentage of gene bases with more than 20% of the mean coverage.
% Bases with coverage > XXXx: The percentage of gene bases with coverage > 1000x/500x/300x/160x. Note that parts of the longest isoform may not be expressed in the sample, therby falsely lowering these scores. Also check the gene coverage plot.
FPKM: The number of fragments mapping to the gene of interest, per kilobase of gene, per million.
Gene Length (bp)
% Bases with coverage > 1000x
% Bases with coverage > 500x
% Bases with coverage > 300x
% Bases with coverage > 160x
Coverage Plot: in vitro GENEX
The following plots show coverage in the SHAPE db-K562 for the gene of interest. A sudden drop in coverage, especially at the 3’ and 5’ UTRs, might indicate expression of a shorter isoform in the sample. Dips in coverage, for the most part, are at exon edges.
Mutation Rate Plots: in vitro GENEX
The following plots show mutation line plots across the longest isoform of gene of interest. Corresponding spikes in mutation, found in NAI and DMSO samples, might indicate a SNP in the sample compared to the reference genome.
The following box plot shows mutation rate distributions for DMSO and NAI samples for the gene of interest.
Reactivity Plot: in vitro GENEX
The following plot shows SHAPE reactivity scores across the gene of interest. Higher reactivity scores correlate with a higher likelihood of unpairdness.
Data Files Table: in vitro GENEX
The following table outlines the data files delivered with this SHAPE db gene package.
Reactivity score at each position across the gene with >300x coverage. *
Reactivity values formatted for use as input into the RNAStructure algorithm to guide the fold prediction.
Gene annotation coordinate file used to compute gene metrics. Uses longest collapsed isoform (refflat). *
Gene annotation sequence fasta file to be used as input into the RNAStructure algorithm with the .shape file above for fold prediction. Uses longest collapsed isoform (refflat). *
SHAPE db-K562 in vitro DMSO (untreated control) reads aligned on the gene. *
SHAPE db-K562 in vitro NAI (treated) reads aligned on the gene. *
Each .bam file also has a corresponding .bai index file to facilitate visualization in a genome viewer.