eCLIP: enhanced Crosslinking and ImmunoPrecipitation

Robust and reproducible framework to identify RNA binding protein targets

Understanding the importance of RNA binding proteins (RBPs)

RNA molecules serve a variety of essential roles in the cell. These include not only the well-characterized role as an information carrier between the DNA genome and translation at ribosomes but also non-coding roles in regulating gene expression, telomere maintenance, and a variety of other aspects of cellular physiology (1, 2). (Figure 1).

The processing of and regulation through RNA molecules is tightly controlled by RNA binding proteins (RBPs), which bind to RNAs through recognition of sequence and structural motifs and regulate RNA processing in cell-type, condition-specific, or temporal manners (3).

Recent studies have estimated over 1500 RBPs in the human genome, which play roles throughout the RNA life cycle (3). Mutation of proper RBP activity has been linked to cancer, Amyotrophic Lateral Sclerosis (ALS), and numerous other diseases, and the emergence of genome sequencing techniques will continue to rapidly expand our knowledge of RBPs causally mutated in disease (2).

Figure 1. Central dogma of molecular biology

1. Bandziulis RJ, Swanson MS, Dreyfuss G. RNA-binding proteins as developmental regulators. 
Genes & development.1989;3(4):431-7. PubMed PMID: 2470643.

2. Lukong KE, Chang KW, Khandjian EW, Richard S. RNA-binding proteins in human genetic disease. Trends in genetics
TIG. 2008;24(8):416-25. doi: 10.1016/j.tig.2008.05.004. PubMed PMID: 18597886.

3. Gerstberger S, Hafner M, Tuschl T. A census of human RNA-binding proteins. Nature reviews Genetics.
2014;15(12):829-45. doi: 10.1038/nrg3813. PubMed PMID: 25365966

4. Nussbacher JK, Batra R, Lagier-Tourenne C, Yeo GW. RNA-binding proteins in neurodegeneration:
Seq and you shall receive.
Trends in neurosciences.2015;38(4):226-36. doi: 10.1016/j.tins.2015.02.003.
PubMed PMID: 25765321; PubMed Central PMCID: PMC4403644.

eCLIP has revolutionized the ability to identify RBP binding sites 

Other non-CLIP methods have generally been limited to identification of transcript- (or large region-) level targets. Previous CLIP methods suffered from high experimental failure rates and technical challenges that hindered widespread adoption. In particular, the low efficiency of adapter ligation to RNA in previous CLIP approaches led to high PCR amplification requirements, which led to high PCR duplication rates and a low fraction of ‘usable reads’ (uniquely mapped reads that are not PCR duplicates).

To address these limitations, Eclipse Bioinnovations has optimized the eCLIP technology based on Van Nostrand, Yeo et al. Nature Methods 2016 paper to improve the efficiency of converting immunoprecipitated RNA into high-throughput sequencing libraries.

The eCLIP-seq method

Enhanced crosslinking and immunoprecipitation followed by high-throughput sequencing (eCLIP-seq) was developed to provide a robust and reproducible framework to identify RNA binding protein targets (Figure 2).

Figure 2. Schematic of the eCLIP methodology.

By altering the enzymatic steps performed to convert RNA into double-strand DNA libraries suitable for high-throughput sequencing, our eCLIP technology achieved a thousand-fold improvement in preparation efficiency. This innovation increased robustness in multiple ways:

    • 3x decrease in unneeded sequencing costs.
    • Increased signal-to-noise in identifying biologically-relevant binding sites.
    • Reduce experimental failure rate, enabling rapid scaling to large-scale profiling

eCLIP Highlights

Figure 3. 102 eCLIP experiments show lower percentage of PCR duplications, relative to published iCLIP and other CLIP datasets

Increased efficiency, decreased PCR duplication

eCLIP-seq builds upon previous CLIP-seq methods to improve library preparation efficiency by nearly one thousand-fold, increasing experimental success rates and decreasing wasted sequencing due to PCR duplication (Figure 3).

Transcriptome-wide identification of RNA targets

eCLIP-seq identifies binding sites throughout RNAs, including binding to intronic and exonic positions and coding sequence and 3′ and 5′ untranslated regions, non-coding RNAs including lincRNAs and microRNAs, and retrotransposons and other RNA transcripts (Figure 4).

Figure 4. Example binding profiles for 5 RBPs. 

Figure 5. Crosslink-termination analysis on RBFOX2

Single-nucleotide resolution

Reverse transcription often terminates at the protein-RNA crosslink site. By performing adapter ligation at the cDNA step, eCLIP can be used to identify binding sites and binding motifs with single-nucleotide resolution, depending on the target protein (Figure 5).

eCLIP Workflow

In eCLIP, RBP-RNA interactions are covalently linked using UV crosslinking of live cells. Cells are then lysed, and RNA is fragmented using limited RNase treatment. A specific RBP (and its bound RNA) is then immunoprecipitated using an antibody that specifically recognizes the targeted RBP. After ligation of a 3’ RNA adapter, immunoprecipitated material (as well as a paired input sample) are run on denaturing protein gels and transferred to nitrocellulose membranes.

A region from the protein size to 75 kDa above is cut from the membrane and treated with Proteinase K to release RNA. After cleanup, RNA is then reverse transcribed to ssDNA, after which a second adapter is ligated. PCR amplification is then used to obtain sufficient material for high-throughput sequencing (Figures 6, 7, 8)

Figure 6. eCLIP protocol overview.

eCLIP Data Analysis

Included in standard eCLIP services and eCLIP + Data Analysis kit

  • HTML reports with figures related to significantly enriched peaks
  • Fastq files of raw sequencing reads
  • Adapter trimmed fastq files of reads with UMI and adapter sequences removed
  • Bam files containing PCR-deduplicated read alignments to the genome
  • Bigwig files which can be uploaded to a genome browser to view read density of sample
  • Bed file of input-normalized peaks containing the location of each peak in the genome, along with the log2 fold change vs. input and p-value

Sample HTML Data Analysis Report

Contact us for details about additional analysis options.

Analyzing your own data

Standard analysis of eCLIP requires read adapter trimming, mapping to the appropriate genome, peak identification, and comparison against a paired input, knockout, or other suitable control sample (Figure 9). An example eCLIP analysis pipeline has been described by the ENCODE consortium here.

If you want to analyze your own data visit the Yeo lab pipeline on Github: github.com/YeoLab/eclip

Figure 9. eCLIP Data Analysis Pipeline. Eclipse Bioinnovations

eCLIP Experiment

What type of biological samples can be performed with eCLIP?

Most biological samples are amenable to eCLIP (cell lines, tissues and model organisms), with optimization required depending on endogenous RNase levels and other biological properties. Please contact us if you have a particular sample type of interest.

What do I need before starting an eCLIP experiment?

Antibody to immunoprecipitate the RNA binding protein of interest

Each eCLIP experiment uses an antibody to immunoprecipitate the RNA binding protein of interest from a biological sample.

Pre-validated antibodies are available for 150 RNA binding proteins

-Alternatively, we recommend antibody validation be performed prior to eCLIP using the eCLIP immunoprecipitation validation kit or anti-tag antibodies

 

UV Cross-linking

Biosamples should be UV cross-linked according to protocols available for:

Suspension Cells

Adherent Cells

Tissues

Contact us for UV crosslinking protocols for Model Organisms

What are the sample requirements for an eCLIP experiment?

The quantity of starting material required for an eCLIP experiment depends significantly on the abundance of the protein you are studying. We would recommend however, starting with 200ug of RNA that has been quantified using our “RNA Fragmentation” guide at the protocol

What tags can I use for eCLIP?

We have in-house pre-validated antibodies for FLAG, V5, HA and MYC.

Do you usually recommend biological replicates, and if so, how many?

We recommend running at least duplicates for publication purposes, however triplicates always produce more reliable data.

Sequencing

What are the sequencing parameters?

Eclipse Bio’s kit is based on the single-end eCLIP variant described in:
Van Nostrand EL, Nguyen TB, et al. Robust, Cost-Effective Profiling of RNA Binding Protein Targets with Single-end Enhanced Crosslinking and Immunoprecipitation (seCLIP). Methods Mol Biol. 2017;1648:177-200. PMID: 28766298.

Libraries generated using the eCLIP-seq method are typically sequenced using standard SE50 or SE75 conditions on the Illumina HiSeq, NovaSeq, or NextSeq platforms. eCLIP-seq libraries are compatible with paired-end sequencing if desired by the user, however due to the small size of typical eCLIP RNA fragments (~200bp), most fragments are fully sequenced in standard single-end formats.

What is the recommended sequencing depth per sample?

Eclipse Bio’s target is 25 million reads per eCLIP-seq dataset.
How deeply to sequence an eCLIP-seq dataset is a challenging balance between cost and sufficient read depth to detect true binding events. In an effort to experimentally address this question, an analysis of eCLIP-seq datasets for 150 RNA binding proteins suggested that for 90% of datasets, saturation of peak information occurred at or below 8.5 million reads (See Supplementary Fig. 11 of Van Nostrand EL, et al. A Large-Scale Binding and Functional Map of Human RNA Binding Proteins. Nature (Accepted, in press) However, we have found that targeting 25 million reads provides better coverage for abundant, broadly binding RNA binding proteins (such as HNRNPs) while still allowing pooling of ~14 eCLIP libraries per standard Illumina HiSeq 4000 lane.

What are the indices for sequencing Sample-Sheet?

i7 index name i7 bases on Sample Sheet i5 index name i5 bases bases on Sample Sheet
705 ATTCAGAA 505 AGGCGAAG
706 GAATTCGT 506 TAATCTTA
707 CTGAAGCT 507 CAGGACGT
708 TAATGCGC 508 GTACTGAC

If I have a project that needs additional index primers what primers are compatible?

We recommend using NEBNext HT (cat # E7600S) index primers for larger projects. Use 5ul of both the forward and reverse primer (10uM each) then scale the total master mix volume up to 50ul (5ul Forward primer, 5ul Reverse primer, 15ul cDNA and 25ul Primer Mix).

What are the adapter sequences?

RNA adapter: 5Phos/rArGrArUrCrGrGrArArGrArGrCrArCrArCrGrUrCrUrG/3SpC3/

ssDNA adapter: 5Phos/NNNNNNNNNNAGATCGGAAGAGCGTCGTGT/3SpC3/

What are the index primer sequences?

Index primer sequences: Illumina dual index primers (provided)
505: AGGCGAAG
506: TAATCTTA
507: CAGGACGT
508: GTACTGAC
705: TTCTGAAT
706: ACGAATTC
707: AGCTTCAG
708: GCGCATTA

Data Analysis

What is included in the m6A Data Analysis report?

The Data Analysis includes:

  • Fastq files of raw sequencing reads
  • Adapter trimmed fastq files of reads with UMI and adapter sequences removed
  • Bam files containing PCR-deduplicated read alignments to the genome
  • Bigwig files which can be uploaded to a genome browser to view read density of sample
  • Bed file of input-normalized peaks containing the location of each peak in the genome, along with the log2 fold change vs. input and p-value
  • Bed file of PureCLIP crosslink sites
  • HTML reports with figures related to PureCLIP crosslink sites

What is included in the eCLIP Data Analysis report?

The Data Analysis includes:

  • Fastq files of raw sequencing reads
  • Adapter trimmed fastq files of reads with UMI and adapter sequences removed
  • Bam files containing PCR-deduplicated read alignments to the genome
  • Bigwig files which can be uploaded to a genome browser to view read density of sample
  • Bed file of input-normalized peaks containing the location of each peak in the genome, along with the log2 fold change vs. input and p-value
  • HTML reports with figures related to significantly enriched peaks

Sample HTML Data Analysis Report

How do I analyze an eCLIP experiment?

Standard analysis of eCLIP requires read adapter trimming, mapping to the appropriate genome, peak identification, and comparison against a paired input, knockout, or other suitable control sample (Figure 9). An example eCLIP analysis pipeline has been described by the ENCODE consortium here.

If you want to analyze your own data visit the Yeo lab pipeline on Github: github.com/YeoLab/eclip

Services

What is the turnaround time for your services?

Our turnaround time is 6-8 weeks from receiving the samples in-house to sending the final customer report.


CONTACT US

Please contact us to receive a quote for custom services or to ask any questions you may have. A brief description of your desired experiment (number of samples, custom versus commercial antibody) will enable us to better respond to your queries.