Welcome to the homepage for curated loci prime editing (cliPE) method resources!

If you want to learn more about cliPE, here are some links:

Jeff’s MSS24 talk archived on Youtube –> click here
cliPE preprint available now on Arxiv (please note supplemental tables available at github repo) –> click here
cliPE Github repo –> click here
cliPE protocols.io resources –> hopefully coming soon

CliPE companion Shiny apps:

Click here to access the cliPE epegRNA Designer shiny app for designing your cliPE libraries and epegRNA architectures for screening (HUGE shoutout to Nico Bodkin for all his hard work building this Shiny app!)

We now have a development version of the epegRNA Designer app as well, please click here to access it. Nico is continuing to add functionality like automatically pulling the Clinvar database. Please note this version of the app is in active development and may have bugs yet to be ironed out, use at your own discretion!

Click here to access the cliPEr_app1_fasta2csv shiny app for converting the fasta output from jellyfish to csv

Click here to access the cliPEr_app2_kmers2variants for using dictionary file to annotate the jellyfish kmer count file with variant name

Click here to access the cliPEr_app3_random_effects_modeling shiny app for shiny app for performing final data analysis using random effect modeling of replicate experiments. Note: You can download ‘Book2_e17_3xreps.csv’ or ‘Book2_e17_4xreps.csv’ from github repo here as example files with the correct formatting for the shiny app.

Some of the shiny apps above require specific input files, see instructions here:

Clinvar missense tsv file

Navigate to https://www.ncbi.nlm.nih.gov/clinvar/ with Chrome or other web browser
Type gene name into search like ‘TSC2’
On the left-side of the page, click the box next to ‘Missense’ in the Molecular consequence section
Just beneath ‘Search results’, click ‘Download’ to bring up the dropdown menu and click ‘Create File’ button
(optional) rename file to something like clinvar_mis_GeneName_DATE.tsv
upload tsv to shiny app CliPEpy_1 to design epegRNA libraries

gnomAD missense csv file

Navigate to https://gnomad.broadinstitute.org/ with Chrome or other web browser
Type gene name into search like ‘TSC2’
Scroll down and just above ‘configure table’ button is a checkbox for ‘Missense/inframe indel’ Click the only button to the right
Click ‘Export variants to CSV’ button
(optional) rename file to something like gnomADmis_GeneName_DATE.csv
upload csv to shiny app CliPEpy_1 to design epegRNA libraries

Due to limitations in manuscript formatting, there are a few topics we were unable to cover in the above preprint. Please see below for additional information on cliPE which may be helpful as you design your experiment:

Designing initial set of epegRNA architectures to screen, epegRNA libraries, and nicking gRNAs (cliPE Module 1)

We have provided a Shiny app to streamline prime editing design. In one step, the Shiny app designs epegRNA libraries based on user input and outputs files including candidate epegRNA libraries, archetypal epegRNAs, and nicking gRNAs. It is important to consider at this stage how many epegRNA libraries will be targeted for the eventual cliPE experiment. Each epegRNA library typically targets one 42-45 bp region which allows editing of up to 15 codons. We target a goal of 15-30% overall editing efficiency for epegRNA libraries; initially, we observed this in about 50% of archetypal epegRNAs screened for TSC2.6 Ongoing work in our lab suggests that highly efficient epegRNAs may comprise 20-50% of designs. Our recommendation is to screen a minimum of 12 archetypal epegRNAs which will produce on average 3-6 epegRNA libraries which will be usable for cliPE. It may be desirable to screen more than 12 archetypal epegRNAs upfront to increase the probability of attaining enough epegRNA designs to proceed with library cloning.

It is important that the regions targeted also include additional classes of variants, or a truth set which will be key for validation of the MAVE during data analysis. Overall, for most MAVEs, two truth sets comprised of positive and negative controls are used to assess assay validity, which will be referred to as the assay validation truth set and the clinical truth set (see Table 1). The assay validation truth set consists of (1) synonymous and missense variants found in the general population in databases such as gnomAD (negative controls) and (2) premature truncation codon (PTC) variants (positive controls). It is critical to set a threshold for allele count or allele frequency for gnomAD variants to filter out variants with variable expressivity or incomplete penetrance which might confound later analysis; this is context-dependent and will vary somewhat gene-to-gene. The clinical truth set similarly consists of negative and positive controls present in the ClinVar database.1 The negative controls in the clinical truth set are benign or likely benign (BLB) variants and the positive controls are pathogenic or likely pathogenic (PLP) missense variants. Brnich et al. provides guidance as to minimal truth set datasets for MAVEs: a minimum of 11 clinical truth set variants divided between BLB and PLP is necessary to achieve moderate evidence strength of benignity or pathogenicity in an ACMG variant classification framework.10 The exact number of variants needed to achieve moderate or greater evidence strength will vary by gene and depend largely, but not entirely, on the dynamic range of the individual MAVE. Our recommendation is to include at least 25-30 clinical truth set variants in a cliPE experiment. We note that some genes lack sufficient clinical truth set variants and MAVEs targeting these genes may need to rely more heavily on assay validation truth set variants.

The Shiny app takes basic information as input such as gene name and RefSeq transcript ID. The user will receive as output a set of documents containing: (1) archetypal epegRNA architectures for screening, (2) oligo libraries to generate epegRNA libraries, and (3) nicking gRNAs. Further, the script will also output sequences for the necessary primers for amplifying the single-stranded oligo pool into double-stranded DNA for cloning into the destination vector. epegRNA designs are based on DeepPrime predictions of optimal prime editing designs.11 Optionally, resources are provided for alternative resources for designing epegRNAs in Table 2. Missense variants from ClinVar and gnomAD are used to generate truth set and assay validation variants. Synonymous variants in gnomAD can optionally be used as additional negative control variants in the assay validation truth set. Custom code produces a TGA at each codon to produce PTC variants for assay validation. We provide examples based on TSC2 for all input and output files. Once the designs are complete, primers and oligo pools can be ordered from a preferred vendor such as Twist Biosciences, Agilent, or IDT. We typically order small IDT oPools for cliPE as the turnaround time is relatively fast and does not require waiting on a quote.

Primer design: Sanger sequencing or low-depth LR sequencing

It is necessary to design primers using Primer3 (https://primer3.ut.ee/) or an equivalent primer design tool to amplify a relatively broad region containing genome editing targets. These primers can be used to amplify these regions for subsequent sequencing to estimate editing rate. It is important to ensure that the primers do not bind too close to the site of editing, particularly for primer designs for Sanger sequencing, due to the extra noise in the first 25-35 bases of sequencing data. This is critical for the archetypal epegRNA screen (Module 2) and for validation of editing with subsequent epegRNAs. An amplicon size of 500-700 bp is optimal for Sanger sequencing, while amplicons of 800-1200 bp are optimal for LR sequencing. We routinely use Plasmidsaurus sequencing services for cost-effective LR sequencing with quick turnaround time, though other preferred vendors may offer similar services. Sanger sequencing reactions are typically less expensive ($4-5 per reaction), while LR sequencing ($15) provides a better estimate of prime editing efficiency.

Primer design: high-depth amplicon sequencing

It is further necessary to design primers to amplify the region of interest specifically (region overlapping RT template of epegRNA pool). It is critical to constrain the total amplicon size to be less than 250 bp to maximize read depth of the target region. Also, as above, it is important to ensure the primers do not overlap the region of interest to detect any small insertions and deletions (indels). If sequencing with a vendor such as MGH CCIB DNA Core’s Complete Amplicon Sequencing service or GENEWIZ from Azenta Amplicon-EZ, it is important to review the sample submission guidelines specific to the respective service. If sequencing in multiplex at a core facility or outside vendor (Module 6 option B), it is necessary when ordering these primers to append the appropriate Illumina adaptors to enable the barcoding in step 6B.5:

Primer 1: adapter + forward target primer (5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG -forward_primer-3’). Primer 2: adapter + reverse target primer (5’- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG -reverse_primer-3’).

A visual guide with additional information about ordering primers and oligo pools is available here via Google Slides: link

Amplicon sequencing of human genomic DNA (cliPE Module 6)

If there are a small number of libraries to sequence, it can be economically advantageous to submit each library separately to an amplicon sequencing service such as MGH CCIB DNA Core’s Complete Amplicon Sequencing service or GENEWIZ from Azenta Amplicon-EZ rather than pooling indexed libraries for sequencing at a core facility or external sequencing vendor such as Novogene, BGI, etc. It is worth estimating the cost for each option to determine which method will be best. For example, our in-house Illumina MiniSeq runs cost approximately $1200 for 8M reads, sufficient to run up to ~32 libraries with a coverage target of 200,000 reads per sample. Based on current prices, it is more economical to multiplex libraries on the MiniSeq only when we have more than ~24 total libraries. Using an amplicon sequencing service has particular utility for QC of epegRNA plasmid libraries and initial MAVE optimization on a small number of epegRNA libraries. We recommend multiplexing when there are sufficient libraries, which will typically be a final pool of all of the biological replicates of selected and control conditions for multiple epegRNA libraries. As each library only requires a minimum of 200,000 reads, it is cost-effective to run up to hundreds of multiplexed libraries in a single sequencing run.

Amplicon sequencing protocol for library QC or pilot experiments (single PCR step)

Prepare amplicon sequencing library (or libraries) with a single PCR step without barcodes.
NOTE: The protocols here and in step 11 are based around paired-end 150 bp short read sequencing with Illumina chemistry. We provide these protocols as they have been optimized throughout the cliPE method design and testing. It is important to note that other short read technologies exist and are becoming widely available, and that other run structures (i.e., a single-end 200 bp run at higher read depth) may yield similar or identical results. Users are welcome to convert the amplicon sequencing protocols provided herein to other sequencing platforms such as those produced by Element Biosciences, Ultima Genomics, PacBio, Complete Genomics, and Oxford Nanopore. Many of these platforms have targeted amplicon sequencing workflows which should in most cases be compatible with the cliPE method. However, while we find Oxford Nanopore LR technology to have high utility for streamlined QC of cloned plasmids, libraries, and estimating editing of archetypal epegRNAs, we would not recommend this platform for high-depth amplicon sequencing for enrichment analysis.
CRITICAL: Use step 10 for smaller batch preparation when sequencing one or a small number of libraries. For libraries generated this way, sequencing will need to be performed by certain external vendors such as MGH CCIB DNA Core’s Complete Amplicon Sequencing service or GENEWIZ from Azenta Amplicon-EZ. These vendors offer sufficient read-depth and quick turnaround time, which can be helpful for QC as well as small pilot experiments to test editing efficiency of a single epegRNA or to optimize a selection methodology, for example.

Design primers to amplify the region spanning the reverse transcription template of each epegRNA architecture. See above for advice on designing these primers. NOTE: It is necessary to design primers to amplify each locus targeted for genome editing. Constraining the total amplicon size to be less than 250 bp, though not necessary, is recommended as it maximizes read depth of the target region. This design gives the largest overlapping region for read pairs. Also, as above, it is important to ensure the primers flank but do not directly overlap the region of interest to detect any small insertions and deletions (indels). If sequencing with a vendor such as MGH CCIB DNA Core’s Complete Amplicon Sequencing service or GENEWIZ from Azenta Amplicon-EZ, it is important to review the sample submission guidelines specific to the respective service.
Extract gDNA from pellets of cells collected under selection conditions. Make sure to also extract gDNA from cell pellets without selection as a control.
Set up PCR reaction to generate amplicons for targeted amplicon sequencing. After aliquoting 22 uL of mastermix in each tube of PCR strips, add 20-60 ng template gDNA (3 uL of 6.67-20 ng/uL).

PCR Reaction Mastermix
Reagent Amount
2X iProof Mastermix 12.5 uL
ampSeq_FwdPrimer (10 uM) 1.25 uL
ampSeq_RevPrimer (10 uM) 1.25 uL
ddH2O To 22 uL total volume

PCR Cycling Conditions
Steps Temperature Time
Initial Denaturation 98 °C 3 min
Denaturation 98 °C 20 sec
Annealing 60 °C 20 sec
Extension 72 °C 30 sec
Repeat Denaturation, Annealing, and Extension steps for total of 30-35 cycles
Final extension 72 °C 7 min
Hold 4 °C forever

Use size selection beads to remove primer dimer and prepare libraries for sequencing. AmpureXP size selection beads are routinely used to remove primer dimer and prepare NGS libraries for either subsequent PCR or for Illumina sequencing.
Allow aliquots of AmpureXP beads to sit at RT for at least 30 min prior to use.
Add 25 uL of DNA grade water to each 25 uL PCR reaction to bring the total volume to 50 uL. Add an equal volume (50 uL) of AmpureXP beads to bring the total volume to 100 uL and mix well by pipetting.
After a 10 min incubation at RT, place on magnet for 5 min.
Remove and discard supernatant, followed by two washes with 100-200 uL of 70% ethanol.
Allow beads to air dry for about 5 min.
Add 21 uL of 1X TE and incubate for 5 min prior to placing tubes back on magnet.
After 5 min of separation, pipette 20 uL of eluted DNA into a fresh tube.
Use Qubit to quantify DNA concentration for each pool of amplicons, dilute to appropriate concentration, and submit for Illumina short-read sequencing.

A note on resources to learn sufficient command line working knowledge for cliPE:

Our goal in designing the Shiny apps was to keep the barrier for entry for cliPE as low as possible. Still, some basic knowledge of the Unix command line and executing software on Linux operating systems is required. There are many primers that can be completed in 1-2 hours to learn the requisite knowledge for completing Module 7; a number are linked to here: https://github.com/nuitrcs/bash_hpc_workshops.

If you have any questions or run into issues with any of the shiny apps, please contact jeffrey [dot] calhoun [at] northwestern [dot] edu. Alternativel, @calhoujd on Twitter/X, or @calhoujd.bsky.social on Bluesky

Shout out section to thank folks:

Carina Biar: Carina worked with me on developing cliPE as an NU undergrad and also a gap year technician. Much of what you see on this page and the preprints linked above are due to her hard work, thank you Carina!

Nico Bodkin: As mentioned above, Nico did an awesome job designing and implementing the pegRNA Designer companion Shiny app. Check it out, link above. Also click here to check out Nico’s webpage

xinkblot: Valerie at xinkblot really knocked the cliPE logo design out of the park! If you want your very own awesome logo, check out the xinkblot Etsy store: here

Funding: huge thank you to the American Epilepsy Society for the Junior Investigator Award which funded our TSC2 MAVE and also cliPE development

Posts

Oct 1, 2024
Welcome to Jekyll!