Tn-Seq Analysis using DESeq2

I have adapted Keith Turner’s Tn-Seq analysis pipeline (https://github.com/khturner/Tn-seq) to run locally on any Mac using DESeq2 and 2023 versions of samtools (v1.16.1) and flexbar (v3.5.0). This document gives a brief overview of the pipeline and provides information about how to set up and run the scripts. 

DESeq2 analysis of Tn-Seq data is split into 2 parts.

  • Map reads to genome and count reads per Tn insertion site

    • Find reads with the Tn sequence

    • Clip Tn sequence from the beginning of the reads

    • Map clipped reads to reference genome with Bowtie2

    • Generate .bam file using Samtools for read visualization with IGV

    • Generate a table of reads per Tn insertion site

  • Determine genes with significantly decreased Tn insertions in output relative to input

    • Loess smoothing to reduce genomic-position sequencing/count bias

    • Normalize reads per insertion site across replicates

    • Determine number of reads per gene

    • Calculate read count variance per gene

    • Negative binomial test for decreased reads/gene in output vs. input

Input files

1.     Fastq files for Tn-Seq reads

2.     Reference genome (PAO1 is available in Box/Jorth Lab/ref_genome/PAO1)

Output

1.     Table of genes with fold-change output vs. input, include p-values and FDRs for each gene calculated by negative binomial test

2.     Table of insertions per gene

3.     Table of reads per insertion

4.     Graph showing fit of dispersion (variance estimates per gene)

5.     Graph showing log2 fold-change (output/input) vs. mean reads/gene

6.     PCA plot showing relatedness of samples

7.     R history (in case you want to make publication quality graphs in R later)