Skip to content

Trimming reports

Trimming reports are generated by Trim Galore in a Cutadapt-compatible format so existing MultiQC parsers continue to work unchanged. Users migrating from v0.6.x will recognise the structure: v2.x does the work in a single process rather than interleaving output from a Cutadapt subprocess, but the emitted text still begins with a This is cutadapt ... (compatible; for MultiQC backwards compatibility) banner so downstream tools parse it correctly.

Reports consist of three sections:

  1. Parameter summary
  2. Cutadapt-compatible trimming summary
  3. Run statistics summary

Written out right at the start of the run. Example (paired-end, default options):

SUMMARISING RUN PARAMETERS
==========================
Input filename: SLX_R1.fastq.gz
Trimming mode: paired-end
Trim Galore version: 2.1.0 (Oxidized Edition)
Quality Phred score cutoff: 20
Quality encoding type selected: ASCII+33
Using Illumina adapter for trimming (count: 18772). Second best hit was smallRNA (count: 0)
Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
Maximum trimming error rate: 0.1
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
Output file will be GZIP compressed

The parameter summary in v2.x is slightly slimmer than the v0.6.x equivalent: entries like Cutadapt version:, Python version:, the per-core annotation, and FastQC/clip status lines are not emitted. Cutadapt and Python are not subprocesses in v2.x, and clipping is applied silently during trimming rather than logged here.

After the parameter block, Trim Galore emits a MultiQC-compatible summary that downstream parsers familiar with the Perl/Cutadapt output recognise unchanged. The block starts with two header lines that identify the edition:

Trim Galore 2.1.0 (Oxidized Edition) — adapter trimming built in
This is cutadapt 4.0 (compatible; for MultiQC backwards compatibility)
Command line parameters: -j 1 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC SLX_R1.fastq.gz
Processing reads on 1 core in single-end mode ...
=== Summary ===
Total reads processed: 1,166,076,593
Reads with adapters: 467,751,243 (40.1%)
Reads written (passing filters): 1,166,076,593 (100.0%)
Total basepairs processed: 174,911,488,950 bp
Quality-trimmed: 674,749,563 bp (0.4%)
Total written (filtered): 172,939,327,763 bp (98.9%)

The per-adapter detail block follows:

=== Adapter 1 ===
Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 467751243 times.
No. of allowed errors:
1-9 bp: 0; 10-13 bp: 1
Overview of removed sequences
length count expect max.err error counts
1 316729650 291519148.2 0 316729650
2 81494061 72879787.1 0 81494061
3 27821736 18219946.8 0 27821736
...
150 3937 17.4 1 102 3835

If multiple adapters were configured (via repeated -a, inline -a " SEQ -a SEQ", or a FASTA file), each gets its own === Adapter N === block.

Written at the end of the report. For single-end input it is a short block after the adapter details:

RUN STATISTICS FOR INPUT FILE: SE.fastq.gz
=============================================
1000000 sequences processed in total
Sequences removed because they became shorter than the length cutoff of 20 bp: 552 (0.1%)

For paired-end input, the validation step that discards under-length pairs runs after both per-file reports are emitted, so:

  • The Read 1 report shows only N sequences processed in total (no post-validation stats).
  • The Read 2 report carries the final paired-end validation counts at the very end:
RUN STATISTICS FOR INPUT FILE: SLX_R2.fastq.gz
=============================================
1166076593 sequences processed in total
Total number of sequences analysed for the sequence pair length validation: 1166076593
Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 3357967 (0.29%)

It is this number, 3,357,967 (0.29%) at the end of the Read 2 trimming report, that represents the total number of read pairs removed from both Read 1 and Read 2 files because of filtering (min length, max length, or max N).

Trim Galore also writes a JSON report (*_trimming_report.json) alongside the text report. It contains the same statistics as the text report in a structured format (schema v1), designed for native parsing by MultiQC. If you're building a custom dashboard or consuming Trim Galore output programmatically, prefer the JSON file.