1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
[33m[WARNING] No absolute output path provided, using current directory as prefix[0m
[INFO] CLAIR3 VERSION: v0.1-r6
[INFO] BAM FILE PATH: /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6.bam
[INFO] REFERENCE FILE PATH: /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/../../config/ref/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta
[INFO] MODEL PATH: /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/ont_r104_e81_sup_g5015
[INFO] OUTPUT FOLDER: /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6
[INFO] PLATFORM: ont
[INFO] THREADS: 48
[INFO] BED FILE PATH: /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6.bed
[INFO] VCF FILE PATH: EMPTY
[INFO] CONTIGS: EMPTY
[INFO] CONDA PREFIX: /home/OXFORDNANOLABS/cwright/miniconda3/envs/clair3
[INFO] SAMTOOLS PATH: samtools
[INFO] PYTHON PATH: python3
[INFO] PYPY PATH: pypy3
[INFO] PARALLEL PATH: parallel
[INFO] WHATSHAP PATH: whatshap
[INFO] CHUNK SIZE: 5000000
[INFO] FULL ALIGN PROPORTION: 0.3
[INFO] FULL ALIGN REFERENCE PROPORTION: 0.1
[INFO] ENABLE FILEUP ONLY CALLING: False
[INFO] ENABLE FAST MODE CALLING: False
[INFO] ENABLE CALLING SNP CANDIDATES ONLY: False
[INFO] ENABLE PRINTING REFERENCE CALLS: False
[INFO] ENABLE OUTPUT GVCF: False
[INFO] ENABLE HAPLOID PRECISE MODE: False
[INFO] ENABLE HAPLOID SENSITIVE MODE: False
[INFO] ENABLE INCLUDE ALL CTGS CALLING: False
[INFO] ENABLE NO PHASING FOR FULL ALIGNMENT: False
+ /home/OXFORDNANOLABS/cwright/miniconda3/envs/clair3/bin/scripts/clair3.sh --bam_fn /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6.bam --ref_fn /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/../../config/ref/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta --threads 48 --model_path /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/ont_r104_e81_sup_g5015 --platform ont --output /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6 --bed_fn=/media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6.bed --vcf_fn=EMPTY --ctg_name=EMPTY --sample_name=SAMPLE --chunk_num=0 --chunk_size=5000000 --samtools=samtools --python=python3 --pypy=pypy3 --parallel=parallel --whatshap=whatshap --qual=2 --var_pct_full=0.3 --ref_pct_full=0.1 --snp_min_af=0 --indel_min_af=0 --pileup_only=False --gvcf=False --fast_mode=False --call_snp_only=False --print_ref_calls=False --haploid_precise=False --haploid_sensitive=False --include_all_ctgs=False --no_phasing_for_fa=False --pileup_model_prefix=pileup --fa_model_prefix=full_alignment
[INFO] Check environment variables
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/log
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/split_beds
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/pileup_output
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/merge_output
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/phase_output
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/gvcf_tmp_output
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/full_alignment_output
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/phase_output/phase_vcf
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/phase_output/phase_bam
[INFO] Create folder /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/tmp/full_alignment_output/candidate_bed
[INFO] --include_all_ctgs not enabled, use chr{1..22,X,Y} and {1..22,X,Y} by default
[INFO] Call variant in contigs: chr6
[INFO] Chunk number for each contig: 35
[INFO] 1/7 Call variants using pileup model
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 19/35) : 13591
Total time elapsed: 194.83 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 10/35) : 14808
Total time elapsed: 197.73 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 29/35) : 12779
Total time elapsed: 197.40 s
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 34/35) : 15600
Total time elapsed: 205.93 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 20/35) : 12458
Total time elapsed: 206.47 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 28/35) : 12097
Total time elapsed: 206.55 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 9/35) : 15132
Total time elapsed: 206.86 s
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 27/35) : 12934
Total time elapsed: 206.97 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 18/35) : 13629
Total time elapsed: 208.22 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 12/35) : 12612
Total time elapsed: 208.10 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 26/35) : 12071
Total time elapsed: 207.66 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 16/35) : 14568
Total time elapsed: 208.77 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 4/35) : 15995
Total time elapsed: 207.96 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 30/35) : 10807
Total time elapsed: 208.14 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 25/35) : 13144
Total time elapsed: 210.15 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 11/35) : 14349
Total time elapsed: 211.05 s
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 15/35) : 13182
Total time elapsed: 212.17 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 17/35) : 13371
Total time elapsed: 212.99 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 5/35) : 15571
Total time elapsed: 213.47 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 23/35) : 13575
Total time elapsed: 213.50 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 24/35) : 13010
Total time elapsed: 212.23 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 22/35) : 14340
Total time elapsed: 214.06 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 21/35) : 12413
Total time elapsed: 213.71 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 31/35) : 14610
Total time elapsed: 214.29 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 6/35) : 18237
Total time elapsed: 216.71 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 3/35) : 14603
Total time elapsed: 218.69 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 33/35) : 17987
Total time elapsed: 221.15 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 8/35) : 15418
Total time elapsed: 223.15 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 1/35) : 17631
Total time elapsed: 224.79 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 14/35) : 14615
Total time elapsed: 225.05 s
[INFO] Delay 3 seconds before starting variant calling ...
[faidx] Truncated sequence: chr6:164925849-171806021
[mpileup] 1 samples in 1 input files
Calling variants ...
Processed 20000 tensors
Total processed positions in chr6 (chunk 35/35) : 24916
Total time elapsed: 228.90 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 2/35) : 17101
Total time elapsed: 229.85 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 32/35) : 14757
Total time elapsed: 230.92 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Processed 20000 tensors
Processed 40000 tensors
Total processed positions in chr6 (chunk 7/35) : 48252
Total time elapsed: 253.98 s
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Processed 20000 tensors
Processed 40000 tensors
Processed 60000 tensors
Processed 80000 tensors
Total processed positions in chr6 (chunk 13/35) : 80612
Total time elapsed: 299.14 s
real 6m13.093s
user 118m44.594s
sys 2m14.765s
[INFO] 2/7 Select heterozygous SNP variants for Whatshap phasing and haplotagging
[INFO] Select heterozygous pileup variants exceeding phasing quality cutoff 22
[INFO] Total heterozygous SNP positions selected: chr6: 111456
real 0m2.061s
user 0m1.483s
sys 0m0.184s
[INFO] 3/7 Phase VCF file using Whatshap
This is WhatsHap 1.0 running under Python 3.6.10
Working on 1 samples from 1 family
======== Working on chromosome 'chr6'
---- Processing individual SAMPLE
Using maximum coverage per sample of 15X
Number of variants skipped due to missing genotypes: 0
Number of remaining heterozygous variants: 111456
Reading alignments and detecting alleles ...
Found 698324 reads covering 111418 variants
Kept 557487 reads that cover at least two variants each
Reducing coverage to at most 15X by selecting most informative reads ...
Selected 72202 reads covering 111352 variants
Variants covered by at least one phase-informative read in at least one individual after read selection: 111352
Phasing 1 sample by solving the MEC problem ...
MEC cost: 269220
No. of phased blocks: 352
Largest component contains 2628 variants (2.4% of accessible variants) between position 375341 and 3504839
======== Writing VCF
Done writing VCF
Changed 16 genotypes while writing VCF
== SUMMARY ==
Maximum memory usage: 1.273 GB
Time spent reading BAM/CRAM: 206.9 s
Time spent parsing VCF: 2.3 s
Time spent selecting reads: 93.8 s
Time spent phasing: 811.9 s
Time spent writing VCF: 2.7 s
Time spent finding components: 84.7 s
Time spent on rest: 4.3 s
Total elapsed time: 1206.7 s
real 20m11.270s
user 19m59.584s
sys 0m6.595s
[INFO] 4/7 Haplotag input BAM file using Whatshap
Found 1 samples in input VCF
Keeping 1 samples for haplo-tagging
Found 0 samples in BAM file
Reading alignments and detecting alleles ...
Found 693682 reads covering 111298 variants
== SUMMARY ==
Total alignments processed: 1543459
Alignments that could be tagged: 692374
Alignments spanning multiple phase sets: 0
haplotag - total processing time: 859.563937664032
real 14m23.638s
user 14m1.966s
sys 0m16.687s
[INFO] 5/7 Select candidates for full-alignment calling
[INFO] Set variants quality cutoff 21.0
[INFO] Set reference calls quality cutoff 10.0
[INFO] Low quality reference calls to be processed in chr6: 29344
[INFO] Low quality variants to be processed in chr6: 94819
real 0m1.676s
user 0m1.285s
sys 0m0.236s
[INFO] 6/7 Call low-quality variants using full-alignment model
[INFO] Delay 3 seconds before starting variant calling ...
[faidx] Truncated sequence: chr6:166386783-171745418
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 13/13) : 4163
Total time elapsed: 107.95 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 6/13) : 10000
Total time elapsed: 153.37 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 3/13) : 10000
Total time elapsed: 171.86 s
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 5/13) : 10000
Total time elapsed: 189.74 s
[INFO] Delay 0 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 2/13) : 10000
Total time elapsed: 208.55 s
[INFO] Delay 4 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 1/13) : 10000
Total time elapsed: 209.56 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 7/13) : 10000
Total time elapsed: 210.31 s
[INFO] Delay 1 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 8/13) : 10000
Total time elapsed: 235.95 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 4/13) : 10000
Total time elapsed: 250.71 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 12/13) : 10000
Total time elapsed: 353.18 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 11/13) : 10000
Total time elapsed: 397.82 s
[INFO] Delay 3 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 9/13) : 10000
Total time elapsed: 400.47 s
[INFO] Delay 2 seconds before starting variant calling ...
[mpileup] 1 samples in 1 input files
Calling variants ...
Total processed positions in chr6 (chunk 10/13) : 10000
Total time elapsed: 419.11 s
real 7m36.616s
user 72m48.421s
sys 1m44.224s
[INFO] 7/7 Merge pileup VCF and full-alignment VCF
[INFO] Pileup variants processed in chr6: 221616
[INFO] Full-alignment variants processed in chr6: 99333
real 0m3.533s
user 0m3.102s
sys 0m0.266s
[INFO] Finish calling, output file: /media/groups/custflow/active/datasets/gm24385_q20_2021.10/extra_analysis/small_variants/chr6/merge_output.vcf.gz
real 48m58.583s
user 227m39.466s
sys 4m30.038s
Tip!
Press p or to see the previous file or,
n or to see the next file