3
我有一個文件,如下面的小例子。每4行都與一個ID相關。每個ID的第二行以N開頭。我想在行首開始刪除N,其他所有內容都保持不變。 我想在python中做到這一點。你知道怎麼做嗎?如何在Python中編輯文本(.fastq)文件
例如:
@SRR2163140.1 HISEQ:148:C670LANXX:3:1101:1302:1947 length=50
NGCGACCTCAGATCAGACGTGGCGACC
+SRR2163140.1 HISEQ:148:C670LANXX:3:1101:1302:1947 length=50
#<<ABGGGGGGGGGGGGGGGGGGGGGG
@SRR2163140.3 HISEQ:148:C670LANXX:3:1101:1381:1997 length=50
NGCCGACATCGAAGGATCAA
+SRR2163140.3 HISEQ:148:C670LANXX:3:1101:1381:1997 length=50
#<<ABFGGGGGGGGGGGGGG
@SRR2163140.4 HISEQ:148:C670LANXX:3:1101:1705:1940 length=50
NACAAACCCTTGTGTCGAGGGC
+SRR2163140.4 HISEQ:148:C670LANXX:3:1101:1705:1940 length=50
#=ABBGGGGGGGGGGGGGGGGG
@SRR2163140.7 HISEQ:148:C670LANXX:3:1101:1704:1965 length=50
NGGGACATGACAGCCTGGACCATCG
+SRR2163140.7 HISEQ:148:C670LANXX:3:1101:1704:1965 length=50
#=ABBGGGGGGGGGGGGGGGGGGGG
輸出:
@SRR2163140.1 HISEQ:148:C670LANXX:3:1101:1302:1947 length=50
GCGACCTCAGATCAGACGTGGCGACC
+SRR2163140.1 HISEQ:148:C670LANXX:3:1101:1302:1947 length=50
#<<ABGGGGGGGGGGGGGGGGGGGGGG
@SRR2163140.3 HISEQ:148:C670LANXX:3:1101:1381:1997 length=50
GCCGACATCGAAGGATCAA
+SRR2163140.3 HISEQ:148:C670LANXX:3:1101:1381:1997 length=50
#<<ABFGGGGGGGGGGGGGG
@SRR2163140.4 HISEQ:148:C670LANXX:3:1101:1705:1940 length=50
ACAAACCCTTGTGTCGAGGGC
+SRR2163140.4 HISEQ:148:C670LANXX:3:1101:1705:1940 length=50
#=ABBGGGGGGGGGGGGGGGGG
@SRR2163140.7 HISEQ:148:C670LANXX:3:1101:1704:1965 length=50
GGGACATGACAGCCTGGACCATCG
+SRR2163140.7 HISEQ:148:C670LANXX:3:1101:1704:1965 length=50
#=ABBGGGGGGGGGGGGGGGGGGGG
請注意,要獲得有效的fastq格式,您還需要刪除質量行的第一個字符。你想要的不會保留基礎和品質之間的匹配。 – bli