Python¥Ð¥¤¥ª?¡¡Python¥Ð¥¤¥ª/¥Ä¡¼¥ë?
6047¡¡¡¡¡¡2019-05-12 (Æü) 16:31:26
¤³¤³¤Ç¤Ï¡¢´û¤Ë¸ø³«¤µ¤ì¤Æ¤¤¤ë¼Â¸³¥Ç¡¼¥¿¤ò»È¤Ã¤Æ¥Þ¥Ã¥Ô¥ó¥°½èÍý¤ò»î¤¹¤³¤È¤ò¹Í¤¨¤Æ¤¤¤ë¡£¼ê¸µ¤Ç¼Â¸³¤ò¹Ô¤¤¥·¡¼¥±¥ó¥µ¡¼¤«¤é¤Î½ÐÎϥǡ¼¥¿¤¬¤¢¤ë¾ì¹ç¤Ï¡¢ÅöÁ³¤Ê¤¬¤é¤½¤Î¥Ç¡¼¥¿¤ò»È¤¦¤³¤È¤Ë¤Ê¤ë¡£
A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae¡¡Saccharomyces¤ÎÎã¤ò»È¤Ã¤¿Îã
¤³¤ÎÎã¤Î¾ì¹ç¤Ï¡¢NBCI¤Î¸ø³«¼Â¸³¥Ç¡¼¥¿¥Ù¡¼¥¹SRA¡ÊSequence Read Archive¡Ë¤«¤éÅê¹ÆÁ´ÂΡÊSRA¡¢submission accession¡Ë¤Î¥¢¥¯¥»¥·¥ç¥óÈÖ¹æ¤Ç¤¢¤ëSRAÈÖ¹æ
SRA¤ò¥¢¥¯¥»¥¹¤¹¤ë¡Êp72¡Ë
¶½Ì£¤Î¤¢¤ë¥Ç¡¼¥¿¤¬¡¢submission_accession=SRA000299¤Ç¤¢¤ë¤È¤¹¤ë¡£¡ÊÏÀʸÅù¤«¤éÍ¿¤¨¤é¤ì¤Æ¤¤¤ë¡Ë
¤³¤ì¤«¤é¡¢NCBI¤ÎSRA¥Ç¡¼¥¿¥Ù¡¼¥¹¤ÇSRA000299¤ò¸¡º÷¤¹¤ë¡£
Web¤Ç¼êºî¶È¤Ç¥¢¥¯¥»¥¹¤¹¤ë¾ì¹ç¤Ï¡¢Ä¾ÀܤËNCBI¤ÎSRA¥Ç¡¼¥¿¥Ù¡¼¥¹¤Ë¥¢¥¯¥»¥¹¤¹¤ì¤Ð¤è¤¤¡£
https://www.ncbi.nlm.nih.gov/sra/?term=SRA000299
°Ê²¼¡¢Web-API¥Ù¡¼¥¹¡Ê¡áPython¥×¥í¥°¥é¥à·Ðͳ¡Ë¤Çºî¶È¤¹¤ë¤³¤È¤ò¹Í¤¨¤ë¡£NCBI¤Î SRA¥Ç¡¼¥¿¥Ù¡¼¥¹¤ÏWeb-API¡Ê¤Î¤ä¤êÊý¡Ë¤ò¸ø³«¤·¤Æ¤¤¤Ê¤¤¤è¤¦¤Ç¡¢Entrez¤ò»È¤¦¤è¤¦¤Ë »Ø¼¨¤·¤Æ¤¤¤ë¡£¡ÊDownload SRA sequences from Entrez search results¡Ë¡¡ ¤³¤³¤ÎÎã¤Ë¤¢¤ë¤è¤¦¤Ë¡¢Entrez¤ÎÌ䤤¹ç¤ï¤»¤òºî¤ë¡£
Entrez API¤Î¥á¥â
Download SRA sequences from Entrez search results
SRA Toolkit download <--fastq-dump and sam-dump
Python¤ÇNCBI¤ÎAPI¤«¤éʸ¸¥¾ðÊó¤ò¼èÆÀ¤·¤Æ¤ß¤¿ ¤Ç¤ÏBioPython¤ò»È¤ï¤Ê¤¤¤ÇľÀÜWeb-API¤òᤤ¤Æ¤¤¤ë¤¬¡¢º£¤Ï¤³¤ì¤Ï»È¤ï¤Ê¤¤¤Ç¤ª¤¯¡£
Python¤Ë¤è¤ëEntrez SRA¤Î¥¢¥¯¥»¥¹
from Bio import Entrez Entrez.email = "yamanouc@hyperresearch.com" handle = Entrez.esearch(db="sra", term="SRA000299") result = Entrez.read(handle) # ¸¡º÷·ë²Ì¤òresult¤ËÆþ¤ì¤ë print('SRA SRA000299, IDList', result['IdList']) for id in result['IdList'][:1]: # result¤Î¤¦¤ÁIdList¤ò¼è¤ê½Ð¤·¤Æ¡¢£±¤Ä¤º¤Ä¥¢¥¯¥»¥¹ handle = Entrez.efetch(db="sra", id=id, retmode="xml") # £±¤Ä¤º¤Ä¥¢¥¯¥»¥¹ print(handle.read())
·ë²Ì¤Ïxml¤·¤«¤Ê¤¤¡£xml¤ò²òÆÉ¤·¤Æ¡¢Íߤ·¤¤RUN accession id ¤òÆþ¼ê¤¹¤ë¤Ë¤Ï¡¢
import xml.etree.ElementTree as ET from Bio import Entrez Entrez.email = "yamanouc@hyperresearch.com" handle = Entrez.esearch(db="sra", term="SRA000299") result = Entrez.read(handle) print('SRA SRA000299, IDList', result['IdList']) for id in result['IdList']: handle = Entrez.efetch(db="sra", id=id, retmode="xml") root = ET.fromstring(handle.read()) # find items = root.findall('.//RUN') for i, u in enumerate(items): #print(i, 'tag', u.tag, 'attr', u.attrib) print(u.get('accession'))
·ë²Ì¤Ï
SRR002324 SRR002320 SRR002325 SRR002322 SRR002321 SRR002323
¤È¤Ê¤ë¡£
¤³¤Îrun accession number SRR002320¡Á5¤ò»È¤Ã¤Æ¡¢fastq¥Æ¡¼¥Ö¥ë¤ò¥¢¥¯¥»¥¹¤¹¤ë¡£
¤³¤³¤«¤é¡¢SRA¤Î¥Ç¡¼¥¿¡ÊSRA¥Ç¡¼¥¿¡¢ºÇ¸å¤Ë¤Ïfastq¥Ç¡¼¥¿¡Ë¤Î¥À¥¦¥ó¥í¡¼¥É¡£
¤Þ¤ºSRA¤Î¥Ú¡¼¥¸¢ÍNCBI SRA Toolkit
¤³¤³¤«¤éNCBI SRA Toolkit¤ò¥À¥¦¥ó¥í¡¼¥É¡¦Å¸³«¤·¤Æ¡¢¤³¤ì¤Ë¤è¤Ã¤Æ¥¢¥¯¥»¥¹¤¹¤ë¡£R¤À¤ÈÃæ¤Ç¼«Æ°Åª¤Ë¤¤¤í¤¤¤í¤È¤ä¤Ã¤Æ¤¯¤ì¤ë¤è¤¦¤À¡£
fasterq-dump SRR002320
·ë²Ì¤Ï
spots read : 39,266,713 reads read : 39,266,713 reads written : 39,266,713
¤Ç¡¢¥Õ¥¡¥¤¥ë¤Ï SRR002320.fastq¤È¤¤¤¦8340468900¥Ð¥¤¥È¤Î¥Õ¥¡¥¤¥ë¤¬ºî¤é¤ì¤¿¡£ÀèÆ¬¤òÇÁ¤¤¤Æ¤ß¤ë¤È
@SRR002320.1 080226_CMLIVERKIDNEY_0007:1:1:112:735 length=36 GTGGTGGGGTTGGTATTTGGTTTCTCGTTTTAATTA +SRR002320.1 080226_CMLIVERKIDNEY_0007:1:1:112:735 length=36 IIIIIIII"IIIII)I$I1%HII"I#./(#/'$#*# @SRR002320.2 080226_CMLIVERKIDNEY_0007:1:1:114:564 length=36 GGATACTCAGGCTGGCCCAATTTCTGGGCGTGGGAA +SRR002320.2 080226_CMLIVERKIDNEY_0007:1:1:114:564 length=36 IIII:>&<I;I%I88II1&+I:IF>II,&D:I-'), @SRR002320.3 080226_CMLIVERKIDNEY_0007:1:1:109:558 length=36 GTAGAATTAGAATTGTGAAGATGATAAGTGTAGAGG +SRR002320.3 080226_CMLIVERKIDNEY_0007:1:1:109:558 length=36 IIIIIIIIIIIIIIIIIIIIIII<IIAIIII6I?I:
¤Î¤è¤¦¤Ë¤Ê¤Ã¤Æ¤¤¤ë¡£
¥ê¥Õ¥¡¥ì¥ó¥¹ÇÛÎó¤ÎÊý¤Ï¡¢NCBI Genome¤Ë¤ª¤¤¤ÆHomo sapiens genome¤ò¸¡º÷
¡¡¢Í ¸¡º÷
¡¡¡¡¢Í GRCh38.p12 (December 2017) Download¡¡¢Í ¥Õ¥¡¥¤¥ëGRCh38.p12.tar¤È¤·¤Æºî¤ë(938MB)¡¡¡¡(p90)
¡¡¡¡¡¡
¡ÖÀè¿Ê¥²¥Î¥à»Ù±ç¡×¾ðÊó²òÀϹֽ¬²ñ¤Î¤´°ÆÆâ
¢Í¾ðÊó²òÀϹֽ¬²ñ¥Ó¥Ç¥ª¡ã2018ǯÅÙ¡¡¾ðÊó²òÀϹֽ¬²ñ¡ÊÃæµé¼Ô¸þ¤±¡Ë¡ä
¢Í»ñÎÁ¡ÊGitHub¡Ë
1-1¤ÎÂêºà¤Ï½Ð²ê¹ÚÊìSaccharomyces cerevisiae¤Ç2¤Ä¤Î°Û¤Ê¤ë¾ò·ï¤Ç¤ÎÇÝÍÜ
Intawat Nookaew et al
"A comprehensive comparison of RNA-Seq-based transcriptome analysis
from reads to differential gene expression and cross-comparison with microarrays: a case study in Sacchaomyces cerevisiae"
Nucleic Acids Research, 2012, Vol. 40, No. 20, Septemter 2012
doi: 10.1093/nar/gks804
full text
ÏÀʸ¤Ç¤Îµ½Ò¡Êp10095¡Ë¤Ë¡¢¥ê¡¼¥É¤Ë´Ø¤¹¤ëACCESSION NUMBERS¤È¤·¤Æ
GSE37599, SRS307298, SRR453566, SRR453567, SRR453568, SRR453569, SRR453570, SRR453571 and SRR453578.
¤È¤¢¤ë¤Î¤Ç¡¢¤³¤ì¤òÍê¤ê¤Ë¡¢¾å¤ÎÎã¤ÈƱ¤¸¤è¤¦¤Ë¤·¤Æ¡¢SRA¤Î¥Ç¡¼¥¿¡ÊSRA¥Ç¡¼¥¿¡¢ºÇ¸å¤Ë¤Ïfastq¥Ç¡¼¥¿¡Ë¤ò¥À¥¦¥ó¥í¡¼¥É¤Ç¤¤ë¡£ ¤Þ¤ºSRA¤Î¥Ú¡¼¥¸ ¢Í NCBI SRA Toolkit ¤«¤éNCBI SRA Toolkit¤ò¥À¥¦¥ó¥í¡¼¥É¡¦Å¸³«¤·¤Æ¡¢¤³¤ì¤ò»È¤Ã¤Æ¥¢¥¯¥»¥¹¤¹¤ë¡£SRR453566¤Î¥Ç¡¼¥¿¤Î¥À¥¦¥ó¥í¡¼¥É¤Ï
fasterq-dump SRR453566 ¤Î¤è¤¦¤Ë¤¹¤ì¤Ð¤è¤¤¡£
¤Þ¤¿ÏÀÊ¸Ãæ¤Ë reference¤È¤·¤ÆS288c¥²¥Î¥à¤ò»È¤¦¤³¤È¤¬½ñ¤«¤ì¤Æ¤¤¤ë¤Î¤Ç¡¢¤³¤Î¥Ç¡¼¥¿¤â ¥¢¥¯¥»¥¹¤¹¤ëɬÍפ¬¤¢¤ë¡£¤³¤³¤ò¸«¤¿¡£
Transcriptome analysis using reference genome-based reads mapping The genome sequence of S. cerevisiae strain S288c and its annotations were retrieved from the SGD databases and used for all analysis.
¤³¤³¤«¤é¡¢SGD (Saccharomyces Genome Database) ¤ÎS288C¤ò¥µ¡¼¥Á¤¹¤ë¤È¡¢Strain: S288C¤¬ÆÀ¤é¤ì¤ë¡£¤³¤ÎÃæ¤Ç¡¢GenBank GCF_000146045.2¤Î¥¨¥ó¥È¥ê¡¼¤ò»È¤¦¤³¤È¤Ë¤¹¤ë¡£
GCF_000146045.2¤ò¥¯¥ê¥Ã¥¯¤¹¤ë¤È¡¢GenBank¤ÎR64 Organism name: Saccharomyces cerevisiae S288C (baker's yeast) Strain: S288C¤Î¥Ú¡¼¥¸¤ËÄ·¤Ö¡£Ä·¤ó¤ÀÀè¤Î¥Ú¡¼¥¸¤Î±¦Â¦¡ÖAccess the data¡×¤ÎDownload the RefSeq assembly¤ò¥¯¥ê¥Ã¥¯¤¹¤ë¤È¡¢¥Õ¥¡¥¤¥ë¥ê¥¹¥È¤¬É½¼¨¤µ¤ì¤ë¡£¤½¤ÎÃæ¤«¤é¡¢»²¾È¥·¡¼¥±¥ó¥¹¤È¤·¤ÆGCF_000146045.2_R64_genomic.fna.gz¡¢µ½ÒGFF¥Õ¥¡¥¤¥ë¤È¤·¤ÆGCF_000146045.2_R64_genomic.gff.gz¤ò¥À¥¦¥ó¥í¡¼¥É¤¹¤ë¡£
Ê̤ÎÅþã¥Ñ¥¹¤È¤·¤Æ¤Ï¡¢NBCI¤Îgenome¤Î¥Ú¡¼¥¸¤«¤ésaccharomyces cerevisiae s288c[orgn] ¤ò¸¡º÷¤¹¤ë¡£
¤³¤ÎÃæ¤ÎFasta¥Õ¥©¡¼¥Þ¥Ã¥È¤Îgenome¤ò¥À¥¦¥ó¥í¡¼¥É¤ò¥¯¥ê¥Ã¥¯¤¹¤ë¤È¡¢¥Õ¥¡¥¤¥ëGCF_000146045.2_R64_genomic.fna.gz¤¬ÆÀ¤é¤ì¤ë¡£
¹¹¤Ëgff¥Õ¥©¡¼¥Þ¥Ã¥È¤Îannotation¤ò¥À¥¦¥ó¥í¡¼¥É¤ò¥¯¥ê¥Ã¥¯¤¹¤ë¤È¡¢¥Õ¥¡¥¤¥ëGCF_000146045.2_R64_genomic.gff.gz¤¬ÆÀ¤é¤ì¤ë¡£
RNA-seq±é½¬(2018-03)¹â¶¶¹°´î
¥Ð¥Ã¥Á¥â¡¼¥É¤Ç¼Â¹Ô¤¹¤ë¾ì¹ç¡£
fastqc --nogroup SRR453566_1.fastq fastqc --nogroup SRR453566_2.fastq
¥¯¥ª¥ê¥Æ¥£¤Î¥°¥é¥Õ¤¬À¸À®¤µ¤ì¤ë¡£»È¤¤Êý¤Î¾ÜºÙ¤ÏFASTQ ¥¯¥ª¥ê¥Æ¥£¥³¥ó¥È¥í¡¼¥ë¤Ë¥Ñ¥é¥á¡¼¥¿»ØÄꤢ¤ê¡£
¸µ¥Ú¡¼¥¸¤Ïfastqc
Trimmomatic¤Ï¤³¤Á¤é»²¾È¡£
¥À¥¦¥ó¥í¡¼¥É¡Ê2019-03-02»þÅÀ¤ÇVersion 0.38¡Ë
/usr/local/Trimmomatic¤Ë¥¤¥ó¥¹¥È¡¼¥ë
»È¤¤Êý¤Ï
java -jar trimmomatic-0.38.jar PE -phred33 input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3- PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 This will perform the following: Remove adapters (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10) Remove leading low quality or N bases (below quality 3) (LEADING:3) Remove trailing low quality or N bases (below quality 3) (TRAILING:3) Scan the read with a 4-base wide sliding window, cutting when the average quality per base drops below 15 (SLIDINGWINDOW:4:15) Drop reads below the 36 bases long (MINLEN:36)
°äÅÁ¸¦¥¹¥é¥¤¥É¤Ë¤è¤ë¤È
java -jar -Xmx512m trimmomatic-0.38.jar \ PE \ -threads ${NSLOTS} \ -phred33 \ -trimlog log_SRR${NUM}.txt \ SRR${NUM}_1.fastq.gz \ SRR${NUM}_2.fastq.gz \ paired_SRR${NUM}_1.trim.fastq.gz \ unpaired_SRR${NUM}_1.trim.fastq.gz \ paired_SRR${NUM}_2.trim.fastq.gz \ unpaired_SRR${NUM}_2.trim.fastq.gz \ ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 \ LEADING:20 \ TRAILING:20 \ SLIDINGWINDOW:4:15 \ MINLEN:36
°ìÈ̤ˡ¢¥¢¥À¥×¥¿½üµî¤Ï¾¤Î²Ã¹©¤è¤êÀè¤Ë¹Ô¤¦Êý¤¬Îɤ¤¡Ê¾¤Î²Ã¹©¤Ë¤è¤ê¥Þ¥Ã¥Á¥ó¥°¤¬Æñ¤·¤¯¤Ê¤ë¤¿¤á¡Ë¤È¤·¤Æ¤¤¤ë¡£
¸Ä¡¹¤ÎºÙ¤«¤¤¥Ñ¥é¥á¡¼¥¿¤Î°ÕÌ£¤Ïtrimmomatic¤Î¥Ú¡¼¥¸¤È¥Þ¥Ë¥å¥¢¥ë¤Ë½ñ¤«¤ì¤Æ¤¤¤ë¡£°Ê²¼¡¢¾åµ¤ÎÎã¤Ë¤Ä¤¤¤ÆÀâÌÀ¤¹¤ë¡ÊPaired End¤Î¾ì¹ç¤Ë¸Â¤ë¡Ë¡£
-Xmx256m | ¤³¤ì¤Ïjava¥³¥Þ¥ó¥É¤ËÂФ¹¤ë»ØÄê¤Ç¡¢trimmomatic¤Î¥Ñ¥é¥á¡¼¥¿¤Ç¤Ï¤Ê¤¤¡£-Xmx256m¤Ï¥á¥â¥ê³äÅö¤Æ¤ÎºÇÂçÎ̤ò256M¥Ð¥¤¥È¤Ë¤¹¤ë¡£Ìµ»ØÄê»þ¤Ï64M |
PE | ưºî¥â¡¼¥É¤¬SE(SingleEnd)¤«PE(PairedEnd)¤« |
--threads 16 | ½èÍý»þ¤ÎÊÂÎó¥¹¥ì¥Ã¥É¿ô |
-phred33 | ±ö´ð¡ÊÆÉ¼è¤ê¡ËÉʼÁ¤Îµ½ÒË¡¡¢-phread33¤«-phread64¡¢Ìµ»ØÄê»þ¤Ï¼«Æ°È½ÊÌ¡Ê-v0.32°Ê¹ß¡Ë |
-trimlog log_SRR453566.txt | ¼Â¹Ô¥í¥°¤Î½ÐÎÏÀè¥Õ¥¡¥¤¥ë̾¤Î»ØÄê |
SRR453566_1.fastq | PairedEnd¤Ç¤ÎÆþÎÏforward¥Õ¥¡¥¤¥ë |
SRR453566_2.fastq | PairedEnd¤Ç¤ÎÆþÎÏbackward¥Õ¥¡¥¤¥ë |
paired_SRR453566}_1.trim.fastq | PairedEnd¤Ç¤Î½ÐÎÏpaired forward¥Õ¥¡¥¤¥ë |
unpaired_SRR453566}_1.trim.fastq | PairedEnd¤Ç¤Î½ÐÎÏunpaired forward¥Õ¥¡¥¤¥ë |
paired_SRR453566}_2.trim.fastq | PairedEnd¤Ç¤Î½ÐÎÏpaired backward¥Õ¥¡¥¤¥ë |
unpaired_SRR453566}_2.trim.fastq | PairedEnd¤Ç¤Î½ÐÎÏunpaired backward¥Õ¥¡¥¤¥ë |
¤³¤ì°Ê¹ß¤Ï¡¢¸ÄÊ̤νüµî¥¹¥Æ¥Ã¥×¤ò»ØÄꤹ¤ë¡£
ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 | ¥¹¥Æ¥Ã¥×1¤ÇIllumina adapter¤ò½üµî¡¢TruSeq***¤Ï¥¢¥À¥×¥¿¡¼¤òµ½Ò¤·¤¿fasta¥Õ¥¡¥¤¥ë¡¢2¤ÏºÇÂç¥ß¥¹¥Þ¥Ã¥Á¿ô¡¢30¤Ï²óʸ¥¢¥é¥¤¥á¥ó¥È»þ¤Ë£²¤Ä¤ÎÎÙÀܥ꡼¥É¤¬¤É¤ì¤À¤±Àµ³Î¤Ë¥Þ¥Ã¥Á¤¹¤ë¤«¤ò»ØÄê¡¢10¤Ï¥¢¥À¥×¥¿¡¼¤È¥ê¡¼¥É´Ö¤Î¥¢¥é¥¤¥á¥ó¥È¥Þ¥Ã¥Á¤ÎÀµ³Î¤µ |
LEADING:20 | ÀèÆ¬¤«¤éÄãÉʼÁ¥Ù¡¼¥¹¤ò¼è¤ê½ü¤¯¡¢¤³¤Î»þ¤Î»Ä¤¹¤¿¤á¤ÎºÇÄãÉʼÁ¤¬20 |
TRAILING:20 | ËöÈø¤«¤éÄãÉʼÁ¥Ù¡¼¥¹¤ò¼è¤ê½ü¤¯¡¢¤³¤Î»þ»Ä¤¹¤¿¤á¤ÎºÇÄãÉʼÁ¤¬20 |
CROP:? Îã¤Ç¤Ï»È¤ï¤ì¤Æ¤¤¤Ê¤¤ | ÉʼÁ¤Ë´Ø·¸¤Ê¤¯¡¢ÀèÆ¬¤«¤é»ØÄꤵ¤ì¤¿±ö´ð¿ô¤À¤±¤ò»Ä¤·¸å¤í¤ò½üµî |
HEADCROP:? Îã¤Ç¤Ï»È¤ï¤ì¤Æ¤¤¤Ê¤¤ | ÉʼÁ¤Ë´Ø·¸¤Ê¤¯¡¢ÀèÆ¬¤«¤é»ØÄꤵ¤ì¤¿±ö´ð¿ô¤À¤±½üµî¤·¸å¤í¤ò»Ä¤¹ |
SLIDINGWINDOW:4:15 | ¥¹¥é¥¤¥Ç¥£¥ó¥°¥¦¥£¥ó¥É¥¦Éý¤ò4¤È¤·¡¢¤½¤ÎÃæ¤Ç¤ÎÊ¿¶ÑÉʼÁ¤¬15°Ê¾å¤Î¤â¤Î¤ò»Ä¤¹ |
MINLEN:36 | ¡ÊÄ̾ïºÇ¸å¤Ë¹Ô¤¦¡Ë»Ä¤Ã¤Æ¤¤¤ë¥ê¡¼¥É¤Î¤¦¤Á¡¢Ä¹¤µ¤ÎºÇ¾®ÃÍ36°Ê¾å¤Î¤â¤Î¤ò»Ä¤¹ |
¼ÂºÝ¤Î¥³¥Þ¥ó¥É¤Ï
java -jar -Xmx512m /usr/local/Trimmomatic/trimmomatic-0.38.jar PE \ -threads 32 \ -phred33 \ -trimlog log_SRR453566.txt \ SRR453566_1.fastq \ SRR453566_2.fastq \ paired_SRR453566_1.trim.fastq \ unpaired_SRR453566_1.trim.fastq \ paired_SRR453566_2.trim.fastq \ unpaired_SRR453566_2.trim.fastq \ ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 \ LEADING:20 \ TRAILING:20 \ SLIDINGWINDOW:4:15 \ MINLEN:36
·ë²Ì¤Ï
TrimmomaticPE: Started with arguments: -threads 32 -phred33 -trimlog log_SRR453566.txt SRR453566_1.fastq SRR453566_2.fastq paired_SRR453566}_1.trim.fastq unpaired_SRR453566_1.trim.fastq paired_SRR453566_2.trim.fastq unpaired_SRR53566_2.trim.fastq ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:15 MINLEN:36 Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT' Using Long Clipping Sequence: 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA' Using Long Clipping Sequence: 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC' Using Long Clipping Sequence: 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT' Using Long Clipping Sequence: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences Input Read Pairs: 5725730 Both Surviving: 5115482 (89.34%) Forward Only Surviving: 514793 (8.99%) Reverse Only Surviving: 46123 (0.81%) Dropped: 49332 (0.86%) TrimmomaticPE: Completed successfully
¤Ê¤ª¡¢IlluminaClip¤Ç»ØÄꤹ¤ë¥¢¥À¥×¥¿¡¼¥·¡¼¥±¥ó¥¹¤Ï GitHub¤Îtrimmomatic¤Î¥Ñ¥Ã¥±¡¼¥¸Ãæ¤Îadapters/TruSeq30PE-2.fa ¤ò»È¤¦¤³¤È¤¬¤Ç¤¤¿¡£