[[Python¥Ð¥¤¥ª]]¡¡[[Python¥Ð¥¤¥ª/¥Ä¡¼¥ë]]~
&counter();¡¡¡¡¡¡&lastmod();~

*¥ê¡¼¥É¤Î»²¾ÈÇÛÎó¤Ø¤Î¥Þ¥Ã¥Ô¥ó¥° [#mbaaa438]

Bowtie2ºÇ¿·ÈǤò¥¤¥ó¥¹¥È¡¼¥ë¡Ê¥¢¥Ã¥×¥Ç¡¼¥È¡Ë~
¥À¥¦¥ó¥í¡¼¥É¤·¤Æunzip¤·¤Æmake

»È¤¤Êý¤ÎÎã¡¡[[biopapyrus bowtie2:https://bi.biopapyrus.jp/rnaseq/mapping/bowtie2/]]¡¢¥Þ¥Ë¥å¥¢¥ë¡¡[[http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#getting-started-with-bowtie-2-lambda-phage-example]]

 bowtie2-build --threads 16 GCF_000001405.38_GRCh38.p12_genomic.fna HG38.ix

 Settings:
   Output files: "HG38.ix.*.bt2"
   Line rate: 6 (line is 64 bytes)
   Lines per side: 1 (side is 64 bytes)
   Offset rate: 4 (one in 16)
   FTable chars: 10
   Strings: unpacked
   Max bucket size: default
   Max bucket size, sqrt multiplier: default
   Max bucket size, len divisor: 4
   Difference-cover sample period: 1024
   Endianness: little
   Actual local endianness: little
   Sanity checking: disabled
   Assertions: disabled
   Random seed: 0
   Sizeofs: void*:8, int:4, long:8, size_t:8
 Input files DNA, FASTA:
   GCF_000001405.38_GRCh38.p12_genomic.fna
 Building a SMALL index
 Reading reference sizes
   Time reading reference sizes: 00:00:20
 Calculating joined length
 Writing header
 Reserving space for joined string
 Joining reference sequences
   Time to join reference sequences: 00:00:20
 bmax according to bmaxDivN setting: 773987710
 Using parameters --bmax 580490783 --dcv 1024
   Doing ahead-of-time memory usage test
   Passed!  Constructing with these parameters: --bmax 580490783 --dcv 1024
 Constructing suffix-array element generator
 Building DifferenceCoverSample
   Building sPrime
   Building sPrimeOrder
   V-Sorting samples
   V-Sorting samples time: 00:01:24
   Allocating rank array
   Ranking v-sort output
   Ranking v-sort output time: 00:00:17
   Invoking Larsson-Sadakane on ranks
   Invoking Larsson-Sadakane on ranks time: 00:00:30
   Sanity-checking and returning
 Building samples
 Reserving space for 12 sample suffixes
 Generating random suffixes
 QSorting 12 sample offsets, eliminating duplicates
 QSorting sample offsets, eliminating duplicates time: 00:00:00
 Multikey QSorting 12 samples
   (Using difference cover)
   Multikey QSorting samples time: 00:00:00
 Calculating bucket sizes
 Splitting and merging
   Splitting and merging time: 00:00:00
 Avg bucket size: 3.09595e+09 (target: 580490782)
 Converting suffix-array elements to index image
 Allocating ftab, absorbFtab
 Entering Ebwt loop
 Getting block 1 of 1
   No samples; assembling all-inclusive block
   Sorting block of length 3095950843 for bucket 1
   (Using difference cover)

¼¡¤Ë¡¢samtools¤Ç¤Þ¤È¤á¤ë¡£ÊÂÎó»ØÄê¤Ï-@ ¥¹¥ì¥Ã¥É¿ô
 samtools view -@ 32 -bS SRR002322.sam > SRR002322.bam

³¤¤¤Æ¡¢¥½¡¼¥È¡£ÊÂÎó»ØÄê¤Ï-@ ¥¹¥ì¥Ã¥É
 samtools sort SRR002322.bam -o SRR002322.sorted.bam -@ 32

ºÇ¸å¤Ëmpileup¡¡¤³¤ì¤ÏÊÂÎó»ØÄê¤Ï¤Ç¤­¤Ê¤¤¤é¤·¤¤
 samtools mpileup -uf GCF_000001405.38_GRCh38.p12_genomic.fna SRR002322.sorted.bam | bcftools view -Ov - > SR002322.raw.bcf



¤Ê¤ª¡¢mpileup¤âÊÂÎó¼Â¹Ô¤Ç¤­¤ë¥Ñ¥Ã¥±¡¼¥¸¤¬¤¢¤Ã¤¿¤Î¤Ç¡¢¤¤¤º¤ì¥Æ¥¹¥È¡£~
[[Sambamba 0.6.8:https://github.com/biod/sambamba/releases]]

*°äÅÁ¸¦¹Ö½¬²ñ¤Ç¤ÎÎã [#c85616f7]
[[¡ÖÀè¿Ê¥²¥Î¥à»Ù±ç¡×¾ðÊó²òÀϹֽ¬²ñ¤Î¤´°ÆÆâ:https://www.genome-sci.jp/whatsnew/event/news20180920.html]]~
¢Í[[¾ðÊó²òÀϹֽ¬²ñ¥Ó¥Ç¥ª¡ã2018ǯÅÙ¡¡¾ðÊó²òÀϹֽ¬²ñ¡ÊÃæµé¼Ô¸þ¤±¡Ë¡ä:https://www.genome-sci.jp/lecture20181st]]~
¢Í[[»ñÎÁ¡ÊGitHub¡Ë:https://github.com/genome-sci/python_bioinfo_2018]]

°äÅÁ¸¦¹Ö½¬²ñ¤ÎÎã¤Ç¤Ï¡¢¡ÖTopHat2¤Î¸å·Ñ¤Ç¤¢¤ë¡×HISAT2¤ò»È¤Ã¤Æ¤¤¤ë¡£<-- [[¥¢¥á¥ê¥¨¥Õ¤Î¥Ö¥í¥° HISAT2:http://staffblog.amelieff.jp/entry/2016/07/01/144419]]

[[HISAT2:https://ccb.jhu.edu/software/hisat2/index.shtml]]¡¡¤ËÛ©¤¯

>HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome).  Based on an extension of BWT for graphs ...

 http://ccb.jhu.edu/software/hisat2/dl/hisat2-2.1.0-source.zip
 unzip
 make

¥Ð¥¤¥Ê¥ê¤Ïmake¤·¤¿¥Ç¥£¥ì¥¯¥È¥êÃæ¤Ë¤Ç¤­¤ë¤Î¤Ç¡¢Å¬µ¹¥ê¥ó¥¯¤òÄ¥¤ë¤³¤È¡£

hisat2¤Î½èÍý¤Ï¡¢¥¤¥ó¥Ç¥Ã¥¯¥¹¤òºî¤ë(hisat2-build) ¢Í hisat2¤Ç¥Þ¥Ã¥×¡¡¤Î½ç¡£

 hisat2-build s288c.fna s288c-build.fna
 Settings:
   Output files: "s288c-build.fna.*.ht2"
   Line rate: 6 (line is 64 bytes)
   Lines per side: 1 (side is 64 bytes)
   Offset rate: 4 (one in 16)
   FTable chars: 10
   Strings: unpacked
   Local offset rate: 3 (one in 8)
   Local fTable chars: 6
   Local sequence length: 57344
   Local sequence overlap between two consecutive indexes: 1024
   Endianness: little
   Actual local endianness: little
   Sanity checking: disabled
   Assertions: disabled
   Random seed: 0
   Sizeofs: void*:8, int:4, long:8, size_t:8
 Input files DNA, FASTA:
   s288c.fna
 Reading reference sizes
   Time reading reference sizes: 00:00:00
 Calculating joined length
 Writing header
 Reserving space for joined string
 Joining reference sequences
   Time to join reference sequences: 00:00:00
   Time to read SNPs and splice sites: 00:00:00
 Using parameters --bmax 2279457 --dcv 1024
   Doing ahead-of-time memory usage test
   Passed!  Constructing with these parameters: --bmax 2279457 --dcv 1024
 Constructing suffix-array element generator
 Building DifferenceCoverSample
   Building sPrime
   Building sPrimeOrder
   V-Sorting samples
   V-Sorting samples time: 00:00:01
   Allocating rank array
   Ranking v-sort output
   Ranking v-sort output time: 00:00:00
   Invoking Larsson-Sadakane on ranks
   Invoking Larsson-Sadakane on ranks time: 00:00:00
   Sanity-checking and returning
 Building samples
 Reserving space for 12 sample suffixes
 Generating random suffixes
 QSorting 12 sample offsets, eliminating duplicates
 QSorting sample offsets, eliminating duplicates time: 00:00:00
 Multikey QSorting 12 samples
   (Using difference cover)
   Multikey QSorting samples time: 00:00:00
 Calculating bucket sizes
 Splitting and merging
   Splitting and merging time: 00:00:00
 Avg bucket size: 1.73673e+06 (target: 2279456)
 Converting suffix-array elements to index image
 Allocating ftab, absorbFtab
 Entering GFM loop
 Getting block 1 of 7
   Reserving size (2279457) for bucket 1
   Calculating Z arrays for bucket 1
   Entering block accumulator loop for bucket 1:
   bucket 1: 10%
   bucket 1: 20%
   bucket 1: 30%
   bucket 1: 40%
   bucket 1: 50%
   bucket 1: 60%
   bucket 1: 70%
   bucket 1: 80%
   bucket 1: 90%
   bucket 1: 100%
   Sorting block of length 1917483 for bucket 1
   (Using difference cover)
   Sorting block time: 00:00:01
 Returning block of 1917484 for bucket 1
 Getting block 2 of 7
   Reserving size (2279457) for bucket 2
   Calculating Z arrays for bucket 2
   Entering block accumulator loop for bucket 2:
   bucket 2: 10%
   bucket 2: 20%
   bucket 2: 30%
   bucket 2: 40%
   bucket 2: 50%
   bucket 2: 60%
   bucket 2: 70%
   bucket 2: 80%
   bucket 2: 90%
   bucket 2: 100%
   Sorting block of length 1565107 for bucket 2
   (Using difference cover)
   Sorting block time: 00:00:00
 Returning block of 1565108 for bucket 2
 Getting block 3 of 7
   Reserving size (2279457) for bucket 3
   Calculating Z arrays for bucket 3
   Entering block accumulator loop for bucket 3:
   bucket 3: 10%
   bucket 3: 20%
   bucket 3: 30%
   bucket 3: 40%
   bucket 3: 50%
   bucket 3: 60%
   bucket 3: 70%
   bucket 3: 80%
   bucket 3: 90%
   bucket 3: 100%
   Sorting block of length 2077071 for bucket 3
   (Using difference cover)
   Sorting block time: 00:00:01
 Returning block of 2077072 for bucket 3
 Getting block 4 of 7
   Reserving size (2279457) for bucket 4
   Calculating Z arrays for bucket 4
   Entering block accumulator loop for bucket 4:
   bucket 4: 10%
   bucket 4: 20%
   bucket 4: 30%
   bucket 4: 40%
   bucket 4: 50%
   bucket 4: 60%
   bucket 4: 70%
   bucket 4: 80%
   bucket 4: 90%
   bucket 4: 100%
   Sorting block of length 1959963 for bucket 4
   (Using difference cover)
   Sorting block time: 00:00:00
 Returning block of 1959964 for bucket 4
 Getting block 5 of 7
   Reserving size (2279457) for bucket 5
   Calculating Z arrays for bucket 5
   Entering block accumulator loop for bucket 5:
   bucket 5: 10%
   bucket 5: 20%
   bucket 5: 30%
   bucket 5: 40%
   bucket 5: 50%
   bucket 5: 60%
   bucket 5: 70%
   bucket 5: 80%
   bucket 5: 90%
   bucket 5: 100%
   Sorting block of length 1858740 for bucket 5
   (Using difference cover)
   Sorting block time: 00:00:01
 Returning block of 1858741 for bucket 5
 Getting block 6 of 7
   Reserving size (2279457) for bucket 6
   Calculating Z arrays for bucket 6
   Entering block accumulator loop for bucket 6:
   bucket 6: 10%
   bucket 6: 20%
   bucket 6: 30%
   bucket 6: 40%
   bucket 6: 50%
   bucket 6: 60%
   bucket 6: 70%
   bucket 6: 80%
   bucket 6: 90%
   bucket 6: 100%
   Sorting block of length 630502 for bucket 6
   (Using difference cover)
   Sorting block time: 00:00:00
 Returning block of 630503 for bucket 6
 Getting block 7 of 7
   Reserving size (2279457) for bucket 7
   Calculating Z arrays for bucket 7
   Entering block accumulator loop for bucket 7:
   bucket 7: 10%
   bucket 7: 20%
   bucket 7: 30%
   bucket 7: 40%
   bucket 7: 50%
   bucket 7: 60%
   bucket 7: 70%
   bucket 7: 80%
  bucket 7: 90%
  bucket 7: 100%
  Sorting block of length 2148233 for bucket 7
  (Using difference cover)
  Sorting block time: 00:00:01
 Returning block of 2148234 for bucket 7
 Exited GFM loop
 fchr[A]: 0
 fchr[C]: 3766349
 fchr[G]: 6086925
 fchr[T]: 8404025
 fchr[$]: 12157105
 Exiting GFM::buildToDisk()
 Returning from initFromVector
 Wrote 8248454 bytes to primary GFM file: s288c-build.fna.1.ht2
 Wrote 3039284 bytes to secondary GFM file: s288c-build.fna.2.ht2
 Re-opening _in1 and _in2 as input streams
 Returning from GFM constructor
 Returning from initFromVector
 Wrote 5399069 bytes to primary GFM file: s288c-build.fna.5.ht2
 Wrote 3092708 bytes to secondary GFM file: s288c-build.fna.6.ht2
 Re-opening _in5 and _in5 as input streams
 Returning from HierEbwt constructor
 Headers:
     len: 12157105
     gbwtLen: 12157106
     nodes: 12157106
     sz: 3039277
     gbwtSz: 3039277
     lineRate: 6
     offRate: 4
     offMask: 0xfffffff0
     ftabChars: 10
     eftabLen: 0
     eftabSz: 0
     ftabLen: 1048577
     ftabSz: 4194308
     offsLen: 759820
     offsSz: 3039280
     lineSz: 64
     sideSz: 64
     sideGbwtSz: 48
     sideGbwtLen: 192
     numSides: 63319
     numLines: 63319
     gbwtTotLen: 4052416
     gbwtTotSz: 4052416
     reverse: 0
     linearFM: Yes
 Total time for call to driver() for forward index: 00:00:06
½àÈ÷½ª¤ï¤ê¡£

¥Þ¥Ã¥Ô¥ó¥°¤Ï
 hisat2 -p 32 --dta-cufflinks -x s288c-build.fna -1 paired_SRR453566_1.trim.fastq -2 paired_SRR453566_2.trim.fastq -S SRR453566.sam
 5115482 reads; of these:
   5115482 (100.00%) were paired; of these:
     323268 (6.32%) aligned concordantly 0 times
     4524989 (88.46%) aligned concordantly exactly 1 time
     267225 (5.22%) aligned concordantly >1 times
     ----
     323268 pairs aligned concordantly 0 times; of these:
       77115 (23.85%) aligned discordantly 1 time
     ----
     246153 pairs aligned 0 times concordantly or discordantly; of these:
       492306 mates make up the pairs; of these:
         341629 (69.39%) aligned 0 times
         135729 (27.57%) aligned exactly 1 time
         14948 (3.04%) aligned >1 times
 96.66% overall alignment rate
¤Ç½ª¤ï¤ê¡£

¤³¤³¤Ç¡¢samtools¤ò»È¤Ã¤Æsam¥Õ¥¡¥¤¥ë¤òbam¥Õ¥¡¥¤¥ë¤ËÊÑ´¹¤·¡¢¤«¤Ä¡¢position¤Çsort¤¹¤ë¡£

samtools/bcftools¤Î¥¤¥ó¥¹¥È¡¼¥ë¤Ï¡¢[[Samtools¤Î¥µ¥¤¥È]http://www.htslib.org/download/]]¤«¤é¥À¥¦¥ó¥í¡¼¥É¤·¡¢
 ./configure
 make
 sudo make install
¤Ç´°Î»¡£

samtools view¤ò»È¤Ã¤Æsam¥Õ¥¡¥¤¥ëSRR453566.sam ¤òbam¥Õ¥¡¥¤¥ë SR453566.bam¤ËÊÑ´¹¡£@32¤Ï32¥¹¥ì¥Ã¥É»ÈÍÑ¡£
 samtools view -@ 32 -b SRR453566.sam > SRR453566.bam
 Ëô¤Ï samtools view -@ 32 -b SRR453566.sam -o SRR453566.bam

samtools sort¤ò»È¤Ã¤Æbam¥Õ¥¡¥¤¥ëSRR453566.bam ¤ò¥½¡¼¥È¤·¤ÆSRR453566.sorted.bam¤Ë½ÐÎÏ
 samtools sort -@ 32 SRR453566.bam > SRR453566.sorted.bam 
    [bam_sort_core] merging from 0 files and 32 in-memory blocks...

***hisat2¤Î¥É¥­¥å¥á¥ó¥È¤Ç¤Ï¸å½èÍý¤È¤·¤Æ¡¡samtools mpileup¤·¤Æbcftools view¤Çɽ¼¨ [#ie39cac8]

 samtools mpileup -uvf s288c.fna SRR453566.sorted.bam > SRR453566.vcf
 [warning] samtools mpileup option `u` is functional, but deprecated. Please switch to using bcftools mpileup in future.
 [warning] samtools mpileup option `v` is functional, but deprecated. Please switch to using bcftools mpileup in future.
 [mpileup] 1 samples in 1 input files

Ʊ¤¸¤³¤È¤òbcftools¤Ç¤Ï
 bcftools mpileup -Ou -f s288c.fna SRR453566.sorted.bam > SRR453566.vcf
¤Ê¤ª¡¢¥Ñ¥é¥á¥¿ --threads 16 ¤Ê¤É¤Ï¼ÂºÝ¤Ë¤ÏÊÂÎ󲽤θú²Ì¤¬Ìµ¤«¤Ã¤¿¡£

[[samtools¤Çvariant call:https://qiita.com/motthy/items/38951f127dce26b22ce3]]¤Ç¤Ï
 bcftools mpileup -Ou -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam> | bcftools call -vmO z -o <study.vcf.gz>
¤È½ñ¤¤¤Æ¤¤¤ë¡£

[[bcftools¤Î¥Þ¥Ë¥å¥¢¥ë:https://samtools.github.io/bcftools/bcftools.html#call]]¤Ç¤Ïview¤ÎÂå¤ï¤ê¤Ëcall¤ò»È¤¨¤È¤·¤Æ¤ª¤ê¡¢¶ñÂÎŪ¤ÊÃê½Ð½èÍý¤ò¤É¤¦¤¹¤ë¤«Î㤬¤¤¤¯¤Ä¤«¤¢¤ë¡£

¹¹¤Ë¡¢ HISAT2¤Ç¤Ï¡¡
 bcftools view -bvcg SRR453566.vcf > SRR453566.raw.bcf¡¡
¤È¤¹¤ë¤È¤·¤Æ¤¤¤ë¤¬¡¢¥Ñ¥é¥á¥¿-bvgf¤Ï¥¨¥é¡¼¡£

²òÀâ¤ò¸«¤ë¤È¡¢ºÇ¶á¤Î¥Ð¡¼¥¸¥ç¥ó¤Ç¤Ïbcftools call¤Ë¤è¤Ã¤Æ¥³¡¼¥ë¤ò¼è½Ð¤¹¤é¤·¤¤¡£
 bcftools call -O v -v -c SRR453566.vcf > SRR453566.raw.bcf
¤¬Àµ²ò¤ÎÌÏÍÍ¡£ÆÀ¤é¤ì¤¿·ë²Ì¤Ï¡ÊÀèÆ¬¥³¥á¥ó¥È¥é¥¤¥ó¤ò½ü¤¯¤È¡Ë
 #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  SRR453566.sorted.bam
 NC_001133.9     137     .       C       CT      4.4191  .       INDEL;IDV=1;IMF=1;DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,1,0;MQ=60;FQ=-37.5262   GT:PL   0/1:40,3,0
 NC_001133.9     349     .       C       T       8.64911 .       DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,0,1;MQ=60;FQ=-29.9908     GT:PL   1/1:38,3,0
 NC_001133.9     11969   .       C       G       10.4247 .       DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,1,0;MQ=60;FQ=-29.9905     GT:PL   1/1:40,3,0
 NC_001133.9     11997   .       T       A       27.0206 .       DP=2;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,1,0;MQ=60;FQ=-29.9901     GT:PL   1/1:57,3,0
 NC_001133.9     12361   .       G       A       4.13164 .       DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,1,0;MQ=60;FQ=-29.9928     GT:PL   0/1:32,3,0
 NC_001133.9     12373   .       G       A       4.13164 .       DP=1;SGB=-0.379885;MQ0F=0;AF1=1;AC1=2;DP4=0,0,1,0;MQ=60;FQ=-29.9928     GT:PL   0/1:32,3,0
¤È¤¤¤Ã¤¿´¶¤¸¡£INFOÉôʬ¤Î²òÀâ¤Ï¥Ø¥Ã¥ÀÉôʬ¤Ë¤¢¤ëÄ̤ꡢ
 ##INFO=<ID=BQB,Number=1,Type=Float,Description="Mann-Whitney U test of Base Quality Bias (bigger is better)">
 ##INFO=<ID=MQSB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality vs Strand Bias (bigger is better)">
 ##INFO=<ID=SGB,Number=1,Type=Float,Description="Segregation based metric.">
 ##INFO=<ID=MQ0F,Number=1,Type=Float,Description="Fraction of MQ0 reads (smaller is better)">
 ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 ##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele frequency (assuming HWE)">
 ##INFO=<ID=AF2,Number=1,Type=Float,Description="Max-likelihood estimate of the first and second group ALT allele frequency (assuming HWE)">
 ##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele count (no HWE assumption)">
 ##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square mapping quality of covering reads">
 ##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of all samples being the same">
 ##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand bias, baseQ bias, mapQ bias and tail distance bias">
 ##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype frequencies">
 ##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test P-value based on G3">
 ##INFO=<ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases">

[[¥Ð¥ê¥¢¥ó¥È¥³¡¼¥ë·ë²Ì¤ÎVCF¥Õ¥©¡¼¥Þ¥Ã¥È - mac¤Ç¥¤¥ó¥Õ¥©¥Þ¥Æ¥£¥¯¥¹:http://kazumaxneo.hatenablog.com/entry/2017/06/02/140208]]


bcftools mpileup¤Ç¡¢Ê£¿ô¥µ¥ó¥×¥ë¤òʤ٤Ƥߤ롣[[¤³¤Î¥Ú¡¼¥¸:https://pepper.is.sci.toho-u.ac.jp/pepper/index.php?cmd=read&page=Python%A5%D0%A5%A4%A5%AA%2F%A5%C4%A1%BC%A5%EB%2FVariant-Calling&word=pyvcf]],  [[bcftools¤Î¥Ú¡¼¥¸:https://samtools.github.io/bcftools/howtos/variant-calling.html]]

 bcftools mpileup -Ou -f s288c.fna SRR453566.sorted.bam  SRR453567.sorted.bam  SRR453568.sorted.bam  SRR453569.sorted.bam  SRR453570.sorted.bam  SRR453571.sorted.bam | bcftools call -Ou -mv -o SRR453566_mpileup.call.vcf

¹¹¤Ë·ë²Ì¤òÆÉ¤á¤ë¤è¤¦¤Ë¤¹¤ë¡£
 bcftools view SRR453566_mpileup.call.vcf | vcf-annotate -f + > SRR453566_mpileup.call.annotate.vcf

¼¡¤Î¤è¤¦¤Ê·ë²Ì¤Ë¤Ê¤Ã¤¿¡£

 #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  SRR453566.sorted.bam    SRR453567.sorted.bam    SRR453568.sorted.bam    SRR453569.sorted.bam    SRR453570.sorted.bam    SRR453571.sorted.bam
 NC_001133.9     136     .       G       A       68      SnpGap  DP=4;VDB=0.00992398;SGB=-0.896883;MQ0F=0;AC=8;AN=8;DP4=0,0,4,0;MQ=60    GT:PL   1/1:23,3,0      1/1:24,3,0      ./.:0,0,0       1/1:23,3,0      1/1:23,3,0      ./.:0,0,0
 NC_001133.9     137     .       C       CT      135     PASS    INDEL;IDV=1;IMF=1;DP=4;VDB=0.00992398;SGB=-0.896883;MQ0F=0;AC=8;AN=8;DP4=0,0,4,0;MQ=60  GT:PL   1/1:40,3,0      1/1:40,3,0      ./.:0,0,0       1/1:40,3,0      1/1:40,3,0      ./.:0,0,0
 NC_001133.9     286     .       A       T       13.4171 MinAB;MinDP     DP=1;SGB=-0.045183;MQ0F=0;AC=2;AN=2;DP4=0,0,1,0;MQ=60   GT:PL   ./.:0,0,0       ./.:0,0,0       1/1:38,3,0      ./.:0,0,0       ./.:0,0,0       ./.:0,0,0
 NC_001133.9     305     .       C       G       10.6101 MinAB;MinDP     DP=1;SGB=-0.045183;MQ0F=0;AC=2;AN=2;DP4=0,0,1,0;MQ=60   GT:PL   ./.:0,0,0       ./.:0,0,0       1/1:35,3,0      ./.:0,0,0       ./.:0,0,0       ./.:0,0,0
 NC_001133.9     349     .       C       T       88      PASS    DP=3;VDB=0.102722;SGB=-0.471632;MQ0F=0;AC=6;AN=6;DP4=0,0,0,3;MQ=60      GT:PL   1/1:38,3,0      ./.:0,0,0       1/1:35,3,0      ./.:0,0,0       ./.:0,0,0       1/1:40,3,0
 NC_001133.9     373     .       T       C       9.69791 Qual;MinAB      DP=2;SGB=-0.045183;RPB=1;MQB=1;BQB=1;MQ0F=0;ICB=0.5;HOB=0.5;AC=2;AN=4;DP4=0,1,0,1;MQ=60 GT:PL   0/1:0,3,41      ./.:0,0,0       0/1:40,3,0      ./.:0,0,0       ./.:0,0,0       ./.:0,0,0
 NC_001133.9     457     .       CAAA    CAA     7.96917 Qual;MinAB;MinDP        INDEL;IDV=1;IMF=1;DP=1;SGB=-0.045183;MQ0F=0;AC=2;AN=2;DP4=0,0,0,1;MQ=60 GT:PL   ./.:0,0,0       ./.:0,0,0       ./.:0,0,0       ./.:0,0,0       1/1:32,3,0      ./.:0,0,0



[[¥µ¥ó¥×¥ë´Ö¤Ç¶¦Ä̤¹¤ëÊѰۤȸÇÍ­¤ÎÊѰۤòÃê½Ð¤¹¤ë - mac¤Ç¥¤¥ó¥Õ¥©¥Þ¥Æ¥£¥¯¥¹:http://kazumaxneo.hatenablog.com/entry/2017/06/06/165329]]

¤½¤ì¤Ç¡¢bam¥Õ¥¡¥¤¥ë¤ä¡¢callÃê½Ð¤·¤¿¸å¤Îvcf¥Õ¥¡¥¤¥ë¤Ë¡¢¥¤¥ó¥Ç¥Ã¥¯¥¹ÉÕ¤±¤¬É¬Íפˤʤ뤳¤È¤¬¤¢¤ë¡£¤¿¤È¤¨¤ÐIGV¤Çɽ¼¨¤¹¤ë¤È¤­¤Ê¤É¡£

¥¤¥ó¥Ç¥Ã¥¯¥¹ÉÕ¤±¤Ïbcftools¤Ç²Äǽ¤À¤¬¡¢ÆþÎϤ¿¤ëvcf¥Õ¥¡¥¤¥ë¤¬bgzip¤µ¤ì¤Æ¤¤¤Ê¤±¤ì¤Ð¤Ê¤é¤Ê¤¤¡£gdzip¼«ÂΤÏsamtools¤Îhtslib¤ÎÃæ¤ËÆþ¤Ã¤Æ¤¤¤Æ¡¢samtools¤òmake; make install¤·¤¿¤À¤±¤Ç¤ÏÆþ¤é¤Ê¤¤¡£make all all-htslib; make install install-htslib¡¡¤Î¤è¤¦¤ËÄɲÃmake¤¹¤ëɬÍפ¬¤¢¤ë¡£
 bgzip -c SRR453566.raw.bcf > SRR453566.raw.bcf.gz
 bcftools index SRR453566.raw.bcf.gz
¤³¤Î¾õÂÖ¤ÇIGV¤Ë¿©¤ï¤»¤ë¤³¤È¤¬¤Ç¤­¤ë¡£
 import igv
 # tracks¤ÇVCF¤ò»ØÄꤷ¤Æ¤ß¤ë
 ref = { 
     "id": "Saccha", 
     "name": "Saccha",
     "fastaURL": "files/s288c.fna",
     "indexURL": "files/s288c.fna.fai",
     "tracks": [ 
       { 
         "name": "My GFF", 
         "type": "annotation", 
         "format": "gff3", 
         #"displayMode": "EXPANDED", 
         "url": "files/s288c_e.gff", 
         "indexed": False, 
       },
       { 
         "name": "VCF", 
         "type": "variant", 
         "format": "vcf", 
         "displayMode": "EXPANDED", 
         "url": "files/SRR453566.raw.bcf.gz", 
         "indexed": True, 
       } 
     ] 
 }
 b = igv.Browser({"reference": ref})
 b.show()

***°äÅÁ¸¦¹Ö½¬²ñ»ñÎÁ¤Ç¤Ï¸å½èÍý¤È¤·¤ÆFeature Count¤Ç¿ô¤¨¤ë [#qaa4366c]

Feature Count¤ÎÁ°½èÍý¤È¤·¤Æ¡¢GeneID¤òÉÕÍ¿¤¹¤ë¡£

°äÅÁ¸¦¤ÎÄ󶡤¹¤ëpython¥×¥í¥°¥é¥à¡¡add_gene_id¡¡¤Ç½èÍý¡£¤½¤Î¤¿¤á¤Ë¡¢¥Ñ¥Ã¥±¡¼¥¸bcbio-gff¤¬É¬ÍפʤΤǡ¢pip¤Ç¥¤¥ó¥¹¥È¡¼¥ë
 pip install bcbio-gff

¤³¤ì¤Ç¡¢add_gene_id¤ò²ÔƯ¤Ç¤­¤ë¡£
 python add_gene_id s288c.gff s288c_e.gff

ÆÀ¤é¤ì¤¿s288c.e.gff¤Ë¤Ïgene_id¤¬Æþ¤Ã¤Æ¤¤¤ë¤Ï¤º¡£

¤³¤ì¤òFeature Count¤¹¤ë¡£[[biopapyrus¤ÎÀâÌÀ:https://bi.biopapyrus.jp/rnaseq/expression/featurecounts.html]]¡¡¤ò°úÍѤ¹¤ë¤È¡§
>featureCounts ¤Ï¥Þ¥Ã¥Ô¥ó¥°·ë²Ì¤Ç¤¢¤ë BAM ¤Þ¤¿¤Ï SAM ¥Õ¥¡¥¤¥ë¤òÆþÎϤȤ·¡¢³Æ feature Îΰè¡Êgene¡¢CDS¡¢exon¡Ë¤´¤È¤Ë¥ê¡¼¥É¤ò¥«¥¦¥ó¥È¤¹¤ë¥×¥í¥°¥é¥à¤Ç¤¢¤ë¡£featureCounts ¤Ï Subread ¥Ñ¥Ã¥±¡¼¥¸¤Ë´Þ¤Þ¤ì¤ë¥×¥í¥°¥é¥à¤Î°ì¤Ä¤Ç¤¢¤ë¡£featureCounts ¤òÍøÍѤ¹¤ë¤Ë¤Ï subread ¤ò¥¤¥ó¥¹¥È¡¼¥ë¤¹¤ëɬÍפ¬¤¢¤ë¡£
subread ¤Î¥½¡¼¥¹¥³¡¼¥É¤È¥³¥ó¥Ñ¥¤¥ë¤µ¤ì¤¿¥×¥í¥°¥é¥à¤Ï sourceforge ¤Ç¸ø³«¤µ¤ì¤Æ¤¤¤ë¡£É¬Íפ˱þ¤¸¤Æ¤É¤Á¤é¤«¤ò¥À¥¦¥ó¥í¡¼¥É¤·¤Æ»ÈÍѤ¹¤ë¡£

¤È¤¤¤¦¤³¤È¤Ç¡¢[[subread:http://subread.sourceforge.net/]]¤Î¥Ú¡¼¥¸¤«¤é¥À¥¦¥ó¥í¡¼¥É¡£

 featureCounts -p -T 8 -t exon -g gene_id -a $GFF \ 
  -o ../featurecount/counts.txt \ 
  SRR453566.sorted.bam \
  SRR453567.sorted.bam \ 
  SRR453568.sorted.bam \ 
  SRR453569.sorted.bam \ 
  SRR453570.sorted.bam \ 
  SRR453571.sorted.bam

$GFF¤Ï»²¾È¥·¡¼¥±¥ó¥¹¤ÎGene¾ðÊó¤ò´Þ¤à¥Õ¥¡¥¤¥ë¡£[[featureCount (Subread) ¤Î¥Þ¥Ë¥å¥¢¥ë:http://bioinf.wehi.edu.au/featureCounts/]]¤Ç¤Ï
 The annotation file should be in either GTF format or a simplified annotation format (SAF) 
 as shown below (columns are tab-delimited):
 GeneID	Chr	Start	End	Strand
 497097	chr1	3204563	3207049	-
 497097	chr1	3411783	3411982	-
 497097	chr1	3660633	3661579	-
 ...
¤È¤·¤Æ¤¢¤ê¡¢GTF¥Õ¥©¡¼¥Þ¥Ã¥È¤Ï[[Frequently Asked Questions: Data File Formats:https://genome.ucsc.edu/FAQ/FAQformat.html#format4]]¤ª¤è¤Ó[[GTF2.2: A Gene Annotation Format (Revised Ensembl GTF):http://mblab.wustl.edu/GTF22.html]]¤Ë¤è¤ë¤È
 GTF stands for Gene transfer format. It borrows from GFF, but has additional 
 structure that warrants a separate definition and format name.
 Structure is as GFF, so the fields are: 
 <seqname> <source> <feature> <start> <end> <score> <strand> <frame> [attributes] [comments]
 
 Here is a simple example with 3 translated exons. Order of rows is not important.
 
 381 Twinscan  CDS          380   401   .   +   0  gene_id "001"; transcript_id "001.1";
 381 Twinscan  CDS          501   650   .   +   2  gene_id "001"; transcript_id "001.1";
 381 Twinscan  CDS          700   707   .   +   2  gene_id "001"; transcript_id "001.1";
 381 Twinscan  start_codon  380   382   .   +   0  gene_id "001"; transcript_id "001.1";
 381 Twinscan  stop_codon   708   710   .   +   0  gene_id "001"; transcript_id "001.1";
 The whitespace in this example is provided only for readability. In GTF, fields must be separated by a single TAB and no white space.

¹¹¤Ë¡¢GFF¥Õ¥©¡¼¥Þ¥Ã¥È¤Ï[[Frequently Asked Questions: Data File Formats:https://genome.ucsc.edu/FAQ/FAQformat.html#format3]]¤Ë¤è¤ë¤È
 GFF (General Feature Format) lines are based on the Sanger GFF2 specification. 
 GFF lines have nine required fields that must be tab-separated. 
 If the fields are separated by spaces instead of tabs, the track will not display correctly. 
 For more information on GFF format, refer to Sanger's GFF page.
¤Þ¤¿¡¢¾åµ­¤Ç»²¾È¤·¤Æ¤¤¤ëSanger¤ÎGFF2/3¤ÎÀâÌÀ¤Ï¡¢[[GFF2:http://gmod.org/wiki/GFF2]]¤È[[GFF3:http://gmod.org/wiki/GFF3]]¤Ë¤¢¤ë¡£GFF2¤Ïobsolete¤È¤µ¤ì¡¢GFF3¤¬recommend¤µ¤ì¤Æ¤¤¤ë¡£¹¹¤Ë¡¢GFF3¤Ç¤Ï¥Õ¥¡¥¤¥ë¤ÎºÇ¸å¤Ë¥ª¥×¥·¥ç¥ó¤Çsequence¤¬½ñ¤±¤ë¡£

ưºî¤¹¤ë¤³¤È¤Î¥Æ¥¹¥È¤È¤·¤Æ¡¢SRR453566.sorted.bam¤Î£±¤Ä¤À¤±¤ò½èÍý¤·¤Æ¤ß¤ë¡£
 featureCounts -p -T 32 -t exon -g gene_id -a s288c_e.gff -o SRR435366.counts.txt SRR453566.sorted.bam

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
	  v1.6.3

 //========================== featureCounts setting ===========================\\
 ||                                                                            ||
 ||             Input files : 1 BAM file                                       ||
 ||                           P SRR453566.sorted.bam                           ||
 ||                                                                            ||
 ||             Output file : SRR435366.counts.txt                             ||
 ||                 Summary : SRR435366.counts.txt.summary                     ||
 ||              Annotation : s288c_e.gff (GTF)                                ||
 ||      Dir for temp files : ./                                               ||
 ||                                                                            ||
 ||                 Threads : 32                                               ||
 ||                   Level : meta-feature level                               ||
 ||              Paired-end : yes                                              ||
 ||      Multimapping reads : not counted                                      ||
 || Multi-overlapping reads : not counted                                      ||
 ||   Min overlapping bases : 1                                                ||
 ||                                                                            ||
 ||          Chimeric reads : counted                                          ||
 ||        Both ends mapped : not required                                     ||
 ||                                                                            ||
 \\===================== http://subread.sourceforge.net/ ======================//
 
 //================================= Running ==================================\\
 ||                                                                            ||
 || Load annotation file s288c_e.gff ...                                       ||
 ||    Features : 6761                                                         ||
 ||    Meta-features : 6420                                                    ||
 ||    Chromosomes/contigs : 17                                                ||
 ||                                                                            ||
 || Process BAM file SRR453566.sorted.bam...                                   ||
 ||    Paired-end reads are included.                                          ||
 ||    Assign alignments (paired-end) to features...                           ||
 ||                                                                            ||
 ||    WARNING: reads from the same pair were found not adjacent to each       ||
 ||             other in the input (due to read sorting by location or         ||
 ||             reporting of multi-mapping read pairs).                        ||
 ||                                                                            ||
 ||    Pairing up the read pairs.                                              ||
 ||                                                                            ||
 ||    Total alignments : 5650830                                              ||
 ||    Successfully assigned alignments : 4568783 (80.9%)                      ||
 ||    Running time : 0.03 minutes                                             ||
 ||                                                                            ||
 ||                                                                            ||
 || Summary of counting results can be found in file "SRR435366.counts.txt.su  ||
 || mmary"                                                                     ||
 ||                                                                            ||
 \\===================== http://subread.sourceforge.net/ ======================//

½ÐÎϤϡÊÀèÆ¬¡Ë
 # Program:featureCounts v1.6.3; Command:"featureCounts" "-p" "-T" "32" "-t" "exon" "-g" "gene_id" "-a" "s288c_e.gff" "-o" "SRR435366.counts.txt" "SRR453566.sorted.bam"
 Geneid  Chr     Start   End     Strand  Length  SRR453566.sorted.bam
 gene_0001       NC_001133.9     1807    2169    -       363     0
 gene_0002       NC_001133.9     2480    2707    +       228     0
 gene_0003       NC_001133.9     7235    9016    -       1782    0
 gene_0004       NC_001133.9     11565   11951   -       387     0
 gene_0005       NC_001133.9     12046   12426   +       381     2
 gene_0006       NC_001133.9     13363   13743   -       381     0
 gene_0007       NC_001133.9     21566   21850   +       285     0
 gene_0008       NC_001133.9     22395   22685   -       291     0
 gene_0009       NC_001133.9     24000   27968   -       3969    32
 gene_0010       NC_001133.9     31567   32940   +       1374    65
 gene_0011       NC_001133.9     33448   34701   +       1254    119
 gene_0012       NC_001133.9     35155   36303   +       1149    1155

¡ä¡ä°äÅÁ¸¦¥»¥ß¥Ê¡¼¤Î¤³¤ÎÀè¤Î½èÍý¤Ï[[RNAseq­£-0¡Ê°äÅÁ¸¦¥»¥ß¥Ê¡¼¤Ç¤Î¡Ëȯ¸½²òÀÏ¡Á¥«¥¦¥ó¥È¥Ç¡¼¥¿¤Î²òÀÏ>Python¥Ð¥¤¥ª/¥Ä¡¼¥ë/RNAseq­£-0¡Ê°äÅÁ¸¦¥»¥ß¥Ê¡¼¤Ç¤Î¡Ëȯ¸½²òÀÏ¡Á¥«¥¦¥ó¥È¥Ç¡¼¥¿¤Î²òÀÏ]]

//=================================

***Ê̤λñÎÁ¤Ë¤è¤ë¤È¡¢¥Ñ¥¤¥×¥é¥¤¥ó¤È¤·¤Æ¡¢HISAT2¢ÍStringTie¢Í [#e7486864]
***Ê̤λñÎÁ¤Ë¤è¤ë¤È¡¢¥Ñ¥¤¥×¥é¥¤¥ó¤È¤·¤Æ¡¢HISAT2¢ÍStringTie¢Í Ballgawn [#e7486864]

Johns Hopkins¤«¤éÏÀʸ
[[Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown.  Nature Protocols volume 11, pages 1650&#8211;1667 (2016):https://www.nature.com/articles/nprot.2016.095]]

- [[StringTie:https://ccb.jhu.edu/software/stringtie/index.shtml]]  is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only the alignments of raw reads used by other transcript assemblers, but also alignments longer sequences that have been assembled from those reads.In order to identify differentially expressed genes between experiments, StringTie's output can be processed by specialized software like Ballgown, Cuffdiff or other programs (DESeq2, edgeR, etc.).
-¥Þ¥Ë¥å¥¢¥ë¤Ï[[manual:https://ccb.jhu.edu/software/stringtie/index.shtml?t=manual]]
HISAT2¤Î½ÐÎϤò»È¤¦¾ì¹ç¡¢be sure to run HISAT2 with the --dta option for alignmen~

-StringTie¤ÎHP¤Ç¾È²ñ¤µ¤ì¤Æ¤¤¤ë[[gffcompare:https://ccb.jhu.edu/software/stringtie/gff.shtml#gffcompare]]¤Ï¡¢~
The program gffcompare can be used to compare, merge, annotate and estimate accuracy of one or more GFF files (the "query" files), when compared with a reference annotation (also provided as GFF/GTF).

- [[NGS¥Ç¡¼¥¿¤«¤é¿·¤¿¤ÊÃ챤òƳ½Ð¤¹¤ë¤¿¤á¤Î¥Ç¡¼¥¿²òÀÏ¥ê¥Æ¥é¥·¡¼ Åý¹çTV(togotv)¡ÃÀ¸Ì¿²Ê³Ø·ÏDB¡¦¥Ä¡¼¥ë»È¤¤Åݤ··Ï¥Á¥ã¥ó¥Í¥ë:https://togotv.dbcls.jp/ajacs2018008.html]]¤Î¡Ö¥Ç¡¼¥¿²òÀϡפιब»²¹Í¤Ë¤Ê¤ê¤½¤¦¡£

-Cufflinks²ó¤ê ¡Á °ÊÁ°¤Ïcufflinks¤ò»È¤Ã¤Æ¤¤¤¿¤é¤·¤¤
--[[¼¡À¤Â奷¡¼¥¯¥¨¥ó¥µ¡¼DRY²òÀ϶µËܤˤè¤ë°äÅÁ»Òȯ¸½²òÀÏ&#12316;Cufflinks¡¢cummeRbund¡¢DAVID&#12316;@AJACSa»°Åç2 Åý¹çTV(togotv)¡ÃÀ¸Ì¿²Ê³Ø·ÏDB¡¦¥Ä¡¼¥ë»È¤¤Åݤ··Ï¥Á¥ã¥ó¥Í¥ë:https://togotv.dbcls.jp/ajacs2016010.html]]¤Ïcufflinks¤ò»È¤Ã¤¿²òÀâ
--[[Cufflinks | RNA-seq ¤òÍøÍѤ·¤¿°äÅÁ»Òȯ¸½²òÀÏ:https://bi.biopapyrus.jp/rnaseq/expression/cufflinks.html]]
--¤½¤Î¼êÁ°¤Î³µÏÀ¡¡[[ȯ¸½Î̲òÀÏ | RNA-Seq ¤òÍøÍѤ·¤¿È¯¸½ÊÑÆ°°äÅÁ»Ò¤Î¸¡½Ð:https://bi.biopapyrus.jp/rnaseq/analysis/]]

-¡Ê[[¥³¥ó¥»¥ó¥µ¥¹ÇÛÎó¤ÎºîÀ® | VCF ¤ÎÊѰ۾ðÊó¤Ë´ð¤Å¤¤¤Æ¥ê¥Õ¥¡¥ì¥ó¥¹¤Î±ö´ðÇÛÎó¤òÃÖ´¹¤·¥³¥ó¥»¥ó¥µ¥¹ÇÛÎó¤òºîÀ®ÊýË¡:https://bi.biopapyrus.jp/gwas/vcf-consensus-fasta.html]]¡Ë

¤½¤ì¤Ç¡¢HISAT2¤Ç¥ê¡¼¥É¤ò¥¢¥»¥ó¥Ö¥ë¤·¤¿·ë²Ì¤Îbam¥Õ¥¡¥¤¥ë¤òStringTie¤Ç¥¢¥»¥ó¥Ö¥ë¤¹¤ë¡£

StringTie¤Î¥¤¥ó¥¹¥È¡¼¥ë¤Ï¡¢[[¥Þ¥Ë¥å¥¢¥ë:https://ccb.jhu.edu/software/stringtie/index.shtml?t=manual]]Ä̤ꡣ¥½¡¼¥¹¤ò¥À¥¦¥ó¥í¡¼¥É¤·¡¢make¡£¥¤¥ó¥¹¥È¡¼¥ë¤Ï¼«Ê¬¤Ç/usr/local/bin/stringtie¤«¤é¥ê¥ó¥¯¤òÄ¥¤ì¤Ð½ª¤ï¤ê¡£

½èÍý¤Ï¡¢¤¢¤é¤«¤¸¤áHISAT2¤Çsorted.bam¤¬¤Ç¤­¤Æ¤¤¤ë¤È¤·¤Æ¡¢¡ÊHISAT2¤Ç--dta¤ò»ØÄꤹ¤ë¤³¤È¡Ë
 stringtie -B -p 16 -o SRR453566.gtf -G s288c_e.gff SRR453566.sorted.bam
¤³¤³¤Ç¡¢-B ¤Ï½ÐÎϤòBallgown¤Ç»È¤¦¤¿¤á¤ÎÀßÄê¡¢-G s288c_e.gff ¤Ïreference annotation file (in GTF or GFF3 format)¡¢-p 16¤Ï¥¹¥ì¥Ã¥É¿ô¤Ç¡¢½ÐÎϤÏ-o SRR453566.gtf ¤Ç»ØÄê¡£

-B»ØÄê¤ò¤·¤¿¤È¤­¤Î½ÐÎϤϡ¢SRR453566.gtf¤Ë²Ã¤¨¤Æ¡¢
        e2t.ctab
        e_data.ctab
        i2t.ctab
        i_data.ctab
        t_data.ctab
¤Ç¤¢¤ê¡¢¤³¤ì¤òBallgawn¤Ë¿©¤ï¤»¤ëɬÍפ¬¤¢¤ë¡£

¤³¤Îstringtie¤Î½ÐÎϤò¡¢Ballgawn¤ÇÀ°Íý¤·¤Æ²Ä»ë²½¤¹¤ë¡£~
A program for computing differentially expressed genes in two or more RNA-seq experiments, using the output of StringTie or Cufflinks. The Ballgown package provides functions to organize, visualize, and analyze expression measurements. Ballgown is written in R and is part of Bioconductor.

[[GitHub - alyssafrazee/ballgown: Bioconductor package "ballgown":https://github.com/alyssafrazee/ballgown]]


[[HISAT-StringTie-Ballgown ¤ò»î¤·¤Æ¤ß¤è¤¦ - Palmsonntagmorgen:http://rnakato.hatenablog.jp/entry/2018/11/09/165200]]¡¡¤³¤ì¤Ç¤ÏÏÀʸ¤Ç¤Ïoptional¤È¤Ê¤Ã¤Æ¤¤¤Þ¤¹¤¬¡¢gffcompare¤ò»È¤Ã¤Æ¡¢À¸À®¤µ¤ì¤¿°äÅÁ»Ò¥ê¥¹¥È¤Èreference¤Î°äÅÁ»Ò¥ê¥¹¥È¤òÈæ³Ó¤¹¤ë¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£

³¤­¤È¤·¤Æ[[HISAT-StringTie-Ballgown ¤ò»î¤·¤Æ¤ß¤è¤¦ ¤½¤Î2 - Palmsonntagmorgen:http://rnakato.hatenablog.jp/entry/2018/11/26/145847]]

¥È¥Ã¥×   ÊÔ½¸ º¹Ê¬ ¥Ð¥Ã¥¯¥¢¥Ã¥× źÉÕ Ê£À½ ̾Á°Êѹ¹ ¥ê¥í¡¼¥É   ¿·µ¬ °ìÍ÷ ñ¸ì¸¡º÷ ºÇ½ª¹¹¿·   ¥Ø¥ë¥×   ºÇ½ª¹¹¿·¤ÎRSS