ノ〖ト

## dN/dS 侍叹 Kn/Ks †

Goodman and Young 1994: A Codon-based Model of Nucleotide Substitution for Protein-coding DNA

### National Genomics Data Center (Beijing) のペ〖ジに郓く †

#### Methods for Calculating Ka and Ks †

Calculating Ka and Ks normally involves three steps. Let us assume that the number of lengths between two DNA sequences compared is n and the number of substitutions between them is m. To calculate Ka and Ks, we need to count the numbers of synonymous (S) and nonsynonymous (N) sites (S + N = n) and the numbers of synonymous (Sd) and nonsynonymous (Nd) substitutions (Sd + Nd = m). Then it is after correcting multiple substitutions that (Nd/N) and (Sd/S) could represent Ka and Ks, respectively, since the observed number of substitutions underestimates the real number of substitutions as sequences diverge over time. Therefore, we can conclude from mentioned above that these methods normally involve three steps to estimate Ka and Ks: counting S and N, counting Sd and Nd, and correction for multiple substitutions.

KaとKsの纷换には、奶撅3つのステップが崔まれます。孺秤する2つのDNA芹误粗の墓さの眶をnとし、それらの粗の弥垂眶をmと簿年します。 KaおよびKsを纷换するには、票盗サイト眶∈S∷および润票盗サイト眶∈N∷∈S + N = n∷、票盗弥垂の眶∈Sd∷および润票盗弥垂の眶∈Nd∷∈S​​d + Nd = m ∷を眶えます。驴脚弥垂を饯赖した稿、∈Nd / N∷と∈Sd / S∷はそれぞれKaとKsを山すことができます。これは、シ〖ケンスが箕粗とともに尸呆するにつれて、囱弧された弥垂眶が悸狠の弥垂眶を册井删擦するためです。したがって、惧淡の冯侠から、これらの数恕には奶撅、KaとKsを夸年するための3つのステップが崔まれると冯侠烧けることができます。

Methods for calculating Ka and Ks adopt different substitution models with subtle yet significant differences. They can be classified as approximate methods and maximum-likelihood methods. Different from approximate methods, maximum-likelihood methods adopt the probability theory to finish all three steps mentioned above in one go.

#### Approximate Methods †

There are several approximate methods incorporated into KaKs_Calculator, and we list their abbreviations in the program and their corresponding reference(s) as follows.

• NG: Nei, M. and Gojobori, T. (1986)
• LWL: Li, W.H., et al. (1985)
• LPB: Li, W.H. (1993) and Pamilo, P. and Bianchi, N.O. (1993)
• MLWL (Modified LWL), MLPB (Modified LPB): Tzeng, Y.H., et al. (2004)
• YN: Yang, Z. and Nielsen, R. (2000)
• MYN (Modified YN): Zhang, Z., et al. (2006)

#### Maximum-Likelihood Methods †

The method of GY takes account of sequence evolutionary features, such as transition/transversion rate ratio and nucleotide frequencies (reflected in the HKY Model) and incorporates these features into a codon-based model. We extend this method to a set of candidate models in a maximum likelihood framework and use the AICc for model selection and model averaging.

• GY: Goldman, N. and Yang, Z. (1994)
• MS (Model Selection), MA (Model Averaging): based on a set of candidate models defined by Posada, D. (2003) as follows.
 Model Substitution Rates Nucleotide Frequency JC / F81 rTC=rAG=rTA=rCG=rTG=rCA Equal/Unequal K2P / HKY rTC=rAG♀rTA=rCG=rTG=rCA Equal/Unequal TrNEF / TrN rTC♀rAG♀rTA=rCG=rTG=rCA Equal/Unequal K3P / K3PUF rTC=rAG♀rTA=rCG♀rTG=rCA Equal/UnEqual TIMEF / TIM rTC♀rAG♀rTA=rCG♀rTG=rCA Equal/Unequal TVMEF / TVM rTC=rAG♀rTA♀rCG♀rTG♀rCA Equal/Unequal SYM / GTR rTC=♀AG♀rTA♀rCG♀rTG=♀rCA Equal/Unequal

rij: substitution rate between i and j, where i ≠ j and i, j⒑{A, C, G, T}

### 纷换プログラム †

Last-modified: 2020-02-16 (泣) 12:15:04 (1137d)