codon usage bias ref: chapter 9 xuhua xia [email protected] http:// dambe.bio.uottawa.ca

17
Codon usage bias Ref: Chapter 9 Xuhua Xia [email protected] http:// dambe.bio.uottawa.ca

Upload: adelia-ross

Post on 03-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

  • Codon usage biasRef: Chapter 9Xuhua [email protected]:// dambe.bio.uottawa.ca

  • ObjectivesUnderstand how codon usage bias affect translation efficiency and gene expressionBiomedical relevanceProtein drugs in pharmaceutical industryTransgenic experiments in agricultureFactors affecting codon usage biasIndices measuring codon usage biasDevelop bioinformatic skills to study the genomic codon usage.

  • Codon Usage BiasObservation: Strongly biased codon usage in a variety of species ranging from viruses, mitochondria, plastids, prokaryotes and eukaryotes. Hypotheses:Differential mutation hypothesis, e.g., Transcriptional hypothesis of codon usage (Xia 1996 Genetics 144:1309-1320 )Different selection hypothesis, e.g., (Xia 1998 Genetics 149: 37-44)Predictions:From mutation hypothesis: Concordance between codon usage and mutation pressureFrom Selection hypothesis: Concordance between differential availability of tRNA and differential codon usage.The concordance is stronger in highly expressed genes than lowly expressed genes (CAI is positively correlated with gene expression).

  • Table 9-2, yeastXia 2007. Bioinformatics and the cell.

    AA(1)

    Codon(2)

    T(3)

    w(4)

    F(5)

    Arg

    AGA

    11

    1

    314

    Arg

    AGG

    1

    0.091

    1

    Asn

    AAC

    10

    1

    208

    Asn

    AAU

    0

    0

    11

    Asp

    GAC

    16

    1

    202

    Asp

    GAU

    0

    0

    112

    Cys

    UGC

    4

    1

    3

    Cys

    UGU

    0

    0

    39

    Gln

    CAA

    9

    1

    153

    Gln

    CAG

    1

    0.111

    1

    Glu

    GAA

    14

    1

    305

    Glu

    GAG

    2

    0.143

    5

    His

    CAC

    7

    1

    102

    His

    CAU

    0

    0

    25

    Leu

    UUA

    7

    0.7

    42

    Leu

    UUG

    10

    1

    359

    Lys

    AAA

    7

    0.5

    65

    Lys

    AAG

    14

    1

    483

    Phe

    UUC

    10

    1

    168

    Phe

    UUU

    0

    0

    19

    Ser

    AGC

    2

    1

    6

    Ser

    AGU

    0

    0

    4

    Tyr

    UAC

    8

    1

    141

    Tyr

    UAU

    0

    0

    10

  • Conflict: Initiation and ElongationMet codon usage from the 12 CDSs: AUA214 AUG 37Possible tRNAMet/CAU, tRNAMet/UAUVertebrate mitochondrial genome has only one tRNAMet. Which one to have?tRNAMet/CAU: Good for initiation, but not efficient for AUA codons even with the C modified to 5-formylcytidinetRNAMet/UAU: Good for AUA codons, but not good for initiation.anticodon: CAU favoring the AUG codonNature has chosen CAU: All mitochondrial genomes with a single tRNAMet has a CAU anticodon.Problem with AUA codons in translation?Xia et al. 2007. PLoS One

  • Hypothesis and PredictionsFavoured by mutation, but not by tRNA-mediated selection because the first (wobble) position in tRNA anticodon is C.Favoured by mutationAlso favoured by tRNA-mediated selection: the first (wobble) position of tRNA is U.Predictions: 1. Proportion of A-ending codons (or RSCU) should be smaller in the Met codon family than in other R-ending codon families:PNNA = NNNA/NNNG2. Availability of tRNAMet/UAU should increase PAUA.

  • Selection against AUA codonsCarullo, M. and Xia, X. 2008 J Mol Evol 66:484493.

    Met

    Leu

    Glu

    Lys

    Gln

    Arg

    Trp

    Species

    AUA

    UUA

    GAA

    AAA

    CAA

    AGA

    UGA

    A. gossypii

    1.473

    1.993

    1.826

    1.852

    1.917

    2

    2

    C. glabrata

    1.043

    1.995

    2.000

    1.938

    1.889

    2

    2

    K. thermotolerans

    0.556

    1.973

    1.910

    1.948

    1.945

    2

    1.967

    S. cerevisiae

    1.140

    1.969

    1.800

    1.883

    1.794

    1.947

    1.908

    S. castelli

    1.299

    1.994

    1.891

    1.981

    1.969

    2

    1.918

    S. servazzii

    1.321

    1.931

    1.702

    1.824

    1.841

    1.959

    2

    Y. lipolytica

    1.440

    1.968

    1.536

    1.859

    1.963

    1.922

    1.882

  • Xia, X. 2012. In: RS Singh et al.. Evolution in the fast lane: Rapidly evolving genes and genetic systems. Oxford University Press.Fig. 5. Relationship between PAUA and PUUA, highlighting the observation that PAUA is greater when both a tRNAMet/CAU and a tRNAMet/UAU are present than when only tRNAMet/CAU is present in the mtDNA, for bivalve species (a) and chordate species (b). The filled squares are for mtDNA containing both tRNAMet/CAU and tRNAMet/UAU genes, and the open triangles are for mtDNA without a tRNAMet/UAU gene.

  • (a)

    (b)

    Chart2

    56.2865.65

    60.3473.96

    47.3463.18

    46.8964.83

    74.0780.18

    33.9937.5

    58.6660.61

    63.8365.02

    63.1265.3

    PAUA

    PAUA

    PUUA

    PAUA

    Bivalve

    PUUAPAUAPAUA

    Acanthocardia tuberculataNC_008452CAU/CAU65.6556.28

    Hiatella arcticaNC_008451CAU/CAU73.9660.34

    Crassostrea virginicaNC_007175CAU/CAU63.1847.34

    C. gigasNC_001276CAU/CAU64.8346.89

    V.philippinarumNC_003354CAU/CAU80.1874.07

    Placopecten magellanicusNC_007234CAU/CAU37.533.99

    Mytilus trossulusNC_007687CAU/UAU58.6660.61

    M. galloprovincialisNC_006886CAU/UAU63.8365.02

    M. edulisNC_006161CAU/UAU63.1265.3

    PAUAPUUAGroupGXSUMMARY OUTPUT

    56.2865.6500

    60.3473.9600Regression Statistics

    47.3463.1800Multiple R0.9375384846

    46.8964.8300R Square0.8789784101

    74.0780.1800Adjusted R Square0.8063654561

    33.9937.500Standard Error5.3287910681

    60.6158.66158.66Observations9

    65.0263.83163.83

    65.363.12163.12ANOVA

    dfSSMSFSignificance F

    Regression31031.1996176508343.733205883612.10498075140.0099200083

    Residual5141.980071238128.3960142476

    Total81173.1796888889

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept-2.37010223810.6992462076-0.22152048770.8334500435-29.873390184225.1331857082-29.873390184225.1331857082

    PUUA0.86460060580.16313122395.30003138070.0031919730.4452584451.28394276670.4452584451.2839427667

    Group8.878155462983.92676456220.10578455530.9198662154-206.8624609657224.6187718914-206.8624609657224.6187718914

    GX0.05887248471.35437475060.04346838620.9670106854-3.42265864663.5404036159-3.42265864663.5404036159

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.9375140938

    R Square0.878932676

    Adjusted R Square0.8385769013

    Standard Error4.8654175142

    Observations9

    ANOVA

    dfSSMSFSignificance F

    Regression21031.145963365515.572981682521.77960113830.0017745197

    Residual6142.033725523923.6722875873

    Total81173.1796888889

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept-2.424949719.7007062734-0.24997661420.8109442465-26.161722809321.3118233892-26.161722809321.3118233892

    PUUA0.86545470610.1478615145.85314381290.00109802510.50365061581.22725879640.50365061581.2272587964

    Group12.5226003773.45782302673.62152726730.0110768964.0616122520.9835885044.0616122520.983588504

    Bivalve

    00

    00

    00

    00

    00

    00

    00

    00

    00

    PAUA

    PAUA

    PUUA

    PAUA

    Tunicates

    CodonAAAplidiumConicumNC_013584RSCUCionaIntestinalisNC_004447CionaSavignyiNC_004570ClavelinaLepadiformisNC_012887DiplosomaListerianumNC_013556DoliolumNationalisNC_006627Halocynthia roretziNC_002177HerdmaniaMomusNC_013561MicrocosmusSulcatusNC_013752PhallusiaFumigataNC_009834PhallusiaMammillataNC_009833StyelaPlicataNC_013565

    UAG*30.66710.240.88910.2510.28671.55640.66781.610.182101.6676191.5

    UAA*61.33391.851.11171.7561.71420.44481.33320.4101.81820.3336130.5

    GCUA762.841702.642712.812873.164762.951771.532772.184682.45561.898781.625791.456522

    GCGA50.18720.07570.27710.03630.117320.637150.426190.685270.915430.896390.719180.692

    GCCA70.262100.37730.11950.18230.117531.05590.25580.28830.102340.708661.217140.538

    GCAA190.71240.906200.792170.618210.816390.776401.135160.577321.085370.771330.608200.769

    UGUC311.676281.806232321.939391.857220.917731.947341.789361.946331.404190.792421.826

    UGCC60.32430.1940010.06130.143261.08320.05340.21110.054140.596291.20840.174

    GAUD621.851631.909601.935651.733481.655401.039601.765591.71641.882371.104411.242591.735

    GACD50.14930.09120.065100.267100.345370.96180.235100.2940.118300.896250.75890.265

    GAGE220.579180.474250.61260.712140.406410.891611.386701.573641.524751.875611.627491.273

    GAAE541.421581.526571.39471.288551.594511.109270.614190.427200.47650.125140.373280.727

    UUUF4951.8754321.8194601.9095221.8355621.9151671.1254011.9373351.6923611.9361941.3432161.4074041.804

    UUCF330.125430.181220.091470.165250.0851300.875130.063610.308120.064950.657910.593440.196

    AGAG1331.4382211.7131511.589761.0131641.7921301.088680.872250.685620.681220.463200.519270.551

    AGGG520.562370.287390.411740.987190.2081090.912881.128481.3151201.319731.537571.481711.449

    GGAG411.763272.038481.811190.644110.978381.246460.722380.425571.06660.595600.57370.841

    GGCG40.1720010.03890.30510.08980.26240.063190.21260.112260.234780.741150.341

    GGGG160.68880.604140.528180.6140.356351.148610.9571401.564751.3952512.2611871.777410.932

    GGUG321.376181.358431.623722.441292.578411.3441442.2591611.799771.4331010.91960.912831.886

    CACH70.21260.18540.12370.222110.297310.88660.188160.48550.167290.906321.032120.369

    CAUH591.788591.815611.877561.778631.703391.114581.813501.515551.833351.094300.968531.631

    AUUI2601.8252961.8852811.9122461.8222811.8861251.2951691.8671241.592331.9581091.4531161.3182261.745

    AUCI250.175180.115130.088240.178170.114680.705120.133320.4150.042410.547600.682330.255

    AAAK681.4471021.672971.552711.4791281.778501.087280.8260.598420.92350.169250.794310.765

    AAGK260.553200.328280.448250.521160.222420.913421.2611.402491.077541.831381.206501.235

    CUAL371.345381.505191.07321.113301.5191431.563140.789441.257211.355731.0321021.115481.289

    CUCL50.18240.15810.05680.27860.304620.67830.169180.51420.129610.862680.743110.295

    CUGL50.18230.11950.28260.20940.203530.579130.732411.17170.452731.032820.896160.43

    CUUL632.291562.218462.592692.4391.9751081.18412.31371.057322.065761.0741141.246741.987

    UUAL3531.5483981.8513351.5023211.733761.8251301.4212921.2141900.7853551.4261080.8091070.9822521.289

    UUGL1030.452320.1491110.498500.27360.175530.5791890.7862941.2151430.5741591.1911111.0181390.711

    AUAM1611.4772361.7482131.6141821.6252331.8641421.4131111.314680.7231601.172800.914680.8141161.055

    AUGM570.523340.252510.386420.375170.136590.587580.6861201.2771130.828951.086991.1861040.945

    AACN120.173160.20140.051150.229290.331601.27750.12190.3880.133280.824250.769190.297

    AAUN1271.8271431.7991541.9491161.7711461.669340.723781.88811.621121.867401.176401.2311091.703

    CCUP773.02672.577652.708642.783853.119612.103592.408532.232471.899431.737391.405522

    CCGP20.07850.192100.41730.1340.147100.345120.4990.379210.848140.566120.432140.538

    CCCP70.27540.15440.16740.17430.11180.62140.163160.67490.364170.687461.658120.462

    CCAP160.627281.077170.708210.913170.624270.931230.939170.716220.889251.01140.505261

    CAGQ100.41750.2120.53380.3410.048281.12211311.216261.13271.459261.333261.156

    CAAQ381.583451.8331.467391.66411.952220.88211200.784200.87100.541130.667190.844

    CGCR10.08500000020.18230.17450.290000110.815151.09130.267

    CGGR90.76640.30850.36430.26100191.101120.696130.945171.388191.407141.01880.711

    CGUR262.213251.923221.6282.435292.636191.101291.681292.109141.143161.185201.455171.511

    CGAR110.936231.769282.036151.304131.182281.623231.333130.945181.46980.59360.436171.511

    AGCS130.27470.175120.235110.17980.186470.90440.063230.33360.1410.774430.945190.304

    AGUS821.726731.825901.7651121.821781.814571.0961241.9381151.6671141.9651.226481.0551061.696

    UCAS390.746641.169541.009320.634491.077480.97240.78331.008371.035441.006400.842380.826

    UCCS140.268150.27460.112120.23880.176501.0150.163130.39750.14581.326551.158280.609

    UCGS140.26870.128120.22450.09950.11190.384100.325120.366230.643280.64280.589160.348

    UCUS1422.7181332.4291422.6541533.031202.637811.636842.732732.229782.182451.029671.4111022.217

    ACAT250.752321381.16310.912431.293330.746331.483250.962351.333150.561210.694240.787

    ACCT80.24140.12520.061120.35330.09410.92730.13540.15490.343180.673361.19160.525

    ACGT40.1280.25100.30540.11830.09190.42950.225160.615130.495301.121250.826140.459

    ACUT962.887842.625812.473892.618842.526841.898482.157592.269481.829441.645391.289682.23

    GUGV380.589180.291300.432250.377100.226770.856890.8621331.3071421.3992582.0311631.52960.98

    GUUV1161.7981161.8791462.1011422.143902.034911.0111841.7821931.8971551.5271170.9211261.1751841.878

    GUCV100.155120.19450.072150.22660.136590.656130.126210.20640.039410.323580.541280.286

    GUAV941.4571011.636971.396831.253711.6051331.4781271.23600.591051.034920.724820.765840.857

    UGAW601.463701.556691.516481.247581.731581.126440.759230.469390.857310.496200.357200.526

    UGGW220.537200.444220.484290.75390.269450.874721.241751.531521.143941.504921.643561.474

    UACY180.218160.195110.148250.318270.297901.132150.181330.395100.118630.851731.014420.398

    UAUY1471.7821481.8051381.8521321.6821551.703690.8681511.8191341.6051601.882851.149710.9861691.602

    PAUA0.73853211010.87407407410.80681818180.81250.9320.70646766170.65680473370.36170212770.58608058610.45714285710.40718562870.5272727273

    PUUA0.7741228070.92558139530.75112107620.86522911050.91262135920.71038251370.60706860710.39256198350.71285140560.4044943820.49082568810.6445012788

    Lancelet

    BranchiostomaBelcheriNC_004537BranchiostomaFloridaeNC_000834BranchiostomaLanceolatumNC_001912EpigonichthysMaldivensisNC_006465

    UAG*61.09161.71481.23120.667AplidiumConicumNC_013584.gbAplidiumConicumNC_013584

    UAA*50.90910.28650.76941.333BranchiostomaBelcheriNC_004537.gbBranchiostomaBelcheriNC_004537

    GCUA1251.7991261.8811211.813981.508BranchiostomaFloridaeNC_000834.gbBranchiostomaFloridaeNC_000834

    GCGA440.633340.507350.524490.754BranchiostomaLanceolatumNC_001912.gbBranchiostomaLanceolatumNC_001912

    GCCA170.245220.328260.39320.492CionaIntestinalisNC_004447.gbCionaIntestinalisNC_004447

    GCAA921.324861.284851.273811.246CionaSavignyiNC_004570.gbCionaSavignyiNC_004570

    UGUC331.737231.211231.211281.273ClavelinaLepadiformisNC_012887.gbClavelinaLepadiformisNC_012887

    UGCC50.263150.789150.789160.727DiplosomaListerianumNC_013556.gbDiplosomaListerianumNC_013556

    GAUD611.564571.425571.425541.521DoliolumNationalisNC_006627.gbDoliolumNationalisNC_006627

    GACD170.436230.575230.575170.479EpigonichthysMaldivensisNC_006465.gbEpigonichthysMaldivensisNC_006465

    GAGE480.96450.874460.893541.029Halocynthia roretziNC_002177.gbHalocynthia roretziNC_002177

    GAAE521.04581.126571.107510.971HerdmaniaMomusNC_013561.gbHerdmaniaMomusNC_013561

    UUUF2271.7391701.3491741.3651811.42MicrocosmusSulcatusNC_013752.gbMicrocosmusSulcatusNC_013752

    UUCF340.261820.651810.635740.58PhallusiaFumigataNC_009834.gbPhallusiaFumigataNC_009834

    GGUG811.095921.26941.288831.118PhallusiaMammillataNC_009833.gbPhallusiaMammillataNC_009833

    GGGG1321.7841131.5481111.5211181.589StyelaPlicataNC_013565.gbStyelaPlicataNC_013565

    GGCG160.216200.274190.26410.552

    GGAG670.905670.918680.932550.741PUUAPAUAPAUA

    CACH110.242260.571280.622290.674BranchiostomaBelcheriNC_0045370.71981776770.2918660287CAU

    CAUH801.758651.429621.378571.326BranchiostomaFloridaeNC_0008340.72069825440.3134328358CAU

    AUUI1991.6381871.6051861.5971591.389BranchiostomaLanceolatumNC_0019120.72681704260.3118811881CAU

    AUCI440.362460.395470.403700.611EpigonichthysMaldivensisNC_0064650.7360.3259668508CAU

    AAAK431.178401.096401.081371.028AplidiumConicumNC_0135840.7741228070.7385321101CAU/UAU

    AAGK300.822330.904340.919350.972CionaIntestinalisNC_0044470.92558139530.8740740741CAU/UAU

    CUAL611.488931.683941.686971.644CionaSavignyiNC_0045700.75112107620.8068181818CAU/UAU

    CUCL80.195150.271150.269310.525ClavelinaLepadiformisNC_0128870.86522911050.8125CAU/UAU

    CUGL190.463360.652410.735510.864DiplosomaListerianumNC_0135560.91262135920.932CAU/UAU

    CUUL761.854771.394731.309570.966DoliolumNationalisNC_0066270.71038251370.7064676617CAU/UAU

    UUAL3161.442891.4412901.4542761.472Halocynthia roretziNC_0021770.60706860710.6568047337CAU/UAU

    UUGL1230.561120.5591090.546990.528HerdmaniaMomusNC_0135610.39256198350.3617021277CAU/UAU

    AUGM610.584630.627630.624590.652MicrocosmusSulcatusNC_0137520.71285140560.5860805861CAU/UAU

    AUAM1481.4161381.3731391.3761221.348PhallusiaFumigataNC_0098340.4044943820.4571428571CAU/UAU

    AACN180.303240.432260.464400.727PhallusiaMammillataNC_0098330.49082568810.4071856287CAU/UAU

    AAUN1011.697871.568861.536701.273StyelaPlicataNC_0135650.64450127880.5272727273CAU/UAU

    CCUP681.789761.924731.848731.896

    CCGP200.526220.557220.557210.545

    CCCP190.5200.506240.608180.468

    CCAP451.184401.013390.987421.091

    CAGQ340.8471.106461.082290.667

    CAAQ511.2380.894390.918581.333

    CGAR181271.421251.316301.622

    CGCR20.111120.632120.63260.324

    CGGR301.667211.105221.158191.027

    CGUR221.222160.842170.895191.027

    AGCS200.578330.895300.808350.959

    AGAS290.838120.325120.323310.849

    UCAS481.386641.736631.697381.041

    UCCS110.318190.515200.539250.685

    UCGS220.635170.461190.512190.521

    UCUS862.484772.088752.02932.548

    AGGS10.029000020.055

    AGUS601.733731.98782.101491.342

    ACAT691.5651.469651.469651.307

    ACCT90.196150.339140.316310.623

    ACGT310.674320.723320.723320.643

    ACUT751.63651.469661.492711.427

    GUCV240.291230.267260.301540.598

    GUGV720.873670.779690.8820.909

    GUUV1151.3941301.5121271.4721161.285

    GUAV1191.4421241.4421231.4261091.208

    UGAW621.127581.094601.132520.981

    UGGW480.873480.906460.868541.019

    UACY340.442530.716540.725600.774

    UAUY1201.558951.284951.275951.226

    PAUA0.29186602870.31343283580.31188118810.3259668508

    PUUA0.71981776770.72069825440.72681704260.736

    Lancelet

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    PAUA

    PAUA

    PUUA

    PAUA

    Chart3

    0.29186602870.7198177677

    0.31343283580.7206982544

    0.31188118810.7268170426

    0.32596685080.736

    0.7741228070.7385321101

    0.92558139530.8740740741

    0.75112107620.8068181818

    0.86522911050.8125

    0.91262135920.932

    0.71038251370.7064676617

    0.60706860710.6568047337

    0.39256198350.3617021277

    0.71285140560.5860805861

    0.4044943820.4571428571

    0.49082568810.4071856287

    0.64450127880.5272727273

    PAUA

    PAUA

    PUUA

    PAUA

    Bivalve

    PUUAPAUAPAUA

    Acanthocardia tuberculataNC_008452CAU/CAU65.6556.28

    Hiatella arcticaNC_008451CAU/CAU73.9660.34

    Crassostrea virginicaNC_007175CAU/CAU63.1847.34

    C. gigasNC_001276CAU/CAU64.8346.89

    V.philippinarumNC_003354CAU/CAU80.1874.07

    Placopecten magellanicusNC_007234CAU/CAU37.533.99

    Mytilus trossulusNC_007687CAU/UAU58.6660.61

    M. galloprovincialisNC_006886CAU/UAU63.8365.02

    M. edulisNC_006161CAU/UAU63.1265.3

    PAUAPUUAGroupGXSUMMARY OUTPUT

    56.2865.6500

    60.3473.9600Regression Statistics

    47.3463.1800Multiple R0.9375384846

    46.8964.8300R Square0.8789784101

    74.0780.1800Adjusted R Square0.8063654561

    33.9937.500Standard Error5.3287910681

    60.6158.66158.66Observations9

    65.0263.83163.83

    65.363.12163.12ANOVA

    dfSSMSFSignificance F

    Regression31031.1996176508343.733205883612.10498075140.0099200083

    Residual5141.980071238128.3960142476

    Total81173.1796888889

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept-2.37010223810.6992462076-0.22152048770.8334500435-29.873390184225.1331857082-29.873390184225.1331857082

    PUUA0.86460060580.16313122395.30003138070.0031919730.4452584451.28394276670.4452584451.2839427667

    Group8.878155462983.92676456220.10578455530.9198662154-206.8624609657224.6187718914-206.8624609657224.6187718914

    GX0.05887248471.35437475060.04346838620.9670106854-3.42265864663.5404036159-3.42265864663.5404036159

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.9375140938

    R Square0.878932676

    Adjusted R Square0.8385769013

    Standard Error4.8654175142

    Observations9

    ANOVA

    dfSSMSFSignificance F

    Regression21031.145963365515.572981682521.77960113830.0017745197

    Residual6142.033725523923.6722875873

    Total81173.1796888889

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept-2.424949719.7007062734-0.24997661420.8109442465-26.161722809321.3118233892-26.161722809321.3118233892

    PUUA0.86545470610.1478615145.85314381290.00109802510.50365061581.22725879640.50365061581.2272587964

    Group12.5226003773.45782302673.62152726730.0110768964.0616122520.9835885044.0616122520.983588504

    Bivalve

    00

    00

    00

    00

    00

    00

    00

    00

    00

    PAUA

    PAUA

    PUUA

    PAUA

    Tunicates

    CodonAAAplidiumConicumNC_013584RSCUCionaIntestinalisNC_004447CionaSavignyiNC_004570ClavelinaLepadiformisNC_012887DiplosomaListerianumNC_013556DoliolumNationalisNC_006627Halocynthia roretziNC_002177HerdmaniaMomusNC_013561MicrocosmusSulcatusNC_013752PhallusiaFumigataNC_009834PhallusiaMammillataNC_009833StyelaPlicataNC_013565

    UAG*30.66710.240.88910.2510.28671.55640.66781.610.182101.6676191.5

    UAA*61.33391.851.11171.7561.71420.44481.33320.4101.81820.3336130.5

    GCUA762.841702.642712.812873.164762.951771.532772.184682.45561.898781.625791.456522

    GCGA50.18720.07570.27710.03630.117320.637150.426190.685270.915430.896390.719180.692

    GCCA70.262100.37730.11950.18230.117531.05590.25580.28830.102340.708661.217140.538

    GCAA190.71240.906200.792170.618210.816390.776401.135160.577321.085370.771330.608200.769

    UGUC311.676281.806232321.939391.857220.917731.947341.789361.946331.404190.792421.826

    UGCC60.32430.1940010.06130.143261.08320.05340.21110.054140.596291.20840.174

    GAUD621.851631.909601.935651.733481.655401.039601.765591.71641.882371.104411.242591.735

    GACD50.14930.09120.065100.267100.345370.96180.235100.2940.118300.896250.75890.265

    GAGE220.579180.474250.61260.712140.406410.891611.386701.573641.524751.875611.627491.273

    GAAE541.421581.526571.39471.288551.594511.109270.614190.427200.47650.125140.373280.727

    UUUF4951.8754321.8194601.9095221.8355621.9151671.1254011.9373351.6923611.9361941.3432161.4074041.804

    UUCF330.125430.181220.091470.165250.0851300.875130.063610.308120.064950.657910.593440.196

    AGAG1331.4382211.7131511.589761.0131641.7921301.088680.872250.685620.681220.463200.519270.551

    AGGG520.562370.287390.411740.987190.2081090.912881.128481.3151201.319731.537571.481711.449

    GGAG411.763272.038481.811190.644110.978381.246460.722380.425571.06660.595600.57370.841

    GGCG40.1720010.03890.30510.08980.26240.063190.21260.112260.234780.741150.341

    GGGG160.68880.604140.528180.6140.356351.148610.9571401.564751.3952512.2611871.777410.932

    GGUG321.376181.358431.623722.441292.578411.3441442.2591611.799771.4331010.91960.912831.886

    CACH70.21260.18540.12370.222110.297310.88660.188160.48550.167290.906321.032120.369

    CAUH591.788591.815611.877561.778631.703391.114581.813501.515551.833351.094300.968531.631

    AUUI2601.8252961.8852811.9122461.8222811.8861251.2951691.8671241.592331.9581091.4531161.3182261.745

    AUCI250.175180.115130.088240.178170.114680.705120.133320.4150.042410.547600.682330.255

    AAAK681.4471021.672971.552711.4791281.778501.087280.8260.598420.92350.169250.794310.765

    AAGK260.553200.328280.448250.521160.222420.913421.2611.402491.077541.831381.206501.235

    CUAL371.345381.505191.07321.113301.5191431.563140.789441.257211.355731.0321021.115481.289

    CUCL50.18240.15810.05680.27860.304620.67830.169180.51420.129610.862680.743110.295

    CUGL50.18230.11950.28260.20940.203530.579130.732411.17170.452731.032820.896160.43

    CUUL632.291562.218462.592692.4391.9751081.18412.31371.057322.065761.0741141.246741.987

    UUAL3531.5483981.8513351.5023211.733761.8251301.4212921.2141900.7853551.4261080.8091070.9822521.289

    UUGL1030.452320.1491110.498500.27360.175530.5791890.7862941.2151430.5741591.1911111.0181390.711

    AUAM1611.4772361.7482131.6141821.6252331.8641421.4131111.314680.7231601.172800.914680.8141161.055

    AUGM570.523340.252510.386420.375170.136590.587580.6861201.2771130.828951.086991.1861040.945

    AACN120.173160.20140.051150.229290.331601.27750.12190.3880.133280.824250.769190.297

    AAUN1271.8271431.7991541.9491161.7711461.669340.723781.88811.621121.867401.176401.2311091.703

    CCUP773.02672.577652.708642.783853.119612.103592.408532.232471.899431.737391.405522

    CCGP20.07850.192100.41730.1340.147100.345120.4990.379210.848140.566120.432140.538

    CCCP70.27540.15440.16740.17430.11180.62140.163160.67490.364170.687461.658120.462

    CCAP160.627281.077170.708210.913170.624270.931230.939170.716220.889251.01140.505261

    CAGQ100.41750.2120.53380.3410.048281.12211311.216261.13271.459261.333261.156

    CAAQ381.583451.8331.467391.66411.952220.88211200.784200.87100.541130.667190.844

    CGCR10.08500000020.18230.17450.290000110.815151.09130.267

    CGGR90.76640.30850.36430.26100191.101120.696130.945171.388191.407141.01880.711

    CGUR262.213251.923221.6282.435292.636191.101291.681292.109141.143161.185201.455171.511

    CGAR110.936231.769282.036151.304131.182281.623231.333130.945181.46980.59360.436171.511

    AGCS130.27470.175120.235110.17980.186470.90440.063230.33360.1410.774430.945190.304

    AGUS821.726731.825901.7651121.821781.814571.0961241.9381151.6671141.9651.226481.0551061.696

    UCAS390.746641.169541.009320.634491.077480.97240.78331.008371.035441.006400.842380.826

    UCCS140.268150.27460.112120.23880.176501.0150.163130.39750.14581.326551.158280.609

    UCGS140.26870.128120.22450.09950.11190.384100.325120.366230.643280.64280.589160.348

    UCUS1422.7181332.4291422.6541533.031202.637811.636842.732732.229782.182451.029671.4111022.217

    ACAT250.752321381.16310.912431.293330.746331.483250.962351.333150.561210.694240.787

    ACCT80.24140.12520.061120.35330.09410.92730.13540.15490.343180.673361.19160.525

    ACGT40.1280.25100.30540.11830.09190.42950.225160.615130.495301.121250.826140.459

    ACUT962.887842.625812.473892.618842.526841.898482.157592.269481.829441.645391.289682.23

    GUGV380.589180.291300.432250.377100.226770.856890.8621331.3071421.3992582.0311631.52960.98

    GUUV1161.7981161.8791462.1011422.143902.034911.0111841.7821931.8971551.5271170.9211261.1751841.878

    GUCV100.155120.19450.072150.22660.136590.656130.126210.20640.039410.323580.541280.286

    GUAV941.4571011.636971.396831.253711.6051331.4781271.23600.591051.034920.724820.765840.857

    UGAW601.463701.556691.516481.247581.731581.126440.759230.469390.857310.496200.357200.526

    UGGW220.537200.444220.484290.75390.269450.874721.241751.531521.143941.504921.643561.474

    UACY180.218160.195110.148250.318270.297901.132150.181330.395100.118630.851731.014420.398

    UAUY1471.7821481.8051381.8521321.6821551.703690.8681511.8191341.6051601.882851.149710.9861691.602

    PAUA0.73853211010.87407407410.80681818180.81250.9320.70646766170.65680473370.36170212770.58608058610.45714285710.40718562870.5272727273

    PUUA0.7741228070.92558139530.75112107620.86522911050.91262135920.71038251370.60706860710.39256198350.71285140560.4044943820.49082568810.6445012788

    Lancelet

    BranchiostomaBelcheriNC_004537BranchiostomaFloridaeNC_000834BranchiostomaLanceolatumNC_001912EpigonichthysMaldivensisNC_006465

    UAG*61.09161.71481.23120.667AplidiumConicumNC_013584.gbAplidiumConicumNC_013584

    UAA*50.90910.28650.76941.333BranchiostomaBelcheriNC_004537.gbBranchiostomaBelcheriNC_004537

    GCUA1251.7991261.8811211.813981.508BranchiostomaFloridaeNC_000834.gbBranchiostomaFloridaeNC_000834

    GCGA440.633340.507350.524490.754BranchiostomaLanceolatumNC_001912.gbBranchiostomaLanceolatumNC_001912

    GCCA170.245220.328260.39320.492CionaIntestinalisNC_004447.gbCionaIntestinalisNC_004447

    GCAA921.324861.284851.273811.246CionaSavignyiNC_004570.gbCionaSavignyiNC_004570

    UGUC331.737231.211231.211281.273ClavelinaLepadiformisNC_012887.gbClavelinaLepadiformisNC_012887

    UGCC50.263150.789150.789160.727DiplosomaListerianumNC_013556.gbDiplosomaListerianumNC_013556

    GAUD611.564571.425571.425541.521DoliolumNationalisNC_006627.gbDoliolumNationalisNC_006627

    GACD170.436230.575230.575170.479EpigonichthysMaldivensisNC_006465.gbEpigonichthysMaldivensisNC_006465

    GAGE480.96450.874460.893541.029Halocynthia roretziNC_002177.gbHalocynthia roretziNC_002177

    GAAE521.04581.126571.107510.971HerdmaniaMomusNC_013561.gbHerdmaniaMomusNC_013561

    UUUF2271.7391701.3491741.3651811.42MicrocosmusSulcatusNC_013752.gbMicrocosmusSulcatusNC_013752

    UUCF340.261820.651810.635740.58PhallusiaFumigataNC_009834.gbPhallusiaFumigataNC_009834

    GGUG811.095921.26941.288831.118PhallusiaMammillataNC_009833.gbPhallusiaMammillataNC_009833

    GGGG1321.7841131.5481111.5211181.589StyelaPlicataNC_013565.gbStyelaPlicataNC_013565

    GGCG160.216200.274190.26410.552

    GGAG670.905670.918680.932550.741PUUAPAUAPAUA

    CACH110.242260.571280.622290.674BranchiostomaBelcheriNC_0045370.71981776770.2918660287CAU

    CAUH801.758651.429621.378571.326BranchiostomaFloridaeNC_0008340.72069825440.3134328358CAU

    AUUI1991.6381871.6051861.5971591.389BranchiostomaLanceolatumNC_0019120.72681704260.3118811881CAU

    AUCI440.362460.395470.403700.611EpigonichthysMaldivensisNC_0064650.7360.3259668508CAU

    AAAK431.178401.096401.081371.028AplidiumConicumNC_0135840.7741228070.7385321101CAU/UAU

    AAGK300.822330.904340.919350.972CionaIntestinalisNC_0044470.92558139530.8740740741CAU/UAU

    CUAL611.488931.683941.686971.644CionaSavignyiNC_0045700.75112107620.8068181818CAU/UAU

    CUCL80.195150.271150.269310.525ClavelinaLepadiformisNC_0128870.86522911050.8125CAU/UAU

    CUGL190.463360.652410.735510.864DiplosomaListerianumNC_0135560.91262135920.932CAU/UAU

    CUUL761.854771.394731.309570.966DoliolumNationalisNC_0066270.71038251370.7064676617CAU/UAU

    UUAL3161.442891.4412901.4542761.472Halocynthia roretziNC_0021770.60706860710.6568047337CAU/UAU

    UUGL1230.561120.5591090.546990.528HerdmaniaMomusNC_0135610.39256198350.3617021277CAU/UAU

    AUGM610.584630.627630.624590.652MicrocosmusSulcatusNC_0137520.71285140560.5860805861CAU/UAU

    AUAM1481.4161381.3731391.3761221.348PhallusiaFumigataNC_0098340.4044943820.4571428571CAU/UAU

    AACN180.303240.432260.464400.727PhallusiaMammillataNC_0098330.49082568810.4071856287CAU/UAU

    AAUN1011.697871.568861.536701.273StyelaPlicataNC_0135650.64450127880.5272727273CAU/UAU

    CCUP681.789761.924731.848731.896

    CCGP200.526220.557220.557210.545

    CCCP190.5200.506240.608180.468

    CCAP451.184401.013390.987421.091

    CAGQ340.8471.106461.082290.667

    CAAQ511.2380.894390.918581.333

    CGAR181271.421251.316301.622

    CGCR20.111120.632120.63260.324

    CGGR301.667211.105221.158191.027

    CGUR221.222160.842170.895191.027

    AGCS200.578330.895300.808350.959

    AGAS290.838120.325120.323310.849

    UCAS481.386641.736631.697381.041

    UCCS110.318190.515200.539250.685

    UCGS220.635170.461190.512190.521

    UCUS862.484772.088752.02932.548

    AGGS10.029000020.055

    AGUS601.733731.98782.101491.342

    ACAT691.5651.469651.469651.307

    ACCT90.196150.339140.316310.623

    ACGT310.674320.723320.723320.643

    ACUT751.63651.469661.492711.427

    GUCV240.291230.267260.301540.598

    GUGV720.873670.779690.8820.909

    GUUV1151.3941301.5121271.4721161.285

    GUAV1191.4421241.4421231.4261091.208

    UGAW621.127581.094601.132520.981

    UGGW480.873480.906460.868541.019

    UACY340.442530.716540.725600.774

    UAUY1201.558951.284951.275951.226

    PAUA0.29186602870.31343283580.31188118810.3259668508

    PUUA0.71981776770.72069825440.72681704260.736

    Lancelet

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    PAUA

    PAUA

    PUUA

    PAUA

  • Calculation of RSCURSCU is codon-specificRSCU and proportion: Different scaling.

    Sheet1

    CodonAANRSCUCodonAANRSCUCodonAANRSCU

    GCUAla520.84CCUPro420.87UAA*83.2

    GCCAla911.47CCCPro631.31UAG*10.4

    GCAAla1031.66CCAPro851.76AGA*10.4

    GCGAla20.03CCGPro30.06AGG*00

    GAAGlu781.64CAAGln791.82AAALys901.78

    GAGGlu170.36CAGGln80.18AAGLys110.22

    GGUGly290.53CGUArg70.44ACUThr440.57

    GGCGly621.13CGCArg110.7ACCThr961.25

    GGAGly971.77CGAArg422.67ACAThr1531.99

    GGGGly310.57CGGArg30.19ACGThr150.19

    UUALeu1101.11AUAMet2181.66UGATrp921.77

    UUGLeu160.16AUGMet440.34UGGTrp120.23

    CUULeu620.62UCUSer511.11GUUVal400.84

    CUCLeu950.95UCCSer651.42GUCVal481.01

    CUALeu2852.86UCASer992.16GUAVal871.83

    CUGLeu290.29UCGSer50.11GUGVal150.32

    Sheet2

    Sheet3

  • Calculation of CAICompound 6- or 8-fold codon families should be broken into two codon families CAI is gene-specific. 0 CAI 1CAI computed with different reference sets are not comparable. Problem with computing w as Fi/Fi.max: Suppose an amino acid is rarely used in highly expressed genes, then there is little selection on it, and the codon usage might be close to even, with wi 1. Now if we have a lowly expressed gene that happen to be made of entire of this amino acid, then the CAI for this lowly expressed gene would be 1, which is misleading.There has been no good alternative. Further research is needed.N2,3,4: Number of 2-, 3-, 4-fold codon families

    Sheet1

    AAAK1855UGA*00.388

    AACN1434UAG*00.243

    AAGK2464UAA*01

    AATN1646GCAA10.606

    ACAT1067GCUA151

    ACCT1056GCGA00.253

    ACGT478GCCA80.75

    ACTT1122UGCC31

    AGAR950UGUC30.909

    AGCS463GAUD91

    AGGR187GACD110.581

    AGTS623GAGE110.862

    ATAI340GAAE151

    ATCI1574UUUF40.552

    ATGM1739UUCF91

    ATTI1666GGAG111

    CAAQ2106GGUG140.248

    CACH751GGGG00.06

    CAGQ966GGCG50.136

    CATH910CACH50.825

    CCAP2617CAUH21

    CCCP222AUCI120.945

    CCGP521AUUI111

    CCTP461AUAI00.204

    CGAR698AAGK271

    CGCR525AAAK50.753

    CGGR216UUAL00.265

    CGTR1115UUGL120.74

    CTAL274CUAL00.181

    CTCL1244CUCL20.822

    CTGL591CUGL40.39

    CTTL1514CUUL41

    GAAE2523AUGM101

    GACD1416AAUN21

    GAGE2175AACN140.871

    GATD2437CCGP20.199

    GCAA1107CCUP70.176

    GCCA1371CCAP71

    GCGA463CCCP90.085

    GCTA1827CAGQ50.459

    GGAG3372CAAQ21

    GGCG460CGGR10.194

    GGGG204CGCR10.471

    GGTG837CGAR10.626

    GTAV497AGAR40.852

    GTCV1239AGGR20.168

    GTGV725CGUR61

    GTTV1645AGUS20.529

    TAA*103AGCS30.393

    TACY1129UCUS30.856

    TAG*25UCGS30.621

    TATY949UCAS01

    TCAS1177UCCS50.764

    TCCS899ACAT20.951

    TCGS731ACCT140.941

    TCTS1007ACGT00.426

    TGA*40ACUT121

    TGCC738GUGV60.441

    TGGW686GUCV150.753

    TGTC671GUAV30.302

    TTAL401GUUV71

    TTCF1640UGGW41

    TTGL1120UAUY10.841

    TTTF906UACY81

    Sheet2

    AAAK1855UGA*00.388CodonAAObsFreqRefCodFreqw

    AACN1434UAG*00.243UGA*060.375

    AAGK2464UAA*01UAG*040.250

    AATN1646GCAA10.606UAA*0161.000

    ACAT1067GCUA151GCAA11950.606

    ACCT1056GCGA00.253GCUA153221.000

    ACGT478GCCA80.75GCGA0810.252

    ACTT1122UGCC31GCCA82420.752

    AGAR950UGUC30.909UGCC31231.000

    AGCS463GAUD91UGUC31120.911

    AGGR187GACD110.581GAUD9691.000

    AGTS623GAGE110.862GACD11400.580

    ATAI340GAAE151GAGE112890.863

    ATCI1574UUUF40.552GAAE143351.000

    ATGM1739UUCF91UUUF31180.554

    ATTI1666GGAG111UUCF92131.000

    CAAQ2106GGUG140.248

    CACH751GGGG00.06

    CAGQ966GGCG50.136

    CATH910CACH50.825

    CCAP2617CAUH21

    CCCP222AUCI120.945

    CCGP521AUUI111

    CCTP461AUAI00.204

    CGAR698AAGK271

    CGCR525AAAK50.753

    CGGR216UUAL00.265

    CGTR1115UUGL120.74

    CTAL274CUAL00.181

    CTCL1244CUCL20.822

    CTGL591CUGL40.39

    CTTL1514CUUL41

    GAAE2523AUGM101

    GACD1416AAUN21

    GAGE2175AACN140.871

    GATD2437CCGP20.199

    GCAA1107CCUP70.176

    GCCA1371CCAP71

    GCGA463CCCP90.085

    GCTA1827CAGQ50.459

    GGAG3372CAAQ21

    GGCG460CGGR10.194

    GGGG204CGCR10.471

    GGTG837CGAR10.626

    GTAV497AGAR40.852

    GTCV1239AGGR20.168

    GTGV725CGUR61

    GTTV1645AGUS20.529

    TAA*103AGCS30.393

    TACY1129UCUS30.856

    TAG*25UCGS30.621

    TATY949UCAS01

    TCAS1177UCCS50.764

    TCCS899ACAT20.951

    TCGS731ACCT140.941

    TCTS1007ACGT00.426

    TGA*40ACUT121

    TGCC738GUGV60.441

    TGGW686GUCV150.753

    TGTC671GUAV30.302

    TTAL401GUUV71

    TTCF1640UGGW41

    TTGL1120UAUY10.841

    TTTF906UACY81

    Sheet3

    520.8387096774

    911.4677419355

    1031.6612903226

    20.0322580645

    248

    MBD0002DDBE.xls

    Sheet1

    CodonAANRSCRCodonAANRSCRCodonAANRSCR

    GCUAla520.84CCUPro420.87UAA*83.2

    GCCAla911.47CCCPro631.31UAG*10.4

    GCAAla1031.66CCAPro851.76AGA*10.4

    GCGAla20.03CCGPro30.06AGG*00

    GAAGlu781.64CAAGln791.82AAALys901.78

    GAGGlu170.36CAGGln80.18AAGLys110.22

    GGUGly290.53CGUArg70.44ACUThr440.57

    GGCGly621.13CGCArg110.7ACCThr961.25

    GGAGly971.77CGAArg422.67ACAThr1531.99

    GGGGly310.57CGGArg30.19ACGThr150.19

    UUALeu1101.11AUAMet2181.66UGATrp921.77

    UUGLeu160.16AUGMet440.34UGGTrp120.23

    CUULeu620.62UCUSer511.11GUUVal400.84

    CUCLeu950.95UCCSer651.42GUCVal481.01

    CUALeu2852.86UCASer992.16GUAVal871.83

    CUGLeu290.29UCGSer50.11GUGVal150.32

    Sheet2

    Sheet3

  • Weak mRNA predictive powerFRS2ENO1

  • Effect of Codon Usage BiasFRS2ENO1

  • Problems with CAIFormulationReference setw = 0ImplementationAUGUGGMultiple codon families for one amino acidDependence on AT%Solutions (Xia, X. 2007. Evolutionary Bioinformatics)

  • RSCU (HIV-1 vs Human)Fig. 1. Relative synonymous codon usage (RSCU) of HIV-1 compared to RSCU of highly expressed human genes. Data points for codons ending with A, C, G or U are annotated with different combinations of colors and symbols. A-ending codons exhibit strong discordance in their usage between HIV-1 and human and are annotated with their coded amino acids. van Weringh et al. 2011. MBE.

  • ResearchObservation on HIV-1: Strong surplus of A-ending codonHigh mutation rateHypothesis: Strong A-biased mutation disrupting codon adaptation.Prediction: Strong A-biased mutation (confirmed)If mutation rate is lower, then there will be better codon adaptation (The related HTLV-1 parasitizes the same cell as HIV-1, but have lower mutation rate: HTLV-1 genes should exhibit better codon adaptation)

  • RSCU (HTLV-1 vs Human)Relative synonymous codon usage (RSCU) of HTLV-1 compared to RSCU of highly expressed human genes. Data points for codons ending with A, C, G or U are annotated with different combinations of colors and symbols. A-ending codons exhibit strong discordance in their usage between HIV-1 and human and are annotated with their coded amino acids.

    Chart1

    0.510.74202898550.74202898550.7420289855

    1.2240.78346028290.78346028290.7834602829

    1.180.92817679560.92817679560.9281767956

    0.7970.35839160840.35839160840.3583916084

    1.50.76137339060.76137339060.7613733906

    1.0730.29585087190.29585087190.2958508719

    1.5050.67714285710.67714285710.6771428571

    0.9860.93228655540.93228655540.9322865554

    0.9450.51993262210.51993262210.5199326221

    0.7880.97315436240.97315436240.9731543624

    0.9320.75247524750.75247524750.7524752475

    0.8150.79188900750.79188900750.7918890075

    0.8210.97052541650.97052541650.9705254165

    0.5830.38819875780.38819875780.3881987578

    1.74057971012.4381.74057971011.7405797101

    1.20276953511.4351.20276953511.2027695351

    1.17019230771.2341.17019230771.1701923077

    1.19561454131.0731.19561454131.1956145413

    1.5453827941.281.5453827941.545382794

    1.18640576731.2281.18640576731.1864057673

    1.69055944061.2191.69055944061.6905594406

    1.06674684311.4041.06674684311.0667468431

    1.16544117651.1081.16544117651.1654411765

    1.46418056921.7291.46418056921.4641805692

    1.60936093611.2331.60936093611.6093609361

    1.40476190481.2811.40476190481.4047619048

    1.60939167562.2221.60939167561.6093916756

    1.7291755661.9321.7291755661.729175566

    1.01397515531.751.01397515531.0139751553

    1.19484702091.3071.19484702091.1948470209

    0.34347826090.34347826090.3030.3434782609

    1.21653971711.21653971710.7761.2165397171

    0.87134964480.87134964481.060.8713496448

    1.23862660941.23862660940.51.2386266094

    2.11906193632.11906193630.5812.1190619363

    1.32285714291.32285714290.4951.3228571429

    0.47301275760.47301275760.4250.4730127576

    1.48006737791.48006737791.0551.4800673779

    1.02684563761.02684563761.2121.0268456376

    1.11611161121.11611161121.0531.1161116112

    0.36926360730.36926360730.1480.3692636073

    0.37761640320.37761640320.2560.3776164032

    1.93478260871.93478260870.8611.9347826087

    1.17391304351.17391304351.17391304350.749

    0.79723046490.79723046490.79723046490.565

    0.82980769230.82980769230.82980769230.766

    0.80438545870.80438545870.80438545870.927

    0.65509076560.65509076560.65509076560.48

    0.81359423270.81359423270.81359423270.772

    0.9510489510.9510489510.9510489510.984

    0.51834034880.51834034880.51834034880.942

    0.83455882350.83455882350.83455882350.892

    1.13052011781.13052011781.13052011780.86

    0.52205220520.52205220520.52205220520.782

    0.59523809520.59523809520.59523809520.719

    1.22945570971.22945570971.22945570970.815

    0.92268261430.92268261430.92268261430.991

    0.66304347830.66304347830.66304347830.806

    0.80515297910.80515297910.80515297910.693

    A-ending

    C-ending

    G-ending

    U-ending

    RSCU (Human)

    RSCU (HTLV-1)

    Fig2A

    --AverageError--AverageErrorArg(AGA)

    --HIV WTHIV WT--HIV MUTHIV MUTArg(AGG)

    Ile-UAU1.339670.70913Ile-UAU0.061390.022161.88917405842.7703068592Ile(AUA)

    Lys1,21.008130.29927Lys1,20.060080.017493.36863033383.4351057747Ile(AUY)

    Lys30.640520.25086Lys30.033952.02E-042.5532966595167.9196755367Leu(UUA)

    Asn-GUU0.458820.12913Asn-GUU0.041770.00469Leu(UUG)

    Sec-UCA10.269970.19873Sec-UCA10.02978.58E-04Lys(AAA)

    Ile-IAU/GAU0.267240.08196Ile-IAU/GAU0.137110.050753.26061493412.7016748768Lys(AAG)

    His-GUG0.209870.0246His-GUG0.102210.04699Gly(GGA)

    Gly-GCC/CCC0.201640.08479Gly-GCC/CCC0.048590.010462.37811062634.6453154876Gly(GGB)

    Pro-IGG/CGG/UGG0.194840.11125Pro-IGG/CGG/UGG0.055520.03084Val(GUA)

    Thr-IGU/CGU0.190040.09363Thr-IGU/CGU0.015490.00784Val(GUB)

    Met-CAU0.184240.03201Met-CAU0.041640.00719Thr(ACA)

    Arg-CCG/UCG0.171890.00734Arg-CCG/UCG0.042140.00163Thr(ACB)

    Glu-CUC/UUC0.157570.01142Glu-CUC/UUC0.048960.01635

    Arg-ICG0.13350.02488Arg-ICG0.048020.00623

    Asp-GUC0.130290.03841Asp-GUC0.057370.01104

    Ala-IGC/CGC/UGC0.113210.01105Ala-IGC/CGC/UGC0.073770.03882

    Ala-CGC0.099250.01093Ala-CGC0.06940.0283

    Leu-UAA0.090.03616Leu-UAA0.037380.008112.48893805314.6091245376

    Cys-GCA0.082810.02926Cys-GCA0.026610.00898

    Leu-IAG/UAG0.079650.04071Leu-IAG/UAG0.041780.01059

    Sec-UCA20.078830.01611Sec-UCA20.060110.02047

    Val-IAC/CAC0.073870.0285Val-IAC/CAC0.048530.021182.59192982462.291312559

    Gly-UCC0.070750.05595Gly-UCC0.028950.006171.26452189454.6920583468

    Tyr-GUA0.068330.01475Tyr-GUA0.03620.01522

    Val-UAC0.066170.02318Val-UAC0.028230.01322.85461604832.1386363636

    Arg-CCU0.0660.03084Arg-CCU0.054380.022582.1400778212.4083259522

    Thr-CGU0.062150.0152Thr-CGU0.019210.006794.08881578952.8291605302

    Ser-GCU0.060460.03637Ser-GCU0.019630.00788

    Gln-CUG/UUG0.058650.00177Gln-CUG/UUG0.021390.00217

    Glu-UUC0.054240.01134Glu-UUC0.019180.00184

    Ser-CGA0.05150.02118Ser-CGA0.022440.01514

    Arg-UCU0.049420.00253Arg-UCU0.027670.0021819.533596837912.6926605505

    Thr-UGU0.048140.00106Thr-UGU0.021480.007245.41509433962.9833333333

    Phe-GAA0.046840.00865Phe-GAA0.017510.00531

    Leu-CAA0.045040.015Leu-CAA0.049560.025033.00266666671.9800239712

    Trp-CCA0.035750.00952Trp-CCA0.018460.0082

    Leu-CAG0.031960.01431Leu-CAG0.021570.01228

    Met-i0.026230.00355Met-i0.01610.00704

    Ser-IGA/UGA0.016610.00137Ser-IGA/UGA0.010610.00118

    Leu-UAA20.00780.0069Leu-UAA20.015430.00306

    ----------

    mGln0.030420.01646mGln0.024780.01961

    mGlu0.027030.00706mGlu0.022080.00217

    mAsp0.026590.01066mAsp0.027750.01513

    mPro0.021320.01909mPro0.017370.01656

    mPhe0.01940.00432mPhe0.019640.01056

    mVal0.018480.01059mVal0.013240.01119

    mMet0.0180.00455mMet0.014660.00276

    mAsn0.017520.01484mAsn0.013560.01279

    mHis0.015730.0025mHis0.009070.00417

    mArg0.015480.00258mArg0.018810.00967

    mLeu-UAA0.014480.00168mLeu-UAA0.011340.00185

    mSer-GCU0.013010.00997mSer-GCU0.032180.01353

    mLys0.012971.82E-04mLys0.009640.00171

    mIle0.011240.01005mIle0.009550.00536

    mTyr0.00990.00729mTyr0.010460.00579

    mSer-UGA0.008550.00753mSer-UGA0.007530.00519

    mAla0.008450.00634mAla0.00890.0041

    mThr0.008290.0069mThr0.007680.00573

    mLeu-UAG0.007640.0068mLeu-UAG0.00610.00542

    mTrp0.006070.00397mTrp0.004850.00343

    mGly0.005640.00434mGly0.004560.00349

    mCys0.004510.00403mCys0.003710.00317

    ----

    Fig2B

    --AverageError

    --HIV MUTHIV MUT

    Ile-UAU0.061390.02216

    Lys1,20.060080.01749

    Lys30.033952.02E-04

    Asn-GUU0.041770.00469

    Sec-UCA10.02978.58E-04

    Ile-IAU/GAU0.137110.05075

    His-GUG0.102210.04699

    Gly-GCC/CCC0.048590.01046

    Pro-IGG/CGG/UGG0.055520.03084

    Thr-IGU/CGU0.015490.00784

    Met-CAU0.041640.00719

    Arg-CCG/UCG0.042140.00163

    Glu-CUC/UUC0.048960.01635

    Arg-ICG0.048020.00623

    Asp-GUC0.057370.01104

    Ala-IGC/CGC/UGC0.073770.03882

    Ala-CGC0.06940.0283

    Leu-UAA0.037380.00811

    Cys-GCA0.026610.00898

    Leu-IAG/UAG0.041780.01059

    Sec-UCA20.060110.02047

    Val-IAC/CAC0.048530.02118

    Gly-UCC0.028950.00617

    Tyr-GUA0.03620.01522

    Val-UAC0.028230.0132

    Arg-CCU0.054380.02258

    Thr-CGU0.019210.00679

    Ser-GCU0.019630.00788

    Gln-CUG/UUG0.021390.00217

    Glu-UUC0.019180.00184

    Ser-CGA0.022440.01514

    Arg-UCU0.027670.00218

    Thr-UGU0.021480.0072

    Phe-GAA0.017510.00531

    Leu-CAA0.049560.02503

    Trp-CCA0.018460.0082

    Leu-CAG0.021570.01228

    Met-i0.01610.00704

    Ser-IGA/UGA0.010610.00118

    Leu-UAA20.015430.00306

    ------

    mGln0.024780.01961

    mGlu0.022080.00217

    mAsp0.027750.01513

    mPro0.017370.01656

    mPhe0.019640.01056

    mVal0.013240.01119

    mMet0.014660.00276

    mAsn0.013560.01279

    mHis0.009070.00417

    mArg0.018810.00967

    mLeu-UAA0.011340.00185

    mSer-GCU0.032180.01353

    mLys0.009640.00171

    mIle0.009550.00536

    mTyr0.010460.00579

    mSer-UGA0.007530.00519

    mAla0.00890.0041

    mThr0.007680.00573

    mLeu-UAG0.00610.00542

    mTrp0.004850.00343

    mGly0.004560.00349

    mCys0.003710.00317

    --

    Fig3A

    --AverageErroraverageerror

    --293T293THIV WTHIV WTlogRatio

    Lys32.727370.861517.732910.22252.701

    Lys1,21.772370.3779416.891671.517263.253

    Asn-GUU3.379360.9394110.624751.435411.653

    Glu-CUC/UUC4.980350.154254.208870.1755-0.243

    Pro-IGG/CGG/UGG6.221192.608614.164570.61297-0.579

    Ile-UAU0.625490.093814.11520.090092.718

    Thr-IGU/CGU4.258170.429063.974360.66775-0.100

    Gly-GCC/CCC2.119070.106552.699340.109310.349

    Asp-GUC4.002340.30412.615540.97891-0.614

    Arg-CCG/UCG2.666220.723152.544680.0461-0.067

    Met-CAU1.501090.072892.118140.672720.497

    Gly-UCC3.42040.354741.86260.87535-0.877

    His-GUG0.874940.087051.812620.850641.051

    Cys-GCA2.775990.029421.753370.67296-0.663

    Arg-ICG2.679020.695931.615640.14948-0.730

    Glu-UUC5.990950.969791.491130.22708-2.006

    Tyr-GUA4.600010.388831.433040.22857-1.683

    Gln-CUG/UUG4.280.020431.393620.06732-1.619

    Ile-IAU/GAU0.581020.046731.371280.437861.239

    Val-IAC/CAC2.169470.286911.29550.35448-0.744

    Ser-GCU3.022390.87371.112010.08361-1.443

    Arg-CCU2.614020.635731.109090.12204-1.237

    Sec-UCA10.732880.179030.982760.455890.423

    Thr-UGU2.791380.004850.958740.14684-1.542

    Met-i5.145860.286850.945250.02846-2.445

    Trp-CCA3.71460.276330.935160.1088-1.990

    Phe-GAA2.116090.343760.912240.31935-1.214

    Arg-UCU2.414170.652360.824090.15629-1.551

    Ala-CGC0.759960.03750.730380.29535-0.057

    Ser-CGA2.157040.714340.721130.08719-1.581

    Leu-UAA1.430610.115050.670520.10534-1.093

    Ser-IGA/UGA2.638970.526460.658340.01926-2.003

    Leu-CAG2.051010.227620.650740.13045-1.656

    Leu-CAA1.834630.142290.650650.06785-1.496

    Leu-IAG/UAG0.997450.1190.584310.03237-0.772

    Val-UAC1.053270.136390.541150.12489-0.961

    Thr-CGU1.371450.014210.481580.01917-1.510

    Ala-IGC/CGC/UGC0.363350.041140.409020.181430.171

    Sec-UCA20.337760.07780.223780.05677-0.594

    Leu-UAA20.828320.177070.180240.07853-2.200

    --

    Fig3B

    --averageerror

    --HIV WTHIV WT

    Lys317.732910.2225

    Lys1,216.891671.51726

    Asn-GUU10.624751.43541

    Glu-CUC/UUC4.208870.1755

    Pro-IGG/CGG/UGG4.164570.61297

    Ile-UAU4.11520.09009

    Thr-IGU/CGU3.974360.66775

    Gly-GCC/CCC2.699340.10931

    Asp-GUC2.615540.97891

    Arg-CCG/UCG2.544680.0461

    Met-CAU2.118140.67272

    Gly-UCC1.86260.87535

    His-GUG1.812620.85064

    Cys-GCA1.753370.67296

    Arg-ICG1.615640.14948

    Glu-UUC1.491130.22708

    Tyr-GUA1.433040.22857

    Gln-CUG/UUG1.393620.06732

    Ile-IAU/GAU1.371280.43786

    Val-IAC/CAC1.29550.35448

    Ser-GCU1.112010.08361

    Arg-CCU1.109090.12204

    Sec-UCA10.982760.45589

    Thr-UGU0.958740.14684

    Met-i0.945250.02846

    Trp-CCA0.935160.1088

    Phe-GAA0.912240.31935

    Arg-UCU0.824090.15629

    Ala-CGC0.730380.29535

    Ser-CGA0.721130.08719

    Leu-UAA0.670520.10534

    Ser-IGA/UGA0.658340.01926

    Leu-CAG0.650740.13045

    Leu-CAA0.650650.06785

    Leu-IAG/UAG0.584310.03237

    Val-UAC0.541150.12489

    Thr-CGU0.481580.01917

    Ala-IGC/CGC/UGC0.409020.18143

    Sec-UCA20.223780.05677

    Leu-UAA20.180240.07853

    --

    Table

    Codon familyRSCU HumanRSCU HIV-1log2(RSCUHIV1/Human)RankRSCUtRNAHIV1tRNAGagVLPlog2(tRNAHIV1/GagVLP)RanktRNA--MeanWTMeanMut

    Arg(AGA)0.971.440.5780.04940.02770.83684Ala-CGC0.099250.0694

    Arg(AGG)1.030.56-0.8840.06600.05440.27942Ala-IGC/CGC/UGC0.113210.07377

    Ile(AUA)0.241.592.73141.33970.06144.447714Arg-CCG/UCG0.171890.04214

    Ile(AUY)2.641.41-0.9030.26720.13710.96285Arg-CCU0.0660.05438

    Leu(UUA)0.681.381.02110.09000.03741.26778Arg-ICG0.13350.04802

    Leu(UUG)1.320.62-1.0910.04500.0496-0.13801Arg-UCU0.049420.02767

    Lys(AAA)0.761.270.7490.64050.03404.237813Asn-GUU0.458820.04177

    Lys(AAG)1.240.73-0.7651.00810.06014.068712Asp-GUC0.130290.05737

    Gly(GGA)0.932.081.16120.07080.02901.28929Cys-GCA0.082810.02661

    Gly(GGB)3.071.92-0.6860.20160.04862.053110Gln-CUG/UUG0.058650.02139

    Val(GUA)0.392.082.42130.06620.02821.22897Glu-CUC/UUC0.157570.04896

    Val(GUB)3.611.92-0.9120.07390.04850.60613Glu-UUC0.054240.01918

    Thr(ACA)0.971.941.00100.04810.02151.16426Gly-GCC/CCC0.201640.04859

    Thr(ACB)3.032.06-0.5670.25220.03472.861511Gly-UCC0.070750.02895

    0.32246362160.578021978His-GUG0.209870.10221

    Ile-IAU/GAU0.267240.13711

    Ile-UAU1.339670.06139

    SUMMARY OUTPUTLeu-CAA0.045040.04956

    Leu-CAG0.031960.02157

    Regression StatisticsLeu-IAG/UAG0.079650.04178

    Multiple R0.5780219780.5780219780.0303830139Leu-UAA0.090.03738

    R Square0.3341094071Leu-UAA20.00780.01543

    Adjusted R Square0.2786185243Lys1,21.008130.06008

    Standard Error3.5530516214Lys30.640520.03395

    Observations14mAla0.008450.0089

    mArg0.015480.01881

    ANOVAmAsn0.017520.01356

    dfSSMSFSignificance FmAsp0.026590.02775

    Regression176.009890109976.00989010996.02097841230.0303830139mCys0.004510.00371

    Residual12151.490109890112.6241758242Met-CAU0.184240.04164

    Total13227.5Met-i0.026230.0161

    mGln0.030420.02478

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%mGlu0.027030.02208

    Intercept3.16483516482.0057639441.57787020470.1405793728-1.20534904457.5350193742-1.20534904457.5350193742mGly0.005640.00456

    RankRSCU0.5780219780.23556502872.45376820670.03038301390.06476987191.09127408420.06476987191.0912740842mHis0.015730.00907

    mIle0.011240.00955

    mLeu-UAA0.014480.01134

    mLeu-UAG0.007640.0061

    mLys0.012970.00964

    mMet0.0180.01466

    mPhe0.01940.01964

    mPro0.021320.01737

    mSer-GCU0.013010.03218

    mSer-UGA0.008550.00753

    mThr0.008290.00768

    mTrp0.006070.00485

    mTyr0.00990.01046

    mVal0.018480.01324

    Phe-GAA0.046840.01751

    Pro-IGG/CGG/UGG0.194840.05552

    Sec-UCA10.269970.0297

    Sec-UCA20.078830.06011

    Ser-CGA0.05150.02244

    Ser-GCU0.060460.01963

    Ser-IGA/UGA0.016610.01061

    Thr-CGU0.062150.01921

    Thr-IGU/CGU0.190040.01549

    Thr-UGU0.048140.02148

    Trp-CCA0.035750.01846

    Tyr-GUA0.068330.0362

    Val-IAC/CAC0.073870.04853

    Val-UAC0.066170.02823

    Table

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    RanktRNA

    RankRSCU

    RanktRNA

    FIg1

    HIV-1HTLV-1

    AACodonRSCUHumA-endingC-endingG-endingU-endingA-endingC-endingG-endingU-endingEnding

    AGCA0.74202898552.02666666670.51A

    EGAA0.78346028291.30708661421.224A

    GGGA0.92817679562.07751937981.18A

    IAUA0.35839160841.59414225940.797A

    KAAA0.76137339061.26970954361.5A

    LCUA0.29585087191.15384615381.073A

    LUUA0.67714285711.38235294121.505A

    PCCA0.93228655541.95789473680.986A

    QCAA0.51993262211.09090909090.945A

    RAGA0.97315436241.44329896910.788A

    RCGA0.75247524752.26666666670.932A

    SUCA0.79188900752.11494252870.815A

    TACA0.97052541651.88018433180.821A

    VGUA0.38819875782.08144796380.583A

    AGCC1.74057971010.82.438C

    CUGC1.20276953510.41176470591.435C

    DGAC1.17019230770.91176470591.234C

    FUUC1.19561454130.74418604651.073C

    GGGC1.5453827940.52713178291.28C

    HCAC1.18640576730.71264367821.228C

    IAUC1.69055944060.5397489541.219C

    LCUC1.06674684310.69230769231.404C

    NAAC1.16544117650.61728395061.108C

    PCCC1.46418056920.69473684211.729C

    RCGC1.60936093610.53333333331.233C

    SAGC1.404761904811.281C

    SUCC1.60939167560.82758620692.222C

    TACC1.7291755660.92165898621.932C

    VGUC1.01397515530.41628959281.75C

    YUAC1.19484702090.50549450551.307C

    AGCG0.34347826090.21333333330.303G

    EGAG1.21653971710.69291338580.776G

    GGGG0.87134964481.0232558141.06G

    KAAG1.23862660940.73029045640.5G

    LCUG2.11906193631.28205128210.581G

    LUUG1.32285714290.61764705880.495G

    PCCG0.47301275760.16842105260.425G

    QCAG1.48006737790.90909090911.055G

    RAGG1.02684563760.55670103091.212G

    RCGG1.11611161121.06666666671.053G

    SUCG0.36926360730.22988505750.148G

    TACG0.37761640320.18433179720.256G

    VGUG1.93478260870.88687782810.861G

    AGCU1.17391304350.960.749U

    CUGU0.79723046491.58823529410.565U

    DGAU0.82980769231.08823529410.766U

    FUUU0.80438545871.25581395350.927U

    GGGU0.65509076560.37209302330.48U

    HCAU0.81359423271.28735632180.772U

    IAUU0.9510489510.86610878660.984U

    LCUU0.51834034880.87179487180.942U

    NAAU0.83455882351.38271604940.892U

    PCCU1.13052011781.17894736840.86U

    RCGU0.52205220520.13333333330.782U

    SAGU0.595238095210.719U

    SUCU1.22945570970.82758620690.815U

    TACU0.92268261431.01382488480.991U

    VGUU0.66304347830.61538461540.806U

    YUAU0.80515297911.49450549450.693U

    FIg1

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    A-ending

    C-ending

    G-ending

    U-ending

    RSCU (Human)

    RSCU (HIV-1)

    V

    T

    S

    R

    R

    Q

    P

    L

    L

    K

    I

    G

    E

    A

    Fig.2

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    A-ending

    C-ending

    G-ending

    U-ending

    RSCU (Human)

    RSCU (HTLV-1)

    EarlyLate

    AACodonEndingRSCUHumRSCUHIV1LateRSCUHIV1EarlyRSCUHTL1LateRSCUHTL1EarlySUMMARY OUTPUT: LateRSCUHumRSCUHIV1LateSUMMARY OUTPUTAACodonRSCUHumRSCUHIV1Late

    AGCAA-A0.74202898552.1221.6670.5430.6150.74202898552.122AGCA0.74202898552.122

    EGAAE-A0.78346028291.4140.7271.2241.231Regression Statistics1.74057971010.805Regression StatisticsAGCC1.74057971010.805

    GGGAG-A0.92817679562.1261.7331.3621.28Multiple R0.47718215290.34347826090.171Multiple R0.054264191AGCG0.34347826090.171

    IAUAI-A0.35839160841.6430.6430.7650.75R Square0.22770280711.17391304350.902R Square0.0029446024AGCU1.17391304350.902

    KAAAK-A0.76137339061.3590.7621.4471.417Adjusted R Square0.20790031491.17019230770.887Adjusted R Square-0.0326645189DGAC1.17019230770.887

    LCUAL-A0.29585087191.1581.121.1190.909Standard Error0.38156973990.82980769231.113Standard Error0.5464828602DGAU0.82980769231.113

    PCCAP-A0.93228655542.0811.4120.9550.988Observations410.78346028291.414Observations30EGAA0.78346028291.414

    QCAAQ-A0.51993262211.1170.9521.0150.6291.21653971710.586EGAG1.21653971710.586

    RAGAR-A0.97315436241.4611.53811.143ANOVA1.18640576730.73ANOVAGGGA0.92817679562.126

    TACAT-A0.97052541651.9511.50.7870.733dfSSMSFSignificance F0.81359423271.27dfSSMSFSignificance FGGGC1.5453827940.421

    VGUAV-A0.38819875782.1591.20.6340.667Regression11.67415778681.674157786811.49869448830.00160720860.29585087191.158Regression10.02469553830.02469553830.08269236390.7757990677GGGG0.87134964481.011

    AGCC1.74057971010.8050.8332.3432.769Residual395.67822318880.14559546641.06674684310.737Residual288.36201846170.2986435165GGGU0.65509076560.442

    DGAC1.17019230770.8870.751.2571.273Total407.35238097562.11906193631.333Total298.386714HCAC1.18640576730.73

    GGGC1.5453827940.4210.81.2771.60.51834034880.772HCAU0.81359423271.27

    HCAC1.18640576730.730.7141.1651CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%0.93228655542.081CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%IAUA0.35839160841.643

    IAUCI-C1.69055944060.4521.7141.2241.2Intercept0.55138258120.1451056823.7998689890.0004956790.25787863850.8448865240.25787863850.8448865241.46418056920.748Intercept0.93557266010.24525795383.81464758020.00068916520.43318452251.43796079760.43318452251.4379607976IAUC1.69055944060.452

    LCUC1.06674684310.7370.481.2661.545RSCUHum0.4486418090.13230476193.39097249890.00160720860.1810301710.71625344710.1810301710.71625344710.47301275760.065RSCUHum0.06442733990.2240461560.28756279990.7757990677-0.39451040010.5233650799-0.39451040010.5233650799IAUU0.9510489510.905

    PCCC1.46418056920.7480.5881.6522.3531.13052011781.106KAAA0.76137339061.359

    SAGC1.40476190480.8721.21.3850SUMMARY OUTPUT: early0.51993262211.117KAAG1.23862660940.641

    TACC1.7291755660.86411.772.7331.48006737790.883LCUA0.29585087191.158

    VGUC1.01397515530.4090.61.8051.667Regression Statistics1.40476190480.872LCUC1.06674684310.737

    AGCG0.34347826090.1710.1670.2860Multiple R0.36769703540.59523809521.128LCUG2.11906193631.333

    EGAG1.21653971710.5861.2730.7760.769R Square0.13520110990.92817679562.126LCUU0.51834034880.772

    GGGG0.87134964481.0111.20.8511.12Adjusted R Square0.11302677931.5453827940.421PCCA0.93228655542.081

    KAAG1.23862660940.6411.2380.5530.583Standard Error0.61577178190.87134964481.011PCCC1.46418056920.748

    LCUG2.11906193631.3331.120.6610.455Observations410.65509076560.442PCCG0.47301275760.065

    PCCG0.47301275760.0650.5880.3780.1880.97052541651.951PCCU1.13052011781.106

    QCAG1.48006737790.8831.0480.9851.371ANOVA1.7291755660.864QCAA0.51993262211.117

    RAGG1.02684563760.5390.46210.857dfSSMSFSignificance F0.37761640320.222QCAG1.48006737790.883

    TACG0.37761640320.2220.1670.1970.267Regression12.31190139282.31190139286.09719016160.01802421050.92268261430.963RAGA0.97315436241.461

    VGUG1.93478260870.7731.80.8290.667Residual3914.78782060720.3791748874-0.14360111660.054264191RAGG1.02684563760.539

    AGCU1.17391304350.9021.3330.8290.615Total4017.099722SAGC1.40476190480.872

    DGAU0.82980769231.1131.250.7430.727SAGU0.59523809521.128

    GGGU0.65509076560.4420.2670.5110CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%TACA0.97052541651.951

    HCAU0.81359423271.271.2860.8351Intercept0.47278688790.23416947162.0189945540.0503998903-0.00086557170.9464393476-0.00086557170.9464393476TACC1.7291755660.864

    IAUU0.9510489510.9050.6431.011.05RSCUHum0.52721311210.213511532.46924890640.01802421050.09534528290.95908094120.09534528290.9590809412TACG0.37761640320.222

    LCUU0.51834034880.7721.280.9541.091TACU0.92268261430.963

    PCCU1.13052011781.1061.4121.0150.471VGUA0.38819875782.159

    SAGU0.59523809521.1280.80.6152VGUC1.01397515530.409

    TACU0.92268261430.9631.3331.2460.267VGUG1.93478260870.773

    VGUU0.66304347830.6590.40.7321VGUU0.66304347830.659

    0.54440046870.7330651607

    PCCAP-A0.93228655542.0811.4120.9550.988

    PCCC1.46418056920.7480.5881.6522.353

    PCCG0.47301275760.0650.5880.3780.188

    PCCU1.13052011781.1061.4121.0150.471

    xxia:mininum number of codons/family: 14

    EarlyLate

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    RSCUHIV1Late

    RSCUHIV1Early

    RSCU (Human)

    RSCU (HIV-1 early and late genes)

    A-A

    E-A

    G-A

    I-A

    K-A

    L-A

    P-A

    Q-A

    R-A

    T-A

    V-A

    I-C

    y = 0.3084x + 0.6916R2 = 0.1024

    SlopeDiff

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    RSCUHTL1Late

    RSCUHTL1Early

    RSCU (Human)

    RSCU (HTLV-1 early and late genes)

    y = 0.4486x + 0.5514R2 = 0.2277p = 0.0016

    y = 0.5272x + 0.4728R2 = 0.1352p = 0.0180

    EarlyLateCAI

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    RSCUHIV1Late

    RSCU (Human)

    RSCU (HIV1-late)

    Fig. 2Final

    AACodonHumanCFMusCFHIVCFRSCUHumRSCUMusRSCUHIVObsFreqHTL1RSCUHTL1ObsFreqHTL1EarlyRSCUHTL1EarlyObsFreqHTL1LateRSCUHTL1LateObsFreqHIV1EarlyRSCUHIV1EarlyObsFreqHIV1LateRSCUHIV1LateRSCUAACodonRSCUHIV1EarlyRSCUHIV1Late

    AGCA51234611140.74200.85672.0267320.5140.615190.543101.667872.122Codon familyRank(Isupply)HIV1EarlyHIV1LateGGGA1.7332.126

    AGCC12016299451.74061.55920.80001532.438182.769822.34350.833330.805Arg(AGA)41.5381.461GGGC0.80.421

    AGCG2371499120.34350.37100.2133190.30300100.28610.16770.171Arg(AGG)20.4620.539GGGG1.21.011

    AGCU8104901541.17391.21310.9600470.74940.615290.82981.333370.902Ile(AUA)140.6431.643GGGU0.2670.442

    CUGC6083055141.20281.05640.4118611.43581331.46750.83380.348Ile(AUY)52.3571.357HCAC0.7140.73

    CUGU4032729540.79720.94361.5882240.56581120.53371.167381.652Leu(UUA)81.2501.358HCAU1.2861.27

    DGAC12176493621.17021.13820.9118871.234141.273441.25760.75470.887Leu(UUG)10.7500.642IAUA0.6431.643

    DGAU8634916740.82980.86181.0882540.76680.727260.743101.25591.113Lys(AAA)130.7621.359IAUC1.7140.452

    EGAA108064651660.78350.81391.3071711.224161.231301.224120.7271281.414Lys(AAG)121.2380.641IAUU0.6430.905

    EGAG16779421881.21651.18610.6929450.776100.769190.776211.273530.586Gly(GGA)91.7332.126KAAA0.7621.359

    FUUC10365193321.19561.17560.7442661.073200.909331.15830.667240.727Gly(GGB)102.2671.874KAAG1.2380.641

    FUUU6973642540.80440.82441.2558570.927241.091240.84261.333421.273Val(GUA)71.2002.159LCUA1.121.158

    GGGA58846521340.92821.08972.0775591.18161.28321.362131.7331012.126Val(GUB)32.8001.841LCUC0.480.737

    GGGC9795667341.54541.32740.5271641.28201.6301.27760.8200.421Thr(ACA)61.5001.951LCUG1.121.333

    GGGG5523722660.87130.87181.0233531.06141.12200.85191.2481.011Thr(ACB)112.5002.049LCUU1.280.772

    GGGU4153036240.65510.71110.3721240.4800120.51120.267210.442LUUA1.251.358

    HCAC5763671311.18641.23560.7126891.228161461.16550.714230.73LUUG0.750.642

    HCAU3952271560.81360.76441.2874560.772161330.83591.286401.27RAGA1.5381.461

    IAUA20514441270.35840.40041.5941510.797100.75250.76530.6431091.643RAGG0.4620.539

    IAUC9675728431.69061.58820.5397781.219161.2401.22481.714300.452RCGA31.714

    IAUU5443648690.95101.01150.8661630.984141.05331.0130.643600.905RCGC0.3330.857

    KAAA88753381530.76140.74501.26971111.5171.417551.44780.7621251.359RCGG0.6671.143

    KAAG14438993881.23861.25500.7303370.570.583210.553131.238590.641RCGU00.286

    LCUA2461713450.29590.36951.15381071.073200.909611.11971.12331.158SAGC1.20.872

    LCUC8874765271.06671.02780.69231401.404341.545691.26630.48210.737SAGU0.81.128

    LCUG17629179502.11911.97991.2821580.581100.455360.66171.12381.333SUCA12.169

    LCUU4312887340.51830.62270.8718940.942241.091520.95481.28220.772SUCC10.814

    LUUA2371153940.67710.57051.3824701.50541381.43451.25741.358SUCG0.3330.203

    LUUG4632889421.32291.42950.6176230.49541150.56630.75350.642SUCU1.6670.814

    MAUG1012539777391017655TACA1.51.951

    NAAC9515391501.16541.16900.6173721.108101.111381.01371.167360.571TACC10.864

    NAAU68138321120.83460.83101.3827580.89280.889370.98750.833901.429TACG0.1670.222

    PCCA4754201930.93231.13721.95791090.986210.988480.955121.412642.081TACU1.3330.963

    PCCC7464621331.46421.25090.69471911.729502.353831.65250.588230.748VGUA1.22.159

    PCCG241152780.47300.41330.1684470.42540.188190.37850.58820.065VGUC0.60.409

    PCCU5764428561.13051.19861.1789950.86100.471511.015121.412341.106VGUG1.80.773

    QCAA46330641140.51990.56131.09091200.945110.629681.015100.952861.117VGUU0.40.659

    QCAG13187854951.48011.43870.90911341.055241.371660.985111.048680.883

    RAGA43529901400.97321.01721.4433130.78841.14391201.5381031.461

    RAGG4592889541.02680.98280.5567201.21230.8579160.462380.539

    RCGA2091507170.75250.82322.2667310.93271.273160.979361.714

    RCGC447247141.60941.34970.5333411.23371.273231.39410.33330.857

    RCGG310215081.11611.17441.0667351.05330.545160.9720.66741.143

    RCGU145119510.52210.65270.1333260.78250.909110.6670010.286

    SAGC8264679531.40481.28631.0000411.28100271.38591.2340.872

    SAGU3502596530.59520.71371.0000230.71942120.61560.8441.128

    SUCA3712798460.79190.93512.1149440.81580.681300.98431322.169

    SUCC7544382181.60941.46440.82761202.222252.128632.06631120.814

    SUCG173106450.36930.35560.229980.14820.1750.16410.33330.203

    SUCU5763725181.22951.24490.8276440.815121.021240.78751.667120.814

    TACA56837511020.97051.09521.8802480.821110.733240.78791.5791.951

    TACC10125099501.72921.48880.92171131.932412.733541.7761350.864

    TACG2211446100.37760.42220.1843150.25640.26760.19710.16790.222

    TACU5403404550.92270.99391.0138580.99140.267381.24681.333390.963

    VGUA25015511150.38820.42182.0814210.58340.667130.63461.2952.159

    VGUC6533822231.01401.03930.4163631.75101.667371.80530.6180.409

    VGUG12466962491.93481.89310.8869310.86140.667170.82991.8340.773

    VGUU4272375340.66300.64580.6154290.80661150.73220.4290.659

    WUGG572339298721340881

    YUAC7424116231.19481.17850.5055661.307221.833291.05561.091150.429

    YUAU5002869680.80520.82151.4945350.69320.167260.94550.909551.571

    *UAA44284240201

    *UAG20197700034

    *UGA68669122010

    401672403783568

    DiffMutationEarlyLate

    AACodonRSCUHIV1LateRSCUHumHumGGroupSUMMARY OUTPUT

    AGCA2.1220.742028985500

    AGCC0.8051.740579710100Regression Statistics

    AGCG0.1710.343478260900Multiple R0.2990972868

    AGCU0.9021.173913043500R Square0.089459187

    DGAC0.8871.170192307700Adjusted R Square0.0415359863

    DGAU1.1130.829807692300Standard Error0.5284094753

    EGAA1.4140.783460282900Observations41

    EGAG0.5861.216539717100

    GGGA2.1260.928176795600ANOVA

    GGGC0.4211.54538279400dfSSMSFSignificance F

    GGGG1.0110.871349644800Regression21.04243820520.52121910261.86671978660.1685345406

    GGGU0.4420.655090765600Residual3810.61022979480.2792165735

    HCAC0.731.186405767300Total4011.652668

    HCAU1.270.813594232700

    IAUA1.6430.35839160841.6431CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    IAUC0.4521.69055944060.4521Intercept1.03873697990.215347174.82354599670.00002300370.60278942941.47468453040.60278942941.4746845304

    IAUU0.9050.9510489510.9051RSCUHum-0.11143043250.1864465213-0.5976535880.5536162996-0.4888716790.266010814-0.4888716790.266010814

    KAAA1.3590.76137339061.3591HumG0.27094832330.15985659261.69494619420.0982666541-0.05266442710.5945610738-0.05266442710.5945610738

    KAAG0.6411.23862660940.6411

    LCUA1.1580.295850871900

    LCUC0.7371.066746843100SUMMARY OUTPUT

    LCUG1.3332.119061936300

    LCUU0.7720.518340348800Regression Statistics

    PCCA2.0810.932286555400Multiple R0.2844298639

    PCCC0.7481.464180569200R Square0.0809003475

    PCCG0.0650.473012757600Adjusted R Square0.0573336897

    PCCU1.1061.130520117800Standard Error0.5240366769

    QCAA1.1170.519932622100Observations41

    QCAG0.8831.480067377900

    RAGA1.4610.97315436241.4611ANOVA

    RAGG0.5391.02684563760.5391dfSSMSFSignificance F

    SAGC0.8721.404761904800Regression10.94270488990.94270488993.43283075120.0714906748

    SAGU1.1280.595238095200Residual3910.70996311010.2746144387

    TACA1.9510.970525416500Total4011.652668

    TACC0.8641.72917556600

    TACG0.2220.377616403200CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%

    TACU0.9630.922682614300Intercept0.9225583630.091896290610.03912515400.7366805721.10843615390.7366805721.1084361539

    VGUA2.1590.38819875782.1591HumG0.28864610160.15578997241.85278999110.0714906748-0.02646885810.6037610612-0.02646885810.6037610612

    VGUC0.4091.01397515530.4091

    VGUG0.7731.93478260870.7731

    VGUU0.6590.66304347830.6591

    xxia:0: Codon families without documented selective packaging1: with selective packaging, i.e., those in Table 1 other than Thr and Gly.

    SeqSeqLenCAICAI2t-Test: Two-Sample Assuming Unequal Variances

    tat2610.668750.71957

    rev3510.662110.72057Variable 1Variable 2

    nef6210.675230.72625Mean0.66869666670.592736

    gag-pol43080.591630.6524Variance0.00004303570.0035822537

    vif5790.619410.68549Observations35

    vpr2910.642720.69085Hypothesized Mean Difference0

    vpu2490.490680.56748df4

    env25710.619240.68272t Stat2.8098988703

    P(T

  • Table 2. Frequency of A residues, length and codon adaptation index (CAI) for the three HIV-1 early (tat, rev and nef) and five late (gag-pol, vif, vpu, vpr, and env) coding sequences (CDS). Any problem with the mutation hypothesis?

    GeneCDS (bp)CAItat2610.66875rev3510.66211nef6210.67523

    gag15030.62784pol30120.58139vif5790.61941vpr2910.64272vpu2490.49068env25710.61924

    You may be wondering about Cys codon family which has 4 tRNAs matching UGC, but none matching UGU. We would have predicted that UGC should be preferred, but the opposite is true. Why? One might think that, because Cys is rarely used, the codon family is not under selection, so that codon usage will be at the mercy of mutation bias. Because the yeast genome is AT-biased, we expect U-ending codon to be more than C-ending codon. Are you happy with this explanation? Unfortunately, the explanation is wrong, but the correct answer is still elusive.