nguyen thi duy nhat, stacy todd, erwin de bruin, tran …10.1038...nguyen thi duy nhat, stacy todd,...
TRANSCRIPT
Appendix for Nhat, Todd, de Bruin et al (2017)
1 / 30
SupplementaryOnlineAppendixStructureofgeneral‐populationantibodytiterdistributionstoinfluenzaAvirusNguyenThiDuyNhat,StacyTodd,ErwindeBruin,TranThiNhuThao,NguyenHaThaoVy,TranMinhQuan,
DaoNguyenVinh,JankovanBeek,PhamHongAnh,HaMinhLam,NguyenThanhHung,NguyenThiLeThanh,
HuynhLeAnhHuy,VoThiHongHa,StephenBaker,GuyEThwaites,NguyenThiNamLien,TranThiKimHong,
JeremyFarrar,CameronPSimmons,NguyenVanVinhChau,MarionKoopmans,MaciejFBoni
TableofContents
1. Mappingluminescencevaluestotitersusingluminescencevaluesatfourdilutions1.1 Four‐parameterlog‐logisticcurve1.2 StandardizationagainsttheinternationalInfluenzaAH1N1Pdm2009Positive
Control2. Mappingluminescencevaluestotitersusingluminescencevaluesattwodilutions3. Titerreproducibility4. Representativenessofsamplecollectionversuscensusdata5. Choosingbinwidthfordiscretizingthenormaldistribution6. Comparisonsofgamma,Weibull,andnormalfits7. Confidenceintervalsformixturecomponentestimates8. Confidenceintervalsforpandemicattackrate9. Additionalfiguresandtables
Appendix for Nhat, Todd, de Bruin et al (2017)
2 / 30
1. MappingluminescencevaluestotitersusingluminescencevaluesatfourdilutionsSamples were tested for presence of a range of seasonal and pandemic influenza
antibodiesusingtheprotein‐microarray(PA)methodatserialfour‐folddilutionsfrom20to1280 except for the international standard (IS) positive control against H1N1 Pandemic2009 that was tested from 40 to 2560. Commercially available HA1 proteins of the 16influenzaviruses(TableS1)werespottedinduplicatesoneachofthe64‐padnitrocellulosecoated slide as described by Koopmans et al (ClinicalMicrobiology and Infection, 2012,18(8):797‐807). Allof theslidesused inthisstudywereprintedat theErasmusMedicalCenter(Rotterdam,NL)andtheDutchNationalInstituteforPublicHealthandEnvironment(RIVM;Bilthoven,NL)byErwindeBruin. Oneachslide,15humanseraandonepositivecontrol were tested in BSL‐2 conditions at the Oxford University Clinical Research Unit(OUCRU)inHoChiMinhCity,inthesamemannerdescribedbyKoopmansetal(2012).TheprocessedslideswerethendriedandsentbacktoRIVMforscanning.TheantigensignalswerequantifiedbyaScanarrayGxPlusmicroarrayscanner(PerkinElmer)withanadaptivecirclesizeof80−200μm.DigitalcopiesofthescanimageswereanalyzedusingScanArrayExpress (version 4.0) Software (PerkinElmer) to determine the median fluorescenceintensity. ThePAtiterwasdefinedasthebestfittiteraccordingtoafour‐parameterlog‐logistic functiondescribed in thenext section. Fitted titerswere then standardized to apositivecontrolinordertocorrectforvariationacrossslides;valuesfallingoutsidethe[20,1280]rangewerescoredas10and1810.AlldataanalysiswasdoneinMatlabR2015a.TheinitialscriptwaswrittenbyJankovanBeek.
Appendix for Nhat, Todd, de Bruin et al (2017)
3 / 30
TableS1:Listofallinfluenzastrainstestedonthemicroarrays
HumanAntigensAbbreviation AvianAntigens
Abbreviation
H1N1
A/SouthCarolina/1/18 H1‐18
H5N1
A/Vietnam/1194/2005 H5‐05
A/USSR/92/1977 H1‐77 A/Cambodia/R0405050/2007 H5‐07
A/NewCaledonia/20/1999 H1‐99 A/Hubei/1/2010 H5‐10
A/Brisbane/59/2007 H1‐07 H7N7 A/Chicken/Netherlands/2003 H7‐03
A/California/6/2009 H1‐09 H9N2 A/GuineaFowl/HK/1999 H9‐99
H3N2
A/Aichi/2/68 H3‐68
A/Wyoming/3/2003 H3‐03
A/Wisconsin/67/2005 H3‐05
A/Brisbane/10/2007 H3‐07
A/Victoria/210/2009 H3‐09
A/Victoria/361/2011 H3‐11
1.1 Fourparameterlog‐logisticcurve
Foreachhuman‐antigencombination,theeightluminescencescoresobtainedfromserial four‐folddilutionswere fitted,usingnon‐linear least‐squares regression, toa four‐parameter log‐logistic curve (Figure S1); errors were defined as absolute luminescencedifferencesbetweenthemodelandthedatapoint. Thelog‐logisticluminescencecurveissigmoidalinshape. Themaximumreadoutgivenbythescannerwas65355;thus,65356wasusedasthemaximumluminescenceinthelog‐logisticfit.Theminimumreadoutwasdefinedas3000,belowwhichitwasdifficulttodifferentiatebetweentheantigensignalsandthebackgroundluminescence.Mathematically,thetitervaluewasdeterminedasthex‐valueoftheinflectionpointofthefittedlog‐logisticcurve,whichisequivalenttothepointwherethefittedcurveachievesaluminescenceof34268. SampleswereassignedtitervaluesofNaNifthewidthofthe95%confidenceintervalresultingfromthefitwasgreaterthanonelog2‐unit. The variables titer and bwere fit with 8 luminescence data points using theequationbelow:
Lum1 exp ∙ log
dilutiontiter
inwhich: Lumi:luminescencescoreatdilutionic=3000:minimumluminescenced=65535:maximumluminescetiter:dilutionoftheinflectionpoint
Appendix for Nhat, Todd, de Bruin et al (2017)
4 / 30
Figure S1: Luminescence vs. protein micro‐array (PA) titers being fitted to four‐
parameterlog‐logisticcurveinwhichc=3000,d=65536,titer=dilutionattheinflectionpoint,andb=slopeattheinflectionpoint1.2 Standardizationagainsttheinternationalpositivecontrol
Asampleof theWHO2nd InternationalStandard(ISH2) forantibody to influenzaH1N1pdmvirus(H1‐09)(NationalInstituteforBiologicalStandardsandControls(NIBSC),NIBSC code: 10/202) was included on every slide to correct for variation among slides,technicians,slidebatches(differentprintingruns),andslideseries(differentcommercialantigen purchases). In order to take into account the titer change across series, thestandardizationprocedurewasdividedintoatwo‐stepprocedure.First,allsampleswithinthe same slide serieswere corrected to thegeometricmean titers for ISH2 samples of areferenceantigenwithintherespectiveseries.Second,dependingonthedifferencebetweenthegeometricmeanofthereferenceantigenforseriesscomparedtoitsgeometricmeaninseries1,thetitersweredeflated/inflatedoverthegeometric‐meantitersofseries1.Inouranalysis,thepandemic2009H1N1(H1‐09)antigenwaschosenasthereferenceantigenforcorrectingvarianceacrossslidesduetoitsstableperformanceovertime(TableS2).
Appendix for Nhat, Todd, de Bruin et al (2017)
5 / 30
TableS2:SummaryofISH2performanceagainstallantigens
Antigen
Series1 Series2 Series3
Mean Std Mean StdInflationoverSeries1
P‐value Mean StdInflationoverSeries1
P‐value
H1‐18 144.15 59.04 250.23 91.47 1.74 0 149.35 48.56 1.04 0.76675
H1‐77 762.14 388.88 1091 514.66 1.43 0 905.99 312.2 1.19 0.00031
H1‐99 287 156.21 307.74 121.12 1.07 0.01314 190.88 39.08 0.67 0.00491
H1‐07 247 110.69 325.42 119.93 1.32 0 483.18 179.65 1.96 0
H1‐09 158.91 44.76 154.15 89.66 0.97 0.48132 201.72 54.78 1.27 0
H3‐68 348.95 145.17 542.08 103.93 1.55 0 458.56 87.03 1.31 0.00064
H3‐03 392.81 119.55 550.42 275.15 1.40 0 448.03 94.08 1.14 0
H3‐05 679.28 336.05 631.39 277.52 0.93 0.00109 777.45 209.43 1.14 0.01081
H3‐07 126.14 37.55 189.85 75.41 1.51 0 158.85 93.98 1.26 0
H3‐09 108.86 40.07 159.54 56.99 1.47 0 171 65.97 1.57 0
H3‐11 146.08 61.06 214.03 82.15 1.47 0 174.75 64.58 1.20 0
H5‐04 40.01 0.23 40.27 12.36 1.01 0.21794 NaN NaN NaN NaN
H5‐07 40.03 0.52 41.14 19.01 1.03 0.02606 NaN NaN NaN NaN
H5‐10 40.01 0.18 40.34 5.34 1.01 0.03549 40.33 24.04 1.01 0.21348
H7‐03 40.23 14.71 40.96 48.39 1.02 0.36462 40.22 2.85 1.00 0.83947
H9‐99 47.28 27.6 89.58 35.03 1.89 0 69.7 25.59 1.47 0
Mathematically, the standardization procedure looks as follows. The within‐seriescorrectionfactorornormalizationfactorforsamplesonslideiandserieskiscalculatedas:
, ,
, , ,
inwhich , , istheH1‐09titerofthepositivecontrolonslidei(assumingitisinseries
k);and , , isthegeometricmeanofISH2titersfortheH1‐09antigeninseriesk.
Hence,thenormalizationfactoristhesameforallantigens andallsamplessonthesameslide; and it serves as the rejection criterion to discard slideswhoseH1‐09 titers of thepositivecontroldifferbymorethan1.2dilutions(log2‐units)awayfrom , , .
To correct for the inter‐series variation, themean titer of antigen in seriesk isinflated/deflatedoveritscorrespondingmeanfromseries1:
′ , ,
, ,
Theindividualtiterofantigen forallsamplessonslideiinserieskisthencorrectedas:
Appendix for Nhat, Todd, de Bruin et al (2017)
6 / 30
∙ ′ ∙ _
2. Mappingluminescencevaluestotitersusingluminescencevaluesattwodilutions
Afteranalyzingabout19,000humanseraatfour‐folddilutions(~300,000sample‐antigen combinations), we began a process of mapping luminescence series to titers todetermineiffewerluminescencepointscouldbeusedfortiterinference.Thiswasdonetoreducethenumberofdilutionsdonepersampleinordertoreducecost.
FigureS2showstherelationshipbetweenluminescencescoresatsingledilutionsandthetiterobtainedfromafour‐folddilution.Clearly,asingledilutionisnotsufficienttobeusedasaproxyforafour‐foldtiter,asthemajorityofmappingsfromsingledilutionstotiterswouldresultinmappedtitersthatareoneortwologunitsdistantfromthetruetiter.Thesingle dilution at 20 is especially uninformative, as 80% of the luminescence scores atdilution1:20aresaturatedat65535(FigureS2,top‐leftpanel).Thus,weconsideredthreedilutionpairsasthebestcandidatesformappingluminescencepairstotruetiters.Thethreepairsconsideredwere1:80and1:320,1:80and1:1280,and1:320and1:1280.Theobjectivewastodefineamappingfunctionf(L1,L2)=T,whereTisthetruetiterobtainedfromfour‐folddilutionsandL1andL2areluminescencescoresattwodifferentdilutions.Fourdifferentfunctionalformsoffwereinvestigated:
1.asimplegridapproachwheretheL1×L2spacewasdividedintogridsquaresofsizen×nluminescencepointsoneachside.Valuesofn=100,200,500,and1000weretried.TheexactL1×L2spaceusedwas[3000,65536]×[3000,65536].Themediantiterwasusedasthemappedtiterpointineachgridsquare.Thisfunctiongavepoormappingpoweratthetwoends,titersbelow200andabove1000.Thisqualityofthemappingdidnotimprovewithsmallergridsize.
2. a piecewise linear function f (L1,L2 ) = 65355
65355 with the breaks
occurringattheluminescencepointswherethetitersachievetheirboundaryvalues;i.e.titerof1810wasassignedforL2>34268.3.alinearsplinesurfaceplotoff(L1,L2)usingalinearinterpolationmethodwiththeMATLABfamilyoffunctionsfitobject.
4.acubicsplinesurfaceplotoff(L1,L2)usingcubicsplineinterpolationmethodwiththeMATLABfamilyoffunctionsfitobject.
Functionalform4waschosenasithadthebestaccuracyacrossalltiterrangesaboveatiterof150,althoughfunctionalform3hadsimilaraccuracy.Noneofthemethodswereable to accurately map to titers below 150 when the smallest dilution used was 1:320.
Appendix for Nhat, Todd, de Bruin et al (2017)
7 / 30
However, the overall performance of the mapping was considerably poorer using the1:80/1:320and1:80/1:1280pairswhichiswhy1:320and1:1280werechosenasthetwomostinformativedilutionstouseforinferringtiters(seebelow).
FigureS2:Four‐foldtitervs.luminescencescoresateachofthefourdilutionsfrom1:20to1:1280
FigureS3showsthemapped2‐foldtitersusingfunctionalform4againstthetrue4‐foldtiters.Threedifferentdilutionpairsforthe2‐foldtitersareshowninthethreepanels.The1:80and1:320dilutionpairlostdiscriminatorypowerfortitervaluesabove700.The1:80and1:1280dilutionpair,ontheotherhand, failedtomaptiters in the intermediaterange[400,1000].Eventhoughthe1:320and1:1280dilutionpairdidnotmaptitersbelow200accurately,thisshouldnotaffecttheseroprevalenceestimatesbecausethemeanofthehighest titercomponent inthemaintextarehigherthan200forallage‐groups(Table1,main text). In otherwords, losing discriminatory power among the seronegative groupsshould not affect seroprevalence estimates. After the initial 19,000 serawere processedusing4‐foldtiters,subsequentserawereprocessedatdilutions1:320and1:1280only(withtheexceptionofHoChiMinhCitysamples,whichwerealwaysprocessed4‐fold).Toensureaccurate standardization across slides, the positive control samples (ISH2) were alwaysprocessedatfourdilutions;andthestandardizationprocedurepresentedinsection1.2wasunchanged.
Appendix for Nhat, Todd, de Bruin et al (2017)
8 / 30
FigureS3:2‐foldtiterobtainedfromcubicsplinesurfacemapvs.4‐foldtiterfordifferentpairsofdilutions1:80and1:320,1:80and1:1280,and1:320and1:1280(topdown)
Appendix for Nhat, Todd, de Bruin et al (2017)
9 / 30
3. TiterReproducibilityThirty‐twosamplescollectedfromKhanhHoaweretested11timesbydifferentlab
techniciansondifferentdaystoassesstiterreproducibility.Acrossallsixteenantigensonthearray,98%ofthesample‐antigencombinationshadastandarddeviationoflog‐titerlessthanonewhenperformingreplicateexperiments(FigureS4,toppanel),makingPAtitersmorereproduciblecomparedtostandardHIassays. Wealso lookedatthereproducibilityof thepositivecontrol(ISH2);seeFigureS4,bottompanel. ThemediandifferencebetweenanindividualISH2titerandthegeometricmeanofall(n=1457)ISH2titerswasbelow0.5.
FigureS4:Top:Boxplotsofstandarddeviationsoflog2‐titersforelevenhumanantigens(x‐axis)forelevenreplicatesof32differentsamplesfromKhanhHoa.Inotherwords,eachboxplot summarizes 32 standard deviations. Bottom: Differences between individual ISH2titersandthegeometricmeanoftheISH2for1457replicateson11humanantigens(logtiterscale).Inbothplots,theredlineisthemediandifference;andtopandbottomedgesare75thand25thpercentileforeachbox,respectively.Thewhiskersextendtoq3+1.5*(q3–q1)andtoq1–1.5(q3–q1),whereq1andq3arethe25thand75thpercentiles.Outliersareplottedinreddots.
Appendix for Nhat, Todd, de Bruin et al (2017)
10 / 30
4. Representativenessofsamplecollectionversuscensusdata
FigureS5:Collectedsamplepopulation(blackdots)versusVietnamnationalhousingcensusin2009(graybars)fordifferentsites.They‐axisshowsdifferentagegroupsandx‐axisisthepercentageofthepopulationorsampleinthatagegroup.Leftcolumnsaredataformalesandrightcolumnsforfemales.
The 0‐4 yrs‐old age group was clearly oversampled for Ho Chi Minh City, and
undersampledforDakLak,whichmighthavebeencausedbydifferenthealthcaresystemsinthesetworegionscausingpediatricadmissionstobeover/under‐representedincertainhospitals.HoChiMinhCityisadenselypopulatedcitywitheasyaccesstomanylevelsofhealthcare,whereasDakLakProvincialHospital is inasmallcity inaruralregionof theVietnamesehighlands,withalowersocioeconomiclevelandlessaccesstohealthcare.Forsomeoftheagegroups,thesamplewasveryclosetothatofthedata(olderthan40,bothgenders,Hue;olderthan50,bothgenders,HoChiMinhCityandKhanhHoa;olderthan60,bothgenders,Dak‐Lak).Forthe5‐20agegroup,allofthesitesareundersampledforbothgenders,whichistobeexpectedastheseindividualsaretheleastlikelytoreporttohospitals.Overall,theserumsamplesethadanagestructuresimilartothetruedemography;thus,thesampleweightsincludedinthelikelihoodfunctionintroducedonlyminoradjustments.
Thesampleweightssi inthelikelihoodequationswerecalculatedfromthesedata.Forexample,thefractionofmalesinthe0‐5year‐oldagerangeinthecensusdataforHCMCis0.0364whilethisfractionis0.1241forthesampleset.Thus,everysamplethatwasmale
Appendix for Nhat, Todd, de Bruin et al (2017)
11 / 30
andbelongedtothe0‐5agecategoryinthesampleshouldbedown‐weightedbyafactorofsi=0.0364/0.1241=0.2933.
Appendix for Nhat, Todd, de Bruin et al (2017)
12 / 30
5. ChoosingabinwidthfordiscretizingthenormaldistributionWhen computing the likelihood function, the normal distribution needed to be
discretizedsothata‘zero‐inflated’normaldistributioncouldbeconstructed(log‐titer=1wastheinflatedprobabilitymass).Aprobabilitymassandcontinuousprobabilityfunctioncannotbecombinedintoasinglelikelihoodfunction.
Thenormalmixturewasdefinedonthelog‐titerrange[1,7],andsevendifferentbinwidthsof1.000,0.500,0.100,0.050,0.010.0.005,and0.001were tested in this range toinvestigatethebehaviorofthesevendifferentprobabilitymassfunctionsinthelikelihoodoptimizationalgorithm(TableS3).Forasinglenormaldistributionfit,theoptimumwasthesameforallbinwidths.Inthecaseoftwoandthreecomponentsinamixturedistribution,therewerefewchangesinthefittingresults,unlessthebinwidthwaslessthanorequalto0.100.Italsowasimportanttoavoidusingtoosmallofabinwidthtoavoidtheproblemofhaving“aspike”inthefitteddistributionthathadnoepidemiologicalmeaning.Abinwidthof0.100wasusedintheanalysis.
TableS3:LLHoptimizationresultsfordifferentbinwidths
Interval
Width w0 w1 μ1 σ1 w2 μ2 σ2 w3 μ3 σ3 SumNegLLH
1.000 0.228 0.772 3.945 1.551 38357.07
0.500 0.228 0.772 3.946 1.551 48240.31
0.100 0.228 0.772 3.950 1.551 71504.04
0.050 0.228 0.772 3.950 1.551 81535.65
0.010 0.228 0.772 3.949 1.551 104838.69
0.005 0.228 0.772 3.949 1.551 114876.59
0.001 0.228 0.772 3.949 1.551 138181.93
1.000 0.228 0.150 2.063 0.234 0.622 4.387 1.386 37870.75
0.500 0.228 0.211 2.443 0.686 0.561 4.509 1.398 47830.31
0.100 0.228 0.218 2.462 0.698 0.555 4.532 1.393 71076.04
0.050 0.228 0.218 2.462 0.697 0.555 4.532 1.392 81105.93
0.010 0.228 0.218 2.462 0.697 0.555 4.531 1.393 104410.41
0.005 0.228 0.218 2.462 0.697 0.555 4.532 1.393 114448.07
0.001 0.228 0.218 2.462 0.697 0.555 4.532 1.393 137753.27
1.000 0.228 0.205 2.017 0.059 0.377 4.026 0.794 0.190 5.765 1.207 37634.10
0.500 0.228 0.379 2.735 0.812 0.151 4.640 0.426 0.242 5.422 1.352 47639.60
0.100 0.228 0.396 2.771 0.830 0.159 4.684 0.434 0.217 5.571 1.298 70856.20
0.050 0.228 0.398 2.774 0.830 0.161 4.685 0.433 0.214 5.594 1.288 80881.55
0.010 0.228 0.398 2.774 0.830 0.161 4.684 0.434 0.214 5.591 1.290 104186.77
0.005 0.228 0.398 2.774 0.830 0.161 4.684 0.435 0.213 5.597 1.287 114224.69
0.001 0.228 0.398 2.774 0.830 0.161 4.684 0.435 0.213 5.596 1.288 137529.83
Appendix for Nhat, Todd, de Bruin et al (2017)
13 / 30
6. Comparisonsofgamma,Weibull,andnormalfits Normaldistributionswerechosenfortheformofthemixturecomponents,asthisisthe traditional distributional form chosen for mixture distribution fits of continuousquantities when no other information is known or assumed about the data points.ComparisonsweredonewithmixturesofWeibullandgammadistributions,althoughnotethatbothWeibullandgammadistributionsallowforsymmetryandrightskew,butnotforleft skew. In choosing a distributional form, when all other considerations are equal, anormaldistribution formshouldbechosenoverWeibullorgamma forms toprevent fitswhere rights skews are “discovered” by the fitting process but left skews remainundiscovered. In the figuresbelow, the four‐component fits forH1N1have the lowestBICvaluewhennormaldistributionsareusedascomponents.Forthree‐componentfitsforH3N2,thegamma‐fitshaveasomewhatlowerBICvaluethanthenormalfits(ΔBIC≈19),butitisalsoclear that the component structure in this fit does not distinguish titer groups. NormalcomponentshadbetterBICthanWeibullcomponentsforbothH1N1andH3N2. Basedon(1)ourinitialconsiderationsofskewnessinthefits,(2)theBICvalueswhencomparingthefitsforbothsubtypes,and(3)ourvisualinspectionofthefits,weconcludedthat we did not have any evidential support for using gamma distributions or Weibulldistributionsinthemixtureanalysis.
Figure S6: H1N1 titers, all sites all ages. Mixture fits were performed with three differentdistributionalforms,showninthethreecolumns.Thefirstrowoffiguresshowsfitswithasinglemixturecomponent,thesecondrowof figuresshowsfitswithtwomixturecomponents,etc. Thenumberinthetop‐rightcornerofeachpanelistheBICvalueforthatmodelfit.
Appendix for Nhat, Todd, de Bruin et al (2017)
14 / 30
Figure S7: H3N2 titers, all sites all ages. Mixture fits were performed with three differentdistributionalforms,showninthethreecolumns.Thefirstrowoffiguresshowsfitswithasinglemixturecomponent,thesecondrowof figuresshowsfitswithtwomixturecomponents,etc. Thenumberinthetop‐rightcornerofeachpanelistheBICvalueforthatmodelfit.
Appendix for Nhat, Todd, de Bruin et al (2017)
15 / 30
7. Confidenceintervalsformixturecomponentestimates Here,wepresentconfidence intervals forall themixturecomponentsandweightsdisplayedinFigure3ofthemaintext(H1N1).
TableS4:Estimateswith95%confidenceintervalsinparentheses,formean,variance,andweightparameters,for1‐componentto4‐componentmixturemodelsofH1N1titers.
Appendix for Nhat, Todd, de Bruin et al (2017)
16 / 30
Andbelow,weshowthesameconfidenceintervalsforH3N2.
TableS5:Estimateswith95%confidenceintervalsinparentheses,formean,variance,andweightparameters,for1‐componentto4‐componentmixturemodelsofH3N2titers.
Appendix for Nhat, Todd, de Bruin et al (2017)
17 / 30
8. Confidenceintervalsforpandemicattackrate InthefourthparagraphoftheResultssectionofthemainpaper,forsubtypeH1N1,wepresentevidenceshowingthattheinterpretationofw4asthefractionofthepopulationthatwasrecentlyinfected,andtheinterpretationofw3asthefractionofthepopulationthatwasnon‐recently infected,areconsistentwithpost‐pandemicobservations inVietnamin2010.TableS6:Estimatesandconfidenceintervals forweightparameters inpost‐pandemic(JanuarytoJune2010)analysis.
AgeGroup NumberofSamples w4(95%CI) w3(95%CI)
0.5–9 161 0.138(0.117–0.169) 0.160(0.146–0.174)
10–19 355 0.226(0.200–0.268) 0.216(0.182–0.234)
20–44 871 0.085(0.070–0.108) 0.230(0.227–0.393)
≥45 439 0.157(0.135–0.205) 0.196(0.169–0.221)
The confidence intervals for these ‘recent attack rates’ and ‘non‐recent attack rates’ arepresentedabove,andthew4columnisconsistentwithwhathasbeenmeasuredforthe2009pandemicinVietnamandotherpartsofAsia. Notethatsomeoftheconfidenceintervalsarequitenon‐symmetric.Allconfidenceintervals were checked manually to ensure that the appropriate weight parameter wasassociated with the appropriate mean (category) during the optimization and profilingprocess.
Appendix for Nhat, Todd, de Bruin et al (2017)
18 / 30
9. AdditionalFiguresandTables
Appendix for Nhat, Todd, de Bruin et al (2017)
19 / 30
FigureS8:Top:Quantileplotshowingthe25th,50th,75th,80th,85th,90th,95th,and99thpercentilesoflog‐titerfor2009H1N1(log‐titer onx‐axis) for the aggregateddata and the individual collection sites (y‐axis).Bottom: quantile‐quantile plots forpairwisecomparisonamongthefourcollectionsites(allagegroupsincluded;notre‐weightedbyprovincepopulation).Whenthetwodatasetscontaindifferentnumbersofsamples,theplottingquantileswerecalculatedbasedonthesmallerdataset.Bluelinesaretheextrapolatedstraightlineconnectingthefirstandthirdquartilesofthesmallerdataset.
Appendix for Nhat, Todd, de Bruin et al (2017)
20 / 30
FigureS9:Top:Quantileplotshowingthe25th,50th,75th,80th,85th,90th,95th,and99thpercentilesoflog‐titerofH3N2(geometricmeantiterof2009and2011strains).Log‐titerisshownonthex‐axisandcollectionsitesareshownonthey‐axis.Bottom:quantile‐quantileplotsforpairwisecomparisonamongthefourcollectionsites(allagegroupsincluded;notre‐weightedbyprovincepopulation).Whenthetwodatasetscontaindifferentnumbersofsamples,theplottingquantileswerecalculatedbasedonthesmallerdataset.Bluelinesaretheextrapolatedstraightlineconnectingthefirstandthirdquartilesofthesmallerdataset.
Appendix for Nhat, Todd, de Bruin et al (2017)
21 / 30
FigureS10:HistogramsofantibodytiterstoH3N2(GMTof2009and2011strains),showingfitresultsformixturemodelswithdifferentnumberofnormalcomponents(toptobottom,thelabeltotheleftofthey‐axisisthenumberofmixturecomponents)andgroupedbycollectionsites(lefttoright).HistogramsareweightedtoadjustforageandgenderaccordingtotheVietnamnationalhousingcensusin2009foreachofthefourcollectionsites.Thebluelinesineachpanelarethenormalizedprobabilitydensityfunctionsofthecomponentdistributionswhoseintensitywasrankedinorderofincreasingμ.Theblacklinesshowthefullmixturedistributiondensity,andtheblackdotsaretheestimatedcumulativedistributionofthemixturemodelsat7.0(titerof1280).ThenumbersintheupperrightcornerofeachpanelaretheBICscoresofthemodelfits.
Appendix for Nhat, Todd, de Bruin et al (2017)
22 / 30
FigureS11:VisualizationofmodelselectionprocessforH3N2titerdistributionsfromFigureS8. They‐axesshowthefittedvaluesofwi (mixtureweights), μi(means), andσi (standarddeviations).Components’ shadeswere ranked from lightest todarkestinincreasingorderofμ.Inthetoppanel,the“0thcomponent”representsthepointmassw0placedat20fortitersbelowthelowerdetectionlimitof20.Notethatforfiveorsixcomponents,inmanycases,theweightsorstandarddeviationparametersareclosetozero;forsomecases,twooftheinferredmeanparametersareveryclosetoeachother.
Appendix for Nhat, Todd, de Bruin et al (2017)
23 / 30
FigureS12:TiterhistogramsforH3N2andfitresultsformixturemodelswithdifferentnumberofcomponents(labelontheleftis thenumberofmixturecomponents)andgroupedbydifferentagegroups.Histogramsareweightedtoadjust forageandgenderaccordingtotheVietnamnationalhousingcensusin2009.ThenumbersintheupperrightcornerofeachpanelarethefittedBICscoresoftherespectivemodel.Foreachpanel,thebluelinesarethenormalizedprobabilitydensityofthecomponentdistributionswhose intensity fromlightest todarkestwasrankedinorderof increasingμ.Black linesarethetotalmixturedistributiondensity;andtheblackdotsareestimatedprobabilityweightofthemixturemodelfortiters≥7.0.
Appendix for Nhat, Todd, de Bruin et al (2017)
24 / 30
FigureS13:Antibodytiterhistogramsforn=20,152individuals,plottedforH1N1strains(toppanels)andH3N2strains(bottompanels).Thefractionsofindividualswithtitersbelowthedetectionlimitof20andabove1280thatwereoutoftheplottingrangesaregivennexttotherespectivebar.HistogramswereweightedtoadjustforageandgenderaccordingtotheVietnamnationalhousingcensusin2009forthefourcollectionsites.
Appendix for Nhat, Todd, de Bruin et al (2017)
25 / 30
FigureS14:SimilartoFigure5inthemaintext,butagegroupingsaredifferent.Individualsinthe0.5‐49agegroupwerebornin1960or laterandwouldhavebeenexposed to two lineagesofH1N1: the1977 lineageand the2009pandemic lineage.Individualsinthe60+agegroupwerebornin1953orearlierandarelikelytohavebeenexposedtothreeH1N1lineages:the1918‐1957lineage,the1977lineage,andthe2009pandemiclineage.TheBICcomparisons(seealsoTableS6)showsthat3componentsisthebestfit for60+andthat4componentsisthebestfitfor0.5‐49. However,thedifferencebetweenthe3‐componentfitsand4‐componentfitshere–as inallothergroupingsthathavebeenlookedat– isthatthe4‐componentfitincludesanextrasmallpeakintheseronegativetiterrange,coveringtitersbetween20and30.Theredoesnotappeartobeareasonwhythedifferentlineage‐exposurehistorywouldproducethisextrapeakforseronegativeindividuals.
Appendix for Nhat, Todd, de Bruin et al (2017)
26 / 30
FigureS15:Anadditionalvisualizationofagestructureinthetiterdistribution.Here,thefour‐componentmodelisshownforH1N1 and the three‐componentmodel is shown for H3N2. The individual histogram bins (0.1width on a log2‐scale) aredisplayedasastackedbargraphwithdifferentagegroupscorrespondingtodifferentcolors.NotethatforH1N1,theleft‐handbaratalog‐titerofoneisshownonaseparatescale.Therearenoclearagepatternsinthesetiterdistributionssuggestingthatcertainmixturecomponentscorrespondtocertainagegroups.
Appendix for Nhat, Todd, de Bruin et al (2017)
27 / 30
FigureS16:AsFigure2inthemaintext,butherefourrandomsub‐samplesweredrawn(withreplacement)accordingtotheage‐groupproportionsandagebandsinthegeneralpopulation,asshowninFigureS5. Fourorfivecomponentsisthebestexplanatorymodel,accordingtoBIC. However,thecomponentmeans,variances,andweightsarenotthesameacrosssub‐sampleswhenfivecomponentsareused.
Appendix for Nhat, Todd, de Bruin et al (2017)
28 / 30
TableS7:ChangeinBICscoresasthenumberofnormaldistributionsinthemixtureincreasefromonetosixforH3N2,fortheaggregateddataaswellastheindividualcollectionsites.Thevaluesofthenegativelog‐likelihoodwereweightedtoadjustforageandgenderaccordingtotheVietnamnationalhousingcensusin2009.Thefirstrowshowstheexactvaluesofthenegativelog‐likelihood.BoldnumbersrepresentthemixtureswiththebestBIC.NB:EventhoughtheBICisdesignedtotakesamplesizeintoaccountinselectingthebestfittedmodels,thelargenumberofdatapointsintheaggregatedata(~20,000)stillcausedtheBICtoselectsixcomponentsasthebestmodelforH1N1(Figure2)andfourcomponentsasthebestmodelforH3N2(FigureS8).Thus,BICisimperfectandcannotbeusedasthesolecriterionformodelselection.
ALL SITES HUE KHANH HOA HCMC DAK LAK
( N = 19785) (N = 4519) (N = 5384) (N = 5784) (N = 4134)
SumNegLLH (C =1) 78498.68 17790.08 21515.90 22757.09 16264.15
Δ in
num componen
ts
1 to 2 ‐533.59 ‐110.90 ‐77.42 ‐119.56 ‐153.85
2 to 3 ‐222.45 ‐103.79 ‐39.32 ‐33.62 ‐54.56
3 to 4 ‐29.26 5.47 1.52 ‐0.60 7.74
4 to 5 15.05 15.82 18.17 12.95 ‐2.89
5 to 6 14.88 16.15 5.95 21.13 23.64
Appendix for Nhat, Todd, de Bruin et al (2017)
29 / 30
TableS8:ChangeinBICscoresasthenumberofnormaldistributionsinthemixtureincreasefromonetosixforH3N2,fordifferentagegroupsasrecommendedbyCONCISE.Thevaluesofthenegativelog‐likelihoodwereweightedtoadjustforageandgenderaccordingtotheVietnamnationalhousingcensusin2009.Thefirstrowshowstheexactvaluesofthenegativelog‐likelihood.BoldnumbersrepresentthemixtureswiththebestBIC.
ALL AGES 0‐9 Yrs‐Old 10‐19 Yrs‐Old 20‐44 Yrs‐Old 45+ Yrs‐Old
( N = 19785) ( N = 3031) ( N = 2922) ( N = 8353) ( N = 5479)
SumNegLLH (C=1) 78498.68 11518.25 15750.62 33377.53 16966.21
Δ in
num componen
ts
1 to 2 ‐533.59 ‐26.27 ‐81.21 ‐222.69 ‐145.63
2 to 3 ‐222.45 6.62 ‐22.60 ‐83.06 ‐82.27
3 to 4 ‐29.26 16.77 ‐10.09 8.94 ‐5.55
4 to 5 15.05 ‐6.69 10.78 21.10 22.92
5 to 6 14.88 ‐1.14 23.64 22.79 25.97
Appendix for Nhat, Todd, de Bruin et al (2017)
30 / 30
TableS9:ChangeinBICscoresasthenumberofnormaldistributionsinthemixtureincreasefromonetosixforH1N1,fordifferentagegroupsasrecommendedbyCONCISE.Thevaluesofthenegativelog‐likelihoodwereweightedtoadjustforageandgenderaccordingtotheVietnamnationalhousingcensusin2009.Thefirstrowshowstheexactvaluesofthenegativelog‐likelihood.BoldnumbersrepresentthemixtureswiththebestBIC.
ALL AGES 0‐9 Yrs‐Old 10‐19 Yrs‐Old 20‐44 Yrs‐Old 45+ Yrs‐Old 0‐49 Yrs‐Old 60+ Yrs‐Old
( N = 19335) ( N = 2988 ) ( N = 2876 ) ( N = 8116 ) ( N = 5355 ) ( N = 15168 ) ( N = 2152 )
SumNegLLH (C=1) 71504.04 9951.65 14564.33 30125.89 16100.72 49844.40 5757.17
Δ in
num componen
ts
1 to 2 ‐1670.41 ‐86.24 ‐77.86 ‐345.89 ‐238.01 ‐547.01 ‐101.08
2 to 3 ‐837.75 ‐46.85 ‐58.34 ‐206.63 ‐90.59 ‐276.79 ‐37.61
3 to 4 ‐653.85 ‐24.88 ‐50.71 ‐134.61 ‐58.24 ‐208.20 ‐3.98
4 to 5 ‐57.35 ‐7.33 4.76 ‐7.33 13.75 ‐19.98 18.09
5 to 6 ‐72.45 11.20 ‐0.88 30.17 17.70 ‐8.14 20.63