The Standard Biplot

NOTE TO READERS OF THE The Standard Biplot ELECTRONIC PDF VERSION: This article contains embedded dynamic graphics in Figures 3, 6 MichaelGreenacre and 9, which can be seen by simply clicking on the figure to see it in Department of Economics and Business motion. To enable this feature you Universitat Pompeu Fabra should use the latest version of the Ramon Trias Fargas, 25-27 free Acrobat Reader, version 9, to 08005 Barcelona open the PDF file. SPAIN E-mail: [email protected] Abstract: Biplotsaregraphicaldisplaysofdatamatricesbasedonthedecompositionofamatrixasthe productoftwomatrices.Elementsofthesetwomatricesareusedascoordinatesfortherowsand columnsofthedatamatrix,withaninterpretationofthejointpresentationthatreliesonthepropertiesof thescalarproduct.Becausethedecompositionisnotunique,thereareseveralalternativewaystoscale therowandcolumnpointsofthebiplot,whichcancauseconfusionamongstusers,especiallywhen softwarepackagesarenotunitedintheirapproachtothisissue.Inanattempttounifythescalingofthe biplotweproposeanewscalingofthesolution,calledthestandardbiplot,whichcanbeappliedtoawide varietyofanalysessuchascorrespondenceanalysis,principalcomponentanalysis,logratioanalysisand thegraphicalresultsofadiscriminantanalysis/MANOVA,infacttoanymethodbasedonthesingular valuedecomposition.Thestandardbiplotalsohandlesdatamatriceswithwidelydifferentlevelsof inherentvariance.Twoconceptstakenfromcorrespondenceanalysisareimportanttothisidea:the weightingofrowandcolumnpoints,andthecontributionsmadebythepointstothesolution.Inthe standardbiplotonesetofpoints,usuallytherowsofthedatamatrix,optimallyrepresentthepositionsof thecasesorsampleunits,whichareweightedandusuallystandardizedinsomewayunlessthematrix containsvaluesthatarecomparableintheirrawform.Theothersetofpoints,usuallythecolumns,is representedinaccordancewiththeircontributionstothelowdimensionalsolution.Asforanybiplot,the projectionsoftherowpointsontovectorsdefinedbythecolumnpointsapproximatethecentredand (optionally)standardizeddata.Themethodisillustratedwithseveralexamplestodemonstratehowthe standardbiplotcopesindifferentsituationstogiveajointmapwhichneedsonlyonecommonscaleon theprincipalaxes,thusavoidingtheproblemofenlargingorcontractingthescaleofonesetofpointsto makethebiplotreadable.Theproposalalsosolvestheproblemincorrespondenceanalysisoflow frequencycategoriesthatarelocatedontheperipheryofthemap,givingthefalseimpressionthattheyare important. Keywords: Biplot,contributions,correspondenceanalysis,discriminantanalysis,MANOVA,principal componentanalysis,scaling,singularvaluedecomposition,weighting.

1 1. Introduction

The biplot ,atermintroducedbyGabriel(1971)inthecontextofprincipalcomponentanalysis, isagraphicaldisplayoftherowsandcolumnsofadatamatrixaspointsinalowdimensional

Euclideanspace,usuallyofdimensionalitytwoorthree.Anymatrix T(I×J)canbedecomposed inaninfinitenumberofwaysastheproductoftwomatrices A(I×K)and BT(K×J):

T= AB T. (1)

Wecall Tthe target matrix ,and Aand Bthe left and right matrices ,respectively,ofthe decomposition.

Aconvenientwayofobtaining(1)istousethesingularvaluedecomposition(SVD)of T:

ΓΓΓ T T T ΓΓΓ γ ≥γ ≥ ≥γ T= U V ,where U U=V V =I , =diag( 1 2 ··· K>0). (2)

Then Aand Bcanbedefinedas:

α α A= U ΓΓΓ B= V ΓΓΓ1 . (3) foranyvalueof α.TheadvantageofformingthedecompositionusingtheSVDisthewell knownEckartYoungtheoremwhichstatesthattheleastsquaresmatrixapproximationofthe targetmatrix Tcanbeobtainedforanygivenrank K* oftheapproximatingmatrix,usingthe first(i.e.,largest) K* singularvaluesinthediagonalof ΓΓΓ,andthecorrespondingfirst K* leftand rightsingularvectorsin Uand V (EckartandYoung1936).

Thusatargetmatrix Tcanbeapproximatedbyits‘closest’(leastsquares)approximationof rank2,forexample:

2 γ 0  [][]u u 1 v v T (4) 1 2  γ  1 2  0 2  andthen Aand Bcouldbedefined,forexamplewith α=1in(3),as

γ 0  A= []u u 1 B= [v v ] (5) 1 2  γ  1 2  0 2 

Therowsof Aandof Bprovidetwodimensionalcoordinatesforpointsrepresentingtherows andcolumnsofthetargetmatrixinaplanardisplay,sothatthescalarproductbetweenthe ith

rowpointandthe jthcolumnpointapproximatesthe(i,j)thelement sij ofthetargetmatrix.

Sincescalarproductsbetweentwovectors aand bareequaltothelengthoftheperpendicular projectionof aonto bmultipliedbythelengthof b(orviceversa),itfollowsthatthedirection vectorsrepresentingthecolumns(forexample)canbecalibratedsothattheapproximatedvalues ofthetargetmatrixcanbereadoffbyprojectingtherowpointsperpendicularlyontoeach columnvector.Inthiscasewerefertothecalibratedcolumnvectorsas biplot axes.Formore detailsaboutbiplots,seeGabrielandOdoroff(1990),Greenacre(1993)andGowerandHand

(1996).

Inspiteofthesimplicityofthebiplot,itsuseoftencausesconfusionbecauseofthemanyways inwhich Aand BcanbeformedfromtheSVD.Whilethescalarproductruleandthe calibrationholdsforallbiplots,theuseofdifferentvaluesof αproducesbiplotswithdifferent withinsetinterpretationsofthecloudsofrowandcolumnpoints.Theusualchoices α=1and

α=0givebiplotsinwhichdistancesareapproximatedbetweenrowsandbetweencolumns respectively.Oftenthetwosetsofplottedpointsareonsuchdifferentscalesthattheyare plottedontwodifferentscales(see,forexample,the biplot functionintheRpackage).To alleviatethisthechoice α=½issometimesrecommendedsothatrowandcolumnpointshave

3 similardispersionsinthedisplay,butthischoiceoptimizesneitherthedistancesbetweenrows northedistancesbetweencolumns.Thislastchoice,calledthe symmetric biplot byGreenacre

(2007)hastheleastbenefitfortheuser,sincedatamatricesareinvariablyinterpreted asymmetrically,withatleastoneofthesetsofdistancesbeingimportanttovisualize(see

Greenacre(2007:Epilogue)forfurtherdiscussionaboutthisaspect).Soweareinterestedin improvingthebiplotdisplayforthechoices α=1or α=0,whereeithertheroworcolumn distancesareoptimallydisplayed.Ourobjectiveistofindabiplotwhichhasthefollowing characteristics:

(a)therowandcolumnconfigurationsshouldbeabletobeplottedtothesamescale,

independentlyofthelevelofvarianceinthedatamatrix;

(b)supposingthatrowdistancesarevisualized,thenthelengthsofthevectorstothe

columnpointsshouldhavesomeadvantageforinterpretation;

(c)thebiplotshouldbeapplicableacrossdifferenttypesofmultivariatemethodsthat

visualizedatatablesusingtheSVD,forexamplecorrespondenceanalysis,principal

componentanalysis,logratioanalysisanddiscriminantanalysis.

Theabovethreepropertiesaresatisfiedbytheproposed standard biplot ,whichwefirstdefinein thecontextofcorrespondenceanalysis.

2. The standard correspondence analysis biplot

Inthestandardbiplottheweightingofthepointsandthecontributionsmadebyeachpointtothe solutionwillplaycrucialroles.Sincetheseconceptsareinherentincorrespondenceanalysis

(CA),wechoosetouseCAtoillustratethegeneralidea.Givenacontingencytable N,and associatedcorrespondencetable P=(1/ n) Nofrelativefrequencies,where nisthegrandtotalof

4 N,let rand cbetherowandcolumnmarginaltotalsof P(calledrowandcolumn masses inCA terminology).Let Drand Dcbethediagonalmatricesofthesemasses.CAhasastargetmatrix thestandardizedresidualsoftherelativefrequencies:

= − 2/1 − T − 2/1 T Dr (P rc )Dc (6) whichcanbewrittenequivalentlyas:

= 2/1 −1 − T − 2/1 T Dr (Dr P 1c )Dc (7)

Itisthislatterform(7)thatisusefulhere,becauseitshowsthatCAistheanalysisoftherow

−1 profiles(rowsoffrequenciesrelativetotheirtotals,inthematrix Dr P ),centredwithrespectto theaverageprofile cT,withrowsweightedbythemassesin randthedistancesbetweenrow

−1 profilesdefinedbychisquaredistancesusingthemetric Dc .Afterdecomposing Tbythe

SVD,asin(2): T= U ΓΓΓV T,where UTU= VTV= I,therowsandcolumnsareoftenplotted

= − 2/1 accordingtothesocalled principal coordinates oftherows: F Dr UΓ andthe standard

= − /1 2 coordinates ofthecolumns: Y Dc V .Therationaleforthischoiceofcoordinatesisas follows:therowpointsdefinedby Fvisualizeapproximatechisquaredistancesbetweenrow profilesthatareweightedbytherowmasses;thecolumnpointsdefinedby Yaretheprojections oftheunitprofiles,orvertexpointsofthesimplexspaceoftheprofiles;androwpointsare weightedaveragesofthecolumnvertexpoints,usingtheelementsoftherowprofilesas

= −1 weights: F Dr PY .Thejointdisplayof Fand Yisabiplot,whichfromtheirdefinitionsand from(7)isseentobe:

T = − 2/1 T − 2/1 = −1 − T −1 FY Dr UΓV Dc (Dr P 1c )Dc (8)

5 Hencescalarproductsbetweenrowandcolumnpointsinalowdimensionaldisplay(usingthe firstcolumnsof Fand Y)areapproximatingthecentredprofilesrelativetotheiraveragevalues:

(pij /ri – cj)/ cj.Thisjointplotisalsocalledthe(rowprincipal)asymmetricmap–seeGreenacre

(1993,2007).

Thefactthattherowprofilesandtheextremeunitprofiles(vertices)lieinthesame multidimensionalspaceisanattractivepropertyoftheasymmetricmap,butinpracticeitis problematicwhenthevarianceinthedata,called inertia inCA,islow.Forexample,the

‘author’dataset,availableinthe ca packageofNenadićandGreenacre(2007),consistsofthe countsofthe26lettersofthealphabetin12samplesoftextfrombooksbyfamousEnglish authors(thedataarereproducedinGreenacreandLewi(2009)).Theprofilesofthebooks acrossthelettersareverysimilar,asexpected,andtheinertiaisconsequentlyverylow.The asymmetricmapofthistable,reproducedfromGreenacre(2007)asFigure1,showsthe12 booksclusteredsotightlyatthecentreofthemapthattheycannotbelabelled.Thisisclearly notagoodbiplotandthescaleofoneofthesetsofpointsneedstobechangedtomakethejoint plotlegible.Thedirectionsdefinedbythecolumnpointsarethecorrectones,buttheyaretoo long.Ratherthansimplyreducetheiroverallscale,theproposedstandardbiplotreducestheir lengthbymultiplyingeachcolumnpoint(representingaletterintheexample)bythesquareroot

½ ofthemass, cj ,associatedwiththecolumnpoint.Thusalllettersarepulledin,butlow frequencyletterssuchas zand qarepulledinalotmorethanhighfrequencyletterssuchas a

2/1 and e.Thisisapremultiplicationof Yby Dc andconverts(8)intothefollowingbiplot:

2/1 T = −1 − T − 2/1 F(Dc Y) (Dr P 1c )Dc (9)

Theelementsbeingapproximated(showninthetargetmatrixontherightof(9))arenowthe

½ standardizeddifferences(pij /ri – cj)/ cj betweentheprofileanditsaverage,standardizedby

6 2/1 dividingbythesquarerootoftheaverage,asinthechisquaremetric.Therightmatrix Dc Y of thebiplotin(9),representingthecolumns,istheoriginalmatrix Vofrightsingularvectors.The factthatstandardizedvaluesarebeingbiplottedhassuggestedtheterm‘standardbiplot’.The standardCAbiplotforthesamedataisgiveninFigure2,andclearlytheproblemofthe differenceinscaleshasbeenclearedup.But,inaddition,thelengthsofeachvectorjoiningthe letterstotheoriginarerelatedtothecontributionsthatthelettersmaketotheprincipalaxes.

Sinceitistheselettersthataredeterminingthesolution,itiscorrectthattheyhavemore prominentpositionsinthejointdisplay.

Toshowtheconnectionwiththecontributions,thepartsofinertiaduetoeachcolumnalongthe

2 principalaxesare,inscalarformforcolumn jalongaxis k: cj yjk ,relativetothesumofthese

T T quantities,whichisequalto1fromthedefinitionof Y: Y DcY= V V= I.Thusby‘shrinking’

½ each yjk by cj ,thesquareofthenewcoordinateisexactlythecontributionofthe jthpointtothe inertiaalongthe kthaxis.

Thusthestandardbiplotofthe‘author’datainFigure2showsthatthelettersd, h, wand chave highcontributionstothefirstaxis,and yisespeciallydominantonthesecond–thecloserletters aretothecentrethelessimportanttheyaretotheinterpretation.Table1liststhecontributions foreachlettertothetwoaxes.Thus, d,forexample,hascontributionsof0.1704toaxis1and

0.0593toaxis2,andthesquarerootsofthesevaluesare0.1704 ½=0.413and0.0593 ½=0.244, respectively.Thecoordinatesof dinFigure2are(againfromTable1,thesquarerootsofthe massesmultipliedbythestandardCAcoordinates)0.046 ½×(–1.9256)=–0.413and

0.046 ½×1.1354=0.244–thisisanempiricalcheckthatthesquaresofthecoordinatesarethe contributions.Notice,however,thatthesquaredlengthsofthevectors,equaltothesumoftheir

7 squaredlengthsalongtheaxes,arenotequaltotheircontributionstothetwodimensionalmap, sincetheinertiasalongtheaxesarenotthesame.

Figure3isadynamicgraphicwhichshowsthetransitionfromFigure1toFigure2,asthebiplot vectorsforthelettersaregraduallyshrunkproportionaltothesquarerootsoftheirrespective masses.Thesizeofthetriangularsymbolforeachletterisproportionaltothemassoftheletter, soitisthepointswithsmallertrianglesthatarepulledinthemostbythestandardbiplotscaling.

Duringthetransitiontherow(book)pointsareinfixedpositionsandthescaleofthemapis adjustedastheletterpointsshrinkdifferentiallytowardsthecentre–thedashedboxshown throughouttheanimationactuallydelimitstheareaofthefinalframeoftheanimationinwhich thestandardCAbiplotresides.

ThestandardCAbiplotfunctionsequallywellforatablewithverylowinertia,suchasthedata set‘author’,asforatablewithhighinertia.Thedataset‘benthos’(reportedinGreenacreand

Lewi(2009)andalsoanalyzedinGreenacre(2007:chapter10))consistsofcountsof92marine speciesat13differentlocationsontheNorthSeabed,11ofwhichlieclosetoanoildrilling platformthatcausesseabedpollution,whiletheothertwo(R40andR42)arereferencestations farawayfromthepollutionsource.Thesedatahavehighinertia,whichischaracteristicof ecologicalabundancedataofthistype.Thus,thestationpointsaremuchmorespreadoutinthe profilespacewithrespecttotheextremevertexpoints,asshownintheasymmetricCAmapof

Figure4.Figure5showsthestandardCAbiplotofthesamedataandFigure6showsthe dynamictransitionfromtheasymmetricmap(Figure4)tothestandardCAbiplot(Figure5).

Againeachspeciespointisshownwithasymbolproportionaltoitsoverallabundance.Onlythe top10contributingspecieshavebeenlabelled–theremaining82unlabelledspeciesareall closertothecentreofthemapintheCAbiplotandcontributeonly15%tothesolution.

Showingthespecies’vectorsintermsoftheircontributionsassistsandclarifiesthe

8 interpretationbyconcentratingattentiononthemostinfluentialspecies.InFigure5itcanbe clearlyseenthatthereisagroupofspecieschieflyresponsiblefortheseparationtotherightof theunpollutedreferencestationsR40andR42(thesewouldbethespeciesmostaffectedand reducedbythepollution).Thereismainlyonespecies( Chaetezona setosa )thatdominatesthe pollutedstations,especiallystationS15whichisoneoftheclosesttotheoildrillingplatform andmostcontaminated.StationS24isaspecialcasebecausethereisanunusuallyhigh occurrenceof Myriochele oculata ,whichisunrelatedtothepollutiongradient.

NoticethatinFigure2thebookpointsshowlessdispersionthanthestationpointsinFigure5, relativetothespreadofthe‘variables’ineachbiplot(lettersinFigure2,speciesinFigure5, respectively).Thispartlytestifiestothelowerinertiainthe‘author’dataset,comparedtothe

‘benthos’dataset,althoughtheimpressionofthespreadoftherowsrelativetothecolumnsis alsogovernedbyhowmanyvariablesthereare.Themorevariablesthereare,themorethe numericalvaluesofthecontributionsarereducedasagroup(becausethecontributionssumto1 foreachaxis).

3. The standard principal component biplot

Inthecaseofprincipalcomponentanalysis(PCA),wheredataareassumedonintervalscales, therearetwodistinctsituations:variablesthatneedtobestandardized,andvariablesthatareall onthesamescaleandthatneednostandardization.Thefirstsituationisthemostcommon, whenvariablesareondifferentmeasurementscales.Table2showsthe‘environ’dataset consistingofJ=10variablesmeasuredatthesameI =13locationsastheprevious‘benthos’ dataset.TheusualPCAapproachistostandardizethevariables,inwhichcasethetotal varianceisequalto10,thenumberofvariables.Incalculatingthevariancesofeachvariable withaviewtostandardization,weuse I, not I–1,toaveragethesquareddeviationsfromthe mean.Inotherwords,weweighteachsamplebyaconstant1/ I,whichmimicstherow

9 weightingideainCA,exceptintheCAcasetherearedifferentweightsforeachrow.Wecould, however,weightthesamplesusingtheCAmasses,whichiswhatisdoneifweusedthe environmentalvariablesasconstraintsinacanonicalcorrespondenceanalysis(CCA).

DistancesbetweensamplesareEuclideanonthestandardizedscalesandthesedistanceswill increaseasthenumberofvariablesintheanalysisincreases.Sincewewishtolimitthe variationinthesamplessothatwecaneventuallyplotsampledistancesandvariable contributionsonthesamebiplot,aconvenientrescalingistocalculateEuclideandistancesusing averagesquareddifferencesbetween(standardized)variables,notsumofsquareddifferences.

Foratable Xofstandardizeddata,thedistancebetweensamples iand i' is:

J 2 /1(J) (x − x ′ ) (10) ∑ j=1 ij jl

ThisagainmimicstheideainCAofweightingthecolumnsofthematrix–inPCAallthe columnsnowreceiveanequalweightof1/ JcomparedtothedifferentialweightinginCA.The totalvariancewillthenbeaveragedaswell,equalto1ratherthan10inthisexample.TheSVD forPCAthenmultipliesthedatabythesquarerootsoftherowandcolumnmasses(assuming X isalreadycolumncentred,sothatallvariableshavemean0andvariance1):

T /1( I) /1( J)X = /1( IJ)X = UΓV (11)

Hencethesumoftheeigenvalues(thatis,thesumofsquaredsingularvalues)isequalto1,and not J(thenumberofvariables)asintheusualdefinitionofPCA.Thebiplotcoordinatesare

F= I UΓ (fortherows,where I istheinverseoftherowmasses,asinCA)and Y= V(for thecolumns,wherethereisnorescalingbythemasses,asinthestandardCAbiplot).Againthe

10 squaredlengthsofthecolumncoordinatesareequaltothecontributionsoftherespective variablestotherespectiveprincipalaxes.

Figure7showsthestandardPCAbiplotofthe‘environ’datasetofTable2.Allthevariables measuringpollution(THC=TotalHydrocarbonContent,andthemetals)pointtotheleftand contributethemosttothehorizontalaxis,whichseparatesthemostpollutedstationS15onthe left(oneoftheclosesttotheoilrig)fromtheunpollutedreferencestationsR40andR42onthe right.Distanceversusdepthcontributestronglytothesecondaxis,withtheTOM(=Total

OrganicMaterial)biplotaxispositive–boththehighlypollutedstationS15andthetwo referencestationsarehighonTOM.

Thereferencestationsareatshallowerdepthsthantheothers,butonlybyafewmetres,andthis differenceisaccentuatedbythestandardization.Alsobecausethevariablesarehighlyskew,a logarithmictransformationmightbeconsideredappropriate–thelogtransformeddataarethen notstandardized,sothatthemultiplicativedifferencesarevisualized.Figure8showsthe standardPCAbiplotofthelogtransformeddata,andtheroleofthevariabledepthis considerablyreduced,withmostofthevariancenowconcentratedonthefirstaxiswhichisthe pollutiongradient.ThisalsoshowsthatthestandardPCAbiplotfunctionswellfor unstandardizedlogtransformeddataaswell.Theonlyslightlyproblematiccasewouldbe whenalltheoriginalvariablesareonthesamescale,andstandardizationisnotrequired,butthe commonscalehasaverysmallorverylargerange(forexample,alldataareratingsfrom0to

100).Thenthescaleoftherowpointswillbequitedifferenttothatofthecolumnpoints,which canberectifiedbysomeappropriateoverallrescalingoftheoriginaldata,forexampletorescale theratingsfrom0to10.

11 ItisimportanttorealizethatthestandardPCAbiplotdescribedaboveisnotthesameasthe biplotinwhichthevariablesaredepictedbytheircorrelationswiththeprincipalaxes,andthus liewithinaunitcircleinthetwodimensionalsolution,forexample.Thislatterbiplotassociates thesingularvalueswiththecolumncoordinates,depictingrows(cases)withcoordinatesin

I U (seetheSVDin(11))andthecolumns(variables)withcoordinatesin VΓΓΓ.Thesquaresof thecorrelationsarethepartsofthe(unit)varianceofthatvariableexplainedbytheaxes,andcan besummedforthetwoaxes,hencethedrawingofaunitcirclewhichrepresentsaperfect explanationofthevariable.Theaveragesumofsquaredcolumncoordinatesonanaxisisthen thepartofthetotalvarianceonthataxis,whilethecasesarerepresentedbycoordinatesthat haveaveragesumofsquaresequalto1oneveryprincipalaxis.

4. Log-ratio analysis

Logratioanalysisisthevisualizationofatableofstrictlypositivedata,logtransformedand doublecentred,againusingtheSVD.Themethodispopularintheanalysisofcompositional data(Aitchison1986).Therearetwoforms,anunweightedform(seeAitchison1983,Aitchison andGreenacre2002)andaweightedform,thelatteralsoknownas‘spectralmapping’(Lewi

1976,GreenacreandLewi2009).Therowandcolumnweightscommonlyusedinweighted logratioanalysisarethesameasthemassesinCA,thatistherelativevaluesofthecolumn marginsoftheoriginaltable.UsingthesamenotationasinCAfortherowandcolumnmasses rand c,andlog( N)todenotethematrixoflogtransformeddata,thetargetmatrixanditsSVD are:

= 2/1 − T − T T 2/1 = T T Dr (I 1r )log(N)(I 1c ) Dc UV (12)

(justsubstitute r=(1/ I)1and c=(1/ J)1fortheunweightedform).Assumingrowswere cases/samplesandthecolumnsvariables/components,thestandardlogratiobiplotwouldthen

12 − 2/1 usetheprincipalcoordinates Dr U fortherowsand Vforthecolumns.Somecareisneeded here,sincetheanalysisisoftenperformedtodetectequilibriuminasubsetofcomponents, whichcanbediagnosedbysomecomponentsliningupinstraightlines(see,forexample,

AitchisonandGreenacre2002).Butthestandardbiplotnormalization,whilehavingthe advantageofshowingwhichcomponentsareimportantfortheinterpretationofthebiplot,has thedisadvantageofdestroyingthisproperty.Adynamicgraphic,whichalternatesbetweenthe usualmapwherethecolumnsareshowninprincipalorstandardcoordinatesandthestandard biplot,canbeusefultodiagnosemodelsinsubcompositionsaswellasseewhichcomponents aredeterminantinthesolution–Figure9showsanexampleofthisforthe‘author’data,starting withaLRAsolutionthatallowsmodeldiagnosisandendingwiththestandardbiplotthat representscomponentcontributions.

5. The standard discriminant analysis / MANOVA biplot

Lineardiscriminantanalysis(LDA)andmultivariateanalysisofvariance(MANOVA)relyona commontheory,whichreducestoaSVDofthematrixofgroupcentroids,weightedbytheir respectivegroupsizes,inaspacestructuredbytheMahalanobismetric–see,forexample,

Mardia,KentandBibby(1979).Theresultantprincipalaxesareoftencalledcanonicalaxesin thiscontext,andtheterminologycanonicalvariateanalysis(CVA)isacommonsynonymas well.Thisisanothercasewherethestandardbiplotcanbeusedtorepresentthecentroidsby theirprojectionsontotheoptimalplanealongwiththepointsforthevariablessuchthattheir squaredlengthsalongacanonicalaxisarethecontributionstothegroupdiscriminationalongthe axis.TheideaofaMANOVAbiplotoriginatesintheworkofGabriel(1972).

Supposethatthereare Ggroupsofcases(rows)inacasesbyvariablesdatamatrix,andthatthe

GvectorsofvariableaveragesarecontainedintheG×Jmatrix G.Theaveragesarecentred withrespecttotheglobalaveragesofthevariablesacrossallthecases–theseglobalaverages

13 areidenticaltothecentroidofthegroupaverages,whereeachgroupisweightedproportionally tothenumberofcasesinit.A J×Jcovariancematrix Sgiscomputedwithineachgroup–asin

PCAtheaveragingofthesquaresandcrossproductsinthecomputationisperformedby dividingbythenumberofcases ngineachgroup,not ng–1.Theestimatedcommoncovariance matrix Sisthencomputedas Σg(ng/n)Sg .Weightsof ng/n areassignedtoeachgroupaverage, andtheseweightsformthediagonalofthediagonalmatrix Dw.

TheweightedPCAofthegroupaverages,intheMahalanobismetric S–1,equivalentto

LDA/MANOVA/CVAisachievedbythefollowingSVD,imitatingtheprocedurefollowedfor

CAandPCA:

= 2/1 − 2/1 = T T D w GS /1 J UΓV (13)

Thusthestandardbiplotwouldvisualizetherowsbytheprincipalcoordinatesofthegroup

− 2/1 centroidsin D w UΓ andthecoordinates Vofthecolumns(variables)thatdepictthe contributionstotheprincipalaxes.Figure10showsthestandardbiplotinthetwodimensional planeofthethreecentroidsofthefamousFisher‘iris’data.

6. Discussion and conclusions

Thesubjectofthescalinginbiplots,andinparticularinCA,hasbeenthesubjectofmuch discussionandcontroversy.ThemostpopularwayofreportingtheresultsofCAhasbeenthe socalled symmetric map ,sometimescalledthe‘Frenchplot’or‘Benzécriplot’.Thissolution, whichallocatesthesingularvaluestoboththeleftandrightmatricesofthesingularvalue decomposition(thatis,bothrowsandcolumnsinprincipalcoordinates)isnotabiplot,but displaysinterrowandintercolumndistancesoptimally,withtheaestheticadvantageof spreadingouttherowandcolumnpointssimilarlyoverthemaparea.Theasymmetricmap,on

14 theotherhand,onlyworkswellinhighinertiacases,withthescalingproblemdemonstratedin

Figure1whentheinherentinertiaislow.

Amongstallbiplotoptionsforatypicalcases ×variablesdataset,whatwearereallyinterested inisthedirectionmadebyeach‘variable’ofthedataset,sotheproposalofthispaperisto maintainthesedirectionswhilemakingthelengthsofthebiplotvectorsbothmeaningfuland alsoonascalewhichallowsjointrepresentationofthe‘case’pointswithouttheneedfortwo setsofscalesfortherowandcolumnpoints.Boththeseobjectivesareattainedbythestandard biplot.Thecasepointsarealwaysintheirprincipalcoordinatepositions,sothatdistances betweencasepointsareinterpretable.Thevariablesareindicatedbylinescoincidingwiththe biplotaxes,whichcouldbecalibratedsothattheprojectionsofthecasepointsonabiplotaxis approximatethevaluesforthatvariable,butthelengthsofthelinesalongeachprincipalaxisof thebiplotarerelatedtothecontributionmadebythatvariabletothesolution.Itiscalledthe standardbiplotbecauseitgenerallyreconstructsdataonastandardizedscaleandalsobecause thesumofsquaredlengthsoftheselinesonanyprincipalaxisisequalto1.

Althoughmotivatedprincipallyincorrespondenceanalysisbecauseoftheproblemof differentialmassesoftherowandcolumnpoints,ithasalsobeenshownthatthestandardbiplot functionswellinstandardizedandunstandardizedprincipalcomponentanalysis,logratio analysisanddiscriminantanalysis.Webelievethatthestandardbiplotcanbecome,asitsname doublyimplies,astandardwayofshowingtheresultsofmethodsbasedonthesingularvaue decomposition,andmayremovemostofthecontroversyanddoubtsurroundingthescalingof thesolutionsofthesemethods.

15 Computing note

Theca package(NenadićandGreenacre2007)in R(Rdevelopmentcoreteam,2008)wasused toperformtheCAcalculationsinthisarticle,whileallotherswerecomputedusingmatrix functionsin Rfunction,notablysvd .ThestandardbiplotoptionforCAalreadyexistsinthe ca package,forexamplemap=”rowgreen” forthestandard(rowprincipal)biplot.Theoption map=”rowgab”istheoneproposedbyGabrielandOdoroff(1990)wherethestandard coordinatesof‘variable’pointsaremultipliedbytheirmasses,notthesquarerootsofthe massesasinthestandardbiplot.

Acknowledgments

ThisresearchhasbeensupportedbytheFundaciónBBVA,Madrid,Spain,andtheauthor expresseshisgratitudetotheFoundation’sdirector,Prof.RafaelPardo.Partialsupportof

SpanishMinistryofEducationandSciencegrantMECSEJ200614098isalsoacknowledged.

ThispaperwasmostlywrittenwhiletheauthorwasonleaveatStanfordUniversity:thanksto

TrevorHastieandcolleaguesoftheDepartmentofStatistics.

16 References

Aitchison,J.(1983).Principalcomponentanalysisofcompositionaldata. Biometrika , 70 ,57–65.

Aitchison,J.(1986), The Statistical Analysis of Compositional Data .London:Chapman&Hall, reprintedin2003byBlackburnPress.

Aitchison,J.andGreenacre,M.J.(2002).Biplotsofcompositionaldata. Applied Statistics , 51 , 375–392.

Eckart,C.andYoung,G.(1936).Theapproximationofonematrixbyanotheroflowerrank. Psychometrika , 1,211–218.

Gabriel,K.R.(1972).Analysisofmeteorologicaldatabymeansofcanonicaldecompositionand biplots. Journal of Applied Meteorology , 11 ,1071–1077.

Gabriel,K.R.andOdoroff,C.L.(1990).Biplotsinbiomedicalresearch. Statistics in Medicine , 9, 469–485.

Gower,J.C.andHand,D.J.(1996). Biplots .London:ChapmanandHall.

Greenacre,M.J.(1993).Biplotsincorrespondenceanalysis. Journal of Applied Statistics , 20 , 251–269.

Greenacre,M.J.(2007). Correspondence Analysis in Practice. Second Edition. London: Chapman&Hall/CRCPress.PublishedinSpanishtranslationbytheFundaciónBBVA, Madrid,2008,freelydownloadablefrom http://www.fbbva.es/TLFU/tlfu/ing/publicaciones/fichalibro/index.jsp?codigo=300

Greenacre, M. J. and Lewi, P. (2009). Distributional equivalence and subcompositional coherence in the analysis of compositional data, contingency tables and ratioscale measurements. Journal of Classification , 26 ,29–54.

Lewi,P.J.(1976).Spectralmapping,atechniqueforclassifyingbiologicalactivityprofilesof chemicalcompounds.Arzneimittel Forschung , 26 ,1295–1300.

17 Mardia,K.V.,Kent,J.T.andBibby,J.M.(1979. Multivariate Analysis .London:Academic Press.

Nenadić,O.andGreenacre,M.J.(2007).Correspondenceanalysisin R,withtwoandthree dimensionalgraphics:The ca package. Journal of Statistical Software , 20 (1). URL http://www.jstatsoft.org/v20/i03/

OksanenJ.,KindtR.,LegendreP.andO'HaraR.B.(2006). vegan :CommunityEcologyPackage version1.83.URL http://cran.r-project.org

RDevelopmentCoreTeam(2008).R:ALanguageandEnvironmentforStatisticalComputing. RFoundationforStatisticalComputing,Vienna,Austria. URL http://www.R-project.org

18 Table 1 :Masses(relativefrequencies),contributions(relativetototal),andstandard coordinates of the 26 letters, for the two principal axes of Figures 1 and 2. The coordinatesofthelettersinthestandardbiplotofFigure2arethesquarerootsofthe massesmultipliedbythestandardcoordinates.Thesquaresofthesecoordinatesare equaltothecontributions.

Standard Masses Contributions coordinates Axis 1 Axis 2 Axis 1 Axis 2 a 0.0798 0.0000 0.0082 0.0176 0.3203 b 0.0157 0.0152 0.0025 0.9845 0.3980 c 0.0228 0.1020 0.0430 2.1150 1.3734 d 0.0460 0.1704 0.0593 -1.9256 1.1354 e 0.1271 0.0010 0.0596 0.0867 0.6848 f 0.0194 0.0317 0.0104 1.2765 0.7330 g 0.0200 0.0209 0.0025 -1.0207 -0.3530 h 0.0649 0.1463 0.0059 -1.5013 0.3024 i 0.0701 0.0050 0.0555 0.2675 -0.8895 j 0.0008 0.0002 0.0007 0.4533 -0.9160 k 0.0092 0.0697 0.0139 -2.7552 -1.2316 l 0.0427 0.0442 0.0012 1.0183 0.1650 m 0.0255 0.0130 0.0500 0.7127 -1.4010 n 0.0690 0.0028 0.0120 -0.2004 0.4173 o 0.0766 0.0009 0.0312 -0.1085 -0.6380 p 0.0152 0.0393 0.0512 1.6108 1.8379 q 0.0007 0.0111 0.0025 4.0798 1.9148 r 0.0519 0.0181 0.0280 0.5914 0.7342 s 0.0607 0.0449 0.0100 0.8602 -0.4056 t 0.0930 0.0009 0.0038 -0.1005 -0.2031 u 0.0298 0.0008 0.0308 0.1633 -1.0171 v 0.0096 0.0500 0.0003 2.2813 -0.1770 w 0.0258 0.1614 0.0021 -2.4992 0.2847 x 0.0012 0.0129 0.0206 3.3405 -4.2154 y 0.0219 0.0000 0.4851 0.0015 -4.7061 z 0.0008 0.0371 0.0099 6.8081 3.5092

19 Table 2 :Dataset‘environ’of10variablesmeasuredat13locationsintheNorthSea. Distance is from the oil drilling platform (the source of the pollution). The two reference stations R40 and R42 are 10 kms away and reflect an unpolluted environment.

DEPTH Ba Cd Cu Fe Pb Zn THC TOM DISTANCE S4 76 1656 0.02 1.3 2022 11.7 16.1 8 0.6 750 S8 74 1373 0.02 1.1 2398 8.9 13.6 9 0.7 1800 S9 76 3680 0.07 3.9 2985 34.4 45.7 103 0.8 800 S12 72 2094 0.04 1.2 2535 21.3 15.1 6 0.7 2500 S13 72 2813 0.04 1.6 2612 17.3 18 27 1 1300 S14 76 4493 0.06 3.3 2515 28.8 26.7 546 1 850 S15 76 6466 0.14 6.2 3421 61.4 72.5 952 1.1 500 S18 72 1661 0.02 1.3 2381 19.8 13.8 10 0.9 2500 S19 72 3580 0.05 2.4 3452 33.7 28.9 32 1 1200 S23 76 2247 0.02 1.5 3457 21.4 14.9 16 0.7 1000 S24 76 2034 0.05 2.7 2311 15.5 16.8 11 0.7 450 R40 67 40 0.01 0 1804 5.9 5.9 3 0.8 10000 R42 67 85 0.01 0.17 1815 6.7 5.9 3 1 10000

20 Figure 1 :Correspondenceanalysisofdataset‘author’,withrowsinprincipal coordinatesandcolumnsinstandardcoordinates(rowprincipalasymmetricmap), showingrowprofilesbunchedupneartheoriginofthebiplot.

4 .00369(19.7%) z

q 2 p c d f e r b w h n a l 0 t v g s .00766(40.9%) o j k ui m -2

-4 x y

-6 -4 -2 0 2 4 6 8 10

21 Figure 2 : Standard correspondence analysis biplot of ‘author’ data, showing the lettersasthesamedirectionsasinFigure1,butmultipliedbythesquarerootsoftheir masses:thesquaredcoordinatesofeachletteronanaxisisequaltoitscontributionto thataxis.

.00369 (19.69 %) c d e p 0.2 r

f n islands asia z 0.1 a h (hemingway) three daughters (michener)q b (buck) w drifters pendorric 2 (holt) (michener) l pendorric 3 (holt)

0 farewell to arms profiles of future (clark) (hemingway) lost world .00766 (40.91 %) east wind (buck) (clark) v j g t sound and fury 6

-0.1 (faulkner) s sound and fury 7 k (faulkner) x o

-0.2 u

m i

-0.3

-0.4

-0.5

-0.6

y -0.7 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4

22 Figure 3:DynamictransitionbetweentheasymmetricCAmap(Figure1)andthe standardCAbiplot(Figure2)ofthedataset‘author’.Abbreviationsforthebooks areused–seeFigure2forfulltitles.Click on the figure to see the animation.

23 Figure 4: Asymmetric CA map of ‘benthos’ data set, with stations in principal coordinates, and species in standard coordinates (this is the first frame of the animationinFigure6).Thetop10contributingspeciesarelabelled,accountingfor 85%oftheinertiaofthetwodimensionalsolution.Theseareallhighlyabundant specieswhiletheremaining82lowerabundancespeciescollectivelycontributeonly 15%tothesolution.Noticethatmanyoftheselowcontributingspeciesareoutlying inthismap.Thetriangularsymbolsforthespeciesareproportionaltotheoverall speciesabundance.Inertiasonaxes1and2are0.2457(31.4%)and0.2043(26.1%) respectively.

24 Figure 5:StandardCAbiplotof‘benthos’dataset.Thetop10contributingspecies are now the most outlying while the remaining 82 lower abundance species that collectivelycontributeverylittlealllietowardsthecentreofthebiplot.Thestation pointsareinthesamepositionsasinFigure4andhaveinertias0.2457(31.4%)and 0.2043(26.1%)onaxes1and2respectively.Thisisthelastframeoftheanimation ofFigure6,anditsboundariesaregivenbythedashedboxofFigure4,showinghow muchwehave“zoomedin”ontheconfigurationofstationpoints.

25 Figure 6:DynamictransitionbetweentheasymmetricCAmap (Figure 4) and the standardCAbiplot(Figure5)ofthedataset‘benthos’. Click on the figure to see the animation.

26 Figure 7:StandardPCAbiplotof‘environ’datasetofTable 2. Average sumof squaresontheaxesare0.7008and0.1791respectively(thesearealsoproportionsof varianceaccountedfor,sincetotalvarianceisscaledtobe1.

27 Figure 8:ThestandardPCAbiplotofthelogtransformed(andunstandardized)data ofTable2.Totalvariance(whichisagaintheaverageoftheindividualvariances)is 0.8692,andthepercentagesofvarianceexplainedon axes 1 and 2 are 84.5% and 11.0%respectively.

28 Figure 9:DynamictransitionbetweentheasymmetricLRAmap(weighted)andits standardbiplotversion,forthedataset‘author’.Inthefirstframeonecanseethe liningupofthreeletters, k, yand x,whichindicatesanequilibriummodelforthese letters(giveninGreenacreandLewi,2009),butwhichcannotbediagnosedinthe standardbiplot. Click on the figure to see the animation.

29 Figure 10 :ThestandardMANOVAbiplotofthethreegroupsintheclassic‘iris’data set.TotalvarianceofthethreegroupcentroidsintheMahalanobisspaceis8.119,of which99.1%isaccountedforbythefirstcanonicalaxis.ThevariableSepalLength hasthelowestcontributiontothediscriminantspace.