introduction | pivotal greenplum database docs

25
1 2 3 4 6 18 19 20 22 24 25 Table of Contents Table of Contents Introduction Key Points for Review Characteristics of a Supported Pivotal Hardware Platform Pivotal Approved Recommended Architecture Pivotal Cluster Examples Example Rack Layout Using gpcheckperf to Validate Disk and Network Performance Pivotal Greenplum Segment Instances per Server Pivotal Greenplum on Virtualized Systems Additional Helpful Tools © Copyright Pivotal Software Inc, 2013-2016 1 A03

Upload: trinhphuc

Post on 13-Feb-2017

251 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Introduction | Pivotal Greenplum Database Docs

12346

181920222425

TableofContents

TableofContentsIntroductionKeyPointsforReviewCharacteristicsofaSupportedPivotalHardwarePlatformPivotalApprovedRecommendedArchitecturePivotalClusterExamplesExampleRackLayoutUsinggpcheckperftoValidateDiskandNetworkPerformancePivotalGreenplumSegmentInstancesperServerPivotalGreenplumonVirtualizedSystemsAdditionalHelpfulTools

©CopyrightPivotalSoftwareInc,2013-2016 1 A03

Page 2: Introduction | Pivotal Greenplum Database Docs

IntroductionTheEMCDataComputingApplianceprovidesaready-madeplatformthatstrivestoaccommodatethemajorityofcustomerworkloads.OneofPivotalGreenplum’sstrongestvaluepropositionsisitsabilitytorunonpracticallyanymodern-dayhardwareplatform.Moreandmore,PivotalEngineeringisseeingcaseswherecustomerselecttobuildaclusterthatsatisfiesaspecificrequirementorpurpose.

PivotalPlatformEngineeringpublishesthisframeworkasaresourceforassistingcustomersinthiseffort.

ObjectivesThisguidecanbeusedfor:

AclearunderstandingofwhatcharacterizesarecommendedplatformforrunningPivotalGreenplumDatabase

Areviewofthetwomostcommontopologieswithsupportingrecommendedarchitecturediagrams

Pivotalrecommendedreferencearchitecturethatincludeshardwarerecommendations,configuration,harddiskguidelines,networklayout,installation,dataloading,andverification

Extraguidancewithreal-worldGreenplumclusterexamples(seePivotalClusterExamples)

Thisdocumentdoes:providerecommendationsforbuildingawell-performingPivotalclusterusingthehardwareguidelinespresented

providegeneralconceptswithoutspecifictuningsuggestions

Thisdocumentdoesnot:

promisePivotalsupportfortheuseofthirdpartyhardware

assumethattheinformationhereinappliestoeverysite,butissubjecttomodificationdependingonacustomer’sspecificlocalrequirements

provideall-inclusiveproceduresforconfiguringPivotalGreenplum.AsubsetofinformationisincludedasitpertainstodeployingaPivotalcluster.

GreenplumTermstoKnowmaster

AserverthatprovidesentrytotheGreenplumDatabasesystem,acceptsclientconnectionsandSQLqueries,anddistributesworktothesegmentinstances.

segmentinstancesIndependentPostgreSQLdatabasesthateachstoreaportionofthedataandperformthemajorityofqueryprocessing.

segmenthostAserverthattypicallyexecutesmultipleGreenplumsegmentinstances.

interconnectNetworkinglayeroftheGreenplumDatabasearchitecturethatfacilitatesinter-processcommunicationbetweensegments.

FeedbackandUpdatesPleasesendfeedbackand/[email protected].

©CopyrightPivotalSoftwareInc,2013-2016 2 A03

Page 3: Introduction | Pivotal Greenplum Database Docs

KeyPointsforReview

WhatisPivotalEngineeringRecommendedArchitecture?ThisPivotalRecommendedArchitecturecomprisesgenericrecommendationsforthirdpartyhardwareforusewithPivotalsoftwareproducts.Pivotalmaintainsexamplesofvariousimplementationsinternallytoaidinassistingcustomersinclusterdiagnosticsandconfigurationassistance.Pivotaldoesnotperformhardwarereplacement,norisPivotalasubstitutefortheOEMvendorsupportfortheseconfigurations.

WhyInstallonanOEMVendorPlatform?TheEMCDCAstrivestoachievethebestbalancebetweenperformanceandcostwhilemeetingabroadrangeofcustomerneeds.Therearesomeveryvalidreasonscustomersmayopttodesigntheirownclusters.

Somepossibilitiesare:

Varyingworkloadprofilesthatmayrequiremorememoryorhigherprocessorcapacity

Specificfunctionalneedslikepublic/privateclouds,increaseddensity,ordisasterrecovery(DR)

Supportforradicallydifferentnetworktopologies

Deeper,moredirectaccessforhardwareandOSmanagement

ExistingrelationshipswithOEMhardwarepartners

PivotalEngineeringhighlyrecommendsfollowingPivotalarchitectureguidelinesifcustomersoptoutofusingtheapplianceanddiscussingtheimplementationwithaPivotalEngineer.Customersachievemuchgreaterreliabilitywhenfollowingtheserecommendations.

©CopyrightPivotalSoftwareInc,2013-2016 3 A03

Page 4: Introduction | Pivotal Greenplum Database Docs

CharacteristicsofaSupportedPivotalHardwarePlatform

CommodityHardwarePivotalbelievesthatcustomersshouldtakeadvantageoftheinexpensiveyetpowerfulcommodityhardwarethatincludesx86_64platformcommodityservers,storage,andEthernetswitches.

Pivotalrecommends:

Chipsetsorhardwareusedacrossmanyplatforms

NICchipsets(likesomeoftheIntelseries)RAIDcontrollers(likeLSIorStorageWorks)

Referencemotherboards/designs

Machinesthatusereferencemotherboardimplementationsarepreferred.AlthoughDIMMcountisimportant,ifamanufacturerintegratesmoreDIMMslotsthantheCPUmanufacturerspecifies,moreriskisplacedontheplatform.

Ethernet-basedinterconnects(10Gb)are

Highlypreferredtoproprietaryinterconnects.Highlypreferredtostoragefabrics.

ManageabilityPivotalrecommends:

Remote,out-of-bandmanagementcapabilitywithsupportforsshconnectivityaswellasweb-basedconsoleaccessandvirtualmedia.

DiagnosticLEDsthatconveyfailureinformation.Amberlightsareaminimum,butanLEDthatdisplaystheexactfailureismoreuseful.

Tool-freemaintenance(thecovercanbeopenedwithouttools,partsarehot-swappablewithouttools,etc.).

Labeling–componentssuchasDIMMsarelabeledsoit’seasytodeterminewhichpartneedstobereplaced.

Command-line,script-basedinterfacesforconfiguringtheserverBIOS,andoptionslikeRAIDcardsandNICs.

RedundancyPivotalrecommends:

Redundanthot-swappablepowersupplies

Redundanthot-swappablefans

Redundantnetworkconnectivity

Hot-swappabledrives

Hot-sparedriveswhenimmediatereplacementoffailedhardwareisunavailable

DeterminingtheBestTopology

TraditionalTopologyThisconfigurationrequirestheleastspecificnetworkingskills,andisthesimplestpossibleconfiguration.Inatraditionalnetworktopology,everyserverintheclusterisdirectlyconnectedtoeveryswitchinthecluster.Thisistypicallyimplementedover10GbEthernet.Thistopologylimitstheclustersizetothenumberofportsontheselectedinterconnectswitches.10Gbportsontheserversarebondedintoanactive/activepairandroutedirectlytoasetofswitchesconfiguredusingMLAG(orcomparabletechnology)toprovidearedundanthighspeednetworkfabric.

©CopyrightPivotalSoftwareInc,2013-2016 4 A03

Page 5: Introduction | Pivotal Greenplum Database Docs

Figure:RecommendedArchitectureExample1(TypicalTopology)

ScaleableTopologyScalablenetworksimplementanetworkcorethatallowstheclustertogrowbeyondthenumberofportsintheinterconnectswitches.Caremustbetakentoensurethatthenumberoflinksfromthein-rackswitchesisadequatetoservicethecore.

HowtoDeterminetheMaximumNumberofServers

Forexample,eachrackcanhold16serversandyoudeterminethatthecoreswitcheseachhave48ports.Oftheseports4areusedtocreatetheMLAGbetweenthetwocoreswitches.Oftheremaining44ports,networkingfromasinglesetofinterconnectswitchesinarackuses4linkspercoreswitch,2fromeachinterconnectswitchtoeachofthecoreswitches.Themaximumnumberofserversisdeterminedbythefollowingformula:

max-nodes=(nodes-per-rack*((core-switch-port-count-MLAGportutilization)/rack-to-rack-link-port-count))176=(16*((48-4)/4))

Figure:RecommendedArchitectureExample2(ScalableTopology)

©CopyrightPivotalSoftwareInc,2013-2016 5 A03

Page 6: Introduction | Pivotal Greenplum Database Docs

PivotalApprovedRecommendedArchitecture

MinimumServerGuidelinesTable1listsminimumrequirementsforagoodcluster.Usegpcheckperftogeneratethesemetrics.

SeeAppendixC:Using gpcheckperf toValidateDiskandNetworkPerformanceforexample gpcheckperf output.

Table1.BaselineNumbersforaPivotalCluster

MasterNodes(mdw&smdw)

Usersandapplicationsconnecttomasterstosubmitqueriesandreturnresults.Typically,monitoringandmanagingtheclusterandthedatabaseisperformedthroughthemasternodes.

8+physicalcoresatgreaterthan2GHzclockspeed

>256GB

>600MB/sRead

>500MB/sWrite

2x10GbNICs

MultipleNICs 1U

SegmentNodes(sdw)

Segmentnodesstoredataandexecutequeries.Theyaregenerallynotpublicfacing.

Multiplesegmentinstancesrunononesegmentnode.

8+physicalcoresatgreaterthan2GHzclockspeed

>256GB

>2000MB/sRead

>2000MB/sWrite

2x10GbNICs

MultipleNICs 2U

ETL/BackupNodes(etl)

Generallyidenticaltosegmentnodes.Theseareusedasstagingareasforloadingdataorasdestinationsforbackupdata.

8+physicalcoresatgreaterthan2GHzclockspeed

>64GBormore

>2000MB/sRead

>2000MB/sWrite

2x10GbNICs

MultipleNICs 2U

NetworkGuidelines

Table2.AdministrationandInterconnectSwitches

AdministrationNetwork

Administrationnetworksareusedtotietogetherlights-outmanagementinterfacesintheclusterandprovideamanagement

48 1GbAlayer-2/layer-3managedswitchperrackwithnospecificbandwidthorblockingrequirements.

©CopyrightPivotalSoftwareInc,2013-2016 6 A03

Page 7: Introduction | Pivotal Greenplum Database Docs

clusterandprovideamanagementrouteintoserverandOSswitches.

InterconnectNetwork 48 10GB

Twolayer-2/layer-3managedswitchesperrack.Allportsmusthavefullbandwidth,beabletooperateatlinerate,andbenon-blocking.

Table3.Racking,Power,andDensity

RackingGenerally,a40Uorlargerrackthatis1200mmdeepisrequired.Built-incablemanagementispreferred.ESMprotectivedoorsarealsopreferred.

Power

ThetypicalinputpowerforaPivotalGreenplumrackis4x208/220V,30amp,singlephasecircuitsintheUS.Internationally,4x230V,32amp,singlephasecircuitsaregenerallyused.Thisaffordsapowerbudgetof~9600VAoffullyredundantpower.

Otherpowerconfigurationsareabsolutelyfinesolongasthereisenoughenergydeliveredtotheracktoaccommodatethecontentsoftherackinafullyredundantmanner.

NodeGuidelines

OSLevelsAtaminimumthefollowingoperatingsystems(OS)aresupported:

RedHat/CentOSLinux5*

RedHat/CentOSLinux6

RedHat/CentOSLinux7**

SUSEEnterpriseLinux10.2or10.3

SUSEEnterpriseLinux11

*RHEL/CentOS5willbeunsupportedinthenextmajorrelease

**supportforRHEL/CentOS7isnearcompletion,pendingkernelbugfixes

ForthelatestinformationonsupportedOSversions,refertotheGreenplumDatabaseInstallationGuide.

SettingOSParametersforGreenplumDatabaseCarefulconsiderationmustbegivenwhensettingOSparametersforGreenplumDatabasehosts.RefertothelatestversionoftheGreenplumDatabaseInstallationGuideforthesesettings.

GreenplumDatabaseServerGuidelinesGreenplumDatabaseintegratesthreekindsofservers:masterservers,segmenthosts,andETLservers.GreenplumDatabaseserversmustmeetthefollowingcriteria.

MasterServers1Uor2Userver.Withlessofaneedfordrives,rackspacecanbesavedbygoingwitha1Uformfactor.However,a2Uformfactorconsistentwithsegmenthostsmayincreasesupportability.

©CopyrightPivotalSoftwareInc,2013-2016 7 A03

Page 8: Introduction | Pivotal Greenplum Database Docs

Sameprocessors,RAM,RAIDcard,andinterconnectNICsasthesegmenthosts.

Sixtotendisks(eightismostcommon)organizedintoasingleRAID5groupwithonehotspareconfigured.

SAS15korSSDdisksarepreferredwith10kdisksaclosesecond.

SATAdrivesareacceptableinsolutionsorientedtowardsarchivalspaceoverqueryperformance.

Alldisksmustbethesamesizeandtype.

Shouldbecapableofreadratesin gpcheckperf of500MB/sorhigher.(Thefasterthemasterscans,thefasteritcangeneratequeryplans,whichimprovesoverallperformance.)

Shouldbecapableofwriteratesin gpcheckperf of500MB/sorhigher.

Shouldhavesufficientadditionalnetworkinterfacestoconnecttothecustomernetworkdirectlyinthemannerdesiredbythecustomer.

SegmentHostsTypicallya2Userver.

Thefastestavailableprocessors.

256GBRAMormore.

OneortwoRAIDcardswithmaximumcacheandcacheprotection(flashorcapacitorspreferredoverbattery).RAIDcardsshouldbeabletosupportfullread/writecapacityofthedrives.

2x10GbNICs.

12to24disksorganizedintotwoorfourRAID5groups.Hotsparesshouldbeconfigured,unlesstherearedisksonhandforquickreplacement.

SAS15kdisksarepreferredwith10kdisksaclosesecond.SATAdisksarepreferredovernearlineSASifSAS15korSAS10kcannotbeused.Alldisksmustbethesamesizeandtype.

Aminimumreadratein gpcheckperf of300MB/spersegmentorhigher.(2000MB/sperserveristypical.)

Aminimumwriteratein gpcheckperf of300MB/sorhigher(2000MB/sperserveristypical.)

AdditionalTipsforSegmentHostConfigurationThenumberofsegmentinstancesthatarerunpersegmenthostisconfigurable,andeachsegmentinstanceisitselfadatabaserunningontheserver.Abaselinerecommendationoncurrenthardware,suchasthehardwaredescribedinAppendixA,is8primarysegmentinstancesperphysicalserver.

AsetofmemoryparameterswillbedeterminedwheninstallingthedatabasesoftwarethatdependupontheamountofRAMselectedforeachsegmentinstance.Whilethesearenotplatformparameters,itistheplatformthatdetermineshowmuchmemoryisavailableandhowthememoryparametersshouldbesetinthesoftware.Refertotheonlinecalculator(http://greenplum.org/calc/ )todeterminethesesettings.

RefertoAppendixDforfurtherreadingonsegmentinstanceconfiguration.

ETLServersTypicallya2Userver.

Thesameprocessors,RAM,andinterconnectNICsasthesegmentservers

OneortwoRAIDcardswithmaximumcacheandcacheprotection(flashorcapacitorspreferredoverbattery).

12to24disksorganizedintoRAID5groupsofsixtoeightdiskswithnohotsparesconfigured(unlessthereareavailabledisksaftertheRAIDgroupsareconstructed).

SATAdisksareagoodchoiceforETLasperformanceistypicallylessofaconcernthanstorageforthesesystems.

Shouldbecapableofreadratesin gpcheckperf of100MB/sorhigher.(ThefastertheETLserversscan,thefasterquerydatacanbeloaded.

Shouldbecapableofwriteratesin gpcheckperf of500MB/sorhigher.(ThefasterETLserverswrite,thefasterdatacanbestagedforloading.)

AdditionalTipsforSelectingETLServersETLnodescanbeanyserverthatoffersenoughstorageandperformancetoaccomplishthetasksrequired.Typically,between4and8ETLserversarerequiredpercluster.ThemaximumnumberisdependentonthedesiredloadperformanceandthesizeoftheGreenplumDatabasecluster.

Forexample,thelargertheGreenplumDatabasecluster,thefastertheloadscanbe.ThemoreETLservers,thefasterdatacanbeserved.HavingmoreETLbandwidththantheclustercanreceiveispointless.HavingmuchlessETLbandwidththantheclustercanreceivemakesforslowerloadingthanthe

©CopyrightPivotalSoftwareInc,2013-2016 8 A03

Page 9: Introduction | Pivotal Greenplum Database Docs

maximumpossible.

HardDiskConfigurationGuidelinesAgenericserverwith24hot-swappablediskscanhaveseveralpotentialdiskconfigurations.TestingbyPivotalPlatformandSystemsEngineeringshowsthatthebestperformingstorageforPivotalsoftwareis:

fourRAID5groupsofsixdiskseach(usedasfourfilesystems),or

combinedintooneortwofilesystemsusinglogicalvolumemanager.

ThefollowinginstructionsdescribehowtobuildtherecommendedRAIDgroupsandvirtualdisksforbothmasterandsegmentnodes.Howtheseultimatelytranslateintofilesystemsiscoveredintherelevantoperatingsystem’sinstallationguide.

LUNConfigurationTheRAIDcontrollersettingsanddiskconfigurationarebasedonsyntheticloadtestingperformedonseveralRAIDconfigurations.Unfortunately,thesettingsthatresultedinthebestreadratesdidnothavethehighestwriteratesandthesettingswiththebestwriteratesdidnothavethehighestreadrates.

Theprescribedsettingsofferacompromise.Inotherwords,thesesettingsresultinwriterateslowerthanthebestmeasuredwriteratebuthigherthanthewriteratesassociatedwiththesettingsforthehighestreadrate.Thesameistrueforreadrates.Thisisintendedtoensurethatbothinputandoutputarethebesttheycanbewhileaffectingtheothertheleastamountpossible.

LUNsforthesystemshouldbepartitionedandmountedas/data1forthefirstLUNandadditionalLUNsshouldfollowthesamenamingconventionwhileincrementallyincreasingthenumber(/data1,/data2,/data3…/dataN).AllfilesystemsshouldbeformattedasxfsandfollowtherecommendationssetforthinthePivotalGreenplumDatabaseInstallationGuide.

MasterServerMasterservers(primaryandsecondary)haveeight,hot-swappabledisks.Configurealleightdisksintoasingle,RAID5stripeset.Eachofthevirtualdisksthatarecarvedfromthisdiskgroupshouldhavethefollowingproperties:

256kstripewidth

Noread-ahead

Diskcachedisabled

DirectI/O

VirtualdisksareconfiguredintheRAIDcard’soptionalROM.EachvirtualdiskdefinedintheRAIDcardwillappeartobeadiskintheoperatingsystemwitha/dev/sd?devicefilename.

SegmentandETLServersSegmentservershave24,hot-swappabledisks.ThesecanbeconfiguredinanumberofwaysbutPivotalrecommendsfour,RAID5groupsofsixdiskseach(RAID5,5+1).Eachofthevirtualdisksthatwillbecarvedfromthesediskgroupsshouldhavethefollowingproperties:

256kstripewidth

Noread-ahead

Diskcachedisabled

DirectI/O

VirtualdisksareconfiguredintheRAIDcard’soptionalROM.EachvirtualdiskdefinedintheRAIDcardwillappeartobeadiskintheoperatingsystemwitha/dev/sd?devicefilename.

SSDStorageFlashstoragehasbeengaininginpopularity.PivotalhasnothadtheopportunitytodoenoughtestingwithSSDdrivestomakearecommendation.Itis

©CopyrightPivotalSoftwareInc,2013-2016 9 A03

Page 10: Introduction | Pivotal Greenplum Database Docs

importantwhenconsideringSSDdrivestovalidatethesustainedsequentialreadandwriteratesforthedrive.Manydriveshaveimpressiveburstrates,butareunabletosustainthoseratesforlongperiodsoftime.Additionally,thechoiceofRAIDcardneedstobeevaluatedtoensureitcanhandlethebandwidthoftheSSDdrives.

SAN/JBODStorageInsomeconfigurationsitmaybearequirementtouseanexternalstoragearrayduetothedatabasesizeorservertypebeingusedbythecustomer.Withthisinmind,itisimportanttounderstandthat,basedontestingbyPivotalPlatformandSystemsEngineering,SANandJBODstoragewillnotperformaswellaslocal,internalserverstorage.

Someconsiderationstobetakenintoaccountifinstallingorsizingsuchaconfigurationarethefollowing(independentofthevendorofchoice):

Knowthedatabasesizeandtheestimatedgrowthovertime

Knowthecustomer’sread/writeratio

LargeblockI/Oisthepredominantworkload(512KB)

DisktypeandpreferredRAIDtypebasedonthevendorofchoice

Expecteddiskthroughputbasedonreadandwrite

Responsetimeofthedisks/JBODcontroller

PreferredoptionistohaveBBUcapabilityoneithertheRAIDcardorcontroller

Redundancyinswitchzoning,preferablywithafanin:out2:1

Atleast8GBFibreChannel(FC)connectivity

EnsurethattheserversupportstheuseofFC,FCoE,orexternalRAIDcards

Inallinstanceswhereanexternalstoragesourceisbeingutilized,thevendorofthediskarray/JBODshouldbeconsultedtoobtainspecificrecommendationsbasedonasequentialworkload.Thismayalsorequirethecustomertoobtainadditionallicensesfromthepertinentvendors.

NetworkLayoutGuidelinesAllthesystemsintheGreenplumclusterneedtobetiedtogetherinsomeformofdedicated,high-speeddatainterconnect.Thisnetworkisusedforloadingdataandforpassingdatabetweensystemsduringqueryprocessing.Itshouldbeashigh-speedandlow-latencyaspossible,anditshouldnotbeusedforanyotherpurpose(i.e.,itshouldnotbepartofthegeneralLAN).

AruleofthumbfornetworkutilizationinaGreenplumclusteristoplanforuptotwentypercentofeachserver’smaximumI/Oreadbandwidthasnetworktraffic.Thismeansaserverwitha2000MB/sreadbandwidth(asmeasuredby gpcheckperf )mightbeexpectedtotransmit400MB/s.Greenplumalsocompressessomedataondiskbutuncompressesitbeforetransmittingtoothersystemsinthecluster,soa2000MB/sreadratewitha4xcompressionratioresultsinan8000MB/seffectivereadrate.Twentypercentof8000MB/sis1600MB/swhichismorethanasinglegigabitinterface’sbandwidth.

Toaccommodatethistraffic,10Gbnetworkingisrecommendedfortheinterconnect.Currentbestpracticesuggeststwo10Gbinterfacesfortheclusterinterconnect.Thisensuresthatthereisbandwidthtogrowinto,andreducescablingintheracks.Itisrecommendedtoconfigurethetwo10GbinterfaceswithNICbondingtocreateaload-balanced,fault-tolerantinterconnect.

Cisco,Brocade,andAristaswitchesaregoodchoicesasthesebrandsincludetheabilitytotieswitchestogetherinfabrics.TogetherwithNICbondingontheservers,thisapproacheliminatessinglepointsoffailureintheinterconnectnetwork.Intel,QLogic,orEmulexnetworkinterfacestendtoworkbest.Layer3capabilityisrecommendedsinceitintegratesmanyfeaturesthatareusefulinaGreenplumDatabaseenvironment.

Note:Thevendorhardwarereferencedaboveisstrictlymentionedasanexample.PivotalPlatformandSystemsEngineeringdoesnotspecifywhichproductstouseinthenetwork.

FCoEswitchsupportisalsorequiredifSANstorageisused,aswellassupportforFibresnooping(FIPS).

AGreenplumDatabaseclusterusesthreekindsofnetworkconnections:

Adminnetworks

Interconnectnetworks

Externalnetworks

AdminNetworks

©CopyrightPivotalSoftwareInc,2013-2016 10 A03

Page 11: Introduction | Pivotal Greenplum Database Docs

AnAdminnetworktiestogetherallthemanagementinterfacesforthedevicesinaconfiguration.Itisgenerallyusedtoprovidemonitoringandout-of-bandconsoleaccessforeachconnecteddevice.Theadminnetworkistypicallya1Gbnetworkphysicallyandlogicallydistinctfromothernetworksinthecluster.

Serversaretypicallyconfiguredsuchthattheout-of-bandorlights-outmanagementinterfacessharethefirstnetworkinterfaceoneachserver.Inthisway,thesamephysicalnetworkprovidesaccesstolights-outmanagementandanoperatingsystemlevelconnectionusefulfornetworkOSinstallation,patchdistribution,monitoring,andemergencyaccess.

SwitchTypes

Typicallyone24-or48-port,1Gbswitchperrackandoneadditional48-portswitchclusterasacore.

Any1GbswitchcanbeusedfortheAdminnetwork.Carefulplanningisrequiredtoensurethatanetworktopologyisdesignedtoprovideenoughconnectionsandthefeaturesdesiredbythesitetoprovidethekindsofaccessrequired.

CablesUseeithercat5eorcat6cablingfortheAdminnetwork.Cablethelights-outormanagementinterfacefromeachclusterdevicetotheAdminnetwork.PlaceanAdminswitchineachrackandcross-connecttheswitchesratherthanattemptingtoruncablesfromacentralswitchtoallracks.

Note:PivotalrecommendsusingadifferentcolorcablefortheAdminnetwork.

InterconnectNetworksTheinterconnectnetworktiestheserversintheclustertogetherandformsahigh-speed,low-contentiondataconnectionbetweentheservers.ThisshouldnotbeimplementedonthegeneraldatacenternetworkasGreenplumDatabaseinterconnecttraffictendstooverwhelmnetworksfromtimetotime.LowlatencyisneededtoensureproperfunctioningoftheGreenplumDatabasecluster.Sharingtheinterconnectwithageneralnetworktendstointroduceinstabilityintothecluster.

Typicallytwoswitchesarerequiredperrack,andtwomoretoactasacore.Usetwo10Gbcablesperserverandeightperracktoconnecttheracktothecore.

Interconnectnetworksareoftenconnectedtogeneralnetworksinlimitedwaystofacilitatedataloading.Inthesecases,itisimportanttoshieldboththeinterconnectnetworkandthegeneralnetworkfromtheGreenplumDatabasetrafficandvisa-versa.UsearouteroranappropriateVLANconfigurationtoaccomplishthis.

ExternalNetworkConnectionsThemasternodesareconnectedtothegeneralcustomernetworktoallowusersandapplicationstosubmitqueries.Typically,thisisdonewithasmallnumberof1Gbconnectionsattachedtothemasternodes.Anymethodthataffordsnetworkconnectivityfromtheusersandapplicationsneedingaccesstothemasternodesisacceptable.

InstallationGuidelinesEachconfigurationrequiresaspecificrackplan.Therearesingleandmulti-rackconfigurationsdeterminedbythenumberofserverspresentintheconfiguration.Asinglerackconfigurationisonewherealltheplannedequipmentfitsintoonerack.Multi-rackconfigurationsrequiretwoormorerackstoaccommodatealltheplannedequipment.

RackingGuidelinesfora42URackConsiderthefollowingifinstallingtheclusterina42Urack.

Priortorackinganyhardware,performasitesurveytodeterminewhatpoweroptionisdesired,ifpowercableswillbetoporbottomoftherack,andwhethernetworkswitchesandpatchpanelswillbetoporbottomoftherack.

InstalltheKMMtrayintorackunit19.

Installtheinterconnectswitchesintorackunits21and22leavingaone-unitgapabovetheKMMtray.

Racksegmentnodesupfromfirstavailablerackunitatthebottomoftherack(seemulti-rackrulesforvariationsusinglowrackunits).

Installnomorethansixteen2Uservers(excludesmasterbutincludessegment,andETLnodes).

Installthemasternodeintorackunit17.Installthestand-bymasterintorackunit18.

Adminswitchescanberackedanywhereintherack,thoughthetopistypicallythebestandsimplestlocation.

©CopyrightPivotalSoftwareInc,2013-2016 11 A03

Page 12: Introduction | Pivotal Greenplum Database Docs

Allcomputers,switches,arrays,andracksshouldbelabeledonboththefrontandback.

Allcomputers,switches,arrays,andracksshouldbelabeledasdescribedinthesectiononlabelslaterinthisdocument.

Allinstalleddevicesshouldbeconnectedtotwoormorepowerdistributionunits(PDUs)intherackwherethedeviceisinstalled.

Wheninstallingamulti-rackcluster:

Installtheinterconnectcoreswitchesinthetoptworackunitsifthecablescomeinfromthetoporinthebottomtworackunitsifthecablescomeinfromthebottom.

Donotinstallcoreswitchesinthemasterrack.

CablingThenumberofcablesrequiredvariesaccordingtotheoptionsselected.Ingeneral,eachserverandswitchinstalledwilluseonecablefortheAdminnetwork.Runcablesaccordingtoestablishedcablingstandards.Eliminatetightbendsorcrimps.Clearlylabelallateachend.Thelabeloneachendofthecablemusttracethepaththecablefollowsbetweenserverandswitch.Thisincludes:

Switchnameandport

Patchpanelnameandport,ifapplicable

Servernameandport

SwitchConfigurationGuidelinesTypically,thefactorydefaultconfigurationissufficient.

IPAddressingGuidelines

IPAddressingSchemefortheAdminNetworkAnadminnetworkshouldbecreatedsothatsystemmaintenanceandaccessworkcanbedoneonanetworkthatisnotthesameasclustertrafficbetweenthenodes.

Note:Pivotal’srecommendedIPaddressforserversontheAdminnetworkusesastandardinternaladdressspaceandisextensibletoincludeover1,000nodes.

AllAdminnetworkswitchespresentshouldbecrossconnectedandallNICsattachedtotheseswitchesparticipateinthe172.254.0.0/16network.

Table4.IPAddressesforServersandCIMC

HostType NetworkInterface IPAddress

SecondaryMasterNode CIMC 172.254.1.252/16

Eth0 172.254.1.250/16

SecondaryMasterNode CIMC 172.254.1.253/16

Eth0 172.254.1.251/16

Non-masterSegmentNodesinrack1(masterrack) CIMC 172.254.1.101/16through172.254.1.116/16

Eth0 172.254.1.1/16through172.254.1.16/16

Non-masterSegmentNodesinrack2 CIMC 172.254.2.101/16through172.254.2.116/16

Eth0 172.254.2.1/16through172.254.2.16/16

Non-masterSegmentNodesinrack# CIMC 172.254.#.101/16through172.254.#.116/16

Eth0 172.254.#.1/16through172.254.#.16/16

Note:Where#istheracknumber.

Thefourthoctetiscountedfromthebottomup.Forexample,thebottomserverinthefirstrackis172.254.1.1andthetop,excludingmasters,is172.254.1.16.

©CopyrightPivotalSoftwareInc,2013-2016 12 A03

Page 13: Introduction | Pivotal Greenplum Database Docs

Thebottomserverinthesecondrackis172.254.2.1andtop172.254.2.16.Thiscontinuesforeachrackintheclusterregardlessofindividualserverpurpose.

IPAddressingforNon-serverDevicesThefollowingtableliststhecorrectIPaddressingforeachnon-serverdevice.

Table5.Non-serverIPAddresses

Device IPAddress

FirstInterconnectSwitchinRack *172.254.#.201/16

SecondInterconnectSwitchinRack *172.254.#.202/16

*Where#istheracknumber

IPAddressingforInterconnectsusing10GbNICsTheInterconnectiswheredataisroutedathighspeedbetweenthenodes.

Table6.InterconnectIPAddressingfor10GbNICS

HostType PhysicalRJ-45Port IPAddress

PrimaryMaster 1stportonPCIecard 172.1.1.250/16

2ndportonPCIecard 172.2.1.250/16

SecondaryMaster 1stportonPCIecard 172.1.1.251/16

2ndportonPCIecard 172.2.1.251/16

Non-MasterNodes 1stportonPCIecard 172.1.#.1/16through172.1.#.16/16

2ndportonPCIecard 172.2.#.1/16through172.2.#.16/16

Note:Where#istheracknumber:

Thefourthoctetiscountedfromthebottomup.Forexample,thebottomserverinthefirstrackuses172.1.1.1and172.2.1.1.

Thetopserverinthefirstrack,excludingmasters,uses172.1.1.16and172.2.1.16.

EachNIContheinterconnectusesadifferentsubnetandeachserverhasaNIConeachsubnet.

IPAddressingforFaultTolerantInterconnectsThefollowingtablelistscorrectIPaddressesforfaulttolerantinterconnectsregardlessofbandwidth.

Table7.FaultTolerant(Bonded)Interconnects

HostType IPAddress

PrimaryMaster 172.1.1.250/16

SecondaryMaster 172.1.1.251/16

Non-MasterNodes 172.1.#.1/16through172.1.#.16/16

Note:Where#istheracknumber:

Thefourthoctetiscountedfromthebottomup.Forexample,thebottomserverinthefirstrackuses172.1.1.1.

Thetopserverinthefirstrack,excludingmasters,uses172.1.1.16.

©CopyrightPivotalSoftwareInc,2013-2016 13 A03

Page 14: Introduction | Pivotal Greenplum Database Docs

DataLoadingConnectivityGuidelinesHigh-speeddataloadingrequiresdirectaccesstothesegmentnodes,bypassingthemasters.TherearethreewaystoconnectaPivotalclustertoexternaldatasourcesorbackuptargets:

VLANOverlay–ThefirstandrecommendedbestpracticeistousevirtualLANs(VLANS)toopenupspecifichostsinthecustomernetworkandtheGreenplumDatabaseclustertoeachother.

DirectConnecttoCustomerNetwork–Onlyuseifthereisaspecificcustomerrequirement.

Routing–Onlyuseifthereisaspecificcustomerrequirement.

VLANOverlayVLANoverlayisthemostcommonlyusedmethodtoprovideaccesstoexternaldatawithoutintroducingnetworkproblems.TheVLANoverlayimposesanadditionalVLANontheconnectionsofasubsetoftheclusterservers.

HowtheVLANOverlayMethodWorksUsingtheVLANOverlaymethod,trafficpassesbetweentheclusterserversontheinternalVLAN,butcannotpassoutoftheinternalswitchfabricbecausetheexternalfacingportsareassignedonlytotheoverlayVLAN.TrafficontheoverlayVLAN(traffictoorfromIPaddressesassignedtotherelevantservers’virtualnetworkinterfaces)canpassinandoutofthecluster.

ThisVLANconfigurationallowsmultipleclusterstoco-existwithoutrequiringanychangetotheirinternalIPaddresses.Thisgivescustomersmorecontroloverwhatelementsoftheclustersareexposedtothegeneralcustomernetwork.TheOverlayVLANcanbeadedicatedVLANandincludeonlythoseserversthatneedtotalktoeachother;ortheOverlayVLANcanbethecustomer’sfullnetwork.

Figure:BasicVLANOverlayExample

Thisfigureshowsaclusterwith3segmenthosts,master,standbymasterandETLhost.Inthiscase,onlytheETLhostispartoftheoverlay.ItisnotarequirementtohaveETLnodeusetheoverlay,thoughthisiscommoninmanyconfigurationstoallowdatatobestagedwithinacluster.Anyoftheserversinthisrackoranyrackofanyotherconfigurationmayparticipateintheoverlayifdesired.Thetypeofconfigurationwilldependuponsecurityrequirementsandiffunctionswithintheclusterneedtoreachanyoutsidedatasources.

ConfiguringtheOverlayVLAN–AnOverviewConfiguringtheVLANinvolvesthreesteps:

1. VirtualinterfacetagspacketswiththeoverlayVLAN

2. ConfiguretheswitchintheclusterwiththeoverlayVLAN

3. Configuretheportsontheswitchconnectingtothecustomernetwork

©CopyrightPivotalSoftwareInc,2013-2016 14 A03

Page 15: Introduction | Pivotal Greenplum Database Docs

Step1–VirtualinterfacetagspacketswiththeoverlayVLAN

EachserverthatisbothinthebaseVLANandtheoverlayVLANhasavirtualinterfacecreatedthattagspacketssentfromtheinterfacewiththeoverlayVLAN.Forexample,supposeeth2isthephysicalinterfaceonanETLserverthatisconnectedtothefirstinterconnectnetwork.ToincludethisserverinanoverlayVLANtheinterfaceeth2.1000iscreatedusingthesamephysicalportbutdefiningasecondinterfacefortheport.ThephysicalportdoesnottagitspacketsbutanypacketsentusingthevirtualportistaggedwithaVLAN.

Step2–ConfiguretheswitchintheclusterwiththeoverlayVLAN

TheswitchintheclusterthatconnectstotheserversandthecustomernetworkisconfiguredwiththeoverlayVLAN.AlloftheportsconnectedtoserversthatwillparticipateintheoverlayarechangedtoswitchportmodeconvergedandaddedtoboththeinternalVLAN(199)andtheoverlayVLAN(1000).

Step3–Configuretheswitchportsconnectedtothecustomernetwork

Theportsontheswitchconnectingtothecustomernetworkareconfiguredaseitheraccessortrunkmodeswitchports(dependingoncustomerpreference)andaddedonlytotheoverlayVLAN.

DirectConnecttotheCustomer’sNetwork

EachnodeintheGreenplumDatabaseclustercansimplybecableddirectlytothenetworkwherethedatasourcesexistoranetworkthatcancommunicatewiththesourcenetwork.Thisisabruteforceapproachthatworksverywell.Dependingonwhatnetworkfeaturesaredesired(redundancy,highbandwidth,etc.)thismethodcanbeveryexpensiveintermsofcablingandswitchgearaswellasspaceforrunninglargenumbersofcables.

Figure:DataLoading—DirectConnecttoCustomerNetwork

Routing

Onewayistouseanyofthestandardnetworkingmethodsusedtolinktwodifferentnetworkstogether.Thesecanbedeployedtotietheinterconnectnetwork(s)tothedatasourcenetwork(s).Whichofthesemethodsisusedwilldependonthecircumstancesandthegoalsfortheconnection.

ArouterisinstalledthatadvertisestheexternalnetworkstotheserversintheGreenplumcluster.Thismethodcouldpotentiallyhaveperformanceandconfigurationimplicationsonthecustomer’snetwork.

ValidationGuidelinesMostofthevalidationeffortisperformedaftertheOSisinstalledandavarietyofOS-leveltoolsareavailable.AchecklistisincludedintherelevantOSinstallationguidethatshouldbeseparatelyprintedandsignedfordeliveryandincludestheissuesraisedinthissection.

Examineandverifythefollowingitems:

Allcableslabeledaccordingtothestandardsinthisdocument

©CopyrightPivotalSoftwareInc,2013-2016 15 A03

Page 16: Introduction | Pivotal Greenplum Database Docs

Allrackslabeledaccordingtothestandardsinthisdocument

Alldevicespoweron

Allhot-swappabledevicesareproperlyseated

Nodevicesshowanywarningorfaultlights

AllnetworkmanagementportsareaccessibleviatheadministrationLAN

Allcablesareneatlydressedintotheracksandhavenosharpbendsorcrimps

Allrackdoorsandcoversareinstalledandcloseproperly

Allserversextendandretractwithoutpinchingorstretchingcables

Labels

Racks

EachrackinaRecommendedArchitectureislabeledatthetopoftherackandonboththefrontandback.RacksarenamedMasterRackorSegmentRack#,where#isasequentialnumberstartingat1.Aracklabelwouldlooklikethis:

Servers

Eachserverislabeledonboththefrontandbackoftheserver.Thelabelshouldbethehostnameoftheserver.

Inotherwords,ifasegmentnodeisknownassdw15,thelabelonthatserverwouldbesdw15.

Switches

Switchesarelabeledaccordingtotheirpurpose.Interconnectswitchesarei-sw,administrationswitchesarea-sw,andETLswitchesaree-sw.Eachswitchisassignedanumberstartingat1.Switchesarelabeledonthefrontoftheswitchonlysincethebackisgenerallynotvisiblewhenracked.

CertificationGuidelines

NetworkPerformanceTestgpcheckperf

Verifiesthelinerateonboth10GbNICs.

Run gpcheckperf onthedisksandnetworkconnectionswithinthecluster.Aseachcertificationwillvaryduetothenumberofdisks,nodes,andnetworkbandwidthavailable,thecommandstoruntestswilldiffer.

SeeUsinggpcheckperftoValidateDiskandNetworkPerformanceformoreinformationonthe gpcheckperf command.

HardwareMonitoringandFailureAnalysisGuidelinesInordertosupportmonitoringofarunningclusterthefollowingitemsshouldbeinplaceandcapableofbeingmonitoredwithinformationgatheredavailableviainterfacessuchasSNMPorIPMI.

©CopyrightPivotalSoftwareInc,2013-2016 16 A03

Page 17: Introduction | Pivotal Greenplum Database Docs

Fans/TempFanstatus/presence

Fanspeed

Chassistemp

CPUtemp

IOHtemp

MemoryDIMMtemp

DIMMstatus(populated,online)

DIMMsinglebiterrors

DIMMdoublebiterrors

ECCwarnings(correctionsexceedingthreshold)

ECCcorrectableerrors

ECCuncorrectableerrors

MemoryCRCerrors

SystemErrorsPosterrors

PCIefatalerrors

PCIenon-fatalerrors

CPUmachinecheckexception

Intrusiondetection

Chipseterrors

PowerPowerSupplypresence

Powersupplyfailures

Powersupplyinputvoltage

Powersupplyamperage

Motherboardvoltagesensors

Systempowerconsumption

©CopyrightPivotalSoftwareInc,2013-2016 17 A03

Page 18: Introduction | Pivotal Greenplum Database Docs

PivotalClusterExamplesThefollowingtablelistsgoodchoicesforclusterhardwarebasedonIntelSandyBridgeprocessor-basedserversandCiscoswitches.

Table1.HardwareComponents

MasterNode

Twoofthesenodespercluster

1Userver(similartotheDellR630):

2xE5-2680v3processors(2.5GHz,12cores,120W)

256GBRAM(8x16GB)

1xRAIDcardw/1GBprotectedcache

8xSAS,10k,6Gdisks(typically8x600GB,2.5”)Organizedintoasingle,RAID5diskgroupwithahotspare.LogicaldevicesdefinedaspertheOSneeds(boot,root,swap,etc.)andtheremaininginasingle,largefilesystemfordata

2x10GbIntel,QLogic,orEmulexbasedNICs

Lightsoutmanagement(IPMI-basedBMC)

2x650Worhigher,high-efficiencypowersupplies

SegmentNode&ETLNode

Upto16perrack.Nomaximumtotalcount

2Userver(similartotheDellR730xd):

2xE5-2680v3processors(2.5GHz,12cores,120W)

256GBRAM(8x16GB)

1xRAIDcardw/1GBprotectedcache

12to24xSAS,10k,6Gdisks(typically12x600GB,3.5”or24x1.8TB,2.5”)OrganizedintotwotofourRAID5groups.Usedeitherastwotofourdatafilesystems(withlogicaldevicesskimmedoffforboot,root,swap,etc.)orasonelargedeviceboundwithLogicalVolumeManager.

2x10GbIntel,QLogic,orEmulexbasedNICs

Lightsoutmanagement(IPMI-basedBMC)

2x650Worhigherhigh-efficiencypowersupplies

AdminSwitch

CiscoCatalyst2960Series

Asimple,48-port,1GBswitchwithfeaturesthatallowittobeeasilycombinedwithotherswitchestoexpandthenetwork.Theleastexpensive,managedswitchwithgoodreliabilityisappropriateforthisrole.Therewillbeatleastoneperrack.

Interconnect

Arista7050-52

TheAristaswitchlineallowsformulti-switchlinkaggregationgroups(calledMLAG),easyexpansion,andareliablebodyofhardwareandoperatingsystem.

©CopyrightPivotalSoftwareInc,2013-2016 18 A03

Page 19: Introduction | Pivotal Greenplum Database Docs

ExampleRackLayoutThefollowingfigureisanexampleracklayoutwithproperswitchandserverplacement.

Figure:42URackDiagram

©CopyrightPivotalSoftwareInc,2013-2016 19 A03

Page 20: Introduction | Pivotal Greenplum Database Docs

UsinggpcheckperftoValidateDiskandNetworkPerformanceThefollowingexamplesillustratehowgpcheckperfisusedtovalidatediskandnetworkperformanceinacluster.

CheckingDiskPerformance—gpcheckperfOutput

[gpadmin@mdw~]$gpcheckperf-fhosts-rd-D-d/data1/primary-d/data2/primary-S80G

/usr/local/greenplum-db/./bin/gpcheckperf-fhosts-rd-D-d/data1/primary-d/data2/primary-S80G

--------------------

DISKWRITETEST

--------------------

--------------------

DISKREADTEST

--------------------

====================

==RESULT

====================

diskwriteavgtime(sec):71.33diskwritetotbytes:343597383680

diskwritetotbandwidth(MB/s):4608.23

diskwriteminbandwidth(MB/s):1047.17[sdw2]diskwritemaxbandwidth(MB/s):1201.70[sdw1]

perhostbandwidth--

diskwritebandwidth(MB/s):1200.82[sdw4]diskwritebandwidth(MB/s):1201.70[sdw1]diskwritebandwidth(MB/s):1047.17[sdw2]diskwritebandwidth(MB/s):1158.53[sdw3]

diskreadavgtime(sec):103.17diskreadtotbytes:343597383680

diskreadtotbandwidth(MB/s):5053.03

diskreadminbandwidth(MB/s):318.88[sdw2]diskreadmaxbandwidth(MB/s):1611.01[sdw1]diskreadbandwidth(MB/s):1611.01[sdw1]diskreadbandwidth(MB/s):318.88[sdw2]diskreadbandwidth(MB/s):1560.38[sdw3]--perhostbandwidth--

CheckingNetworkPerformance—gpcheckperfOutput

©CopyrightPivotalSoftwareInc,2013-2016 20 A03

Page 21: Introduction | Pivotal Greenplum Database Docs

[gpadmin@mdw~]$gpcheckperf-fnetwork1-rN-d/tmp

/usr/local/greenplum-db/./bin/gpcheckperf-fnetwork1-rN-d/tmp

-------------------

--NETPERFTEST

-------------------

====================

==RESULT

====================

Netperfbisectionbandwidthtestsdw1->sdw2=1074.010000

sdw3->sdw4=1076.250000sdw2->sdw1=1094.880000sdw4->sdw3=1104.080000

Summary:

sum=4349.22MB/secmin=1074.01MB/secmax=1104.08MB/secavg=1087.31MB/secmedian=1094.88MB/sec

©CopyrightPivotalSoftwareInc,2013-2016 21 A03

Page 22: Introduction | Pivotal Greenplum Database Docs

PivotalGreenplumSegmentInstancesperServer

UnderstandingGreenplumSegmentsGreenplumsegmentinstancesareessentiallyindividualdatabases.InaGreenplumclustertherewillbeaGreenplummasterserverwhichdispatchesworktobedonetomultiplesegmentinstances.Eachoftheseinstanceswillresideonsegmenthosts.Dataforatableisdistributedacrossallofthesegmentinstancesandwhenaqueryisexecutedthatrequestsdataitisdispatchedtoallofthemtoexecuteinparallel.Thoseinstancesthatactivelyprocessthequeryarereferredtoastheprimaryinstances.AGreenplumclusterinadditionwillberunningmirrorinstances,onepairedtoeachprimary.Themirrorsdonotparticipateinansweringqueries;theyarejustperformingdatareplication,sothatifaprimaryshouldfailitsmirrorcantakeoverprocessinginitsplace.

Whenplanningacluster,itisimportanttounderstandthatalloftheseinstancesaregoingtoacceptaqueryinparallelandactuponit.Thereforetheremustbeenoughresourcesonaservertofacilitatealloftheseprocessesrunningandcommunicatingwitheachotheratonce.

SegmentsResourcesRuleofThumbAgeneralruleofthumbisthatforeverysegmentinstance(primaryormirror)youwillwanttoprovideatleast:

1core

200MB/sIOread

200MB/sIOwrite

8GBRAM

1GBnetworkthroughput

Asegmenthostwith8primaryand8mirrorinstanceswouldhave:

16cores

3200MB/sIOread

3200MB/sIOwrite

128GBRAM

20GBnetworkthroughput

Thesenumbershaveproventoprovideareliableplatformforavarietyofusecasesandgiveagoodbaselineforthenumberofinstancestorunonasingleserver.Pivotalrecommendsamaximumof8primaryand8mirrorinstancesonaservereveniftheresourcesprovidedaresufficientformore.

Pivotalhasfoundthatallocatingaratioof1to2physicalCPUsperprimarysegmentworkswellformostusecases;itisnotrecommendtodropbelow1CPUperprimarysegment.IdealarchitectureswilladditionallyalignNUMAarchitecturewiththenumberofsegments.

ReasonstoreducethenumberofsegmentinstancesperserverAdatabaseschemathatusespartitionedcolumnartableshasthepotentialtogeneratealargenumberoffiles.Forexample,atablethatispartitioneddailyforayearwillhaveover300files,oneforeachday.Ifthattableadditionallyhascolumnarorientationwith300columnsitwillhavewellover90,000filesrepresentingthedatainthattableononesegmentinstance.Aserverthatisrunning8primaryinstanceswiththistablewouldhavetoopen720,000filesifafulltablescanquerywereissuedtothattable.Systemsthatmakeuseofpartitioncolumnartablesmaybenefitfromalessernumberofsegmentinstancesperserverifdataisbeingusedinawaythatrequiresmanyopenfiles.

Systemsthatspanlargenumbersofnodescreatemoreworkforthemastertoplanqueriesanddocoordinationofallofthesegments.Insystemsspanningtwoormoreracksconsiderreducingthenumberofsegmentinstancesperserver.

Whenqueriesrequirelargeamountsofmemoryreducingthenumberofsegmentsperserverincreasestheamountofmemoryavailabletoanyonesegment.

Iftheamountofconcurrentqueryprocessingcausesresourcestorunlowonthesystem,reducingtheamountofparallelismontheplatformitselfwillallowformoreparallelisminqueryexecution.

Reasonstoincreasethenumberofsegmentinstancesperserver

©CopyrightPivotalSoftwareInc,2013-2016 22 A03

Page 23: Introduction | Pivotal Greenplum Database Docs

Inlowconcurrencysystemsincreasingthesegmentinstancecountwillalloweachquerytoutilizemoreresourcesinparallelifsystemutilizationislow.

SystemswithlargeamountsoffreeRAMthatcanbeusedbytheOSforfilebuffersmaybenefitfromincreasingthenumberofsegmentinstancesperserver.

©CopyrightPivotalSoftwareInc,2013-2016 23 A03

Page 24: Introduction | Pivotal Greenplum Database Docs

PivotalGreenplumonVirtualizedSystems

GeneralunderstandingofPivotalGreenplumandvirtualizationGreenplumDatabaseisaparallelprocessingsoftware.ThismeansthatthePivotalGreenplumsoftwareoftendoesthesameprocessatthesametimeacrossaclusterofnodes.Virtualizationisfrequentlyusedtocentralizesystemssothattheywillbeabletoshareresources,takingadvantageofthefactthatsoftwareoftenutilizesresourcessporadically,allowingthoseresourcestobeover-subscribed.GreenplumDatabasewillnotfunctionwellinanoversubscribedenvironmentbecauseallsegmentsbecomeactiveatonceduringqueryprocessing.Inthattypeofenvironment,thesystemispronetobottlenecksandunpredictablebehaviorthatcouldresultfrombeingunabletoaccessresourcesthesystembelievesithasbeenallocated.

Withthisinmind,aslongasthesystemmeetstherequirementssetforthintheinstallationguide,Greenplumissupportedonvirtualinfrastructure.

ChoosingthenumberofsegmentinstancestorunperVMTherecommendedhardwarespecificationsarequitelargeandmaybehardtoachieveinavirtualenvironment.InthesecaseseachVMshouldhavenomorethan1primaryand1mirrorsegmentforevery2CPUs,32GBofRAM,and300MB/sofsequentialreadbandwidthandwritebandwidth.ThusaVMwith4CPU,64GBRAM,and1GB/ssequentialreadandwritewouldbeabletohost2primarysegmentinstancesand2mirrorsegmentinstances.

WhileitispossibletocreatesegmenthostVMsthatonlyhostasingleprimarysegmentinstance,itispreferredtohaveatleasttwoormoreprimarysegmentinstancesperVM.Certainqueriesthatperformtaskssuchaslookingforuniquenesscancausesomesegmentinstancestoperformmorework,andrequiremoreresources,thanotherinstances.Groupingmultiplesegmentinstancestogetherononeservercanmitigatesomeoftheseincreasedresourceneedsbyallowingasegmentinstancetoutilizetheresourcesallocatedtotheothersegmentinstances.

VMEnvironmentSettingsVMshostingGreenplumDatabaseshouldnothaveanyauto-migrationfeaturesturnedon.Thesegmentinstancesareexpectingtoruninparallelandifoneofthemispausedtocoalescememoryorstateformigrationthesystemcanseeitasafailureoroutage.Itwouldbebettertotakethesystemdown,removeitfromtheactiveclusterandthenintroduceitbackintoclusteronceithasbeenmoved.

Specialcareshouldbegiventounderstandthetopologyofprimaryandmirrorsegmentinstances.NosetofVMsthatcontainaprimaryanditsmirrorshouldrunonthesamehostsystem.Ifahostcontainingboththeprimaryandmirrorforasegmentfails,theGreenplumclusterwillbeofflineuntilatleastoneofthemisrestoredtocompletethedatabasecontent.

©CopyrightPivotalSoftwareInc,2013-2016 24 A03

Page 25: Introduction | Pivotal Greenplum Database Docs

AdditionalHelpfulTools

YumRepositoryConfiguringaYUMrepositoryonthemasterserverscanmakemanagementofthesoftwareacrosstheclustermoreefficient,particularlyincaseswherethesegmentnodesdonothaveexternalinternetaccess.Morethanonerepositorycanmakemanagementeasier,forexampleonerepositoryforOSfilesandanotherforallotherpackages.Configuretherepositoriesonboththemasterandstandbymasterservers.

KickstartImagesKickstartimagesforthemasterserversandsegmenthostscanspeedupimplementationofnewserversandrecoveryoffailednodes.Inmostcaseswherethereisanodefailurebutthedisksaregood,reimagingisnotnecessarybecausethedisksinthefailedservercanbetransferredtothenewreplacementnode.

©CopyrightPivotalSoftwareInc,2013-2016 25 A03