cyberinfrastructure user support - ou · cyberinfrastructure user support ... n requires code...

28
Cyberinfrastructure User Support Andrew Sherman Yale University Senior Research Scientist, Yale Center for Research Computing Senior Research Scientist, Department of Computer Science ACI-REF Virtual Residency 2016 Thu August 11, 2016

Upload: phungliem

Post on 06-Jun-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

CyberinfrastructureUser Support

AndrewShermanYaleUniversity

SeniorResearchScientist,YaleCenterforResearchComputingSeniorResearchScientist,DepartmentofComputerScience

ACI-REFVirtualResidency2016ThuAugust11,2016

Goalsforthissession

n WhatisCI,andhowdoesitdifferfromconventionalIT?n CIusercategories,andhowtosupportthemn SomeofthehumanaspectsofCIsupport(i.e.politics,conflicts)n Policies,education,outreach,collaborations,andnetworking

TheseslidesarebasedonmaterialfromMehmet(Memo)Belgin (GATech),modifiedbyHenryNeeman,andareusedwithpermission.Numerouseditshavebeenmade.

ACI-REFVirtualResidency2016,ThuAugust11,2016 2

YaleCenterforResearchComputing

n Free-standingcenterreportingtoDeputyProvostforResearch(dottedlinestothemedicalschoolandITS);createdinJuly2015

n Whoweare(~15FTEs)n 2FacultyDirectors(Arts&Sciences;MedicalSchool)n ExecutiveDirectorn ACI-REFs(6+):2researchfaculty;5+others;alignedtospecificclustersn HPCEngineering/SystemAdministrationTeam(6)n DirectorofResearchServices(education,communications)

n Whowearen’t (ITS)n DesktoporLabSupportn CampusNetworkOperations(ScienceNetwork&DMZisshared)n DataCenterOperations(power,etc.)n Security&AuthenticationServices

ACI-REFVirtualResidency2016,ThuAugust11,2016 3

YCRCResponsibilities

ACI-REFVirtualResidency2016,ThuAugust11,2016 4

Cyberinfrastructure• 5HPCclusters(~17K cores??)• HPCdatastorage(~8PB)• Researchdatamanagement

• Integrationwithcampus-wide“Storage@Yale”active&archivetiers

• Someintegrationwithlabandinstrumentationstorage

• ScienceNetwork&DMZ

Research&TeachingSupport• Dedicatedsupport (YCGA,G&G)• HPCsoftware&algorithm

installations,tuning&consultation• Supportforscience&engineering

softwareapplications• Nationalinfrastructureassistance• Grantpreparation• Facultyrecruitment(startuppkgs)• HPCsupportforclasses

Education&Training• ParallelComputing(creditclass)• Research ComputingWorkshops

• GettingStartedBootcamps• Python,ParallelR,GIS• Group/Dept.Bootcamps• XSEDE&vendorworkshops

• Usergroups

OutsideCommunity• CASC(http://www.casc.org)

• Working groupson”beyondhardware”andregulateddata

• XSEDECampusChampions(2)• ACI-REF (CaRC);ACI-REF-VR• NortheastBigData Hub• LCI

WhattheHeckisCyberInfrastructure (CI),Anyway?

n Componentsn Computingsystemsn Datastoragesystemsn Advancedinstrumentsanddatarepositoriesn Visualizationenvironmentsn HighSpeedNetworksn People

n Purposen Enablescholarlyinnovationanddiscoveriesnototherwisepossible

ACI-REFVirtualResidency2016,ThuAugust11,2016 5

BasedonIndianaUniversity’sdefinition

DifferencesbetweenCIandConventionalIT

n Primarytargetisperformancen UsuallyreliesonconventionalITservices(byaseparateteam)n Morefocusonsupportingend-usersthanservicesn UsescommonITtechnologiesinuncommonwaysn Maymixsharedanddedicatedresourcesinoneentityn Requiresspecificmiddlewareandsoftwarelayersn Requirescodecompilationsusingcomplicatedmechanismsn Mayrequirespecificknowledgeabouttheapplication/sciencen Hasirregularusagepatterns,whichmaybecomeobviousand

troublingtousers

ACI-REFVirtualResidency2016,ThuAugust11,2016 6

Outline

n PartI:CIuserexpectations,categorizationandcommonalitiesn PartII:Policies,Politics,ConflictsandPersonalityManagementn PartIII:Education,Outreach,andNetworking

ACI-REFVirtualResidency2016,ThuAugust11,2016 7

Faculty(a/k/aPrincipalInvestigator)Expectations

n TypicalRolesn Researchentrepreneur&teachern ManagerandfunderofCIusers

n OftenknowledgeableaboutCIbutdoesn’tuseitdirectly(thatpleasureisreservedforstudents&postdocs!)

n Mayownorpayforresourcesandservices(butsharedresourcesmaybefreeatsomeinstitutions)

n Expectations:n CIresourcesarereliablyupandrunningon7x24basisn Studentsandcollaboratorshavefair(?)accesstoCIresources

requiredtocarryoutresearchorclassroomassignmentsontimen Assistanceavailableasandwhenneededn Regularusageandexpensereports(especiallyforstorage)

ACI-REFVirtualResidency2016,ThuAugust11,2016 8

“ActualCIUser”Expectations

n TypicalRolesn Some“handson”facultyn Usuallystudents,postdocs,orotherswhoarenotpermanentn Permanentresearchstafforresearchfacultyn Externalcollaborators

n Expectationsn 7x24accesstoCIresources(andshortjobwaittimes,ofcourse)n “Insider”relationshiptoCIstaffforadvancedusersn Ultra-fastlearningcurven Simpleandinstantsolutionstocomplexproblemsn Applicationsrunningmuchfasterthanontheirdesktops(notalways

possible!)n Helpdiagnosing/fixingproblemsthatmaybeexternallycontrolledn Answersthatmatchtheirlevelofknowledge

ACI-REFVirtualResidency2016,ThuAugust11,2016 9

CIUserCategories

n Threebroadcategories:n Novicen Intermediaten Advanced

n Difficulttoidentifyauser'scategorywithoutanypriorinteractionn Thelanguageusedinrequestsisagoodindicatorn Repliestofollow-upquestionsalsorevealthelevelofproficiencyn Ifuncertain,assume“novice”(butdon’tmakeitobvious!)

ACI-REFVirtualResidency2016,ThuAugust11,2016 10

Category1:NoviceUsers

n Characteristicsn LittleexperiencewithLinuxorcommand-lineenvironmentsn MayuseMatlab,Mathematica,andsometimesR(orevenExcel)n MayhavelimitedknowledgeofascriptinglanguagelikePythonn Rarelyanyinklingaboutparallelism

n Generateupto40-50%ofsupportrequests.Commonexamples:n Desktopsetup(especiallyforWindows)n Loginprocedures(ssh keys,two-factorauthentication,etc.)n Findingsoftwareonthecluster(s)n Findinghelpanddocumentation

n Mostrequestsarestraightforward,butsome“simple-sounding”onesmaytakealotofwork(orbeimpossible)

ACI-REFVirtualResidency2016,ThuAugust11,2016 11

SupportActivitiesforNoviceUsers

n Up-to-datewebsitewithreasonabledocumentationfornovicesn Getting-startedpresentationoron-linetutorial(possibly

customizedfortheuser’sdesktopOS)n Linux101workshopwithsoftwaresuggestions(e.g.,easyeditor)n Friendlyticketsystemforrequests,questions,andassistancen Walk-inofficehoursn Makeiteasytofindsoftware,manageenvironment&runjobs

n ToolslikeLmodn Cross-clusterstandardizationofenvironment,jobscheduler,etc.n Provideannotatedtemplatesubmissionscripts

n Softwareinstallationassistancen Helpwithtoolstomovedatato/fromclusters

ACI-REFVirtualResidency2016,ThuAugust11,2016 12

Category2:IntermediateUsers

n Characteristicsn HavepriorLinuxclusterexperience;cancreatejobscripts,butmay

notunderstandsystem-wideimpactoftheiractionsn VaryingdegreesofproficiencyinPython,C,Fortran,R,etc.n Useworkflowsinvolvingmultipledomain-specificpackagesn OftennoticeandreportHWorsystemproblemsn Mayusewebsearchtotrytoovercomedifficulties

n Generateupto30-40%ofsupportrequests.Commonexamples:n Assistancewithcomplexsoftwareinstallationsn Assistancewithperformanceissuesn Helpwithcomplexjobscripts,jobarrays,orparameterstudiesn Specialrequests(“bendingtherules”),suchasjobpriorityorquota

ACI-REFVirtualResidency2016,ThuAugust11,2016 13

EffectiveSupportforIntermediateUsersn “Teachthemtofish”:Offeradvanced,possiblydomain-specific,

workshops;takeadvantageofXSEDEorvendorofferings;SoftwareCarpentryorDataCarpentrymaybevaluableforsomeusers

n Buildstrongindividualworkingrelationshipssincetheseusersoftenserveaslocaltrainers&“experts”fortheirgroups.

n Betransparentindiscussions,sincetheycandistinguishfactfromspeculation(andwillprobablyputyouradvicetothetest).

n Admitwhenyoudon’tknowsomething.Youaren’texpectedtoknoweverything!Butthentrytofindoutandfollowup!(Network!)

n Helpthemfindsolid,high-qualityon-lineinformation(vendorsites,userforums,etc.)pitchedattheproperlevel.

n Assistordocomplexsoftwareinstallations,especiallythoseinvolvingparallelcodesorsignificantoptimizations.Helpwithcodedevelopment/debugging/tuningmaypaybigdividendslater.

ACI-REFVirtualResidency2016,ThuAugust11,2016 14

Category3:AdvancedUsers

n Characteristicsn Maybehands-onfaculty,researchstaff,oradvancedstudentsn Experiencewithandaccesstomultipleclusters(includingXSEDE,etc.)n Technicallyproficientinscriptingorprogramminglanguagesn Developand/oruseparallelapplicationsn Developcomplexworkflowsandjobscriptsn Alwaystryingnewthings;willingtoexperimentwithnewsoftware

n Generateupto10-15%ofsupportrequests.Commonexamples:n Installationofcomplexsoftware&tools(“It’sjust1Pythonmodule!”)n RequestsborderingonR&Dn Specialrequests/treatment(oftenoutsideofnormalchannels)n Helpwithspecialhardware(e.g.,GPUs)n Bugsfoundinhardware,3rd partyapplications,orlibraries

ACI-REFVirtualResidency2016,ThuAugust11,2016 15

EffectiveSupportforAdvancedUsersn Applyallsupporttechniquesforintermediateusershere,too.n Communicateandmeetregularlywiththem.Happyadvanced

usersandtheirfacultyadvisors/PIsmayoftenbeyourstrongestadvocatesatyourinstitution.

n Treatadvancedusersaspeers;theymayknowasmuchormorethanyoudoaboutresearchcomputing.

n Asappropriate,involvetheminhardwareacquisitionsandACIgrantproposals.

n Collaborate!ResolvingmanyofthecomplexproblemstheyencountermayrequireclosecooperationamongACI-REFs,systemadministrators,andothers.

n Beflexible.Makesmallrulesexceptionswhentheywon’timpactothers.However,watchoutforslipperyslopes.

ACI-REFVirtualResidency2016,ThuAugust11,2016 16

Outline

n PartI:CIuserexpectations,categorizationandcommonalitiesn PartII:Policies,Politics,ConflictsandPersonalityManagementn PartIII:Education,Outreach,andNetworking

ACI-REFVirtualResidency2016,ThuAugust11,2016 17

Policiesn Havewell-definedwrittenpolicies.Theseseteveryone’s

expectationsandavoidmisunderstandings.n Publishpoliciesinplaceseasytofind(online).RequirePIsto

acceptyourpoliciesandmakePIsresponsibleforthebehavioroftheirstudents,postdocs,andstaff.

n Bepreparedtoexplainthereasoningbehindeachpolicyitem.n Makepoliciesstrict(conservative),butconsiderexceptionsas

needed(butavoidslipperyslopes!)n Encourageuserstoopenlydiscussandcriticizethepolicies.n Don’thesitatetoupdatepoliciestostayrelevant.n Buildtrustandeffectivecommunicationwithdecisionmakers.n Seekdelegationprivilegestospeedthingsup.n Influence,butdon’tmake,policiesforresourcesyoudon’town.

ACI-REFVirtualResidency2016,ThuAugust11,2016 18

ScheduledMaintenance

n Setregularschedule,withmultipleadvanceannouncements.n Unscheduleddowntimesarenoexcuseforskippingmaintenancen Provideasummaryofcompletedtasksaftermaintenance.n Havecleargoals;planaheadingreatdetail:

n Workwithyourvendorsn Teammember/taskassociationsn Estimatedtaskdurationn Criticalpathsandfallbackplans

n Prepareforpotentialproblemsduring/aftermaintenancedaysn Showbesteffortforminimalimpact

n Configuretheschedulertohavenorunningjobsn Disableuseraccesstoresourcesduringthemaintenanceactivitiesn Assistusersinmovingworktoalternativeclusterswhenpossible

ACI-REFVirtualResidency2016,ThuAugust11,2016 19

PoliticsandConflicts

n Trickybutinevitablen Nomagicformula,needcase-specificcreativesolutionsn Biggestchallenge:conflictsduetolimitedresources

n Configuresystemstomatchyourpolicies.n Collectandstoredataforpastandpresentusage.n Provideuserswithtoolstobrowsedata/statisticsfortheiraccounts.n Runregularauditstodefuseproblemsbeforetheyexplode.n Considerascavengequeueforpre-emptiblejobs

ACI-REFVirtualResidency2016,ThuAugust11,2016 20

TiersofConflict

n Internaltoagroup/department:Usuallyeasiertosolvewithcommunicationandinformalagreements.Sometimesagoodjobschedulercanhelp(e.g.,multi-levelfairshare).Provideadvice,butgetthePIorchairtotaketheleadandowntheresolution.

n Betweengroups/departments:Cangetmessy,butmaybeavoidableifyousticktoyourpolicies.Beeven-handed;don’tshowfavoritism.Getallagreementsinwriting!

n BetweenusersandCIsupportstaff:Haveclearpolicieshandyasabasisfordecliningunreasonableorimpossiblerequests,andkeepsolidstatistics/dataasevidence.Asabove,beeven-handed;don’tshowfavoritism.Getallagreementsinwriting!

ACI-REFVirtualResidency2016,ThuAugust11,2016 21

PersonalityManagement

n Someusersaremoredifficultthanothers.That’slife!n Don’ttakethingspersonally;reportharassment;neverretaliaten Usersdon’tmeantobedifficult;butmaybeundergreatpressure

andextremelyfrustratedn Ifyoumakeamistake,takeresponsibilityandofferanapology.n Showempathyandsincerityn Acknowledgethat:

n youunderstandtheuser’sconcerns;n youareawareofitsparticularimpactontheuser.

n Besensitivetoculturaldifferencesandlanguagedifficulties.n Usehumorappropriately,andavoidbeingawkwardorinsulting.n Communicatefrequentlywhileworkingonanyissue

ACI-REFVirtualResidency2016,ThuAugust11,2016 22

Outline

n PartI:CIuserexpectations,categorizationandcommonalitiesn PartII:Policies,Politics,ConflictsandPersonalityManagementn PartIII:Education,Outreach,andNetworking

ACI-REFVirtualResidency2016,ThuAugust11,2016 23

TrainingsandTutorials

n ResearchComputingWorkshopsn GettingStartedBootcampsn Python,ParallelR,GISn Group/Dept.Bootcampsn XSEDE&vendorworkshopsn SoftwareCarpentry;DataCarpentry;SCTutorials&Workshops

n SpecialTopicsn ParallelComputingn Debugging/optimizationofcodes(includingparallel)n Systemarchitecturespecificdetailsn Advanceduseofcommontools(ScientificPython,ParallelMATLAB)

ACI-REFVirtualResidency2016,ThuAugust11,2016 24

GroupConsultations

n Mini-orientationsfornewgroups(“On-Boarding”)n Usegroupmeetingsforfeedback&toresolveinternalconflictsn Resolutionoftechnicalproblemsthatarespecifictoagroupn Technicalfeedbacktoassistinpolicymakingandsystempurchasesn Introduceservicestonewgroupsinterestedingettingresources

ACI-REFVirtualResidency2016,ThuAugust11,2016 25

CollaborationswithResearchersandVendors

n Researchershelpingresearchersn Crucialforstayingrelevant:Whatisyourfacultyplanning?n Collaborativegrantwritingn Collaborativeprojects/papers(acknowledgementsorco-authors)n Supportforclassesandworkshopsn Developer/vendorcollaborations

n Bugtrackingandfixesn HW/SWinformation,evaluationofnewsystemsandtechnologyn Pilotstudies&benchmarks

ACI-REFVirtualResidency2016,ThuAugust11,2016 26

SomeExternalGroupsforStaffTraining&Networking

n ACI-REF;ACI-REF-VR;CaRCn XSEDECampusChampions(national&regional)n CASC(http://www.casc.org)

n Workinggroupson”beyondhardware”andregulateddatan Educausen LCI(aimedatHPCsystemadministration)

ACI-REFVirtualResidency2016,ThuAugust11,2016 27

THANKSFORYOURATTENTION!

[email protected]

ACI-REFVirtualResidency2016,ThuAugust11,2016 28