turning practice into perfect implementing fathom 2.0 adam backman white star software [email protected]

56
Turning Practice into Perfect Implementing Fathom 2.0 Adam Backman Adam Backman White Star Software White Star Software [email protected] [email protected]

Upload: alexander-burns

Post on 28-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Turning Practice into PerfectImplementing Fathom 2.0Turning Practice into PerfectImplementing Fathom 2.0

Adam BackmanAdam Backman

White Star SoftwareWhite Star Software

[email protected]@wss.com

NotesNotes

All of the information covered in this All of the information covered in this presentation is covered in a new portion of presentation is covered in a new portion of the Progress documentation called: the Progress documentation called: OpenEdge RevealedOpenEdge Revealed

Mastering the Progress Database with FathomMastering the Progress Database with Fathom A system works as a whole and not as a A system works as a whole and not as a

sum of its parts so the presentation is sum of its parts so the presentation is written the same way. With this in mind written the same way. With this in mind please hold your questions until the endplease hold your questions until the end

Presentation GoalPresentation Goal

Explain how to use FathomExplain how to use Fathom Implementing best practicesImplementing best practices Not here to teach in-depth System Not here to teach in-depth System

AdministrationAdministration Point you to other presentations for more Point you to other presentations for more

in-depth informationin-depth information

The Operation’s ChallengeThe Operation’s Challenge

Constant reactive modeConstant reactive mode Manual processesManual processes Poor reportingPoor reporting Unplanned downtimeUnplanned downtime Cannot plan for growthCannot plan for growth Poor System PerformancePoor System Performance

Resulting In:

Unpredictable Operations

Exposure to Errors

Incomplete Information

Frustrated End Users

Frustrated Administrators

Goals of a Well Maintained SystemGoals of a Well Maintained System

Resiliency – The ability to recoverResiliency – The ability to recover Availability – Provide maximum uptimeAvailability – Provide maximum uptime Performance – Consistency despite system Performance – Consistency despite system

loadload

Fathom can help achieve these goalsFathom can help achieve these goals

RoadmapRoadmap

What are best practices?What are best practices? What is Fathom?What is Fathom? Providing a resilient systemProviding a resilient system Making your system highly availableMaking your system highly available Providing consistent performanceProviding consistent performance

What are Best Practices?What are Best Practices?

Defined processes to followDefined processes to follow Consistent verifiable outcomeConsistent verifiable outcome End result – well maintained systemEnd result – well maintained system

Defined Process to FollowDefined Process to Follow

Must have clear goalsMust have clear goals FunctionalFunctional BusinessBusiness

Document where you are now and how you Document where you are now and how you are going to achieve your goalsare going to achieve your goals

Munich Paris

London

Amsterdam

PragueParis

Defined Verifiable OutcomeDefined Verifiable Outcome

Know what you expectKnow what you expect Know what you are gettingKnow what you are getting Test completely prior to implementationTest completely prior to implementation

Unit testingUnit testing End-to-end testingEnd-to-end testing

Well Maintained SystemWell Maintained System

Ability to support 24 hour operations with only Ability to support 24 hour operations with only scheduled outages for upgrades and maintenancescheduled outages for upgrades and maintenance

Ability to recover from disaster with little or no Ability to recover from disaster with little or no data loss and minimal interruption to operationsdata loss and minimal interruption to operations

Ability to support the changing needs of the Ability to support the changing needs of the business with little or no performance degradation business with little or no performance degradation during times of heavy processingduring times of heavy processing

RoadmapRoadmap

What are best practices?What are best practices? What is Fathom?What is Fathom? Providing a resilient systemProviding a resilient system Making your system highly availableMaking your system highly available Providing consistent performanceProviding consistent performance

What is Fathom?What is Fathom?

Java-based management console and agentJava-based management console and agent Management consoleManagement console

Provides interface to the agentProvides interface to the agent Provides an interface to the Fathom DatabaseProvides an interface to the Fathom Database Allows for definition of alertsAllows for definition of alerts

Fathom agentFathom agent Collector of operating system resource informationCollector of operating system resource information Collector of Progress database management Collector of Progress database management

informationinformation

DictionaryDictionary

Resource – Anything Fathom can monitor or trendResource – Anything Fathom can monitor or trend Schedule – A defined timeframe when a resource is available for Schedule – A defined timeframe when a resource is available for

monitoring, alerting, and trendingmonitoring, alerting, and trending Poll – The process of gathering information about a resourcePoll – The process of gathering information about a resource Rule – a performance requirement that can be evaluatedRule – a performance requirement that can be evaluated Alert – A response to a rule being brokenAlert – A response to a rule being broken Action – A process to be performed in response to an alertAction – A process to be performed in response to an alert Trending – The process of storing performance and audit data in the Trending – The process of storing performance and audit data in the

Fathom Trend DatabaseFathom Trend Database Monitoring – Performing polling, evaluating rules, generating alerts, Monitoring – Performing polling, evaluating rules, generating alerts,

executing actions, and trending of a resource within a scheduled executing actions, and trending of a resource within a scheduled timeframe.timeframe.

Progress Fathom ArchitectureProgress Fathom Architecture

Fathom Fathom DB

Fathom DBProduction Production

DBDBProduction Production

DBDB

MemoryDiskDiskCPU

CPU

Net*Log

FileSystem

Fathom ArchitectureMultiple SitesFathom ArchitectureMultiple Sites

Fathom Fathom DB

Fathom DB

Fathom Fathom DB

Fathom DB

Fathom Architecture Monitor Locally\Trend RemotelyFathom Architecture Monitor Locally\Trend Remotely

Fathom

Fathom Fathom DB

Fathom DB

Fathom Architecture 2.0Monitor/Trend Database RemotelyFathom Architecture 2.0Monitor/Trend Database Remotely

DB Agent

DB Agent

Fathom

Fathom DB

Fathom DB

Fathom Fathom DB

Fathom DB

DB Agent

DB Agent

Fathom Architecture 2.0Monitor & Trend AnywhereFathom Architecture 2.0Monitor & Trend Anywhere

Fathom Fathom DB

Fathom DB

Fathom

Fathom ArchitectureManage from One BrowserFathom ArchitectureManage from One Browser

Fathom

Fathom DB

Fathom DB

DB Agent

DB Agent

Fathom Fathom DB

Fathom DB

Fathom Fathom DB

Fathom DB

RoadmapRoadmap

What are best practices?What are best practices? What is Fathom and how does it work?What is Fathom and how does it work? Providing a resilient systemProviding a resilient system Making your system highly availableMaking your system highly available Providing consistent performanceProviding consistent performance

ResiliencyResiliency

RedundancyRedundancy Developing an effective recovery planDeveloping an effective recovery plan Monitoring for problem avoidanceMonitoring for problem avoidance

RedundancyRedundancy

DiskDisk RAIDRAID

Raid LevelsRaid Levels Dos and Don’ts Dos and Don’ts

After imagingAfter imaging MemoryMemory

RAIDRAID

RRedundant edundant AArray of rray of IInexpensive nexpensive DDisksisksPatterson, Gibson and Katz at the University of California Berkeley (1987)Patterson, Gibson and Katz at the University of California Berkeley (1987)

Common RAID LevelsCommon RAID Levels RAID 0 – stripingRAID 0 – striping RAID 1 – mirroringRAID 1 – mirroring RAID 10 or 0+1 – Striped with mirrorsRAID 10 or 0+1 – Striped with mirrors RAID 5 – Striped with calculated parityRAID 5 – Striped with calculated parity

RAID: Dos and Don’tsRAID: Dos and Don’ts

Do:Do: Use RAID 10 for randomized storageUse RAID 10 for randomized storage Use RAID 1 for sequential storageUse RAID 1 for sequential storage Use RAID 5 for READ-ONLY dataUse RAID 5 for READ-ONLY data

Don’tDon’t Use RAID 5 for OLTPUse RAID 5 for OLTP Use RAID 0 for data storageUse RAID 0 for data storage

Memory InterleavingMemory Interleaving

Memory interleaving works like RAID 0 for Memory interleaving works like RAID 0 for memory. While there are significant memory. While there are significant potential performance gains from potential performance gains from interleaving memory you run the risk of interleaving memory you run the risk of having one faulty memory chip bring down having one faulty memory chip bring down your application.your application.

Resiliency: Recovery PlanningResiliency: Recovery Planning

Who in involved in the process?Who in involved in the process? What gets backed up?What gets backed up? Where do we backup up our data Where do we backup up our data Where do we store the physical backup?Where do we store the physical backup? When do we do a backup?When do we do a backup? Why do a backup at all?Why do a backup at all? How can Fathom help?How can Fathom help?

Who is Involved in Recovery Planning?Who is Involved in Recovery Planning?

Technical peopleTechnical people They understand what is possibleThey understand what is possible

Business peopleBusiness people They understand what is needed and the cost of They understand what is needed and the cost of

downtimedowntime ManagementManagement

They understand where the business is headed They understand where the business is headed and what can be affordedand what can be afforded

What is Included on the Backup?What is Included on the Backup?

More than just a database backupMore than just a database backup DatabaseDatabase ApplicationApplication Other FilesOther Files

Physical backupPhysical backup Secondary machine roomSecondary machine room Additional HardwareAdditional Hardware InfrastructureInfrastructure

Where Do We Backup To?Where Do We Backup To?

Capacity – How much do you need to Capacity – How much do you need to store?store?

Removable – To allow off-site archivalRemovable – To allow off-site archival Reliable – It must work every timeReliable – It must work every time Compatible – Keeps your options openCompatible – Keeps your options open

Where to Store your Backup?Where to Store your Backup?

Formal serviceFormal service 24 hour access24 hour access SecureSecure Highly disaster resistantHighly disaster resistant

Separate location (different building)Separate location (different building) InexpensiveInexpensive Greater need for planning (access, security, Greater need for planning (access, security,

disaster, etc.)disaster, etc.)

When to do a Backup?When to do a Backup?

As often as practicalAs often as practical A once a day backup will cause you to loose up A once a day backup will cause you to loose up

to 24 hour of processing in the worst caseto 24 hour of processing in the worst case Fill in with after imagingFill in with after imaging

Store AI on different diskStore AI on different disk Archive AI files throughout the dayArchive AI files throughout the day Keep warm standby to reduce downtimeKeep warm standby to reduce downtime

Why do a Backup?Why do a Backup?

Reduce data lossReduce data loss Build user confidenceBuild user confidence Keep your job Keep your job

How Can Fathom Help?SchedulingHow Can Fathom Help?Scheduling

Consistent schedule that is not forgottenConsistent schedule that is not forgotten Pro-active notification if there is a problemPro-active notification if there is a problem Fathom 2.0 Job TemplatesFathom 2.0 Job Templates

How Can Fathom Help?ReportingHow Can Fathom Help?Reporting

Processing time is capturedProcessing time is captured Historical trend report of backupHistorical trend report of backup Audit trailAudit trail

Resiliency: Problem AvoidanceResiliency: Problem Avoidance

Common problem areas:Common problem areas: Disk full problemsDisk full problems Database extents filling fastDatabase extents filling fast

Fathom: Disk MonitoringFathom: Disk Monitoring

Disk viewDisk view Monitoring disks other than databaseMonitoring disks other than database Graphical view of what disks look likeGraphical view of what disks look like

RoadmapRoadmap

What are best practices?What are best practices? What is Fathom and how does it work?What is Fathom and how does it work? Providing a resilient systemProviding a resilient system Making your system highly availableMaking your system highly available Providing consistent performanceProviding consistent performance

AvailabilityAvailability

Reducing the impact of unplanned events Reducing the impact of unplanned events Planning for system growthPlanning for system growth Reducing impact of change to the userReducing impact of change to the user Scheduling Online UtilitiesScheduling Online Utilities

Planning for System GrowthPlanning for System Growth

Trending allows for patterns to be viewed Trending allows for patterns to be viewed and acted uponand acted upon

Trending allows for operational thresholds Trending allows for operational thresholds to be establishedto be established

Trending allows for advanced planning so Trending allows for advanced planning so maintenance can be scheduled when maintenance can be scheduled when convenient for the businessconvenient for the business

Fathom: Disk TrendingFathom: Disk Trending

Correlating database and disk trendsCorrelating database and disk trends Month by Month, Week by Week or Day by Month by Month, Week by Week or Day by

Day it is your choiceDay it is your choice Fill rates and activity of each diskFill rates and activity of each disk

Fathom: Storage Area TrendingFathom: Storage Area Trending

Fill rate Fill rate Activity by areaActivity by area This information can show a need to spread This information can show a need to spread

data even furtherdata even further

Fathom: Table and Index Trending – database analysisFathom: Table and Index Trending – database analysis

Predicting table growthPredicting table growth Predicting Index growthPredicting Index growth Index compaction rates can be monitored Index compaction rates can be monitored

and actions can be taken if the compaction and actions can be taken if the compaction drops below a certain leveldrops below a certain level

Utilization of each table and index can also Utilization of each table and index can also be tracked and viewed in other areas of be tracked and viewed in other areas of fathomfathom

Fathom: Memory TrendingFathom: Memory Trending

Focus on paging and swapping rather than Focus on paging and swapping rather than utilizationutilization

This is currently a weak area within the This is currently a weak area within the Fathom productFathom product

Fathom: CPU TrendingFathom: CPU Trending

Look at Idle Look at Idle Look at the ratio between User and SystemLook at the ratio between User and System High system time can indicate an incorrect High system time can indicate an incorrect

value for –spin or High paging or swappingvalue for –spin or High paging or swapping

RoadmapRoadmap

What are best practices?What are best practices? What is Fathom and how does it work?What is Fathom and how does it work? Providing a resilient systemProviding a resilient system Making your system highly availableMaking your system highly available Providing consistent performanceProviding consistent performance

Performance Performance

Performance is relative Performance is relative Fast is overrated Fast is overrated Fathom can help find tough problemsFathom can help find tough problems

Performance is RelativePerformance is Relative

What is a baseline?What is a baseline? Determining your baselinesDetermining your baselines

How Fathom can helpHow Fathom can help Important indicatorsImportant indicators Who is your canary?Who is your canary?

Determining your baselineDetermining your baseline

Good baseline guidelinesGood baseline guidelines Often accessed portions on the applicationOften accessed portions on the application High customer impactHigh customer impact End to End (Time to enter an order)End to End (Time to enter an order)

Bad baselineBad baseline Year-end processYear-end process Management reporting (in most cases)Management reporting (in most cases) Little used portions of the applicationLittle used portions of the application

Components of PerformanceComponents of Performance

NetworkNetwork DiskDisk MemoryMemory CPUCPU

Issues: NetworkIssues: Network

Check your network capacity BEFORE Check your network capacity BEFORE adding any additional applicationsadding any additional applications Baseline response times with FathomBaseline response times with Fathom

Routed vs. switched networksRouted vs. switched networks Location of Progress filesLocation of Progress files Program LibrariesProgram Libraries

Issues: DiskIssues: Disk

Storage capacity vs. throughput capacityStorage capacity vs. throughput capacity Remember your RAID levelsRemember your RAID levels Location of dataLocation of data

Issues: MemoryIssues: Memory

Memory acts as a buffer between the user Memory acts as a buffer between the user processes and diskprocesses and disk

Use memory for the common goodUse memory for the common good Increase broker memory firstIncrease broker memory first Increase client memory (-Bt, …)Increase client memory (-Bt, …) Then get creativeThen get creative

Issues: CPUIssues: CPU

Good CPU usage vs. Bad CPU UsageGood CPU usage vs. Bad CPU Usage The –spin parameterThe –spin parameter Have a CPU problem? Look at your disksHave a CPU problem? Look at your disks

Monitoring PerformanceMonitoring Performance

Spot checksSpot checks My Fathom ViewsMy Fathom Views

Trend reportingTrend reporting Getting out of the ForestGetting out of the Forest

ConclusionConclusion

Start slow Start slow Remember your goalsRemember your goals

ResiliencyResiliency AvailabilityAvailability PerformancePerformance

Consider the cost/benefit before adding Consider the cost/benefit before adding monitoring or trending to a resourcemonitoring or trending to a resource

QuestionsQuestions