how to document campus infrastructure offices, hospitals ... · pdf filehow to document campus...
TRANSCRIPT
How To Document Campus Infrastructure Offices, Hospitals, Universities, Airports, Etc.
29th November 2012
David Cuthbertson, Director Square Mile Systems Ltd
www.squaremilesystems.com
Some Background
Based in Cirencester, Glos, UK
Customers – worldwide.
Key Skill Areas
Documenting IT infrastructure
Configuration management processes
MS Visio automation
Industry bodies & roles
• BCS-Config Mgmt Specialist Group
• BCS-ITIL Specialist Group – ex Chairman
• LinkedIn – Data Center Engineering
• LinkedIn – Data Center Operations Mgmt
• BICSI, ITSMF, Microsoft guest speaker
Fixed Infrastructure (Cabling, Power, Cabinets, Rooms, Buildings)
Hardware Infrastructure PCs, Network, Servers, UPS, Storage, Other
Virtual Infrastructure PCs, Network, Servers, Storage, DBMS
Applications PC, server, mainframe, SOA
Services End user, infrastructure, supplier
Business Processes Departmental, Company
*BCS – British Computer Society Presentation Scope
Infrastructure Management Maturity
Reactive Repeatable Defined Managed Optimised
1 2 3 4 5
Individual
approach
Some process,
often informal
Process
documented
and explained
Process
checked and
reviewed for
gaps
Process open
to external
review and
updated
regularly
To move to the right we typically need to;
1. Embed infrastructure knowledge in team systems
2. Enable separation of roles – design, implement, operate, risk
3. Plan and allocate resource against demand
4. Feedback on metrics and changes – billing, compliance
The Current State For Many
1. Lots of existing documentation of varying accuracy, formats and purpose which isn’t trusted, updated or generally known about
2. Project teams use individual MS office tools for communicating changes, whereas operations teams need multi-user systems
3. Task or projects often involve reverse-engineering, site surveys, workshops and audits of the same infrastructure
4. Location, business and system dependencies in peoples heads
5. IT teams are targeted to deliver projects faster and minimise disruption, not to maintain systems documentation
6. There is no identified budget for improving process or management techniques for the infrastructure
7. Multiple, repeated audits to get management data on infrastructure capacity and usage
8. Lack of standards – naming, data maintained, Visio shapes, etc
Overlays of Technologies (1)
WAN Video
Conferenci
ng
Voice
Wireless
External
Video KVM
Unix
Servers
Printers
Copiers Storage
Phones
Fax
Power Cabling
LAN SAN
CCTV
Wintel
Servers
Desktop
Laptops Network
CCTV
SAN
Overlays of Technologies (2)
WAN Video
Conferenci
ng
Voice
Wireless
External
Video KVM /
Control
Unix
Servers
Printers
Copiers
Phones
Fax
Power Cabling
LAN
Wintel
Servers
Desktop
Laptops Network Storage
SAN
Overlays of Technologies (3)
WAN Video
Conferenci
ng
Voice
Wireless
External
Video KVM /
Control
Unix
Servers
Printers
Copiers
Phones
Fax
Power Cabling
LAN
CCTV
Wintel
Servers
Desktop
Laptops Network Storage
Document Overload!
1. Update asset/inventory list
2. Update rack diagrams
3. Update network diagrams/patching records
4. Update switch port usage and capacity
5. Update floor plan rack capacity
6. Update power usage spread sheet(s)
7. Update storage / backup system documentation
8. Update systems architecture documentation
9. Update DR lists and documents
10. Update maintenance records
11. Update billing and charging data
12. Update project documentation with the “as built” details
After a project change, what should be updated?
1 2 3 4 5 6 7 8 17 18 19 20 21 22 23 249 10 11 12 13 14 15 161 2 3 4 5 6 7 8 17 18 19 20 21 22 23 249 10 11 12 13 14 15 16
PP01-03-01
UID
HP
ProLiant
DL380 G5
COMPACT
1 2
1 2 3 4 5 6 7 8
POWER
SUPPLY SUPPLY
POWER
SPAREONLINE
MIRROR
CAGE
RISER
PCI
PROC PROC
LOCKINTER
TEMPOVER
FANS
PP
M
PP
M
DIMMS
UID
HP
ProLiant
DL380 G5
COMPACT
1 2
1 2 3 4 5 6 7 8
POWER
SUPPLY SUPPLY
POWER
SPAREONLINE
MIRROR
CAGE
RISER
PCI
PROC PROC
LOCKINTER
TEMPOVER
FANS
PP
M
PP
M
DIMMS
SVR-BHAM-010301
440
I
CRITICAL
MAJOR
MINOR
USER
COMPACT
microsystems
440
I
CRITICAL
MAJOR
MINOR
USER
COMPACT
microsystems
UK_BIRM_UX01
PROLIANTPROLIANT
SERVERWIN0001
tsr 4554
PROLIANTPROLIANT
SERVERWIN0099
PROLIANTPROLIANT
SERVERWIN00078
www.assetgen.com
Document Overkill!
1. Update asset/inventory list
2. Update rack diagrams
3. Update network diagrams/patching records
4. Update switch port usage and capacity
5. Update floor plan rack capacity
6. Update power usage spread sheet(s)
7. Update storage / backup system documentation
8. Update systems architecture documentation
9. Update DR lists and documents
10. Update maintenance records
11. Update billing and charging data
12. Update project documentation with the “as built” details
After a project change, what should be updated?
1 2 3 4 5 6 7 8 17 18 19 20 21 22 23 249 10 11 12 13 14 15 161 2 3 4 5 6 7 8 17 18 19 20 21 22 23 249 10 11 12 13 14 15 16
PP01-03-01
UID
HP
ProLiant
DL380 G5
COMPACT
1 2
1 2 3 4 5 6 7 8
POWER
SUPPLY SUPPLY
POWER
SPAREONLINE
MIRROR
CAGE
RISER
PCI
PROC PROC
LOCKINTER
TEMPOVER
FANS
PP
M
PP
M
DIMMS
UID
HP
ProLiant
DL380 G5
COMPACT
1 2
1 2 3 4 5 6 7 8
POWER
SUPPLY SUPPLY
POWER
SPAREONLINE
MIRROR
CAGE
RISER
PCI
PROC PROC
LOCKINTER
TEMPOVER
FANS
PP
M
PP
M
DIMMS
SVR-BHAM-010301
440
I
CRITICAL
MAJOR
MINOR
USER
COMPACT
microsystems
440
I
CRITICAL
MAJOR
MINOR
USER
COMPACT
microsystems
UK_BIRM_UX01
PROLIANTPROLIANT
SERVERWIN0001
tsr 4554
PROLIANTPROLIANT
SERVERWIN0099
PROLIANTPROLIANT
SERVERWIN00078
Who, what, where, how, when?
www.assetgen.com
Infrastructure Knowledge
Plan Build Operate Risk Dispose
Project and task Manage and Coordinate
Examples
Project documentation
Equipment lists
Visio/CAD diagrams
Test results
Examples
Asset and Inventory management
Business / service dependencies
Monitoring of performance, status
Risk and Recovery
Ease and speed of creation
Ease of distribution
Flexible to meet task needs
Limited training
Ease of use by many
Structured for integration & reporting
Support for multiple processes
Wide scope – the big picture!
10
Task Resourcing
Desktop
Server
Single Building
Campus
Campus + 3rd party
Plan Request Finish Build Admin
Desktop
Server
Desktop
Server
Why does it take longer?
Change Tasks
Plan
• Surveys, meeting
• Project docs
• Scheduling
• Ordering
• Access
• Change forms
• Costing
Build
• Travel
• Unpack
• Install
• Configure
• Label
• Test
• Dispose
Admin
• Project docs
• Operations docs
• Billing / time sheets
• Project reports
• Service reports
• Payments
Document - Start With Quick Wins
Few elements
Low rate of change
Lots of elements
High rate of change
Site
Rooms / locations
Computer racks, enclosures
Fixed Infrastructure
Core network devices
Hosts and servers
User infrastructure
User devices – desktops, printers, voice
Prepare - Reduce The Workload
1. Establish policies, standards and clarify ownership Make it easy for engineers
2. Have project / operations use common terms & formats Supply templates, naming system, labels, etc.
3. Reduce the numbers of documents / files to maintain Consolidate into centralised systems and make easy to find
Link or create Visio diagrams, reports, excel from databases
4. Update operational systems as part of planning processes
Prepare – Define the Scope and Priority
• Technical space(s) rooms, racks, ERs, TRs, user areas
• Inventory – bigger than “assets”
• Backbone and horizontal cabling, power, voice, etc.
• Pathways, cable routes, capacity
• Attributes / data on inventory elements
• Relationships – location, chassis, links, end to end
• Diagrams – Formats, symbols and shapes
• Change processes and maintenance
Prepare – Define Standards/Formats
• Assess existing conventions and formats to minimise variations
• Align with appropriate external standards / guidelines – Data centers TIA942, ANSI/BICSI-002
– Cabling administration TIA606-B, EN50174-1, ISO14763-2
– Manufacturers Cisco, HP, IBM, installers
Examples Rack U position starts from bottom TIA942
Rack tile - identifier front right corner AH06
Rack name - tile or row/number AH06 or B06
Prepare - TIA606-B Administration
• Classes of administration – Class 1 single ER (equipment room}
– Class 2 single ER with multiple TR (telecom rooms)
– Class 3 campus Multiple ERs, TR
– Class 4 multi-site, multiple campus
• Identifier formats for interchange of information
• Labeling formats – Example cable label within 300mm (12in) from end of cable
– All letters uppercase, machine created without serifs
• Definitions of terms
• Patch panel identifier
Prepare - Example Identifiers
• Space - LON-F1-1A London - Floor 1- Room 1A
• Cabinet identifier
– XY Coord - 1A.AH06 (front right corner tile reference)
– Row/rack -1A.A6 Row A, Rack 06
– Rack number R07
• Always preface with an alpha and add leading zero (excel sorting
issue). A01 – good A1 - bad
• Label rack top/bottom/ front/rear
– Full (LON-F1-1A-A06) , short (1A-A06), simple (A06)
• Patch panel naming – choose one system!
– 1A-AH06-A (a panel)
– 1A-AH06-30 (u position)
– 1A-AH06-F30 (front u position or nsew, abcd)
• Ports
– 1A-AH06-A-001 (put in leading edge 00s)
– 1A-AH06-F30:015 or 1A-AH06-F30:010-020
• Backbone panels
– 1A-AH06-30:001-010 / 2C-AH06-30:001-010
– 2C-AH06-30:001-010 / 1A-AH06-30:001-010
• Outlets
– EO / TO / CP /SP
Prepare - Patch Panels
Prepare - Patch Cable Labeling
1. No label 2. Local port 23 3. Local port to port SW1:23 to PPA06:12 4. Remote port to port SW1:23 to ServerB:Eth0 5. Path/work order ref WO33432 6. Unique cable label C000232 A201201/322
All have their benefits though we recommend 6. easy to create/print doesn’t change with device name easy to read good reference for work instructions
Prepare – Exception Handling
What about active equipment? Use logical name as reference, or equipment type and location id
SW-BHAM-01 or Cisco 3750 LON-F1-A1-AH06-U23
What about passive hardware? Cable management, blanking plates, trays – use location ID
CM LON-F1-A1-AH06-U2
What about cards, plug in modules, blade systems, etc Parent device and then slot/card number
SW-BHAM-01.Slot05
Preparation Summary
• For physical infrastructure develop a convention
like TIA606B and ISO14763-2 recommends.
• Have a short name for convenience, as the
unique administration identifier may be unwieldy.
• For active components it is often best to use the
logical name
– makes it easier to understand
– easy to import data from other sources
– other identifiers such as asset tag numbers will result
in a lot of work cross referencing if not careful.
Capture - Audit Process
• Scope, depth, schedule of visits
• Develop data capture tools Planning
• Check the process works on a trial building
• Refine data capture tools and process Prototype
• Data capture using workbooks / teams
• Upload as soon as possible in case of data or process erros Bulk Capture
• Check for gaps and inaccuracies across teams and cultures
• Combine with other data source Reconciliation
• Project and site reports
• Produce diagrams, sharepoint portals Presentation
Capture - Inventory
• Defined scope – infrastructure, desktops, printers
• Naming – use existing naming or new?
• Labelling/tagging – required – if so how do you create?
• Categorising equipment
• Exception handling
– Equipment on top of rack?
– Still in boxes
– Surplus bits – disks, panels, cables
– No name, partial name, wrong name, multiple names
• Defined scope – network, voice, power, video?
• Must have completed fixed infrastructure and inventory!
• Labelling/tagging – required? How do you create?
• Exception handling
– Can’t find ends
– Damaged
– One end connected – the other not!
– Multi-cable joins
– And so on
Capture - Connectivity
• It is likely that the quality of data capture will vary – Expertise increases with each audit
– Rescheduling and coping with local site/user/team issues
– Local decision making and exception handling
– Descriptions will be based on what is seen
– Comparing inputs before upload often highlights differences
• Other data sources may overcome missing audit data as well as add in additional device data. – Choose what is best for ongoing use, not stick with
previous format
Capture - Reconciliation
Presentation - Desired Outputs?
1. Asset/inventory list
2. Rack diagrams
3. Network diagrams/patching records
4. Switch port usage and capacity
5. Floor plan rack capacity
6. Power usage spreadsheet(s)
7. Storage / backup system documentation
8. Systems architecture documentation
9. DR lists and documents
10. Maintenance records
11. Billing and charging data
12. Project documentation with the “as built” details
Are we just recreating the same problem we started with?
1 2 3 4 5 6 7 8 17 18 19 20 21 22 23 249 10 11 12 13 14 15 161 2 3 4 5 6 7 8 17 18 19 20 21 22 23 249 10 11 12 13 14 15 16
PP01-03-01
UID
HP
ProLiant
DL380 G5
COMPACT
1 2
1 2 3 4 5 6 7 8
POWER
SUPPLY SUPPLY
POWER
SPAREONLINE
MIRROR
CAGE
RISER
PCI
PROC PROC
LOCKINTER
TEMPOVER
FANS
PP
M
PP
M
DIMMS
UID
HP
ProLiant
DL380 G5
COMPACT
1 2
1 2 3 4 5 6 7 8
POWER
SUPPLY SUPPLY
POWER
SPAREONLINE
MIRROR
CAGE
RISER
PCI
PROC PROC
LOCKINTER
TEMPOVER
FANS
PP
M
PP
M
DIMMS
SVR-BHAM-010301
440
I
CRITICAL
MAJOR
MINOR
USER
COMPACT
microsystems
440
I
CRITICAL
MAJOR
MINOR
USER
COMPACT
microsystems
UK_BIRM_UX01
PROLIANTPROLIANT
SERVERWIN0001
tsr 4554
PROLIANTPROLIANT
SERVERWIN0099
PROLIANTPROLIANT
SERVERWIN00078
www.assetgen.com
Support Different Needs
© AssetGen Limited 29
Claims
Processing
Payment Module
ACCOUNTS
Accounts Module
SVRWIN001
Accounts
Payable
Accounts
Receivable
PAYROLL
Sage Payroll
Payroll
SW-BHAM-
CORE1
SW-BHAM-
CORE2
SW-BHAM-01 SW-BHAM-03 SW-BHAM-05 SW-BHAM-02SW-BHAM-04SW-BHAM-06
UK_BIRM_UX01 UK_BIRM_UX02 UK_BIRM_UX09 UK_BIRM_UX04UK_BIRM_UX05 UK_BIRM_UX03 UK_BIRM_UX06 UK_BIRM_UX07 UK_BIRM_UX08 UK_BIRM_UX10
99999
99999
UK_BIRMCC_PDU1
UK_BIRMCC_PDU2
PWR01-
03-A
PWR01-
04-A
PWR01-
05-A
PWR01-06-
A
PWR01-
07-A
PWR01-
03-B
PWR01-04-
B
PWR01-05-
B
PWR01-06-
B
PWR01-07-
B
UK_BIRM_UX01 UK_BIRM_UX02 UK_BIRM_UX09 UK_BIRM_UX10 UK_BIRM_UX03 UK_BIRM_UX04
UK_BIRM_UX05
UK_BIRM_UX06
UK_BIRM_UX07
UK_BIRM_UX08
PP01-02-01
RTR-BHAM-01
PWR01-02-A
PWR01-02-B
NTU-BHAM-01
NTU-TEST03
Floor Plan
Rack Position
Application/
Service impact
Power Supply
Network Connections
BLADE_BIRM01
UK
_B
IRM
01
_B
LA
DE
-01
UK
_B
IRM
01
_B
LA
DE
-02
UK
_B
IRM
01
_B
LA
DE
-03
UK
_B
IRM
01
_B
LA
DE
-04
BL
AD
E-B
IRM
01.B
LA
DE
-SW
1
BL
AD
E-B
IRM
01.B
LA
DE
-SW
2
UK
_B
IRM
01
_B
LA
DE
-05
UK
_B
IRM
01
_B
LA
DE
-09
UK
_B
IRM
01
_B
LA
DE
-10
UK
_B
IRM
01
_B
LA
DE
-12
H/W Build
Reduce The Workload!
Excel Visio
Floor box list Floor plan
Cabinet list Equipment room floor plan
Patch panel list Backbone cabling diagram
Inventory Network diagram
Inventory Rack diagram
Inventory Server connectivity diagram
Capture – Our Data Capture Approach
1. Document / survey buildings and spaces and put into an infrastructure database.
2. Capture racks and enclosures using paper and then into a spreadsheet format. Enables production of Visio floor plans and supports audit packs
3. Capture inventory into an upload spreadsheet. Creates rack diagrams, floor box layouts, architecture maps
4. Capture connectivity into an upload spreadsheet. Create network, path and other topology diagrams
Capture – Difference In Approach
• Data capture focusses on delivering 3 files
• Visualisation is either created automatically, or by combining data
with existing backdrops - floor plans
• No need to check across multiple documents for consistency and
format
A faster, less complex and less costly audit, which doesn’t require high
skill levels within the audit team
And delivers an operational system that can be maintained easily!
Excel Excel Excel
Rack Device Cable
Maintain - Infrastructure Knowledge
Project and task Manage and Coordinate
Ease and speed of creation Ease of distribution Flexible to meet task needs Limited training
Ease of use by many Structured for integration & reporting Support for multiple processes Wide scope – the big picture!
33
Record planning decisions in the operational system Produce project docs for/from the operational system
Plan Build Operate Risk Dispose
Maintain – Keeping Data Up to Date
• Project teams can assess current state and capacity without the
need to survey for every request.
• Design teams can allocate and manage existing infrastructure
resource capacity.
• Projects go faster, less change conflicts, reduced cost of meeting
infrastructure change requests.
• Operations teams do not have to maintain detailed data, they feed
off project updates.
– Overnight updates of inventory / diagrams
– Ad hoc query / checking to help resolve service problems
• Management and capacity data is always available
– Space, connectivity, power, changes, audit trails
Summary
• Campus infrastructure is physically dispersed, supports multiple services and scale / complexity limits MS office tools suitability.
• Reducing multiple spreadsheets into a database helps, even more so when that same system creates a variety of Visio outputs automatically.
• A systems approach to documentation directly reduces change costs and project delivery times – with the starting point being a baseline!
Additional Material
www.squaremilesystems.com Training/workshops Technical / management aspects of data centers
Webinars Visio automation, documenting cabling, etc.
Videos Free SMS Visio utilities
www.assetgen.com Evaluation software Free “DCIM” evaluation version
Webinars Data center practices, Visio integration
Videos Visio automation, change impact analysis