ISOM Midterm

Capacity Management
● Refers to the wide variety of planning actions used to ensure that a business infrastructure has
adequate resources to maximize its potential activities and production output under any
condition.
● Act of ensuring a business maximizes its potentials activities and production output
● Measures how much companies can achieve, produce or sell within a given time period
● Maximum throughput that a configuration item or IT service can deliver
Changing conditions or External influences

● Seasonal demand
● Industry changes
● Unexpected macroeconomic events
Capacity Management
● Working overtime
● Outsourcing business operations
● Purchasing additional equipment
● Leasing or selling commercial property
Space Management
● Calculating the proportion of spatial capacity that is actually being used over a certain time
period
Goal: Ensure that cost-justifiable IT capacity in all areas of IT always exists and is matched to the current
and future agreed need of the business, in a timely manner.
Purpose: Provide a point of focus and management for all capacity and performance related issues.
Capacity Management Information System (CMIS)

- a collection of IT infrastructure usage capacity and performance information gathered in a consistent
many and stored in one or more databases.
- single book of record for all usage, capacity and performance data, complete with associated business,
application and service statistics
- used by any IT staffer needing access to capacity management data
- all data is synchronized from a collection period perspective
- it is scrubbed to ensure it is consistent and accurate.
IT Service Management Processes frequently accessing CMIS Data:

● Capacity Planning
● Performance Management
● Service Level Management
● Help/Service Desk
● Incident Management
● Problem Management
● Configuration Management
Capacity Management Database (CDB)

- was the central data store but ITIL proponents realized that it fell short of what was needed to take
capacity management to the next level
- a collection of data but there were no standards regarding the collection and archival nor integration
between the different technologies
CMIS Contents
● Business performance data
● Financial data
● Business transaction metrics
● Infrastructure upgrade costs
● Application transaction counts
● Power and cooling cost
● Invoices generated
● IT budget information
● IT service performance data
● Component utilization data
● Transaction response times
● Server performance metrics
● Transaction rate
● Network performance
● Workload volumes
● Data storage measurements
● Memory usage
Characteristics of CMIS
1. Openness
- goal of CMIS is to become the central hub for all performance-related data
- a good CMIS need to make it easy to het information in and out
- a comprehensive performance data regarding the infrastructure going in and efficient access to
that data for analysis and reporting purposes.
● Data collectors use the CMIS to store information
● Performance and other systems management tools use it to access data and share analysis
results
- should be possible to effectively instrument all the critical applications (custom application)
- should be able to implement custom analysis and reports
- should facilitate information sharing with Configuration Management Database (CMDBs),
chargeback application and other tools
- should be able to alert event consoles and service desk tools when adverse events are detected
1. Business-relevant Views
- CMIS should let tools analyze and report on enterprise IT infrastructure from:
○ a component useful for problem-solving and technology-
view, specific detailed planning
○ an IT service- help facilitate business-aligned analysis and
based view, reporting
○ a business
process view
- these views allows you to relate operational and planning results at many different levels
1. Real-time Data
- ability to detect and respond to performance bottlenecks will be hampered if your CMIS is
unable to collect and deliver performance data in real-time
1. Heterogeneous Coverage
- one advantage of CMIS is being able to manage the performance and capacity of all those
platforms from a single repository
- CMIS can handle data from the key platforms within your data centers
1. Automation
- a good CMIS has built-in automation to handle most of the repetitive task and provide interfaces
where you can automate other related task that are specific to your organization's needs.
1. Scalability
- CMIS must have the ability to scale up or down to meet the growing need of the organization
1. Efficiency
- best CMIS tools minimize their use of computing resources, networking bandwidth and require
fewer data storage to perform their work
1. Security
- prevents unauthorized changes or deletion of historical data
- permits you to restrict access to proprietary data stored in the CMIS (e.g. business lan to
preserve competitive advantage)
1. Support
- CMIS must have a capable support team available to assist you with the implementation and
ongoing maintenance of CMIS
How Capacity Management Works

Capacity Management Tools
- measure the volume, speeds, latencies, and efficiency of the movement of data as it is processed by an
organization's application
- able to examine the operations of all the hardware and software in an environment and capture
critical information about data flow
- must be able to observe the individual performance of IT assets, as well as how these assets interact.
- should be able to monitor and measure the following IT elements:
● Servers
● End-user devices
● Networks and related communications devices
● Storage systems and storage network devices
● Cloud services
- relies on the interception of data movements metrics and the internal processes of individual
components
E.g.
1. IOmeter - free, open-source utility originally developed by Intel that provides details about
processing by servers, clusters of servers, or individual end-user computers
IOPS (input/output operations per second) - basic measure of the transfer rate of data during
processing.
1. Emulation Programs - mimic application programs such as database management system
(DBMSes) to determine how a system is likely to perform under similar loads in production
environments
2. Application Emulators - include their own sets of test data to help ensure accurate and
consistent results across disparate equipment.
3. Hardware-based monitoring devices - focus on network performance and can provide
comprehensive information on most aspects of data movement.
Components
1. Control devices (servers with specialized software)
2. Network TAPS ( Network Test Access Points) - devices that physically hook into particular
elements of a network to capture information about data traffic as it occurs.
Components of Capacity Management

- have a fairly narrow scope, providing high-level information on a variety of infrastructure components
- provide detailed metrics related to one segment of the computing environment
- gather as much information as possible and then to attempt to correlate those measurements into an
application-centric picture that focuses on the performance and requirement of mission-critical
applications across the environment rather than how individual components are performing.
Performance (throughput)
- key metric in capacity management as it may point to processing bottlenecks that affect overall
application processing performance
- CPU, routers, storage and controllers should be monitored to ensure that their processing capabilities
are not frequently pinning or at near 100%
Memory
- a factor in capacity management
- servers and other devices use their installed memory to run applications and process data
Physical Space
- most commonly associated with capacity management
- focus generally on storage space for application and data
● Storage Systems
- that are near capacity will have longer response time, as it takes longer to locate specific data
when drives (hard disk/solid-state) are full or nearly full
● Processor and memory measurements
- it's important to monitor space usage in devices other than server and end-user PCs that may
have installed storage that's used for caching data.
Disaster Recovery Plan?

● A documented, structure approach that describes how an organization can quickly resume work
after an unplanned incident
● An essential part of a business continuity plan (BCP)
● Is applied to the aspects of an organization that depend on a functioning IT infrastructure
● Aims to help an organization resolve data loss and recover system functionality
● Step-by-step plan consists of the precautions to minimize the effect of a disaster so the
organization can continue to operate or quickly resume mission-critical functions
● Typically involves an analysis of business processes and continuity needs
● Before generating a detailed plan, an organization performs a business impact analysis (BIA) and
risk analysis (RA) and stablishes recovery objectives
● Define data recovery and protection strategies
● Ability to quickly handle incident can reduce downtime and minimize both financial and
reputational damage
● Ensure that organizations meet all compliance requirements while also providing a clear
roadmap to recovery.
Some types of disasters:

● Application failure
● Communication failure
● Data center disaster
● Building disaster
● Campus disaster
● Citywide disaster
● Regional disaster
● National disaster
● Multinational disaster.
Recovery Plan Considerations

● Recovery time objective (RTO)
○ Describes the target amount of time a business application can be down
● Recovery point objective (RPO)
○ Describes the age of files that must be recovered from backup storage for normal
operations to resume
● Recovery strategies
○ Define an organization's plan for responding to an incident
● Disaster recovery plans
○ Describe how the organization should respond
○ Derived from recovery strategies
● Budget
● Insurance coverage
● Resources (people and physical facilities
● Management's position on risk
● Technology
● Data
● Suppliers
● Compliance requirements
Types of Disaster Recovery Plans

Environment-Specific Plans
● Virtualized DRP
○ Provides opportunities to implement disaster recovery in a more efficient and simpler
wat
○ Can spin up new virtual machine (VM) instances within minutes and provide application
recovery through high availability
○ Testing is easier to achieve
○ Plan must include the ability to validate that application can be run in disaster recovery
mode and returned to normal operations within the RPO and RTO
● Network DRP
○ Recovering network gets more complication as the complexity of the network increases
○ It is important to detail the step-by-step recovery procedure, test it properly and keep it
updated
○ Data will be specific to the network such as performance and networking staff
● Cloud DRP
○ Can range from a file backup in the cloud to a complete replication
○ Can be space, time and cost-efficient but requires proper management for maintenance
○ Manager must know the location of the physical and virtual servers
○ The plan must address security which is a common issue that can be alleviated through
testing
● Data Center DRP
○ Focuses exclusively on the data center facility and infrastructure
○ Operation risk assessment is a key element because it analyzes key components such as
building location, power systems and protection security and office space
○ Plan must address a broad range of possible scenarios
Scope and Objectives

● Business Continuity Institute and Disaster Recovery Institute International provide free
information and online how-to articles
● DRP checklist
○ Identifying critical IT systems and networks
○ Prioritizing the RTO
○ Outlining the steps needed to restart
○ Reconfigure and recover systems and networks
How to Build a Disaster Recovery Plan

Business Impact Analysis (BIA)
● Identifies the impacts of disruptive events and is the starting point for identifying risk within the
context of disaster recovery
Risk Analysis (RA)
● Identifies threats and vulnerabilities that could disrupt the operation of systems and processes
highlighted in the BIA
● Assesses the likelihood of a disruptive event and outlines its potential severity
DRP Checklist/Steps
● Establish the range/extent of necessary treatment an activity and the scope of recovery
● Gathering relevant network infrastructure documents
● Identifying the most serious threats and vulnerabilities and most critical assets
● Reviewing history of unplanned incidents and outages and how they were handle
● Identifying current disaster recovery strategies
● Identifying the incident response team
● Management review and approve the DRP
● Testing the plan
● Updating the plan
● Implementing a DRP audit
Elements of DRP
● A statement if intent and disaster recover policy statement
● Plan goals
● Authentication tools (passwords)
● Geographical risk and factors
● Tips for dealing with media
● Financial and legal information and action steps
● Plan history
Communication Plan
● Another component of DRP
● Details how both internal and external crisis communication will be handled
○ Internal Communications
■ Alerts that can be sent using email, overhead building paging systems, voice
messages or text messages to mobile devices
■ Examples (instructions to evacuate the building, updates on the progress of the
situation
○ External Communications
■ Include instructions on how to notify family members in the case of injury or
death; how to inform and update key clients and stakeholders
Disaster Recovery Plan Template

● Begin with summary of vital action steps and a list of important contacts
● Define roles and responsibilities of disaster recovery team
● Outline the criteria to launch the plan into action
● Specify in detail the incident response and recovery activities
List of Disaster Recovery Test

1. LOSS OF KEY STAFF SCENARIOS
■ Plane crash with critical personnel on board
■ Major transit incident prevents staff from getting to the office
■ Major flu epidemic strike
■ Employees go on strikes
2. LOSS OF KEY TECHNICAL INFRASTRUCTURE SCENARIOS
■ Building fire at office
■ Trucker plows through power cable supporting the office building
■ An employee at a branch office smells a "gas-like smell" coming from the locked
server room
■ Office hardware, computers and telephony have been stolen overnight
■ Fans and cooling systems to the data center have lost power
■ Disgruntled employee takes anger out on
■ Loss or corruption of critical application
3. LOSS OR CORRUPTION OF KEY DATA SCENARIOS
■ Faulty backup tapes
■ HR's office is completely cleaned out by burglars in the middle of the night
■ Network has been hacked
■ Your servers hacked
■ Rogue file sharing instance have lead to a data breach
4. ENVIRONMENTAL SCENARIOS
■ Storm
■ Earthquake
■ Incidents
■ Riots
■ tsunami

ISOM Midterm

Uploaded by

Copyright:

Available Formats

ISOM Midterm

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ISOM Midterm

Uploaded by

Copyright:

Available Formats

Capacity Management

Changing conditions or External influences

Capacity Management Information System (CMIS)

IT Service Management Processes frequently accessing CMIS Data:

Capacity Management Database (CDB)

How Capacity Management Works

Components of Capacity Management

Disaster Recovery Plan?

Some types of disasters:

Recovery Plan Considerations

Types of Disaster Recovery Plans

Scope and Objectives

How to Build a Disaster Recovery Plan

Disaster Recovery Plan Template

List of Disaster Recovery Test

You might also like