IT support sucks but we made our metrics

SLAs should only include those items that can be effectively monitored and measured at a commonly agreed point. Inclusion of items that can’t be effectively monitored almost always results in disputes and eventual loss of faith in the SLM process.

It is essential that monitoring matches the customer’s true perception of the service. A service that is available only to the edge of the data center and not to the end-user is not a complete service and provides little value to the business. Monitoring of services must show the service from an end-to-end or value perspective. Monitoring must also detect when failure of a component recorded at the service desk results in failure of a service and potentially of an SLA. Further, it should indicate how many end-users are potentially impacted by the failure. This will determine the impact and possibly the urgency of a given incident.

This capability requires a well functioning CMDB and the ability to connect incidents with both components and services. SLA breaches are first identified at the Service Desk, so it is very important that appropriate processes and procedures are in place and that they are followed. If this is not done, the reporting may indicate SLA breaches where none actually occur or it may indicate no breaches when they do occur. Either way, the result is bad for IT. It is also critical that SLA information, such as triggers and escalations match between the SLM and incident/problem recording systems.

There are a number of important ‘soft’ issues that can’t be monitored by technical or procedural means such as customer satisfaction, which may not match ‘hard’ monitoring. For instance, even when there have been a number of reported service failures the customers/end-users may still have positive feelings about IT performance. The opposite may also happen when the service is performing normally, but customers/end-users still feel dissatisfied. These types of disconnects primarily occur because of human interaction between end-users and the service desk or customers and process/IT managers. In these situations, perception often outweighs the facts and human errors taint the perception of IT.

Given the importance of soft issues, the question arises; how do we measure soft issues? The simple answer is that you ask the customer or end user what their perception is. Measuring soft issues however is as much an art as it is science. Political pollsters, the masters of this art, are famous for asking the same question in slightly different ways and getting completely different results. They are also famous for selecting the wrong group of people to ask.

Measuring soft issues will take continuous effort over time but it is vital for the long term success of IT. One way to increase the objectivity of these measures over time is to set targets for soft issues that can be quantified and improved.

Posted in IT Management, IT Service Management, Service Level Management | Tagged , , | Leave a comment

The End-user’s Perception is Your Reality

In order for processes and services to be managed and optimized, there must be a feedback loop that provides management information regarding the functioning of the processor service. This requires monitoring the process or service in appropriate ways to ensure that it is meeting the stated goal in the most effective and efficient manner. From an ITIL or quality perspective, the success of the process or service is measured at the point where the output of the process or service is consumed. Within the ITIL framework, this means that services are measured from the end-user and customer perspective or on an end-to-end basis. Processes should be measured at the point at which their output is consumed and on their contribution to the end-user and customer experience.

This requires creating the ability to monitor services on a significant number of workstations, both when the end-user is utilizing the service and also when they are not. Why would we want to monitor the service, when the end-user isn’t using it?

We learned that the Service Desk is directly responsible for end-user perception of IT, and indirectly responsible for customer perception. Now, we learn that Service Level Management is directly responsible for customer perception of IT, which is influenced by end-user perceptions, and will be very interested in monitoring Service Desk performance.

Service level management should make every effort to develop a baseline of understanding for the initial perception of the services. This will enable them to paint a before and after picture, once SLM is firmly established. Part of the picture should include the lack of adequate capability to measure perception prior to implementation. It should also include a description of new capabilities to measure and to report customer perception after implementation.

Plans need to be made for evaluating and implementing or improving UCs and OLAs in support of new SLA targets. Service level agreements do not stand alone. Often, multiple IT groups work together to provide services to end-users. Operational level agreements document this capability and establish clear expectations in support of SLA targets.

Often, services are dependent on outside vendors for parts of their functionality. For an SLA to be enforceable, the underpinning contracts with those vendors must support the SLA targets.

Posted in IT Management, IT Service Management, Service Level Management | Tagged , , | Leave a comment

IT as Strategic Partner

The quantity and levels of service that IT is able to provide are always bounded by the associated costs. IT Service Level Management (SLM) is tasked with helping the customer understand the tradeoffs between cost and benefit. This should be accomplished in a way that allows the business to make decisions about which services it requires and what levels of service are justifiable, given the current business environment.

In organizations without strong SLM, IT is often left making these business decisions even though IT does not have clear understanding of the business environment and its drivers. This results in frustration on the part of both the business and IT as the business feels powerless to make service decisions and IT feels that the business is ungrateful for its efforts.

Implementing SLM has additional up front costs. However, those costs can easily be offset by the benefits derived from IT and the business improving their working relationship. If it is important to IT to be seen as a strategic partner to the business, then it is incumbent upon IT to demonstrate that fact by allocating resources and management commitment to both an SLM process and an SLM Manager.

Posted in IT Management, IT Service Management, Service Level Management | Leave a comment

Configuration Management Database Introduction

The configuration management database is a virtual concept and is made up of many physical databases and physical stores of information. It becomes a CMDB, when the information is brought together with a common interface that makes the information accessible and relevant for decision makers; most importantly, the CIs are related (linked) to one another.

Too many organizations take a technical view of the CMDB and begin by trying to develop a database schema so they can build one. It is important to emphasize again, that the CMDB is not one physical database. It is a logical concept for organizing information, whose closest tangible representation may be a common interface with many collections of information. It is made up of many physical databases, at least one of which will contain the relationships or links between CIs.

One thing that makes a CMDB unique from most data stores is that it focuses on relationships

between CIs. That is, it allows decision makers to draw visuals of systems and their interconnections so that they can model the potential impacts of a change to any component or collection of components.

Every discipline that is required to understand the workings of complex systems ends up building models of those systems. These models serve to increase basic understanding and also to predict outcomes of changes to the system. As IT environments continue to grow in size and complexity, successful managers will be those who are able to model their environments and accurately predict the effects of change.

 

 

Posted in CMDB, Configuration Management, IT Management, IT Service Management | Tagged , , | Leave a comment

Operations and Development Tear Down That Wall

Many IT organizations suffer from less than optimal communication between development and operations groups. This happens so often that a special term has been created to describe what happens during the transition process. IT people say that developed software is “thrown over the wall” from development to operations. The result is that in house development projects often fail to meet expectations as design and operations conflicts are addressed too late in the project lifecycle if at all. Too many times this leads to the perception that IT is unable to deliver services that meet business requirements.

ITIL describes an environment where operational groups have ongoing relationships with the business at all levels of the organization and where change management is a comprehensive process at the IT organization level. In this scenario requests for change to the production environment are reviewed and approved by both development and operations then managed through a common change management process.

Operations  facilitates requirements gathering through its business relationships. Once the RFC is approved, development becomes the primary party in the project. However, operations primarily in the release manager role and at major milestones remains involved throughout the development lifecycle. As the project gets closer to final release operations involvement increases while development involvement decreases.

This scenario highlights the benefits of following a structured development/project management methodology. Within each methodology there are clear milestones at each stage of the project. These milestones have formal meetings associated with them. These meetings provide an opportunity for all stakeholders to maintain control over the project and to ensure that the impact of project changes are understood and accounted for by all parties. This provides operations with many opportunities to ensure that the production environment will be prepared to receive the new release on schedule.

In this scenario once development has completed their testing the release moves smoothly into the production test environment where both groups work to ensure that production testing proceeds appropriately. Having a well defined and functioning production test process serves to make the transition period much easier for development and ensures that operations groups have adequate opportunity to prepare for supporting the release in production.

The operations group conducts pre-release production testing, with help as needed from development, and moves towards final roll-out. Development remains involved throughout this process primarily in an advise-and-assist role.

At a pre-determined point following final roll out, both development and operations will jointly conduct a post implementation review to determine how effectively the specific roll out and the entire process worked. Lessons learned will be incorporated at an organizational level for all groups involved. Successes and failures will be jointly owned by both groups. It is critical in modern complex and interrelated IT organizations that development and operations work effectively together. ITIL provides a blueprint for accomplishing this objective.

Posted in Change Management, IT Management, IT Service Management, Process, Release Management | Tagged , , | Leave a comment

Definitive Stores Reduce Costs

What happens in a typical organization when there is catastrophic damage to a hardware component that requires rebuilding form scratch? In many organizations the existing production configuration would not be definitively known and the exact version of software that was installed is often unknown and not locatable. This results in a rebuilt component that differs from the previous production component. Sometimes few or no differences are noticed. Often critical differences in performance or functionality are found. Sometimes they are found immediately and sometimes they are found much after the fact with significant impact to the business.

The more common scenario is a server room with common hardware and no two servers running the same software configuration. Often desktops will be imaged with different images by different technicians because definitive policies and definitive software stores are nonexistent. Although this seems to be a trivial issue it is the cause of many incidents and increases the costs of maintaining services.

The definitive software store is a dedicated location or group of locations where a definitive copy of every version of software in the production environment is stored. It also stores software versions that have been retired. This provides the capability to restore previous environments for any number of situations some of which are as critical and as potentially costly as regulatory compliance issues. The DSL does not stand alone. It requires definitive understanding of current and past environmental configurations. Maintaining this information is part of configuration management and the configuration management database.

A similar concept is that of the definitive hardware store. This is a location that has backup hardware components that allow for immediate replacement when critical components fail.

Posted in IT Management, IT Service Management, Release Management | Tagged , , | Leave a comment

Great Release Management Requires Project Management

Release management impacts every process area in the production environment. In addition its effectiveness and professionalism contributes significantly to the general perception of the development group. The most finely developed product ever is considered useless by the business if it can not be effectively deployed. As such it is critical that release management invest time and resources into effective planning for all of its critical activities.

Release management is the area in operations that is most dependent on structured project management. Each release is a project of limited duration. It begins with an approved RFC and ends with a PIR. Release management conducts many projects simultaneously each with its own schedule and deadlines. ITIL strongly recommends the use of a structured project management framework or methodology to ensure maximum success for individual projects and release management as a whole.

PRINCE2 is the project management methodology recognized by ITIL and adopted by the British government. In the United States the Project Management Institute (PMI) is the primary source of project management guidance.

Given that many development groups are adopting structured project management disciplines for their development activities it would be beneficial for release management to adopt similar discipline. The coordination requirements between release management and development become much easier if both groups recognize the same project definitions, milestones, documentation, and tools

Posted in IT Management, IT Service Management, Process, Release Management | Tagged , , | Leave a comment

Define Your Roles for Best Communication Flows

Effective communication flows are important for any organization. They become increasingly important as organizations grow in size or are required to operate in a more rapid fashion. Communication flows are also one of the most difficult aspects of an organization to manage. Much of the difficulty involves the lack of a good model with which to define the communication flows. Often, communication is defined informally on an individual to individual basis. As individuals change roles, critical communication often does not occur resulting in increased cost and unplanned downtime.

ITIL, by introducing a framework and the concept of roles provides powerful tools to define and model communication flows that do not break down as individuals change roles. In this example, we see communication flows that are defined between process areas. These defined flows remain consistent regardless of who is responsible for any given process area.

These flows can be broken down more granularly by role. For instance, the problem manager has a responsibility to communicate trigger and threshold information for escalation of incidents to the incident manager. If the communication flows are mapped by role, problem manager, and incident manager, they remain consistent regardless of which individual may be assigned to a given role or how often responsibility for that role changes.

Within the operations space, there is continuous communication flow between incident management, problem management and change management. Slide 7.9 demonstrates some of the most common communication flows.

Posted in IT Management, IT Service Management, Roles | Tagged , , | Leave a comment

Create a Loop to Measure Change Management

How do you know if an individual change or the change process as a whole was successful?

This question can be answered by creating a feedback loop. The components of this loop are incident records, problem records, and RFCs (change records). By linking these three components together, we can begin to determine the effectiveness of change management.

Changes often introduce new errors into the infrastructure. These errors are identified and recorded through incident records. Incidents that have significant impact become problems. We know from previous sections that we always link incident and problem records together. Now we close the feedback loop by linking incidents and problems to RFCs.

This allows us to determine the impact of any given change. If a change introduces a few incidents but no problems, then it can be considered to have met the test of causing minimum impact to the production environment. If a change can be linked to many incidents and some low category problems, then it can be considered to have a significant impact on the production environment. If a change can be linked to many incidents and many problems or even one high category problem, then it can be considered to have had a serious impact on the production environment.

By aggregating this information for all changes during a given period, we begin to understand the impact that change management has on the production environment.

This concept can be taken even further by identifying incidents that are linked to changes that were not approved through the change management process. This identifies how well the change management process has been implemented and adopted. If the number or category of incidents and problems related to unmanaged changes is high, then the change management process as a whole may be failing.

Recognizing how these processes interact identifies a critical point in determining how to structure your organization and what kinds of systems to deploy. It is critical to ensure that your organization is not sub-divided to the point where the benefits of this kind of feedback loop are broken. Likewise, it is critical to have all of these processes managed in a single integrated system. The major software vendors are very much aware of these interactions. Their products are designed to seamlessly bring out these benefits when used properly.

This loop is particularly broken in companies that outsource their IT operations in a piecemeal fashion with each vendor using their own systems. Companies that do this often save visible pennies at the cost of hard to see dollars.

Posted in IT Management, IT Service Management, Process | Tagged , , | Leave a comment

One Change Management for all of IT

Many IT organizations have change processes that are specific to different parts of the organization. For instance, development may have its own change process, and each operational technical silo may have a change process of its own. While this may have worked in the past, the current state of technology interdependence makes this type of organization problematic.

ITIL suggests that the change management process be implemented across all of IT as a whole. This means that all changes are recorded in a common tracking system. The system should recognize the specific needs of each group. For instance, the change management tools needed for writing software are vastly different from the change management tools needed for operations. The key idea though is that operations should have visibility into the projects that development is working on, so that they can prepare the operations environment for new releases. By the same token development should have visibility into operations changes, so that they can be aware of any operational changes that potentially impact projects they are working on.

Posted in Change Management, IT Management, IT Service Management, Process | Tagged , , | Leave a comment