Thank you for asking about the "unknown consequences" that RCM can identify. Your inquisitiveness in this regard is indicative of the level of understanding that is just beginning to take center stage in the world of RCM and reliability in general. Typical examples are prevalent throughout industry. That includes industries and facilities as diverse as a manufacturing plant, a chemical plant, a refinery, a commercial aircraft, the space shuttle, a nuclear power plant, an off-shore oil drilling platform, or even a shoe factory. The "unknown consequences" most often manifest themselves due to the lack of understanding of how to really analyze "hidden failures" and the inaccurate categorization of run-to-failure as a choice of strategy for a component that is NOT really run-to-failure. It is the "unknown consequences" that have the greatest potential to cause a catastrophic event.
As I mentioned in my previous post, the late John Moubray, who was a colleague of mine for over 20 years, was also in agreement with me on this issue.
A very typical example, that I mention in my book, occurred, not in a third world country but right here at a U.S. nuclear power plant. One of the major safety systems of the main 1150 megawatt turbine was considered so important that the designer included "triple redundancy" so that a runaway turbine due to an uncontrolled overspeed would never occur. The bottom line is that..... a runaway turbine, due to an uncontrolled overspeed, is exactly what happened! The turbine blew apart and only by luck, no one was killed. It caused several hundred million dollars in damages and resulted in the plant being shutdown for almost one full year resulting in an additional $600 million dollars in lost revenue. That catastrophe, which was totally avoidable with an understanding of some of the concepts I write about in my book, resulted in approximately $1,000,000,000 (yes that's one BILLION dollars) in total losses!
This total disaster could have been avoided with the addition of a few simple but strategic PM's that would not have cost more than a few hundred dollars to implement. The incorrect categorization of run-to-failure and the "unknown consequences" surrounding this event are all too common. Let me also set the stage here for something else to consider. A nuclear power plant probably has more technical expertise on site than any other entity. There are several hundred engineers, scientists, reliability experts, quality assurance inspectors, component experts, maintenance specialists, and a whole host of many other technical types that reside on site. The "unknown consequences" and the pre-existing conditions for the aforementioned disaster went under everyone's radar.
To provide you with another real-life recent example, consider the fire and explosion at the BP oil refinery in Texas. Seventeen people were killed and many others injured. This incident was so significant that BP convened a special investigation panel led by the former Secretary of State, James Baker. The Baker panel's final conclusion was that the explosion was PRIMARILY caused by an inadequate preventive maintenance program whereby certain component functions and failure consequences were apparently not well understood.
I could go on and on with a litany of other such typical examples.
The theme I am bringing to industry in my book and in my speaking engagements throughout the country is that it is not the "usual suspects" that cause the major disasters. It is not the components that everyone knows is a problem that have the greatest potential to wreak havoc on your facility; it is the "unknown failure consequences" and the incorrectly invoked categorization of run-to-failure that poses the greatest threat to safety and reliability.
Something you might find of interest, which is pertinent to this discussion, is an article I recently wrote for the October issue of Terry's Uptime Magazine published by Jeff Shuler. If you did not see the article, I have attached a copy of it below. It is titled "What is RCM Anyway?"
Author, "Reliability Centered Maintenance – Implementation Made Simple" published by McGraw-Hill
FROM THE OCTOBER ISSUE OF UPTIME MAGAZINE
WHAT EXACTLY IS RCM ANYWAY?
Reliability Centered Maintenance, or RCM as it is called, is a term used in the reliability community by different folks to mean different things. Reliability people in various industries around the world truly want to improve their preventive maintenance programs and there are innumerable of ways to accomplish that goal.
Unfortunately, and all to often, when the term RCM is used, it unknowingly becomes a convenient "handle" to add the aura of credibility or the image of some kind of technical authority to one's more simplistic approach to improving a preventive maintenance program. The thinking goes like this.... "After all, RCM was founded in the commercial aviation industry and if it is good enough to base the safety and reliability of aircraft on, it must be good enough for my plant or facility". While that's a true statement, many of us, unwittingly, I might add, try to fit RCM into their vernacular when, in reality RCM goes way beyond the simplistic vision of many preventive maintenance goals.
Through no fault of their own, most people do not know what RCM really is.
For case in point, let's look at some common thought processes that can be greatly misunderstood. For example, the phrase... "A strategic organizational realignment" might really mean ... "Let's fire the incompetents at the top and replace them with people who truly know how to run an organization." See how much better the former sounds? Or perhaps another typical example such as; ... "Notwithstanding a few minor technical impediments and cost challenges, we are very close to completing the project on a given timetable." In real parlance, this means... "We are way behind schedule and over budget." In the majority of instances, this is the same misunderstood syntax used when describing that "we are going to implement an RCM program to make our plant safer and more reliable".
From my experience with RCM and associated preventive maintenance program initiatives, approximately 80% of those people touting an RCM plan for which to improve their plant reliability, do not have a grasp of real-life RCM. In reality, what they actually want to implement is a PM Optimization program. A PM Optimization program is NOT an RCM program. What these folks actually want is to convert time directed overhauls into condition monitoring predictive maintenance tasks.... or they want to use cookie-cutter PM templates.....or they want to find better ways to schedule their existing PM's..... or they want to review the 20% of their known problem components that they believe cause 80% of all of their problems and reduce their costs correspondingly. Don't get me wrong here.... These are all wonderful things to do and I wholeheartedly endorse all of them as well as a whole host of other such peripheral betterment issues. But let me be very clear; these betterment issues are NOT RCM.
WHAT RCM IS NOT
RCM is NOT:
"¢ A remake of overhauls into condition monitoring.
"¢ Reviewing known problems.
"¢ A process that selectively picks and chooses a few given systems or certain
components to analyze, that everyone, including the janitor, knows is a problem and that has a major effect on the operation of the plant or facility when it fails.
"¢ Performing an analysis on a piece part such as a bearing or a shaft, for example.
"¢ Implementing a set of standardized task templates for PM activities.
The primary reason for implementing an RCM program is to identify components whose functional failures can cause unwanted consequences to ones plant or facility. If you already know what those components are you don't need an RCM program. You need a PM Optimization program which is a far cry from an RCM effort.
So why do people call these peripheral programs an RCM program? Usually, it is because it sounds better. It has a cache of professionalism, authenticity, and technical credentials associated with it. It is like buying a "NASA inspired space developed mattress" which sounds much more technically state of the art than describing it as a "foam mattress that does not move when you jump on it"!
IT'S THE UNEXPECTED
It has been proven over and over and over that the vast majority of major disasters which occur, that were not due to either nature, human error, or sheer negligence, were caused by equipment failures whose consequences of failure were unexpected and never analyzed, or those components which were incorrectly analyzed to be run-to-failure components. The disasters caused by these two reasons are "surprises" because they were totally unknown, unexpected and unanalyzed. Many of these disasters were caused by failures of rather innocuous or non-obvious components. A PM Optimization program would have little or no chance to ferret out those component failure consequences.
The "unexpected" disaster can also take place with the misguided conception that redundant components "automatically qualify for a run-to-failure" status. In the absence of identifying what I have termed "potentially critical" components, a hidden failure can go undetected if there is no indication of the failure and if there is no immediate consequence of the failure until another failure takes place in combination with the first failure.
After all, don't we all have maintenance budgets that encompass the known maintenance work that needs to be done? One rather ˜small' unexpected disaster can exceed that budget 2, 4, or 10 times over. Worse yet, one unexpected disaster can shut your facility down for good. The real-life examples of major disaster occurrences are bountiful with the latest BP explosion in Texas being just one of them.
WHY TAKE SHORTCUTS?
Even when we believe that we really do need an RCM program and not a PM Optimization effort, why do we take shortcuts that are commonly called streamlined or truncated RCM? We take them only because RCM has been made so difficult and costly to implement that it is mostly shied away from except for mega corporations with megabucks to spend. RCM usually ends up as an unsuccessful venture, even for the mega corporations. To put it in proper perspective, over 90% of all attempted RCM programs result in failure! This does not have to be the case.
Most people believe a comprehensive RCM program takes a team of 6 or more people, three or four years to complete. That could take at least 18 man-years! That's a scary thought and one that undoubtedly puts the kibosh on implementing a comprehensive RCM program. It does not have to be that way.
From personal experience in having developed and managed what is perhaps, even today, one of the most comprehensive classical RCM programs ever implemented, with over 125,000 components analyzed at a dual unit nuclear power plant, the "lessons I learned" are that for an average size facility, which comprises probably 95% of all facilities except perhaps, nuclear power plants and jet aircraft for example, a comprehensive RCM program for all plant components can be completed by a team of 3 to 4 technically qualified people in only four to six months!
PM BETTERMENT PROGRAMS
I totally concur with the late John Moubray, who was an acquaintance of mine for over 20 years, that there is nothing wrong with PM betterment programs; however, they should be distinguished from RCM. I am not bothered by the false name pretense of calling a PM betterment program an RCM program. I am concerned, however, about one's senior plant management being lulled into naively and falsely believing that his, or her, facility will now become more reliable and less prone to an unwanted disaster because the maintenance and engineering folks have "implemented an RCM program" borne directly from jumbo jets and nuclear plants when in reality all that is planned is a simple PM betterment program! In fact it is not fair to have senior management believe such a myth.
If you and your folks, as the responsible reliability liaison at your facility, do not know what happens with the unwanted failure of each and every FUNCTIONAL component in your plant, you do not have an RCM based reliability program. Note that I underlined, italicized, and bolded the word FUNCTIONAL. I did this because obviously there are components in every plant or facility that have no real function... they are there for convenience or for very minor importance. Think about it, if a truly functional component was designed into your plant, it is there for a reason. If its immediate failure has no unwanted consequence, and if a multitude of associated component failures in addition to the originally failed component have no unwanted consequences, and if it doesn't matter when the failed component is restored to an operable condition, then why is it in the plant to begin with?
SAE Document JA1011 has it place. It was not developed to be used on a pick and choose basis. In fact nowhere in the document does it say it can be used selectively. Remember, an RCM analysis is employed to "identify" components whose functional failures can result in unwanted plant consequences. If you are already clairvoyant and know what those components are there is really no need to pursue an RCM program. In such a case, RCM would be a waste of time and money.
Of course there are those components whose criticality is well known. That is a given. That is what maintenance budgets are made for. What about the other 80% of the plant that is not looked at? There are myriad critical components in that population just waiting to cause a disaster. In fact, it is within that "unanalyzed" population of components, that your disaster is most likely to occur! Keep in mind; it is the non-obvious and the rather innocuous components whose failure consequences have the greatest potential for wreaking havoc upon your facility.
BE AWARE OF THE RISKS
Let's look at the 80-20 program I mentioned earlier where only 20% of a plant is analyzed leaving the remaining 80% unanalyzed and probably in the run-to-failure category. As delineated in my book, I liken this to buying car insurance that insures you only while you are driving 65 mph on a freeway or while you are driving in heavy traffic, when you believe an accident is most likely to occur. You would not be insured driving on country roads, or driving slowly through your neighborhood to and from work, or driving on any nonbusy roads because it is assumed you would not have an accident under those conditions. You would assume the risk of having no insurance coverage during these times. Does that sound comforting? Statistically, when do most accidents occur? They occur within a few miles of home.
Many astute reliability professionals have begun to understand this logic. They understand that one unanticipated functional failure consequence can totally wipe out any routinely generated maintenance budget that was put together by including maintenance expenses only for the well known problem components that I refer to as the "usual suspects." They understand the risk of disaster that they assume by eliminating 80% of their plant from being analyzed for unwanted functional failures.
To put all of the aforementioned into a clearer and more focused picture, look at RCM as having three phases associated with it. The 1st phase is the heart and soul of the process. It is where the population of equipment requiring preventive maintenance is identified. This is the phase where RCM decision logic comes into play. Phase 1 is the "engine" of the process. The 2nd phase is to specify the tasks that will be scheduled on the population identified in phase 1. This second phase is where condition monitoring, predictive maintenance techniques and the use of cookie-cutter PM task templates are specified. The 3rd phase is the actual implementation of the specified tasks. This is where EAM and CMMS systems come into play.
Look at the 3 phases of RCM like you would look at a car. For example, the heart and soul of the car is the engine. Even though a car has many different facets to it such as tires, brakes, windshield wipers, seats, windows, and so on, it is the engine that singularly defines the car. All other facets of the car are peripheral to the engine.
In summary, don't unintentionally place your senior management into the false belief that you are going to make the plant safer and more reliable. If it is really a PM betterment program you are after, tell your management that you are embarking on a program to reduce known costs. There is a major difference between a program such as RCM which is designed for truly enhancing safety and reliability and its concomitant cost avoidance benefits (such as avoiding potential disasters) and a PM betterment program which is strictly an economic exercise implemented solely to reduce known costs.
In virtually every case, any type of true RCM effort will result in certain INCREASED costs because certain previously unknown failure consequences will be addressed and hence, those components will need to be continuously maintained within the preventive maintenance program. The real benefit of RCM is its ability to ferret out those components whose failure consequences were previously unknown so that they can be appropriately addressed to avoid an unwanted disaster. Obviously, avoiding unwanted disasters has enormous cost avoidance benefits. However, if your facility is such that the worst thing that can happen is within the realm of acceptance by your senior management, then RCM is not really needed. A PM betterment program is what should be pursued.
Neil Bloom is the author of "Reliability Centered Maintenance- Implementation Made Simple" published by McGraw-Hill. He is a mechanical engineer with over 35 years of both hands-on and senior level managerial engineering and maintenance experience in RCM and Preventive Maintenance Programs in the commercial aviation and commercial nuclear power industries. He is an international guest speaker on RCM and an instructor of RCM in the Continuing Education Division at the University of California – Irvine (UCI). Neil provides 3-day RCM Training Seminars and can be reached at neilbloom@RCMauthor.com or (949) 218-1286. His website is www.RCMauthor.com