Prognostics & Health Management (PHM)
DAU GLOSSARY DEFINITION
Prognostics and health management (PHM) is an advanced approach to minimize maintenance costs while maximizing operational availability and utilization of critical systems. PHM seeks to develop sensor hardware and algorithms to detect anomalies, diagnose problems that cause the anomalies, and compute a probability distribution of time to failure. Given this distribution along with operational constraints and objectives for the system, maintenance activities can be scheduled to achieve the optimal cost and utilization.
This article describes how Prognostics and Health Management (PHM) supports DoD Condition Based Maintenance Plus (CBM+) initiatives:
Prognostics and Health Management (PHM)
You may have heard the term and most people would respond that it means “the ability to troubleshoot” or “to measure the remaining life of a component.” While PHM does perform these functions, it is so much more. The goal is to anticipate a component failure and schedule maintenance to support Sense & Respond as described in DoD Condition Based Maintenance “Plus” (CBM+) guidance. CBM+ leverages the impending failure information to trigger supply support and other logistics processes and resources to quickly restore the system to full capability. CBM+ also informs the depot for repair line capacity planning and can be used to warn of impending design changes needed. PHM relies on hardware (sensors and associated wiring) and software (with thresholds for detection reporting of impending faults).
An example of PHM in our daily lives...
You pack the family car in preparation to depart for Wally World in the morning. You start the Suburban Family Truckster (the salesman had said, “If you think you hate it now – wait’ll you drive it!”) and the check engine light is on. Clark might simply put a piece of black electrical tape over it and press on. The transmission would fail in the middle of Death Valley, of course.
The garage charges Clark for a tow truck to pick up the car. Garage technicians must troubleshoot why the transmission failed and produced bits of metal in the fluid. It will take one day to remove the transmission, one day to tear down and find the failed component plus another two days awaiting parts. Two more days to re-assemble including repair of secondary damage caused by metal shards. One more day to re-install transmission and test drive. “Whew.” That’s seven long days at the motel with nothing to do. With vacation time now over, Wally World will have to wait until next year...
Upon the discovery of the check engine light you plug in your On-Board Diagnostics Reader and find the code: “Trans Temp Hi.” You check the transmission dipstick and see metal bits. You would make the decision to un-pack the car and go to the spare (your cousin Eddy’s mini van that smells like wet dog) and head for Wally World.
The mission was accomplished and the roadside failure averted. One day was lost to unpack the car, pick up the minivan, and re-load. Upon your return you still must schedule an appointment with the dealership and tow the car in. It will still take seven days downtime with the additional expense to fix the secondary damage caused by the pump failure and metal released.
But What If...
The previous month the car PHM had been monitoring the transmission pump performance and noticed the pressure fluctuating, the pump temperature rising above threshold, and transmission fluid temperature rising above threshold. The PHM system would report an impending pump failure. CBM+ would take that sensor information further and report it to the dealership technicians. Knowing these parameter exceedances enables a technician to quickly confirm the fault isolation to the pump. CBM+ would also offer to schedule a time with you to repair it and search inventory for the repair parts required. It would also report the fault to the engineers for trend analysis. If many of the same impending failure codes are reported – the engineers would investigate the failures and develop a design fix for the pump to avert future failures.
Going from an unscheduled maintenance action to a predicted scheduled maintenance action means total down time is reduced to three days because troubleshooting is already complete, parts have been ordered, and have already arrived. The system has given you enough advance notice to repair the car before departing for the vacation. You even avoided the cost of the tow job and the secondary damage repair because you could drive to the dealership before failure.
So how can we apply it to our defense systems?
It begins with a strong requirement reflected in the Initial Capability Document (ICD) and Capability Development Document (CDD).
PHM must be intentionally designed-in which begins at System Functional Review (SFR), Preliminary Design Review (PDR) and then the Critical Design Review (CDR). Allocations must be made for weight (for the sensors and wiring) and included in software development workload estimate.
A business case analysis is required for each sensor to “buy its way on board.” The total cost of development and sustainment of the sensor suite cannot exceed the cost avoidance benefit that PHM could bring. Fortunately, many of the sensors that are required for system performance monitoring can also support PHM. We can add PHM to an existing platform or a Non-Developmental Item (NDI) or Commercial Off The Shelf (COTS) product, but it is significantly more difficult. The cost to implement PHM in the design and the potential impact on maintainability for a system not designed for access to these sensors and wiring cannot be ignored. Regardless, we can always initiate a CBM+ Fleet Performance Monitoring Program to gather data which will guide fleet sustainment improvement opportunities. This capability would at least detect and track high volume “Could Not Duplicate” failures indicating a need to redesign a component or update fault isolation procedures.
What Are the Disadvantages to having PHM?
Well, if the system False Alarm Rate exceeds requirements, then the maintainers are replacing perfectly good components. This results in reduced system availability and will rapidly deplete the forecasted spare components pool and increase depot workload. The associated cost increases will exceed the budget. At some point, the PHM capability quickly becomes more expensive than simply operating the component to failure. The maintainers and operators will quickly learn to ignore the spurious PHM alarms and potentially miss a real impending failure.
PHM has the potential to actually reduce total system reliability. It is, after all, one more complex system added to a weapon system. The advantage still remains that reducing unscheduled maintenance is worth the slightly reduced system-level reliability.
Also, the data system to report and trigger maintenance can bring its own cost and complexity. If the platform must be deployable, the portable data system may rely on connectivity to the main database. The data system introduces one more cyber vulnerability to the platform putting system availability at risk.
- Read, understand, and apply the principles and processes outlined in the DoD CBM+ Guidebook and DoD Instruction 4151.22 Condition Based Maintenance Plus for Materiel Maintenance policy. Then read the DAU CBM+ ACQuipedia article and sign up for the LOG 0290 Condition-Based Maintenance Plus (CBM+) online training course.
- Ensure the Joint Capabilities Integration and Development System (JCIDS) requirements (ICD and Draft CDD) clearly define the PHM parameters and include boundaries (such as False Alarm Rate). These system attributes will ensure PHM is included in design allocations, addressed at the Critical Design Review, and not easily traded away in the design process. If the PHM design is deferred to the Low Rate Initial Production contract, then it becomes vastly more difficult to justify its inclusion in the final design.
- Ensure your Test and Evaluation Master Plan (TEMP) includes PHM “seeded fault testing” (sure, it is expensive, but much cheaper than testing in the field with actual failures or false alarms). Seeded fault testing software routines simulate sensor outputs which exercises the PHM suite and then measures responses. The alternative is to field the system and test it the operating environment and hope for good results. (“Hope is not a strategy, sergeant!”)
- Test the PHM system performance early and often, starting in TMRR and throughout EMD.
- Ensure your program Follow On Testing & Evaluation (FOT&E) planning includes schedule and budget for “tuning” PHM sensor software thresholds to correct for demonstrated fault detection and false alarm limits. Rarely (never?) does a PHM system operate flawlessly in Developmental Testing. The sooner the software is updated, the sooner the system stops sending false signals with the resultant increased maintenance and decreased demonstrated system availability.
- The Prognostics and Health Management Consortium (PHMC) is an industry and government supported research and education consortium within the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland. The mission of the PHM Consortium is to provide basic research and technology on prognostics and health management to members. CALCE has been performing cooperative, industrially focused research on prognostics. Find out if your Service Reliability, Availability and Maintainability (RAM) engineering department is part of the PHM Consortium. If so, they can present you with options from the “state of the art” technology and ensure your program leverages the full range of sensors and software available for possible integration into your design. This will guide your system specification and request for proposals.