Enterprise asset management expert Mike Sondalini chats about how to define system reliability and finding the right KPIs for your shop.
Overall equipment effectiveness (OEE) is one of the most important metrics for understanding and improving manufacturing productivity. However, it’s not always well understood how to define it. In a complex business, all parts are interconnected, from the raw materials at the very start of the supply chain, to equipment on the factory floor, until the product is out the door. In such a complex environment, making meaningful improvements is a daunting and elusive goal.
Mike Sondalini, Author, and founder of Plant Wellness Way, an enterprise asset management consulting firm based in Perth, Australia, is a unique and trusted voice in operational effectiveness.
The following Q&A, which contains some of Sondalini’s best advice for approaching OEE, is edited for clarity.
Engineering.com (ENG): What industries do you focus on?
Mike Sondalini (MS): My passion is processing operations, that came from my time with Swan Brewery and Coogee Chemicals. The other industry I have been heavily involved with is ore mining, which is the biggest industry in Western Australia and is vital to its economy. Following those two is small manufacturing enterprises (SME), which in Western Australia often exist to support the needs of the mining industry. Other SME’s I’ve worked with were involved with construction materials-related process and batch operations.
ENG: What do you see as the most common low-hanging fruit for improving OEE?
MS: The lack of a commonly held understanding throughout a company, from boardroom to shop floor, that all persons work in one holistic system. People need to know and see they—and everyone else in the company and its supply chains—are colleagues working together as one system. What happens in the boardroom directly impacts the shop floor. What happens on the shop floor directly impacts the boardroom. What happens at your suppliers directly impacts your operation’s performance. What happens upstream of you directly impacts your work. What you do in your tasks directly affects all those people downstream of you. Once you realize everything you do, and everything anyone else in your company does, either helps or harms the system, then you can start making worthwhile improvements throughout the system that are sure to better its performance and positively benefit your company, its people, and its customers.
ENG: Does that mean focusing on company culture?
MS: Culture is a set of behaviors and actions you can see. If you want a particular culture, then what are the full set of required expectations, behaviors and actions? Put them into every process, job procedure, work task and training. Embed them in every organizational document so the correct cultural behaviors in a situation are clear and specific and people know what the right thing is to do. Culture is a design choice. Put cultural requirements into your system’s design and structure so cultural norms are clear and cultural randomness is removed, and the intended behaviors can be reproduced by everyone.
ENG: In your experience, what are the most common bottlenecks in production?
MS: Those materials, parts, equipment, skilled expertise and other resources that are vitally necessary are not available and ready-for-use when needed. Then everyone involved must stop and wait for what’s missing before anything else can progress.
You get ahead of the situations and circumstances that lead to bottlenecks by managing and eliminating the root causes of those situations. Instead of fixing bottlenecks once they happen, you focus on stopping ‘the cause of the cause’ so bottlenecks can’t start because the initiating root causes have been prevented.
A practical approach is to proactively use risk identification and risk assessment to identify all upstream causes of a bottleneck and all associated consequences. Then proactively plan and prepare the materials, parts, equipment, skilled expertise and other resources needed to eliminate the causes of the risks to throughput. Follow that with proactive scheduling to ensure the materials, parts, equipment, skilled expertise, and other resources are onsite at the right time and in the right place, so bottleneck risks are minimized, and thereby production uptime maximized. Instead of letting a bottleneck choke and then addressing the issues, your proactive risk management and pre-emptive planning lets you know what issues will cause bottlenecks so you can monitor if the risk of bottlenecking is rising.
I call the proactive elimination of the causes of the causes of business system and process failure ‘Enterprise Wellness’. Wellness is the holistic mindset and life-long practice of proactive health where your mind and body are fit, are working properly, and are disease-free. A healthy, fit, properly functioning, and failure-free organization and operations are also what every enterprise wants. Enterprise Wellness is a methodology with a collection of techniques to focus you on getting lifelong organizational health and wellness.
ENG: Is there a set strategy for identifying obstacles or bottlenecks in production, or is it case by case?
MS: What I would start with is Value Stream Mapping (VSM) because it identifies bottlenecks. Following that, I would track all wasted money and all lost operating profits throughout the whole process using Instantaneous Cost of Failure. The lifeblood of business is money, and you must know where your money went, how much of it was lost, and why loss events happened.
It’s a lot easier and quicker for a business to save money than it is to make more money. If you can save money, while at the same time improving and optimizing system-wide performance, the operating profits will rocket up. PWW methodology is used to help companies reach operational excellence as fast as possible by looking at where the money goes in their systems and processes and then identifying how to optimize system performance while minimizing costs.
ENG: Some manufacturers may identify a problem with unplanned downtime and think replacing equipment is the solution. How do you see it?
MS: W. Edwards Deming told us what to do in this situation back in the 1980’s. First, optimize the system you have before changing it and potentially introduce new problems and variation. A new machine might be the right answer. But if the existing system is not fully optimized at maximum possible sustainable performance, there is no need yet for a new machine.
Lifting OEE requires a holistic, system-wide approach. It might be best for the system that you don’t buy a new machine, but instead your supplier buys a new machine so they can give you better raw materials. If you can see that your suppliers’ equipment and their raw materials are all part of your holistic system, then you have the correct insight to look in the right places and pick good solutions that end the causes of the problems that destroy your OEE.
ENG: When purchasing new equipment, ROI is considered. Should manufacturers use their historical OEE numbers as part of ROI calculations, or focus on improving OEE first?
MS: When getting new equipment, it’s critical that the total life cycle cost is used in the ROI calculations. I would use a range of OEE from the very worst possible to the very best possible. Since they are all possible scenarios, I would want to know how to ensure I get the very best OEE results all the time. You would get the very best results if there was no possible way that the causes of bad OEE could ever arise. Maximizing process success requires finding everything that stops or prevents successful results and eliminating the causes of bad results from the process. If nothing goes wrong in the process because there are no causes present that could lead to failure, then the process naturally, and always, delivers its best success.
I use a company’s quality system process flowcharts and job procedures to investigate and analyze poor operational performance and poor process outcomes. From there I build a spreadsheet model of the process, procedural step by procedural step. At each step, every risk to its successful accomplishment is listed. I then look at the existing process and its procedures and work instructions to ensure the causes of each risk are either eliminated or controlled so causes cannot become problems. All the unmonitored and/or uncontrolled risks remaining go into my report as opportunities for the company to improve its performance and profits.
ENG: Often, products like ERP and cloud-based monitoring are considered solutions to improving OEE. Do you agree or is there more to it?
MS: Such vendors sell data storage, data display, and data analysis solutions. Using data management software code and algorithms is no guarantee of OEE improvement.
OEE = Availability x Performance x Quality. Where, Availability = Run Time / Planned Production Time. Performance = (Ideal Cycle Time x Total Count) / Run Time. Quality = Good Count / Total Count). The perfect OEE result is 1 x 1 x 1 = 1. The further your operation is away from a ‘1’ the bigger the problems.
First, collect all the right data. With the complete set of correct data, the data management and analysis software is a tool to give you insights on why problems occurred. If availability was poor, your collected data must tell you why it was poor. If performance was low, your collected data must tell you why it was low. When quality is bad, the collected data must tell you why that happened. All data management software works with the data given to it. When you give it incomplete data or wrong data the software will still give its output. But it will be incomplete and/or wrong, and you don’t realize it.
ENG: Are there situations where OEE isn’t a good metric for performance?
MS: OEE is an extremely high-level measure. It’s measuring the holistic performance of the entire operation. It’s interesting to know the ‘30,000-foot’ perspective but you can’t see the details. It’s most vital to know what is happening on the ground that is causing the outcomes you see from so far up the KPI hierarchy. With respect to OEE, you need measures covering the causes and losses of availability, measures covering the causes and losses of performance, and measures covering the causes and losses of quality. Graphic, visual representations of measures and trends are most useful.
ENG: How can a manufacturer know if they have chosen the right KPIs to measure? Is lack of OEE improvement the only way to tell or are there other signs?
MS: System-level performance like OEE shows the accumulated effect of ground-level results. I want to see ground-level KPI’s directly from every procedure done in each process.
When I investigate KPI selection, I look at the process flowcharts and list their inputs and outputs step-by-step. I want to determine what factors should be monitored or measured that tell you how well a step is performing. The necessary KPI’s are then selected and the appropriate way to measure them is determined. From then on, each KPI is plotted on both a run chart and a frequency distribution graph looking for wide variations of performance. Bad performance is investigated for its causes upstream, and those causes are eliminated or tightly controlled. Brillant performance is investigated to capture how it happened and ensure you bring those actions into the process steps so brilliant results become the standard.
ENG: When improvements are required in an operation to maximize OEE, what approach would you recommend?
MS: Every change made in an organization, or its supply chains, alters the holistic system behavior and the downstream outcomes. It needs a methodology to separate suggestions which will do bad from those that will make your enterprise healthier, stronger, and more successful—every change must guarantee your enterprise gains more wellness and profit.
I am a great believer in the scientific method of testing a suggestion or new idea against existing practices to be sure it’s a positive improvement and is worthwhile implementing before agreeing to do the change in the real world. The scientific method technique requires you to confirm a proposal is clearly and unquestionably better than the current practice, and by how much.
The scientific method is universally applicable to any ‘before and after’ situation because you are required to measure the size of the effect caused by a proposed change. Experiments quantify the benefits of possible changes. Pilot trials confirm the proposal delivers better results than the current state. It ensures you confirm a change will do no harm and will truly work for the good and wellbeing of the whole enterprise and its future.
Of course, an experiment that requires changes to a whole process is a huge risk and would require massive amounts of investigation, planning, engineering and proof testing to ensure minimal risk. Making big changes to an operating business would not be welcomed by its management. What would be heartily welcomed by management is small, simple changes and improvements that cost little, are easy to make and are certain to deliver a profit. That is why Toyota promotes kaizen—they do thousands of small, incremental, low cost, low risk improvements to their processes every year, year after year, and they do not wait to do one monumental change that is highly likely to be less successful than is intended.
To learn more about Mike Sondalini and Plant Wellness Way, visit the website.