When we moved into our house, it came with the appliances from the previous owner, including the washing machine. The washing machine was similar to the one I had grown up with, and was probably about the same age. It was a top loading agitator style. One day, in the middle of the wash cycle, it just stopped. There was no warning, just a tub full of water and half washed clothes. It wasn’t agitating, but I certainly was!
Our new washing machine is a totally different proposition. As well as being front loading, it contains a small computer. It has multiple sensors and collects a set of diagnostic information about itself. If it’s not happy, it will tell us what to do, or at least it gives us an error code.
If a broken washing machine is enough to disrupt your domestic world, then imagine if you are running a large industrial process, or managing a fleet of expensive vehicles. Machine fault detection and machine fault diagnosis are key to keeping your business going. Another word for this is predictive maintenance, and outlier detection (anomaly detection) helps you solve for the predictive maintenance problem.
In this post, we apply outlier detection to finding machine defects, using a dataset from University of California Irvine. The data is from Scania Trucks of Sweden.
“The dataset consists of data collected from heavy Scania trucks in everyday usage. The system in focus is the Air Pressure system (APS) which generates pressurised air that are utilized in various functions in a truck, such as braking and gear changes. The dataset’s positive class consists of component failures for a specific component of the APS system.”
The original dataset contains 60,000 records and 170 columns. Of the 60,000 records, 1000 have a failed APS component and this is indicated in the first column. It follows that the overall failure rate is one in 60 or 1.67%. Prior to outlier detection, we removed this first column, as this is what we are trying to predict. All the remaining columns are diagnostic readings, the meaning of which has been masked. They have names like “ab_000”.
We uploaded the file to the Penny Analytics website and downloaded our free data profiling report. This report gives us a quick tour of the data and enables us to find issues before submitting the file for outlier detection. Here is the data profiling report:
aps_failure_nolabel_profiling_report.html
(To enable all features of the data profiling report including toggle details, you will need to download it and open it from there.)
In the outlier results, the outliers are ranked from 1 to 1800, with 1 being the biggest outlier. Some ranks are ties. In fact, 2386 or 4% of records have been given an outlier score and reason code. The first thing we did with this results file was to reattach the class column that tells us whether the record represents a defect or not. We then sorted the results by the outlier ranking. Here is a peek at the first few records:
Next, we took the top ranked outliers and put them into groups of 100. In the first 100 records we found 70 defects i.e. 7% of all the 1000 defects. The following table shows the results for the next 100 outliers, and so on:
The results degrade as we go deeper into the outliers. But how far down the list of outliers should we go? From the original dataset, there are two potential costs:
• Cost 1 ($10) refers to the cost that an unnessesary check needs to be done by an engineer at a workshop, while
• Cost 2 ($500) refers to the cost of missing a faulty truck, which may cause a breakdown
This means we can calculate a breakeven point. If an interval contains 100 records, and x is the number of faulty trucks, then 100-x is the number of trucks that are not faulty. The breakeven point is where 500x = 10*(100-x), the solution of which is x = 2.
So, any interval where we are likely to find two genuinely faulty trucks is worthwhile investigating. It turns out this is true until we get to the interval 2401 – 2500, by which time we have detected 83% of faulty trucks. Furthermore, the engineers would have access to the outlier ranks and reason codes, which would speed up their work.
Is this the best possible solution? If you are running a large industrial operation, you could probably do better by building custom predictive models, but you will need data scientists and you would need a manageable number of faults to build models for. But if you want to get results quickly, then outlier detection could be the answer. At Penny Analytics, you can learn how our outlier detection service works by taking advantage of our free trial.
The Penny Analytics machine defect outlier file is here: