Complete Playlist of Unsupervised Machine Learning https://www.youtube.com/playlist?list=PLfQLfkzgFi7azUjaXuU0jTqg03kD-ZbUz

Let's look at our second unsupervised learning algorithm. Anomaly detection algorithms look at an unlabeled dataset of normal events and thereby learns to detect or to raise a red flag for if there is an unusual or an anomalous event. Let's look at an example. Some of my friends were working on using anomaly detection to detect possible problems with aircraft engines that were being manufactured. When a company makes an aircraft engine, you really want that aircraft engine to be reliable and function well because an aircraft engine failure has very negative consequences. So some of my friends were using anomaly detection to check if an aircraft engine after it was manufactured seemed anomalous or if there seemed to be anything wrong with it. Here's the idea, after an aircraft engine rolls off the assembly line, you can compute a number of different features of the aircraft engine. So, say feature x1 measures the heat generated by the engine. Feature x2 measures the vibration intensity and so on and so forth for additional features as well. But to simplify the slide a bit, I'm going to use just two features x1 and x2 corresponding to the heat and the vibrations of the engine. Now, it turns out that aircraft engine manufacturers don't make that many bad engines. And so the easier type of data to collect would be if you have manufactured m aircraft engines to collect the features x1 and x2 about how these m engines behave and probably most of them are just fine that normal engines rather than ones with a defect or flow in them. And the anomaly detection problem is, after the learning algorithm has seen these m examples of how aircraft engines typically behave in terms of how much heat is generated and how much they vibrate. If a brand new aircraft engine were to roll off the assembly line and it had a new feature vector given by Xtest, we'd like to know does this engine look similar to ones that have been manufactured before? So is this probably okay? Or is there something really weird about this engine which might cause this performance to be suspect, meaning that maybe we should inspect it even more carefully before we let it get shipped out and be installed in an airplane and then hopefully nothing will go wrong with it. Here's how an anomaly detection algorithm works. Let me plot the examples x1 through xm over here via these crosses where each cross each data point in this plot corresponds to a specific engine with a specific amount of heat and specific amount of vibrations. If this new aircraft engine Xtest rolls off the assembly lin,e and if you were to plot these values of x1 and x2 and if it were here, you say, okay, that looks probably okay. Looks very similar to other aircraft engines. Maybe I don't need to worry about this one. But if this new aircraft engine has a heat and vibration signature that is say all the way down here, then this data point down here looks very different than once we saw up on top. And so we will probably say, boy, this looks like an anomaly. This doesn't look like the examples I've seen before, we better inspect this more carefully before we let this engine get installed on an airplane. How can you have an algorithm address this problem? The most common way to carry out anomaly detection is through a technique called density estimation. And what that means is, when you're given your training sets of these m examples, the first thing you do is build a model for the probability of x. In other words, the learning algorithm will try to figure out what are the values of the features x1 and x2 that have high probability and what are the values that are less likely or have a lower chance or lower probability of being seen in the data set. In this example that we have here, I think it is quite likely to see examples in that little ellipse in the middle, so that region in the middle would have high probability maybe things in this ellipse have a little bit lower probability. Things in this ellipse of this oval have even lower probability and things outside have even lower probability.

Subscribe to our channel for more computer science related tutorials| https://www.youtube.com/@learnwithcoursera