Complete Playlist of Unsupervised Machine Learning https://www.youtube.com/playlist?list=PLfQLfkzgFi7azUjaXuU0jTqg03kD-ZbUz
What is Collaborative Filtering Algorithm? How Collaborative Filtering Algorithm Works in Unsupervised Machine Learning? In the last video, you saw how if you have features for each movie, such as features x_1 and x_2 that tell you how much is this a romance movie and how much is this an action movie. Then you can use basically linear regression to learn to predict movie ratings. But what if you don't have those features, x_1 and x_2? Let's take a look at how you can learn or come up with those features x_1 and x_2 from the data. Here's the data that we had before. But what if instead of having these numbers for x_1 and x_2, we didn't know in advance what the values of the features x_1 and x_2 were? I'm going to replace them with question marks over here. Now, just for the purposes of illustration, let's say we had somehow already learned parameters for the four users. Let's say that we learned parameters w^1 equals 5 and 0 and b^1 equals 0, for user one. W^2 is also 5, 0 b^2, 0. W^3 is 0, 5 b^3 is 0, and for user four W^4 is also 0, 5 and b^4 0, 0. We'll worry later about how we might have come up with these parameters, w and b. But let's say we have them already. As a reminder, to predict music j's rating on movie i, we're going to use w^j dot product, the features of x_i plus b^j. To simplify this example, all the values of b are actually equal to 0. Just to reduce a little bit of writing, I'm going to ignore b for the rest of this example. Let's take a look at how we can try to guess what might be reasonable features for movie one. If these are the parameters you have on the left, then given that Alice rated movie one, 5, we should have that w^1.x^1 should be about equal to 5 and w^2.x^2 should also be about equal to 5 because Bob rated it 5. W^3.x^1 should be close to 0 and w^4.x^1 should be close to 0 as well. The question is, given these values for w that we have up here, what choice for x_1 will cause these values to be right? Well, one possible choice would be if the features for that first movie, were 1, 0 in which case, w^1.x^1 will be equal to 5, w^2.x^1 will be equal to 5 and similarly, w^3 or w^4 dot product with this feature vector x_1 would be equal to 0. What we have is that if you have the parameters for all four users here, and if you have four ratings in this example that you want to try to match, you can take a reasonable guess at what lists a feature vector x_1 for movie one that would make good predictions for these four ratings up on top. Similarly, if you have these parameter vectors, you can also try to come up with a feature vector x_2 for the second movie, feature vector x_3 for the third movie, and so on to try to make the algorithm's predictions on these additional movies close to what was actually the ratings given by the users. Let's come up with a cost function for actually learning the values of x_1 and x_2. By the way, notice that this works only because we have parameters for four users. That's what allows us to try to guess appropriate features, x_1. I'm using the notation here a little bit informally and not keeping very careful track of the superscripts and subscripts, but the key takeaway I hope you have from this is that the parameters to this model are w and b, and x now is also a parameter, which is why we minimize the cost function as a function of all three of these sets of parameters, w and b, as well as x. The average we derived is called collaborative filtering, and the name collaborative filtering refers to the sense that because multiple users have rated the same movie collaboratively, given you a sense of what this movie maybe like, that allows you to guess what are appropriate features for that movie, and this in turn allows you to predict how other users that haven't yet rated that same movie may decide to rate it in the future. This collaborative filtering is this gathering of data from multiple users. This collaboration between users to help you predict ratings for even other users in the future. So far, our problem formulation has used movie ratings from 1- 5 stars or from 0- 5 stars. A very common use case of recommended systems is when you have binary labels such as that the user favors, or like, or interact with an item. In the next video, let's take a look at a generalization of the model that you've seen so far to binary labels. Let's go see that in the next video.
Subscribe to our channel for more computer science related tutorials| https://www.youtube.com/@learnwithcoursera