Recommending classes to users based on an assessment using Content-Based Recommendation System

In this article, we will build a content-based recommendation system that will use the user’s interest and scores on an assessment to recommend them, various classes

5 min readJan 21, 2021

Recommendation Systems are one of the most exciting and popular algorithms in the field of Data Science and drive most of the revenue earned by companies like Netflix and YouTube.

There are lots of ways to build recommendation systems but in this article, we will look into the Content-Based Recommendation system. As the name indicates, they use the content from the objects that are to be recommended and the user info. These systems are particularly useful when you are starting out as you don’t have much information regarding the users.

One of the companies that are using this type of algorithm is Ernesto.net. The algorithm was developed by Broward College Computer Science Professor, Ernesto Lee. The company uses this algorithm to recommend courses based on the mastery level of an individual, and they don’t recommend based on the time a user has spent, rather they use assessments at each step to judge the mastery of a user and recommend them courses accordingly. Ernesto.net combines the assessments and tasks accomplished by users to tailor future content for personalized learning.

Explore the Data

Let’s first start by exploring the data. We start with two files containing data regarding the users and classes.

Regarding the users, we have their interests, the type of test they gave, and how much they got in the test. For classes, we know their types and their level of difficulty.

User’s Info Based Recommendations

We will start by building a simple content-based recommendation system using the users’ interests and the class types. We need to start by first converting the Interests and the Class Type to one-hot encoding.

One-hot encoding is basically creating new dummy columns for each column so that if the record contains that value, it will have 1 in that column otherwise 0.

Let’s write a function to first to change these data types to one-hot encoding. I am writing a generic function that I can use later on as well.

The updated data frames would be like this:

Now, we will use this information to give recommendations to the user. The way we do this is by calculating the dot product between the class types and the users’ interests i.e, we multiply each of the class type dummy columns with certain users’ interests. A higher dot product means that the users’ interests and the class types are similar. We will use the following function:

Let’s test this function with some user IDs:

This shows that user with ID ‘0’ prefers the classes with IDs 0, 5, and 10. We can actually verify that by seeing that these do actually have the same interests and class types.

Test Results Based Recommendations

Let’s now use the test results that the users have gotten and the test types to recommend classes to those users.

As the test results are out of 15, we can divide them into three categories. Less than or equal to 5 will be Beginner, 6–10 will be Medium and above 10 would be Hard. That would help us map the test results to Class Level.

We will combine the user’s interest and their results to give them optimized recommendations. Let’s start by converting the Class Level column to one-hot encoding using our generic function. So our updated classes_df would be:

Now let’s work the Test Results and Test Type column in users_df. For Test Type, we will use the same function as above. For Test Results, we will first have to change the scores to difficulty level and then convert them to one-hot encoding. We will use the following function to achieve that

The updated users_df would be:

Now it’s time to make recommendations. We will create the following three separate dot products that will be used together to make robust and more accurate recommendations.

Interests from users and class types
Class types and test types
Class levels and users’ performance in the test.

Let’s create the functions that calculate these and give top recommendations to the user.

The function calls three separate functions to calculate the dot products and then recommends the topmost with the highest combined dot product. Let’s test this out

Conclusion:

In this article, we looked into content-based recommendation systems:

We saw the importance of one-hot encoding and using dot product as a means to calculate the similarity between different objects
We built a recommendation system using the information we had regarding users’ interests and the types of classes available.
We further included the test results and types to build a more robust and accurate recommendation system.