May 23, 2024

Recommender Systems

Written by: Emina Šahinović, Engineering Manager

Connect to the internet and you’re greeted by… something.

Today, whatever website or app you open, you expect the homepage to be filled with something that catches your attention. Unlike ++Mag, organized by an editor, most content on the internet is organized by an algorithm. 

At first glance, designing and displaying data might seem simple, but it quickly becomes clear that it's a complex problem. Implementing recommendation systems often involves gathering large amounts of data. Different implementations require different solutions and approaches. For example, sometimes the system allows for explicit recommendations of the newest or cheapest items, videos, or images from certain users, movies with certain categories. In other situations, recommendations and connections can be made differently, by showing similar items, popular videos, or even random connections.  

Data Collection

To implement a recommendation system, data must first be collected. There are two methods of data collection: explicit and implicit. In the explicit method, users generate the necessary data. This can include rating content on a scale, like/dislike, describing content, ranking, comparing... Explicit data collection requires active users and well-thought-out methods for collecting data. Examples of this approach are unexpected surveys and requests for users to rate something. Implicit data collection involves analyzing user behavior. This includes navigation, time spent on certain pages or items, tracking purchases, interactions with ads, and more. This method doesn't require additional functionalities within the app, just the implementation of analytics and metrics. When the news mentions "purchasing user data," it refers to this category because it is relatively independent of the app and difficult to collect. 

Collaborative model recommender system

Collaborative Filtering

After a certain amount of data has been collected, the system can start using collaborative filters. Collaborative filters generate groups of similar users and user experiences. The assumption in these systems is that recognized similarities can be used as patterns. The system tracks user paths and actions, and finds the most likely next step.

By comparing user interactions, similar users are found, and recommendations are calculated based on their interactions. After buying two items on a site, the system might find that other users who bought those two items soon bought a third, and then recommend it to you. In this case, the system doesn't need to know the items or their descriptions, nor the users. Simple comparison of interactions identifies patterns used for recommendations. Such systems can be used in online stores. Because content analysis is not used, the recommended items might seem unrelated. It's very difficult to ensure that all users use the application equally, so less active users may be linked with super active users. This creates problems for new users and new data without a developed dataset. Therefore, this filtering method is used less and less. 

Recommender System

The content model takes the user and his characteristics, and the characteristics of the content and predicts the interaction

Content-Based Filtering

Unlike collaborative filters that rely on users and user interactions, content-based filters rely on the content. When the system connects you as a user to a certain group, it recognizes patterns. One user can belong to different groups, such as a certain age group, education level, interaction level with the app, and usage of other services. All these different attributes are scaled and adjusted.

Thus, in online stores, interactions with other stores and ads will be important. If the content is related to music, recommendations may use data from music apps, and news sites use regional and demographic data. In addition to generating a user profile, content-based filters also use content data. Important information can include the content category, descriptions and tags, and creation date. Content data can be generated by users or through machine learning. The filter then uses content data in combination with user data. When new products or items are added, it's enough to add the necessary content data, and these new items are already ready for recognition and recommendation, unlike collaborative filters where they would first need to be discovered. Similarly, user profiles can be created very quickly and certain criteria can already be applied.

You’ll probably notice that most sites give you some recommendations and content as soon as you create a profile. If you compare that initial content with someone else's, you'll likely see that it's different. After some time, if multiple users have similar interests, the content might become quite similar, but the initial filter will be very different. 

Recommender system algorithm

Various content is proposed to the user for the purpose of greater interaction and training of the system. The algorithm predicts reactions and provides further recommendations.

Passive Becomes Active

Describing and setting up models and filters is the first step in defining a recommendation system. The next step is implementation, which includes fine-tuning models and parameters. The main purpose of a recommendation system is to maximize interactions and engagement. Recommendations need to be tailored to the application itself.

One of the basic settings is determining the weights for categories, how important the profile items are, how important the content is, and other elements. User satisfaction with the recommendations is also crucial. Users expect something familiar that they will like, but also something new, different from what they're used to. It's not a bad idea to sometimes integrate some generally popular things.

Adjusting the metrics is a constant process. YouTube will recommend videos with 100 views, but also a video you watched and liked 10 years ago, sending you down a nostalgia rabbit hole. Spotify might recommend a band in a genre you've never heard of, and you'll discover a new favorite obsession, or you might think "why is this in my recommendations?" and close the app.

Probably the most popular recent example is TikTok, for whose algorithm is said "you have to train it." This is actually a new era of a big story – passive now becomes active, under conscious influence.

So – like, comment, subscribe.