Marappan R*
Senior Assistant Professor, India
*Corresponding author: Marappan R, School of Computing, SASTRA Deemed University, Thanjavur, India
Submission: June 14, 2022;Published: October 17, 2022
ISSN:2832-4463 Volume2 Issue3
There are a lot of recommendation systems developed for different artificial intelligence and machine learning applications. The recommendation systems should provide a better recommendation with minimal computing time. This research focuses on how to build the popularity-based recommender system for the Movie Lens dataset using Python with its analysis.
Keywords: Recommendation systems; Recommender systems; Artificial intelligence; Machine learning; Popularity recommender
Recently many recommender systems are developed for information filtering [1]. The learners or users are expected to get a better recommendation for their interests in various applications. This research focuses on developing a popularity-based recommender using Python.
For the implementation purpose, the Movie Lens dataset is considered with ratings.csv and movies.csv files [2-4]. The fields in movies.csv are movieId, title, and genre. The unique id for each movie is defined in movieId. The name of the movie is defined in the title field. The genre of the movie is defined in the genre field. The fields defined in ratings.csv are userId, movieId, rating, and timestamp. The users who are rated movies are defined in the unique userId. The movie ratings of the user are defined in the rating field. The time of rating a movie is defined in the timestamp field.
This section focuses on the Python recommendation model for the Movie Lens dataset.
The structure of ratings.csv and movies.csv are sketched in Figure 1 & 2 respectively. The
combined structure of these files is shown in Figure 3. The complete model is defined as
follows:
#import all necessary libraries: os, numpy, pandas, matplotlib.pyplot
plt.style.use(‘seaborn-bright’)
%Matplotlib inline
#Change directory to the folder where data files are present
#This step is not necessary if the data files and jupyter notebook are in same folder
os.chdir(“E:\MovieLens”)
#Import ratings file in a pandas data frame
ratings_data=pd.read_csv(“ratings.csv”)
ratings_data.head()
movie_names = pd.read_csv(“movies.csv”)
movie_names.head()
m o v i e _ d a t a = p d . m e r g e ( r a t i n g s _ d a t a , m o v i e _ names,on=’movieId’)
movie_data.head()
Figure 1: Gamefinite login page.
Figure 2: Structure of movies csv.
Figure 3: Combined structure of ratings.csv & movies.csv with movieId as the primary key.
This section focuses on the analysis of the constructed model.
The ratings and the total ratings are shown in Figure 4 & 5. The
ratings with the number of movies are sketched in Figures 6-8.
The steps in Python model development for popularity-based
recommendation are as follows:
1. Define the packages: os, numpy and pandas
2. The working directory is to be changed to the dataset folder.
3. Read the information from the ratings.
4. Read the information from the movies file.
5. Merge ratings_data & movie_names using the pandas built-in
function.
6. Construct the data frame ‘movie_data’ and print it.
7. Plot a horizontal bar graph using matplotlib library to get an
overview of data.
8. Plot a bar graph to sketch the total number of reviews for each
movie individually.
9. Arrange the titles in the order to recommend top-rating
movies.
Figure 4:Rating and total ratings.
Figure 5:
Figure 6:Ratings with number of movies.
Figure 7:
Figure 8:
More users may review and rate some movies. To have a better recommendation, new rules should be added for better popularity prediction of a movie. In addition, the newer ones may be better than the existing ones. In these situations, more weight will be included in the rate of newer movies to bring to the recommendation list. In the future, new recommenders will be developed using evolutionary computing strategies [5-10].
© 2022 Samad Shibghatullah A. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.