# Programming Assignment 4 Instructions

## Due by March 25, 2019 11:55 pm

**Goal:** In this assignment, we learn how to factorize the utility matrix to build recommender systems. We will use the MovieLens 100k Dataset. This dataset contains about 100k ratings from n = 943 users and m = 1682 movies. We will factorize the utility matrix into two matrices U, V of dimensions nxd and dxm, respectively, where d = 20.

**Input File:** Dowload file ml-100k.zip, look for the file name u.data. We only use data in this file to do factorization. DO NOT assume that users and movies are indexed from 0 to n and m, respectively.

**Input Format:** Each row has four tab-separated columns of the form:

UserId MovieId Rating Timestamp
For example, the first line is:
196 242 3 881250949
which means that user 196 gave a rating of 3 to movie 242 at timestamp 881250949. For the matrix factorization approach, we will ignore the timestamp feature. It may be helpful to look at the toy dataset.

**Output Format:** Two files, named UT.tsv and VT.tsv, correspond to two matrices U and V:
- UT.tsv: Each row of the file correspond to each row of the matrix U where the first column is the UserId and d (20 in this assignment) following columns represent the corresponding row of the user in U.
- VT.tsv: Each row of the file correspond to each column of the matrix V where the first column is the MovieId and d (20 in this assignment) following columns represent the corresponding column of the movie in V.

See UT.tsv and VT.tsv for sample outputs of the toy dataset with d = 2.

There is only one question worth 50 points.

**Question (50 points): ** Factorize the utility matrix into two matrix U and V. You should run your algorithm with T = 20 iterations. For full score, your algorithm must run in **less than 5 minutes** with RMSE less than 0.62.

**Note 1:** Submit your code and output data to the Connex

# FAQ

**Q1:** How do I initialize matrices U and V?

**Answer:** I initialize entries of U and V by randomly selecting numbers from [0,1] using numpy.random.random_sample().

Want to go back to the course overview, click here.