You will continue to add functionality to your existing project from the previous task. There is no new repository to clone.
In this task, you will work with real data related to songs and movies. You will create and store data files that you'll use in your tests in a directory named "data" in your top level project directory. We'll use 3 different resources to obtain data. You can/should use data from these files in your tests. You should create smaller files using this data.
Songs + Ratings: The songs.csv file contains songs and ratings [from students in Fall 2023] as a CSV file in the format "songID,artist,title,reviewerID,rating".
Raw data downloaded from - https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset?resource=download Copyright - CC0: Public Domain
Movies: The movies.csv file contains data for 4575 movies. The format of the file is "movieTitle,castMember0,castMember1,castMember2,etc." where there can be any number of cast members (There must be at least 1 cast member for every movie in the file).
Movie Ratings: The movie_ratings.csv file contains ratings for movies by title in the format "title,reviewerId,rating".
In this task, you will implement the following specs related to reading files matching the formats from the resources section to populate data structures of Songs and Movies.
Create a FileReader
class in the ratings
package the following
static
methods will be tested and implemented:
static
method named readSongs
that takes a String
as a
parameter and returns and ArrayList
of Songs
String
is the name of a file to be read. This file will be a CSV file
where each line matches the format from the songs.csv file:
"songID,artist,title,reviewerID,rating"
ArrayList
of Songs
containing all of the
information from the
input file. If a song appears in the file more than once, this means that it has been
rated by multiple reviewers. In this case, only one Song object should be returned for
that song, and it should contain all the ratings for this song from the file
ArrayList
Songs
in the returned ArrayList
may appear in any order.
Your tests must accept any
order of the returned Songs
(This is much of the challenge in writing these tests)
Ratings
for a song, you can assume that they are in the
same
order in which they appear in the file (This will make it simpler to test the linked
list)
static
method named readMovies
that takes a String
as a
parameter and returns an ArrayList
of Movies
String
is the name of a file to be read. This will be a CSV file
where each line matches the format from the movies.csv file that you can download above:
"movieTitle,castMember0,castMember1,castMember2,etc" and there can be any number
of cast members, but there must be at least one cast member per movie
ArrayList
of Movies
containing all of the
information from the input file
ArrayList
ArrayList
may appear in any order. Your tests must
accept any order of the returned movies (This is much of the challenge in writing these tests)
ArrayList
in the same order in which they
appear in the line for that movie. You should assume this order when testing the cast
ArrayLists
(The capitalization of the cast names will be exactly as they appear
in the file so there's no need to ignore case when comparing names)
static
method named readMovieRatings
that takes an ArrayList
of Movies
and a String
as parameters. The method returns an
ArrayList
of Movies
ArrayList
of Movies
will contain Movie
objects
with their title and cast
populated. These movies should not have any ratings added to them. When your program
runs, it is intended for this ArrayList
to be provided by your
readMovies
method
String
is the name of a file to be read. This will be a CSV file
where each line matches the format from the movie_ratings.csv file that you can download
above: "title,reviewerId,rating"
ArrayList
of Movies
containing the movies
from the input ArrayList
along with their ratings read from
the input file. If a Movie from the input ArrayList
does not have any ratings in
the
ratings file, it should not be included in the output ArrayList
. If a movie has
been
rated that is not included in the input ArrayList
, the rating should be ignored
(eg. If you find a rating for a movie that is in the input, do not create a Movie
object for the rating)
ArrayList
contains
multiple
movies with the same title. Any ratings for movies with the same title should be treated as
multiple ratings for the same movie
ArrayList
Movies
in the returned ArrayList
may appear in any order. Your
tests should accept any order of the returned movies
Movie
must be in the order in which they appear in the file.
When
testing the linked list of Ratings
for a Movie
, you can assume that
they are in the same order in which they appear in the file (This will make it simpler to test
the linked list)
TestFiles: Create a class named TestFiles
in the tests
package and
write the following testing utility method in this class.
compareMovieArrayLists
- Write a method named compareMovieArrayLists
in the
tests.TestFiles
class that:
ArrayList<Movie>
objects as parameters
ArrayLists
contain all the same movies
(Same title, cast list, and ratings). The cast list and ratings must be in the same exact order
in their respective data structures, but the movies can appear in any order in their
ArrayLists
The method either returns false, or fails a JUnit
assert, if the ArrayLists
do not contain exactly the same movie
Note: You may want to write a similar method to compare lists of songs, even though it's not required, as it may prove useful when writing tests for reading songs.
Write tests for the three methods (readSongs
, readMovies
, and
readMovieRatings
) in the tests.TestFiles
class.
For the testing in this task, you will create and use data files for your tests. All of your data files must be in a directory named "data" in the root of your project. For example, if you have a test file named "movies_test_1.csv", you should write a test with the filename "data/movies_test_1.csv" and it will open the correct file from your data directory.
Note: Do not add spaces in your testing filename. Use underscores like in the examples above if you want to use filenames with multiple words.
Implement the readSongs
, readMovies
, and
readMovieRatings
static
methods.
The feedback in Autolab will be given in 4 phases. If you don't complete a phase, then feedback for the following phase(s) will not be provided.
Once you complete all 4 phases, you will have completed this Task and Autolab will confirm this with a score of 1.0 for complete.