Using json
[from https://www.json.org/json-en.html] JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.
JSON is built on two structures:
- A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
Here is an example of a JSON object having three key/value pairs. For example, the first key/value pair is
"name"
, "Zoe". Notice the use of{
, commas, colons (the new lines are optional).
json
{
"name": "Zoe",
"salary": 56000,
"married": true
}
- An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence. Here is an example of an array:
json
["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]
Here is a JSON array of two objects:
[
{"name":"Ram", "email":"Ram@gmail.com"},
{"name":"Bob", "email":"bob32@gmail.com"}
]
We will be using a JSON file of movie information that is organized a list of JSON objects (I got it from https://oracleofbacon.org/how.php). Each line consists of information for a movie, formated as a JSON object such as:
{"title":"Return of the Jedi","cast":["Mark Hamill","Harrison Ford","Carrie Fisher","Billy Dee Williams","Anthony Daniels","David Prowse","Kenny Baker","Peter Mayhew","Frank Oz","Ian McDiarmid","Alec Guinness"],"directors":["Richard Marquand"],"producers":["Howard Kazanjian"],"companies":["Lucasfilm Ltd.","20th Century Fox"],"year":1983}
Typically, a JSON file is read into a program and converted to a dictionary, i.e. an abstract data type that allows efficient storage and access to the key pairs. For instance, if we were to store the previous JSON object into a dictionary called D
, we would be able to get the value of key "year"
by using D["year"]
.
// ... after some magic instructions to read the JSON object from
// ... the file into an object called D in your program
cout << D["year"]; // would print 1983
cout << D["title"]; // would print "Return of the Jedi"
cout << D["cast"][0]; // would print "Mark Hamill"
Download the file at: https://oracleofbacon.org/data.txt.bz2 and decompress it.
As fas as I know, the C++ STL does not provide functions to facilitate reading and manipulating JSON files. Therefore we will be using an external library called "JSON for Modern C++" (https://github.com/nlohmann/json). To install it, you must first install the following programs in your computer: git
and cmake
. If you are using windows, I strongly advise that you perform your software development for this course in a VM running some flavor of linux (I prefer Ubuntu).
Once you have git and cmake, here is how to install the "JSON to Modern C++" library.
git clone https://github.com/nlohmann/json
cd json
mkdir build
cd build
cmake ..
make
To test if the JSON library is working properly you can try compiling this program:
#include <nlohmann/json.hpp>
#include <iostream>
#include <fstream>
using nlohmann::json;
using namespace std;
// This program reads the json file and prints the names of all the movies.
int main(int argc, char **argv) {
if (argc < 2) {
cout << "Usage: " << argv[0] << " JSONFileName" << endl;
exit(1);
}
std::ifstream f(argv[1]);;
json jsonObj;
try{
while (f >> jsonObj) {
std::cout << jsonObj["title"] << std::endl;
}
}
catch (...) {
// just a catch all to skip the end of file error
}
return 0;
}
To compile it use:
g++ -o nameofexec nameofyourprogram.cpp -I[path of the json dir/include] -std=c++11
For example, if your program is named test.cpp
and the json library is in /Users/rarce/code/json
, then:
g++ -o test test.cpp -I/Users/rarce/code/json/include -std=c++11
When you run ./test data.txt
you will see a long list of movie names (400K+).
"Actrius"
"Army of Darkness"
"The Birth of a Nation"
"Blade Runner"
....
Exercises:
-
Modify the C++ program to the dates of the movies. You can compare the result of accessing a key by comparing against
nullptr
, for example:if (e["year"] != nullptr) cout << e["year"];
-
Modify the C++ program to print the quantity of movies per year. Use a Direct Address Table to perform the counting. You can safely assume that the years of the movies is in the range [1800-2025] (yes, 2025).
-
Create a C++ program to: read the movies title and year into a array of objects of a class such as this:
c++
class titleYear {
string title;
unsigned short year;
};
Then, sort the array according to the year. You do not need to implement a sorting algorithm. Read this to learn how to use STL's sort algorithm on objects. Print the results.
-
What is (are) the movies with the largest cast.
-
Find what actor(s) is(are) cast in the most movies.
-
What director has directed the most (unique) actors. "unique" means that eventhough Quentin Tarantino has worked many times with Uma Thurman, she only counts once.
-
Find what actor(s) spent the longest time between two movies. For example, Carrie Fisher spent 5 years between 2009 "White Lightnin'" 2014 "Maps to the Stars" but she is not the actor who has spent the most years between movies.