For this worksheet, we need the following packages along with a connection to the Chinook database.

library(RSQLite)
library(sqldf)
library(DBI)
chinook_db <- dbConnect(SQLite(), "../Chinook_Sqlite.sqlite")

Problem 1

  1. What is the total length in minutes of all tracks with GenreId = 2? Answer with a query returning the following:
dbGetQuery(chinook_db, "select GenreId, sum(milliseconds) / 60000 as minutes from track t
                          where t.genreid = 2")
##   GenreId minutes
## 1       2     632
  1. Using a join, add the name of the genre to the above query.
dbGetQuery(chinook_db, "select g.Name, g.GenreId, sum(milliseconds) / 60000 as minutes from track t
                          inner join genre g on t.genreid = g.genreid
                          where t.genreid = 2")
##   Name GenreId minutes
## 1 Jazz       2     632
  1. Using a group by, find this information for all genres. I’ve provided just the first 5 records of the output.
dbGetQuery(chinook_db, "select g.Name, g.GenreId, sum(milliseconds) / 60000 as minutes from track t
                          inner join genre g on t.genreid = g.genreid
                          group by t.genreid
                          limit 5")
##                 Name GenreId minutes
## 1               Rock       1    6137
## 2               Jazz       2     632
## 3              Metal       3    1930
## 4 Alternative & Punk       4    1296
## 5      Rock And Roll       5      26

Problem 2

Who is the support rep of the customer named Heather Leacock? Answer this by returning the following relation using a single query. Make sure your field names match mine by using aliases.

dbGetQuery(chinook_db, "select e.EmployeeId as RepId,
                               e.FirstName as RepFirstName,
                               e.LastName as RepLastName,
                               c.CustomerId,
                               c.FirstName as CustFirstName, 
                               c.LastName as CustLastName
                        from customer c inner join employee e 
                            on c.supportrepid = e.employeeid
                        where customerid = 22")
##   RepId RepFirstName RepLastName CustomerId CustFirstName CustLastName
## 1     4     Margaret        Park         22       Heather      Leacock

Problem 3: Distinct

The following command is often useful: distinct selects records with unique field values from a table.

For example, here is how we can get the unique prices of each track.

dbGetQuery(chinook_db, "select distinct UnitPrice from Track")
##   UnitPrice
## 1      0.99
## 2      1.99

We can see tracks cost each 0.99 or 1.99, two different values. To get the count directly we can use the following

dbGetQuery(chinook_db, "select count(distinct UnitPrice) from Track")
##   count(distinct UnitPrice)
## 1                         2
  1. How many difference countries are the customers from?
dbGetQuery(chinook_db, "select count(distinct country) from customer")
##   count(distinct country)
## 1                      24
  1. Retrieve the unique titles of the employees.
dbGetQuery(chinook_db, "select distinct title from employee")
##                 Title
## 1     General Manager
## 2       Sales Manager
## 3 Sales Support Agent
## 4          IT Manager
## 5            IT Staff
  1. The following determines how many unique genres there are. Was distinct necessary here? Explain.
dbGetQuery(chinook_db, "select count(distinct GenreId) from Genre")
##   count(distinct GenreId)
## 1                      25

distinct was not necessary since GenreId is the primary key of the Genre table and primary keys are guaranteed to be unique.