clientsdb - A docker image with clients comments
Have you ever been looking for a ready to use database while doing training? Search no more, this docker is an image with a client review database dropped inside a postgres, to be used for teaching.
About the dataset
The titles and comments are extracted from this Google Drive link that contains “amazon_review_full_csv.tar.gz”, which I discovered on the Amazon review database Kaggle page. Then the two columns date and name being randomly generated in R.
Here is the coded used to generate the full table:
library(data.table)
dataset <- fread("data/train.csv", header = FALSE, sep = ",")
names(dataset) <- c("score", "title", "comment")
nms <- paste(
sample(charlatan:::person_en_us$first_names, nrow(dataset), TRUE),
sample(charlatan:::person_en_us$last_names, nrow(dataset), TRUE)
)
date <- sample(0:as.numeric(as.POSIXct("2010-01-01")), nrow(dataset), TRUE)
date <- as.POSIXct(date, origin = "1970-01-01")
dataset[
, `:=`(
score = NULL,
name = nms,
date = date
)
]
data.table::fwrite(dataset, "datasetwithusers.csv")
Launch and use
The main purpose of this image is to provide a “real life” tool for teaching databases use.
Info:
- the
POSTGRES_DB
used isclients
- the
POSTGRES_PASSWORD
isverysecretwow
- the
POSTGRES_USER
issuperduperuser
To launch the db, do:
# Might take some time to warm up
docker run --rm -d -p 5432:5432 --name clientsdb colinfay/clientsdb:latest
Then, for example from R:
library(DBI)
con <- dbConnect(
RPostgres::Postgres(),
dbname = 'clients',
host = 'localhost',
port = 5432,
user = 'superduperuser',
password = 'verysecretwow'
)
dbListTables(con)
[1] "clients"
res <- dbSendQuery(con, "SELECT score, title, name, date FROM clients LIMIT 5")
dbFetch(res)
score title
1 4 bastante bueno...,,,OK
2 4 underrated !!!!
3 5 HENRY MANCINI'S MUSICAL SCORING IS A HIT, ALONG WITH SOLID SCRIPTS
4 3 Jury Still Out
5 1 Complete release info
name date
1 Yareli Koch 1992-01-21
2 Lyle Turcotte 1984-10-25
3 Luvenia Vandervort 1975-04-18
4 Diego Walter 2001-12-25
5 Jaron Jakubowski 1998-05-12
dbClearResult(res)
res <- dbSendQuery(con, "SELECT title, name, date FROM clients WHERE date = '1998-05-12' LIMIT 10")
dbFetch(res)
title name date
1 Complete release info Jaron Jakubowski 1998-05-12
2 Fun but extremely poor made!! Velvet Hand 1998-05-12
3 Boring if work in the industry.... Sol Gerlach 1998-05-12
4 our state magazine Zebulon Reichel 1998-05-12
5 Loin cloth? Linwood Beier 1998-05-12
6 The Best Ceola Heaney 1998-05-12
7 My favorite book Eura Jacobs 1998-05-12
8 ok story from the hartnell era Raphael Moore 1998-05-12
9 Couldn't put the book down! Capitola Huel 1998-05-12
10 car essential oil diffuser - great gift Elwyn Von 1998-05-12
dbClearResult(res)
dbDisconnect(con)
And then stop the db.
docker stop clientsdb
What do you think?