tidy tuesday

Tidy Tuesday Horror

Load the Data and Check Duplicates library(tidyverse) library(lubridate) library(kableExtra) library(ggridges) # there were complete duplicated rows dat <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-10-22/horror_movies.csv") %>% distinct(.) # removes complete dups # check duplicates dup_title <- dat %>% filter(duplicated(title) | duplicated(title, fromLast = TRUE)) %>% arrange(title) # examined they seem different movies even though same title dup_title %>% filter(duplicated(plot)) ## # A tibble: 0 x 12 ## # … with 12 variables: title <chr>, genres <chr>, release_date <chr>, ## # release_country <chr>, movie_rating <chr>, review_rating <dbl>, ## # movie_run_time <chr>, plot <chr>, cast <chr>, language <chr>, ## # filming_locations <chr>, budget <chr> dup_title %>% filter(duplicated(release_date)| duplicated(release_date, fromLast = TRUE)) ## # A tibble: 2 x 12 ## title genres release_date release_country movie_rating review_rating ## <chr> <chr> <chr> <chr> <chr> <dbl> ## 1 The … Comed… 21-Jul-15 USA <NA> 5.