Solving #AdventOfCode day 5 and 6 with R

5 minute(s) read

Solving the puzzles of Advent of Code with R.

[Disclaimer] Obviously, this post contains a big spoiler about Advent of Code, as it gives solutions for solving days quoted in the title.

Advent of Code

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. About Advent of Code

Day five, part one

The first part of the challenge for day 3 starts well: we have to read in R a file with A LOT of letters :)

vec <- readLines("input5.txt")
substr(vec, 1, 30)
## [1] "MmMmMxcRrCmMAaEeiIMmopZzPHxEeH"

This character vector is a polymer, and it describes chemical reactions.

The polymer is formed by smaller units which, when triggered, react with each other such that two adjacent units of the same type and opposite polarity are destroyed. Units’ types are represented by letters; units’ polarity is represented by capitalization. For instance, r and R are units with the same type but opposite polarity, whereas r and s are entirely different types and do not react.

So the first question was:

How many units remain after fully reacting the polymer you scanned?

Let’s use a regex to do that :) First, we’ll create a regex that describes any possible combination of upper/lower letters:

regex <- paste0(paste0(letters, LETTERS), "|", paste0(LETTERS, letters), collapse = "|")
regex
## [1] "aA|Aa|bB|Bb|cC|Cc|dD|Dd|eE|Ee|fF|Ff|gG|Gg|hH|Hh|iI|Ii|jJ|Jj|kK|Kk|lL|Ll|mM|Mm|nN|Nn|oO|Oo|pP|Pp|qQ|Qq|rR|Rr|sS|Ss|tT|Tt|uU|Uu|vV|Vv|wW|Ww|xX|Xx|yY|Yy|zZ|Zz"
# Check that the regex is there
grepl(regex, vec)
## [1] TRUE

So, we’ll need to run a loop that removes this regex until there is nothing to remove anymore. The idea is to get the number of characters before the gsub, perform the sub, and when the removal is done, count if the number of characters from before the subtraction is different from the number of characters after (i.e, if it is the same, there is nothing to gsub anymore).

continue <- TRUE
while (continue) {
  old_size <- nchar(vec)
  vec <- gsub(regex, "", vec)
  continue <- nchar(vec) != old_size
}

# Check that the regex is not there anymore
grepl(regex, vec)
## [1] FALSE
nchar(vec)
## [1] 9288

Day five, part two

With part 2, we need to try to first remove, one by one, the couples of units (for example “aA”), and react the polymer without each couple. Then, we need to find:

What is the length of the shortest polymer you can produce by removing all units of exactly one type and fully reacting the result?

Let’s start by putting our last code in a function:

react <- function(vec,
                   regex =  paste0(paste0(letters, LETTERS), "|",
                                  paste0(LETTERS, letters), collapse = "|")){
  continue <- TRUE
  while (continue) {
    old_size <- nchar(vec)
    vec <- gsub(regex, "", vec)
    continue <- nchar(vec) != old_size
  }
  vec
}

Let’s try with the examples from the website:

nchar(react("dbcCCBcCcD"))
## [1] 6
nchar(react("daAcCaCAcCcaDA"))
## [1] 8
nchar(react("dabAaBAaDA"))
## [1] 4
nchar(react("abAcCaCBAcCcaA"))
## [1] 6

Now we’ll combine pattern removal and reaction:

clean_and_react <- function(vec, pattern){
  react( gsub(pattern, "", vec) )
}

clean_and_react("dabAcCaCBAcCcaDA", "a|A")
## [1] "dbCBcD"

Then, a function to get a tibble with: extracted pattern, and number of characters:

library(tidyverse)
clean_and_react_and_count <- function(vec, pattern){
  tibble(
    pattern = pattern,
    nchars = nchar(clean_and_react(vec, pattern))
  )
}
clean_and_react_and_count("dabAcCaCBAcCcaDA", "a|A")
## # A tibble: 1 x 2
##   pattern nchars
##   <chr>    <int>
## 1 a|A          6

As it should take some time, let’s use {furrr} to do our calculation:

library(furrr)
## Loading required package: future
plan(multiprocess)
res <- future_map_dfr(paste0(LETTERS,"|", letters), ~ clean_and_react_and_count(vec, .x))

What’s the best solution?

top_n(res, -1)
## Selecting by nchars

## # A tibble: 1 x 2
##   pattern nchars
##   <chr>    <int>
## 1 F|f       5844

Day six, part one

Now we’re working with distance calculation. We’ve been provided a bunch of coordinates. Once we have put these on a grid, we need to fill the “empty cells” with the reference to the closest coordinate from our input, by calculating the shortest manhattan distance.

day6 <- read_csv("input6.txt", col_names = c("V1", "V2"))
## Parsed with column specification:
## cols(
##   V1 = col_integer(),
##   V2 = col_integer()
## )
day6$id <- as.character(1:50)
day6
## # A tibble: 50 x 3
##       V1    V2 id
##    <int> <int> <chr>
##  1   315   342 1
##  2    59   106 2
##  3    44   207 3
##  4    52    81 4
##  5   139   207 5
##  6    93   135 6
##  7   152   187 7
##  8   271    47 8
##  9   223   342 9
## 10    50   255 10
## # ... with 40 more rows

First of all, let’s get a list of all the “empty cells” we mentioned before:

all_comb <- expand.grid(
  min(day6$V1):max(day6$V1),
  min(day6$V2):max(day6$V2)
) %>% as_tibble()

The Manhattan distance function:

manat_dist <- function(x, y){
  abs(x - y)
}

A function to find the closer ID, given an x and a y:

closest_id <- function(x, y, df = day6){
  df %>%
    mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) ) %>%
    top_n(-1, dist) %>%
    pull(id)
}

# Apply it on all our combination
cl <- future_pmap_chr(all_comb, function(...){
  x <- closest_id(..1, ..2)
  if (length(x) > 1) {
    NA
  } else {
    x
  }
})

And now, time to answer the puzzle:

What is the size of the largest area that isn’t infinite?

An infinite area is defined by the fact that at least one of its element is on the edge of the grid (hence equal to the min or max of V1 or V2).

all_comb <- all_comb %>%
  mutate(
    closest = cl,
    max1 = max(Var1),
    max2 = max(Var2),
    min1 = min(Var1),
    min2 = min(Var2),
  )

# Get if each row are on the border
is_border <- pmap_lgl(all_comb, function(...){
  ..1 == ..4 | ..1 == ..6 | ..2 == ..5 | ..2 == ..7
})

all_comb %>%
  mutate(is_border = is_border) %>%
  group_by(closest) %>%
  mutate(is_bord = any(is_border)) %>%
  filter(!is_bord) %>%
  count(closest, sort = TRUE) %>%
  ungroup() %>%
  top_n(1)
## Selecting by n

## # A tibble: 1 x 2
##   closest     n
##   <chr>   <int>
## 1 13       4290

Day six, part two

Now, we need to compute the distance of each point on the grid to each coordinates, and to answer this question:

What is the size of the region containing all locations which have a total distance to all given coordinates of less than 10000?

In other word, we get the distance to each coordinate on each cell, and keep only the one with a total Manhattan distance which is less than 1000:

# Create a function to compute all distances
all_dist <- function(x, y, df = day6){
  df %>%
    mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) )
}

# And use it on the all_comb table
future_pmap(all_comb, function(...){
  all_dist(..1, ..2)
}) %>%
  map_int(~sum(.x$dist)) %>%
  keep(~ .x < 10000) %>%
  length()
## [1] 37318

What do you think?