Travis-CI Build Status

Compute string distance the tidy way. Built on top of the stringdist package.

Install tidystringdist

You’ll get the dev version on:

devtools::install_github("ColinFay/tidystringdist")

Stable version is available with :

tidystringdist basic workflow

tidycomb

First, you need to create a tibble with the combinations of words you want to compare. You can do this with the tidy_comb and tidy_comb_all functions. The first takes a base word and combines it with each elements of a list or a column of a data.frame, the 2nd combines all the possible couples from a list or a column.

If you already have a data.frame with two columns containing the strings to compare, you can skip this part.

tidy_string_dist

Once you’ve got this data.frame, you can use tidy_string_dist() to compute string distance. This function takes a data.frame, the two columns containing the strings, and one or more stringdist methods.

Note that if you’ve used the tidy_comb function to create your data.frame, you won’t need to set the column names.

Default call compute all the methods. You can use specific method with the method argument: