New R Package — proustr
proustr
is now on CRAN.
An R Package for Marcel Proust’s A La Recherche Du Temps Perdu
This package gives you access to all the books from Marcel Proust “À la recherche du temps perdu” collection. This collection is divided in books, which are divided in volumes. Inspired by the package janeaustenr by Julia Silge.
All books have been downloaded from BEQ
Here is a list of all the books contained in this pacakage :
- Du côté de chez Swann (1913): 2 volumes
ducotedechezswann1
&ducotedechezswann2
. - À l’ombre des jeunes filles en fleurs (1919): 3 volumes
alombredesjeunesfillesenfleurs1
,alombredesjeunesfillesenfleurs2
, andalombredesjeunesfillesenfleurs3
. - Le Côté de Guermantes (1920-1921): 3 volumes
lecotedeguermantes1
,lecotedeguermantes2
andlecotedeguermantes3
- Sodome et Gomorrhe I et II (1921-1922) : 2 volumes
sodomeetgomorrhe1
andsodomeetgomorrhe2
. - La Prisonnière (1923) : 2 volumes
laprisonniere1
andlaprisonniere2
. - Albertine disparue (1925, also know as : La Fugitive) :
albertinedisparue
. Le Temps retrouvé (1927) : 2 volumesletempretrouve1
andletempretrouve2
.
Install proustr
Install this package directly in R :
From CRAN :
install.packages("proustr")
From Github :
devtools::install_github("ColinFay/proustr")
Examples
devtools::install_github("ThinkRstat/stopwords")
library(proustr)
library(tidytext)
library(tidyverse)
library(stopwords)
proust_books() %>%
mutate(text = stringr::str_replace_all(.$text, "’", " ")) %>%
unnest_tokens(word, text) %>%
filter(!word %in% stopwords_iso$fr) %>%
count(word, sort = TRUE)%>%
head(10)
# A tibble: 10 x 2
word n
<chr> <int>
1 mme 3106
2 faire 2869
3 albertine 2389
4 grand 1833
5 guermantes 1807
6 vie 1732
7 temps 1715
8 swann 1682
9 jamais 1639
10 voir 1568
Contact
Questions and feedbacks welcome !
What do you think?