Wikileaks Twitter DM


On the 29th of July 2018, Emma Best published on her website the copy of 11k+ wikileaks Twitter DM :

Here is a data extraction and wrangling of this corpus, to make it easily searchable, extractable and sharable.

How to use this page

  • Every “link.csv” is a downloadable csv.
  • You can search and order every table. Results of the search are downloadable as csv or can be copied in the clipboard.
  • You can zoom in the time series by selecting the date range. You can also use the selector beside to choose this range. Double click to reset the settings.
  • Under each dynamic plot, you can find a static plot by clicking on “Static plost”.

This page may not work as expected on Internet Explorer / Edge. Please switch to another browser if you have trouble reading this page.

Data format

  • Every csv is encoded in UTF8
  • You can find these csv in JSON format on the GitHub repo

Browse through the content

  • Home has the full dataset, to search and download.
  • Timeline has a series of time-related content: notably DMs by years, and daily count of DMs.
  • Users holds the dataset for each users.
  • mentions_urls holds the extracted mentions and urls
  • methodo contains the methodology used for the data wrangling

Count of daily DMs

A dataset with 2 columns

  • date: the date
  • n: number of DMs


Static plot

DMs by year

3 datasets (1 per year), each with 3 columns:

  • text: extracted text
  • date: date of the dm
  • user: user who sent the dm



Static plot



Static plot



Static plot


Everything has been done in R.

Methodology is described in methodo