Wikileaks Twitter DM

About

On the 29th of July 2018, Emma Best published on her website the copy of 11k+ wikileaks Twitter DM : https://emma.best/2018/07/29/11000-messages-from-private-wikileaks-chat-released/

Here is a data extraction and wrangling of this corpus, to make it easily searchable, extractable and sharable.

How to use this page

  • Every “link.csv” is a downloadable csv.
  • You can search and order every table. Results of the search are downloadable as csv or can be copied in the clipboard.
  • You can zoom in the time series by selecting the date range. You can also use the selector beside to choose this range. Double click to reset the settings.
  • Under each dynamic plot, you can find a static plot by clicking on “Static plost”.

This page may not work as expected on Internet Explorer / Edge. Please switch to another browser if you have trouble reading this page.

Data format

  • Every csv is encoded in UTF8
  • You can find these csv in JSON format on the GitHub repo

Browse through the content

  • Home has the full dataset, to search and download.
  • Timeline has a series of time-related content: notably DMs by years, and daily count of DMs.
  • Users holds the dataset for each users.
  • mentions_urls holds the extracted mentions and urls
  • methodo contains the methodology used for the data wrangling

Note: As documented in the methodo, the DMConversationEntry have no date in the dataset, hence the date is inferred from the directly preceeding date, so these entries might not be correct when it comes to date.

Monthly participation by user

Static plot

Count of user participation

A dataset with 2 columns

  • user: the user
  • n: number of DMs in the corpus for that user

user_count.csv

DMs by users

15 datasets (1/user), each with 3 columns:

  • text: extracted text
  • date: date of the dm
  • user: user who sent the dm

Bean

user_Bean.csv

Static plot

Cabledrum

user_Cabledrum.csv

Static plot

DMConversationEntry

user_DMConversationEntry.csv

Note: As documented in the methodo, the DMConversationEntry have no date in the dataset, hence the date is inferred from the directly preceeding date, so these entries might not be correct when it comes to date.

Static plot

Emmy B

user_Emmy.B.csv

Static plot

LibertarianLibrarian

user_LibertarianLibrarian.csv

Static plot

M

user_M.csv

Static plot

Matt Watt

user_Matt.Watt.csv

Static plot

noll

user_noll.csv

Static plot

SAWC Sydney

user_SAWC.Sydney.csv

Static plot

voidiss

user_voidiss.csv

Static plot

WikiLeaks Press

user_WikiLeaks.Press.csv

Static plot

WikiLeaks Task Force

user_WikiLeaks.Task.Force.csv

Static plot

WikiLeaks

user_WikiLeaks.csv

Static plot

WISE Up Action

user_WISE.Up.Action.csv

Static plot

WISE Up Wales

user_WISE.Up.Wales.csv

Static plot

Methodology

Everything has been done in R.

Methodology is described in methodo