Chapter 9 Reading Excel spreadsheets

The most common R data import/export question seems to be ‘how do I read an Excel spreadsheet’. This chapter collects together advice and options given earlier. Note that most of the advice is for pre-Excel 2007 spreadsheets and not the later .xlsx format.

The first piece of advice is to avoid doing so if possible! If you have access to Excel, export the data you want from Excel in tab-delimited or comma-separated form, and use read.delim or read.csv to import it into R. (You may need to use read.delim2 or read.csv2 in a locale that uses comma as the decimal point.) Exporting a DIF file and reading it using read.DIF is another possibility.

If you do not have Excel, many other programs are able to read such spreadsheets and export in a text format on both Windows and Unix, for example Gnumeric (http://www.gnome.org/projects/gnumeric/) and OpenOffice (https://www.openoffice.org). You can also cut-and-paste between the display of a spreadsheet in such a program and R: read.table will read from the R console or, under Windows, from the clipboard (via file = “clipboard” or readClipboard). The read.DIF function can also read from the clipboard.

Note that an Excel .xls file is not just a spreadsheet: such files can contain many sheets, and the sheets can contain formulae, macros and so on. Not all readers can read other than the first sheet, and may be confused by other contents of the file.

Windows users (of 32-bit R) can use odbcConnectExcel in package RODBC. This can select rows and columns from any of the sheets in an Excel spreadsheet file (at least from Excel 97–2003, depending on your ODBC drivers: by calling odbcConnect directly versions back to Excel 3.0 can be read). The version odbcConnectExcel2007 will read the Excel 2007 formats as well as earlier ones (provided the drivers are installed, including with 64-bit Windows R: see RODBC). macOS users can also use RODBC if they have a suitable driver (e.g. that from Actual Technologies).

Perl users have contributed a module OLE::SpreadSheet::ParseExcel and a program xls2csv.pl to convert Excel 95–2003 spreadsheets to CSV files. Package gdata provides a basic wrapper in its read.xls function. With suitable Perl modules installed this function can also read Excel 2007 spreadsheets.

Packages dataframes2xls and WriteXLS each contain a function to write one or more data frames to an .xls file, using Python and Perl respectively.

Packages xlsx can read and and manipulate Excel 2007 and later spreadsheets: it requires Java.

Package XLConnect can read, write and manipulate both Excel 97–2003 and Excel 2007/10 spreadsheets, using Java.

Package readxl can read both Excel 97–2003 and Excel 2007/10 spreadsheets, using an included C library.