Chapter 9 Reading Excel spreadsheets
The most common R data import/export question seems to be ‘how do I read an Excel spreadsheet’. This chapter collects together advice and options given earlier. Note that most of the advice is for pre-Excel 2007 spreadsheets and not the later .xlsx format.
The first piece of advice is to avoid doing so if possible! If you have access to Excel, export the data you want from Excel in tab-delimited or comma-separated form, and use read.delim
or read.csv
to import it into R. (You may need to use read.delim2
or read.csv2
in a locale that uses comma as the decimal point.) Exporting a DIF file and reading it using read.DIF
is another possibility.
If you do not have Excel, many other programs are able to read such spreadsheets and export in a text format on both Windows and Unix, for example Gnumeric (http://www.gnome.org/projects/gnumeric/) and OpenOffice (https://www.openoffice.org). You can also cut-and-paste between the display of a spreadsheet in such a program and R: read.table
will read from the R console or, under Windows, from the clipboard (via file = “clipboard”
or readClipboard
). The read.DIF
function can also read from the clipboard.
Note that an Excel .xls file is not just a spreadsheet: such files can contain many sheets, and the sheets can contain formulae, macros and so on. Not all readers can read other than the first sheet, and may be confused by other contents of the file.
Windows users (of 32-bit R) can use odbcConnectExcel
in package RODBC. This can select rows and columns from any of the sheets in an Excel spreadsheet file (at least from Excel 97–2003, depending on your ODBC drivers: by calling odbcConnect
directly versions back to Excel 3.0 can be read). The version odbcConnectExcel2007
will read the Excel 2007 formats as well as earlier ones (provided the drivers are installed, including with 64-bit Windows R: see RODBC). macOS users can also use RODBC if they have a suitable driver (e.g. that from Actual Technologies).
Perl
users have contributed a module OLE::SpreadSheet::ParseExcel
and a program xls2csv.pl
to convert Excel 95–2003 spreadsheets to CSV files. Package gdata provides a basic wrapper in its read.xls
function. With suitable Perl
modules installed this function can also read Excel 2007 spreadsheets.
Packages dataframes2xls and WriteXLS each contain a function to write one or more data frames to an .xls file, using Python and Perl respectively.
Packages xlsx can read and and manipulate Excel 2007 and later spreadsheets: it requires Java.
Package XLConnect can read, write and manipulate both Excel 97–2003 and Excel 2007/10 spreadsheets, using Java.
Package readxl can read both Excel 97–2003 and Excel 2007/10 spreadsheets, using an included C library.