ExcelReaders.jl

ExcelReaders is a package that provides functionality to read Excel files.
Popularity
57 Stars
Updated Last
2 Years Ago
Started In
March 2015

ExcelReaders

Build Status Build status Coverage Status codecov

ExcelReaders is a package that provides functionality to read Excel files.

WARNING: Version v0.9.0 removed all support for DataFrames.jl from this package. The ExcelFiles.jl package now provides functionality to read data from an Excel file into a DataFrame (or any other table type), and users are encouraged to use that package for tabular data going forward. Version v0.9.0 also no longer uses DataArrays.jl, but instead is based on DataValues.jl.

Installation

Use Pkg.add("ExcelReaders") in Julia to install ExcelReaders and its dependencies.

The package uses the Python xlrd library. If either Python or the xlrd package are not installed on your Mac or Windows system, the package will use the Conda.jl package to install all necessary dependencies automatically. If you are on another system you can either install Python and xlrd yourself or instruct PyCall to use Conda.jl to manage its own python install (ENV["PYTHON"]=""; Pkg.build("PyCall") and restart Julia).

Alternatives

The Taro package also provides Excel file reading functionality. The main difference between the two packages (in terms of Excel functionality) is that ExcelReaders uses the Python package xlrd for its processing, whereas Taro uses the Java packages Apache Tika and Apache POI.

Basic usage

The most basic usage is this:

using ExcelReaders

data = readxl("Filename.xlsx", "Sheet1!A1:C4")

This will return an array with all the data in the cell range A1 to C4 on Sheet1 in the Excel file Filename.xlsx.

If you expect to read multiple ranges from the same Excel file you can get much better performance by opening the Excel file only once:

using ExcelReaders

f = openxl("Filename.xlsx")

data1 = readxl(f, "Sheet1!A1:C4")
data2 = readxl(f, "Sheet2!B4:F10")

Reading a whole sheet

The readxlsheet function reads complete Excel sheets, without a need to specify precise range information. The most basic usage is

using ExcelReaders

data = readxlsheet("Filename.xlsx", "Sheet1")

This will read all content on Sheet1 in the file Filename.xlsx. Eventual blank rows and columns at the top and left are skipped. readxlsheet takes a number of optional keyword arguments:

  • skipstartrows accepts either :blanks (default) or a positive integer. With :blank any empty initial rows are skipped. An integer skips as many rows as specified.
  • skipstartcols accepts either :blanks (default) or a positive integer. With :blank any empty initial columns are skipped. An integer skips as many columns as specified.
  • nrows accepts either :all (default) or a positive integer. With :all, all rows (except skipped ones) are read. An integer specifies the exact number of rows to be read.
  • ncols accepts either :all (default) or a postiive integer. With :all, all columns (except skipped ones) are read. An integer specifies the exact number of columns to be read.

readxlsheet also accepts an ExcelFile (as obtained from openxl) as its first argument.