NAME

Spreadsheet::ParseExcel::Stream - Simple interface to Excel data with less memory overhead

SYNOPSIS

my $xls = Spreadsheet::ParseExcel::Stream->new($xls_file, \%options);
while ( my $sheet = $xls->sheet() ) {
  while ( my $row = $sheet->row ) {
    my @data = @$row;
  }
}

DESCRIPTION

A simple iterative interface to Spreadsheet::ParseExcel, similar to Spreadsheet::ParseExcel::Simple, but does not parse the entire document to memory. Uses the hints provided in the Spreadsheet::ParseExcel docs to reduce memory usage, and returns the data row by row and sheet by sheet.

Will also parse XLSX files via Spreadsheet::XLSX, but does not save any memory.

METHODS

new

my $xls = Spreadsheet::ParseExcel::Stream->new($xls_or_xlsx_file, \%options);

Opens the spreadsheet and returns an object to iterate through the data.

Accepts an optional hashref with the following keys:

Type

Specify the type (XLSX or XLS) of the document and use the appropriate library to parse it. When not using this option, the library will try to determine which type of spreadsheet is used, and will use Spreadsheet::ParseExcel::Stream::XLS or Spreadsheet::ParseExcel::Stream::XLSX to parse the document. You may use either of those libraries directly instead of specifying this option.

Password

Password to decrypt XLS documents with. This option is passed on to Spreadsheet::ParseExcel.

TrimEmpty

If true, trims leading empty columns. Trims however many empty columns that the row with the minimum number of empty columns has. E.g. if row 1 has data in columns B, C, and D, and row 2 has data in C, D, and E, then row 1 will shift to A, B, and C, and row 2 will shift to B, C, and D.

Not implemented for XLSX files.

BindColumns

Accepts a reference to a list of references to scalars. Calls bind_columns on the list.

sheet

Returns the next worksheet of the workbook.

row

Returns the next row of data from the current spreadsheet. The data is the formatted contents of each cell as returned by the $cell->value() method of Spreadsheet::ParseExcel.

If a true argument is passed in, returns the current row of data without advancing to the next row.

unformatted

Returns the next row of data from the current spreadsheet as returned by the $cell->unformatted() method of Spreadsheet::ParseExcel.

If a true argument is passed in, returns the current row of data without advancing to the next row.

next_row

Returns the next row of cells from the current spreadsheet as Spreadsheet::ParseExcel cell objects.

If a true argument is passed in, returns the current row without advancing to the next row.

name

Returns the name of the current worksheet.

bind_columns

Accepts an array of references to scalars. Binds the output of the row, unformatted, and next_row methods to the list of scalars if the 'current row' argument to those methods is not true.

If output is bound, then a simple true value instead of a reference to an array is returned from those methods if there is a next row.

unbind_columns

Unbinds any scalars bound with bind_columns().

worksheet

Returns the current worksheet as a Spreadsheet::ParseExcel object.

AUTHOR

Douglas Wilson, <dougw@cpan.org<gt>

BUGS AND LIMITATIONS

For spreadsheets created with Spreadsheet::WriteExcel without using $wb->compatibility_mode(), this module will read rows of a spreadsheet out of order if the rows were written out of order, and the TrimEmpty option of this module will not work correctly.

COPYRIGHT AND LICENSE

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.

SEE ALSO

Spreadsheet::ParseExcel, Spreadsheet::ParseExcel::Simple