NAME
Spreadsheet::Reader::ExcelXML::XMLReader::NamedSharedStrings - Name based sharedStrings Reader
SYNOPSIS
#!/usr/bin/env perl
use Data::Dumper;
use MooseX::ShortCut::BuildInstance qw( build_instance );
use Spreadsheet::Reader::ExcelXML::XMLReader;
use Spreadsheet::Reader::ExcelXML::XMLReader::NamedSharedStrings;
use Spreadsheet::Reader::ExcelXML::SharedStrings;
my $file_instance = build_instance(
package => 'SharedStringsInstance',
workbook_inst => Spreadsheet::Reader::ExcelXML::Workbook->new,
superclasses =>[
'Spreadsheet::Reader::ExcelXML::XMLReader'
],
add_roles_in_sequence =>[
'Spreadsheet::Reader::ExcelXML::XMLReader::NamedSharedStrings',
],
);
DESCRIPTION
This documentation is written to explain ways to use this module when writing your own excel parser or extending this package. To use the general package for excel parsing out of the box please review the documentation for Workbooks , Worksheets , and Cells.
This role is written to extend Spreadsheet::Reader::ExcelXML::XMLReader. It adds functionality to read name based sharedStrings files. It presents this functionality in compliance with the top level interface . This POD only describes the functionality incrementally provided by this module. For an overview of sharedStrings.xml reading see Spreadsheet::Reader::ExcelXML::SharedStrings
WARNING
If your Excel 2003 xml based file does not include a SharedStrings portion then ignore this warning since it will not matter. I don't have an example of an Excel 2003 xml file that has SharedStrings content. I'm not even sure that any generators build flat SpreadsheetML files with a SharedStrings subsection. As a consequence this role is just a placeholder to allow the rest of the package to work on Excel 2003 xml files. If you are actually parsing an xml file that contains a SharedStrings portion then your parse will die with the request to submit an issue on the github repo . Please include the file that is failing. I will need an example in order to complete this section of the parser.
Requires
These are the methods required by this role and their default provider. All methods are imported straight across with no re-naming.
"set_error" in Spreadsheet::Reader::ExcelXML::Error
"good_load" in Spreadsheet::Reader::ExcelXML::XMLReader
"close_the_file" in Spreadsheet::Reader::ExcelXML::XMLReader
"advance_element_position" in Spreadsheet::Reader::ExcelXML::XMLReader
"start_the_file_over" in Spreadsheet::Reader::ExcelXML::XMLReader
"parse_element" in Spreadsheet::Reader::ExcelXML::XMLReader
"squash_node" in Spreadsheet::Reader::ExcelXML::XMLReader
"current_named_node" in Spreadsheet::Reader::ExcelXML::XMLReader
"get_group_return_type" in Spreadsheet::Reader::ExcelXML::Workbook
Methods
These are the primary ways to use this class. For additional SharedStrings options see the Attributes section.
get_shared_string( $name)
Definition: This is the primary method that needs an example for completion.
Accepts: $name = the node name of the shared string to be returned
Returns: dies with a message to submit the file to my github repo
load_unique_bits
Definition: When the xml file first loads this is available to pull customized data. It mostly pulls metadata and stores it in hidden attributes for use later. If all goes according to plan it sets "good_load" in Spreadsheet::Reader::ExcelXML::XMLReader to 1.
Accepts: Nothing
Returns: Nothing
Attributes
Data passed to new when creating an instance of this class. For modification of this(ese) attribute(s) see the listed 'attribute methods'. For more information on attributes see Moose::Manual::Attributes. The easiest way to modify this(ese) attribute(s) is when a classinstance is created and before it is passed to the workbook or parser.
cache_positions
Definition: Especially for sheets with lots of stored text the parser can slow way down when accessing each postion. This is because the text is not always stored sequentially and the reader is a JIT linear parser. To go back it must restart and index through each position till it gets to the right place. This is especially true for excel sheets that have experienced any significant level of manual intervention prior to being read. This attribute turns (default) on caching for shared strings so the parser only has to read through the shared strings once. When the read is complete all the way to the end it will also release the shared strings file in order to free up some space. (a small win in exchange for the space taken by the cache). The trade off here is that all intermediate shared strings are fully read before reading the target string. This means early reads will be slower. For sheets that only have numbers stored or at least have very few strings this will likely not be a initial hit (or speed improvement). In order to minimize the physical size of the cache, if there is only a text string stored in the shared strings position then only the string will be stored (not as a value to a raw_text hash key). It will then reconstitue into a hashref when requested.
Default: 1 = caching is on
Range: 1|0
Attribute required: yes
attribute methods Methods provided to adjust this attribute
none - (will be autoset by "cache_positions" in Spreadsheet::Reader::ExcelXML)
SUPPORT
TODO
1. Nothing yet
AUTHOR
Jed Lund
jandrew@cpan.org
COPYRIGHT
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
This software is copyrighted (c) 2016 by Jed Lund
DEPENDENCIES
Spreadsheet::Reader::ExcelXML - the package
SEE ALSO
Spreadsheet::Read - generic Spreadsheet reader
Spreadsheet::ParseExcel - Excel binary version 2003 and earlier (.xls files)
Spreadsheet::XLSX - Excel version 2007 and later
Spreadsheet::ParseXLSX - Excel version 2007 and later
All lines in this package that use Log::Shiras are commented out