NAME
Meta::Xml::Parsers::Dbdata - parser which imports into a database.
COPYRIGHT
Copyright (C) 2001, 2002 Mark Veltzer; All rights reserved.
LICENSE
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
DETAILS
MANIFEST: Dbdata.pm
PROJECT: meta
VERSION: 0.20
SYNOPSIS
package foo;
use Meta::Xml::Parsers::Dbdata qw();
my($def_parser)=Meta::Xml::Parsers::Dbdata->new();
$def_parser->parsefile($file);
DESCRIPTION
This parser helps you with importing xml files into a database. The reason that we use a parser and not a DOM type object is that, in theory, xml files can be very large and we dont want to follow the naive algorithm of: get the data into RAM and then import it because the ram requirements may be heavy. Currently we do it record by record. The parser recognizes the end of each record and then issues the insert statement. Options that need to be added are: 0. insertion only at the end of the entire read (back to the DOM module) - a user could use this if he knows he has a small enough database to fit into RAM. 1. insertion after each field - a user could use this if he has tables in which each field is very large (I wonder if that is possible..:). In addition to all of the above this implementation this parser prepares the statements to be executed because each statement is executed many times and it is much more efficient to do it that way. A weird thing here is that the Expat parser will not call the 'Char' handler if no data is in there. Maybe I should use another handler ?
* an important feature is that this does importing into a list of databases handled by a connection object.
FUNCTIONS
new($)
handle_start($$)
handle_end($$)
handle_char($$)
TEST($)
FUNCTION DOCUMENTATION
- new($)
-
This gives you a new object for a parser.
- handle_start($$)
-
This will handle start tags. This will create new objects according to the context.
- handle_end($$)
-
This will handle end tags. This currently does nothing.
- handle_char($$)
-
This will handle actual text. This currently, according to context, sets attributes for the various objects.
- TEST($)
-
Test suite for this module.
SUPER CLASSES
Meta::Xml::Parsers::Base(3)
BUGS
None.
AUTHOR
Name: Mark Veltzer
Email: mailto:veltzer@cpan.org
WWW: http://www.veltzer.org
CPAN id: VELTZER
HISTORY
0.00 MV data sets
0.01 MV PDMT
0.02 MV pictures database
0.03 MV tree type organization in databases
0.04 MV more movies
0.05 MV md5 project
0.06 MV database
0.07 MV perl module versions in files
0.08 MV movies and small fixes
0.09 MV movie stuff
0.10 MV graph visualization
0.11 MV more thumbnail stuff
0.12 MV thumbnail user interface
0.13 MV more thumbnail issues
0.14 MV website construction
0.15 MV web site automation
0.16 MV SEE ALSO section fix
0.17 MV download scripts
0.18 MV weblog issues
0.19 MV teachers project
0.20 MV md5 issues
SEE ALSO
Meta::Db::Connections(3), Meta::Db::Info(3), Meta::Development::Module(3), Meta::Ds::Array(3), Meta::Sql::Stats(3), Meta::Utils::File::File(3), Meta::Utils::Output(3), Meta::Utils::System(3), Meta::Utils::Time(3), Meta::Xml::Parsers::Base(3), strict(3)
TODO
-move parser to new style (stop using context).
-remmember that the characters call back does not give you all the data since xml parsers are supposed to be streamlined (inherit from a parser that does ?!?).
-start actually using the DEF object I got to do sanity testing (toggleble ofcourse).
-use bind with param type on all parameters (remove the version with no binding type and make sure all types are mapped right)