NAME

Tie::Gzip - read and write gzip compressed files

SYNOPSIS

require Tie::Gzip;

tie filehandle, 'Tie::Gzip'
tie filehandle, 'Tie::Gzip', mode, filename
tie filehandle, 'Tie::Gzip', filename

tie filehandle, 'Tie::Gzip', \%options
tie filehandle, 'Tie::Gzip', mode, filename, \%options 
tie filehandle, 'Tie::Gzip', filename, \%options

tie filehandle, 'Tie::Gzip', \@options
tie filehandle, 'Tie::Gzip', mode, filename, \@options 
tie filehandle, 'Tie::Gzip', filename, \@options

DESCRIPTION

The 'Tie::Gzip' module provides a file handle Tie for compressing and uncompressing files using the gzip compression format.

By tieing a filehandle to 'Tie::Gzip' subsequent uses of the file subroutines with the tied filehandle will compress data written to an opened file using gzip compression and decompress data read from an opened file using gzip compression.

If the 'Tie::Gzip' tie receives a filename or mode filename after completing the tie, 'Tie::Gzip' will open filename.

During the tie, Tie::Gzip will first try to load the 'Compress::Zlib' module and package. If successful, 'Tie::Gzip' uses the 'Compress::Zlib' for compressing and decompressing the file data.

If unsuccessful, 'Tie::Gzip' setups up the following pipes to an anticipated GNU 'gzip' site command for compressing and decompressing the file data:

gzip --decompress --stdout {} | # read file data
| gzip --stdout > {} # write file data

where the string '{}' is a placeholder for the filename.

Many sites, especially UNIX Internet Service Providers, will not provide the 'Compress::Zlib' module. Instead they expect the users to make use of a site Unix gzip command.

If neither of these gzip resources are available for a site, 'Tie::Gzip' provides the 'read_pipe' and 'write_pipe' options, to tie to a suitable local site gzip command.

For example, to specify the GNU gzip, provide the following options as either a hash or array reference:

[ read_pipe => 'gzip --decompress --stdout {}',
  write_pipe => ' gzip --stdout > {}' ]

The pipe symbol '|' is optional. The 'Tie::Gzip' uses the 'binmode' for all data to and from the read and write pipes. This is equivalent to 'raw' (as oppose to 'cooked') for Unix file drivers and the binary (as oppose to 'text') for Windows file drivers.

The hash reference to the 'Tie::Gzip' data may be obtained as follows:

my $self = tied filehandle;

The 'Tie::Gzip' data hash keys and contents are subject to change without notice expect for

$self->{options}->{read_pipe}
$self->{options}->{write_pipe}

as described above.

Because of the nature of the gzip compression software, the file subroutines have at least the following restrictions:

open

The open command will accept only the '>' and the '<' modes. All other modes are invalid. The 'Tie::Gzip' tie does provide greatly limited piping capabilities with the 'read_pipe' and 'write_pipe' options. Feature creep of reading and writing a compress file is coming.

seek

The seek is only valid for mode 1, positive seeks when reading a compress files. Feature creep of seek is comming.

fileno

The file no when using "Compress::Zlib" is undefined.

binmode

This subroutine does nothing since the tied 'Tie::Gzip' file handle is always in the binmode.

REQUIREMENTS

For these requirements the pharse 'Tie Gzip file handle' will mean a file handle successfully tied to 'Tie::Gzip' that uses either the 'Compress::Zlib' module or the a site system GNU gzip executable to compress and decompress the file data. Thus, the data written to a file using a 'Tie::Gzip file handle' should be in accordance with RFC 1951 and RFC 1952.

The 'Tie::Gzip' requirements are as follows:

data integrity [1]

The data read back from a file using a 'Tie::Gzip file handle' shall[1] be the same as the data written to the file using a 'Tie::Gzip file handle'.

interoperability [1]

The data read back from a file using a software unit or executable program in accordance with RFC 1951 and RFC 1952 shall[1] be the same as the data written to the same file using a 'Tie::Gzip file handle'.

interoperability [2]

The data read back from a file using 'Tie::Gzip file handle shall[2] be the same as the data written to the same file using a software unit or executable program in accordance with RFC 1951 and RFC 1952.

DEMONSTRATION

#########
# perl Gzip.d
###

~~~~~~ Demonstration overview ~~~~~

Perl code begins with the prompt

=>

The selected results from executing the Perl Code follow on the next lines. For example,

=> 2 + 2
4

~~~~~~ The demonstration follows ~~~~~

=>     use File::Package;
=>     use File::Copy;
=>     use File::SmartNL;

=>     my $uut = 'Tie::Gzip'; # Unit Under Test
=>     my $fp = 'File::Package';
=>     my $snl = 'File::SmartNL';
=>     my $loaded;

=> ##################
=> # Load UUT
=> # 
=> ###

=> my $errors = $fp->load_package($uut)
=> $errors
''

=> ##################
=> # Tie::Gzip Version $Tie::Gzip::VERSION loaded
=> # 
=> ###

=> $loaded = $fp->is_package_loaded($uut)
1

=> ##################
=> # Copy gzip0.htm to gzip1.htm.
=> # 
=> ###

=> unlink 'gzip1.htm'
=> copy('gzip0.htm', 'gzip1.htm')
'1'

=>       sub gz_decompress
=>      {
=>          my ($gzip) = shift @_;
=>          my $file = 'gzip1.htm';
=>  
=>          return undef unless open($gzip, "< $file.gz");

=>          if( open (FILE, "> $file" ) ) {
=>              while( my $line = <$gzip> ) {
=>                   print FILE $line;
=>              }
=>              close FILE;
=>              close $gzip;
=>              unlink 'gzip1.htm.gz';
=>              return 1;
=>          }

=>          1 

=>      }

=>      sub gz_compress
=>      {
=>          my ($gzip) = shift @_;
=>          my $file = 'gzip1.htm';
=>          return undef unless open($gzip, "> $file.gz");
=>         
=>          if( open(FILE, "< $file") ) {
=>              while( my $line = <FILE> ) {
=>                     print $gzip $line;
=>              }
=>              close FILE;
=>              unlink $file;
=>          }
=>          close $gzip;
=>     }

=>     #####
=>     # Compress gzip1.htm with gzip software unit of opportunity
=>     # Decompress gzip1.htm,gz with gzip software unit of opportunity
=>     #
=>     tie *GZIP, 'Tie::Gzip';
=>     my $tie_obj = tied *GZIP;
=>     my $gz_package = $tie_obj->{gz_package};
=>     my $gzip = \*GZIP;
=>     
=>     #####
=>     # Do not skip tests next compress and decompress tests if this expression fails.
=>     # Passing the next compress and decompress tests is mandatory to ensure at 
=>     # least one gzip is available and works
=>     # 
=>     my $gzip_opportunity= gz_compress( $gzip );

=> ##################
=> # Compress gzip1.htm with gzip of opportunity. Validate gzip1.htm.gz exists
=> # 
=> ###

=> -f 'gzip1.htm.gz'
'1'

=> ##################
=> # Decompress gzip1.htm.gz with gzip of opportunity. Validate gzip1.htm same as gzip0.htm
=> # 
=> ###

=> gz_decompress( $gzip )
=> $snl->fin('gzip1.htm') eq $snl->fin('gzip0.htm')
'1'

=> unlink 'gzip1.htm'

QUALITY ASSURANCE

Test Script Design

The Tie:Gzip test script performs multiple duties. The Tie::Gzip program module finds a gzip software unit of opportunity looking for both Perl Compress::Zlib program module and a site operating system gzip with the following GNU syntax:

read_pipe => 'gzip --decompress --stdout {}',
write_pipe => 'gzip --stdout > {}',

If a particular site does not support both gzips, those tests, such as the interoperatability between different gzip software units, are skipped.

For quality assurance, the Tie::Gzip test is performed on a site that supports both. For installation test, only one is needed for a pass. However if an installation supports both, both should pass in order to meet the interoperatability requirement for the Tie::Gzip module. This of course does not test that files produced from gzip software units outside the site are interoperatable. However, since the site gzip used for the quality assurance test meets the RFC 1951 and RFC 1952, the chances are that the gzip outside the site is broken if Tie::Gzip cannot decompress it.

Test Report

=> perl Gzip.t

1..13 # Running under perl version 5.006001 for MSWin32 # Win32::BuildNumber 635 # Current time local: Fri Apr 16 15:59:27 2004 # Current time GMT: Fri Apr 16 19:59:27 2004 # Using Test.pm version 1.24 # Test::Tech : 1.19 # Data::Secs2 : 1.17 # Data::SecsPack: 0.02 # =cut ok 1 - UUT not loaded ok 2 - Load UUT ok 3 - Tie::Gzip Version 1.14 loaded ok 4 - Ensure gzip.t can access gzip0.htm ok 5 - Copy gzip0.htm to gzip1.htm. ok 6 - Compress gzip1.htm with gzip of opportunity. Validate gzip1.htm.gz exists ok 7 - Decompress gzip1.htm.gz with gzip of opportunity. Validate gzip1.htm same as gzip0.htm ok 8 - Compress gzip1.htm with site os GNU gzip. Validate gzip1.htm.gz exists ok 9 - Decompress with site os GNU gzip. Validate gzip1.htm same as gzip0.htm ok 10 - Compress gzip1.htm with Compress::Zlib. Validate gzip1.htm.gz exists. ok 11 - Decompress gzip1.htm.gz with site OS GNU gzip. Validate gzip1.htm same as gzip0.htm ok 12 - Compress gzip1.htm with site os GNU gzip. Validate gzip1.htm.gz exists. ok 13 - Decompress gzip1.htm.gz with Compress::Zlib. Validate gzip1.htm same as gzip0.htm. # Passed : 13/13 100%

Test Script Software and Operation

Running the test script 'Gzip.t' found in the "Tie-Gzip-$VERSION.tar.gz" distribution file verifies the requirements for this module.

All testing software and documentation stems from the Software Test Description (STD) program module 't::Tie::Gzip', found in the distribution file "Tie-Gzip-$VERSION.tar.gz".

The 't::Tie::Gzip' STD POD contains a tracebility matix between the requirements established above for this module, and the test steps identified by a 'ok' number from running the 'Gzip.t' test script.

The t::Tie::Gzip' STD program module '__DATA__' section contains the data to perform the following:

  • to generate the test script 'Gzip.t'

  • generate the tailored STD POD in the 't::Tie::Gzip' module,

  • generate the 'Gzip.d' demo script,

  • replace the POD demonstration section herein with the demo script 'Gzip.d' output, and

  • run the test script using Test::Harness with or without the verbose option,

To perform all the above, prepare and run the automation software as follows:

  • Install "Test_STDmaker-$VERSION.tar.gz" from one of the respositories only if it has not been installed:

    • http://www.softwarediamonds/packages/

    • http://www.perl.com/CPAN-local/authors/id/S/SO/SOFTDIA/

  • manually place the script tmake.pl in "Test_STDmaker-$VERSION.tar.gz' in the site operating system executable path only if it is not in the executable path

  • place the 't::Tie::Gzip' at the same level in the directory struture as the directory holding the 'Tie::Gzip' module

  • execute the following in any directory:

    tmake -test_verbose -replace -run -pm=t::Tie::Gzip

NOTES

The package 'CPAN::Tarzip::TIEHANDLE' buried deep in the 'CPAN' module has a bare bones tie to decompress gzip files. A study of this package proved valuable in identifying some of the pitfalls that the author of this package encountered in his similar endeavor. One issue was that 'Compress::Zlib' gzip subroutines/methods will return data entact from a file that is not compress as well as compress gzip file contents without any signaling of the differences in the raw file contents.

This 'Compress::Gzip' module follows the overall direction of 'CPAN::Tarzip::TIEHANDLE' in handling this issue with a different code implementation.

Another related module is the 'PerlIO::gzip' module that implements the gzip file disciplines. Gzip file disciplines are available in the newer version of Perls. Altough the C code was not examined for this module, there appears in the POD a somewhat different approach to processing the file content that is not gzip compressed. There is a lot of gzip header checking and whatever.

Many of the older Perls in wide spread use do not support file disciplines.

head2 FEEDBACK

From: Mark.Scarton@FranklinCovey.com Date: Thu, 19 Feb 2004 17:23:37 -0700

In the 'lib/Tie/Gzip.pm' module of the Tie-Gzip-0.01 package, the open of the pipe ("gzip --decompress --stdout |") is failing due to the reference to $! in the conditional. As a test, I cleared $! before issuing the open call as follows:

Line 124:

###############
# Some perls will return a glob and a warning
# for certain pipe errors such as the command
# not a recognized command
#
$! = 0;    ### MAS ###
my $success = open PIPE, $pipe;
if($! || !$success) {
    warn "Could not pipe $pipe: $!\n";
    $self->CLOSE;
    return undef;
}

Line 167:

###############
# Some perls will return a glob and a warning
# for certain pipe errors such as the command
# not a recognized command
#
$! = 0;    ### MAS ###
my $success = open PIPE, $pipe;
if($! || !$success) {
    warn "Could not pipe $pipe: $!\n";
    $self->CLOSE;
    return undef;
}

This works. Prior to making this change, test 6 of Gzip.t would fail.

According to the Learning Perl O'Reilly book,

"But if you use die to indicate an error that is not the failure of a system request, don't include $!, since it will generally hold an unrelated message left over from something Perl did internally. It will hold a useful value only immediately after a failed system request. A successful request won't leave anything useful there."

So $! is only sourced when a system error occurs and it is not cleared prior to the call. If no error occurs, the value is indeterminate.

head2 FILES

The installation of the "Tie-Gzip-$VERSION.tar.gz" distribution file installs the 'Docs::Site_SVD::Tie_Gzip' SVD program module.

The __DATA__ data section of the 'Docs::Site_SVD::Tie_Gzip' contains all the necessary data to generate the POD section of 'Docs::Site_SVD::Tie_Gzip' and the "Tie-Gzip-$VERSION.tar.gz" distribution file.

To make use of the 'Docs::Site_SVD::Tie_Gzip' SVD program module, perform the following:

  • install "ExtUtils-SVDmaker-$VERSION.tar.gz" from one of the respositories only if it has not been installed:

    • http://www.softwarediamonds/packages/

    • http://www.perl.com/CPAN-local/authors/id/S/SO/SOFTDIA/

  • manually place the script vmake.pl in "ExtUtils-SVDmaker-$VERSION.tar.gz' in the site operating system executable path only if it is not in the executable path

  • Make any appropriate changes to the __DATA__ section of the 'Docs::Site_SVD::Tie_Gzip' module. For example, any changes to 'Tie::Gzip' will impact the at least 'Changes' field.

  • Execute the following:

    vmake readme_html all -pm=Docs::Site_SVD::Tie_Gzip -verbose

AUTHOR

The holder of the copyright and maintainer is

<support@SoftwareDiamonds.com>

Copyrighted (c) 2002 Software Diamonds

All Rights Reserved

BINDING REQUIREMENTS NOTICE

Binding requirements are indexed with the pharse 'shall[dd]' where dd is an unique number for each header section. This conforms to standard federal government practices, 490A ("3.2.3.6" in STD490A). In accordance with the License for 'Tie::Gzip', Software Diamonds is not liable for meeting any requirement, binding or otherwise.

LICENSE

Software Diamonds permits the redistribution and use in source and binary forms, with or without modification, provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

SOFTWARE DIAMONDS, http::www.softwarediamonds.com, PROVIDES THIS SOFTWARE 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SOFTWARE DIAMONDS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING USE OF THIS SOFTWARE, EVEN IF ADVISED OF NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE POSSIBILITY OF SUCH DAMAGE.







SEE ALSO

CSPAN, PerlIO::gzip, Test::STDmaker, Docs::US_DOD::STD, ExtUtils::SVDmaker, Docs::US_DOD::SVD, gzip, rfc 1952 (the gzip file format specification), rfc 1951 (DEFLATE compressed data format specification)