NAME

HTTP::Server::Singlethreaded - a framework for standalone web applications

SYNOPSIS

# configuration first:
#
BEGIN { # so the configuration happens before import() is called
# static directories are mapped to file paths in %Static
$HTTP::Server::Singlethreaded::Static{'/images/'} = '/var/www/images';
$HTTP::Server::Singlethreaded::Static{'/'} = '/var/www/htdocs';
#
# configuration for serving static files (defaults are shown)
$HTTP::Server::Singlethreaded::DefaultMimeType = 'text/plain';
@HTTP::Server::Singlethreaded::MimeType{qw/txt htm html jpg gif png/} =
qw{text/plain text/html text/html image/jpeg image/gif image/png};
#
# internal web services are declared in %Functions 
$HTTP::Server::Singlethreaded::Function{'/AIS/'} = \&HandleAIS;
#
# external CGI-BIN directories are declared in %CgiBin
# NOT IMPLEMENTED YET
$HTTP::Server::Singlethreaded::CgiBin{'/cgi/'} = '/var/www/cgi-bin';
#
# @Port where we try to listen
@HTTP::Server::Singlethreaded::Port = (80,8000);
#
# Timeout for the selecting 
$HTTP::Server::Singlethreaded::Timeout = 5
#
# overload protection
$HTTP::Server::Singlethreaded::MaxClients = 10
#
}; # end BEGIN
# merge path config and open listening sockets
# configuration can also be provided in Use line.
use HTTP::Server::Singlethreaded
   timeout => \$NotSetToAnythingForFullBlocking,
   function => { # must be a hash ref
                  '/time/' => sub {
                     "Content-type: text/plain\n\n".localtime
                  }
   },
   path => \%ChangeConfigurationWhileServingBySettingThis;
#
# "top level select loop" is invoked explicitly
for(;;){
  #
  # manage keepalives on database handles
  if ((time - $lasttime) > 40){
     ...
     $lasttime = time;
  };
  # Auto restart on editing this file
  BEGIN{$OriginalM = -M $0}
  exec "perl -w $0" if -M $0 != $OriginalM;
  #
  # do pending IO, invoke functions, read statics
  # HTTP::Server::Singlethreaded::Serve()
  Serve(); # this gets exported
}

DESCRIPTION

HTTP::Server::Singlethreaded is a framework for providing web applications without using a web server (apache, boa, etc.) to handle HTTP.

CONFIGURATION

One of %Static, %Function, %CgiBin should contain a '/' key, this will handle just the domain name, or a get request for /.

%Static

the %Static hash contains paths to directories where files can be found for serving static files.

$StaticBufferSize

How much of a large file do we read in at once? Without memory mapping, we have to read in files, and then write them out. Files larger than this will get this much read from them when the output buffer is smaller than this size. Defaults to 50000 bytes, so output buffers for a request should fluctuate between zero and 100000 bytes while serving a large file.

%Function

Paths to functions => functions to run. The entire server request is available in $_ and several variables are available in %_. $_{PATH_INFO},$_{QUERY_STRING} are of interest. The whole standard CGI environment will eventually appear in %_ for use by functions but it does not yet.

%CgiBin

CgiBin is a functional wrapper that forks and executes a named executable program, after setting the common gateway interface environment variables and changing directory to the listed directory. NOT IMPLEMENTED YET

@Port

the @Port array lists the ports the server tries to listen on.

name-based virtual hosts

not implemented yet; a few configuration interfaces are possible, most likely a hash of host names that map to strings that will be prepeneded to the key looked up in %Path, something like

use HTTP::Server::Singlethreaded 
   vhost => {
      'perl.org' => perl =>
      'www.perl.org' => perl =>
      'web.perl.org' => perl =>
      'example.org' => exmpl =>
      'example.com' => exmpl =>
      'example.net' => exmpl =>
      'www.example.org' => exmpl =>
      'www.example.com' => exmpl =>
      'www.example.net' => exmpl =>
   },
   static => {
      '/' => '/var/web/htdocs/',
      'perl/' => '/var/vhosts/perl/htdocs',
      'exmpl/' => '/var/vhosts/example/htdocs'
   }
;

Please submit comments via rt.cpan.org.

$Timeout

the timeout for the select. 0 will cause Serve to simply poll. undef, to cause Serve to block until thereis a connection, can only be passed on the use line.

$MaxClients

if we have more active clients than this we won't accept more. Since we're not respecting keepalive at this time, this number indicates how long of a backlog singlethreaded will maintain at any moment,and should be orders of magnitude lower than the number of simultaneous web page viewers possible. Depending on how long your functions take.

$WebEmail

an e-mail address for whoever is responsible for this server, for use in error messages.

$forkwidth

Set $forkwidth to a number greater than 1 to have singlethreaded fork after binding. If running on a multiprocessor machine for instance, or if you want to verify that the elevator algorithm works. After import(), $forkwidth is altered to indicate which process we are in, such as "2 of 3". The original gets an array of the process IDs of all the children in @kids, as well as a $forkwidth variable that matches /(\d+) of \1/. Also, all children are sent a TERM signal from the parent process's END block. Uncomment the relevant lines in the module source if you need this. Forking after initializing the module should work too. This might get removed as an example of featureitis.

$uid and $gid

when starting as root in a *nix, specify these numerically. The process credentials will be changed after the listening sockets are bound.

Dynamic Reconfiguration

Dynamic reconfiguration is possible, either by directly altering the configuration variables or by passing references to import().

Action Selection Method

The request is split on slashes, then matched against the configuration hash until there is a match. Longer matching pieces trump shorter ones.

Having the same path listed in more than one of %Static, %Functions, or %CgiBin is an error and the server will not start in that case. It will die while constructing %Path.

Writing Functions For Use With HTTP::Server::Singlethreaded

This framework uses the %_ hash for passing data between elements which are in different packages.

Data you get

the whole enchilada

The full RFC2616-sec5 HTTP Request is available for inspection in $_. Certain parts have been parsed out and are available in %_. These include

Method

Your function can access all the HTTP methods. You are not restricted to GET or POST as with the CGI environment.

URI

Whatever the client asked for.

HTTPver

such as 1.1

QUERY_STRING, PATH_INFO

as in CGI

Data you give

The HandleRequest() function looks at two data only:

ResultCode

$_{ResultCode} defaults to 200 on success and gets set to 500 when your function dies. $@ will be included in the output. Singlethreaded knows all the result code strings defined in RFC2616.

As of late 2004, Mozilla FireFox will show you error messages while Microsoft Internet Explorer hides error messages from its users, at least with the default configuration.

Data

Store your complete web page output into $_{Data}, just as you would write output starting with server headers when writing a simple CGI program. Or leave $_{Data} alone and return a valid page, beginning with headers.

AVOIDING DEADLOCKS

The server blocks while reading files and executing functions. You may use a closure to describe a callback. %_ is restored between callbacks while handling a request.

CALLBACK FUNCTIONS (and poll functions)

Instead of a string to send to the client, the function returns a coderef to indicate that Singlethreaded needs to check back later to see if the page is ready, by running the coderef, next time around. Data for the client, if any, must be stored in $_{Data} when you want the callback to be called again (indicated by continuing to return the callback function.)

When the callback function returns a non-reference, that string is considered the end of the response.

Instead of a coderef, a hashref or an arrayref is acceptable. The hashref needs to have 'continue' defined within it as a coderef., and may have 'poll' defined in it when it makes sense to have separate poll and continue coderefs.

poll

a reference to code that will return a boolean indicating true when it is time to run the continue piece and get some data, or false when we should wait some more before running the continuation.

continue

a coderef that, when run, will set $_{Data} with an empty or non-empty string, and return a (contine, [poll]) list.

an arrayref instead of a hashref

in the order of, [$continue, $poll] so the later one can be left out if there is no poll code.

example

Lets say we have two functions called Start() and More($) that we are wrapping as a web service with Singlethreaded. Start returns a handle that is passed as an argument to More to prevent instance confusion. More will return either some data or emptystring or undef when it is done. Here's how to wrap them:

sub StartMoreWrapper{
   my $handle = Start or die "Start() failed";
   my $con;
   $_{Data} = <<HEAD;
Content-type: text/html

<html><body bgcolor="FFFFFF">
Here are the results from More:
<pre>
HEAD

   $con = sub{
      my $rv = More($handle);
      if(defined $rv){
           $_{Data} = $rv;
           return ($con);
      };
      <<TAIL;
</pre> thanks for playing </body></html>
TAIL
   }
}

And be sure to put '/startresults' = \&StartMoreWrapper> into the functions hash.

What Singlethreaded is good for

Singlethreaded is designed to provide a web interface to a database, leveraging a single persistent DBI handle into an unlimited number of simultaneous HTTP requests.

It will work to serve a mini-cpan repository.

HISTORY

0.01

August 18-22, 2004. %CgiBin is not yet implemented.

0.02

August 22, 2004. Nonblocking sockets apparently just plain don't exist on Microsoft Windows, so on that platform we can only add one new client from each listener on each call to serve. Which should make no difference at all. At least not noticeable. The connection time will be longer for some of the clients in a burst of simultaneous connections. Writing around this would not be hard: another select loop that only cares about the Listeners would do it.

0.03

The listen queue will now be drained until empty on platforms without nonblocking listen sockets thanks to a second select call.

Large files are now read in pieces instead of being slurped whole.

0.04

Support for continuations for page generating functions is in place.

0.05

Support for POST data is in place. POST data appears in $_{POST_DATA}. Other CGI variables now available in %_ include PATH_INFO, QUERY_STRING, REMOTE_ADDR, REQUEST_METHOD, REQUEST_URI and SCRIPT_NAME.

0.06

Fixed a bug with serving files larger than the chunksize, that inserted a gratuitous newline. Singlethreaded will now work to serve a minicpan mirror.

0.08 March, 2008

address of this end of the connection now available

0.10 June, 2008

improved handling of callbacks

improved association logic WRT trailing slashes

repeated selects inside Serve() while outputting

only writing one byte at a time on Windows, where Cygwin's syswrite does not do partial writes. (patch welcome to improve this situation)

less debugging output by default, and some informational prints changed to warnings (to get line number info)

EXPORTS

Serve() is exported, and must be called in a loop.

AUTHOR

David Nicol <davidnico@cpan.org>

This module is released AL/GPL, the same terms as Perl.

References

Paul Tchistopolskii's public domain phttpd

HTTP::Daemon

the University of Missouri - Kansas City Task Definition Interface

perlmonks