NAME

Finance::Shares::Overview - Outline of Finance::Shares modules

DESCRIPTION

Overview

This suite aims to provide Perl programmable support for analysing prices of shares quoted on the world's stock exchanges. It gets the quotes from http://finance.yahoo.com, storing them in a mysql database. Calculations and tests may then be applied to the data, in an attempt to derive some meaning from the semi-chaos.

The process is controlled from a file holding a model specification. Finance::Shares::Model interprets this and uses Finance::Shares::Chart to produce graphs showing the results.

Stock quotes are fetched using Finance::Shares::MySQL and held in a Finance::Shares::data object. Functions like averages or trend identifiers are applied to these data and the results used in tests. Each model can apply several tests to several samples. When the tests are run signals may be invoked highlighting interesting situations. The intention is to use these signals to drive a simulated portfolio which can be used in analysing risk.

Preparation

You will need mysql working on your system. This package's tests and tutorials use the 'test' database that comes with every mysql installation. They also expect a user called 'test' with 'test' as the password.

To set this up, as root, fire up the mysql client and declare the user:

# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3 to server version: 3.23.52-Max-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> grant all privileges on test.* to 'test'
    -> identified by 'test';

Query OK, 0 rows affected (0.15 sec)

NOTE: There is always a database called 'test', so you don't need to create it. However, if you've chosen to use a different name you will need to give the command create database db_name; as well.

Running Models

Configuration file

There are a few examples in the models directory of the distribution. Before running them, you might find it useful to set up a configuration file. By default, this is expected at

~/.fs/default.conf

Alternatively, it can be specified from the command line.

Configuration files have just the same elements as model specification files, but usually only specify commonly used settings. models/default.conf is an example.

Using the fsmodel Script

A script, fsmodel should be available on your system. Help is available with either

fsmodel --help
man fsmodel

If the preconditions are met, the following command lines should produce PostScript files (*.ps) which may be printed or viewed using a PostScript viewer such as gv or KGhostView.

Conditions

mysql has been set up as in "Preparation" above.
There is an open internet connection.
models/default.conf has been copied to ~/.fs/default.conf.
The command is given from the distribution top level directory.

Command lines

fsmodel --model=models/greater MSFT

fsmodel --model=models/less --file=less GSK.L AZN.L

fsmodel -m models/compare -s stocks/FTSE-media -f media

fsmodel -m models/convergence -s stocks/FTSE-mining -v 3

The system has successfully handled around 100 pages, so it should be OK with useable numbers of stocks.

Handling PostScript

PostScript::File is used to output all charts in PostScript format. Originally this was because I couldn't find any software that printed A4 sized charts with enough accuracy and detail to scribble lines on. In practice, I don't print them out nearly as often as they are viewed on screen, but it is good to be able to do both. This doesn't sit particularly well with the web interface, but programs like pstill can convert the output to PDF easily enough. To convert chart.ps to Adobe Portable Document Format:

pstill -o chart.pdf chart.ps

Don't forget that most browsers can be configured to do something useful with files in application/postscript MIME format, possibly by nominating pstill as a helper program.

It is also worth investigating GhostScript which is available for a wide variety of platforms. For example, the following command (on a single line) will convert the file chart.ps to Portable Network Graphics file chart.png.

    gs -q -dBATCH -sDEVICE=png16m \
	-sOutputFile=chart.png chart.ps

Ghostscript is freely available from Artifex Software, at http://ghostscript.com or from the Free Software Foundation at ftp://mirror.cs.wisc.edu/pub/mirrors/ghost/gnu/current/.

Note that gs will also work as a filter. PostScript might take a bit more work than producing graphs directly, but all the advantages of vector graphics are maintained. As well as producing the generic printer format, PostScript::File can also output PNG files (for those applications that require bitmapped graphics) and EPS files (for embedding directly in other documents).

Support Modules

Each graph line is provided by its own module. These are some of the functions available.

Finance::Shares::Function has instructions and examples showing how to extend the suite by writing your own function modules.

Miscellaneous Support

These are generally used in more complex functions.

mark

Used to specify the style of chart lines, points or bars under program control.

value

Place a visible horizontal line identifying a particular Y axis value.

gradient

Smoothed rate of change.

momentum

Shows how a measure changes between now and N days ago.

rate_of_change

Where 'momentum' is a difference, this is a ratio.

Averages

Calculating the mean of a series of values within one line or the same value across a series of lines.

moving_average

The normal workhorse.

exponential_average

A variation which takes all previously known data into some account.

weighted_average

A moving average which has most recent values more heavily weighted.

multiline_mean

Calculates the average of a number of lines.

Comparisons

Relate one line to another.

compare

Express one or more lines relative to some base line used for comparison.

greater_than, greater_equal, less_than and less_equal are a leftover from version 0 and are included because they were an early part of the test suite. They are depreciated in favour of writing the tests directly.

Bands and Ranges

Functions identifying boundaries around the distributed values.

highest

A trace of N-day highs.

lowest

A trace of N-day lows.

percent_band

Produces two lines, N percent above and below a source line.

bollinger_band

Produces two lines bounding typically 2 standard deviations above and below a source line.

Compound Functions

These typically use some other (hidden) function in their calculations.

on_balance_volume

Give some indication of buying and selling pressure.

oversold

Identify when the rate of change is unusually high.

undersold

Identify when the rate of change is unusually low.

historical_highs

Show how long since some line was as high as the current value.

historical_lows

Show how long since some line was as low as the current value.

is_falling

Shows 'high' when the source line is decreasing.

is_rising

Shows 'high' when the source line is increasing.

Changes Since Version 0

Version 1 is more or less a complete re-write; very little of the original code remains. The aim has changed. Version 0 was attempting to become a toolkit of modules that could be used to build your own stock analysis system. It seems that this general aim is not possible as the modules have to make assumptions about the running environment. [However, see http://geniustraders.org for a well developed (and more complex) trading system written in Perl which is well worth a look.]

This solution provides an engine running a simulation from a specification file. This file usually includes user perl code to be executed before, during and/or after the run. The code typically makes use of the graph functions and may write to the graphs as well as invoke callbacks.

Declarative specifications

This suite has been developed to be more declarative than procedural. Version 0 used a model specification, but the lines and tests had to be given in the right order.

Now the model specification describes the results wanted, rather than the processes to be carried out. For example, Only top level lines or tests need to be specified in a sample. The model engine infers what is needed and the order of calculation from the specification and code fragments.

Different resources

sources, files, charts, groups and samples are much the same. But two others have been added, supporting a variety of named dates and stocks.

functions has been renamed lines because that's what they mostly produce. However, it is strictly inaccurate and may be confusing.

The tests have either disappeared or become lines. signals - the distinguishing feature of the old tests - have disappeared altogether. Instead the new tests are code fragments, giving much more flexibility and power. Builtin functions now support things like chart marks, files and messages.

More scope for defaults

It is now possible to have the specification split over several files. One of which is a configuration file providing defaults or a complete model - as you choose.

The resource names are more flexible, resource blocks can appear many times and earlier default values can be overridden.

One of the configuration features is the introduction of user-defined aliases - currently only used for function names, which are often rather long.

Programmable tests

Probably the most useful development is the use of code fragments or imported callbacks which can be invoked before, during or after a model is run. The 'during' fragment is visited at every data point, when a variety of data is made available, including the value of all other lines.

Code fragments can be as large as you wish - complete files using additional modules. Callbacks or internal functions can be invoked conditionally within your code, replacing the old signals.

The line functions supporting this paradigm are rather different (See "Support Modules"). All of the lines producing logical output (e.g. greater_than, and) have gone. Functions providing statistical measures (e.g. maximum, standard_deviation) are provided, but are more often used to access the single value they calculate.

Multiple pages

The previous version could produce PostScript files with several charts, but they were completely seperate models.

It is now possible to produce several (possible related) charts and refer to one stock model from another. For example, it was previously impossible to compare stock prices with a group average. This can now be done.

Fully Qualified Line Names

If charts can interact, there must be some way to specify the same line on different charts. A chart is specified as a unique combination of sample, date and stock code. The chart (or page) name qualifies the line name, in the same way that a spreadsheet sheet name may qualify a cell range.

In addition, wildcards and regular expressions can be used to specify the same line on particular pages, every line on a page, or even a line on 'every page except this one'.

Graph layout

The graph types have been renamed. price and volume types are the same, but cycles are now called analysis and tests have been renamed levels. This is because the analysis graphs are more flexible than before and there are no more tests to present output for.

In version 0, the graphs that could appear on a chart were fixed. Now any number of any graph type can appear in any order, with the date axis presented on any or all of them. As before, charts don't need to be specified as they will be created automatically; but now the defaults are tailored to colour output rather than defaulting to the lowest common denominator.

One major limitation of the old layout was that there was never enough space for the Key panel. To overcome this, there is now only one Key panel per page. All lines from any chart appear there.

It is possible to control the order of the lines explicitly, if you wish, bringing some to the front and others to the back.

There are a few refinements to the graphs themselves. As well as OHLC and close marks, candles are now supported, both in monochrome and two colour versions. The default styles for lines have been improved to the extent that style specifications can be ignored if you wish.

In addition, the drawing order of lines is under complete user control. It is even possible to place them <behind> the data (by giving a negative value).

Alternative output

As well as the original PostScript, it is now possible to have graphs output in PNG format. These are (of course) always inferior quality, but often more useful.

It is also possible to output the graphs in a CGI-compatible format. One of the motivations for driving the model from a single data structure was to simplify CGI input. No work has been done on this as yet, however.

Test driven development

There are many more tests in this package. This reflects my increased use of test driven development (well, a modified version I find useable). By including a batch of significant regression tests, it should be easier for others to extend the code.

A couple of less helpful side effects of TDD, though. First, the development is a bit ad hoc. This means that code is less regular than I would normally like it - with more potential for errors.

Secondly, by the nature of things, the tests define what might be expected to work. Stray far from the tests and all hell breaks loose. It will be interesting to see if this improves.

In the meantime, I recommend taking a close look at the t and models directories. Most tests beyond t/100.t have output files and give some idea of the features available, but look at the example models for format.

AUTHOR

Chris Willmot, chris@willmot.org.uk

SEE ALSO

Have a look at the tests in the t directory of the distribution. The charts produced will give you a good idea of the kind of thing this suite does.

fsmodel is the main script, using Finance::Shares::Model - see that man page for details.