NAME
Data::Frame::Examples - Example data sets
VERSION
version 0.006003
SYNOPSIS
use Data::Frame::Examples qw(:datasets dataset_names);
my $datasets = dataset_names(); # names of all example datasets
my $mtcars = mtcars();
DESCRIPTION
Example datasets as Data::Frame objects.
Checkout Data::Frame::Examples::dataset_names()
for an array of example datasets provided by this module.
FUNCTIONS
dataset_names
Returns an array of names of the datasets in this module.
DATASETS
airquality
A dataset with 154 observations on 6 variables, for daily readings of the following air quality values for May 1, 1973 to September 30, 1973.
The variables are,
Ozone
numeric Ozone (ppb)
Solar_R
numeric Solar R (lang)
Wind
numeric Wind (mph)
Temp
numeric Temperature (degrees F)
Month
numeric Month (1-12)
Day
numeric Day of month (1-31)
diamonds
A dataset containing the prices and other attributes of almost 53,940 diamonds on 10 variables.
The variables are,
price
price in US dollars
carat
weight of the diamond
cut
quality of the cut (Fair, Good, Very Good, Premium, Ideal)
color
diamond colour, from J (worst) to D (best)
clarity
a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
x
length in mm
y
width in mm
z
depth in mm
depth
total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)
table
width of top of diamond relative to widest point
economics
A dataset with 574 rows and 6 variables, produced from US economic time series data available from http://research.stlouisfed.org/fred2.
The variables are,
date
Month of data collection
psavert
personal saving rate
pce
personal consumption expenditures, in billions of dollars
unemploy
number of unemployed in thousands
uempmed
median duration of unemployment, in weeks
pop
total population, in thousands
economics_long
A dataset with 2870 rows and 4 variables.
It's from the same data source as economics
, except that economics
is in "wide" format, this economics_long
is in "long" format.
faithfuld
A 2d density estimate of the waiting and eruptions variables data faithful. 5,625 observations and 3 variables.
iris
A dataset with 150 cases and 5 variables, for 50 flowers from each of 3 species of iris.
The variables are,
Sepal_Length
Sepal_Width
Petal_Length
Petal_Width
Species
The species are setosa, versicolor, and virginica.
mpg
A subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. 234 rows and 11 variables.
The variables are,
manufacturer
model
model name
displ
Engine displacement, in litres
year
year of manufacture
cyl
number of cylinders
trans
type of transmission
drv
f = front-wheel drive, r = rear wheel drive, 4 = 4wd
cty
city miles per gallon
hwy
highway miles per gallon
fl
fuel type
class
"type" of car
mtcars
Data extracted from the 1974 Motor Trend US magazine, for 32 automobiles (1973-74 models). 32 observations on 11 variables.
The variables are,
mpg
Miles/(US) gallon
cyl
Number of cylinders
disp
Displacement (cu.in.)
hp
Gross horsepower
drat
Rear axle ratio
wt
Weight (1000 lbs)
qseq
1/4 mile time
vs
V/S
am
Transmission (0 = automatic, 1 = manual)
gear
Number of forward gears
carb
Number of carburetors
txhousing
Information about the housing market in Texas provided by the TAMU real estate center, http://recenter.tamu.edu/. 8602 observations and 9 variables.
The variables are,
city
Name of MLS area
year,month,date
sales
Number of sales
volume
Total value of sales
median
Median sale price
listings
Total active listings
inventory
"Months inventory": amount of time it would take to sell all current listings at current pace of sales.
SEE ALSO
AUTHORS
Zakariyya Mughal <zmughal@cpan.org>
Stephan Loyd <sloyd@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014, 2019-2022 by Zakariyya Mughal, Stephan Loyd.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.