Saturday, 7 June 2014

More Plotting

Introduction

As I mentioned in my last post, I have decided to extend the code I have for Excel interop through a new project on bitbucket. This is principally to add further charting capabilities and allow it to be used generally, in particular, when using F# Interactive.

I recently came across an interesting book on forecasting called "Forecasting: principles and practice" by Rob J Hyndman and George Athana­sopou­los. This is available in hard copy, but also published online at:
This contains a number of interesting plots using R and is supported by a CRAN package containing code and data for the examples in the book. In this post, I will replicate the plots shown in the chapter "the forecasters toolbox" - 2.1 Graphics.

Setting up the data

I first needed to make the data available to F#. I loaded up RStudio, loaded the fpp package and then tested I could generate the first example chart. I then saved the backing data in CSV format. This is the simple R code:

##do plot
plot(melsyd[,"Economy.Class"], main="Economy class passengers: Melbourne-Sydney",xlab="Year",ylab="Thousands")

##write csv for F#
write.zoo(as.zoo(melsyd),file="c:/vs/vs13/FSharp.XlInt/fpp/ch2/melsyd.txt",sep=",")


This generated a text file with these contents:

"Index","First.Class","Business.Class","Economy.Class"
1987.48076923077,1.912,NA,20.167
1987.5,1.848,NA,20.161
1987.51923076923,1.856,NA,19.993
1987.53846153846,2.142,NA,20.986
1987.55769230769,2.118,NA,20.497
....


The following chart was displayed:

In F# interactive I then used Deedle to load this data for use by F# and then called my library to display the chart in Excel.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
#load @"..\..\packages\Deedle.0.9.12\Deedle.fsx"
#r @"..\..\FSharp.XlIntLib\bin\debug\Office.dll"
#r @"..\..\FSharp.XlIntLib\bin\debug\Microsoft.Office.Interop.Excel.dll"
#r @"..\..\FSharp.XlIntLib\bin\debug\FSharp.XlIntLib.dll"

open System
open Deedle
open FSharp.XlInt

// call unless Excel is open and OK to use open version
Xl.start()

//do melsyd
let melsyd0 = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/melsyd.txt")
let melsyd1 : Frame<float, string> = melsyd0 |> Frame.indexRows "Index"
let plt1 = 
    Plt.line (melsyd1?``Economy.Class``
              |> Series.observations, Title = "Economy Melbourne-Sydney", XTitle = "Year", YTitle = "Thousands", XSpacing = 48, XNumberFormat = "0")

In lines 1 to 8, I set up the references needed to Deedle, Excel Interop and my library. In line 11, I call a utility function in my library that opens Excel and makes this instance the one to work with. In line 14, I use Deedle to load the CSV file into a dataframe. In line 15, I create an amended version of the dataframe that uses the Index field as the index of the dataframe. Finally, in lines 16 to 18, I call my library to plot the Economy.Class series. This is the result:


Further Plots

Similarly, I loaded other data from R into CSV files and then loaded them for use by F#. This is the remainder of the F# code used to generate further plots:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
//do a10
let a10 = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/a10.txt")
let a101 : Frame<float, string> = a10 |> Frame.indexRows "Index"
let sales = a101?x
            |> Series.observations
let plt2 = Plt.line (sales, Title = "Antidiabetic drug sales", XTitle = "Year", YTitle = "$ million", XSpacing = 60, XNumberFormat = "0")

//do seasonplot
Plt.seasonplot (sales, Title = "Seasonal drug sales", YTitle = "$ million")
//do monthplot
Plt.monthplot (sales, Title = "Seasonal deviation drug sales", YTitle = "$ million")

//do fuel
let fuel = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/fuel.txt")
let city = fuel?City
           |> Series.values
let carbon = fuel?Carbon
             |> Series.values
let xy = Seq.zip city carbon

//jitter
Plt.scatterplot (Plt.jitter (xy), Title = "Jitter", XTitle = "City mpg", YTitle = "Carbon footprint") |> ignore

//scatterplot matrix
let litres = fuel?Litres
             |> Series.values
let highway = fuel?Highway
              |> Series.values
let data = [| litres; city; highway; carbon |]
let labels = [| "Litres"; "City"; "Highway"; "Carbon" |]

Plt.scatterplotmatrix (data, labels, Title = "ScatterPlot Matrix") |> ignore
//call to tidy up excel
Xl.close()

In lines 1 to 5, the data for sales is loaded. In line 6, a simple line plot of this date is created:

In line 9, the data is loaded into a plot showing the seasonal changes each month for each year:


In line 11, there is a variation on this plot which shows for each month how the sales change over the years:

In lines 14 to 19, some car fuel data is loaded, which is then plotted as a standard scatter plot in line 22. The more interesting element of this plot is that the data is "jittered" before being plotted to help illustrate where certain points have overlapping data items.

In lines 25 to 30, four of the series in the fuel dataframe are set up to be plotted against each other in the commonly used scatterplot matrix. In line 32, this data was plotted:


Finally, in line 34, a utility function was called to close this instance of Excel.


No comments:

Post a Comment