Friday, 7 February 2014

Higher Order Functions in Excel

Making Interop Easier

As discussed in the last post, it seemed difficult to write clean code using Excel Interop. We mentioned that using NetOffice helped, but still left some challenges in writing simple clean code.

It would be very helpful if we could use Higher Order functions similar to those you get with Lists, such as map and iter. There was some useful code on the F# Snippets site, in particular some code created by Kit Eason - Higher-Order Functions for Excel. This used Excel Interop directly rather than NetOffice.

This worked well for smaller ranges in Excel, but I found problems using it for large ranges.

My Version

I decided to modify this code to make it work better with large ranges and also to use NetOffice.

To verify performance, I created 4 corresponding scripts:
  1. The original version
  2. Version 1 but using NetOffice
  3. Version 1 but with changes I made
  4. Version 3 but using NetOffice
The changes I made were as follows:

  • The map functions (map and maprc) do not apply to each cell as a range within Excel, but rather apply to the objects stored in each cell.
  • The iter functions (iter and iterrc) apply a function that reads the object in the cell and then generates a function option. This generated function does work on the cell as an Excel range. The use of the option allows you to restrict the actions to a subset of the cells within a range.
  • The filter function was removed, as the iter functions effectively include it through use of the option type.  
The most complex code is within the iterrc function, which is shown below for variant 4:

/// Takes a function and an Excel range, and applies the function to each individual cell,
/// providing 1-based row-index and column-index for each cell as arguments to the function.
let iterrc (f : int -> int -> obj -> (Range -> unit) option) (range : Range) = 
    let fc = range |> maprc f
    for r = 1 to range.Rows.Count do
        for c = 1 to range.Columns.Count do
            let fcrw = fc.[r, c]
            let cell = range.[r, c]
            if fcrw.IsSome then 
                fcrw.Value cell

This function takes as parameters a rather complex function and a range. The first line of the function maps the function supplied to generate a function option that takes a range as a parameter and generates a unit result. This function option is then used within the for loops to run against each cell in the overall range. Note the use of the option type for efficiency - the function is only called if it has a value. Also note the use of the Dispose method - this is very important  - without this the function will rapidly create many Excel COM objects and use up memory.

It is much easier to understand how this works with an example:

// iterrc example
//colours a chequer of reds
let chequerRange = 
    |> iterrc (fun r c cell -> 
                        if (r % 2 = 0) && (c % 2 <> 0) || (r % 2 <> 0) && (c % 2 = 0) then None
                        else Some(fun cell -> cell.Interior.Color <- 255))

The function provided uses the row and column parameters to determine whether it is to be excluded and then returns None or if included creates a function option that sets the cell colour to red.

Variant 4 seemed to perform well in my tests and given these promising results, I decided to put my version on the F# Snippets site - see High-Order Functions for Excel using NetOffice.

Using in Mortality Manager

I decided to adopt this code and have added some other Excel utilities. To make this more understandable, I created two modules for Sheets (module Sht) and Ranges (module Rng). I have only just started to use these, but here is an early example:

let VwInd(e) = 
    let sh = Sht.add "LocalIndex"
    let hdr = [ "Name"; "Nation"; "LocalFile"; "Uri" ]
    let vls = Ind.rows mortinds
    let rws = hdr :: vls
                |> array2D
    let rng = Rng.getSE sh (1, 1) (Array2D.length1 rws, 4)
    rng |> Rng.set rws
    rng |> Rng.iterrc (fun r _ _ -> 
                if r = 1 then Some(Rng.bold true)
                else None)
    sh |> Sht.autoFilter
    sh |> Sht.autoFit

This hopefully should be relatively easy to follow. It does the following:
  • Adds a sheet called "LocalIndex" to the active workbook
  • Creates a 2D array containing a header row and a number of rows of values
  • Creates a range in the sheet of the right size to hold this data
  • Sets the value of this range to this 2D array of data
  • Use the iterrc function to set the first row to bold
  • Applies autofilter to the sheet
  • Autofits the columns of the sheet to the cell contents
The code can be found on bitbucket.

No comments:

Post a Comment