Sergio and the sigil

Excel is the business format, automate it

Posted by Sergio on 2008-06-09

One thing I realized when working in large companies is that Excel is the true data exchange file format. That what the business types exchange among themselves and how they like to persist any kind of lists or anything that needs to be formatted like a table (be it tabular data or not.)

All too often one of this documents is forwarded to me containing some form of business data that will become all sorts of things: lookup data, configuration values, updates to an existing table, etc.

I dread these conversion tasks when I have to do them manually so I tend to automate the task as soon as it smells like a candidate for recurrence. That way I'll take a hit on the first time and reap the benefits from then on.

For some reason I like writing these automation scripts in Ruby. I could probably almost as easily write them in VBScript, C#, PowerShell or even VBA but my experience with Ruby is that it tends to be shorter (as in more concise) and easier to develop. Of course that varies with your level of familiarity with the Ruby libraries, but the learning curve isn't bad at all.

Consider the following hypothetical layout for an Excel file dropped in your inbox for import.

DescriptionPriceCategory
1DVD Player76.49Electronics
2Rain Coat35.10Men's apparel
3Code Complete49.99Books
............
12024" Monitor499.99Computer Accessories

My typical script to process such file would be the obvious line-by-line read, maybe processing each cell before ultimately converting the row to its final format (or skipping that row.) The skeleton of that kind of script is shown below.

require 'win32ole'

begin 
  excel_app = WIN32OLE.new('Excel.Application')
  book = excel_app.Workbooks.Open("c:\\project\\data\\datafile.xls")
  sheet = book.Worksheets(1)
  #first row contains only the headers
  row = 2 #start at the 2nd line (excel is 1-based)

  #stop when we find an empty row
  while sheet.Cells(row, 1).Value
    data = []
    (1..4).each {|c| data[c] = sheet.Cells(row, c).Value }
    #at this point the 'data' array contains the entire row
    #use it as desired *********
    
    row += 1
  end
  
rescue 
  #oops...
  puts "There was an error: #{$!}"
ensure
  #cleanup
  excel_app.DisplayAlerts = 0
  book.Close if book
  excel_app.Quit
end

But, no matter if you use Ruby or any other language, the value in automating these tasks can be tremendous. Taking the time to learn the basics of Excel automation is, in my opinion, very important and will let your users continue to use the tool that they like.

In a kind of reverse example, I have had occasions when one of the business users was repeatedly requesting for some of the data in Excel format during the application development (while we didn't have a web page to show that data yet) so, after he asked for a updated report for the second time, I created a script to read the data from the database, create an Excel file, and send the file attached in an email message every Monday morning. Easy way to gain a happy user.

Update: As noted in the comments, one nice reference for automating with Ruby is Ruby on Windows