Building an SSIS Testing Framework

Ken Ross published on 2014-05-21 included in business-intelligence testing

If you compared the Google results for “testing c#” and “testing SSIS” you would quickly realize that testability isn’t one of the strengths of SSIS. I’ve seen a few different frameworks which were either too complicated or didn’t really work, in this post I’m going to give you my take on building an SSIS Testing Framework. The Goal The overall goal of what I wanted to achieve from my test framework:

Scraping the Web with Power Query

Ken Ross published on 2014-04-14 included in business-intelligence

Last year I entered the PowerBI video demo contest. Whilst I didn’t win any prizes I did learn a fair bit from going through the exercise of putting together a screencast demo (more on that another time). In this post I’m going to walk-through the web scraping part of my demo. The website that I choose to use for my demo was the National UFO Reporting Center (more for novelty sake then any serious interest).

Debugging Custom SSIS Components

Ken Ross published on 2014-04-11 included in business-intelligence development

In the past I have written extensively about how to build custom components for SQL Server Integration Services, these posts have always been focused on the ‘happy path’, if you’re not familiar with this phrase it refers to the path through your application that works exactly as expected. Often times in development we have to deal with the sad path, or when things aren’t working as we would like or expect.

Apache Flume - Get logs out of RabbitMQ and into HDFS

Ken Ross published on 2014-04-06 included in big-data

This post is an extension of Tutorial 12 from Hortonworks (original here), which shows how to use Apache Flume to consume entries from a log file and put them into HDFS. One of the problems that I see with the Hortonworks sandbox tutorials (and don’t get me wrong, I think they are great) is the assumption that you already have data loaded into your cluster, or they demonstrate an unrealistic way of loading data into your cluster - uploading a csv file through your web browser.

Role Playing Dimensions - DAX vs MDX (Part 1)

Ken Ross published on 2014-02-12 included in business-intelligence

A couple of weeks ago I was doing some work on an internal reporting cube. One of the measures required represents an ‘order backlog’, that is orders that have been received but haven’t yet been provisioned in our systems. The Problem The fact table looks something like this: A row will appear in the fact table after the order has been closed, with the provisioned date being set to NULL until it has been provisioned.

Scraping the Web with Excel 2013 - PowerBI Competition Entry

Ken Ross published on 2014-01-21 included in business-intelligence

The Microsoft PowerBI Competition is now in full swing with the voting open to the public for the next week. (Check out my entry). As you can see below I just made my submission in time. I like to cut it fine! When I came to building my demo (check it out) I had a few different data sets in mind, but there were two main points that I wanted to highlight from my entry -

Unravelled Development