Archive for September, 2012

Downloading Vancouver Weather Data Using PowerShell

I’m trying to get some data I can play around with Tableau, fortunately information for Canadian climate data is available in CSV and XML from http://www.climate.weatheroffice.gc.ca/. I thought I’d go fetch the Vancouver daily weather data since 1953 using PowerShell.

Here’s a few PowerShell snippets for anyone interested. I haven’t cleaned these up yet, but should be pretty functional.

Script to Fetch Data

$yearfrom = 1953
$yearto = 2012
$folder = "C:\Temp\Weather Files"
for ($i = $yearfrom; $i -le $yearto; $i++)
{
   $filename = "weather_$($i).csv"
   $weatherfile = Join-Path $folder $filename
   $url = "http://www.climate.weatheroffice.gc.ca/climateData/bulkdata_e.html?Prov=BC&StationID=889&Year=$($i)&Month=1&Day=1&timeframe=2&format=csv"
   $webclient = New-Object System.Net.WebClient
   $webclient.UseDefaultCredentials = $true
   $webclient.DownloadFile($url, $weatherfile)
}

Script to Remove Extra Lines

$path = "C:\Temp\Weather Files\*.*"
$newfolder = "C:\Temp\Weather Files\Clean"
 
Get-ChildItem -Path $path -Include "*.csv" |
Foreach {
   #need to delete lines 1-25, leave only data
   $file = $_
   $newfilename =  "clean_$($file.Name)"   
   $newfile = Join-Path $newfolder $newfilename
   $filecontents = Get-Content $file.FullName   
   $filecontents[25..($filecontents.length-1)] | Out-File $newfile -Encoding ascii
}

Script to Merge Files

$csvfolder = "C:\Temp\Weather Files\Clean"
$mergefolder = "C:\Temp\Weather Files\Merge"
$mergedfilename = "merged_all_weather.csv"
$header = @"
"Date/Time","Year","Month","Day","Data Quality","Max Temp","Max Temp Flag","Min Temp","Min Temp Flag","Mean Temp","Mean Temp Flag","Heat Deg Days","Heat Deg Days Flag","Cool Deg Days","Cool Deg Days Flag","Total Rain (mm)","Total Rain Flag","Total Snow (cm)","Total Snow Flag","Total Precip (mm)","Total Precip Flag","Snow on Grnd (cm)","Snow on Grnd Flag","Dir of Max Gust (10s deg)","Dir of Max Gust Flag","Spd of Max Gust (km/h)","Spd of Max Gust Flag"
"@
 
#put headers in final merged file
$header | Out-File $fullmergedfilename
$fullmergedfilename = Join-Path $mergefolder $mergedfilename
 
Get-ChildItem -Path $csvfolder | 
Get-Content |
Out-File $fullmergedfilename -Append
 
#open up windows explorer, let's check
explorer $mergefolder
VN:F [1.9.22_1171]
Rating: 10.0/10 (5 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

The company I work for is evaluating tools for BI/Visualization, and Tableau is one of the front runners that we are considering.

I’ve been fortunate to be able to attend this week’s Advanced Tableau training in Vancouver. Our trainer is one of the Tableau Jedis, Interworks’ Director of Business Intelligence (BI) Dan Murray, and boy does he know how to impress.

At first, because of budget constraints, I was contemplating on whether I should just watch the videos – since Tableau has graciously published a number of them allowing anyone to learn the product. But I’ve always thought that there’s always something to learn from in class training – especially if you get an awesome instructor. And I am fortunate – and very thankful – I did get an awesome instructor. I cannot believe how much he packed in two (2) days of training. Don’t get me wrong, my brain was full after two days – but even after the class ended I just wanted to keep on going and try out more Tableau stuff.

I won’t go through each topic that we covered, but for anyone interested, the curriculum for the Advanced Tableau training is posted in the Tableau site. Upcoming training sessions from Interworks are posted in the Interworks site.

Anyway, just wanted to share some of the tidbits I have learned:

Tableau Specific

  • Extracts allow you to work with your data really fast.
    • True story – in v6 an 80M record Tableau extract took > 30 mins to load. In v7, it took ~ 1 second.
  • The color of the pill matters. BLUE means discrete. GREEN means continuous.

  • Tableau can do inner, left, right join, or union, although Tableau usually doesn’t recommend unions, they don’t guarantee performance
  • Tableau works great with relational data, even flat files. Within the same connection, you can do JOINs on your tables, but within different connections you can BLEND. JOINs happen at the server (or source), BLENDs happen locally.

    Tableau JOIN

    Tableau BLEND

  • Ctrl + Dragging a pill allows you to copy the pill and repurpose
  • Actions are almost always faster than parameters
  • Quick filters are great, but use sparingly. When dashboarding, real estate is expensive (sounds like the Vancouver real estate market)
  • When adding reference lines, invoke reference line from axis you want to reference from
  • Annotations can be risky – might fly around when your data changes
  • Tableau tooltips look awesome by default.

Data Exploration and Visualization

  • Easy is hard. There’s a lot of work in making something easy.
  • Don’t be afraid to explore your data. Sometimes, you know which questions to ask, but sometimes you don’t. In either case, don’t be afraid to experiment, explore. You might be surprised at what you can discover.
  • Dan differentiates data into 3 Types:
    • Type 1 – Data you know (normal BI) – ex sales, profit etc
    • Type 2 – Data that comes from Type 1, is the blip, is the explanation to the question.
    • Type 3 – Data you needed to know that you didn’t know you needed to know; Real data discovery; Can usually be explored with scatter plots; This is where jaw dropping starts
  • Scatter plots are a great way to explore data.
  • If you’re serious about visualization, you’ll read Stephen Few’s books  and watch Hans Rosling TED talks. Also check out Deiter Rams and Iain McGilchrist‘s The Divided Brain on Youtube
  • When dashboarding – always ask yourself the question – what story am I trying to tell?

I am pretty excited to start working with Tableau. I plan to do more Tableau-related posts and tutorials in the future (of course incorporating all the tricks I’ve learned from this training session).

I work with SSRS a lot, and I can see how SSRS and Tableau can be great complements.

As a last bit, here’s my super awesome dashboard – done in 40 minutes and uploaded to Tableau public! (with lots of help from Tableau Jedi Dan Murray)

Go ahead, click on the image to play with it. Go ahead, I know you want to. :)

VN:F [1.9.22_1171]
Rating: 10.0/10 (7 votes cast)
VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)
`