PASS Summit Day 1 – Keynote, MCM, SQLPASSTV, ServiceBroker, BI Power Hour, MDX, Who Dunnit?

Written by belle
Posted November 10, 2010 at 4:09 am

Just a heads up. Today I’ve seen the most impressive BI demos – ever! *jaw drop*

Key Note

My takeaways:

Atlanta, Denali, Crescent

The next version of SQL Server (code names Project Atlanta, “Denali”, and “Crescent”) – mostly about teradatas, column stores, and reporting and diagnosing from the cloud – will change the landscape of SQL Server.

And man, PowerPivot is everywhere!

You can now download the Denali CTP1 version from here:
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=6a04f16f-f6be-4f92-9c92-f7e5677d91f9

My iPad cannot compete with trusty laptops and MacBooks, and the wireless during the keynote was essentially nonexistent. So I have to refer you to – ta da – Brent Ozar’s posts – for the lowdown of events:
Read the rest of this entry »

Rating: 6.0/10 (4 votes cast)

Rating: 0 (from 0 votes)

1 Comment

Filed under: PASS Summit 2010, SQLPASS

PASS Summit 2010 Day 0 – PreCon – ETL with SSIS

Written by belle
Posted November 9, 2010 at 1:36 am

PASS has officially started. For the second year in a row, I am attending Brian Knight’s (blog | twitter) Pre Conference. And as usual, Brian Knight – this time with Patrick LeBlanc (blog | twitter) – have delivered. Hats off to them, for making the pre-con very very informative, yet very entertaining.

Notables

Here are some of my notables from the session:
Read the rest of this entry »

Rating: 0.0/10 (0 votes cast)

Rating: 0 (from 0 votes)

No Comments

Filed under: PASS Summit 2010, SQLPASS

PASS Summit 2010 Day -1 – NoSQL, DBA Myths and Twitter

Written by belle
Posted November 8, 2010 at 12:57 am

Yeah I know, my counting system is off. This is my weird counting plan, you just have to bear with me:

Day -1	One day before official PASS week. Sunday.
Day 0	Pre Conference. Monday.
Day 1	Real Day 1. Tuesday. After all, everyone is going to refer to this as Day 1 anyway. Best to be consistent.
Day 2	Real Day 2. Wednesday.
Day 3	Real Day 3. Thursday.
Day 4	Post Conference. Friday.

Day -1 Summary (read Day minus 1)

PASS has not officially started, but my adventure already started.

Got my official PASS Summit 2010 backpack swag. Woo hoo! What can I say, swags make me happy.

By this time I’ve already devoured through the PASS Summit 2010 Program Guide, the schedule, and Im halfway through the free SQLServer Magazine!
Read the rest of this entry »

Rating: 8.3/10 (4 votes cast)

Rating: 0 (from 0 votes)

1 Comment

Filed under: PASS Summit 2010, SQLPASS

SQLPASS Pre-Conference Lessons Learned – Building a Microsoft Data Warehousing Platform (Brian Knight)

Written by belle
Posted November 2, 2009 at 6:12 pm

I just finished the SQLPASS pre-conference from Brian Knight – Building a Microsoft Data Warehousing Platform, and I’m so pumped to do more BI! Reporting, Data Warehousing, Data Mining – all fun stuff.

In a gist, this is what we learned today (and it’s a lot of acronyms):
– SSIS, ETL
– SSAS
– Cubes, Data Warehouse
– Data Mining
– MDX and DMX (on a very very high level)

It was a lot of details and demos jammed in one day, but I won’t have it any other way, and overall the session was just great! Loved it!

Here are some bits and pieces of information / tips / useful sites / resources (probably mostly for my benefit, so I remember. But if you find it useful, then all the better):

65% of data warehousing projects fail, and one of the reasons for this is lack of communication. Business users want something, but devs deliver something else.
Waterfall model for DW projects – hard to react. Agile is better – and Brian recommends 1 month iterations (ie complete deliverable cycle per month)
Why data warehouse? Consolidate data, aggregates, reporting, analytics, archived data (ex if you need to keep data for years and years)
Business Key = Alternate Key = PK from source
Surrogate Keys – insulates data warehouse from source changes
In one of Pragmatic Works’ projects, they lowered SSIS loading time from 4.5 hours to 3 mins – all about choosing “better” way. In SSIS most components are in memory and are synchronous.
- Most SSIS components are synchronous – good for performance
- Some are asynchronous and partially blocking, not so good – for example UNION ALL and MERGE – slight overhead
- Some are asynchronous and fully blocking – not good at all – for example sort or aggregate transforms – but sometimes there are reasons to use these
In BIDS, devenv.exe -NOSPLASH to prevent splash screen
In SSIS – Protection Levels for Packages – EncryptSensitiveWithUserKey generally a bad idea because it associates the package with your profile; if you transfer package, might say package is corrupt. Best to Encrypt All With Password; ServerStorage relies on msdb
Different types of dimensions
- Type 0 – Changing Attribute – updates; no history kept
- Type 1 – Fixed Attribute – cannot change under any circumstance; ignores change
- Type 2 – helps track changes; biggest disadvantage is additional storage; it creates a new record per update
Good practice – smaller packages; easier to change and maintain
Slowly Changing Dimension Wizard (SCD) – data types need to be lined up, need to be very careful; easy to break; also when you need to customize later on and relaunch the wizard, previous customizations will be dropped. You should keep documentation on how to recreate your customizations
Another problem with SCD – similar to a cursor, goes through row by row; alternative way is to create staging table first. Use T-SQL for what it’s good at, SSIS for what it’s good at. If you can do sorts in T-SQL, do the sorts in T-SQL
Inferred members – data that not in source, but is in Fact; for example, if it takes 2 weeks to add a salesperson in HR, but salesperson already has sales from Day 1
Better load – “Fast Load”; also configure “Maximum Insert Commit Size”
Every member should has an unknown member
ALWAYS alias your columns
Alternatives to detecting updates – instead of specifying all columns in a multi OR expression, use CHECKSUM or HASHBYTES. Problem with CHECKSUM is it is not guaranteed to be unique. Ex.
```
HASHBYTES('sha', ISNULL(col1, '') + ISNULL(col2, ''))
```
So in your expression column can just have something similar to: (DT_WSTR, 64)yourhashcol_src != (DT_WSTR,64)yourhashcol_dw
SQL Cache Transform – creates cache file, can be reused multiple times; 80% faster in SQL Server 2008; SQL Server 2005 was bad with threading
Terminologies – data warehouse (SSAS 2008)
- Measure Group = Fact Table
- Attributes = Columns in a Dimension
- Member = Row in an Attribute
Hierarchies are shortcuts for end users; you can ultimately hide actual attributes (AttributeHierarchyVisible = True), and just expose the hierarchies
Always reconnect to cube after dimension or attribute changes so changes are apparent
Set up attribute relationships ex:
```
Date SK -- Date -- Month -- Quarter -- Year 
```
Note that uniqueness of quarter is really Year + Quarter.
Set up constraints/uniqueness through the “Key Column”. Set up what’s being displayed to user using “Name Column”. You will probably need to explore your dimensions and see which attributes need this.
For any time dimensions, add “Time Intelligence”. As a side effect, this automatically adds additional filters like “Yesterday”, “This Week”, “Last Week” etc in Excel
“Mushroom Cloud” – Invalid queries, but you are getting invalid results. For example if you drag Product then EmployeeCount. This gives you the EmployeeCount value for Product.
In Cube Editor, MeasureGroup, set IgnoreUnrelatedDimension = False helps prevent “mushroom clouds”; result will display blank instead of giving you invalid value
Discretization Method – allows you to create dynamic buckets of data
- Automatic
- EqualAddress – your just specify number of groups
- Clusters – groups based on your data
HideMemberIf (trying to remember why this option was highlighted in the session)
MDX is language for querying the cube/SSAS; used for creating KPIs, calculations
Tuple in MDX points to a cell; you can work with six (6) different axes. Good book is MS Press MDX Step by Step
When you want to show data, put in SELECT; if you don’t but you want to filter by it, put in WHERE
Parenthesis in an MDX helps you represent a tuple
SSRS and Excel can generate some MDX
Calculations < Calculated Member does not store anything in the cube, just the metadata; actual calculations done on the fly
- PRO: Only formula is stored in the cube; don’t need extra storage
- CON: Only formula is stored in the cube; performance will be worse
Usage Based Optimization : Go to SSMS < Connect to SSAS < Set CreateQueryLogTable = true; you can also choose to “Review Aggregation Usage”
How to lock data in your cube : Go to your dimension then go to Role, and add roles/users
Option: Enable Visual Totals – like an implicit WHERE; Off by default, which means users will still see all data even if data is locked down; for example, if user can only see US data, his/her totals should be just for US, but s/he will see US Totals, then a different Grand Total. Enable this to lock down these totals to only the data they can see
Deployment wizard in SSAS – creates .asdatabase file that contains metadata required to deploy the cube
You dont need to always process the whole cube; you can do incremental processing
Partitioning in SSAS – in about 50K records, you will start to feel the slowness
SQL Server Standard allows 3 partitions; SQL Server Enterprise allows any number
New feature in SQL Server 2008 Enterprise – Proactive Caching
In SQL Server 2008, you can use profiler on SSAS to deconstruct MDX queries. In Excel, you can also right click on some columns then check MDX
There is an option to do offline OLAP : Excel < Options < OLAP Tools < Offline OLAP. You can get to your data, but actions and drillthrough are off
Double hop issue in SSRS and SSAS when you build reports on top of the cube : check Kerberos, SPN
You don’t need a cube to do data mining! You can also allow your users to do predictions based on flat tables or views.
Sample data mining scenario: used to auto approve insurance claims; can be used for fraud detection; can be integrated with SSIS in conditional tasks
For data mining structures, Decision Trees is the best way to start; as far as alogrithms go, you can even create your own, or purchase 3rd party algorithms
Users can get/do data mining through Excel
Users can do drillthroughs with data mining

Useful Resources / Sites

pragmaticworks.com
sqlshare.com
ssas-info.com
dtsxchange.com
sqlserverdatamining.com
Marco Russo’s Expert Cube Development with Microsoft SQL Server 2008 Analysis Services
Brian Knight’s 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services (Wrox Programmer to Programmer)
MS Press Microsoft SQL Server 2008 MDX Step by Step (Step By Step (Microsoft))

Rating: 9.0/10 (2 votes cast)

Rating: 0 (from 0 votes)

2 Comments

Filed under: SQLPASS

belle’s sql musings

Archive for the ‘ SQLPASS ’ Category

PASS Summit Day 1 – Keynote, MCM, SQLPASSTV, ServiceBroker, BI Power Hour, MDX, Who Dunnit?

Key Note

Atlanta, Denali, Crescent

PASS Summit 2010 Day 0 – PreCon – ETL with SSIS

Notables

PASS Summit 2010 Day -1 – NoSQL, DBA Myths and Twitter

Day -1 Summary (read Day minus 1)

SQLPASS Pre-Conference Lessons Learned – Building a Microsoft Data Warehousing Platform (Brian Knight)

Useful Resources / Sites

sqlbelle’s new blog

SQL Server with PowerShell Cookbook

belle. sql server. mvp. mcitp. mct. bcit.

belle @ DevTeach Vancouver 2009

Popular Posts

Categories

Tags

Blogroll

Blogroll - SharePoint

Blogroll - SQL Server

Recent Posts