SSAS -- to Partition or not?

by Rob 20. January 2010 01:16

Recently I had a question from a client whether there was really any performance improvement in SSAS cube processing when using partitioning strategies vs. just using a single partition.

While intuitively I "knew" that partitioning cube fact tables results in better parallelism and thus increased performance (when adequate CPU, memory and I/O resources can support it), I realized I really didn't have any empirical "proof" to backup my professional opinion.  So--I setup a simple test.

The following is a very simple comparison of processing the same data set with a cube design that's identical--except for the introduction of multiple partitions. For this test I used Analysis Services 2008R2 x64 running on a 4-core Xeon server

The data set processed includes 37 million fact rows, and several dimensions--the largest including around 4,000 members.  In the first test, a single partition is used.

In the second, the cube is partitioned along months, resulting in about 1.3 million rows per partition.  This is below the recommended threshold for partition sizing, but provides a decent (if basic) evaluation of the effect of parallelism on cube processing.

Results:

  • Time to process the cube using a single partition: 21 minutes
  • Time to process the cube using 23 partitions: 12.7 minutes
  • Bottom line: by partitioning the processing of the cube, there's a time savings of around 40% in this case
Observation: The server I used is in my lab and has 4 Xeon cores, 4GB RAM, and 1.3TB RAID 1E storage. During the single partition processing, memory demand didn't exceed 3GB, and CPU utilization didn't exceed 60% or so.  With partitioned processing, both CPU and memory were completely saturated.  When I get the chance (and the required loose change) to upgrade the server to 8 cores and more memory, I'd like to re-run this to see whether increased resources would provide an even greater benefit.  However, having at least this much empirical evidence is a good start.



Single partition processing

23 Partitions processing

Tags:

Analysis Services | SQL Server 2008

TechEd 2009: Microsoft BI - Gemini

by Rob 11. May 2009 23:57

Today I had the pleasure to attend Donald Farmer and Kamal Hathi's session on the Gemini project.  I've seen several of Mr. Farmer's presentations on BI and Data Mining, and I'm never disappointed at his skills and depth of knowledge!

Gemini is an in-memory, Excel-based, analysis services orientated technology that brings the power of dimensional modeling to a primarily Excel skilled audience--with the promise that the output of Gemini can be forward-engineered into managed Analysis Services solutions. 

Donald's quite entertaining slide deck was a major departure from the typical corporate spin that we're accustomed to seeing (and that I'm personally accustomed to producing) when we educate potential technology users.   

The slide deck casts the analyst, boss and IT admin as silent movie characters, where the analyst and IT professional struggle to keep the boss happy within the limitations of their skills and the existing content in the corporate DW.  It's the all-too-familiar "I need that analysis by tomorrow" versus "it takes time to put that data together".  Terribly apt and consistent with what we see in the "BI trenches" every day.

The silent movie theme stresses that this situation hasn't changed that much in many decades (and in any event in the two decades I've been working on BI solutions).  Of course, the slides were endlessly entertaining while driving home these points!

As for the "Gemini" product, it's quite exciting.  BlueGranite has been in the business of helping our clients' analysts pull info together quickly and make it meaningful since the ProClarity days.  One of my lingering concerns has been that--since ProClarity was acquired and the future of its core technology became uncertain--what tools within the Microsoft product suite would serve these "data diggers"? 

Excel pivot tables against SSAS cubes are fantastic--but how to make sure analysts can get data pulled together by the cube designer fast enough to meet business needs?  If the corporate governed DW/Cubes evolve slowly, how can analysts fill in the gaps? 

Today I still recommend ProClarity for the deep data analysts (as a complement to Excel), as it still provides more flexibility and richer visualization than Excel pivot tables for advanced analysts.  But I think Gemini may one day (hopefully soon) deliver even more, and really become the data digger's tool of choice.

Gemini is clearly intended to fill the analyst gap.  And in so many ways I think it will. It's at once a more approachable way for non-OLAP users to build rich dimensional models, a way to make more dynamic data integrations using the familiar Excel environment, and a rich OLAP query tool to be used against these analyst-generated models.

In truth, I can easily see even seasoned SSAS pros (including myself) using Gemini during prototype and early development, in addition its intended less sophisticated audience.

The things I love about Gemini:

1. Users import data from all types of sources:
  
a. Structured relational
  
b.
Existing OLAP Cubes
  
c.
Subscribe to "Service Documents" (a form of RSS feed that contains tabular data)
  
d. Paste in any tabular data (e.g. copied to clipboard from a web page)

2. Fast processing of large data volumes (100M rows demonstrated, 20M demo'd on a netbook)

3.  Ability to add calculated measures at the Excel pivot-table layer (sweet!)

4.  Ability to connect to Gemini models as SSAS data sources (rocks!)

5.  Tight integration with SharePoint as a basic architectural construct

6.  Translation of most OLAP concepts to Excel terminology more familiar to analysts

Some things I would like to see improved, or clarified. 

I would have asked these questions, but Q&A was cut short today due to time constraints...

1. As simple as they seem to MDX people (like me), my sense is the DAX expressions are going to be too complex for many of the analysts I train.  It reminds me of PPS-P PEL, in that the expressions are intended to be simplified from MDX, but they're still complex and require multidimensional thinking--which isn't a gift many of us are born with.   PEL generated lots of push-back from analysts when I demonstrated it to "real customers", and I fear DAX may as well.

2. To address #1, I hope the product team will consider following the model ProClarity set with it's KPI Designer, which allows users to use wizards to build calculated measures such as ranking and bubble-up exceptions.  KPI designer users build complex MDX without knowing that's what they're doing.  I still train new users on these tools, and it's a positive for them to use wizards (rather than purely language constructs, as with DAX).

3. I'd like to see some MDM tie-in to ensure that already accepted calculations, data sources, etc., can be drawn upon and re-used in a Gemini solution.  I can see a BI governance issue (and IT objection!) if many analysts are building silo BI solutions without some centralized baseline to start from.  Gemini doesn't prohibit a DW/MDM baseline, but it doesn't appear to promote one either.

4. Security really wasn't addressed in the session.  If analysts will be pulling 100M rows of fact data into a desktop solution…how does a corporate IT policy ensure that that such huge volumes of data isn't lost in the back of a taxi?  BitLocker would be a convenient answer to this question, but I hope it isn't the only one.

5. Data mining--not sure if it's possible to incorporate DM models into the Gemini models, but the combination of capabilities on the desktop would be really fantastic!

In-memory OLAP is a hot technology, and as illustrated by Donald's slides, the world has been waiting far too long to put such powerful tools in the hands of typical analysts.  I can't wait to see how this possibly disruptive technology impacts the wide swath of users its intended to benefit!

Tags: , , ,

Analysis Services | Analytics

SSAS 2008 Deployment: The connection either timed out or was lost

by keruibo 16. April 2009 01:32

The following issue is a problem in SSAS that you might run into either in test or deployment environments (see references). 

In a nutshell, when a client session running Vista or Windows 2008 talks over the network to an SSAS server running on Windows 2008, and they use Kerberos for authentication, there likely will at some point be connectivity problems that resemble timeout or firewall blocking problems. 

Errors might look like one of the following:

The connection either timed out or was lost.
Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
An existing connection was forcibly closed by the remote host

The two places we’ve observed these problems are:

1.       MDX queries submitted to the server that are reasonably large, such as those PPS submits, or perhaps those coded by hand that become large.  “Large” might be something like 1000 bytes or so.

2.       Deploying databases over the network from a Vista/2008 client to a 2008 server.

This problem still exists as of SQL 2008 SP1, so applying that update isn’t a resolution. This problem definitely exists when client & server are on different machines; may not be a problem when client/server are on the same computer (e.g. VPC or BIDS on a remote desktop session). 

From what I understand, MS is aware of this and we should expect a post-SP1 CU soon.  Whether this is to be resolved as a Windows patch or a SSAS patch is unclear to me. 

The following are work-arounds in the mean-time:

1.       If either the server or client is running XP or Windows 2003, the problem should not occur in any case.

2.       If you can edit the connect string (e.g. SSRS/Excel), specify ;SSPI=NTLM as a parameter.    Be advised that user security delegation will not succeed if this workaround is used.

3.       Connect with the IP address instead of machine name or FQDN.  Since Kerberos is unsupported with IP address connections, this also forces a fallback to NTLM.  Kerberos delegation also will not succeed when this workaround is used.

 Of particular note is that if SSAS is to be deployed on a Windows 2008 server, all available workarounds currently imply kerberos delegation will be impossible (a short-term driver for SSAS deployment on Server 2003?). 

 References:

http://blogs.msdn.com/psssql/archive/2009/04/03/errors-may-occur-after-configuring-analysis-services-to-use-kerberos-authentication-on-advanced-encryption-standard-aware-operating-systems.aspx

http://denglishbi.spaces.live.com/Blog/cns!CD3E77E793DF6178!1214.entry

http://sqlblogcasts.com/blogs/drjohn/archive/2009/03/28/kerberos-kills-large-mdx-queries-on-windows-server-2008-ssas.aspx

Tags: , , ,

Analysis Services | Configuration | Security

Scripting Analysis Services Database Backup

by keruibo 30. August 2008 20:19

There are a number of ways to automate backups of Analysis Services databases.  Some examples I've seen: SSIS Packages, PowerShell, SQL Agent jobs with XMLA. 

Here is a "quick and dirty" way using a .vbs cscript that can be called form the operating system, or scheduled in SQL Agent as a cmdexec step. This script will backup all Analysis Services databases programatically (i.e. if someone adds a new database, it will automatically be backed-up without further operational changes).

I'm not saying this is the best way to do this, and the script below completely lacks error trapping.  But it is effective, simple to implement, and may form a good base for a robust script in your environment.

 

 '-------------  Get the target folder from the command-line  -----------
 strBackupPath = WScript.Arguments.Named.Item("folder")
 WScript.echo "Destination folder: " & strBackupPath 

 if strBackupPath = "" then     wScript.Echo "'folder' parameter resulted in a blank string"     WScript.Echo "Usage: cscript BackupSSAS.vbs /folder:<full path to backup folder>"     WScript.Quit 1  end if

 strBackupPath = strBackupPath + "\"

 '-------------- Get TimeStamp in form YYYYMMDDhhmmss  -------------  MyTimestamp = CStr(Year(Now()) * 10000 + Month(Now()) * 100 + Day(Now())) _    + CStr(Hour(Now()) * 10000 + Minute(Now()) * 100 + Second(Now()))     '-------------  Connect to SSAS Instance  --------------------  Set ssas = CreateObject("Microsoft.AnalysisServices.Server")  ssas.Connect("localhost")

 '-----------  Backup Databases   -------------------------  For Each MyDatabase In ssas.Databases      MyDatabase.Backup strBackupPath + MyDatabase.Name + "_" + MyTimestamp + ".abf"  Next    '------------  Disconnect from SSAS Instance  -----------------  ssas.Disconnect()

Tags:

Analysis Services

Microsoft BI with Constrained Kerberos Delegation

by keruibo 17. May 2008 22:40

In a Microsoft BI environment, we very often want to grant data visibility permissions at the datebase level.  The most common way to accomplish this is to use Kerberos delegation. 

In Active Directory, delegation comes in two flavors: Constrained and Unconstrained. Constrained delegation provides an enhanced level of security for deployments where Kerberos delegation is used to pass end-user credentials to back-end services.

In an unconstrained delegation configuration, servers and service accounts are trusted to send Kerberos tickets to any service on any destination computer. This typically isn’t a problem, since administrators know what their service accounts are used for, and what software is installed on their servers.

However, using constrained delegation provides an additional level of security by restricting which back-end services on which destination server a computer account or service account may pass Kerberos tickets to. Constrained delegation satisfies the “principle of least privilege”, where even trusted principals are granted only the minimum permissions needed to get their job done.

Promoting delegation from unconstrained to constrained delegation is relatively simple. This additional level of security isn't without cost, however.  Following a constrained model will result in additional long-term administration (the addition of a new back-end web server or database will require additional Active Directory configuration). However, if the least privilege principle is the best practice in your company, constrained delegation is for you.

The following process goes through constrained configuration of a typical distributed Microsoft BI environment. This process assumes you’ve already configured an unconstrained Kerberos delegation environment. The following only covers the upgrade to constrained delegation.

In the following scenario, there are:

  1. A single database server running SQL Server and Analysis Services (BI-DB).
  2. An application server hosting PPS Monitoring Server, ProClarity Analytics Server and Reporting Services (BI-APP)
  3. Finally there is a SharePoint server (BI-WEB).

Services on each server are run using the service accounts svc_bi_db, svc_bi_app, and svc_bi_web, respectively.

First the documentation. Don’t skip this step! The key to completing this process successfully is organization, because any mistake may cause integrated authentication to fail and the troubleshooting will be difficult and even more time consuming! You need to know what service accounts are running each service on every server, and use this information to make Active Directory configuration changes.

Documentation is a two-step process: document delegation trusts needed for each server, then the trusts needed for each service account. If you’ve already setup your basic Kerberos delegation using SetSPN or ADSIEdit, you should have this information handy.

The following are the trust requirements for our scenario:

 
Table #1 - Computer Delegation Trusts
 Delegator
(Computer)
 Delegatee
(Service Account)
Allowed Destination
(service/server:instance)
 
 BI-APP  svc_bi_db MSOLAPSvc.3/BI-DB:PROD
MSSQLSvc/BI-DB:1433
 BI-WEB

svc_bi_db

MSOLAPSvc.3/BI-DB:PROD
MSSQLSvc/BI-DB:1433

 BI-WEB  svc_bi_app HTTP/BI-APP
 
Table #2 - Service Account Delegation Trusts
 Delegator
(Service Account)
 Delegatee
(Service Account)
Allowed Destination
(service/server:instance)
 
svc_bi_app  svc_bi_db MSOLAPSvc.3/BI-DB:PROD
MSSQLSvc/BI-DB:1433
svc_bi_web

svc_bi_db

MSOLAPSvc.3/BI-DB:PROD
MSSQLSvc/BI-DB:1433

svc_bi_web svc_bi_app HTTP/BI-APP

 

 

With this information, we’re ready to make the Active Directory changes. These changes are mostly made in the Active Directory Users and Computers snap-in, however we’ll see that the named instance used for the OLAP server isn’t supported by this snap-in (I don't know whether this is a bug or the intended behavior, but you can read about it in Microsoft knowledge base article 936628). We'll work around this limitation by making the final configurations in the ADSIEdit snap-in.

First let’s take care of the machine account configurations:

  1. Launch Active Directory Users and Computers.  Find the PPS server (BI-APP for us) in the list, double-click it.
  2. Click on the delegation tab. 
  3. Select the third option, Trust this computer for delegation to specified services only
  4. Click the Use Kerberos Only radio button
  5. Click the Add button
  6. In the Add Services dialog, click the Users or Computers… button, then enter the service account used to run the database services on the database server (the second column in table #1). 
  7. In the Add Services dialog, we need to select all the services in the third column of table #1 (MSSQLSvc/BI-DB:1433 and MSOLAPSvc.3/BI-DB:PROD in this example), then click OK.  Note that if your OLAP database is a named instance (as ours is), then only MSSQLSvc is available…this is because this snap-in doesn’t work for named OLAP instances.  We’ll get around this problem later using the ADSIEdit snap-in.
  8. With MSSQLSvc added to the Delegation tab of the BI-APP machine account, press OK on this dialog to save the delegation settings.
  9. Repeat the same sequence for the BI-WEB server, but this time in addition to adding services for the account svc_bi_db, also add services for the svc_bi_app account to allow delegation of security for HTTP services to that machine.  When you’re done, the computer account delegation tab for BI-WEB should look like this:
  10. Repeat the same sequence for BI-APP’s service account (svc_bi_app) using the values in columns 2 & 3 of table #2, yielding the following configuration when complete:
  11. Again, the same sequence for BI-WEB’s service account (svc_bi_web), yielding the following configuration:
  12. If your OLAP database has no instance name (you're using the "default instance"), you’re done!  If not, you’re almost finished, except that the Active Directory Computers and Users snap-in doesn’t support the named instance of the OLAP database.  So, open the ADSIEdit snap-in (adsiedit.msc) instead.
  13. Navigate within ADSIEdit to find the computer account for the APP server (BI-APP for us). 
  14. Right-click on the computer, choose Properties.
  15. In the Attribute Editor, locate the string msDS-AllowedToDelegateTo, and click the Edit button.
  16. Add two values for the OLAP database used in your environment (MSOLAPSvc.3/BI-DB.terrafirma.kerr.cc:PROD and MSOLAPSvc.3/BI-DB:PROD for our example).  Note that the two are the same except one has the FQDN and the other has only the NetBIOS name of the server.  Both are required.  When finished, the string editor should look like this:
  17. Click OK, on the string editor, then OK on the machine properties to save changes.  If you return to this editor in ADSIEdit, you can review and update these chnages.  However, beware that if you review changes in the Active Directory Users and Computers snap-in, you won't see these named instance entries.
  18. Repeat this change for the web server as well (BI-WEB in this example)
  19. Repeat this change for the user accounts used on the BI-WEB and BI-APP servers (svc_bi_web and svc_bi_app in this example).

OK, that’s all.  Now the environment is configured for constrained delegation.  From this point on, AD will still allow Kerberos delegation as before, except now it will carefully check not only that service accounts are running on the machine they should, but also that each service account is only passing tickets to servers/services that are pre-authorized.

Tags: ,

Analysis Services | PerformancePoint | Security | Windows Server

Analysis Services 2008 Improvements & BIDS Helper

by Rob 27. April 2008 23:33

Analysis Services cubes for OLAP analysis have improved substantially between SQL Server 2000 and 2005.  Still, there's always room for improvement.  The recent CTP (Community Technology Preview) editions of the upcoming SQL Server 2008 product shows that the product continues to improve, and that improvements align very well with what customers and developers want and need.

Some of my favorite new capabilities and features with SSAS 2008:

The dimension design improvements will be especially helpful for those new to AS cube development.  The current tools are excellent, and the available wizards do a great job getting new developers going in the right general direction.  Yet, as with many tools, the initial feeling of confidence can be misleading as a generic, wizard-driven design begins to crack under the strain of production data volumes and usage patterns.

The new dimension designer enhancements are meant to address this situation, with the goal to help designers make performant cubes from the very beginning.  Of course, there are things that can be done even in the 2005 release to improve the quality of designs.  The easiest to use tool is BIDS Helper, a freeware set of add-ins for BIDS.

Where BIDS Helper will give good advice, and should be part of the tool-box for most OLAP developers using SSAS, only a thorough understanding of what's happening under the hood will help achieve the best design and performance.  One of the best sources for this info is the Microsoft SQL Server 2005 Analysis Services Performance Guide provided by Microsoft.  A must-read for any serious SSAS OLAP developer.

Tags: ,

Analysis Services

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2010 Rob Kerr's BI Blog