tirsdag den 26. februar 2019

extract only numbers from a string in sql

DECLARE @textval NVARCHAR(30)
SET @textval = 'AB ABCDE # 123'

SELECT LEFT(SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000),
           PATINDEX('%[^0-9.-]%', SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000) + 'X') -1)

torsdag den 14. februar 2019

Datetime in a measure

if you have a column that is a datetime type, it needs to be converted to a numeric value before it can be used in a meassure in SSAS. To do so we can do the following to the column:

StartTime and EndTime are both DateTime:

,CONVERT(FLOAT,ljl.StartTime) + 2 AS StartSecond
,CONVERT(FLOAT,ljl.EndTime) + 2 AS EndSecond

This is done on the table/view that SSAS is looking at.

In Visual Studio, when you drag the StartSecond from the table into the Measures (at the left side of the screen) in the Cube Structure.


For start use minimum, and for end time date use maximum for the Usage.
In excel when you view the cube, you can format the field with custom format: dd/mm/yyyy hh:mm:ss to get datetime value
or
you can do it in the cube, by clicking on the measure, and change its properties:
formatString: yyyy-mm-dd hh:mm:ss

onsdag den 10. maj 2017

Precedence Constraint Success vs Completion

Use success if the next needs to execute if the current flow succeed. If it's okay that current flow fails, then use on completion.

http://www.techbrothersit.com/2014/03/what-is-difference-between-success-and.html

tirsdag den 27. december 2016

A duplicate attribute key has been found when processing

A duplicate attribute key has been found when processing…

SQL Server 2005 | SQL Server 2008
This post is about a common error message during dimension processing I’ve been asked about quite a few times so I thought it would be worth posting about it. The error message says that a duplicate attribute key has been found when processing as shown in the following screenshot for a test cube (I just processed one dimension here):
image_thumb3
Here’s the full error message:
Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: 'dbo_Product', Column: 'ProductGroup', Value: ''. The attribute is 'Product Group'.
When you got to this article because you just ran into this problem you probably don’t want to read much about the background but only want a solution. Unfortunately I found at least three possible reasons for this error message:
Reason 1 (likely): The most likely reason for that error is that you are having NULL values in your attribute key column.If you simply created the attribute by dragging it from the source view, BIDS only sets the key column (name and value column default to the key column in this case), so for example if you have a column ‘Product Group’ in your source table and drag it to your dimension, the product group (Text field) will automatically become the key for this attribute. The attribute is listed in the error message (in the example above it is ‘Product Group’).
Solution: Try avoiding those NULL values in your data source (for example by using a DSV query and the T-SQL coalesce-function). When your source data is a data warehouse it’s also a good practice to avoid null values as they complicate the queries to the data warehouse.
Reason 2 (likely): You defined an attribute relationship between two attributes of the dimension but the data in your source tables violates the relationship. The error message gives you the name of the conflicting attribute (text part ‘The attribute is…’). The attributes has a relationship to another attribute but for the value stated in the error message (‘Value: …’) there are at least two different values in the attribute that the relationship refers to. If you have BIDS Helper installed, you can also see the error details and all violating references when using the ‘Dimension Health Check’ function.
Solution: You may solve the error by making the key of the attribute unique. For example: Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: 'DimDate_x0024_', Column: 'Month', Value: 'April'. The attribute is 'Month'. In this example, the Month attribute violates an attribute relationship (maybe Month->Year) for the month April meaning that April appears for more than one year. By adding the year to the key of the month attribute you would make the relationsship unique again.
Reason 3 (not that likely): You have an attribute with separate key and name source fields. When you check the data, you see that keys are appearing more than once with different entries in their name columns (note that it’s not a problem if the key appears more than once if only the name column is the same). In this case you will usually also see the key value in the error message, for example:
Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: 'dbo_Product2', Column: 'ProductCode', Value: '1'. The attribute is 'Product Name'.
This means that the attribute ‘Product Name’ uses the source column ‘ProductCode’ as the key and for the product code 1 there is more than one name.
Solution: Use a unique key column (unique with respect to the name column)
Long explanation Reason 1:
In this case our attribute is only defined by one single source column (acting as key, name and value information) from the data source view. When processing a dimension, SSAS run select distinct queries on the underlying source table, so a duplicated key should be impossible even if the key appears multiple times. Just think of a date dimension like the following one (just for years and months):
image_thumb15
In this case the year (2009) appears in multiple rows. However, defining an attribute year (using the the year column as the key) does not give a conflict as it is queried using a distinct query (so 2009 only appears once). So again, how could we get a duplicate result when using a select distinct query? Here is how my product table looked like:
image_thumb16
As you can see the ProductGroup column has one row with an empty string and another row with a NULL value. When SSAS queries this attribute during processing it runs the following SQL query (that can be captured using the profiler):
SELECT DISTINCT [dbo_Product].[ProductGroup] AS [dbo_ProductProductGroup0_0]
FROM [dbo].[Product] AS [dbo_Product]
The result of the query looks like this:
image_thumb
Now, with the default NULL processing for our dimension attribute being set to ‘Automatic’ meaning Zero (for numerical values) or Blank (for texts) the NULL value above is converted to an empty string. So the result set has two lines with an empty string and that causes the error.
image_thumb17
So the problem can be avoided if you don’t have null values in your column. This explains the first reason described above.
Long explanation Reason 2:
I blogged about attribute relationship before and you may want to read this post about defining the key for attributes in an attribute relationship.
Long explanation Reason 3:
Let’s take a look at the following modified product table.
image_thumb2
The ProductID column is unique while the ProductCode is not. If we now define the ProductName attribute as follows we will also get a duplicate key error:
image_thumb4
The reason here is that for the ProductCode 1 two names are found (and therefore the select distinct returns two lines with ProductCode 1), so ProductCode is not a good key here. The problem would disappear if the ProductName for the third line would also be ‘A’ (like for the first line) or the ProductCode for the third line would be other than 1 or 2.
However, this reason occurs rather seldom because usually if we have a key and a name in our dimension, the source comes from some kind of master data table and therefore it should be unique. But for type 2 slowly changing dimensions you must not use the business key as key column (as there may be many rows with the same business key).
Another way to “solve” duplicate key errors (although not recommended) is to set the “KeyDuplicate” property for the error processing of the dimension to “IgnoreError” as shown below:
image_thumb18
However, this is definitely not recommended except for prototyping scenarios. This is clearly explained here http://technet.microsoft.com/en-us/library/bb630297.aspx.


source: http://ms-olap.blogspot.com/2009/11/duplicate-attribute-key-has-been-found.html

tirsdag den 25. oktober 2016

Handling long text in cube - need to be tested

Handling Annotation and comments in Analysis Services

By Thierry D’hers, Mosha Pasumansky
April 2005

This write-up summarizes several techniques for handling text information with the cube.  It describes the reason for these techniques, how it is applied, their pros and cons as well as limitations.

Text in an OLAP cube? Why?

While OLAP cube in general are very good and efficient at handling data, data alone isn’t self explanatory and is rarely sufficient to explain behaviors, patterns, exception such as pick or sudden drops… Many applications that require data entry from users (as opposed to loading of data from data factory) also require these same user to justify the data by association a comment to it. These applications are usually in the Finanical analytical space (Statutory consolidation, Budget, Forecasting…) or HR (Time tracking, Project tracking…).

One could say, why don’t we let OLAP handle the data and store the comments and annotation in a relational store. While this is perfectly feasible and currently is one of the main techniques used, it has some serious limitations. For one it requires a Front-end application to handle the user entry and appropriately store the Data and text in their respective storage. This is not by itself a show stopper since most of these applications require specific business logic and processes (Entry, validation, submission…) and thus need to expose a dedicated front end to the user for data entry anyway. Second and more critical is that it also requires a dedicated reporting tool that understands the application schema and is capable at query time to generate a query t the OLAP engine to retrieve the Data and another query to the relational storage to retrieve the associated Comments and join both data sets at run time. This creates a serious limitation which prevents the user community from using any third party tool to report on this data. Since this data is usually very sensitive it is necessary that comment always follows the data. Companies have then only two choices: Impose the Proprietary front-end to every single user or run the risk to have data being reported without its associated context.

The following document describes how to setup the cube so that it will enable the storage and reporting of text through MDX thus enabling both Data and Text to be queried through a unique query language and with a unique dataset.

Associating Comments with a member

Problem

Associating a comment with a member suggest that this comment never varies by any other dimensions.  While this may not be the most common type of annotation it is definitely possible.
One could say: “Wait a minute, how is that different from a member property?” Well technically it is not. While a member property is often seen as this characteristic of the member it also doesn’t vary over other dimensions.
Thus the technique to handle Dimension Member Comment is similar than the one about handling  Dimension member property, the only difference is the semantic meaning of this property.

Solution

  • Create a new member property for this dimension attribute
  • Name it comment or Annotation
  • Populate it using Dimension writeback “UpdateCube”
  • Query it like any other member property using the Dimension.Currentmember.Properties(“PropertyName”) MDX construct.

 <<Add example and screenshot here>>

Associating Comments with any cell

Problem

In most cases indeed, the Comment or Annotation is really associated with a cell or a set of cell.
For now let’s treat the single Cell case. This is indeed probably the most often used or desired one.
In this case we want to be able to associate a Comment with any cell in the cube (or measure group to be more precise) whether it is at leaf or non leaf level.

Solution

The solution revolve around the fact that Analysis Services Storage engine doesn’t know how to handle String for Measure data but does know for Dimension data and also the fact that the Formula engine does know how to handle text data.
So in a nutshell, Comment and Annotation are handled as Dimension members, entered by the user using Dimension writeback, then are retrieved as part of the measure group using a calculated member and thus queried from any front-end using regular MDX.

Handling Comment or Annotation as dimension member can seem scary at first, but since Yukon doesn’t handle the entire Dimension in memory anymore, its scalability capability in term of size of dimension members has greatly increased and can easily go above 50 millions members. For the very large majority of applications it seems reasonable to assume that this type of volume will never be reached. Indeed if we assume that an application has a lifespan of 5 years, then it gives us 10 millions members or Comment per year. Now let’s say that this application has 1000 active users, then every user has a workspace of 10 000 comments per year or 833 comments per month. It is a lot of text to enter every month for a single user…

Step by Step implementation of the solution

Following this step by step procedure we will add Comment support to any existing cube.
  1. Create a new Table, called “Annotation” (or anything that your wish) in the Relational backend with two columns (more can be added later, see Support for translation section): CommentKey (LongInt), Annotation String (VarChar). Do not populate these columns.
  2. In the fact table used by the measure group for which you want to add Comment, add a new column: CommentIndex (LongInt). Do not populate this column.
  3. Repeat Step 2 for as many Measure group to which you wish to Add Comments.
  4. Now, anywhere in the DSV diagram surface, click the Refresh context menu. The new Fact table columns are refreshed.
  5. Still in the DSV, select the Add/Remove Table context menu and add the Annotation table to the DSV.
  6. Now, using the Dimension wizard, create a new dimension for the Annotation table:
    • Dimension Key attribute key: CommentKey
    • Dimension Key attribute Name: Annotation String
    • Accept all the other defaults
    • Dimension Name: Annotation
  7. Now, open the Cube designer and go to the Cube Builder tab. Form the fact table create a new Measure from the newly added column CommentIndex, set it to Non Visible, set its aggregation function to NONE (This is important, we’ll see later why?).
  8. Go to the Dimension Usage tab, add the Annotation dimension to the cube using the Add Cube dimension context menu. You do not need to add any relationship with any Measure group.
  9. Go to the Calculation tab, add the Following calculated member:
    • Calculated member name: Comment
    • Calculated member MDX expression: IiF (Measures.[Comment Index] = 0,"-" ,StrToMember("Annotation.&[" + VBA!CStr(measures.[Comment Index]) + "]").Properties("Caption"))
    • This is the calculated member that you will add to your query to retrieve the comments

Now that we have the mechanics in place for retrieving the Comment for each cell, how do we get the CommentIndex in the Fact table and the Annotation String in the Dimension table.

Creating and inputting comments

Well the first easy solution is to have them loaded from the dimension table and Fact table when creating the dimension and the cube. That could mean that the Application front-end generate new records in these table for each comment that is created.
The challenge here is that the Front-end need to track the coordinate of the Cell where the comment is entered in order to appropriately create the new record.

The second solution is to use Analysis Services native Dimension and Data writeback capability to create the new Comment. This way the Application doesn’t have to know the coordinate of the cell, this is automatically generated by AS data writeback capability.
This is what the application must create in this case:
  1. User enters a Comment in a given Cell in the Comment measures
  2. Application capture this data entry when user submits
  3. Application use Dimension Writeback to create a new member in the Annotation dimension with the following values:
    1. Annotation String name: User entered text
    2. Annotation string Key: a new unique index generated by the application
  1. Application use Data Writeback to create a new data in the CommentIndex measure for the same coordinate where the text was entered:
    1. Value: = Unique MemberKey index generated for the new Annotation dimension member.
    2. Also application needs to make sure that if the index is associated with a non leaf level that in that case the writeback is done for the Datamember instead of the real parent level (otherwise the writeback will automatically allocate that value and spread it to the children which would defeat the purpose here). In this case you need to make sure that for this index measure, the measure is aggregated by replacing any parent value with any data member value if it is non null. Richard Tkachuk wrote a WP on how to do this calculation: http://www.sqlserveranalysisservices.com/OLAPPapers/Loading%20Aggregate%20Data%20in%20AS2005v2.htm

  1. Refresh the query to get the Newly entered comment back into the Comment field.


Enhancing the solution to support Comment translation

Since Annotation support relies on Dimension members, the Translation feature of Yukon can be leveraged if needed to enter or retrieve localized comments. Now it may not be realistic that every user will enter a comment as well as it localized value in many languages, but web services like Google offer online translation capabilities.
It may be a very nice service to offer comments to be automatically translated in many languages. The changes required to support it are minimal.
  1. Add a new column in the Annotation dimension for every language that need to be supported.
  2. Refresh the DSV so that these columns are seen by the DSV
  3. In the Annotation dimension designer translation tab, add the new languages then set the Annotation String attribute property as follow:
  1. Repeat for as many languages as needed.
  2. Now query with a different language. That’s it. It just works:
  3. In English:
  1. And in French…

Associating Comments with a set of Cells or Subcube

Associating Comments with a set of Cell or subcube can be slightly more problematic. There are many solution that each have different pros and cons
First solution is for the Application front end to generate as many identical values in the commentIndex as there are cells in the Subcube. This doesn’t require any changes in the solution described above, but it could lead to scalability issue, especially if the user is not aware of the concept of OLAP sparcity.
The second solution is to create a calculation that store the Commentindex value for every cell in the subcube. This can be easily done with the following expression:
Scope (<Subcube expression>)
This = <CommentIndex value>
End Scope
This doesn’t present any scalability challenges but it requires unique administration right to update the MDX script and save the cube.

Associating Comments with a grain of a few less dimensions than the data

Now let’s say that the data in our measure group varies by Account, Time, Scenarios, Organization, but we want to Comment field to not vary by the Scenario dimension. This is especially helpful if you actually want to Comment to apply to Actual, Budget and Variance altogether:

The solution to this problem is just a slight variation from the Cell level Comment solution.
  1. Instead of adding the CommentIndex column to an existing fact table, create a new fact table with only CommentIndex and the Dimension Id column that are needed (in our Case, TimeId, AccoutId, OrganizationId)
  2. Add this table to the DSV, add all the appropriate relationship with the dimension table
  3. Add CommentIndex as a new measure with its new measure group
  4. The rest of the solution works the same way…
  5. You now have a Comment field that varies by Account, Time, Organization byt not Scenarios

What about doing this in Analysis Services 2000

Well Most of it can actually work in Analysis Services 2000, a few feature though are not available in 2000 that slightly impact the solution

Solutions differences and limitations

First the NONE aggregation function doesn’t exist. The impact is that the CommentIndex values get aggregated and as a result some comment may appear in aggregated places where they shouldn’t.
For example if you have the following:
Comment 1, for January 1997 with an CommentIndex of 1
Comment 2, for February 1997 with an CommentIndex of 2
Comment 3, for March January 1997 with an CommentIndex of 3
Comment 6, for June 1997 with an CommentIndex of 6

Then when browsing Quarter1 1997 you may actually see Comment 6.
Why? Well the CommentIndex value for Q1 1997 is the SUM of its children so 6 in our case. The calculated member tries to retrieve the Comment with the CommentKey = 6.

To prevent this you can change the Calculated member to only work at leaf level. This at the same time fixes the issue but also adds a limitation to our solution. Comment can now only be entered at leaf level of the cube.

Conclusion

Managing Data and Text in the same database in Analysis Services is now very easy thanks to Yukon new NONE aggregation method, NONE Allocation writeback method and multiple measure group granularity.
All it requires is a front end that know how to writeback a dimension and a data cell at the same time.
But the benefit is immense because now any 3rd party analytic and reporting tool including Excel is now capable of query both type of data.



http://www.sqlserverdatamining.com/OLAPPapers/Handling%20Comments%20in%20Yukon.htm

torsdag den 14. juli 2016

derived column if else

Excel Syntax for IF else statement:

=IF(A=B,C,D)

SSIS Derived Column Syntax
A == B ? C : D


http://www.josefrichberg.com/journal/2009/8/24/if-then-else-statement-in-ssis-derived-columns.html