<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Submit Force</title>
	<atom:link href="http://www.submitforce.info/feed" rel="self" type="application/rss+xml" />
	<link>http://www.submitforce.info</link>
	<description>www.submitforce.info</description>
	<lastBuildDate>Wed, 24 Aug 2011 18:02:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>About Time</title>
		<link>http://www.submitforce.info/structuring-time/about-time.php5</link>
		<comments>http://www.submitforce.info/structuring-time/about-time.php5#comments</comments>
		<pubDate>Wed, 24 Aug 2011 18:02:03 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Structuring Time]]></category>

		<guid isPermaLink="false">http://www.submitforce.info/?p=9</guid>
		<description><![CDATA[If we think about the importance and omnipresence of time in our daily lives, it is not surprising that time-related dimensions play such an important role in OLAP. After Measures, Time is by far the most common dimension to be found in OLAP cubes. My intention is to examine some common issues and problems related [...]]]></description>
			<content:encoded><![CDATA[<p align="left">If we think about the importance and omnipresence of time in our daily lives, it is not surprising that time-related dimensions play such an important role in OLAP. After Measures, Time is by far the most common dimension to be found in OLAP cubes. My intention is to examine some common issues and problems related to Time dimensions. In a few cases I may propose solutions while in another I just would like to point out the possible problems.</p>
<p align="left"><span style="color: #800000; font-size: medium;"><strong>Structuring Time</strong></span></p>
<p align="left">At first sight, Time is a deceptively simple dimension to build. How hard can it be? If you are thinking about economic of financial analysis, there seems to be a natural, perfectly regular hierarchy:</p>
<p>Years<br />
Quarters<br />
Months</p>
<p>If this structure solves all your problems then you may skip to the following section.</p>
<p>Of course, if you are thinking about sales or production, you might want to add days. That seems OK, too. At the most, there is this little hiccup in February caused by leap years (Why can&#8217;t this planet adjust its movements, huh?).</p>
<p align="left">Even if your application needs hours, or even seconds, that will mean more levels and more members, but not more conceptual complexity. Of course if you intend to keep data to the second you have to take into account there are 31 million seconds in a year. Careful planning and lots of resources may be needed.</p>
<p align="left">At this stage, two classical complications may appear.</p>
<p align="left">One of them is the need to roll up weeks. If you need first week of the month, second week and so on, this is solved by adding yet another level. However, you have to consider the &#8220;short&#8221; weeks at the end of each month whenever you create formulas, or if you are going to compare weeks side by side.</p>
<p align="left">Things get a little more complicated if you want to roll up weeks within a year. Since weeks will not fit into months you will need to create a multiple hierarchy, and how easily you can do this will depend on your MDDB.</p>
<p align="left">The second complication is the need to roll up your months into calendar years and fiscal years. Again this issue is usually solved by defining a multiple hierarchy.</p>
<p align="left">If days are important to your application, then it will be just a matter of time before you will need to discriminate working days from week-ends and holidays. In this case your MDDB better have some way of defining properties or attributes for dimension members, since this is the simplest solution, barring specific time-related functionality.</p>
<p align="left"><span style="color: #800000; font-size: medium;"><strong>Granularity</strong></span></p>
<p align="left">This issue may arise in many situations but it is very common to find it related to the Time dimension. Granularity is a truly a multidimensional problem, since it can only appear at the intersection of two or more dimensions. More over this is a problem that applies to specific members of the related dimensions.</p>
<p align="left">It is very simple to define: for a given member of a dimension, its granularity in terms of another dimension is the lowest level you have source data for. For example, you may have a sales analysis cube with three dimensions:</p>
<p align="left">Time (Year, Quarter, Month)</p>
<p align="left">Scenario : Actual, Budget</p>
<p align="left">Products (Line, Family,  Product)</p>
<p align="left">We are assuming an implicit measure: sales in dollars.</p>
<p align="left">In an ideal (from a technical viewpoint) situation you would have Actual and Budget data for each Product for each Month.</p>
<p align="left">However, it would not be unusual that sales were budgeted for each Product for each Quarter. In such a case we say the Time granularity of the Scenario member Budget is Quarter, while the Actual member Time granularity is Month.</p>
<p align="left">When this happens, you are left with a number of null cells. You have three options: you can leave null cells in your cube, and deal with the missing data problem when queries are made, or fill them using formulas according to an allocation algorithm. Since the subject of allocation methods  warrants a separate column, I will just say that there are quite a few of them, and none is perfect. It depends on the application. (Heard that one before, haven&#8217;t you?) The third option to solve a serious granularity problem is to split your cube in two.</p>
<p align="left">As I said earlier different granularities appear at the intersection of two dimensions, but there may be more than two dimensions involved. In our previous example, some product Lines might be budgeted for each Product, while others might be budgeted down to the Family level. The conceptual solutions are the same but the specific implementation becomes considerably more difficult. You will need double allocation algorithms, or your formulas will have to take into account missing data with more complex patterns.</p>
<p align="left">From the preceding paragraphs you can easily see why it is desirable to know  the granularity of available data before you start designing your solution.</p>
<p align="left"><span style="color: #800000; font-size: medium;"><strong>Rolling Up Time</strong></span></p>
<p align="left">Another issue that may occur in any dimension, but is often associated with time, is the need for custom rollups.</p>
<p align="left">Two classic examples are balance or stock related measures. When you have to rollup monthly figures into quarterly total adding them, as any self-respecting MDDB engine will do by default that not yield the correct result. Instead, you have to take the number for the last period, that is to say, your Stock for Quarter 1 will be your March figure, your 2001 stock will be the same as the 4th Quarter of 2002, and so on.</p>
<p align="left">Why does this happen? Members of the Time dimension, no matter at what level, can represent two different things : a time span or an instant in time. In the first case, data for December 2002 would measure the variation of whatever we are measuring aggregated during the 31 days of December, while in the second it would mean the value of that measure at midnight of December 31st.</p>
<p align="left">This tends to be so obvious to users that they will rarely make it explicit. For instance, if you talk about sales or expenses of course you mean time spans. On the other hand, balance sheet accounts will usually be taken as closing values. There are MDDB engines that provide functionality that addresses this particular case, in other products you are on your own.</p>
<p align="left">A second case of non-additive time rollups occurs when you define formula based members. For instance, you might have sales and cost and define a cost-to-sales ratio or percentage. In such a case the value for a quarter cannot be calculated by adding the values for each month. Instead, you have to calculate the ratio using the aggregated values for sales and costs. What this means is that what will happen depends on the calculation order (I may write a column on calculation order one these days) of the product you are using. In some cases the engine you are using may use the right formula by default but you should always make sure.</p>
<p align="left">There are too many possibilities to explore them in-depth in this column but by now you may have realized this is another aspect of cube design that is worth taking into account at design-time.</p>
<p align="left"><span style="color: #800000; font-size: medium;"><strong>Question Time</strong></span></p>
<p align="left">When we include a Time dimension in a cube we must know as much as we can about the questions that cube is supposed to answer. Of course such a list will never be complete but it will give us valuable information to guide our design decisions. Time related queries can range from the trivial to the hair-rising.</p>
<p align="left">These are just a few common patterns:</p>
<p align="left">How many times: Time is the one dimension most likely to be viewed on more than one axis when slicing-and-dicing. The classic example: the study of seasonal trends. Reports are usually of the form:</p>
<table width="439" border="3">
<tbody>
<tr>
<td width="110">Sales Report</td>
<td width="97">January</td>
<td width="112">February</td>
<td width="94">March</td>
</tr>
<tr>
<td width="110">2001</td>
<td width="97"></td>
<td width="112"></td>
<td width="94"></td>
</tr>
<tr>
<td width="110">2002</td>
<td width="97"></td>
<td width="112"></td>
<td width="94"></td>
</tr>
</tbody>
</table>
<p align="left">To create these reports, from a purely design standpoint you have the choice of creating more than one dimension, or using multiple hierarchies. However MDDB engines may include functionality (or the lack of it) that forces your choice.</p>
<p align="left">Moving target: There are many applications that require aggregations and calculations based on the &#8220;current period&#8221;. While the meaning of this expression varies, it usually refers to the last period for which data has been loaded into the cube. There are, however two important considerations: first, in cases of partial loads, there may be a requirement of completeness before a period is considered the current period; second, where there is an scenario dimension, current period will usually mean actual data, so that periods for which planned or budget data only has been loaded should not be considered. While this particular problem has been explored by Ralph Kimball (see Related articles) in terms of Data Warehouse design, when thinking about cubes different rules apply.</p>
<p align="left">Previous period: Many applications require calculated members to compare a given period with a previous period. This could be the last month, or the same month from last year, and from here it usually gets worse. The nice thing about this kind of members is that they may build upon your existing roll-up problems and also define the &#8220;current previous period&#8221;. While you may try to &#8220;forget&#8221; asking users if they need this functionality, it is advisable to know as soon as possible, even if you decide to leave implementation for version 2.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.submitforce.info/structuring-time/about-time.php5/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

