Sunday, March 6, 2016

How to dis-aggregate summarized data in SQL Server ?

Now that is just weird to me.  Why would you want to 'dis-aggregate' data? Who wants to look at the details?  If you're not an accountant, I mean... For example, instead of this:

You want to see this:

Turns out it's fairly common to need to see the details behind a summarized report. This is a quick example for dis-aggregating data. 

-- load your summarized data
CREATE TABLE #aggregated (
  [filename] VARCHAR(25),downloads INT,downloaded date,data1 VARCHAR(15),
  data2 VARCHAR(15),data3 VARCHAR(15),processed bit
INSERT #aggregated ([filename],downloads,downloaded,data1,data2,data3,processed)

-- table for the disaggregated details
CREATE TABLE #disaggregated (
  [filename] VARCHAR(25),[action] CHAR(8),downloaded date,data1 VARCHAR(15),
  data2 VARCHAR(15), data3 VARCHAR(15)

-- set based approach (rather than cursor)
WHILE EXISTS(SELECT 1 FROM #aggregated WHERE processed = 0)
      @next INT = (SELECT TOP 1 downloads FROM #aggregated WHERE processed = 0),
      @count INT = 0
      WHILE @count < @next
         INSERT #disaggregated ([filename],[action],downloaded,data1,data2,data3)
         SELECT [filename],'download',downloaded,data1,data2,data3
         FROM #aggregated
         WHERE downloads = @next

         SELECT @count = @count + 1

     UPDATE #aggregated SET processed = 1 WHERE downloads = @next

Take a look at the details in your your #disaggregated table:

Or just run this to see that the count of what we just disaggregated corresponds to the numeric 'downloads' value in the aggregated data set.

--  query to show the counts of what was inserted
SELECT a.[filename],a.downloads,count(b.[action]) [details]
FROM #aggregated a INNER JOIN #disaggregated b
  ON a.[filename] = b.[filename]
  AND a.downloaded = b.downloaded
  AND a.data1 = b.data1
  AND a.data2 = b.data2
  AND a.data3 = b.data3

Should return this:

Hope it helps!  Let me know what you think.

No comments:

Post a Comment