Now that is just weird to me. Why would you want to 'dis-aggregate' data? Who wants to look at the details? If you're not an accountant, I mean... For example, instead of this:
You want to see this:
Turns out it's fairly common to need to see the details behind a summarized report. This is a quick example for dis-aggregating data.
-- load your summarized data
CREATE TABLE #aggregated (
[filename]
VARCHAR(25),downloads INT,downloaded date,data1 VARCHAR(15),
data2 VARCHAR(15),data3 VARCHAR(15),processed bit
)
INSERT #aggregated ([filename],downloads,downloaded,data1,data2,data3,processed)
VALUES
('filename1.txt',5,'11/15/2015','md1','md2','md3',0),
('filename1.txt',79,'12/31/2015','md1','md2','md3',0),
('filename1.txt',24,'1/15/2016','md1','md2','md3',0),
('filename2.xlsx',49,'10/15/2015','md5','string','string2',0),
('filename2.xlsx',25,'2/15/2016','md6','string','string3',0),
('filename2.xlsx',39,'12/15/2015','md7','string','string4',0),
('filename2.xlsx',8,'1/15/2016','md','string','string5',0);
-- table for the disaggregated details
CREATE TABLE
#disaggregated (
[filename] VARCHAR(25),[action] CHAR(8),downloaded date,data1 VARCHAR(15),
data2 VARCHAR(15), data3 VARCHAR(15)
)
-- set based approach (rather than
cursor)
WHILE EXISTS(SELECT 1 FROM
#aggregated WHERE processed = 0)
BEGIN
DECLARE
@next
INT = (SELECT TOP 1
downloads FROM #aggregated WHERE processed = 0),
@count
INT = 0
WHILE @count <
@next
BEGIN
INSERT #disaggregated ([filename],[action],downloaded,data1,data2,data3)
SELECT [filename],'download',downloaded,data1,data2,data3
FROM #aggregated
WHERE downloads =
@next
SELECT @count =
@count + 1
END
UPDATE #aggregated SET
processed = 1 WHERE
downloads = @next
END
Take a look at the details in your your #disaggregated table:
Or just run this to see that the count of what we just disaggregated corresponds to the numeric 'downloads' value in the aggregated data set.
--
query to show the counts of what was inserted
SELECT a.[filename],a.downloads,count(b.[action]) [details]
FROM #aggregated a INNER JOIN #disaggregated b
ON a.[filename] = b.[filename]
AND a.downloaded = b.downloaded
AND a.data1 = b.data1
AND a.data2 = b.data2
AND a.data3 = b.data3
GROUP BY
a.[filename],
a.downloads
Should return this:
Hope it helps! Let me know what you think.
No comments:
Post a Comment