Monday, October 13, 2014

More SQL Server Aggregate functions

I mentioned that I would try to get back in here with some more examples using SQL Server's Aggregate functions. I think this is a good one. One of the requests I used to get again and again, was the current net position for all symbols traded on the Equity exchanges. The Ops desk (the Team supporting the traders) needed to know if positions were open for any symbol across these exchanges:    ISLD, BRUT, ARCA, CHX, NSDQ, NYSE.

To handle that, I just created this procedure which was run on the back-end, and would alert them if an open position was found.

  CREATE PROCEDURE dbo.usp_NetPosBySymbol
  AS
  SET NOCOUNT ON
  /*
   Returns the net position by symbol, across the equity exchanges, if exists.
   EXEC dbo.usp_NetPosBySymbol 

  Auth:  Me
  Date:  10/13/2014
  */
  BEGIN
        -- use this method to avoid an IN list in the WHERE clause
DECLARE @Exchanges TABLE (ExchangeName VARCHAR(15))
INSERT @Exchanges (ExchangeName)
VALUES 
('ARCA'),
('BRUT'),
('CHX'),
('DNET'),
('ISLD'),
('NSDQ'),
('NYSE');

IF EXISTS(
SELECT buys-sells AS Difference
FROM (
SELECT Symbol,SUM(Quantity) [buys] 
FROM dbo.Orders o
WHERE Side = 'B' 
         AND EXISTS(
                    SELECT 1 FROM @Exchanges e WHERE e.ExchangeName = o.Exchange)
GROUP BY Symbol) buys JOIN
(
SELECT Symbol,SUM(Quantity) [sells]
FROM dbo.Orders o 
WHERE Side = 'S' 
AND EXISTS(
                     SELECT 1 FROM @Exchanges e WHERE e.ExchangeName = o.Exchange)
GROUP BY Symbol) sells
         ON buys.Symbol = sells.Symbol
WHERE 
  buys <> sells)

BEGIN
SELECT buys.Symbol,Buys,Sells,buys-sells AS Difference
FROM(
SELECT Symbol,SUM(Quantity) [buys] 
FROM dbo.Orders o
WHERE Side = 'B' 
           AND EXISTS(
                      SELECT 1 FROM @Exchanges e WHERE e.ExchangeName = o.Exchange)
GROUP BY Symbol) buys JOIN
(
SELECT Symbol,SUM(Quantity) [sells]
FROM dbo.Orders o 
WHERE Side = 'S' 
             AND EXISTS(
                       SELECT 1 FROM @Exchanges e WHERE e.ExchangeName = o.Exchange)
GROUP BY Symbol) sells
              ON buys.Symbol = sells.Symbol
WHERE 
                   buys <> sells
ORDER BY 
                   buys.Symbol
       END
    END

  SET NOCOUNT OFF
  GO

We're still using the Orders data that we wrote in this post, SQL Aggregate functions, and these are our results:

   Symbol   Buys  Sells  Difference
   BEAS      55    100     -45
   YHOO      5   105     -55

That is just a simple example using SUM.  We first do an existence check, and then we return the net position back to the front end, if exists.

See this for more information, and for a couple more examples using the SUM function:
     http://msdn.microsoft.com/en-us/library/ms187810.aspx

SQL Aggregate functions for count & quantity traded

You're managing a lot of trading data, and you've been asked to provide some statistics back to the trading desks. There are numerous different types of statistics that you could provide, all using SQL Server aggregate functions such as COUNT, SUM, AVG, etc. Today I am going to show a very simple example for reporting how many buys and sells they are trading on each symbol.

First, we need an Orders table that all of trades are written to.  Very simple design, like this:

CREATE TABLE dbo.Orders (
OrderID INT IDENTITY(1,1),
TraderID CHAR (12) NOT NULL,
Exchange CHAR (8) NOT NULL,
Symbol CHAR (32) NOT NULL,
Side CHAR (1) NOT NULL,
Quantity BIGINT NOT NULL ,
Price DECIMAL(18, 8) NOT NULL ,
OrderDate DATETIME NOT NULL 
) ON YourDB_Data
GO

I should note, MUCH better practices would be a numeric identifier for Exchange, Symbol and even Side. But... then I'd have to do all kinds of JOINs to get the data back to you meaningfully. For this example, we can ignore the performance impediments of character strings vs integers.  ;-)

For the insert into Orders, I am only going to use a small chunk of orders by only one trader, John Doe.

-- load some orders
SET IDENTITY_INSERT Orders ON
INSERT dbo.Orders (OrderID,TraderID,Exchange,Symbol,Side,Quantity,Price,OrderDate)
VALUES
(1,10,'NYSE','AAPL','B',55,12.45,'10/13/2012 09:30'),
(2,10,'NSDQ','YHOO','B',50,61.75,'10/13/2012 10:30'),
(3,10,'CME','AAPL','B',5,17.50,'10/13/2012 09:00'),
(4,10,'NYSE','DDM9','S',5,1.75,'10/13/2012 08:30'),
(5,10,'NSDQ','ABS','B',3,66.502,'10/13/2012 09:35'),
(61,10,'NYSE','BEAS','B',55,1.75,'10/13/2012 12:20'),
(7,10,'NSDQ','BEAS','S',100,1.025,'10/13/2012 12:55'),
(8,10,'CME','CAG','B',10,11.53,'10/13/2012 13:30'),
(9,10,'CSX','YHOO','S',10,17.705,'10/13/2012 16:55'), 
(10,10,'NYSE','YHOO','S',50,10.45,'10/13/2012 10:30'),
(11,10,'NSDQ','YHOO','S',55,60.75,'10/13/2012 11:30'),
(12,10,'CME','YHOO','S',5,19.50,'10/13/2012 10:00');
SET IDENTITY_INSERT Orders OFF

Ok. We've got some order data for trader John Doe. With the query below, you will return a count for how many orders he's traded, and how much quantity was for buys vs sells. Basically, counts and quantities for each trader, by symbol and side ('side' is trading speak for buy or sell).

  DECLARE 
      @OrderDate DATE = '10/13/2012',
@TraderID INT =  10

  -- collect the trade count (#) and Qty for each trader, by symbol
  SELECT 
CONVERT(VARCHAR(10),o.OrderDate,101)  [Date],
t.TraderName,
o.Symbol,
COUNT(*) [#Trades],
LTRIM(RTRIM(LEFT(CONVERT(CHAR(16),CAST(SUM(CASE WHEN o.Side = 'b'
          THEN o.Quantity ELSE 0 END) AS MONEY),1),13))) [BuyQty],
LTRIM(RTRIM(LEFT(CONVERT(CHAR(16),CAST(SUM(CASE WHEN o.Side = 's'
          THEN o.Quantity ELSE 0 END) AS MONEY),1),13))) [SellQty]
  FROM 
dbo.Orders o INNER JOIN dbo.Traders t
         ON o.TraderID = t.TraderID
  WHERE 
CONVERT(VARCHAR(10),o.OrderDate,101) = @OrderDate
AND o.TraderID = @TraderID
  GROUP BY 
CONVERT(VARCHAR(10),o.OrderDate,101),
t.TraderName,
o.Symbol
  ORDER BY
CONVERT(VARCHAR(10),o.OrderDate,101),
o.Symbol

The results:

     OrderDate   Trader   Symbol   #Trades  BuyQty   SellQty
     10/13/2012  John Doe   AAPL     2      60        0
     10/13/2012  John Doe    ABS     1       3        0
     10/13/2012  John Doe   BEAS     2      55      100
     10/13/2012  John Doe    CAG     1      10        0
     10/13/2012  John Doe   DDM9     1       0        5
     10/13/2012  John Doe   YHOO     5      50      120

I added these two statements after the one above just to help me confirm that it's accurate. Looking at the raw and the agg data side by side makes it easier to validate:

    SELECT COUNT(*) FROM dbo.Orders WHERE TraderID = 10 AND Symbol = 'AAPL'
    SELECT * FROM dbo.Orders WHERE TraderID = 10 AND Symbol = 'ABS' AND Side = 'b'

Like I said, there are many different ways to report the data using SQL Server's aggregate functions. This is just one example. I'll try to come back soon with more. Until then, take a look at this, for more detail:


Calculate running totals in SQL Server

You want to show details for the individual orders traded (or transactions), but you also need to show the total orders traded by each trader, in the same result set. This means you need to calculate running totals. There are numerous ways to calculate running totals with tSQL, but you've got to be cautious, or your performance could be less than acceptable. In this post, I'm giving you just a quick example for calculating the running total by using a correlated sub query... and we're going to return it for each trader, by date.

-- first let's create a table for our traders
IF OBJECT_ID('tempdb..#traders') <> 0
DROP TABLE #traders;
CREATE TABLE #Traders (
    TraderID INT IDENTITY(1,1),
    TraderName VARCHAR(45)
 )
SET IDENTITY_INSERT #Traders ON
INSERT #Traders (TraderID,TraderName)
VALUES (10, 'John Doe'),
    (11, 'Jane Smith');
SET IDENTITY_INSERT #Traders OFF;

-- create our orders table 
CREATE TABLE #orders (
OrderID INT IDENTITY(1,1), 
OrderDate DATETIME
TraderID INT,
Quantity INT,
RunningTotal INT NULL )

-- populate #orders
SET IDENTITY_INSERT #orders ON
INSERT #orders (OrderID,OrderDate,TraderID,Quantity)
VALUES (1,'10/12/2014 09:00', 10, 100),
            (2, '10/12/2014 11:00', 10, 200),
           (3, '10/12/2014 12:00', 10, 300),
      (4, '10/12/2014 14:00', 11, 400),
           (5, '10/12/2014 14:30', 11, 500),
          (6,'10/13/2014 09:00', 10, 100),
          (7, '10/13/2014 11:00', 10, 90),
          (8, '10/13/2014 12:00', 11, 300),
          (9, '10/13/2014 14:00', 11, 4500),
        (10, '10/13/2014 14:30', 11, 500);

-- take a look at the orders data that you just loaded
SELECT * FROM #orders

   OrderID   OrderDate             TraderID Quantity RunningTotal
     1    2014-10-12 09:00:00.000   10       100       NULL
     2    2014-10-12 11:00:00.000   10       200       NULL
     3    2014-10-12 12:00:00.000   10       300       NULL
     4    2014-10-12 14:00:00.000   11       400       NULL 
     5    2014-10-12 14:30:00.000   11       500       NULL
     6    2014-10-13 09:00:00.000   10       100       NULL
   .............................
   ..................

Now we'll use a correlated subquery to bring it back out for each trader, and each order date.

   -- bring back the running total
   SELECT
          CONVERT(CHAR(10),o.OrderDate,101) [OrderDate],
          t.TraderName,
          o.Quantity,
         (
              SELECT SUM(Quantity) FROM #orders 
              WHERE OrderID <= O.OrderID 
              AND CONVERT(CHAR(10),OrderDate,101) = CONVERT(CHAR(10),o.OrderDate,101)
              AND TraderID = o.TraderID
         ) [Running Total] 
    FROM
          #orders o INNER JOIN #Traders t
          ON o.TraderID = t.TraderID
    ORDER BY
       OrderDate,
       t.TraderName


Here's your full result-set :

    OrderDate    TraderName   Quantity  RunningTotal
    10/12/2014 Jane Smith 400 400
    10/12/2014 Jane Smith 500 900
    10/12/2014 John Doe 100 100
    10/12/2014 John Doe 200 300
    10/12/2014 John Doe 300 600
    10/13/2014 Jane Smith 300 300
    10/13/2014 Jane Smith 4500 4800
    10/13/2014 Jane Smith 500 5300
    10/13/2014 John Doe 100 100
    10/13/2014 John Doe 90 190


That's it!  As I said, there are numerous ways to calculate running totals. There are new functions with the newer releases, such as the OVER clause that became available in v2005.  Here is a quick example using the OVER clause:

   -- using the OVER clause
   SELECT 
      OrderDate,
      Quantity,
      SUM(Quantity) OVER (ORDER BY o.OrderID) [RunningTotal]
   FROM 
      #orders o
   ORDER BY 
      o.OrderID;


Here is a little more information on the correlated subquery:
 http://technet.microsoft.com/en-us/library/ms187638(v=sql.105).aspx

And take a look at this for more detail on the OVER clause:
  http://msdn.microsoft.com/en-us/library/ms189461(v=sql.105).aspx



Sunday, October 5, 2014

SQL query for the total PnL for each trader, by symbol

I am cleaning up my laptop tonight, preparing to buy a new one... hopefully before this one dies. Today I found something that I had intended to include the other day when I posted this example for calculating sums across two tables: http://www.sqlfingers.com/2014/09/calculate-sum-across-two-tables.html

Just like that example, this was something the same customer needed very quickly after the close, to report the PnL for each trader account, by symbol.  I've trimmed it down a bit, including only the objects necessary to collect the PnL, but here it is: 

   -- first let's create a table for our traders
   CREATE TABLE dbo.Traders (
      TraderID INT IDENTITY(1,1),
      TraderName VARCHAR(45)
   )
   SET IDENTITY_INSERT dbo.Traders ON
   INSERT dbo.Traders (TraderID,TraderName)
   VALUES (10,'John Doe'),
              (11,'Jane Smith');
   SET IDENTITY_INSERT dbo.Traders OFF;

-- now let's create our daily trades table
CREATE TABLE dbo.Trades (
    TraderID INT,
    TradeID INT,
    Symbol VARCHAR(35),
    Volume BIGINT,
    Side CHAR(1)
)
   -- write some trades
   INSERT dbo.Trades (TraderID,TradeID,Symbol,Volume,Side)
   VALUES ( 10,1, 'AAPL', 10000,1),
                 ( 10,2, 'AAPL', 10000,2),
( 10,3, 'AAPL', -10000,2);

   -- check your data
   SELECT * FROM dbo.Trades

Ok.  Let's say closing price is 1.5.  To return the eod(end-of-day) PnL for each trader by symbol, we'd multiply each symbol's net volume by the closing price.  In this case, it's just AAPL:

  -- declarations
   DECLARE 
@price MONEY = 1.5,
@pnl MONEY

   -- collect the @pnl for AAPL
   SELECT @pnl = @price * SUM(Volume) 
   FROM dbo.Trades 
   WHERE Symbol = 'AAPL'

   -- bring it back out for each trader  - in this example we only have one trader... :-).   
   SELECT t1.TraderName, t2.Symbol, @pnl [PnL]
   FROM dbo.Traders t1 INNER JOIN dbo.Trades t2
          ON t1.TraderID = t2.TraderID
   GROUP BY t1.TraderName, t2.Symbol

Your results:

   TraderName   Symbol    PnL     
   John Doe     AAPL   15000.00

There you go.  Nothing too elaborate... just a SQL query for the closing profit and loss for each trader, by symbol.

I've mentioned before that I have virtually lived with the stock market data for almost 20 years. I have miles of tSQL; current, daily and month-to-date statistics, message ratios, PnL, open order reports, clearing... and more.  If any of you are managing trader volumes and pnl, you know what a pain it can be!  I have decided to share some of this, so you may begin seeing more logic like the one above. Let me know if you're looking for something that you don't see. I may have it already, or I can help you put it together.


List all empty tables in your SQL Server database

Here’s a simple query to find all empty tables in your SQL Server database.  We are using the dm_db_partition_stats DMV, which returns page and row-count information for every partition in the current database.  

   /*   Use the dm_db_partition_stats DMV to list all empty tables in your database.   */
   ;WITH Empty AS     
   (
    SELECT 
OBJECT_NAME(OBJECT_ID) [Table],
SUM(row_count) [Records]
    FROM 
sys.dm_db_partition_stats      
    WHERE 
index_id = 0 OR index_id = 1      
    GROUP BY 
OBJECT_ID      
   )      

   -- pull it back out
   SELECT [Table],Records 
   FROM [Empty]      
   WHERE [Records] = 0

   -- your results
   Table           Records
   sysclones          0
   sysseobjvalues     0
   syspriorities      0
   sysdbfrag       0
   sysfgfrag       0
   .........
   ......
   ..

This will provide you a little more information on the sys.dm_db_partition_stats DMV:

Please remember, this is not guaranteed to be an accurate record count.  These counts are based on what is cached in memory, or stored on disk in varied system tables. Something I often refer people to is this post from Kalen Delaney. She shows different methods to get the row counts, and makes us ask ourselves what 'accurate' really means:

Wait. What about the tables that aren't used anymore, but are not empty?  We often have unused tables in the database that still contain data. In my book, if they're not used, get rid of them. Thus, preserving server resources for what is required, rather than old/unused objects that are no longer needed.