0

lately we have been learning about database managind through microsoft sql server management studio and were learning of how to use the index function to optimize a consult and data management, and a part of this is that we should keep hash matches and Table Scans to a minimum. While doing so I came in contanct with a consult which I was able to get rid of various Table Scans but was unable to figure out how to get rid of these hash matches.

For context we were working with 4 tables from an example database:

[Sales].[OrderDetails]
[Sales].[Orders]
[Production].[Products]
[Production].[Categories]

And had to show The category name, the amount of products sold and the amount of money exchanged.

In the image you can find the current state of the consult

Slightly optimized execution plan

For this I did the next consult:

SELECT C.categoryname
       ,COUNT(*) AS QTY
       ,SUM(OD.unitprice * OD.qty) AS 'TotalAmountUSD'
FROM Sales.OrderDetails AS OD
    JOIN Sales.Orders O 
        ON (OD.orderid = O.orderid)
    JOIN Production.Products P 
        ON (OD.productid = P.productid)
    JOIN Production.Categories C 
        ON (P.categoryid = C.categoryid)
WHERE O.orderdate BETWEEN CAST('2015-01-01' AS DATE) AND CAST('2015-12-31' AS DATE)
GROUP BY C.CategoryName;

Which worked fine. And I was able to optimize it so Table Scans didn't appear with the next Indexes:

CREATE INDEX IDX_OrderDetails_AllColumns ON Sales.OrderDetails (orderid, productid, unitprice, qty);
CREATE INDEX IDX_Categories_AllColumns ON Production.Categories (categoryid, categoryname);
CREATE INDEX IDX_Products_AllColumns ON Production.Products (productid, categoryid);
CREATE INDEX IDX_Orders_OrderDate_OrderId ON Sales.Orders (orderdate) INCLUDE (orderid);

Then with the hash tables I tried by using indexes of the individual values:

CREATE INDEX IDX_Orders_OrderDate ON Sales.Orders(orderdate);
CREATE INDEX IDX_OrderDetails_OrderId ON Sales.OrderDetails(orderid);
CREATE INDEX IDX_Products_CategoryId ON Production.Products(categoryid);
CREATE INDEX IDX_Categories_CategoryName ON Production.Categories(categoryname);

And indexes of links between the joins and their uses:

CREATE INDEX IDX_OrderDetails_OrderID_UnitPrice_QTY ON Sales.OrderDetails(orderid, unitprice, qty);
CREATE INDEX IDX_OrdersDetails_OrderID_ProductID_UnitPrice_QTY ON Sales.OrderDetails(orderid, productid, unitprice, qty);

I expected to find the point where you apply an index so these hash matches would go away or be replaced for something like a Nested Loop.

New contributor
Iker-JK is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
4
  • This can help, please read: sqlshack.com/sql-server-execution-plan-operators-part-3
    – sa-es-ir
    Commented Jul 5 at 3:55
  • Rather than images, please share query plans via brentozar.com/pastetheplan. Please also show full tables and index definitions. Commented Jul 5 at 4:19
  • GROUP BY C.categoryid, C.CategoryName might be a good idea, means the compiler can infer uniqueness over the Categories table. Commented Jul 5 at 4:21
  • I don't have any specific advice for you here. But I would like to introduce you to one of my favorite games. It's called "what's that knob for?". It's played when some so-called "best practice" says that something's bad. But! if it were always bad, it wouldn't be in the product, right? So the game is determining under what circumstances that thing isn't bad.
    – Ben Thul
    Commented 2 days ago

1 Answer 1

0

In your case the best index will be on a view...

CREATE VIEW V_STATS
WITH SCHEMABINDING
AS
SELECT C.categoryname
       ,COUNT_BIG(*) AS QTY
       ,SUM(OD.unitprice * OD.qty) AS TotalAmountUSD
FROM Sales.OrderDetails AS OD
    JOIN Sales.Orders O 
        ON (OD.orderid = O.orderid)
    JOIN Production.Products P 
        ON (OD.productid = P.productid)
    JOIN Production.Categories C 
        ON (P.categoryid = C.categoryid)
WHERE O.orderdate BETWEEN CAST('2015-01-01' AS DATE) AND CAST('2015-12-31' AS DATE)
GROUP BY C.CategoryName;
GO

CREATE UNIQUE CLUSTERED INDEX XCV_STATS ON V_STATS (categoryname);
GO

test it !

Not the answer you're looking for? Browse other questions tagged or ask your own question.