dev certification, sql_server

This the Part 2 of my Study notes for 70-461 exam. You can find the Part 1 here

04. Aggregate Functions

  1. GROUPING SETS Using GROUPING SETS operator returns the same result set generated by using a GROUP BY, ROLLUP and CUBE operator. For example, this query generates the following result set:

     SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SUM(SubTotal) AS Total
     FROM Sales.SalesOrderHeader o
     GROUP BY GROUPING SETS ((), o.TerritoryID, YEAR(OrderDate))
     ORDER BY SaleYear ASC, o.TerritoryID ASC, Total ASC
    

    Output: Query Output

    Source: http://technet.microsoft.com/en-us/library/bb522495(v=sql.105).aspx

  2. RANK / DENSE_RANK / NTILE / ROW_NUMBER These functions may be quite helpful for reporting queries.

    RANK – DENSE_RANK: These functions help to order rows by assigning an incrementing number based on the criteria used in ORDER BY clause. The difference between them is rank skips the next value when there are two of the same one and dense_rank doesn’t.

    Example:

     SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SUM(SubTotal) AS Total,
     RANK() OVER(ORDER BY SUM(SubTotal) DESC) AS Rank
     FROM Sales.SalesOrderHeader o
     GROUP BY o.TerritoryID, YEAR(OrderDate)
     ORDER BY Rank, SaleYear ASC, o.TerritoryID ASC, Total ASC
    

    Output:

    Query Output

  3. NTILE Works like rank but it takes an integer parameter (n) and partitions the results to n-pieces. For example, the query below with NTILE returns 40 rows. As n is 4 in this example, the first 10 rows got rank 1 and 11 – 20 got rank 2 and so forth. Example:

     SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SUM(SubTotal) AS Total,
     NTILE(4) OVER(ORDER BY SUM(SubTotal) DESC) AS Rank
     FROM Sales.SalesOrderHeader o
     GROUP BY o.TerritoryID, YEAR(OrderDate)
     ORDER BY Rank, SaleYear ASC, o.TerritoryID ASC, Total ASC
    

    Output:

    Query Output

    ROW_NUMBER: Just increments the rank no matter what. Example:

     SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SUM(SubTotal) AS Total,
     ROW_NUMBER() OVER(ORDER BY SUM(SubTotal) DESC) AS Rank
     FROM Sales.SalesOrderHeader o
     GROUP BY o.TerritoryID, YEAR(OrderDate)
     ORDER BY Rank, SaleYear ASC, o.TerritoryID ASC, Total ASC
    

    Output:

    Query Output

Source: http://msdn.microsoft.com/en-us/library/ms189798.aspx

05. Analytic Functions

LAG / LEAD: When I first saw these I instantly loved them! They prevent an unnecessary costly self-join to access previous and next rows in a result set.

So for example check out the following query and its output:

SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SUM(SubTotal) AS Total,
LAG(SUM(SubTotal)) OVER(ORDER BY YEAR(OrderDate)) AS PrevYear,
LEAD(SUM(SubTotal)) OVER(ORDER BY YEAR(OrderDate))  AS NextYear
FROM Sales.SalesOrderHeader o
WHERE o.TerritoryID = 1
GROUP BY o.TerritoryID, YEAR(OrderDate)
ORDER BY SaleYear ASC, o.TerritoryID ASC, Total ASC

Output:

Query Output

Notice that every row has the previous row’s current value in its PrevYear column. Except the first one obviously as it doesn’t have a previous year. And the same logic applies to the NextYear value with the difference that LEAD function is used to obtain those values.

Source: http://msdn.microsoft.com/en-us/library/hh213234.aspx

06. TRANSACTION TIPS

One important topic to grasp is transaction isolation levels. I had a few questions related to this topic. Also it is crucial to create high-performance queries. The isolation levels you need to pay attention are:

READ UNCOMMITTED: This one is most loose one as it allows dirty reads. But much faster than the others as it doesn’t lock the rows.

READ COMMITTED: It doesn’t allow reading uncommitted data. This is the default level.

REPEATABLE READ: Prevents dirty reads. Concurrent operations cannot modify or delete data.

SERIALIZABLE: This level takes it one step further: In  addition to REPEATABLE READ it prevents concurrent operations from inserting data too.

SNAPSHOT: Specifies that data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction.

Source: http://msdn.microsoft.com/en-us/library/ms173763.aspx

dev certification, sql_server

As I mentioned before I like certification because it gives me a chance to cover all topics related to a subject no matter how trivial. During my preparations for this exam I learned some useful stuff. I believe they can be very useful in real-life scenarios. A compiled a list of items that I learned while studying for this exam:

01. Common Table Expressions

Not a new feature in SQL 2012 actually. I have used it a few times but I didn’t know its name back then. A Common Table Expression (CTE) is basically a temporary result set that you can use in the scope statement. For example

WITH DepartmentCTE (Name, GroupName)
AS
(
    SELECT Name, GroupName
    FROM HumanResources.Department
    WHERE GroupName = 'Executive General and Administration'
)

SELECT NationalIDNumber
FROM DepartmentCTE cte
INNER JOIN HumanResources.Employee e ON cte.Name = e.JobTitle

This is not a very good example because a simple join could be used instead. But for the demonstration purposes it should be enough to layout the syntax and usage.

Source: http://msdn.microsoft.com/en-us/library/ms190766(v=sql.105).aspx

02. Pivot / Unpivot

I had one question about Pivot and that was just the definition of it so it was easy. Basically, pivot converts row values to columns. For example, take this query:

SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SUM(SubTotal) AS Total
FROM Sales.SalesOrderHeader o
WHERE o.TerritoryID = 1
GROUP BY o.TerritoryID, YEAR(OrderDate)
ORDER BY SaleYear ASC, o.TerritoryID ASC

which yields this output:

Query Output

Say we want to get all years as columns and the totals as the values in the row. Using pivots we can rewrite the query as follows:

SELECT 'Total Sales' AS TotalSales, [2005], [2006], [2007], [2008]
FROM
(
    SELECT o.TerritoryID, YEAR(OrderDate) AS SaleYear, SubTotal
    FROM Sales.SalesOrderHeader o
    WHERE o.TerritoryID = 1
) AS SourceTable
PIVOT
(
    SUM(SubTotal)
    FOR SaleYear IN ([2005], [2006], [2007], [2008])
) AS PivotTable;

and we get this resultset:

Query Output

Source: http://msdn.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

03. FOR XML

I think XML lost a lot of ground after the rise of JSON over the last few years but still it is used very commonly. I never used FOR XML in a real-life project but I was aware the Microsoft cares about it and not surprisingly I got 4 questions. It’s is very simple and intuitive actually. The keywords you need to know are: RAW, AUTO, EXPLICIT, PATH and ELEMENTS

Raw: Returns a single element for row.

Query:

SELECT TOP 1 *
FROM HumanResources.Department
FOR XML RAW

Output:

<row DepartmentID="1" Name="Engineering" GroupName="Research and Development" ModifiedDate="2002-06-01T00:00:00"/>

Auto: Generates the name of the elements from the query

Query:

SELECT TOP 1 *
FROM HumanResources.Department
FOR XML AUTO

Output:

<HumanResources.Department DepartmentID="1" Name="Engineering" GroupName="Research and Development" ModifiedDate="2002-06-01T00:00:00">

Elements: Returns the values in elements rather than attributes.

Query:

SELECT TOP 1 *
FROM HumanResources.Department
FOR XML AUTO, ELEMENTS

Output:

<HumanResources.Department>
<DepartmentID>1</DepartmentID>
<Name>Engineering</Name>
<GroupName>Research and Development</GroupName>
<ModifiedDate>2002-06-01T00:00:00</ModifiedDate>
</HumanResources.Department>

Path: Gives more control over the path of the nodes

Query:

SELECT TOP 1 *
FROM HumanResources.Department
FOR XML PATH('Department'), ROOT('HumanResources')

Output:

<HumanResources>
<Department>
<DepartmentID>1</DepartmentID>
<Name>Engineering</Name>
<GroupName>Research and Development</GroupName>
<ModifiedDate>2002-06-01T00:00:00</ModifiedDate>
</Department>
</HumanResources>

Source: http://msdn.microsoft.com/en-us/library/ms178107.aspx

dev certification, sql_server

I believe I had a considerable amount of experience with SQL Server throughout my career. Recently I decided to get SQL Server MSCA title which requires taking 3 exams (70-461, 70-462 and 70-463 namely). I think it’s a good way of polishing existing knowledge and learning new stuff. Hands on experience is great but generally you only learn enough to get things done, which is the desired result of course but when studying for certification you have to cover all details. Some of them may be quite trivial and don’t have much practical usage but I think they all may come in handy sometime so it doesn’t hurt to learn more.

On to 70-461

The number of questions vary between 40 – 60 and test duration is between 2 – 3 hours. In my exam, I had 44 questions and the time given was 2 hours.

Total score is 1000 and the passing score is 700. I’ve got a whopping 906. Actually at the end of the day doesn’t really matter if you get a full 1000 but it feels better to pass with a high mark.

Study Notes

I decided to release my notes on a separate blog post as it became very lengthy very quickly.

Study Materials

70-461 Training Kit

I used several different sources preparing for the exam. One of the most important ones is the official Training Kit. Also, I found the CBT Nuggets video series quite useful. You can move at a much faster pace to cover the topics with a video. So I recommend those videos (despite the steep price of CBT Nuggets subscription)

There is another series in TrainSignal but I haven’t tried it.Finally I’ve found a short introductory video (around 1h 15m) to the exam. You can find the find link in the resorces section. It’s a very fast-paced introduction mainly underlining the exam topics that are not commonly used in real life.

Resources