Subscribe To

Subscribe to SQL Server and XML by Email

Saturday, June 13, 2009

Converting VARBINARY to VARCHAR using FOR XML

http://beyondrelational.com/blogs/jacob/archive/2009/06/13/converting-varbinary-to-varchar-using-for-xml.aspx

Wednesday, June 10, 2009

Tech-ED Ahmedabad 2009 Announced – 20 June 2009 (Saturday), Hotel Rock Regency

Read the post http://beyondrelational.com/blogs/jacob/archive/2009/06/10/beyond-relational-news-tech-ed-ahmedabad-2009-announced-20-june-2009-saturday-hotel-rock-regency.aspx

Sunday, April 26, 2009

New home for my blog

I have been trying to setup a new home for my blog and I am almost done with it. I am moving my blog to http://beyondrelational.com/blogs/jacob/default.aspx

Those of you who are subscribed to the RSS feeds of this blog, please take note of the new RSS feed URL: http://feeds2.feedburner.com/ExploringBeyondRelational

Hope to meet you at http://beyondrelational.com/blogs/jacob/default.aspx

Friday, April 24, 2009

TSQL Challenges are moving to a new website

I am moving the TSQL challenges to a new ‘exclusive’ blog that is dedicated only for TSQL challenges. The new site will function in a little more organized and structured manner. We will come up with a new challenge every week(friday) and make sure that the we will announce the winners on the monday after the closing date of the challenge.

The new home page of ‘TSQL Challenges’ is http://beyondrelational.com/blogs/tc/default.aspx

TSQL Challenge 5 is live and can be found at http://beyondrelational.com/blogs/tc/archive/2009/04/24/tsql-challenge-5.aspx

Sunday, April 19, 2009

TSQL Challenge 4

Welcome to TSQL Challenge 4!

This is a simple challenge to write a piece of code that validates an SSN. The key here is to write “shortest” TSQL code that removes invalid SSN values and returns a result set with only valid SSN values.

Validation Rules

  1. The value should be in the format of XXX-XX-XXXX where each X represents a digit
  2. The first three digits (Area number) cannot be between 734 and 749.
  3. The first three digits (Area number) cannot be higher than  772
  4. The first three digits (Area number) cannot be 666
  5. It is not allowed to have all zeros in any of the digit group. For example 000-12-1234, 123-00-1234 or 123-12-0000 etc are invalid
  6. Numbers from 987-65-4320 to 987-65-4329 cannot be used

Here is some sample data.

SET NOCOUNT ON;
DECLARE @t TABLE (SSN VARCHAR(15))
INSERT INTO @t (SSN) SELECT '123-45-6789'
INSERT INTO @t (SSN) SELECT '123-45-67.89'
INSERT INTO @t (SSN) SELECT 'ABC-12-3455'
INSERT INTO @t (SSN) SELECT '123-45-67890'
INSERT INTO @t (SSN) SELECT '123-456789'
INSERT INTO @t (SSN) SELECT ' 123-45-6789'
INSERT INTO @t (SSN) SELECT '12345-6789'
INSERT INTO @t (SSN) SELECT '123456789'
INSERT INTO @t (SSN) SELECT '123-12-1234'

/*
Expected Result:
SSN
---------------
123-45-6789
123-12-1234
*/

Notes:

  1. Remember the key here is to write the shortest code
  2. CLR Functions not allowed
  3. Make sure that the code runs on the above table. (don’t create your own table. That will give me a tough time testing the code. You can add more rows to the above table to test your code with more data. In such cases, include those insert statements when you send me the code)
  4. Send your entries to jacob@beyondrelational.com
  5. Do not paste the code in the email body. Include it as an attachment (.sql file)
  6. Mention ‘TSQL Challenge 4’ in the email subject
  7. Last Date: 24 April 2009

Friday, April 17, 2009

Ahmedabad SQL Server User Group – April Meeting (18 April 2009)

I would like to invite everyone to the April Meeting of Ahmedabad SQL Server UG. We will meet tomorrow (Saturday, 18 April 2009) at our new office, 302 OLIVE ARCADE, OPP Samudra Anexe, OFF CG Road, Ahmedabad at 6 PM.

This month, we will discuss the XML features of SQL Server 2008 and as usual will have a QA session to discuss general SQL Server Topics. We will also discuss the community plans for the upcoming Tech-ED India 2009 and the discount options for the UG members. There will be some give-away items in addition to the usual ice-cream and snacks :-)

Hope to see many of you tomorrow!

Monday, April 13, 2009

TSQL Challenge 3 – Winners

It took me much more time to identify and announce the winners of TSQL Challenge 3, than I expected. I received a large number of entries it was not an easy task to review each entry and identify the top 3.

The toughest part with this challenge was that almost all entries produced the correct result. However, several dozens of them used cursors and WHILE LOOPS to generate the result. Since the challenge is to generate the result with a single query, I discarded the entries that used cursors and loops.

After all the scanning and filtering, I ended up with 70+ shortlisted entries and identifying the top 3 from them was really difficult. I used the following logic to select the top 3 from the shortlisted entries.

  1. I ordered the entries by the length and complexity of the query. Some of the entries used only one CTE, but many used two or three CTEs. I discarded the entries with more than one CTE and ended with 13 finalists.
  2. I looked at the IO used by each query and ordered them based on that. After ordering them, I took the top 3 entries that used least IO to produce the results.

I would like to thank everyone who participated in this contest. I would like to congratulate those 70+ winners who sent me ‘correct’ entries that solves the challenge.

Here are the (TOP 3) winners from the list of 70+ SQL Server experts who can translate any business logic into a TSQL query.

 

IMG_8835

Rui Carvalho

Rui Carvalho is a senior developper on Sql Server and .Net mainly experienced in web applications. He work as consultant for a Microsoft experts company called Winwise in france. Rui worked in the past as a full time Sql developper specialised in sql optimisations and reporting for CRM applications and now  mainly focus his time on .Net core technologies and Asp.net MVC. Architecture and software design are also part of his job.

He runs two blogs, one in French and one in English.

Rui was one of the winners in TSQL Challenge 2 as well.

 

Leonid Koyfman

Leonid Koyfman

Leonid Koyfman is a Senior Developer with Razorfish ( San Francisco). He is in software development over 10 years. His focus is data visualization and delivering reporting solutions from various data sources, but spending most of the time with SQL Server.

Leonid was one of the winners in TSQL Challenge 1 as well.

 

Fred2

Friedrich Paul

Friedrich Paul live in Bangalore, India. He completed my MCA from Christ College , Bangalore on April 2008. Currently he is working as database developer at First Indian Corporation from past 8 months. He maintains a blog at http://datawarehousejourney.blogspot.com/

Solving the Problem

Two of the winners, Rui and Friedrich used a recursive CTE to generate the required output while Leonid used a combination of a number table and FOR XML path to produce the ‘reversed’ string.

Here is the solution of Rui.

DECLARE @t TABLE( ID INT IDENTITY, data NVARCHAR(20))
INSERT INTO @t(data) SELECT 'Jacob'
INSERT INTO @t(data) SELECT 'Sebastian'

;WITH REVERSE_CTE AS (
    SELECT
        ID,
        data,
        cast('' as NVARCHAR(20)) AS inverted,
        CASE WHEN LEN(ISNULL(data,'')) = 0 THEN 1 ELSE 0 END AS pos
    FROM @t 
    UNION ALL
    SELECT    
        ID,
        data, 
        CAST(inverted + SUBSTRING(data,LEN(data)-LEN(inverted),1) AS NVARCHAR(20)),
        pos = cast(LEN(data)-LEN(inverted) as int)
    FROM REVERSE_CTE
    WHERE LEN(data)-LEN(inverted) > 0 
)
SELECT ID,data=inverted 
FROM REVERSE_CTE WHERE pos = 1
/*
ID          data
----------- --------------------
2           naitsabeS
1           bocaJ
*/

Here is the solution of Leonid.

DECLARE @t TABLE( ID INT IDENTITY, data VARCHAR(20))
INSERT INTO @t(data) SELECT 'Jacob'
INSERT INTO @t(data) SELECT 'Sebastian'
 
;WITH Numbers(N) AS (
    SELECT    DISTINCT Number 
    FROM      master..spt_values
    WHERE     Number BETWEEN 1 AND (SELECT MAX(LEN(data))FROM @t)
)
SELECT 
    id,
    data_reversed
FROM @t T1
CROSS APPLY(
     SELECT 
        SUBSTRING(data,N,1)
     FROM @t T2 JOIN Numbers ON N<=LEN(data)
     WHERE T1.id=T2.id
     ORDER BY N DESC
     FOR XML PATH('')
)x(data_reversed)
ORDER BY id DESC
/*
id          data_reversed
----------- ---------------
2           naitsabeS
1           bocaJ
*/

Finally, here is the solution of Friedrich.

DECLARE @t TABLE( ID INT IDENTITY, data VARCHAR(20))
INSERT INTO @t(data) SELECT 'Jacob'
INSERT INTO @t(data) SELECT 'Sebastian'

;WITH revtab (id,data,position) AS
(
    select 
        id, 
        SUBSTRING(data,len(data),1),
        len(data)-1 as position 
    from @t
    UNION ALL
    select 
        t.id, 
        SUBSTRING(t.data,position,1),
        position - 1 as position
    from @t as t INNER JOIN revtab as r ON t.id = r.id 
    where position <> 0 
)
SELECT 
    t1.id,
    ( 
        SELECT 
            data + '' 
        FROM revtab t2
        WHERE t2.id= t1.id
        ORDER BY position desc
        FOR XML PATH('') 
    ) AS Names
FROM revtab t1
GROUP BY id
/*
id          Names
----------- ------------------
1           bocaJ
2           naitsabeS
*/

I hope you enjoyed the challenge. Thank you every one who participated in the challenge. I will post the next challenge later today.

Saturday, April 04, 2009

TSQL Challenge 3 – Results by next week

Hi All,

I could not review and announce the winners of TSQL Challenge 3 yet. I am currently travelling and I will do this early next week.

Get ready for the next challenge!

cheers!

Thursday, March 26, 2009

TSQL Challenge 3

Congratulations to the winners of TSQL Challenge 2 and thanks to all the participants and readers who welcomed it with great enthusiasm. Here is the next challenge. This challenge is not for solving any business/application problem, but just to refresh your TSQL skills on set based operations.

So, the task is to reverse a string without using the REVERSE() function. What is wrong with the REVERSE() function? Nothing Really! As I mentioned earlier, this is to refresh your TSQL skills on set based operations. In real life, you should always use the REVERSE() function, if ever you need to reverse a string.

Again, we are not going to reverse a single string. We need to reverse all the values in the column of a table using a single query.

Here is the sample Data

ID          data
----------- --------------------
1           Jacob
2           Sebastian

Here is the expected result

id          data
----------- --------------------
2           naitsabeS
1           bocaJ

Use the script below to create the sample table.

DECLARE @t TABLE( ID INT IDENTITY, data VARCHAR(20))
INSERT INTO @t(data) SELECT 'Jacob'
INSERT INTO @t(data) SELECT 'Sebastian'

Notes:

  1. Write a single query that produces the expected result. No User Defined Functions allowed.
  2. Make sure that your code works with the sample script given above. Use the same column names, table variable name etc. This makes my life easier while testing the code.
  3. No restriction on SQL Server version. You can write the query for SQL Server 2000, 2005 or 2008
  4. Make sure that the subject of your email is ‘TSQL Challenge 3’
  5. Last date to submit your entries: 31 March 2009

Wednesday, March 25, 2009

TSQL Challenge 2 – Winners

I am glad to announce the winners of TSQL Challenge 2. Though I wanted to announce the winners on Sunday, I could not do that as it took me a lot of time scanning all the entries. Thanks everyone for participating and get ready for the next challenge.

Here are the winners of TSQL Challenge 2.

antoine

Antoine Gémis

Antoine Gémis lives in Toulouse, France. He is involved in software development over 15 years, mostly in networking, accounting and kiosk development.
Since 2008,  he is an employee of Navocap  as full time SQL developer. Antoine is working on "real-time" geolocation databases and you can find his blog here.

 

IMG_8835

Rui Carvalho

Rui Carvalho is a senior developper on Sql Server and .Net mainly experienced in web applications. He work as consultant for a Microsoft experts company called Winwise in france. Rui worked in the past as a full time Sql developper specialised in sql optimisations and reporting for CRM applications and now  mainly focus his time on .Net core technologies and Asp.net MVC. Architecture and software design are also part of his job.

He runs two blogs, one in French and one in English.

Tejas

Tejas Shah

Tejas Shah, is very strong in .NET and MS SQL. He has 4+ years of Experience as a Web Developer. He is M.sc. (C.A. & I.T.) (Master Of Science in Computer Application and Information Technology). He is currently working as Team Leader at Ahmedabad, India.

You can find his blog here.

 

SOLVING THE CHALLENGE

Solving this challenge involves solving two logical hurdles. The first part of the problem is to generate a sequence of dates that falls within the given date values so that we can calculate the hours of each date. For example, to calculate the working hours between 6th March 2 PM and 9th March 11 AM, we need to look at each date that falls between this period and calculate the hours of each day. The following example illustrates this.

/*
From Date            Day        To Date              WorkHrs
-----------------    ---        -----------------    -------
2009-Mar-06 14:00    Fri        2009-Mar-06 17:00    03:00                
2009-Mar-07 08:00    Sat        2009-Mar-07 17:00    00:00
2009-Mar-08 08:00    Sun        2009-Mar-08 17:00    00:00
2009-Mar-09 08:00    Mon        2009-Mar-09 11:00    03:00
                                                     -----
                                                     06:00
                                                     -----
*/

So, the period between 2009-Mar-06 14:00 and 2009-03-09 11:00 gives us 6 working hours. We need to do such a calculation for each row we have in the table. So the key here is identifying the dates between given two values and then looking at the day (sunday, monday etc) and calculate the work hours accordingly.

The question now is, “how do we generate a sequence of dates that falls between a given date range. Well, there are a few options.

USING A CALENDAR TABLE

If you search on internet for “Calendar Table”, you will find many articles that explain the advantages of using a calendar table. Almost all databases that deal with business applications keep a calendar table to help perform various calculations. If your database has a calendar table, it can be used to solve the above problem.

Let us see, how we can make use of a calendar table to achieve this result. Let us build a calendar table for the purpose of this demonstration. I am inserting only 10 rows to the calendar table. But in a real life scenario, you might have a few years of data in your calendar table.

The following example builds a calendar table needed for this example

IF OBJECT_ID('Calendar','U') IS NOT NULL DROP TABLE Calendar
CREATE TABLE Calendar (dt DATETIME)
GO

DECLARE @FromDate DATETIME, @ToDate DATETIME
SELECT @FromDate = '2009-03-01', @ToDate = '2009-03-10'
WHILE @FromDate <= @ToDate BEGIN
    INSERT INTO Calendar (dt) SELECT @FromDate
    SELECT @FromDate = @FromDate + 1
END

SELECT * FROM Calendar
/*
dt
-----------------------
2009-03-01 00:00:00.000
2009-03-02 00:00:00.000
2009-03-03 00:00:00.000
2009-03-04 00:00:00.000
2009-03-05 00:00:00.000
2009-03-06 00:00:00.000
2009-03-07 00:00:00.000
2009-03-08 00:00:00.000
2009-03-09 00:00:00.000
2009-03-10 00:00:00.000
*/

Now, let us write a query that uses this calendar table to achieve the result that we needed.

DECLARE @t Table(FromDate DATETIME, ToDate DATETIME)
INSERT INTO @t (FromDate, ToDate) SELECT '2009-03-06 14:00', '2009-03-09 11:00'

SELECT 
    dt AS Date,
    DATENAME(weekday, dt) AS wkday
FROM @t t
CROSS JOIN Calendar c
WHERE 
    dt BETWEEN DATEADD(d,0,DATEDIFF(d,0,FromDate)) 
    AND 
    DATEADD(d,0,DATEDIFF(d,0,ToDate)) 
/*
Date                    wkday
----------------------- ------------------------------
2009-03-06 00:00:00.000 Friday
2009-03-07 00:00:00.000 Saturday
2009-03-08 00:00:00.000 Sunday
2009-03-09 00:00:00.000 Monday
*/

The above example demonstrated how to generate a sequence of dates between the given date values. Now let us apply a tiny logic and calculate the actual number of hours for each day. Let us generate the time period (business hours) on each day (between 8 am and 5 PM).

DECLARE @t Table(FromDate DATETIME, ToDate DATETIME)
INSERT INTO @t (FromDate, ToDate) SELECT '2009-03-06 14:00', '2009-03-09 11:00'

SELECT 
    dt AS Date,
    LEFT(DATENAME(weekday, dt),3) AS wkday,
    CASE 
        WHEN c.dt = DATEADD(d,0,DATEDIFF(d,0,FromDate)) THEN FromDate 
        ELSE DATEADD(hour,8,dt)
    END AS StartTime,
    CASE 
        WHEN c.dt = DATEADD(d,0,DATEDIFF(d,0,ToDate))  THEN ToDate 
        ELSE DATEADD(hour,17,dt) 
    END AS EndTime
FROM @t t
CROSS JOIN Calendar c
WHERE 
    dt BETWEEN DATEADD(d,0,DATEDIFF(d,0,FromDate)) 
    AND 
    DATEADD(d,0,DATEDIFF(d,0,ToDate))
/*
Date                    wkday StartTime               EndTime
----------------------- ----- ----------------------- -----------------------
2009-03-06 00:00:00.000 Fri   2009-03-06 14:00:00.000 2009-03-06 17:00:00.000
2009-03-07 00:00:00.000 Sat   2009-03-07 08:00:00.000 2009-03-07 17:00:00.000
2009-03-08 00:00:00.000 Sun   2009-03-08 08:00:00.000 2009-03-08 17:00:00.000
2009-03-09 00:00:00.000 Mon   2009-03-09 08:00:00.000 2009-03-09 11:00:00.000
*/ 

Look at the StartTime and EndTime. It shows the actual business hours on each day. Now, we can do DATEDIFF() function to get the differences between the two date values and ignore Sat and Sun. Then apply a SUM() over the DATEDIFF() function and you will get the total minutes. You can then convert this minutes to hours.

Note: The code given above is a simplified version of the final code, presented to explain one of the logical approaches to solve this problem. The above code is not tested against all the different combination of values.

USING A NUMBER TABLE

We saw, how to use a calendar table to solve the problem presented in this challenge. Now let us see how a number table can help solve this.

While a calendar table can help solve a number of date-time related problems, a number table can help solve date-time related problems as well as a wide range of other problems. I have used a number table to solve a number of string parsing problems.

Many of our databases have large number tables. If the range of numbers needed is small, I usually use the ‘undocumented’ table master..spt_values. You should use this only at your own risk :-). If you need a number table, it is always a good idea to create a number table instead of using spt_values table. However, I use master..spt_values for demonstrating code samples as it is pretty easy to use. (no setup needed).

The following code snippet shows a different version of the previous code that uses a number table.

DECLARE @t Table(FromDate DATETIME, ToDate DATETIME)
INSERT INTO @t (FromDate, ToDate) SELECT '2009-03-06 14:00', '2009-03-09 11:00'

SELECT 
    DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number AS Date,
    LEFT(DATENAME(weekday, DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number),3) AS wkday,
    CASE 
        WHEN DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number = DATEADD(d,0,DATEDIFF(d,0,FromDate)) 
            THEN FromDate 
        ELSE DATEADD(hour,8,DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number)
    END AS StartTime,
    CASE 
        WHEN DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number = DATEADD(d,0,DATEDIFF(d,0,ToDate))  
            THEN ToDate 
        ELSE DATEADD(hour,17,DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number) 
    END AS EndTime
FROM @t t
CROSS JOIN master..spt_values c
WHERE 
    [type] = 'P' 
    AND
    DATEADD(d,0,DATEDIFF(d,0,FromDate)) + Number BETWEEN DATEADD(d,0,DATEDIFF(d,0,FromDate)) 
    AND 
    DATEADD(d,0,DATEDIFF(d,0,ToDate))
/*
Date                    wkday StartTime               EndTime
----------------------- ----- ----------------------- -----------------------
2009-03-06 00:00:00.000 Fri   2009-03-06 14:00:00.000 2009-03-06 17:00:00.000
2009-03-07 00:00:00.000 Sat   2009-03-07 08:00:00.000 2009-03-07 17:00:00.000
2009-03-08 00:00:00.000 Sun   2009-03-08 08:00:00.000 2009-03-08 17:00:00.000
2009-03-09 00:00:00.000 Mon   2009-03-09 08:00:00.000 2009-03-09 11:00:00.000
*/ 

USING A RECURSIVE CTE

Another way of generating the sequence of dates is by using a recursive CTE. All the three winners of this challenge submitted their queries using a recursive CTE. This is a classic example that demonstrates the possibility of solving many more problems using recursive CTEs, against the common belief that recursive CTEs are only for generating hierarchical stuff.

Let us write a version of our previous queries using a recursive CTE.

DECLARE @t Table(FromDate DATETIME, ToDate DATETIME)
INSERT INTO @t (FromDate, ToDate) SELECT '2009-03-06 14:00', '2009-03-09 11:00'

;with cte as(
    SELECT  
        DATENAME(weekday,Fromdate) AS Wkday,
        FromDate,
        ToDate
    FROM @T
    UNION ALL
    SELECT 
        DATENAME(weekday,DATEADD(d, 1, DATEADD(dd,0, DATEDIFF(dd,0,c.Fromdate)))),
        DATEADD(d, 1, DATEADD(dd,0, DATEDIFF(dd,0,c.Fromdate))) AS Fromdate,
        c.ToDate
    FROM    @t t
    INNER JOIN cte c ON 
        DATEADD(d, 1, DATEADD(dd,0, DATEDIFF(dd,0,c.FromDate))) <= c.ToDate
)
SELECT * FROM cte
/*
Wkday                          FromDate                ToDate
------------------------------ ----------------------- -----------------------
Friday                         2009-03-06 14:00:00.000 2009-03-09 11:00:00.000
Saturday                       2009-03-07 00:00:00.000 2009-03-09 11:00:00.000
Sunday                         2009-03-08 00:00:00.000 2009-03-09 11:00:00.000
Monday                         2009-03-09 00:00:00.000 2009-03-09 11:00:00.000
*/

The above example demonstrates how to generate a range of dates between two given date values using a RECURSIVE CTE. Once the date values within the ranges are generated, we could apply a similar logic we discussed previously, to calculate the actual work hours between the values.

COMPLETE CODE LISTING

Now, let us see the listing of the complete code.  Here is the entry submitted by Tejas.

DECLARE @t TABLE (ID INT IDENTITY, StartDate DATETIME, EndDate DATETIME) 
INSERT INTO @t (StartDate, EndDate) SELECT '3/2/2009 8:00AM', '3/2/2009 3:00PM' --07:00 
INSERT INTO @t (StartDate, EndDate) SELECT '3/1/2009 4:00PM', '3/3/2009 2:00 PM' --15:00 
INSERT INTO @t (StartDate, EndDate) SELECT '2/26/2009 7:00AM', '2/26/2009 10:00PM' --09:00 
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 AM', '1/27/2009 5:15 PM' --07:45 
INSERT INTO @t (StartDate, EndDate) SELECT '1/17/2009 1:45 PM', '1/19/2009 7:45 AM' --00:00 
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 PM', '1/28/2009 9:15 AM' --01:15 

;with cte as(
SELECT    ID,
        StartDate AS OrgStartDate,
        EndDate AS OrgEndDate,
        StartDate,
        EndDate
FROM @T
UNION ALL
SELECT    t.ID,
        c.OrgStartDate AS OrgStartDate,
        c.OrgEndDate AS OrgEndDate,
        DATEADD(d, 1, DATEADD(dd,0, DATEDIFF(dd,0,c.StartDate))) AS StartDate,
        c.EndDate
FROM    @t t
INNER JOIN cte c ON t.ID = c.ID
AND DATEADD(d, 1, DATEADD(dd,0, DATEDIFF(dd,0,c.StartDate))) <= c.EndDate
)
, cte2 AS(
SELECT    ID,
        OrgStartDate AS OrgStartDate,
        OrgEndDate AS OrgEndDate,
        DATENAME(dw,StartDate) AS DayName,
        CASE 
            WHEN StartDate <= DATEADD(hh,8,DATEADD(dd,0, DATEDIFF(dd,0,c.StartDate)))
                THEN DATEADD(hh,8,DATEADD(dd,0, DATEDIFF(dd,0,c.StartDate)))
                ELSE StartDate
        END AS StartDate,
        CASE 
            WHEN EndDate <= DATEADD(hh,17,DATEADD(dd,0, DATEDIFF(dd,0,c.StartDate)))
                THEN EndDate
                ELSE DATEADD(hh,17,DATEADD(dd,0, DATEDIFF(dd,0,c.StartDate)))
        END AS EndDate
from cte c
)
SELECT    ID,
        OrgStartDate AS StartDate,
        OrgEndDate AS EndDate,
        CAST(SUM(CASE 
                    WHEN StartDate > EndDate THEN 0 
                    ELSE DATEDIFF(mi,StartDate,EndDate) 
                END) / 60 AS INT) AS Hours,
        CAST(SUM(CASE 
                    WHEN StartDate > EndDate THEN 0 
                    ELSE DATEDIFF(mi,StartDate,EndDate) 
                END) % 60 AS INT) AS Minutes
        
from cte2
WHERE DAYNAME NOT IN('Saturday', 'SUNDAY')
GROUP BY ID, OrgStartDate, OrgEndDate

Given below is the entry submitted by antoine

-- antoine.gemis@gmail.com --
DECLARE @t TABLE (ID INT IDENTITY, StartDate DATETIME, EndDate DATETIME) 
SET DATEFORMAT MDY
SET DATEFIRST 1
INSERT INTO @t (StartDate, EndDate) SELECT '3/5/2009 18:00PM', '3/10/2009 7:00AM' --18:00 
INSERT INTO @t (StartDate, EndDate) SELECT '3/2/2009 8:00AM', '3/2/2009 3:00PM' --07:00 
INSERT INTO @t (StartDate, EndDate) SELECT '3/1/2009 4:00PM', '3/3/2009 2:00 PM' --15:00 
INSERT INTO @t (StartDate, EndDate) SELECT '2/26/2009 7:00AM', '2/26/2009 10:00PM' --09:00 
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 AM', '1/27/2009 5:15 PM' --07:45 
INSERT INTO @t (StartDate, EndDate) SELECT '1/17/2009 1:45 PM', '1/19/2009 7:45 AM' --00:00 
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 PM', '1/28/2009 9:15 AM' --01:15 

-- CTE pour fabriquer la sequence de jours concernés
;WITH CTE_EXTRACTWORKDAY (ID, StartDate, EndDate) AS
(
    SELECT 
        ID, StartDate, EndDate 
    FROM @t 
    group by ID, startdate, enddate
    UNION ALL
    SELECT ID, DATEADD(day, 1, StartDate), EndDate
    FROM CTE_EXTRACTWORKDAY
    WHERE StartDate < EndDate
),
-- On extrait les jours de travail potentiels
CTE_WORKDAY (ID, WorkDay) AS
(
    SELECT ID, DATEADD(d, 0, DATEDIFF(DAY, 0, startdate)) AS WorkDay  
    FROM CTE_EXTRACTWORKDAY
    WHERE StartDate < EndDate
    UNION 
    SELECT ID, DATEADD(d, 0, DATEDIFF(DAY, 0, enddate)) AS WorkDay
    FROM CTE_EXTRACTWORKDAY
    WHERE StartDate < EndDate
),
-- Calcul de la différenc en minute entre les bornes de début de fin de travail au bureau.
CTE_MORECLEAR (WorkDay, ID, StartDate, EndDate,  WorkedMinutes) AS
(
    SELECT  CTE_WORKDAY.WorkDay, 
        A.ID,
        A.StartDate,
        A.EndDate, 
        CASE DATEPART(WEEKDAY, WORKDAY) -- On n'ouvre pas le week-end. 
            WHEN 6 THEN 0
            WHEN 7 THEN 0
        ELSE    
            CASE WHEN 
                DATEDIFF(MINUTE, 
                    CASE WHEN StartDate < DATEADD(HOUR, 8, CTE_WORKDAY.WorkDay) THEN 
                        DATEADD(HOUR, 8, CTE_WORKDAY.WorkDay)
                    ELSE
                        StartDate
                    END,
                    CASE WHEN EndDate > DATEADD(HOUR, 17, CTE_WORKDAY.WorkDay) THEN 
                        DATEADD(HOUR, 17, CTE_WORKDAY.WorkDay)
                    ELSE
                        EndDate
                    END)
                < 0 THEN 0
            ELSE
                DATEDIFF(MINUTE, 
                    CASE WHEN StartDate < DATEADD(HOUR, 8, CTE_WORKDAY.WorkDay) THEN 
                        DATEADD(HOUR, 8, CTE_WORKDAY.WorkDay)
                    ELSE
                        StartDate
                    END,
                    CASE WHEN EndDate > DATEADD(HOUR, 17, CTE_WORKDAY.WorkDay) THEN 
                        DATEADD(HOUR, 17, CTE_WORKDAY.WorkDay)
                    ELSE
                        EndDate
                    END)
            END
        END AS WorkedMinutes
        FROM CTE_WORKDAY
    LEFT JOIN @t A ON A.ID = CTE_WORKDAY.ID
)
SELECT     StartDate, EndDate, 
        SUM(WorkedMinutes) / 60 as Hours, 
        SUM(WorkedMinutes) -(SUM(WorkedMinutes) / 60 * 60) AS Minutes  
FROM CTE_MORECLEAR 
GROUP BY ID, StartDate, EndDate    
ORDER BY ID 

Finally, here is the code of Rui

DECLARE @t TABLE (ID INT IDENTITY, StartDate DATETIME, EndDate DATETIME)
SET DATEFORMAT MDY

INSERT INTO @t (StartDate, EndDate) SELECT '3/2/2009 8:00AM', '3/2/2009 3:00PM' --07:00
INSERT INTO @t (StartDate, EndDate) SELECT '3/1/2009 4:00PM', '3/3/2009 2:00 PM' --15:00
INSERT INTO @t (StartDate, EndDate) SELECT '2/26/2009 7:00AM', '2/26/2009 10:00PM' --09:00
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 AM', '1/27/2009 5:15 PM' --07:45
INSERT INTO @t (StartDate, EndDate) SELECT '1/17/2009 1:45 PM', '1/19/2009 7:45 AM' --00:00
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 PM', '1/28/2009 9:15 AM' --01:15

-- Build the range of days between start and end with additional helper calcultations
;WITH BUILD_RANGES AS
(
    -- select the exact start date based on business hours
    SELECT    
        T.ID
        ,StartDate = 
                CASE 
                WHEN DATEPART(HOUR,T.StartDate) < 8 
                    THEN DATEADD(HOUR,8,DATEADD(DAY,0,DATEDIFF(DAY,0,T.StartDate))) 
                WHEN DATEPART(HOUR,T.StartDate) > 17 
                    THEN DATEADD(HOUR,17,DATEADD(DAY,0,DATEDIFF(DAY,0,T.StartDate))) 
                ELSE T.StartDate 
                END
        ,T.EndDate 
        ,BusinessDay = CASE 
                        WHEN DATEPART(DW,T.StartDate) IN (6,7) OR T.EndDate < T.StartDate 
                        THEN 0 ELSE 1 
                       END 
    FROM @t T
    UNION ALL
    -- Select the other dates til the end
    SELECT    A.ID
            ,StartDate = DATEADD(HOUR,8,DateAdd(day,1,DATEADD(DAY,0,DATEDIFF(DAY,0,A.StartDate))))
            ,A.EndDate
            ,BusinessDay = CASE 
                            WHEN DATEPART(DW,DateAdd(day,1,A.StartDate)) IN (6,7) THEN 0 
                            ELSE 1 
                           END 
    FROM BUILD_RANGES A
        INNER JOIN @t T ON A.ID = T.id
    WHERE A.StartDate < T.EndDate
)
-- do final computation
SELECT    ID,StartDate,EndDate,
        HOURS = SUM(DATEDIFF(MINUTE,ComputedStart,ComputedEnd))/60,
        MINUTES = SUM(DATEDIFF(MINUTE,ComputedStart,ComputedEnd))%60
FROM    (
    -- select final values with enddate based on business hours
    SELECT  O.Id
            ,ComputedStart = C.StartDate
            ,ComputedEnd = 
                CASE WHEN DATEADD(DAY,0,DATEDIFF(DAY,0,C.StartDate)) 
                        = DATEADD(DAY,0,DATEDIFF(DAY,0,O.EndDate))
                THEN 
                    CASE 
                    WHEN DATEPART(HOUR,C.EndDate) < 8 
                        THEN DATEADD(HOUR,8,DATEADD(DAY,0,DATEDIFF(DAY,0,C.EndDate))) 
                    WHEN DATEPART(HOUR,C.EndDate) > 16 
                        THEN DATEADD(HOUR,17,DATEADD(DAY,0,DATEDIFF(DAY,0,C.EndDate))) 
                    ELSE CASE WHEN O.StartDate > O.EndDate 
                        THEN C.StartDate ELSE C.EndDate END
                    END
                ELSE
                    DATEADD(HOUR,17,DATEADD(DAY,0,DATEDIFF(DAY,0,C.StartDate)))
                END
            ,BusinessDay
            ,O.StartDate,O.EndDate
    FROM    BUILD_RANGES C
        FULL JOIN @t O ON C.ID = O.ID
) FINAL_DATA
WHERE 1=1
    AND    (BusinessDay = 1 OR StartDate > EndDate)
    AND DATEADD(DAY,0,DATEDIFF(DAY,0,ComputedStart)) 
        <= DATEADD(DAY,0,DATEDIFF(DAY,0,EndDate))
GROUP BY ID,StartDate,EndDate
ORDER BY ID

I would like to thank all of you who participated as well as those who attempted to solve this challenge. I will post the next challenge soon.

SQL Server 2008 – New Whitepapers Published

There are a couple of new SQL Server 2008 white papers published this week. The first one is on “Service Broker Performance Scalability Techniques” and the second is on “Partitioned Tables and Index Strategies”

SQL Server 2008 White Paper - Service Broker: Performance and Scalability Techniques
SQL Server 2008 White Paper - Partitioned Table and Index Strategies Using SQL Server 2008

Tuesday, March 17, 2009

Practical SQL Server XML: Part Three

This is the third part of the XML blog series which intends to demonstrate some practical usages of the XML support extended by SQL Server 2005 and 2008.

SQL Server MVP Pinal Dave wrote the first post in this series at http://blog.sqlauthority.com/2009/03/17/sql-server-practical-sql-server-xml-part-one-query-plan-cache-and-cost-of-operations-in-the-cache/, describing how to shred the XML query plans in the SQL Server query plan cache. SQL Server MVP and author of "SQL Server 2008 XML", Michael Coles wrote the second part of the series at http://sqlblog.com/blogs/michael_coles/archive/2009/03/03/practical-sql-server-xml-part-2.aspx. In his post, Michael talks about sending multiple parameters, and even parameter sets, to SQL Server via XML.

When information is exchanged in XML format, there needs to be an agreement between the sender and receiver (caller and callee) on the structure of the XML document being exchanged. Michael's article demonstrates how to pass variable number of arguments to a stored procedure using an XML parameter. In this example, sender is the client application and receiver is the stored procedure that lives within a SQL Server database. The client application should send the XML document with the correct structure and values, else the stored procedure may not be able to process it and read the correct information from it.

This raises the need for a way to describe an XML document (structure, data type and format of values etc) and a way to validate an XML instance against the given XML description. Sometimes the position of elements in the XML document is significant and the code might break if the XML document does not contain the elements in the given order. Some elements may be mandatory and some optional. The code might break if a mandatory element is missing. For example, the Employee Name in the example posted by Michael is a mandatory element and the employee information does not make sense without an employee name. Data types of values are important as well. Age of an employee may be better described as an integer value (20, 34 etc) instead of a text value (twenty four, thirty nine etc). There may be also restrictions on the ranges, for example the age of an employee cannot be a negative number or a big positive number such as 500. Phone numbers should follow certain formats for example, (999) 999-9999.

In a real-world data exchange scenario, there needs to be a number of validations like the examples given above. Now the question is, "How do we perform those validations?" Well, this is where the XSD (XML Schema Definition) support extended by SQL Server 2005 (and 2008) can help you. You can create an XML Schema to describe the given XML document (structure, validations etc) and then create a SQL Server XML Schema Collection with your Schema definition. Once the schema collection is created, you can validate an XML document against the Schema Collection.

In this post, we will create an XML Schema Collection to validate the XML document Michael used in his post. I will only describe the XSD components needed for this example. If you find XML Schemas interesting, I would suggest you to read my book "The Art of XSD: SQL Server XML Schema Collections". It covers SQL Server XML Schema collections in detail starting from the basics. I have presented a large number of examples and several hands-on labs to make sure that even SQL Server developers who are not familiar with XSD can learn and write powerful real-life XML Schema Collections in SQL Server.

Let us start writing a schema for the XML document presented by Michael in his post. Here is the structure of the XML document.

<Person>
  <PersonType>SC</PersonType>
  <NameStyle>0</NameStyle>
  <Title>Mr.</Title>
  <FirstName>George</FirstName>
  <MiddleName/>
  <LastName>Jetson</LastName>
  <Suffix/>
  <EmailPromotion>0</EmailPromotion>
  <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress>
</Person>

Let us identify the validations needed for this XML document.

1. Elements under the Person element should appear exactly in the same order.

2. XML is case sensitive, hence the name of elements should match exactly as given above

3. PersonType - should be one of "GC", "SP", "EM", "IN","VC" and "SC"

4. NameStyle- should be either "1" or "0"

5. Title - Empty string allowed. Maximum length 10 characters

6. FirstName - Should not be more than 50 characters long. Empty string not allowed

7. MiddleName - Should not be more than 50 characters long. Empty string is allowed

8. LastName - Should not be more than 50 characters long. Empty string is not allowed

9. Suffix - Empty string allowed. Maximum length 20 characters

10. EmailPromotion - Should be one of "0", "1" and "2"

11. Email - The value should be a valid email address

For the purpose of this example, let us go ahead with the validations listed above. In a real-life scenario, you might need a different set of validations.

Let us start writing the schema. I assume that you have a basic understanding of XML Schema Collections. If you are new to XML Schema Collections, I would suggest you read the following articles from my XML Workshop Series.

1. XML Workshop VI - Typed XML and SCHEMA Collection

2. XML Workshop VII - Validating values with SCHEMA

The articles listed above will give you an introduction to SQL Server XML Schemas. If you find schemas interesting, you may go ahead and read these too.

3. XML Workshop VIII - Custom Types and Inheritance

4. XML Workshop IX - Mixed Types

5. XML Workshop XIII - XSD And Variable Content Containers

Let us come back to our example. We defined the rules needed to validate the XML document. It is time to translate them to an XML Schema. Let us start with an empty schema declaration.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
</xsd:schema>

Let us now add the schema declaration for all the rules we defined earlier.

Rule: Elements under the Person element should appear exactly in the same order. XML is case sensitive hence the name of elements should match exactly as given above.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="Person">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="PersonType"/>
        <xsd:element name="NameStyle"/>
        <xsd:element name="Title"/>
        <xsd:element name="FirstName"/>
        <xsd:element name="MiddleName"/>
        <xsd:element name="LastName"/>
        <xsd:element name="Suffix"/>
        <xsd:element name="EmailPromotion"/>
        <xsd:element name="EmailAddress"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

Rule: PersonType - should be one of "GC", "SP", "EM", "IN","VC" and "SC".

Let us define an enumeration to perform this validation.

  <xsd:simpleType name="PersonTypeEnumeration">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="GC"/>
      <xsd:enumeration value="SP"/>
      <xsd:enumeration value="EM"/>
      <xsd:enumeration value="IN"/>
      <xsd:enumeration value="VC"/>
      <xsd:enumeration value="SC"/>
     </xsd:restriction>
  </xsd:simpleType>

Now, let us set the type of the PersonType element to use this enumeration.

<xsd:element name="PersonType" type="PersonTypeEnumeration"/>

Rule: NameStyle- should be either "1" or "0"

<xsd:element name="NameStyle">
  <xsd:simpleType>
    <xsd:restriction base="xsd:integer">
      <xsd:minInclusive value="0"/>
      <xsd:maxInclusive value="1"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Rule: Title - Empty string allowed. Maximum length 10 characters

<xsd:element name="Title">
  <xsd:simpleType>
    <xsd:restriction base="xsd:string">
      <xsd:maxLength value="10"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Rule: FirstName - Should not be more than 50 characters long. Empty string not allowed. LastName - Should not be more than 50 characters long. Empty string is not allowed.

Since both FirstName and LastName follow the same set of validation rules, let us create a simple type to perform this validation.

<xsd:simpleType name="NameType">
  <xsd:restriction base="xsd:string">
    <xsd:minLength value="1"/>
    <xsd:maxLength value="50"/>
  </xsd:restriction>
</xsd:simpleType>

Now let us set the type of FirstName and LastName to NameType.

<xsd:element name="FirstName" type="NameType"/>
<xsd:element name="LastName" type="NameType"/>

Rule: MiddleName - Should not be more than 50 characters long. Empty string is allowed.

<xsd:element name="MiddleName">
  <xsd:simpleType>
    <xsd:restriction base="xsd:string">
      <xsd:maxLength value="50"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Rule: Suffix - Empty string allowed. Maximum length 20 characters

<xsd:element name="Suffix">
  <xsd:simpleType>
    <xsd:restriction base="xsd:string">
      <xsd:maxLength value="20"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Rule: EmailPromotion - Should be one of "0", "1" and "2".

<xsd:element name="EmailPromotion">
  <xsd:simpleType>
    <xsd:restriction base="xsd:integer">
      <xsd:minInclusive value="0"/>
      <xsd:maxInclusive value="2"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Rule: Email - The value should be a valid email address.

Let us use a regular expression to validate email address. I will use a simple regular expression to validate the email address for the purpose of this example. You could come up with your own complex regular expressions and enhance this validation.

<xsd:element name="EmailAddress">
  <xsd:simpleType>
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="[A-Za-z0-9_]+([-+.'][A-Za-z0-9_]+)*
                          @[A-Za-z0-9_]+([-.][A-Za-z0-9_]+)*\.
                          [A-Za-z0-9_]+([-.][A-Za-z0-9_]+)*"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:element>

Note: I have used line breaks in the pattern to make it more readable. The line breaks should be removed before making the schema collection.

Let us now go ahead and create an XML Schema Collection with the schema definition we created so far. Here is the code that creates an XML schema collection named "PersonSchema"

IF EXISTS(
    SELECT * FROM sys.xml_schema_collections 
    WHERE name = 'PersonSchema' )
DROP XML SCHEMA COLLECTION PersonSchema

CREATE XML SCHEMA COLLECTION PersonSchema AS '
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="Person">
    <xsd:complexType>
      <xsd:sequence>
        <!-- "PersonType" - uses simpletype "PersonTypeEnumeration" -->
        <xsd:element name="PersonType" type="PersonTypeEnumeration"/>
        <!-- "NameType" - Only 0 or 1 accepted-->
        <xsd:element name="NameStyle">
          <xsd:simpleType>
            <xsd:restriction base="xsd:integer">
              <xsd:minInclusive value="0"/>
              <xsd:maxInclusive value="1"/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
        <!-- "Title" - Length cannot be more than 10-->
        <xsd:element name="Title">
          <xsd:simpleType>
            <xsd:restriction base="xsd:string">
              <xsd:maxLength value="10"/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
        <!-- "First Name" - Uses simpletype "NameType" -->
        <xsd:element name="FirstName" type="NameType"/>
        <!-- "Middle Name" - Length cannot be more than 50 -->
        <xsd:element name="MiddleName">
          <xsd:simpleType>
            <xsd:restriction base="xsd:string">
              <xsd:maxLength value="50"/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
        <!-- "First Name" - Uses simpletype "NameType" -->
        <xsd:element name="LastName" type="NameType"/>
        <!-- "Suffix" - Length cannot be more than 20 -->
        <xsd:element name="Suffix">
          <xsd:simpleType>
            <xsd:restriction base="xsd:string">
              <xsd:maxLength value="20"/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
        <!-- "Email Promotion" - Can be one of 0, 1 or 2 -->
        <xsd:element name="EmailPromotion">
          <xsd:simpleType>
            <xsd:restriction base="xsd:integer">
              <xsd:minInclusive value="0"/>
              <xsd:maxInclusive value="2"/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
        <!-- "Email Address" - The pattern below validates an email addr-->
        <xsd:element name="EmailAddress">
          <xsd:simpleType>
            <xsd:restriction base="xsd:string">
              <xsd:pattern value="[A-Za-z0-9_]+([-+.''][A-Za-z0-9_]+)*@
              [A-Za-z0-9_]+([-.][A-Za-z0-9_]+)*\.[A-Za-z0-9_]
              +([-.][A-Za-z0-9_]+)*"/>
            </xsd:restriction>
          </xsd:simpleType>
        </xsd:element>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
  <!-- "PersonType" enumeration -->
  <xsd:simpleType name="PersonTypeEnumeration">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="GC"/>
      <xsd:enumeration value="SP"/>
      <xsd:enumeration value="EM"/>
      <xsd:enumeration value="IN"/>
      <xsd:enumeration value="VC"/>
      <xsd:enumeration value="SC"/>
     </xsd:restriction>
  </xsd:simpleType>
  <!-- "NameType" declaration-->
  <xsd:simpleType name="NameType">
    <xsd:restriction base="xsd:string">
      <xsd:minLength value="1"/>
      <xsd:maxLength value="50"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:schema>'

Note: Make sure to remove the line breaks from the pattern we used for validating email address. I added line breaks just to make the schema more readable.

Let us now make sure the schema collection actually works in the way we expect. You can bind an XML variable or column to an XML schema collection as shown in the following example.

DECLARE @x XML(PersonSchema)

The above declaration indicates that the XML variable @x is bound to the XML schema collection "PersonSchema". When you assign a value to @x, SQL Server will validate the XML value and will raise an error if the validation fails.

Let us try to see the validation in action. The following is a correct XML document that follows all the rules we defined earlier. Run the following and it should complete without an error.

DECLARE @x XML(PersonSchema)
SELECT @x = '
<Person>
  <PersonType>SC</PersonType>
  <NameStyle>0</NameStyle>
  <Title>Mr.</Title>
  <FirstName>George</FirstName>
  <MiddleName/>
  <LastName>Jetson</LastName>
  <Suffix/>
  <EmailPromotion>0</EmailPromotion>
  <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress>
</Person>'

Now, let us try with a few 'bad' xml documents. The following fails because "FirstName" cannot be empty.

DECLARE @x XML(PersonSchema)
SELECT @x = '
<Person>
  <PersonType>SC</PersonType>
  <NameStyle>0</NameStyle>
  <Title>Mr.</Title>
  <FirstName></FirstName>
  <MiddleName/>
  <LastName>Jetson</LastName>
  <Suffix/>
  <EmailPromotion>0</EmailPromotion>
  <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress>
</Person>'

SQL Server will raise the following error when you run the above.

Msg 6926, Level 16, State 1, Line 2
XML Validation: Invalid simple type value: ''. Location: /*:Person[1]/*:FirstName[1]

Try to assign different variations of the above XML to the variable and you will see that SQL Server strictly validates the XML document against the Schema Collection.

Now, let us come back to Michael's example and modify it slightly to integrate the schema validation. The only change you need is to change the XML parameter to a TYPED xml parameter bound to the schema collection.

CREATE PROCEDURE Person.AddPersonXML
    @People xml(PersonSchema) = NULL
AS
...........

Let us try to pass an incorrect XML document to the procedure and see what happens.

EXEC Person.AddPersonXML N'
<Person>
    <PersonType>SC</PersonType>
    <NameStyle>3</NameStyle>
    <Title>Mr.</Title>
    <FirstName>George</FirstName>
    <MiddleName/>
    <LastName>Jetson</LastName>
    <Suffix/>
    <EmailPromotion>0</EmailPromotion>
    <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress>
</Person>'

The above XML instance contains "3" in the element NameStyle and this will generate an error. If you run the above code, SQL Server will raise the following error.

Msg 6926, Level 16, State 1, Procedure AddPersonXML, Line 0
XML Validation: Invalid simple type value: '3'. Location: /*:Person[1]/*:NameStyle[1]

Now let us try with a correct XML value.

EXEC Person.AddPersonXML N'
<Person>
    <PersonType>SC</PersonType>
    <NameStyle>0</NameStyle>
    <Title>Mr.</Title>
    <FirstName>George</FirstName>
    <MiddleName/>
    <LastName>Jetson</LastName>
    <Suffix/>
    <EmailPromotion>0</EmailPromotion>
    <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress>
</Person>'    

You will notice that this will execute successfully.

The examples presented in this article demonstrates how well you can make use of XML Schema collections to apply validations on XML parameters. This is one of the several use cases where XML Schema collections can help. Feel free to send your questions to jacob@beyondrelational.com or post a comment.

Friday, March 13, 2009

TSQL Challenge 2

I would like to thank all the participants of TSQL Challenge 1 and congratulate the winners. Let us move ahead to the next challenge.

Here is the sample data for TSQL Challenge 2

ID          StartDate               EndDate
----------- ----------------------- -----------------------
1           2009-03-02 08:00:00.000 2009-03-02 15:00:00.000
2           2009-03-01 16:00:00.000 2009-03-03 14:00:00.000
3           2009-02-26 07:00:00.000 2009-02-26 22:00:00.000
4           2009-01-27 09:15:00.000 2009-01-27 17:15:00.000
5           2009-01-17 13:45:00.000 2009-01-19 07:45:00.000
6           2009-01-27 21:15:00.000 2009-01-28 09:15:00.000

The challenge is to calculate the Business Hours between StartDate and EndDate. Let us define Business Hours as the time between 8 AM and 5 PM, Monday to Friday.

If StartDate is Friday 12 Noon and EndDate is Monday 10 AM, you should count only the duration between 12 Noon and 5 PM on friday and 8AM to 10 AM on monday.

The query should return the following result.

StartDate               EndDate                 Hours       Minutes 
----------------------- ----------------------- ----------- ----------- 
2009-03-02 08:00:00.000 2009-03-02 15:00:00.000 7           0 
2009-03-01 16:00:00.000 2009-03-03 14:00:00.000 15          0 
2009-02-26 07:00:00.000 2009-02-26 22:00:00.000 9           0 
2009-01-27 09:15:00.000 2009-01-27 17:15:00.000 7           45 
2009-01-17 13:45:00.000 2009-01-19 07:45:00.000 0           0 
2009-01-27 21:15:00.000 2009-01-28 09:15:00.000 1           15

Here is the script to generate the sample data.

SET DATEFORMAT MDY
DECLARE @t TABLE (ID INT IDENTITY, StartDate DATETIME, EndDate DATETIME) 
INSERT INTO @t (StartDate, EndDate) SELECT '3/2/2009 8:00AM', '3/2/2009 3:00PM' --07:00 
INSERT INTO @t (StartDate, EndDate) SELECT '3/1/2009 4:00PM', '3/3/2009 2:00 PM' --15:00 
INSERT INTO @t (StartDate, EndDate) SELECT '2/26/2009 7:00AM', '2/26/2009 10:00PM' --09:00 
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 AM', '1/27/2009 5:15 PM' --07:45 
INSERT INTO @t (StartDate, EndDate) SELECT '1/17/2009 1:45 PM', '1/19/2009 7:45 AM' --00:00 
INSERT INTO @t (StartDate, EndDate) SELECT '1/27/2009 9:15 PM', '1/28/2009 9:15 AM' --01:15 

SELECT * FROM @t

Note: Please write your query using the sample data provided. This will make it easier for me to test your code.

  1. Please make your submissions before 20th March 2009
  2. I will announce the winners on 23rd March
  3. Send your queries by email and write "TSQL Challenge 2" in the subject
  4. Send your entries to jacob at beyondrelational dot com.

Happy Querying!

Wednesday, March 11, 2009

TSQL Challenge 1 Answer and Winners!

I would like to thank everyone who participated in TSQL challenge1. I received much more entries than I expected and it took me some time to scan all the entries and test them. I promised you that I would announce the winners on Monday, but this process kept me busy for another day.

..... and the Winners are:

Adam Machanic

adam machanic

Adam Machanic is a Boston-based independent database consultant, writer, and speaker. He has been involved in dozens of SQL Server implementations for both high-availability OLTP and large-scale data warehouse applications, and has optimized data access layer performance for several data-intensive applications. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "Expert SQL Server 2005 Development" (Apress, 2007) and "Inside SQL Server 2005: Query Tuning and Optimization" (Microsoft Press, 2007). Adam regularly speaks at user groups, community events, and conferences on a variety of SQL Server and .NET-related topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, Microsoft Certified IT Professional (MCITP), and a member of the INETA North American Speakers

Ashish Gilhotra

ashish

Ashish Gilhotra a Web Developer from Jaipur, Rajasthan. He is an MCAD in .net with Sql Server.

Leonid Koyfman

Leonid Koyfman

Leonid Koyfman is a Senior Developer with Razorfish ( San Francisco). He is in software development over 10 years. His focus is data visualization and delivering reporting solutions from various data sources, but spending most of the time with SQL Server.

 

I would like to congratulate the winners. In the coming days, I will publish more details about these people so that all of you get to know more about them.

Solving the problem

Let us now have a look at how this challenge can be solved. The solution lies in the way we join @b and @c. This is the tricky part that we need to solve.

A join between @b and @c will produce duplicate rows. The "code" column in @b contains 2 records having value "1". Similarly the "code" column in @c contains 3 records having value "1". If you try to join them with a FULL JOIN you will receive 12 records. But our query should produce only 3 rows. Similarly, our query should produce only 2 records for rows having "2" in the column "code". The total number of rows we are looking for is 5 where as a FULL JOIN between @b and @c currently produces 16 records. See this example:

SELECT * FROM @b b
FULL JOIN @c c ON b.code = b.code
/*
code        bname      code        cname
----------- ---------- ----------- ----------
1           aaa        1           xxx
1           bbb        1           xxx
2           ccc        1           xxx
2           ddd        1           xxx
1           aaa        1           yyy
1           bbb        1           yyy
2           ccc        1           yyy
2           ddd        1           yyy
1           aaa        1           zzz
1           bbb        1           zzz
2           ccc        1           zzz
2           ddd        1           zzz
1           aaa        2           www
1           bbb        2           www
2           ccc        2           www
2           ddd        2           www
*/ 

How do we remove those duplicate rows? Well, the join that we apply should do the following. It should take the first row in @b with value "1" and join it with the first row in @c having "1". Then it should take the second row in @b having "1" and join it with the second row in @c having value "1". It should then take the third row in @c having value "1". This does not have a matching row in @b, so it will return the information from @c and will return NULL from @b. If we can apply such a join, we can receive back exactly 3 records.

To join the rows in the manner described above, we need to generate a sequence number for each row. The sequence number should reset for each distinct value in the "code" column. So the trick here is to generate this sequence number.

SQL Server 2005 introduced ROW_NUMBER() that can be used to generate a sequence number. However it is not available in SQL Server 2000. So, how do we generate a sequence number (and reset it for each group) in SQL Server 2000?

In SQL Server 2000, this can be achieved by doing a self join and counting the number of records smaller than equal to the current row. This trick is demonstrated in the below example.

SELECT
    b1.bname,
    b1.code,
    COUNT(*) AS row
FROM @b b1
INNER JOIN @b b2 ON 
    b1.code = b2.code
    AND b2.bname <= b1.bname
GROUP BY 
    b1.bname,
    b1.code
/*
bname      code        row
---------- ----------- -----------
aaa        1           1
bbb        1           2
ccc        2           1
ddd        2           2
*/  

This trick is going to help us solve the problem. Using this trick, we can join @b and @c on Code + row and it will make each row unique and will eliminate the duplicate values from the results. A join of @b and @c using this method will produce the following result.

SELECT * FROM (
    SELECT
        b1.bname,
        b1.code,
        COUNT(*) AS row
    FROM @b b1
    INNER JOIN @b b2 ON 
        b1.code = b2.code
        AND b2.bname <= b1.bname
    GROUP BY 
        b1.bname,
        b1.code
) b FULL JOIN (    
    SELECT
        c1.cname,
        c1.code,
        COUNT(*) AS row
    FROM @c c1
    INNER JOIN @c c2 ON 
        c1.code = c2.code
        AND c2.cname <= c1.cname
    GROUP BY 
        c1.cname,
        c1.code
) c ON b.Code = c.Code AND b.row = c.row
/*
bname      code        row         cname      code        row
---------- ----------- ----------- ---------- ----------- -----------
ccc        2           1           www        2           1
aaa        1           1           xxx        1           1
bbb        1           2           yyy        1           2
NULL       NULL        NULL        zzz        1           3
ddd        2           2           NULL       NULL        NULL
*/   

This leads us to the final solution. The final result can be achieved by simply joining this result with @a. Here is the complete code listing.

DECLARE @a TABLE (code INT, aname VARCHAR(10))
INSERT INTO @a(code, aname) SELECT 1,'Cat'
INSERT INTO @a(code, aname) SELECT 2,'Dog'
INSERT INTO @a(code, aname) SELECT 3,'Bird'

DECLARE @b TABLE (code INT, bname VARCHAR(10))
INSERT INTO @b(code, bname) SELECT 1,'aaa'
INSERT INTO @b(code, bname) SELECT 1,'bbb'
INSERT INTO @b(code, bname) SELECT 2,'ccc'
INSERT INTO @b(code, bname) SELECT 2,'ddd'

DECLARE @c TABLE (code INT, cname VARCHAR(10))
INSERT INTO @c(code, cname) SELECT 1,'xxx'
INSERT INTO @c(code, cname) SELECT 1,'yyy'
INSERT INTO @c(code, cname) SELECT 1,'zzz'
INSERT INTO @c(code, cname) SELECT 2,'www'

SELECT
    a.*,
    b.bname,
    c.cname
FROM @a a
LEFT JOIN (
    (
        SELECT
            b1.bname,
            b1.code,
            COUNT(*) AS row
        FROM @b b1
        JOIN @b b2 ON 
            b1.code = b2.code
            AND b2.bname <= b1.bname
        GROUP BY 
            b1.bname,
            b1.code
    ) b
    FULL JOIN
    (
        SELECT
            c1.cname,
            c1.code,
            COUNT(*) AS row
        FROM @c c1
        JOIN @c c2 ON 
            c1.code = c2.code
            AND c2.cname <= c1.cname
        GROUP BY 
            c1.cname,
            c1.code
    ) c ON 
        b.code = c.code
        AND b.row = c.row
) ON a.code IN (b.code, c.code)               
/*
code        aname      bname      cname
----------- ---------- ---------- ----------
1           Cat        aaa        xxx
1           Cat        bbb        yyy
1           Cat        NULL       zzz
2           Dog        ccc        www
2           Dog        ddd        NULL
3           Bird       NULL       NULL
*/            

I would like to thank everyone who participated in the challenge. Are you ready for the next challenge?

Monday, March 09, 2009

SQL Server IDENTITY Columns

I have written over a dozen articles covering almost all aspects of SQL Server IDENTITY columns. The intention of this page is to serve as an index, so that people can find the right post based on the specific piece of information they are looking for.

How do I create an Auto-number column in SQL Server?

This is the first article in the series. It provides some basic information about IDENTITY columns in SQL Server. It explains the basic syntax and demonstrates how to create IDENTITY columns with different SEED and INCREMENT values.

What are the data types supported in IDENTITY columns?

This post performs a deeper investigation into the different data types supported by IDENTITY columns. It also explains how to query the system catalog views to find out the identity column and their data type of each table.

How do I create an Auto-number (IDENTITY) column that counts backwards?

This post demonstrates how to create an IDENTITY column that counts backwards. This can be achieved by creating an IDENTITY column with a negative INCREMENT value. The post shows a few examples demonstrating this.

Restrictions on the SEED and INCREMENT values of an IDENTITY column

This post explains the various restrictions SQL Server applies on the ROOT and INCREMENT attributes of an IDENTITY column.

My IDENTITY values are not sequential! Why do I have missing numbers in the IDENTITY column?

This post intends to answer a very common question I hear from people around. Identity values are not expected to be sequential. The articles explores the different scenarios that can create missing IDENTITY values.

IDENTITY - How to change/reset the IDENTITY values?

This post explains how to reset the SEED value of an IDENTITY column.

How do I Insert an explicit value into an IDENTITY column? How do I Update the value of an IDENTITY Column?

This post explains how to insert explicit values to an IDENTITY column. Further, it demonstrates a few other points related to explicit IDENTITY insert.

Why do I have duplicate identity values?

IDENTITY values are not guaranteed to be unique, unless you have a unique index on the column. This post demonstrates a few cases that can result in duplicate IDENTITY values.

How to find out the SEED and INCREMENT values of an IDENTITY column?

This post explains how to retrieve the SEED and INCREMENT attributes of an IDENTITY column. It also explains how to retrieve the CURRENT identity value of the column as well.

@@IDENTITY, SCOPE_IDENTITY() and IDENT_CURRENT()

This post compares @@IDENTITY variable with SCOPE_IDENTITY() and IDENT_CURRENT() functions. It explains the characteristics as well as differences of each one of them.

How to retrieve the identity values generated by a multi-row insert?

@@IDENTITY and SCOPE_IDENTITY() retrieves the last IDENTITY value generated in the current session and scope. If the operation inserts more than one row, they will give you only the last value. This article explains how to retrieve the IDENTITY values inserted by a multi-row insert.

IDENTITY property and IDENTITY() function

IDENTITY property and IDENTITY() function are different. This post explains the differences.

IDENTITY related functions, commands, system variables, catalog views

This post summarizes all the functions, commands, variables and system catalog views related to IDENTITY columns.

A few more interesting facts

I made this post to add a few interesting points that I missed in the previous posts. It points out that you cannot change the INCREMENT value of an IDENTITY column. It also looks into the differences between NUMERIC and DECIMAL data types.

Sunday, March 08, 2009

SQL Server IDENTITY Columns - A few more interesting facts

This is my 14th article on SQL Server IDENTITY values. In each post we discussed a couple of interesting points related to the IDENTITY columns. I saw a number of IDENITY related questions in various online SQL Server forums and saw a lot of cases where IDENTITY columns were completely misunderstood. That is what inspired me to write over a dozen posts on IDENTITY values.

I thought I would stop after writing the 13th article in this series. I presented an IDENTITY Question/Answer session during the last UG meeting at Ahmedabad SQL Server User Group. I prepared 70+ questions on IDENTITY columns and related stuff and we all tried to answer them during my session. When we did that, I came across a few cases where I realized that my previous articles either missed those points or did not explain them in enough detail. That is the reason for writing this post; to clarify those missing points.

How do you change IDENTITY INCREMENT value?

One of the points we missed is that the INCREMENT value of an identity column cannot be changed after the column is created. I did not emphasis it in any of the previous articles. This topic came up while we were discussing about the scenarios where the IDENTITY value grows beyond the storage capacity of the associated data type.

When the IDENTITY value grows beyond the storage capacity of the associated data type, the first option you have is to upgrade the column (and all the references) to a bigger data type. But if you are already using the biggest data type and if the IDENTITY value still grows beyond that, you don't have many choices. If your IDENTITY value started with a positive number (for example 1) and if the data type supports negative values (for example BIGINT) you can RESEED the identity SEED to utilize the negative numbers. Though 1 to N is already used, you can still generate IDENTITY values between -N and 0. This seems to be the only option available if you have started with the smallest positive number. If you have started with the highest negative value, then you are screwed! You might need to do some re-design to get the problem fixed.

While we were at this point, I threw the following question to the audience. "If your IDENTITY column has reached the maximum positive value the data type can store and you want to utilize the negative value space, how would you reseed the IDENTITY column? Would you reseed it so as to start from -N and move to -1 or would you prefer to start from -1 and go backward to -N?"

Option 1: If you move from -N to -1, there are chances that you end up with duplicate IDENTITY values (If there is no UNIQUE index on the column). There are all chances that the value will reach -1 one day and will grow into positive values. These positive values will conflict with existing positive values in the table. If there is no UNIQUE index on the column, you might not notice this problem and you might end up with incorrect data.

Option 2: If you start with -1 and move backward to -N, you will not have the problem mentioned above. When we reach -N and if the value grows further, SQL Server will throw an error, as the data type is not capable of storing the new IDENTITY column. But..

While option 2 seems to be a better choice in this scenario, there is not way to do this with any of the SQL Server versions available today. For option 2, you need to change the IDENTITY SEED and INCREMENT to -1. For option 1, you only need to change the IDENTITY SEED to -N. Since SQL Server does not permit us to modify the IDENTITY INCREMENT value, option 2 cannot be used.

DECIMAL v/s NUMERIC

Another point that caused little bit of confusion is the difference between DECIMAL and NUMERIC. Books-online says that numeric is functionally equivalent to decimal. In SQL Server, there is no difference between DECIMAL and NUMERIC. However, ANSI standard allows DECIMAL to return a higher precision than we ask for. Special thanks to SQL Server MVPs: Razvan Socol, Tibor Kararzi, Aaron Bertrand and Steve Kass for explaining this.

Tuesday, March 03, 2009

How to convert a number to a comma-formatted string?

Every time some one asked this question in the past, I used to suggest that formatting should be done at the presentation layer. While some of you might agree with me on this, I just wanted to point out that you can do comma formatting of numeric values in TSQL.

You can convert MONEY values (values of data type MONEY) to comma-formatted strings using the CONVERT() function with style flag 1. The following example demonstrates it.

DECLARE @m MONEY
SELECT @m = '23456789.25'

SELECT
    CONVERT(VARCHAR, @m, 1) AS FormattedValue
    
/*
FormattedValue
------------------------------
23,456,789.25
*/    

The default style flag is 0, hence if you have not specified a style flag, SQL Server assumes zero and no formatting is done. Style flag 2 adds 4 decimal places to the output. Here is an example:

DECLARE @m MONEY
SELECT @m = '23456789.25'

SELECT
    CONVERT(VARCHAR, @m, 0) AS Style0,
    CONVERT(VARCHAR, @m, 1) AS Style1,
    CONVERT(VARCHAR, @m, 2) AS Style2
    
/*
Style0                         Style1                         Style2
------------------------------ ------------------------------ ------------------------------
23456789.25                    23,456,789.25                  23456789.2500
*/    

Comma-formatting is available only for MONEY data types. If you want to comma-format INT, DECIMAL etc, you should cast it to MONEY and do a convert with style flag 1.

DECLARE @m DECIMAL(10,2)
SELECT @m = '23456789.25'

SELECT
    CONVERT(VARCHAR, @m, 1) AS DecimalValue,
    CONVERT(VARCHAR, CAST(@m AS MONEY), 1) AS MoneyValue    
/*
DecimalValue                   MoneyValue
------------------------------ ------------------------------
23456789.25                    23,456,789.25
*/

Sunday, March 01, 2009

XQuery Lab 43 - Deleting multiple elements

Moved to http://beyondrelational.com/blogs/jacob/archive/2009/03/01/xquery-lab-43-deleting-multiple-elements.aspx

Friday, February 27, 2009

TSQL Challenge 1

I would like to invite my readers to participate in a TSQL Challenge. And if it works well, I will come up with more such challenges and we will have more fun solving TSQL problems. I will be sharing some of the interesting TSQL challenges that I see around and we will see different ways to solve them.

Here is the first TSQL Challenge:

I found the following question in one of the SQL Server forums, a few weeks back.

There are three tables with the following data.

Table 1

code        aname
----------- ----------
1           Cat
2           Dog
3           Bird

Table 2

code        bname
----------- ----------
1           aaa
1           bbb
2           ccc
2           ddd

Table 3

code        cname
----------- ----------
1           xxx
1           yyy
1           zzz
2           www

Required Output

code        aname      bname      cname
----------- ---------- ---------- ----------
1           Cat        aaa        xxx
1           Cat        bbb        yyy
1           Cat        NULL       zzz
2           Dog        ccc        www
2           Dog        ddd        NULL
3           Bird       NULL       NULL

The query should work in SQL Server 2000 as well as SQL Server 2005/2008. However, two separate version of the query is acceptable for SQL Server 2000 and 2005/2008 (SQL Server 2005/8 has some new functions that makes writing this query easier and you can make use of them)

Please send your answers latest by 5th March 2009. I will announce the winner on 9th March 2009. I look forward to see your queries :-)

Here are the insert scripts for the sample data.

DECLARE @a TABLE (code INT, aname VARCHAR(10))
INSERT INTO @a(code, aname) SELECT 1,'Cat'
INSERT INTO @a(code, aname) SELECT 2,'Dog'
INSERT INTO @a(code, aname) SELECT 3,'Bird'

DECLARE @b TABLE (code INT, bname VARCHAR(10))
INSERT INTO @b(code, bname) SELECT 1,'aaa'
INSERT INTO @b(code, bname) SELECT 1,'bbb'
INSERT INTO @b(code, bname) SELECT 2,'ccc'
INSERT INTO @b(code, bname) SELECT 2,'ddd'

DECLARE @c TABLE (code INT, cname VARCHAR(10))
INSERT INTO @c(code, cname) SELECT 1,'xxx'
INSERT INTO @c(code, cname) SELECT 1,'yyy'
INSERT INTO @c(code, cname) SELECT 1,'zzz'
INSERT INTO @c(code, cname) SELECT 2,'www'

Thursday, February 26, 2009

Ahmedabad SQL Server User Group: February 2009 Meeting

I am a little late to post details of the UG meeting held on 21st February 2009, and I feel bad for keeping you waiting for so long. I was busy finishing some assignments at work. I am flying to Seattle tomorrow to attend the MVP Global Summit. I hope I will have a great time meeting other SQL Server MVPs and many of the SQL Server Team members.

Coming back to the UG meeting, we really had a wonderful time discussing many of the basic SQL Server questions. Pinal Dave has already posted a very detailed blog post covering the activities. He did a wonderful session where he presented some very interesting points related to the way SQL Server Query Optimizer uses indexes.

My session was more of an interactive discussion. While writing my series of posts on SQL Server IDENTITY columns, I came across several interesting questions and I then compiled a long list of 65 questions. We discussed those 65 questions and their answers and the 'why-how' part of each of them. I will create a PDF document with those questions and will make them available for download.

Here are some photographs:

pinal

Pinal Dave

jacob

Jacob Sebastian

group

Group Photo

About Me
Jacob Sebastian
Tutorials
* XQuery Tutorials
My Links
SQL Server White Papers
SQL Server 2008
My Articles
XML Workshop RSS Feed
Contact me
Readers
Free Hit Counter
Web Site Hit Counters
SQL Server Bloggers
Blog Directories
blogarama - the blog directory Programming Blogs - BlogCatalog Blog Directory
 
Copyright Jacob Sebastian