This is the third part of the XML blog series which intends to demonstrate some practical usages of the XML support extended by SQL Server 2005 and 2008.
SQL Server MVP Pinal Dave wrote the first post in this series at http://blog.sqlauthority.com/2009/03/17/sql-server-practical-sql-server-xml-part-one-query-plan-cache-and-cost-of-operations-in-the-cache/, describing how to shred the XML query plans in the SQL Server query plan cache. SQL Server MVP and author of "SQL Server 2008 XML", Michael Coles wrote the second part of the series at http://sqlblog.com/blogs/michael_coles/archive/2009/03/03/practical-sql-server-xml-part-2.aspx. In his post, Michael talks about sending multiple parameters, and even parameter sets, to SQL Server via XML.
When information is exchanged in XML format, there needs to be an agreement between the sender and receiver (caller and callee) on the structure of the XML document being exchanged. Michael's article demonstrates how to pass variable number of arguments to a stored procedure using an XML parameter. In this example, sender is the client application and receiver is the stored procedure that lives within a SQL Server database. The client application should send the XML document with the correct structure and values, else the stored procedure may not be able to process it and read the correct information from it.
This raises the need for a way to describe an XML document (structure, data type and format of values etc) and a way to validate an XML instance against the given XML description. Sometimes the position of elements in the XML document is significant and the code might break if the XML document does not contain the elements in the given order. Some elements may be mandatory and some optional. The code might break if a mandatory element is missing. For example, the Employee Name in the example posted by Michael is a mandatory element and the employee information does not make sense without an employee name. Data types of values are important as well. Age of an employee may be better described as an integer value (20, 34 etc) instead of a text value (twenty four, thirty nine etc). There may be also restrictions on the ranges, for example the age of an employee cannot be a negative number or a big positive number such as 500. Phone numbers should follow certain formats for example, (999) 999-9999.
In a real-world data exchange scenario, there needs to be a number of validations like the examples given above. Now the question is, "How do we perform those validations?" Well, this is where the XSD (XML Schema Definition) support extended by SQL Server 2005 (and 2008) can help you. You can create an XML Schema to describe the given XML document (structure, validations etc) and then create a SQL Server XML Schema Collection with your Schema definition. Once the schema collection is created, you can validate an XML document against the Schema Collection.
In this post, we will create an XML Schema Collection to validate the XML document Michael used in his post. I will only describe the XSD components needed for this example. If you find XML Schemas interesting, I would suggest you to read my book "The Art of XSD: SQL Server XML Schema Collections". It covers SQL Server XML Schema collections in detail starting from the basics. I have presented a large number of examples and several hands-on labs to make sure that even SQL Server developers who are not familiar with XSD can learn and write powerful real-life XML Schema Collections in SQL Server.
Let us start writing a schema for the XML document presented by Michael in his post. Here is the structure of the XML document.
<Person> <PersonType>SC</PersonType> <NameStyle>0</NameStyle> <Title>Mr.</Title> <FirstName>George</FirstName> <MiddleName/> <LastName>Jetson</LastName> <Suffix/> <EmailPromotion>0</EmailPromotion> <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress> </Person>
Let us identify the validations needed for this XML document.
1. Elements under the Person element should appear exactly in the same order.
2. XML is case sensitive, hence the name of elements should match exactly as given above
3. PersonType - should be one of "GC", "SP", "EM", "IN","VC" and "SC"
4. NameStyle- should be either "1" or "0"
5. Title - Empty string allowed. Maximum length 10 characters
6. FirstName - Should not be more than 50 characters long. Empty string not allowed
7. MiddleName - Should not be more than 50 characters long. Empty string is allowed
8. LastName - Should not be more than 50 characters long. Empty string is not allowed
9. Suffix - Empty string allowed. Maximum length 20 characters
10. EmailPromotion - Should be one of "0", "1" and "2"
11. Email - The value should be a valid email address
For the purpose of this example, let us go ahead with the validations listed above. In a real-life scenario, you might need a different set of validations.
Let us start writing the schema. I assume that you have a basic understanding of XML Schema Collections. If you are new to XML Schema Collections, I would suggest you read the following articles from my XML Workshop Series.
1. XML Workshop VI - Typed XML and SCHEMA Collection
2. XML Workshop VII - Validating values with SCHEMA
The articles listed above will give you an introduction to SQL Server XML Schemas. If you find schemas interesting, you may go ahead and read these too.
3. XML Workshop VIII - Custom Types and Inheritance
4. XML Workshop IX - Mixed Types
5. XML Workshop XIII - XSD And Variable Content Containers
Let us come back to our example. We defined the rules needed to validate the XML document. It is time to translate them to an XML Schema. Let us start with an empty schema declaration.
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> </xsd:schema>
Let us now add the schema declaration for all the rules we defined earlier.
Rule: Elements under the Person element should appear exactly in the same order. XML is case sensitive hence the name of elements should match exactly as given above.
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Person"> <xsd:complexType> <xsd:sequence> <xsd:element name="PersonType"/> <xsd:element name="NameStyle"/> <xsd:element name="Title"/> <xsd:element name="FirstName"/> <xsd:element name="MiddleName"/> <xsd:element name="LastName"/> <xsd:element name="Suffix"/> <xsd:element name="EmailPromotion"/> <xsd:element name="EmailAddress"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Rule: PersonType - should be one of "GC", "SP", "EM", "IN","VC" and "SC".
Let us define an enumeration to perform this validation.
<xsd:simpleType name="PersonTypeEnumeration"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="GC"/> <xsd:enumeration value="SP"/> <xsd:enumeration value="EM"/> <xsd:enumeration value="IN"/> <xsd:enumeration value="VC"/> <xsd:enumeration value="SC"/> </xsd:restriction> </xsd:simpleType>
Now, let us set the type of the PersonType element to use this enumeration.
<xsd:element name="PersonType" type="PersonTypeEnumeration"/>
Rule: NameStyle- should be either "1" or "0"
<xsd:element name="NameStyle"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
Rule: Title - Empty string allowed. Maximum length 10 characters
<xsd:element name="Title"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="10"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
Rule: FirstName - Should not be more than 50 characters long. Empty string not allowed. LastName - Should not be more than 50 characters long. Empty string is not allowed.
Since both FirstName and LastName follow the same set of validation rules, let us create a simple type to perform this validation.
<xsd:simpleType name="NameType"> <xsd:restriction base="xsd:string"> <xsd:minLength value="1"/> <xsd:maxLength value="50"/> </xsd:restriction> </xsd:simpleType>
Now let us set the type of FirstName and LastName to NameType.
<xsd:element name="FirstName" type="NameType"/> <xsd:element name="LastName" type="NameType"/>
Rule: MiddleName - Should not be more than 50 characters long. Empty string is allowed.
<xsd:element name="MiddleName"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="50"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
Rule: Suffix - Empty string allowed. Maximum length 20 characters
<xsd:element name="Suffix"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="20"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
Rule: EmailPromotion - Should be one of "0", "1" and "2".
<xsd:element name="EmailPromotion"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="2"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
Rule: Email - The value should be a valid email address.
Let us use a regular expression to validate email address. I will use a simple regular expression to validate the email address for the purpose of this example. You could come up with your own complex regular expressions and enhance this validation.
<xsd:element name="EmailAddress"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="[A-Za-z0-9_]+([-+.'][A-Za-z0-9_]+)* @[A-Za-z0-9_]+([-.][A-Za-z0-9_]+)*\. [A-Za-z0-9_]+([-.][A-Za-z0-9_]+)*"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
Note: I have used line breaks in the pattern to make it more readable. The line breaks should be removed before making the schema collection.
Let us now go ahead and create an XML Schema Collection with the schema definition we created so far. Here is the code that creates an XML schema collection named "PersonSchema"
IF EXISTS( SELECT * FROM sys.xml_schema_collections WHERE name = 'PersonSchema' ) DROP XML SCHEMA COLLECTION PersonSchema CREATE XML SCHEMA COLLECTION PersonSchema AS ' <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Person"> <xsd:complexType> <xsd:sequence> <!-- "PersonType" - uses simpletype "PersonTypeEnumeration" --> <xsd:element name="PersonType" type="PersonTypeEnumeration"/> <!-- "NameType" - Only 0 or 1 accepted--> <xsd:element name="NameStyle"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <!-- "Title" - Length cannot be more than 10--> <xsd:element name="Title"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="10"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <!-- "First Name" - Uses simpletype "NameType" --> <xsd:element name="FirstName" type="NameType"/> <!-- "Middle Name" - Length cannot be more than 50 --> <xsd:element name="MiddleName"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="50"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <!-- "First Name" - Uses simpletype "NameType" --> <xsd:element name="LastName" type="NameType"/> <!-- "Suffix" - Length cannot be more than 20 --> <xsd:element name="Suffix"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:maxLength value="20"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <!-- "Email Promotion" - Can be one of 0, 1 or 2 --> <xsd:element name="EmailPromotion"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="2"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <!-- "Email Address" - The pattern below validates an email addr--> <xsd:element name="EmailAddress"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="[A-Za-z0-9_]+([-+.''][A-Za-z0-9_]+)*@ [A-Za-z0-9_]+([-.][A-Za-z0-9_]+)*\.[A-Za-z0-9_] +([-.][A-Za-z0-9_]+)*"/> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> <!-- "PersonType" enumeration --> <xsd:simpleType name="PersonTypeEnumeration"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="GC"/> <xsd:enumeration value="SP"/> <xsd:enumeration value="EM"/> <xsd:enumeration value="IN"/> <xsd:enumeration value="VC"/> <xsd:enumeration value="SC"/> </xsd:restriction> </xsd:simpleType> <!-- "NameType" declaration--> <xsd:simpleType name="NameType"> <xsd:restriction base="xsd:string"> <xsd:minLength value="1"/> <xsd:maxLength value="50"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>'
Note: Make sure to remove the line breaks from the pattern we used for validating email address. I added line breaks just to make the schema more readable.
Let us now make sure the schema collection actually works in the way we expect. You can bind an XML variable or column to an XML schema collection as shown in the following example.
DECLARE @x XML(PersonSchema)
The above declaration indicates that the XML variable @x is bound to the XML schema collection "PersonSchema". When you assign a value to @x, SQL Server will validate the XML value and will raise an error if the validation fails.
Let us try to see the validation in action. The following is a correct XML document that follows all the rules we defined earlier. Run the following and it should complete without an error.
DECLARE @x XML(PersonSchema) SELECT @x = ' <Person> <PersonType>SC</PersonType> <NameStyle>0</NameStyle> <Title>Mr.</Title> <FirstName>George</FirstName> <MiddleName/> <LastName>Jetson</LastName> <Suffix/> <EmailPromotion>0</EmailPromotion> <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress> </Person>'
Now, let us try with a few 'bad' xml documents. The following fails because "FirstName" cannot be empty.
DECLARE @x XML(PersonSchema) SELECT @x = ' <Person> <PersonType>SC</PersonType> <NameStyle>0</NameStyle> <Title>Mr.</Title> <FirstName></FirstName> <MiddleName/> <LastName>Jetson</LastName> <Suffix/> <EmailPromotion>0</EmailPromotion> <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress> </Person>'
SQL Server will raise the following error when you run the above.
Msg 6926, Level 16, State 1, Line 2 XML Validation: Invalid simple type value: ''. Location: /*:Person[1]/*:FirstName[1]
Try to assign different variations of the above XML to the variable and you will see that SQL Server strictly validates the XML document against the Schema Collection.
Now, let us come back to Michael's example and modify it slightly to integrate the schema validation. The only change you need is to change the XML parameter to a TYPED xml parameter bound to the schema collection.
CREATE PROCEDURE Person.AddPersonXML @People xml(PersonSchema) = NULL AS ...........
Let us try to pass an incorrect XML document to the procedure and see what happens.
EXEC Person.AddPersonXML N' <Person> <PersonType>SC</PersonType> <NameStyle>3</NameStyle> <Title>Mr.</Title> <FirstName>George</FirstName> <MiddleName/> <LastName>Jetson</LastName> <Suffix/> <EmailPromotion>0</EmailPromotion> <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress> </Person>'
The above XML instance contains "3" in the element NameStyle and this will generate an error. If you run the above code, SQL Server will raise the following error.
Msg 6926, Level 16, State 1, Procedure AddPersonXML, Line 0 XML Validation: Invalid simple type value: '3'. Location: /*:Person[1]/*:NameStyle[1]
Now let us try with a correct XML value.
EXEC Person.AddPersonXML N' <Person> <PersonType>SC</PersonType> <NameStyle>0</NameStyle> <Title>Mr.</Title> <FirstName>George</FirstName> <MiddleName/> <LastName>Jetson</LastName> <Suffix/> <EmailPromotion>0</EmailPromotion> <EmailAddress>George.Jetson@spacelysprockets.com</EmailAddress> </Person>'
You will notice that this will execute successfully.
The examples presented in this article demonstrates how well you can make use of XML Schema collections to apply validations on XML parameters. This is one of the several use cases where XML Schema collections can help. Feel free to send your questions to jacob@beyondrelational.com or post a comment.




2 comments:
Great article! What if there were multiple errors when validating the XML document and I wanted to store the error code, attribute and value in a table so I can create an error report?
Hi Suhail,
You can put your code into a TRY CATCH block and capture the ERROR_MESSAGE() which can then be inserted to a log table.
For example:
BEGIN TRY
DECLARE @x XML(PersonSchema)
SELECT @x = '
[Person]
[PersonType]SC[/PersonType]
[/Person]'
END TRY
BEGIN CATCH
SELECT ERROR_MESSAGE()
END CATCH
Note: Please replace [] with XML tags
Post a Comment