Sunday, January 31, 2010

Notes on SQL 9

Previous related posts:

XML

FOR XML _____:

  • RAW: returns each row as an XML element and each column as an XML attribute (<row ClientID="13" Name="Fake client name" />). Aliases can be used for columns, using RAW('element_type') causes the rows to be "named" element_type (instead of "row" in the example), adding ROOT('root_name') wraps everything in <root_name> </root_name> tags, and adding ELEMENTS causes each column to be listed as its own element instead of an attribute. By default, NULL values are not included as attributes in the XML element (or as XML elements if the ELEMENTS directive is used), but adding the XSINIL directve to the ELEMENTS directive adds an "xsi:nil="true"" attribute for elements whose corresponding columns have NULL. As a side-effect, a namespace is added to the root element in the XML, since that's where the nil attribute is declared.

  • AUTO: supports basic ("single-child") hierarchies. The hierarchy is created based on the order of the columns in the SELECT statement (not the order in which the tables are referenced). Each table referenced, though, gets its object in the hierarchy, named after the alias used for the table. The order of the rows in the result determines how the hierarchy is built, so ORDER BY clauses are important when using this mode.

  • EXPLICIT: the ELEMENTS and XSINIL directives do not exist for this mode. ORDER BY is important to construct the result, just as in AUTO mode. Any XML structure (even some not supported by the PATH mode) can be created, by using specific aliases for the columns (e.g. "Parent", "Customers!1!!element", "Customer!2!Id", "Order!3!OrderDate"). A tag and a parent column are necessary; the first one uniquely identifies an XML element in the result, and the second specifies the tag of the parent element (or NULL).

  • PATH: the ELEMENTS and XSINIL directives do not exist for this mode. Preferred mode. Using certain syntax in the column aliases, you can generate XML structures:


SELECT
c.ClientID as "@Id"
,c.Name as "@ClientName"
,c.RowGuid as "comment()"
,CAST('<Test/>' as XML) as "node()"
,c.ClientType as "AdditionalInfo/@Type"
,c.ModifiedDate as "AdditionalInfo/text()"
FROM dbo.Customer AS c
FOR XML PATH('Customer'), ROOT('Customers')


Results in:

<Customers>
<Customer Id="1" ClientName="Fake client 1">
<!-- 3ACB432D-B87D-4AED-95B4-C3019DA348B1-->
<Test />
<AdditionalInfo Type="S">2007-10-10T10:00:00.123</AdditionalInfo>
</Customer>
</Customers>


When nesting FOR XML queries, the TYPE declarative must be added at the end of each subquery, so the XML is not interpreted as a string.

A custom namespace can be included by preceding the query with a WITH XMLNAMESPACES clause.

XML Data Type

Typed XML attaches a schema collection to a column to verify its contents. You create a schema collection with the CREATE XML SCHEMA COLLECTION statement. When declaring XML columns for a table, just "XML" means untyped, or a schema collection can be indicated in parenthesis. Also inside those parenthesis, the DOCUMENT option forces the column to support only XML documents (not fragments).

A disadvantage of typed XML columns is that if a modification must be done to the schema, all columns that use it must be changed to use untyped XML (costly due to differences in internal representation) and updated so they comply with the new schema, the schema dropped and recreated, and then the columns converted back to it (costly again).

The XML data types provide methods to use them:

  • EXIST: obvious.

  • VALUE: retrieves a single, scalar value from somewhere in the XML.

  • NODES: converts XML into a tabular form.

  • QUERY: used to return an XML fragment.


CLR

Usage of CLR within the database requires that the SQL Server instance allows CLR code to run (EXEC sp_configure 'clr enabled', 1; RECONFIGURE;). Once the (VB.NET/C#) code is compiled into an assembly, the assembly must be loaded into SQL Server and a database object must be created and pointed to the loaded assembly using DDL.

The SqlContext and SqlPipe classes (in the Microsoft.SqlServer.Server namespace) must be used for command execution, since they are used to communicate to the SQL Server session within which the code will execute.

To load a compiled assembly (dll) into the database, the CREATE ASSEMBLY statement is used (CREATE ASSEMBLY AssemblyNameInDB FROM 'assembly_path_in_the_filesystem'). The dll file is no longer needed since the assembly is copied into the DB (the sys.assemblies catalog view contains information about existing assemblies).

An example of a SP that calls a CLR assembly is the following:

CREATE PROCEDURE ExampleSP @Parameter1 INT AS EXTERNAL NAME AssemblyName.[ClassNameWithNamespace].MethodName

ClassNameWithNamespace is enclosed in brackets (it could have been quotes) because it probably contains periods, which are not part of a valid identifier in SQL Server.

Source code can also be stored in the DB, like this:

ALTER ASSEMBLY AssemblyNameInDB ADD FILE FROM 'source_code_path_in_the_filesystem'

CLR Scalar UDFs

The SqlFunction attribute is required on the method. You can specify IsDeterministic, and IsPrecise (if it uses any floating point arithmetic, then this is false).

CLR Table-valued UDFs

2 methods are needed: the one that is called from SQL Server (decorated with an attribute like [SqlFunction(FillRowMethodName = "OtherMethodName", TableDefinition = "Column1 INT, Column2 NVARCHAR(100)" )]), which uses yield return to "generate" each row in the result, and the one that fills each row that gets returned ("OtherMethodName"). The second method always receives one object parameter (which is whatever the "yield return" statement was called for in the first method), and the same amount of OUT parameters as columns in the output table definition (these will be the actual column values).

CLR triggers

Similar to CLR SPs, but you can use the TriggerContext class in SqlContext. to perform trigger-specific checks.

CLR User-Defined aggregates

An entire CLR type (class/struct) must be created, so the aggregate can accomplish this:

  • Initialize: by using the Init method of the defined CLR type.

  • Add another value to the calculation: with the Accumulate method.

  • Combine itself with another instance of the same aggregate: with the Merge method.

  • Return its result: with the Terminate method.


The decorating attribute SqlUserDefinedAggregate can specify if serialization should be handled automatically or by the user (that creates the code), if an empty input returns NULL, if duplicates matter, if NULLs matter, if order matters, and the maximum size of the serialized structure.

CLR User-Defined Types

The attribute SqlUserDefinedType can define who handles serialization (see CLR User-Defined Aggregates), a method to Validate integrity when a binary type is casted to the UDT, if the type is byte-ordered (SQL Server only supports byte sorting, so you should be VERY CAREFUL when implementing this), if it is of fixed length, and its max size. It must implement the INullable interface. Parse and ToString methods are required. More details here.

The following permission sets can be used for CLR assemblies:

  • SAFE: not allowed access to any resources outside the DB to which it is deployed.

  • EXTERNAL ACCESS: allowed to access resources outside the local SQL Server instance, such as another instance, the file system, or even a network resource.

  • UNSAFE: allowed to execute non-CLR code (e.g. Win32 API or a COM component).


Filestream

Option for columns of the varbinary(max) data type. Allows SQL Server to store data for those columns in separate files in the file system. To enable filestream usage, use EXEC sp_configure 'sp_filestream_access', (1|2|3). 1 allows only T-SQL access to the filestream data; 2 allows direct file-system access; 3 allows access through a network share. A filegroup that contains a Filestream database is also required.

Usage requires the table to have a UNIQUEIDENTIFIER column with the ROWGUIDOL property and a Unique constraint on it.

Deleting of filestream data is done with a garbage collector, not necessarily at the moment when the UPDATE/DELETE statement is executed.

No comments: