Managed code in Yukon: Big deal?
Managed code can be used inside SQL Server Yukon is two principal ways—managed types can act as the data types for columns inside a regular SQL Server table, and managed code can be used as or by stored procedures, triggers, and user-defined functions to query and manipulate data or perform other custom tasks. SQL Server contains many auxiliary components like Data Transformation Services (DTS) that currently provide access to COM objects, and the use of managed code inside these components is a sensible evolutionary step. Indeed, using managed code with these auxiliary components is possible today using COM interoperability and COM Callable Wrappers (CCWs). In contrast, managed types have no equivalent in SQL Server 2000, and while it is currently possible to use the .NET Framework inside SQL Server with Managed C++ and extended stored procedures, Yukon managed stored procedures represent an order-of-magnitude increase in ease and safety for writing stored procedures in traditional procedural languages.
Before delving into the functional arguments for and against the use of managed code in the database, it is worth covering some of the non-functional aspects, particularly database server process stability.
For many organisations, the database server is at the heart of information management workflows, and the prospect of poorly-written application code which is being executed inside the confines of the server process causing database crashes is a genuinely terrify scenario.
Figure 1: SQL Server Yukon Project in Visual Studio.NET
Managed code is run using a different application domain per database, which isolates any problems that occur in one database from those in other databases, and CLR permissions are used to control the functionality and resources that .NET code can access when running inside the database. A significant amount of work has gone into ensuring that the performance impact of managed code execution within the database is acceptable, and in some cases, the execution of managed code can actually improve performance by taking advantage of the compiled nature of .NET methods. .NET languages like C# are also more efficient at expressing some programming techniques compared to TSQL, and this can be used to speed up data manipulation before data is sent over the wire to database clients.
Managed Stored Procedures
Despite some of the more outlandish claims that managed stored
procedure will bring about a fundamental change in the way
database-centric applications are written, the number of occasions
where managed code will be a better choice than TSQL for data
extraction and manipulation will be quite limited.
Figure 2: Items that can be added to a SQL Server Project
Once a SQL Server Project has been created inside Visual Studio and a data source has been nominated, various database objects can be added to the solution. SQL Server Yukon ships with a new sample database called AdventureWorks, but for the sake of this demonstration the reliable and familiar Northwind database will be used.
Accessing data is accomplished using a subset of the ADO.NET libraries. For those familiar with ADO.NET, writing managed stored procedures will be a simple exercise. The complete syntax for querying the Northwind Orders table and returning the result to the stored procedure caller is shown in Listing A:
Visual Studio Whidbey allows a SQL Server project to be deployed directly to Yukon, avoiding the need to learn the new DDL statements related to .NET assemblies.
Two objects are worth calling out from the stored procedure—the SqlContext object, which is used to retrieve a SqlConnection object in this example, has various properties that can be used to retrieve various ambient properties related to the context the code is executing in. The SqlPipe object, which is return by the GetPipe method of SqlContext, is the communications mechanism used to return results, messages, and error status codes to the stored procedure client.
While the managed code is not overly cumbersome to write, for simple data access queries, TSQL will prove a much better choice. The compiled nature of .NET code means that it can have a speed advantage over interpreted TSQL for heavy numerical calculations, but TSQL has direct access to SQL Server’s internal buffers and has an optimised data access query plan associated with it. TSQL has also been markedly improved in the Yukon release, and contains improvements in error handling, looping, and recursive expressions. Given the database tier is the hardest (and generally most expensive) tier in an n-tier application to scale out, conducting processor intensive calculations within the database server, which is the main performance argument for in-process managed code, would not be advisable in most SQL Server deployments.
Managed stored procedures makes the most sense when the database server is also being used in the role of the application server. SQL Server Broker, which ships with Yukon, is described by SQL Server Books Online as “a new technology for building database-intensive distributed applications that are secure, reliable, and scalable”. The basic idea of SQL Server Broker is that all non-UI elements of a distributed application run within the confines of SQL Server, and various applets, which can be stored procedures, DTS packages or a traditional executable application, are executed by SQL Broker is response to data arrival. SQL Server Broker is an obvious example of the database-as-application-server architecture, but using SQL Server in this way is not confined to Broker systems. Regardless of the implementation, using managed code as SQL Server-hosted applets certainly makes sense if this architectural pattern is employed.
Managed Types
Managed types are an interesting addition to SQL Server Yukon. SQL
Server is primarily a relational database and the XML support that was
added SQL Server 2000 and built upon in SQL Server Yukon gives it
strong hierarchical storage capabilities. Managed types complete the
storage trifecta by adding object storage capabilities to the database.
The constraints on the type definition are reasonably severe—the types
must be a reference type, but the use of derived types inside the
database is not supported, the type’s binary serialised form must give
the same comparison results as the types equality operators if the type
wishes to be used in any automatic comparison operations (for example,
use as a primary key or use as a column in an order by clause), and the
type must implement custom serialisation if it exposes properties that
are not fixed-length value types. These restraints mean that SQL Server
Yukon cannot be used as a generic object database, and some mapping
from rich application object hierarchies to SQL Server-compatible types
would be required.
The essential characteristics of managed types can be illustrated with
a simple example—Listing B shows a C# Person type compliant with SQL Server Yukon’s type rules:
The sample shows a managed type with two properties based on member variables (Name and DOB) and a calculated property (Age). As is evident from the code, managed types can be quite complex, even if the data they are storing is simple. There are a number of rules that a type must follow, and these rules cannot be compile-time enforced, which makes it quite easy to create data storage bugs. In addition, the TSQL support for dealing with managed types is still quite primitive, and the complexities that are likely to arise when combining managed types and advanced database tasks like replication are concerning. In a software environment where the general trend for long-lived distributed systems is to move away from object technologies, managed types seem like a regressive step for most database designs.
Conclusion
Software engineers face a tough choice when dealing with new
technologies. We need to “reset our defaults” to avoid the ugly habit
of using new tools in exactly the same way that the pervious toolset
was used, but caution also needs to be shown to avoid jumping onto new
technologies simply because their shiny new exterior initially appears
more attractive than the technologies currently available.
When deciding where and how managed code should be used inside the database, err on the side of caution, and wait for a compelling reason before throwing away TSQL and relational database design. These technologies have served the software industry quite well, and are adequate for a wide range of tasks. Managed code in the database is yet to attain these characteristics.