Managed code in Yukon: Big deal?

Yukon, the next release of Microsoft’s SQL Server database, lets managed code execute within the database server’s process. Just how useful is this?
Written by Nick Wienholt, Contributor
The next release of Microsoft’s SQL Server database, which is currently known by its codename of Yukon, allows managed code to be executed within the confines of the database server’s process. There has been mixed reaction to Yukon’s managed code features, ranging from unbridled jubilation and proclamations about the death of application servers and the middle tier, through to deep pessimism and pledges of managed code disablement. This article attempts to answer the question: when does managed code make sense inside the database?

Managed code can be used inside SQL Server Yukon is two principal ways—managed types can act as the data types for columns inside a regular SQL Server table, and managed code can be used as or by stored procedures, triggers, and user-defined functions to query and manipulate data or perform other custom tasks. SQL Server contains many auxiliary components like Data Transformation Services (DTS) that currently provide access to COM objects, and the use of managed code inside these components is a sensible evolutionary step. Indeed, using managed code with these auxiliary components is possible today using COM interoperability and COM Callable Wrappers (CCWs). In contrast, managed types have no equivalent in SQL Server 2000, and while it is currently possible to use the .NET Framework inside SQL Server with Managed C++ and extended stored procedures, Yukon managed stored procedures represent an order-of-magnitude increase in ease and safety for writing stored procedures in traditional procedural languages.

Before delving into the functional arguments for and against the use of managed code in the database, it is worth covering some of the non-functional aspects, particularly database server process stability.

For many organisations, the database server is at the heart of information management workflows, and the prospect of poorly-written application code which is being executed inside the confines of the server process causing database crashes is a genuinely terrify scenario.

Figure 1: SQL Server Yukon Project in Visual Studio.NET

To address these fears, a great deal of work has occurred to ensure that the database engine can maintain tight control over managed code running within its process, and the Whidbey version of the .NET Framework that Yukon uses includes a new hosting interface that allow the database server to keep a tight leash on the managed code that it executes.

Managed code is run using a different application domain per database, which isolates any problems that occur in one database from those in other databases, and CLR permissions are used to control the functionality and resources that .NET code can access when running inside the database. A significant amount of work has gone into ensuring that the performance impact of managed code execution within the database is acceptable, and in some cases, the execution of managed code can actually improve performance by taking advantage of the compiled nature of .NET methods. .NET languages like C# are also more efficient at expressing some programming techniques compared to TSQL, and this can be used to speed up data manipulation before data is sent over the wire to database clients.

Managed Stored Procedures
Despite some of the more outlandish claims that managed stored procedure will bring about a fundamental change in the way database-centric applications are written, the number of occasions where managed code will be a better choice than TSQL for data extraction and manipulation will be quite limited.

Figure 2: Items that can be added to a SQL Server Project

To put the managed stored procedures in a historical context, Oracle has supported Java stored procedures for more than five years (since version 8i, released in March 1999), and this has failed to cause a stampede of developers leaving the SQL Server platform. As mentioned earlier, extended stored procedures in SQL Server allows Managed C++ to use .NET code within the database, and native extended stored procedures have been available since 1998 with the release of SQL Server 7. Managed stored procedures in SQL Server Yukon are undoubtedly simpler and safer than extended stored procedures in previous SQL Server versions. Yukon adds a Visual Studio.NET project template that makes creating managed stored procedures, triggers, and user-defined functions relatively easy. New Data Definition Language (DDL) statements allow .NET assemblies to be imported into the database server and exposed as stored procedures, but the use of Visual Studio.NET or the new SQL Server Workbench (which replaces Enterprise Manager) alleviates developers from using these commands.

Once a SQL Server Project has been created inside Visual Studio and a data source has been nominated, various database objects can be added to the solution. SQL Server Yukon ships with a new sample database called AdventureWorks, but for the sake of this demonstration the reliable and familiar Northwind database will be used.

Accessing data is accomplished using a subset of the ADO.NET libraries. For those familiar with ADO.NET, writing managed stored procedures will be a simple exercise. The complete syntax for querying the Northwind Orders table and returning the result to the stored procedure caller is shown in Listing A:

Visual Studio Whidbey allows a SQL Server project to be deployed directly to Yukon, avoiding the need to learn the new DDL statements related to .NET assemblies.

Two objects are worth calling out from the stored procedure—the SqlContext object, which is used to retrieve a SqlConnection object in this example, has various properties that can be used to retrieve various ambient properties related to the context the code is executing in. The SqlPipe object, which is return by the GetPipe method of SqlContext, is the communications mechanism used to return results, messages, and error status codes to the stored procedure client.

While the managed code is not overly cumbersome to write, for simple data access queries, TSQL will prove a much better choice. The compiled nature of .NET code means that it can have a speed advantage over interpreted TSQL for heavy numerical calculations, but TSQL has direct access to SQL Server’s internal buffers and has an optimised data access query plan associated with it. TSQL has also been markedly improved in the Yukon release, and contains improvements in error handling, looping, and recursive expressions. Given the database tier is the hardest (and generally most expensive) tier in an n-tier application to scale out, conducting processor intensive calculations within the database server, which is the main performance argument for in-process managed code, would not be advisable in most SQL Server deployments.

Managed stored procedures makes the most sense when the database server is also being used in the role of the application server. SQL Server Broker, which ships with Yukon, is described by SQL Server Books Online as “a new technology for building database-intensive distributed applications that are secure, reliable, and scalable”. The basic idea of SQL Server Broker is that all non-UI elements of a distributed application run within the confines of SQL Server, and various applets, which can be stored procedures, DTS packages or a traditional executable application, are executed by SQL Broker is response to data arrival. SQL Server Broker is an obvious example of the database-as-application-server architecture, but using SQL Server in this way is not confined to Broker systems. Regardless of the implementation, using managed code as SQL Server-hosted applets certainly makes sense if this architectural pattern is employed.

Managed Types
Managed types are an interesting addition to SQL Server Yukon. SQL Server is primarily a relational database and the XML support that was added SQL Server 2000 and built upon in SQL Server Yukon gives it strong hierarchical storage capabilities. Managed types complete the storage trifecta by adding object storage capabilities to the database. The constraints on the type definition are reasonably severe—the types must be a reference type, but the use of derived types inside the database is not supported, the type’s binary serialised form must give the same comparison results as the types equality operators if the type wishes to be used in any automatic comparison operations (for example, use as a primary key or use as a column in an order by clause), and the type must implement custom serialisation if it exposes properties that are not fixed-length value types. These restraints mean that SQL Server Yukon cannot be used as a generic object database, and some mapping from rich application object hierarchies to SQL Server-compatible types would be required. The essential characteristics of managed types can be illustrated with a simple example—Listing B shows a C# Person type compliant with SQL Server Yukon’s type rules:

The sample shows a managed type with two properties based on member variables (Name and DOB) and a calculated property (Age). As is evident from the code, managed types can be quite complex, even if the data they are storing is simple. There are a number of rules that a type must follow, and these rules cannot be compile-time enforced, which makes it quite easy to create data storage bugs. In addition, the TSQL support for dealing with managed types is still quite primitive, and the complexities that are likely to arise when combining managed types and advanced database tasks like replication are concerning. In a software environment where the general trend for long-lived distributed systems is to move away from object technologies, managed types seem like a regressive step for most database designs.

Software engineers face a tough choice when dealing with new technologies. We need to “reset our defaults” to avoid the ugly habit of using new tools in exactly the same way that the pervious toolset was used, but caution also needs to be shown to avoid jumping onto new technologies simply because their shiny new exterior initially appears more attractive than the technologies currently available.

When deciding where and how managed code should be used inside the database, err on the side of caution, and wait for a compelling reason before throwing away TSQL and relational database design. These technologies have served the software industry quite well, and are adequate for a wide range of tasks. Managed code in the database is yet to attain these characteristics.

Editorial standards