Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Best of Simple Talk

Download as pdf or txt
Download as pdf or txt
You are on page 1of 138

The Best of Simple Talk

Jit ‘n’ Run


The Best of
Simple Talk - .NET
Vol.2

In association with

ISBN: 978-1-906434-12-0 www.simpletalkpublishing.com


Shelving: Development/Computer Science
i

JIT N' Run v2

The Best of Simple-Talk .NET

by Amir thalingam Prasanna, Ben Hall, Brian Donahue, Damon Armstrong,


Francis Nor ton, James Moore, Jef f Hewitt, Jesse Liber ty, John Papa, Mike
Bloise and Tilman Bregler,
First published 2008 by Simple-Talk Publishing
ii

Copyright Amirthalingam Prasanna, Ben Hall, Brian Donahue, Damon Armstrong, Francis Norton, James Moore, Jeff Hewitt, Jesse
Liberty, John Papa, Mike Bloise and Tilman Bregler 2008

ISBN 978-1-906434-12-0

The right of Amirthalingam Prasanna, Ben Hall, Brian Donahue, Damon Armstrong, Francis Norton, James Moore, Jeff Hewitt, Jesse
Liberty, John Papa, Mike Bloise and Tilman Bregler to be identified as the author of this work has been asserted by him in accordance with
the Copyright, Designs and Patents Act 1988

All rights reserved. No part of this publication may be reproduced, stored or introduced into a retrieval system, or transmitted, in any form,
or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written consent of the publisher. Any
person who does any unauthorised act in relation to this publication may be liable to criminal prosecution and civil claims for damages.

This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated
without the publisher’s prior consent in any form other than which it is published and without a similar condition including this condition
being imposed on the subsequent publisher.

Typeset by Andrew Clarke and Chris Massey


iii

CONTENTS
Contents............................................................................................................................................................3
About the Authors ..........................................................................................................................................7
acknowledgements ..........................................................................................................................................9
Introduction .....................................................................................................................................................9
NET Application Architecture: the Data Access Layer..........................................................................11
Designing and building a robust data access layer..................................................................11
Layered design and the data access layer..............................................................................11
Design principals in the data access layer ............................................................................14
Exchanging Data with the DAL............................................................................................14
Putting theory into practice: the demo application ............................................................17
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer.......................19
Managing SQL exceptions in .NET applications ...................................................................19
The real-world scenario ..........................................................................................................19
Solution architecture: objectives and overview...................................................................20
Building the SQL statement logger ......................................................................................22
The code in action: demo application ..................................................................................35
Conclusion ....................................................................................................................................36
ADO.NET 2.0 Factory Classes ..................................................................................................................37
Achieve database independence by developing a pluggable data layer ...............................37
Introduction..............................................................................................................................37
Supporting many database products.....................................................................................37
System.Data.Common namespace........................................................................................37
Conclusion ....................................................................................................................................40
Implementing Real-World Data Input Validation using Regular Expressions....................................41
Some real validation requirements .......................................................................................41
Basics: Implementing NUM using "^"…"$", "["…"]", "?" and "+"...............................41
Using "{" … "}", "(" … ")", "", and "d" to implement Repetition.................................42
Using "|" to implement a logical OR...................................................................................44
Using "(?=" … ")" to implement a logical AND ...............................................................44
Using "(?!" … ")" to implement AND NOT......................................................................45
Conclusion ................................................................................................................................46
References .................................................................................................................................46
NET 3.5 Language Enhancements ............................................................................................................47
Automatic Properties .............................................................................................................47
Object Initializers.....................................................................................................................48
iv

Collection Initializers...............................................................................................................49
Extension Methods .................................................................................................................50
Anonymous Types and Implicitly Typed Variables ............................................................51
Wrapping Up ............................................................................................................................52
NET Collection Management with C# 3.0 ..............................................................................................53
Using C# 3.0 to manage a collection of objects ....................................................................53
Introduction..................................................................................................................................53
Sorting a List.................................................................................................................................53
Searching a List.............................................................................................................................54
List Conversion ............................................................................................................................55
Linq ................................................................................................................................................55
Conclusion ....................................................................................................................................56
Exceptionally expensive ...............................................................................................................................57
Need to loosen my bindings........................................................................................................................61
Extending MSBuild.......................................................................................................................................62
Project Files are MSBuild Files..............................................................................................62
The Default Build Process .....................................................................................................62
Property, Item and Target Evaluation ..................................................................................63
Side Effects of Static Evaluation ..........................................................................................64
Discovering Predefined Properties and Items ....................................................................65
Referencing Different Dlls for Release and Debug Builds ...............................................65
Executing Custom Targets .....................................................................................................66
Extending All Builds ...............................................................................................................67
MSBuild Constrictions............................................................................................................67
Conclusion ................................................................................................................................68
Controls Based Security in a Windows Forms Application ...................................................................69
Users and roles .............................................................................................................................69
Roles and controls........................................................................................................................69
The database .................................................................................................................................70
Application and control-security forms ...................................................................................71
Creating the application ..............................................................................................................71
Creating users and roles..........................................................................................................73
The Manage Permissions page ..............................................................................................75
Implementing permissions – step-by-step...............................................................................78
ManagePermisssions constructor..........................................................................................79
ShowControls...........................................................................................................................79
Populate Permission Tree.......................................................................................................80
Adding a new restriction to a control...................................................................................83
v

Clean up on exit .......................................................................................................................85


The Debugger is your friend......................................................................................................86
Wrap-up .........................................................................................................................................86
Building Active Directory Wrappers in .NET..........................................................................................89
The problem with values in Active Directory.....................................................................89
IADsLargeInteger wrapper....................................................................................................90
ObjectGuid wrapper ...............................................................................................................90
ObjectSid wrapper...................................................................................................................91
UserAccountControl wrapper ...............................................................................................91
Using the wrapper classes.......................................................................................................92
Updating the Active Directory entry ....................................................................................94
Things to keep in mind...........................................................................................................95
Conclusion – possible improvements or additions ............................................................96
Integrating with WMI...................................................................................................................................97
Creating and installing the service.............................................................................................97
Starting and Stopping the Service .............................................................................................99
Adding WMI Support .................................................................................................................99
Testing the WMI Provider........................................................................................................100
Summary......................................................................................................................................103
Make sure your .NET applications perform...........................................................................................105
Beware of optimising needlessly .............................................................................................105
A big Oh......................................................................................................................................105
Get real ........................................................................................................................................106
What to look for.........................................................................................................................106
Loops .......................................................................................................................................107
Do less work...........................................................................................................................107
Divide and conquer ...............................................................................................................107
Modelling ................................................................................................................................107
Dynamic programming.........................................................................................................107
Translate the problem ...........................................................................................................108
If all else fails, fake it.............................................................................................................108
Blocking functions.................................................................................................................108
Threads can end up in knots................................................................................................108
Batch it up...............................................................................................................................108
It's good to share ...................................................................................................................109
Lock free .................................................................................................................................109
Conclusion ..................................................................................................................................109
Tracing memory leaks in .NET applications with ANTS Profiler......................................................111
vi

Summary .................................................................................................................................115
Testing Times Ahead: Extending NUnit.................................................................................................117
Why Extend? ..............................................................................................................................117
NUnit Extension Points............................................................................................................117
How to Extend NUnit? ............................................................................................................118
Hello World ............................................................................................................................118
Suite Builders..........................................................................................................................121
Test Case Builders..................................................................................................................123
Test Decorators......................................................................................................................126
Event Listeners ......................................................................................................................129
Deployment and execution ..................................................................................................130
Summary......................................................................................................................................130
Using SQL Data Generator with your Unit Tests .................................................................................131
SQL Data Generator .................................................................................................................131
Automating Data Generation ..................................................................................................131
The Attribute Based approach.............................................................................................132
The POCO Wrapper approach ...........................................................................................134
Maintainability and coping with schema changes .................................................................134
Testing your Business Layer .....................................................................................................134
Summary......................................................................................................................................136
vii

ABOUT THE AUTHORS


Ben Hall
Ben Hall is a UK C# developer/tester who enjoys all aspects of the development lifecycle and
technology in general. During the day Ben works for Red Gate Software as a Test Engineer. At night,
Ben is a MbUnit Core Commit Member, helping out on the project whenever possible. Ben has also
gained his Microsoft Certified Technology Specialist (MCTS) in Web Applications and is a Microsoft
Certified Professional (MCP) in Administrating Windows XP. He blogs on http://blog.-
benhall.me.uk/index.html
Amirthalingam Prasanna
Prasanna is a software engineer, technical author and trainer with over 7 years experience in the
software development industry. He is a Microsoft MVP in the Visual developer category, a MCT and
a MCPD on enterprise application development. You can read his blog at www.prasanna.ws and e-
mail him at feedback@prasanna.ws
Damon Armstrong
Damon Armstrong is a senior architect with Cogent Company in Dallas, Texas, and author of Pro
ASP.NET 2.0 Website Programming. He specializes in Microsoft technologies with a focus on web
application design using ASP.NET. When not staying up all night coding, he can be found playing
disc golf, softball, working on something for Carrollton Young Life, or recovering from staying up all
night coding.
Brian Donahue
Have debugger, will travel! Brian has been the Product Support Team Leader for Red Gate Software
Ltd since 2002, and has ten years’ experience in troubleshooting and maintaining computer software.
He enjoys the challenge of maintaining his hard-core technical expertise in parallel with his people
skills, and is exhilarated by working with people on the cutting edge of computer science.
Jesse Liberty
The president of Liberty Associates, Inc and a Microsoft MVP, Jesse Liberty is the author of the
international best-selling Programming C#, Programming VB 2005, Programming ASP.NET and
numerous other books, including the forthcoming Programming .NET 3. He has written dozens of
articles for leading industry publications and has been a featured or keynote speaker at international
industry events. Jesse’s biography is listed on Wikipedia and he maintains a political blog, a technical
blog, and a free private forum on which he provides support for all his writing.
Jeff Hewitt
Jeff Hewitt is a senior consultant with Credera, a full service business and technology consulting firm
in Dallas Texas. He specializes in Microsoft Technologies specifically Windows forms and ASP.NET
application design and development. He is also a capable Java developer and enjoys tinkering with
open source projects.
Mike Bloise
Mike Bloise is the lead developer at Recognin Technologies in Raleigh, North Carolina. His focus in
on building flexible, extensible application frameworks, and he currently uses C# 2005 for nearly
everything. His daily bread is Rz3, an enterprise management system for electronic component
distributors. When he's not programming, he enjoys thinking, talking, and writing about it.
James Moore
James Moore is a devleoper and runs the .NET Tools division at Red Gate Software. Previously, he
developed the UI for SQL Compare 5 and SQL Data Compare 5, and was technical lead on SQL
Backup 5.
viii

John Papa
John Papa, Senior .NET Consultant at ASPSOFT, is a Microsoft MVP [C#], MCSD.NET, and an
author of several data access technology books. John has over 10 years' experience in developing
Microsoft technologies as an architect, consultant, and a trainer. He is the author of the Data Points
column in MSDN Magazine and can often be found speaking at user groups, MSDN Web Casts, and
industry conferences such as VSLive and DevConnections. John is a baseball fanatic who spends
most of his summer nights rooting for the Yankees with his family and his faithful dog, Kadi. You
can contact John at johnpapa.net.
Tilman Bregler
Tilman started work at Red Gate as a tester and is now build engineer. He has a BSc Maths and
Philosophy (comb.) from York and Part III Maths, Cambridge.
Francis Norton
Francis Norton works at iE (http//:www.ie.com), developing online retail finance applications in C#,
ASP.Net and SQL Server 2005, with special interests in XML technologies, Powershell and mainframe
integration. In an earlier life he contributed chapters to three Wrox books on XML. He's still trying to
work out if he gets more satisfaction from solving problems or from explaining the solutions.
ix

ACKNOWLEDGEMENTS
Grateful thanks are here given to all those who helped in the production of this Ebook, including
Claire Brooking, Anna Larjomaa, Simon Galbraith, Laila Lotfi and Tony Davis at Red-Gate. We’d also
like to thank the authors, who all submitted cheerfully to the editing process that had their
contributions altered to conform with Simple-Talk’s ‘house style’.

INTRODUCTION
This Ebook is a republication of the best of Simple-Talk’s .NET articles. These articles are written to
appeal to a wide range of readership. Simple-Talk has readers who work professionally as Developers,
Database Administrators, testers, Technical authors, Architects, IT managers. The articles are
designed to inform without patronising or ‘talking down’ to the less technical of the readers. Simple-
Talk always likes to take a slightly different approach to ‘Internet-based’ IT publishing and we are not
afraid of introducing humour or in going out of our way to explain highly esoteric subjects.
Simple-Talk is an online IT magazine that is designed for those who are professionally involved in
software development and who work with Microsoft Technologies, the most important to us being
SQL Server and .NET. Many of our readers are those who have bought, or shown an interest in, Red-
Gate tools, and the magazine is used, not only to provide general technical content, but also to
provide technical content about Red-Gate tools, and keep our readers abreast of any important
changes or developments at Red-Gate. We are independent of Microsoft, but we are, of course, Red-
Gate’s ‘house’ magazine, so are unapologetic for occasionally featuring Red-Gate products amongst
Microsoft’s and others, where it is relevant.
x
NET Application Architecture: the Data Access Layer 11

NET APPLICATION ARCHITECTURE: THE DATA ACCESS LAYER


11 July 2006
by Damon Armstrong

Designing and building a robust data access layer


Building an understanding of architectural concepts is an essential aspect of managing your career.
Technical interviews normally contain a battery of questions to gauge your architectural knowledge
during the hiring process, and your architectural ability only becomes more important as you ascend
through the ranks. So it's always a good idea to make sure you have a good grasp on the
fundamentals. In this article you will explore a key component of application architecture known as
the Data Access Layer (DAL), which helps separate data-access logic from your business objects. The
article discusses the concepts behind the DAL, and the associated PDF file takes a look at a full-
blown DAL implementation. This is the first in a series of articles discussing some of the cool things
you can do with a DAL, so the code and concepts in this article form the base for future discussions.

Layered design and the data access layer


Layered application designs are extremely popular because they increase application performance,
scalability, flexibility, code reuse, and have a myriad of other benefits that I could rattle off if I had all
of the architectural buzzwords memorized. In the classic three tier design, applications break down
into three major areas of functionality:
• The data layer manages the physical storage and retrieval of data
• The business layer maintains business rules and logic
• The presentation layer houses the user interface and related presentation code.
Inside each of these tiers there may also exist a series of sub-layers that provide an even more
granular break up the functional areas of the application. Figure 1 outlines a basic three tired
architecture in ASP.NET along with some of the sub-tiers that you may encounter:
12 by Damon Armstrong

Figure 1 – Three tiered ASP.NET application with sub-tiers

The presentation tier


In the presentation layer, the code-behind mechanism for ASP.NET pages and user controls is a
prominent example of a layered design. The markup file defines the look and layout of the web form
and the code behind file contains the presentation logic. It's a clean separation because both the
markup and the code-behind layers house specific sets of functionality that benefit from being apart.
Designers don't have to worry about messing up code to make user interface changes, and developers
don't have to worry about sifting through the user-interface to update code.
The data tier
You also see sub-layers in the data tier with database systems. Tables define the physical storage of
data in a database, but stored procedures and views allow you to manipulate data as it goes into and
out of those tables. Say, for example, you need to denormalize a table and therefore have to change its
physical storage structure. If you access tables directly in the business layer, then you are forced to
update your business tier to account for the changes to the table. If you use a layer of stored
procedures and views to access the data, then you can expose the same logical structure by updating a
view or stored procedure to account for the physical change without having to touch any code in your
NET Application Architecture: the Data Access Layer 13

business layer. When used appropriately, a layered design can lessen the overall impact of changes to
the application.
The business tier
And of course, this brings us to the topic of business objects and the Data Access Layer (also known
as the DAL), two sub-layers within the business tier. A business object is a component that
encapsulates the data and business processing logic for a particular business entity. It is not, however,
a persistent storage mechanism. Since business objects cannot store data indefinitely, the business tier
relies on the data tier for long term data storage and retrieval. Thus, your business tier contains logic
for retrieving persistent data from the data-tier and placing it into business objects and, conversely,
logic that persists data from business objects into the data tier. This is called data access logic.
Some developers choose to put the data access logic for their business objects directly in the business
objects themselves, tightly binding the two together. This may seem like a logical choice at first
because from the business object perspective it seems to keep everything nicely packaged. You will
begin noticing problems, however, if you ever need to support multiple databases, change databases,
or even overhaul your current database significantly. Let's say, for example, that your boss comes to
you and says that you will be moving your application's database from Oracle to SQL Server and that
you have four months to do it. In the meantime, however, you have to continue supporting whatever
business logic changes come up. Your only real option is to make a complete copy of the business
object code so you can update the data access logic in it to support SQL Server. As business object
changes arise, you have to make those changes to both the SQL Server code base and the Oracle code
base. Not fun. Figure 2 depicts this scenario:

Figure 2 – Business objects with embedded data access logic

A more flexible option involves removing the data access logic from the business objects and placing
it all in a separate assembly known as the DAL. This gives you a clean separation between your
business objects and the data access logic used to populate those business objects. Presented with the
same challenge of making the switch from Oracle to SQL Server, you can just make a copy of the
Oracle DAL and then convert it to work with SQL Server. As new business requirements come in,
you no longer need to make changes in multiple locations because you only maintain a single set of
business objects. And when you are done writing the SQL Server DAL, your application has two
functional data access layers. In other words, your application has the means to support two
databases. Figure 3 depicts separating data access logic out into a separate DAL:

Figure 3 – Business objects with separate data access layer


14 by Damon Armstrong

Design principals in the data access layer


The objective of the DAL is to provide data to your business objects without using database specific
code. You accomplish this by exposing a series of data access methods from the DAL that operate on
data in the data-tier using database specific code but do not expose any database specific method
parameters or return types to the business tier. Any time a business object needs to access the data
tier, you use the method calls in the DAL instead of calling directly down to the data tier. This pushes
database-specific code into the DAL and makes your business object database independent.
Now wait, you say, all you've accomplished is making the business objects dependent on the DAL.
And since the DAL uses database-specific code, what's the benefit? The benefit is that the DAL
resides in its own assembly and exposes database-independent method signatures. You can easily
create another DAL with the same assembly name and an identical set of method signatures that
supports a different database. Since the method signatures are the same, your code can interface with
either one, effectively giving you two interchangeable assemblies. And since the assembly is a physical
file referenced by your application and the assembly names are the same, interchanging the two is
simply a matter of placing one or the other into your application's bin folder.
NOTE:
You can also implement a DAL without placing it in a separate assembly if you build it
against a DAL interface definition, but we will leave that to another article.

Exchanging Data with the DAL


Now the question is: how do you exchange data between your business objects, the DAL, and vice
versa? All interaction between your business objects and the DAL occurs by calling data access
methods in the DAL from code in your business objects. As mentioned previously, the method
parameters and return values in the DAL are all database independent to ensure your business objects
are not bound to a particular database. This means that you need to exchange data between the two
using non-database-specific .NET types and classes. At first glance it may seem like a good idea to
pass your business objects directly into the DAL so they can be populated, but it's just not possible.
The business object assembly references the DAL assembly, so the DAL assembly cannot reference
the business object assembly or else you would get a circular reference error. As such, you cannot pass
business objects down into the DAL because the DAL has no concept of your business objects.
Figure 4 diagrams the situation:

Figure 4 – Business objects assembly references the DAL, so the DAL has no concept of business
objects

The custom class option


One option is to pass information in custom classes, as long as those custom classes are defined in an
assembly that both the business object and DAL assemblies can reference. From an academic
standpoint, this approach is probably the truest form of a data abstraction for a DAL because you
can make the shared classes completely data-source independent and not just database independent.
Figure 5 depicts how the business object assembly and the DAL assembly can both reference a shared
assembly:
NET Application Architecture: the Data Access Layer 15

Figure 5 – The business object assembly and the DAL assembly both reference a shared assembly, so
they can exchange information using classes and data structures from the shared assembly.

In practice, I find that building out custom classes solely to exchange data doesn't give you much
return for your effort, especially when there are other acceptable options already built into .NET.
The XML approach
You could opt to use XML since it's the poster child of flexibility and data-source independence and
can easily represent any data imaginable. Of course, it also means that you will be doing a lot of XML
parsing work to accommodate the data exchange, and I'm not a fan of extra work.
The database interface approach
You could also use the database interfaces from the System.Data namespace to exchange data
between business objects and the DAL. Database specific objects such as SqlDataReader,
SqlCommand, and SqlParameter are tied to SQL Server, and exposing them from the DAL would
defeat the purpose. However, by exposing an IDataReader, IDBCommand, or IDataParameter
object you do not tie yourself to particular database so they are an acceptable option, though not my
first choice.
From an academic standpoint, the database interface objects do tie you to using a "database
management system" even though they do not tie you to a specific database. Pure academics will tell
you that the DAL should be "data-source independent" and not just "database independent" so be
prepared for that fight if you have a Harvard or Oxford grad on your development team who
majored in theoretical application design. Nobody else on the planet cares because the chances of
your application moving away from a database system are fairly slim.
My preferred approach: DataSets
Another option for passing information, and the one that I gravitate towards because of its flexibility,
is the DataSet. Microsoft created the DataSet class specifically for storing relational information in
a non-database specific data structure, so the DataSet comes highly recommended for returning query
information containing multiple records and or tables of data. Your work load shouldn't suffer too
significantly from using the DataSet because DataAdapters, which fill DataSets with information,
already exists for most database systems. Furthermore, getting data out of the DataSet is fairly easy
because it contains methods for extracting your data as tables, rows, and columns.
Also note that a DataSet is technically data-source independent, not just database independent. You
can write custom code to load XML files, CSV files, or any other data source into a DataSet object.
Additionally, you can even manipulate and move information around inside the DataSet, something
that is not possible with the database interfaces from the System.Data namespace.
16 by Damon Armstrong

Exchanging non-relational data


Of course, you also deal with non-relational information when you pass data back and forth between
your business objects and the DAL. For example, if you want to save a single business object to the
data-tier, you have to pass that business object's properties into the DAL. To do so, simply pass
business object properties into the DAL via native .NET type method parameters. So a string
property on your business object is passed into the DAL as a string parameter, and an int property
on your business object is passed into the DAL as an int parameter. If the DAL updates the
business object property, then you should mark the parameter with the ref modifier so the new value
can be passed back to the business object. You can also use return values to return information as the
result of a function when the need arises. Listing 1 contains examples of method signatures that you
may need in the DAL if you have a Person business object in your application:

DataSet Person_GetAll()
{
//Returns a DataSet containing all people records in the database.
}

DataSet Person_GetByPersonID(int personID)


{
// Queries the database for the particular user identified by
// personID. If the user is located then the DataSet contains a
// single record corresponding to the requested user. If the user
// is not found then the DataSet does not contain any records.
}

bool Person_Save(ref int personID, string fname, string lname, DateTime dob)
{
// Locates the record for the given personID. If the record exists,
// the method updates the record. If the record does not exist, the
// method adds the record and sets the personID variable equal to
// the identity value assigned to the new record. Then the method
// returns the value to the business layer because personID is
// marked with the ref modifier.
}

int Person_DeleteInactive()
{
//Deletes all inactive people in the database and returns a value
//indicating how many records were deleted.
}

Listing 1 – Data access layer method signature examples

Data service classes


Normally you have one data access method in your DAL for each scenario in which you need to
exchange data between a business object and the database. If, for example, you have a Person class
then you may need data access methods like Person_GetAll, Person_GetPersonByID,
Person_GetByLoginCredentials, Person_Update, Person_Delete, and so on, so you can do
everything you need to do with a Person object via the DAL. Since the total number of data access
methods in your DAL can get fairly large fairly quickly, it helps to separate those methods out into
smaller more manageable Data Service Classes (or partial classes in .NET 2.0) inside your DAL.
Aside from being more manageable from a shear number standpoint, breaking down the DAL into
multiple data service classes helps reduce check-out bottle necks with your source control if you have
multiple developers needing to work on the DAL at the same time. Figure 6 depicts a DAL broken
down into three individual data service classes:
NET Application Architecture: the Data Access Layer 17

Figure 6 – Breaking down the DAL into multiple data service classes

Notice that all of the data service classes depicted in Figure 3 derive from a single base class named
DataServiceBase. The DataServiceBase class provides common data access functionality like
opening a database connection, managing a transaction, setting up stored procedure parameters,
executing commands, and so forth. In other words, the DataServiceBase class contains the general
database code and provides you with a set of helper methods for use in the individual data service
classes. The derived data service classes use the helper methods in the DataServiceBase for
specific purposes, like executing a specific command or running a specific query.

Putting theory into practice: the demo application


At this point you should have a descent understanding of what the data access layer is and how it fits
into an application from an architectural point of view. Theory is great, but at some point you have to
quit talking and start coding. Of course, going from theory to practice is no trivial step, so I wanted
to make sure you had a solid example to use as a foundation both in terms of code and
understanding.
On the Simple-Talk site there is a file containing two items: a demo application containing a DAL
implementation and a Building a Data Access Layer PDF that explains the code in detail. The
application is fairly simple, a two page web app that allows you to view / delete a list of people on
one page and to add / edit those people on another. However, it does implement all of the design
principles that we've covered here. Enjoy!
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 19

NET APPLICATION ARCHITECTURE: LOGGING SQL EXCEPTIONS


IN THE DATA ACCESS LAYER
11 September 2006
by Damon Armstrong

Managing SQL exceptions in .NET applications


You can develop for obvious exceptions. Code reviews may catch some of the ones you missed. And
your QA team can chip away at your application to uncover a lot more. But nothing exposes errors
like deploying your application into production and letting real users give it a savage beating. As much
as we'd like to think that we can release error-less software, the fact is that it's next to impossible.
Don't get me wrong, releasing error-free software is a valiant pursuit and one that every developer
should strive towards, but the reality is that code is complex, end-users tend to find ways to do things
that defy both logic and reason, and you can't account for the nearly infinite number of scenarios that
can occur when the two mix. The question is, how can you effectively respond to those unforeseen
issues?
Obviously, you should have a logging solution in place to keep a record of production exceptions and
important supplemental information to help you pinpoint the error back in your development
environment. The exception type, exception message, inner-exception chain, stack trace, page on
which the exception occurred, query string values, form values, and server variables are invaluable
pieces of the information that can help you locate an issue and identify why an error occurs in a
particular scenario. But one of the things I've always wanted, in addition to that information, is a field
containing the SQL statements executing when a data-access exception occurs. Having a record of
the offending SQL gives you the ability to check input values for validity and also to re-execute the
script in SQL Server Management Studio, both of which provide a great deal of insight when you are
tracking a bug in production. In this article, you will learn how to do just that, by building SQL
exception logging into your Data Access Layer.
Please note that this article builds on the concepts introduced in my article .NET Application
Architecture: the Data Access Layer and the code in the associated demo application. Please take a
look at this original article if you have any questions about what a Data Access Layer is, or how it is
implemented. Also note that there is a PDF file accompanying the demo application that covers the
original Data Access Layer demo code in great detail. That PDF is included in the demo application
for this article, available from the Code Download link in the box to the right of the article title, in
case you need to reference it.

The real-world scenario


Not too long ago, I was working with a client to help resolve a few maintenance issues on their
website. They had a decent logging solution in place to capture detailed exception information, but
there was one exception that was fairly elusive. It only happened once every ten or fifteen days, the
error message was fairly obscure, and the QA team could not find a way to reproduce it in the test
environment (it only happened in production). To make things even more confusing, users who
experienced the issue only seemed to experience it once. We have logs showing that they hit the page,
received the error, then hit the page again (probably on a Refresh) and everything ran just fine. We
knew it was a data-access issue because of the message, but aside from saying that there had been an
error executing a command, the exception message was basically useless. It was quite an enigma. So
we hatched a plan to capture the SQL command name and the input values for that command any
time that particular exception came up, and it helped us track down the issue. This article is an
extension of that concept.
20 by Damon Armstrong

Solution architecture: objectives and overview


If you read .NET Application Architecture: the Data Access Layer, then Figure 1 should look familiar:
it shows a high-level overview of the various tiers and layers in a web application. Our focus is the
Data Access Layer (DAL), which resides in the Business Tier and handles all communication between
the various Business Objects and the Data Tier.

Figure 1 – Three-tiered ASP.NET application with sub-tiers (layers)

The objective of the DAL is to expose information from the Data Tier to the Business Objects
without tying the Business Objects to any particular data source. It does this by using data-source-
specific calls to retrieve data from a particular data store, places the data in a non-data-source specific
variable or object, then passes the information to a Business Object for use in some operation.
Changing from one data source to another (say Oracle to SQL Server) is simply a matter of making a
new DAL to access the new data store and putting the data into the format the Business Objects
expect. You never need to touch the business objects when making the switch.
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 21

SQL error logging in the DAL


Our objective in this endeavor is to log any SQL statements that result in a data-access exception. So
the first question we need to answer is, where do we put that logging code? Since we're talking about
data access exceptions, and since the DAL is highlighted in Figure 1, you may have surmised, through
inductive reasoning, that the logging code resides in the DAL. And you would be correct. Which leads
to the next, more important, question: why? Take another look at Figure 1. Notice that all
communication from the Data Tier goes through the DAL. This means that that the DAL is the only
place in your application where a data-access exception is even feasible. Inside the DAL there are a
number of data service classes (or partial classes, if you so choose), each of which contains data-
access methods.
Figure 2 depicts this architecture. Although there may be a large number of these classes and data
access methods, all the classes inherit their basic data-access functionality from a single base class. By
placing the SQL logging code in the data service base class, we can give all SQL logging capabilities to
the entire DAL with relative ease.

Figure 2 – Breaking down the Data Access Layer (DAL) into multiple Data Service classes

Logging the error in the application


Finally, we have to consider how to expose the logged SQL to the application, so the application can
store it to an appropriate location. Logging the SQL is really a three-part process:
• Capture the SQL that was executing when the exception occurred
• Pass the captured information back to the application
• Get the application to log the information somewhere
Whenever a SQL statement fails, your application throws an exception. That exception travels back
up the call stack until it finds an appropriate catch statement that handles the exception or the
exception reaches the .NET runtime. We want to pass our SQL statement along with that exception
so your application can log them both. To do this, we'll create a custom exception wrapper with a
field to store the SQL statement. When a data-access exception occurs, we'll store the SQL statement
in the custom exception wrapper, wrap the actual data-access exception with our exception, and let
our exception, along with the SQL statement, propagate back up through the call stack. Then all you
22 by Damon Armstrong

have to do is set up your application to catch the exception and log it along with the SQL statement
that accompanies it.
A note on transactional support and configuration
One issue that came to mind while I was putting the code for this article together was the fact that
some SQL statements occur within a transaction. If that is the case, and you just capture the SQL
statement that caused the error, without providing the rest of the SQL statements that preceded it in
the transaction, then you may not be getting everything that you need to effectively track down an
issue. However, if you want to capture all of the SQL in a transaction, it means you have to log all the
"good" statements that lead up to the bad statement, and chances are that a "bad" statement won't
happen very often. So you'll be doing a lot of logging to ensure the information is available when you
need it, but most of it will simply be discarded because you won't need it that often. And that is not
good for performance.
But there may be a time when the benefit of tracking down an error outweighs the performance cost,
so we'll make full transactional logging a configurable setting in the web.config in case you ever really
need it. And while we're at it, we'll also put in a configuration setting that allows you to turn off SQL
statement logging altogether. The configuration settings will be stored in the <appSettings> section
of the web.config, as shown in Listing 1.

<?xml version="1.0"?>
<configuration>
<appSettings>
<add key="SqlLogging" value="true" />
<add key="SqlFullTxnLogging" value="true" />
</appSettings>
...
</configuration>

Listing 1 – Configuration settings for SQL Logging in the appSettings section of the web.config

At this point you should have a good high-level overview of the solution, so we’ll shift gears and
begin looking at the actual implementation details next.

Building the SQL statement logger


As mentioned at the start, the concepts and code in this article build off the DAL that I built and
described in .NET Application Architecture: the Data Access Layer. In the sections that follow I
discuss how to modify that original code base to add SQL exception logging support to your DAL.
We begin by creating the SqlWrapperException class, which is the only new class in the solution.
Then we will focus on adding SQL logging support to the DataServiceBase class by doing the
following:
1. Adding new private fields to the DataServiceBase class to help with SQL logging
2. Adding the SqlLogging and SqlFullTxnLogging properties to manage SQL logging
configuration values from the web.config
3. Adding the ErrorSQL property and helper methods to help manage the SQL statement log
4. Adding static methods to help manage logging during database transactions
5. Update the constructor to support logging during database transactions
6. Add the BuildSQL method that actually logs the SQL statement
7. Update the ExecuteNonQuery and ExecuteDataSet methods to use the BuildSQL method
to log SQL when an exception occurs
So, let’s jump into the code.
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 23

Capturing SQL statements using the SqlWrapperException class


We'll begin by building the SqlWrapperException class, our custom exception wrapper that stores
the SQL statement which caused the exception. I put the class in the Demo.Common assembly
because it is referenced by the business objects, the DAL, and the application. Listing 2 contains the
code for the SqlWrapperException class.

using System;

namespace Demo.Common
{
[Serializable]
public class SqlWrapperException : Exception
{
/////////////////////////////////////////////////////////
private string _sql = string.Empty;

/////////////////////////////////////////////////////////
public string SQL
{
get { return _sql; }
set { _sql = value; }
}

/////////////////////////////////////////////////////////
public SqlWrapperException(string sql, Exception inner)
: base(inner.Message, inner)
{
this.SQL = sql;
}

}
}

Listing 2 – Demo.Common.SqlWrapperException class

The SqlWrapperException class inherits its base exception functionality from the
System.Exception class. It's marked as Serializable because Microsoft says it's a best practice
(presumably because there are a number scenarios where exceptions need to be serialized). Other
than that, this class is pretty simple. It has a private field named _sql to store the SQL statement
value, and a public property named SQL that exposes that field. There is a single constructor that
accepts two parameters: the SQL statement that you want to log and the data-access exception
thrown while trying to execute that statement. The call to base(inner.Message, inner) sets the
Message property of the SqlWrapperException equal to the Message property on the original
exception, and assigns the InnerException property of the SqlExceptionWrapper to the
original data-access exception. This effectively wraps the data-access exception with our custom
exception wrapper. The constructor then sets the SQL property of the exception wrapper so the SQL
statement can travel along with the exception to wherever it ultimately needs to go.
New SQL exception logging fields in the DataServiceBase class
Since this article builds off the demo application from my previous article, I'm just going to cover
what you need to add to the DataServiceBase class to add support for SQL exception logging
instead of going back over everything that it does. Listing 3 shows the four new fields the class uses
for SQL statement logging.

////////////////////////////////////////////////////////////////////////
// Fields for SQL Error Logging
////////////////////////////////////////////////////////////////////////
private StringBuilder _errorSQL = null; //Stores SQL Text
24 by Damon Armstrong

private static int _sqlLogging = -1; //Toggle SQL Logging


private static int _sqlFullTxnLogging = -1; //Toggle Full Trans. Logging

private static Dictionary<int, StringBuilder> _sqlTxnSBMapping =


new Dictionary<int, StringBuilder>(); //Stores StringBuilders

Listing 3 – New fields in the DataServiceBase class

The first field, _errorSQL, is a StringBuilder that stores the logged SQL statement. I opted to
use a StringBuilder because the code needs to do a lot of string concatenation to build the SQL
statement, a situation in which the StringBuilder's performance is far better than regular fixed-
length string. After that, we have two static configuration fields, _sqlLogging and
_sqlFullTxnLogging. These fields store values that determine whether or not SQL logging is
enabled and whether or not to use full transactional logging, respectively. Although both of these
fields are integer variables, they actually represent Boolean data, but we'll discuss that in more detail
when we take a look at the properties that expose these values. Last, we have a static Dictionary
field named _sqlTxnSBMapping. A Dictionary object allows you to reference values in the
Dictionary based on a key. In this case, our key is an integer, and our value is a StringBuilder
object. What the _sqlTxnSBMapping field allows us to do is associate a StringBuilder object
(the Dictionary value) with a particular database transaction (the Dictionary key). Why isn't the
key a SqlTransaction then? Because we store the hash value (an int) of the SqlTransaction as
the key and not the actual SqlTransaction object itself.
Configuring logging with the SqlLogging and SqlFullTxnLogging properties
We want to make the SQL statement logging capabilities of this demo configurable so they can be
turned on and off. We already have the fields that we need to store the values, _sqlLogging and
_sqlFullTxnLogging, so now we have to create a way to get the settings from the web.config. I
opted to put that code directly in the properties that expose those fields. The only problem is that
we're dealing with configurable Boolean values. Configurable Boolean values have three states: un-
initialized, true (on), and false (off). But a Boolean variable only has two states: true and false. That's
why the _sqlLogging and _sqlFullTxnLogging fields are integers. We use -1 to represent un-
initialized, 0 to represent false, and 1 to represent true. Listing 4 shows how this all plays out inside
the SqlLogging property.

////////////////////////////////////////////////////////////////////////
public static bool SqlLogging
{
get
{
if (_sqlLogging == -1)
{
bool value = Convert.ToBoolean(
ConfigurationManager.AppSettings["SqlLogging"]);
_sqlLogging = value ? 1 : 0;
}
return _sqlLogging == 1 ? true : false;
}
set
{
_sqlLogging = value ? 1 : 0;
}
}

Listing 4 – SqlLogging property (new property in the DataServiceBase class)

Inside the Get portion of the property, the code begins by checking to see if the _sqlLogging field
is equal to -1. If so, it indicates that it is in its un-initialized state and we need to get the appropriate
configuration value from the web.config. Inside the if block, we acquire the SqlLogging value
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 25

from the appSettings section of the web.config, convert that string value to a bool value, then
store it in the value variable. What happens if you don't have a SqlLogging setting defined in the
appSetting section? Then the ConfigurationManager returns a null value for
AppSetting["SqlLogging"] and Convert.ToBoolean interprets the null value as false. So if
you don't specify the settings, then it's the same as setting them to false. Then the code determines if
_sqlLogging is equal to 1 and, if so, returns true. Otherwise it returns false. The Set portion of the
property is fairly simple. It takes the value assigned to the property and sets _sqlLogging to 1 if the
value is true, or 0 if the value is false.
Listing 5 contains the code for the SqlFullTxnLogging property. It's basically the same code that
you saw for the SqlLogging property, but it checks to see if SqlLogging is enabled before running
any of the code that initializes and returns the SqlFullTxnLogging value. If SqlLogging is not
enabled, then SqlFullTxnLogging returns false because disabling SqlLogging disables all logging.
If SqlLogging is enabled, then it runs through the same logic that we discussed for the
SqlLogging property to determine if SqlFullTxnLogging is enabled.

////////////////////////////////////////////////////////////////////////
public static bool SqlFullTxnLogging
{
get
{
if (SqlLogging)
{
if (_sqlFullTxnLogging == -1)
{
bool value = Convert.ToBoolean(
ConfigurationManager.AppSettings["SqlFullTxnLogging"]);
_sqlFullTxnLogging = value ? 1 : 0;
}

return _sqlFullTxnLogging == 1 ? true : false;


}
else
{
return false;
}
}
set
{
_sqlFullTxnLogging = value ? 1 : 0;
}
}

Listing 5 – SqlFullTxnLogging property (new property in the DataServiceBase class)

Managing the SQL statement log


One other property and two minor methods relating to the _errorSQL StringBuilder need to be
discussed. First, we'll talk about the ErrorSQL property shown in Listing 6. This is a fairly simple
property that exposes the _errorSQL field and ensures that the ErrorSQL property always returns a
valid reference to a StringBuilder object. If _errorSQL is null, it simply creates a new
StringBuilder object and assigns it to the _errorSQL field before returning _errorSQL as the
result of the property.

///////////////////////////////////////////////////////////
public StringBuilder ErrorSQL
{
get
{
if (_errorSQL == null)
{
_errorSQL = new StringBuilder();
}
return _errorSQL;
26 by Damon Armstrong

}
set
{
_errorSQL = value;
}
}

Listing 6 – ErrorSQL property (new property in the DataServiceBase class)

Next we have the GetSqlStatementForException method. When a data-access exception occurs


in the DAL, you need to take the SQL statement (or statements) stored in the ErrorSQL
StringBuilder and place it into the SqlWrapperException. You also need to clear the
ErrorSQL StringBuilder when you do this, because the possibility exists that you could reuse the
same DataService class for another call after an exception occurred (assuming the first exception
was handled gracefully). The code in Listing 7 begins by checking to see if _errorSQL is null. If not,
it stores the value of _errorSQL in a temp variable, clears the StringBuilder, then returns the
temp value as the result of the function. If _errorSQL is null, the method simply returns an empty
string. Also note that I chose to make this a private method because it's fairly specific to how we're
using it inside the class. If you want to expose it publicly, feel free to do so.

///////////////////////////////////////////////////////////
private string GetSqlStatementForException()
{
if (_errorSQL != null)
{
string value = _errorSQL.ToString();
_errorSQL.Length = 0;
return value;
}
else
{
return string.Empty;
}
}

Listing 7 – GetSqlStatementForException function (new method in the DataServiceBase class)

And last, we have the ClearSqlStatementLog method shown in Listing 8. This method just gives
you an efficient way to clear any SQL statements in the _errorSQL StringBuilder. You could
accomplish the same thing by calling ErrorSQL.Clear(), but remember that ErrorSQL will create
a new StringBuilder if _errorSQL is null. ClearSqlStatementLog allows you to avoid
inadvertently creating a StringBuilder. We never really use this in the demo app, but it's there in
case you ever need it.

////////////////////////////////////////////////////////////
public void ClearSqlStatementLog()
{
if (_errorSQL != null)
{
_errorSQL.Length = 0;
}
}

Listing 8 – ClearSqlStatementLog (new method in DataService class)

Now, on to more important things.


Beginning, committing, and rolling back a transaction via the DataServiceBase class
There are two constructors for the DataServiceBase class: a parameterless constructor, and a
constructor that accepts a transaction. One of the features of this SQL logging demo is that it can
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 27

log all of the SQL statements that run in a given transaction. It is possible, and fairly likely, that you
will have different Data Service classes participating in a given transaction. Listing 9 shows one
possible scenario:

//Create some objects


Person person = new Person("Bob");
Car car = new Car("Corvette");

//Save those objects using a transaction


SqlTransaction txn = cnx.BeginTransaction();

// Passed txn to the PersonDataService


new PersonDataService(txn).SavePerson(person);

// Passed txn to the CarDataService


new CarDataService(txn).SavePersonCar(person, car);

txn.Commit();

Listing 9 – Multiple Data Service classes participating in a transaction

Remember, SQL statements are stored in the StringBuilder object associated with the ErrorSQL
property in a DataService class. So the question is, how do you share a StringBuilder between
different DataService classes? One option is to pass the StringBuilder object around all over
the place, but that would make for some pretty nasty code. Instead, we're going to store
StringBuilder objects in a static property and associate those StringBuilder objects with a
particular transaction. And this is the entire reason the _sqlTxnSBMapping Dictionary field exists.
Whenever we instantiate a DataService that uses a transaction, we can use the
_sqlTxnSBMapping Dictionary to look up the appropriate StringBuilder object for that
transaction. But that means that you need a structured way of adding the StringBuilder to the
Dictionary when you begin a transaction, and a structured way of removing the StringBuilder
when you commit or roll back the transaction.
So, there are three new static methods on the DataServiceBase class that assist you in that
endeavor: BeginTransaction, CommitTransaction, and RollbackTransaction. Listing 10
shows the code for all three of these methods.

////////////////////////////////////////////////////////////
public static IDbTransaction BeginTransaction()
{
SqlConnection txnConnection =
new SqlConnection(GetConnectionString());
txnConnection.Open();

SqlTransaction txn = txnConnection.BeginTransaction();

if (SqlFullTxnLogging)
{
StringBuilder sbTemp = new StringBuilder();
_sqlTxnSBMapping.Add(txn.GetHashCode(), sbTemp);
sbTemp.AppendLine("BEGIN TRANSACTION");
}
return txn;
}

////////////////////////////////////////////////////////////
public static void RollbackTransaction(IDbTransaction txn)
{
if (txn != null)
{
if (SqlFullTxnLogging)
28 by Damon Armstrong

{
StringBuilder sbTemp;
if (_sqlTxnSBMapping.TryGetValue(
txn.GetHashCode(), out sbTemp))
{
sbTemp.Append("ROLLBACK TRANSACTION");
_sqlTxnSBMapping.Remove(txn.GetHashCode());
}
}
txn.Rollback();
}
}

////////////////////////////////////////////////////////////
public static void CommitTransaction(IDbTransaction txn)
{
if (txn != null)
{
if (SqlFullTxnLogging)
{
StringBuilder sbTemp;
if (_sqlTxnSBMapping.TryGetValue(
txn.GetHashCode(), out sbTemp))
{
sbTemp.Append("COMMIT TRANSACTION");
_sqlTxnSBMapping.Remove(txn.GetHashCode());
}
}
txn.Commit();
}
}

Listing 10 – BeginTransaction, CommitTransaction, and RollbackTransaction (new static methods in


the DataServiceBase class)

When you call BeginTransaction, the method needs to create the StringBuilder object that's
going to capture the SQL for the duration of that entire transaction. It also needs to associate that
StringBuilder object to the transaction in the _sqlTxnSBMapping dictionary. In the bolded
BeginTransaction code, you can see the if statement that checks to see if full transaction logging
is enabled. If so, the method creates the new StringBuilder then adds it to the dictionary. Notice
that it uses the txn.GetHashCode() method to generate the int value used as the dictionary key.
Then it outputs BEGIN TRANSACTION to the StringBuilder so your SQL will run in a transaction
when you copy and paste it into a query window. Once the StringBuilder is in the static
_sqlTxnSBMapping dictionary, the individual DataService instances can easily share that
StringBuilder. We'll see how, when we get to the constructor changes in a moment.
Next we have the RollbackTransaction and CommitTransaction code. These are identical
methods except that one commits the transaction, and one rolls the transaction back. In the bolded
code for each method, you can see if the if statement that runs when full transaction logging is
enabled. It starts by declaring a StringBuilder variable, then calls _sqlTxnSBMapping.-
TryGetValue in an attempt to find the StringBuilder associated with the transaction passed into
the method. If it finds the StringBuilder in the dictionary, it then loads that object into the
sbTemp variable and appends ROLLBACK TRANSACTION or COMMIT TRANSACTION depending on
which method you called. Finally, it removes the StringBuilder object from the dictionary because
you are, in theory, done with the transaction.
So, here's the deal. If you want the full transaction logging to work appropriately, you need to use
these methods to manage your transactions. Failing to call BeginTransaction means that you may
not get all of the SQL statements from the transaction in your SQL log, and failing to call
CommitTransaction or RollbackTransaction means that the _sqlTxnSBMapping will have
orphaned StringBuilder objects hogging up memory. And if you are doing transactions within
transactions, then you will need to write some custom code that stores parent-to-child relationships
between transactions because, if I account for that here, then this article is just going to get that much
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 29

longer. This should suffice for most needs. Listing 11 shows an updated example of how you use
these new methods in your code.

//Create some objects


Person person = new Person("Bob");
Car car = new Car("Corvette");

//Save those objects using a transaction


SqlTransaction txn = DataServiceBase.BeginTransaction();
try
{

// Passed txn to the PersonDataService


new PersonDataService(txn).SavePerson(person);

// Passed txn to the CarDataService


new CarDataService(txn).SavePersonCar(person, car);
DataService.CommitTransaction(txn);
}
catch(Exception ex)
{
DataService.RollbackTransaction(txn);
}

Listing 11 – Multiple Data Service classes participating in a transaction with transaction logging

Now let's take a look at the updates to the constructor.


Acquiring the StringBuilder for the transaction in the DataService constructor
One of the DataService constructors accepts a transaction and associates that transaction with the
DataService. Any commands that execute through the DataService occur within the context of
that transaction. And when full transaction logging is enabled, that means we need to acquire the
StringBuilder associated with that transaction so we can log the SQL in the appropriate place.
Listing 12 shows the updates to the constructor, which allow it to acquire the appropriate
StringBuilder object.

////////////////////////////////////////////////////////////
public DataServiceBase(IDbTransaction txn)
{
if (txn == null)
{
_isOwner = true;
}
else
{
_txn = (SqlTransaction)txn;
_isOwner = false;
if(SqlFullTxnLogging)
{
_sqlTxnSBMapping.TryGetValue(
_txn.GetHashCode(), out _errorSQL);
}
}
}

Listing 12 – Locating the StringBuilder associated with a transaction

The code in bold shows the updates made to the transaction constructor. As you can see, the if
statement checks to see if full transaction logging is enabled and, if so, the code looks for the
StringBuilder associated with the transaction in the dictionary. If there is no StringBuilder in
the dictionary it means that you did not call BeginTransaction to start the transaction. When your
DataService goes to log a SQL statement, it will call the ErrorSQL property, which automatically
creates a StringBuilder object and assigns it to _errorSQL. So, the SQL statements the
30 by Damon Armstrong

DataService executes will still be logged, but you may not get to see a complete log of the
statements in the transaction if an exception occurs.
Logging SQL statements with the BuildSQL method
The BuildSQL method is responsible for taking a SqlCommand object and producing the SQL
statement that gets attached to the SqlWrapperException. Although there's a lot of code in this
method, the majority of it has to do with formatting the SQL statement appropriately. So it's lengthy,
but not overly complex. This particular method focuses on creating SQL statements to run stored
procedures. You can modify it to write both stored procedure statements as well as ad hoc SQL
statements if you so choose (all you have to do is switch the logic based on the cmd.CommandType
property). There are four sections of code in the BuildSQL method:
1. Validating the CMD object
2. Declaring output parameters
3. Writing the EXEC statement
4. Writing out the stored procedure parameters and values
In Listing 13 you will find all of the code for the BuildSQL method. Bolded comments indicate the
starting point for each of the four sections outlined above. We'll discuss each section of code in more
detail after the listing.

///////////////////////////////////////////////////////////////////////
protected void BuildSQL(SqlCommand cmd)
{

//SECTION 1 – Validating the CMD Object


if (cmd == null)
{
ErrorSQL.AppendLine("/* Command was null -- cannot display " +
"SQL for a null command */");
return;
}

//SECTION 2 – Declaring Output Parameters


if (ErrorSQL.Length > 0) ErrorSQL.AppendLine();

for (int index = 0; index < cmd.Parameters.Count; index++)


{
if (cmd.Parameters[index].Direction != ParameterDirection.Input)
{
ErrorSQL.Append("DECLARE ");
ErrorSQL.Append(cmd.Parameters[index].ParameterName);
ErrorSQL.Append(" ");
ErrorSQL.Append(cmd.Parameters[index].SqlDbType.
ToString().ToLower());

//Check to see if the size and precision need to be included


if (cmd.Parameters[index].Size != 0)
{
if (cmd.Parameters[index].Precision != 0)
{
ErrorSQL.Append("(");
ErrorSQL.Append(cmd.Parameters[index].
Size.ToString());
ErrorSQL.Append(",");
ErrorSQL.Append(cmd.Parameters[index].
Precision.ToString());
ErrorSQL.Append(")");
}
else
{
ErrorSQL.Append("(");
ErrorSQL.Append(cmd.Parameters[index].
Size.ToString());
ErrorSQL.Append(")");
}
}
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 31

//Output the direction just for kicks


ErrorSQL.Append("; --");
ErrorSQL.Append(cmd.Parameters[index].Direction.ToString());
ErrorSQL.AppendLine();

// Set the Default Value for the Parameter


// if it's an InputOutput
if (cmd.Parameters[index].Direction ==
ParameterDirection.InputOutput)
{
ErrorSQL.Append("SET ");
ErrorSQL.Append(cmd.Parameters[index].ParameterName);
ErrorSQL.Append(" = ");
if (cmd.Parameters[index].Value == DBNull.Value
|| cmd.Parameters[index].Value == null)
{
ErrorSQL.AppendLine("NULL");
}
else
{
ErrorSQL.Append("'");
ErrorSQL.Append(cmd.Parameters[index].
Value.ToString());
ErrorSQL.Append("'");
}
ErrorSQL.AppendLine(";");
}

}
}

//Section 3 – Writing the EXEC Statement


ErrorSQL.AppendLine();

//Output the exec statement


ErrorSQL.Append("EXEC ");

//See if you need to capture the return value


for (int index = 0; index < cmd.Parameters.Count; index++)
{
if (cmd.Parameters[index].Direction ==
ParameterDirection.ReturnValue)
{
ErrorSQL.Append(cmd.Parameters[index].ParameterName);
ErrorSQL.Append(" = ");
break;
}
}

//Output the name of the command


ErrorSQL.Append("[");
ErrorSQL.Append(cmd.CommandText);
ErrorSQL.AppendLine("] ");

//Section 4 – Writing Out the Stored Procedure Parameters and Values


for (int index = 0; index < cmd.Parameters.Count; index++)
{
if (cmd.Parameters[index].Direction !=
ParameterDirection.ReturnValue)
{
//Append comma seperator (or space if it's the first item)
if (index == 0)
{
ErrorSQL.Append(" ");
}
else
{
ErrorSQL.Append(", ");
ErrorSQL.AppendLine();
ErrorSQL.Append("\t\t");
}

ErrorSQL.Append(cmd.Parameters[index].ParameterName);
switch (cmd.Parameters[index].Direction)
{
32 by Damon Armstrong

case ParameterDirection.Input:
ErrorSQL.Append(" = ");
if (cmd.Parameters[index].Value == DBNull.Value
|| cmd.Parameters[index].Value == null)
{
ErrorSQL.AppendLine("NULL");
}
else
{
ErrorSQL.Append("'");
ErrorSQL.Append(cmd.Parameters[index].
Value.ToString());
ErrorSQL.Append("'");
}
break;

case ParameterDirection.InputOutput:
case ParameterDirection.Output:

ErrorSQL.Append(cmd.Parameters[index].
ParameterName);
ErrorSQL.Append(" OUTPUT");
break;
}
}
}

ErrorSQL.AppendLine(";");
ErrorSQL.AppendLine("GO");

Listing 13 – BuildSQL method (new method in the DataServiceBase class)

Section 1 deals with validating the SqlCommand object. All this code does is check to see if cmd is
null. If it is null, then the method writes a SQL comment to the ErrorSQL StringBuilder
indicating that it could not write out a SQL statement for the command. Then it calls return to exit
the method because the method can't do much with a null command.
Section 2 declares output parameter variables to help in debugging the stored procedure. Many
stored procedures return information back from the stored procedure in the form of a return value
or an output parameter. If you want to check the values of these outputs, you need a way to reference
them after the stored procedure executes, so you need to declare SQL variables to store those
outputs. To help out in this endeavor, the BuildSQL method automatically creates variable
declarations for all return value and output parameters in your stored procedure. Listing 14 shows an
example SQL statement that includes variables to store the return value (@RV) and an output
parameter (@PersonID).

DECLARE @RV int; -- Return Value


DECLARE @PersonID int; -- InputOutput Parameter
SET @PersonID = 5;

EXEC @RV = [dbo].[Person_Save]


@PersonID OUTPUT,
@NameFirst = 'Dave',
@NameLast = 'Smith',
@DOB = '3/22/1975'

SELECT @RV, @PersonID; -- Checking the Outputs

Listing 14 – Example of checking output parameters

To write the output parameters, Section 2 begins by checking to see if the ErrorSQL
StringBuilder contains text. If you have enabled SqlFullTxnLogging then it is possible for
SqlError to contain previous SQL statements from the transaction, and you do not want multiple
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 33

statements to run together because this would be hard to read. So the code adds a line break between
the statements to help break them apart and to make the SQL statements easier to read. Next, the
code iterates through all of the parameters in the command and runs an if statement that checks to
see if the Direction property is anything other than ParameterDirection.Input. If so, it
means that the parameter handles some form of output from the stored procedure and that we need
to write out a variable to store that output. The first four lines inside the if block output the
DECLARE keyword, parameter name, a space, and the variable type. Then the code checks to see if the
parameter has a size specified. If so, the method checks to see if the parameter also has a precision.
If both the size and precision are specified, then method outputs them both in the form
(<size>,<precision>). If only the size is specified, then the method outputs (<size>). Finally, the
method appends the ending semicolon, writes a comment indicating the parameter direction, and
appends a line break.
Next, the code determines whether or not the parameter's Direction is set to
ParameterDirection.InputOutput. If it is, it means that the stored procedure is expecting a
value to come in via the parameter and that the stored procedure can pass a value out via the
parameter as well. For simple input parameters, we can declare the parameter values in the EXEC
statement (see the @NameFirst parameter in Listing 14). But SQL does not allow you to pass in a
value and specify the OUTPUT keyword. So you have to set the parameter value before you call the
EXEC statement (see 3rd line of Listing 14). So, if the parameter is an InputOutput parameter, the
code outputs the SET statement, the parameter name, and an equals sign. It then checks to see if the
parameter value is set to DBNull.Value. If so, it outputs SQL that sets the parameter variable to
null. If not, it outputs SQL that sets the parameter variable to the appropriate value. And finally, it
appends a semicolon.
Section 3 is responsible for writing out the EXEC statement, and begins by writing out the EXEC
keyword. Then it iterates through all of the parameters, checking to determine which parameter, if
any, handles the return value. If it finds a parameter whose Direction is
ParameterDirection.ReturnValue, it outputs the parameter name and an equals sign. This sets
up the parameter to receive the return value of the procedure (refer to Listing 14). Then the code
outputs the name of the stored procedure.
Section 4 writes out the stored procedure parameters and their values. The code in this section uses a
for loop to iterate through all of the parameters in the command. The if statement directly inside
the for loop checks the Direction property of the parameter to make sure we don't add the return
value parameter to the list of stored procedure parameters, because we've already accounted for the
return value in Section 3. Once the code has determined it's dealing with a stored procedure
parameter, it checks the index of the for loop to see if this is the first parameter in the list. If it is the
first parameter, the code appends a space to separate the stored procedure name from the parameter
list. If it is not the first parameter, the code appends a comma and a space to separate the parameters.
Then it outputs the parameter name. After that, the code checks the Direction property of the
parameter. If the parameter is an Input parameter, then the code writes an equals sign and the
parameter value to ErrorSQL using the same logic discussed in Section 2. If the parameter is an
Output or an InputOutput parameter, the code writes the OUTPUT keyword to indicate that the
stored procedure passes a value back out using the parameter. And at the very end, the code writes
out a semicolon, a line break, and the GO keyword, to finish off the SQL statement.
Whew! That was the final and biggest addition to the DataServiceBase class. Now we just need to
see exactly how to use all of these new properties and methods in the existing code.
Catching exceptions and logging the SQL statements
When you execute a SQL statement and it fails, your code throws an exception. All you have to do is
catch the exception, pass the failed command to the BuildSQL method to log the SQL, create a
SqlWrapperException to wrap the actual exception, assign the contents of ErrorSQL to the
SqlWrapperException, then throw the wrapped exception. This allows the exception to propagate
back up to the caller along with the SQL that caused the exception to occur. There are only two
places in the DataServiceBase class where we execute SQL commands: the ExecuteDataSet
34 by Damon Armstrong

method, and the ExecuteNonQuery method. Listing 15 contains the updated code for the
ExecuteNonQuery method.

//////////////////////////////////////////////////////////////////////
protected void ExecuteNonQuery(out SqlCommand cmd, string procName,
params IDataParameter[] procParams)
{
//Method variables
SqlConnection cnx = null;
cmd = null; //Avoids "Use of unassigned variable" compiler error

try
{
//Setup command object
cmd = new SqlCommand(procName);
cmd.CommandType = CommandType.StoredProcedure;
for (int index = 0; index < procParams.Length; index++)
{
cmd.Parameters.Add(procParams[index]);
}

//Determine the transaction owner and process accordingly


if (_isOwner)
{
cnx = new SqlConnection(GetConnectionString());
cmd.Connection = cnx;
cnx.Open();
}
else
{
cmd.Connection = _txn.Connection;
cmd.Transaction = _txn;
}

//Execute the command


cmd.ExecuteNonQuery();
}
catch(Exception ex)
{
if (SqlLogging)
{
BuildSQL((SqlCommand)cmd);
ErrorSQL.AppendLine
("-- A SQL ERROR OCCURED RUNNING THE LAST COMMAND");
if (_txn != null && SqlFullTxnLogging)
{
ErrorSQL.AppendLine();
ErrorSQL.AppendLine("ROLLBACK TRANSACTION");
}
throw new SqlWrapperException(
GetSqlStatementForException(), ex);
}
else
{
throw;
}
}
finally
{
if (_isOwner)
{
cnx.Dispose(); //Implicitly calls cnx.Close()
}
if (SqlFullTxnLogging) BuildSQL(cmd);
if (cmd != null) cmd.Dispose();
}
}

Listing 15 – Updated ExecuteNonQuery method

Inside the catch statement, the code begins by checking to see if SQL logging is enabled. If not, the
code simply re-throws the original exception, without worrying about logging anything. If SQL
NET Application Architecture: Logging SQL Exceptions in the Data Access Layer 35

logging is enabled, the code passes the cmd variable to the BuildSQL method. The cmd variable
contains the SqlCommand that was executing when the exception occurred, and contains all of the
information that needs to be logged. When the BuildSQL method finishes, the StringBuilder in
the _errorSQL field contains the SQL log information. It then appends a SQL comment to the log
indicating that the last command in the log caused the exception. After that, the code checks to see if
there is an active transaction in the _txn variable and if full transaction logging is enabled. If that is
the case, then the code appends a ROLLBACK TRANSACTION statement to the SQL log to avoid
accidentally committing the SQL during debugging. Then, regardless of whether or not full
transaction logging is enabled, it creates and throws a new SqlWrapperException. Notice that it
uses GetSqlStatementForException to pass the SQL statement and to clear the SQL log. You
will find similar code in the catch block of the ExecuteDataSet method.
And the last bit of code we need to discuss in the ExecuteNonQuery method is the line of bolded
code that appears in the final block. When full transaction logging is enabled, you have to log the
good statements as well as the bad ones. If the method gets to this line of code, it means that the
command succeeded. The line simply checks to see if full transaction logging is enabled and, if so,
logs the successful command.

The code in action: demo application


I updated the Person class, PersonDataService, and demo database to have a few more methods
and stored procedures that help demo the SQL exception logging capabilities that we've been
discussing: RandomProcA, RandomProcB, RandomProcC, and RandomProcThrowError. The first
three methods work fine. The stored procedure for the fourth one attempts to divide by zero any
time it runs, so it will cause your code to throw an exception. Here's the code from the
PersonCreateError.aspx page in the Website project:

protected void Page_Load(object sender, EventArgs e)


{
try
{
Person myPerson = new Person();
myPerson.RandomProcThrowError(2, 3, 5, 7, 11, 13);
}
catch (SqlWrapperException ex)
{
lblSqlErrorInfo.Text = "<b>An error occured</b>: " +
ex.Message +
"<br/><br/><hr/>" +
Server.HtmlEncode(ex.SQL).Replace(
"\r\n","<br/>") +
"<hr/>";
}
catch (Exception ex)
{
lblSqlErrorInfo.Text =
"An error occurred but it was not a SQL Wrapper Exception: " +
ex.Message;
}
}

Listing 16 – PersonCreateError.aspx Page_Load code

Notice that the first catch statement explicitly looks for the SqlWrapperException. If the
RandomProcThrowError method throws a SqlWrapperException (I say "if" because you can
turn off SQL logging altogether), the code in the first catch blocks executes and displays the
statements stored in the SQL property of the SqlExceptionWrapper. If the
RandomProcThrowError does not throw a SqlWrapperException, then the second catch block
catches the exception. This demonstrates a key point. Somewhere in your exception logging code you
will need to check and see if the exception is a SqlWrapperException, and then process the SQL
property accordingly, to ensure it gets stored for future reference. Most exception logging tools allow
you to store extended exception properties, so it's just a matter of using the exception logging tool
36 by Damon Armstrong

appropriately. You can also check out the PersonCreateErrorInTXN.aspx page to see the output
from an exception that occurs within a transaction. The demo application has SQL logging and full
transaction logging enabled by default. Feel free to change the settings and rerun the pages to see the
various outputs.

Conclusion
You've seen the basics for logging exception-causing stored procedures, and you've seen how to pass
a log of SQL statements back to your application using an exception wrapper. So you've got one
more tool in your arsenal for tracking down nasty bugs. Remember, if you are worried about
performance, then shy away from the full transaction logging because it has to build out SQL
statements for every command that executes. SQL logging alone should not affect performance too
badly, since it only runs when an exception occurs. If you want, you can always expand on the
solution and add support for ad hoc SQL statements or transaction-inside-transaction support. You
could even go as far as to add configuration options to turn on full transaction logging for a specific
DataService, instead of all of them, to help out in the performance area. At any rate, good luck with
it!
ADO.NET 2.0 Factory Classes 37

ADO.NET 2.0 FACTORY CLASSES


25 October 2005
by Amirthalingam Prasanna

Achieve database independence by developing a pluggable data


layer

This article explains how to use .NET 2.0’s data provider factory classes to develop a pluggable
data layer that is independent of database type and ADO.NET data provider.

Introduction
If you want to develop a data layer that supports many types of database products, it should not be
tightly coupled with a particular database product or ADO.NET data provider. The fact that
ADO.NET has data providers that are enhanced for specific databases makes that independence
more difficult and cumbersome to achieve.
You should have a good understanding of the .NET framework and familiarity with the ADO.NET
library before using .NET 2.0’s data provider factory classes to create a pluggable data layer.

Supporting many database products


If you plan to market your application to many potential clients, it should support more than one
database product. Since some clients may have already invested in a particular database, the ability to
easily configure your application to work with different products is a strong feature.
When developing a data-centric application, I generally use a particular ADO.NET data provider and
develop the data layer targeting a particular database product. One benefit of isolating the data layer is
that it makes it easy to change the database product without affecting the application too much.
If the business and user interface layers in your application use the data layer for database-related
operations and do not directly access the database, then you can have multiple data layers for the
database products you want to support. Although this approach sounds reasonable, maintaining
multiple data layers for every database product you intend to support is not feasible. The classes in
System.Data.Common namespace enable you to build a data layer independent of the database
product, and easily change the database product on which it works.

System.Data.Common namespace
Take the following ADO.NET code that connects to a SQL Server database and execute an arbitrary
SQL statement:
C# Code

System.Data.SqlClient.SqlConnection con = new


System.Data.SqlClient.SqlConnection();
con.ConnectionString = "Data Source=.;initial
catalog=Northwind;Integrated security=true";
System.Data.SqlClient.SqlCommand cmd = new
System.Data.SqlClient.SqlCommand();
cmd.CommandText = "Update Products set UnitsInStock=UnitsInStock+10";
cmd.Connection = con;
con.Open();
cmd.ExecuteNonQuery();
con.Close();
38 by Amirthalingam Prasanna

There are quite a few problems in the code above that would make it difficult to modify to work with
a different database product. One obvious change that is needed is to move the hard-coded
connection string information out to a configuration file. Another problem is that we are tying our
code to a particular ADO.NET data provider, in this case the SQL Client data provider. This
increases the changes that are needed if we want to support another database product.
Now let us see how the code is changed after we move the connection string out to the application
configuration file and use the classes in the System.Data.Common namespace instead of the SQL
Client ADO.NET data provider:
C# Code

System.Configuration.AppSettingsReader appReader = new


System.Configuration.AppSettingsReader();
string provider = appReader.GetValue("provider", typeof(string)).ToString();
string connectionString = appReader.GetValue("connectionString",
typeof(string)).ToString();
System.Data.Common.DbProviderFactory factory =
System.Data.Common.DbProviderFactories.GetFactory(provider);
System.Data.Common.DbConnection con = factory.CreateConnection();
con.ConnectionString = connectionString;
System.Data.Common.DbCommand cmd = factory.CreateCommand();
cmd.CommandText = "Update Products set UnitsInStock=UnitsInStock+10″;
cmd.Connection = con;
con.Open();
cmd.ExecuteNonQuery();
con.Close();

App.config file

<configuration>
<appsettings>
<add key = "provider" value ="System.Data.SqlClient"/>
<add key = "connectionString" value ="Data Source=.;
initial catalog=Northwind;Integrated security=true" />
</appsettings>
</configuration>

In the code above, other than isolating the connection string, we have used the common ADO.NET
data provider in the System.Data.Common namespace. This is a simple implementation of the
abstract factory pattern. Each ADO.NET data provider has a factory class that enables us to create
ADO.NET objects of its provider type.
The SQL Client ADO.NET data provider, for example, has a SqlClientFactory that can be used to
create SqlConnection, SqlCommand, and other SQL Client ADO.NET data provider-specific objects.
Based on the string value that is passed to the GetFactory method of the DbProviderFactories class,
a concrete instance of a particular ADO.NET data provider factory will be created. Instead of
creating the connection and the command objects directly, we use this factory instance to create the
necessary ADO.NET objects for us.
The code above shows that we are passing the string value System.Data.SqlClient from the
application configuration file, indicating that we want a SqlClientFactory object to be created and
assigned to the factory variable. From that point on, all the create methods of the DbProviderFactory
object will create ADO.NET objects of the SQL Client ADO.NET data provider.
The classes in ADO.NET have been altered from .NET 1.1 to inherit common base classes from the
SystemData.Common namespace. ADO.NET connection classes such as SqlConnection and
OleDbConnection inherit from the DbConnection class, for example. The following diagram shows
the inheritance hierarchy of the factory classes and the ADO.NET classes:
ADO.NET 2.0 Factory Classes 39

Provided we have used standard SQL statements, we can easily make our product work with a
different ADO.NET data provider by changing the provider in the application configuration file. If
we set it to System.Data.OleDb, an OleDbFactory class will be created, which will create OleDb data
provider-specific ADO.NET objects such as OleDbConnection and so on.
You might also want to list all of the available ADO.NET data providers. You can do so using the
GetFactoryClasses method of the DbProviderFactories class:
C# Code

DataTable tbl =
System.Data.CommonDbProviderFactories.GetFactoryClasses();
dataGridView1.DataSource = tbl;
foreach (DataRow row in tbl.Rows)
{
Console.WriteLine(row["InvariantName"].ToString());
}

The GetFactoryClasses method returns a data table containing information about the available
ADO.NET data providers. The InvariantName column provides the necessary string value needed to
pass to the GetFactory method in order to create a factory for a particular ADO.NET data provider.
One disadvantage of using the factory classes and developing a common data layer is that it limits us
to standard SQL statements. This means we cannot take advantage of the full functionality of a
particular database product.
One way to overcome this is to make a check on the type of ADO.NET object created by a factory
and execute some statements based on it. Though it’s not an elegant approach, it is useful when we
need to execute database product-specific SQL statements. For example:
C# Code

DbProviderFactory factory =
DbProviderFactories.GetFactory("System.Data.SqlClient");
DbCommand cmd = factory.CreateCommand();
if(cmd is System.Data.SqlClient.SqlCommand)
{
//set command text to SQL Server specific statement
}
else if (cmd is System.Data.OleDb.OleDbCommand)
{
//set command text to OleDb specific statement
}
40 by Amirthalingam Prasanna

Conclusion
The ADO.NET data providers in .NET 2.0 provide factory and common ADO.NET classes that
make it easy to keep your code independent from a particular ADO.NET data provider or database
product.
Implementing Real-World Data Input Validation using Regular Expressions 41

IMPLEMENTING REAL-WORLD DATA INPUT VALIDATION USING


REGULAR EXPRESSIONS
14 May 2007
by Francis Norton
This article explains how to use .NET regular expressions to enforce the kind of logically complex
input validation requirements that we sometimes confront in real specifications. This will allow us to
start with basics and go on to exploit some fairly advanced features.
Because regular expressions are powerful and complex enough to be the subject of entire books, I'm
going to stick strictly to their use in validation. I will entirely ignore otherwise interesting and valid
topics like performance, comparison with non-.NET implementations, token extraction and
replacement, in order to take you somewhere new on this topic while keeping some clarity and focus.
I will test the regexes using the Powershell command line, which you can download for free at
http://www.microsoft.com/technet/scriptcenter/topics/msh/download.mspx. Because Microsoft's
architectural plan is that you can access the same .NET regex library whatever you're writing, from
ASP.NET (dead easy) to SQL Server 2005 (slightly greater difficulty - I include a reference at the end
of the article that gives further details on this), the regular expression skills you learn in one context
are directly transferable to another.

Some real validation requirements


These all come from real specs, I've simply selected some examples and arranged them in order of
increasing logical complexity.
1. Num: Numbers only. Can be negative or positive, for example 1234 or -1234.
2. Dec: May be fixed length. A numeric amount (positive or negative) including a maximum of 2
decimal places unless stated otherwise, for example12345.5, 12345, 12345.50 or -12345.50 are all
valid Dec 7 inputs
3. UK Bank Sort Code: Six digits, either xx-xx-xx or xxxxxx input format allowed.
4. House: Alphanumeric. Must not include the strings'PO Box', 'P.O. Box', 'P.O.Box', 'P.O Box' or
'POBox' (any case)

Basics: Implementing NUM using "^"…"$", "["…"]", "?" and "+"


This section will illustrate some core regex concepts and syntax, so if you're familiar with the use of
the above symbols in patterns, feel free to skip forwards.
Let's take another look at the Num requirement:
Num: Numbers only. Can be negative or positive, for example 1234 or -1234.
I take this to mean that we'll accept anything consisting of an optional minus sign followed by one or
more digits.
We can specify the "one or more digits" part by using square brackets and a dash for character ranges,
and the plus sign ("+") for repetition. Let's start with character ranges, in this case the range of
characters from "0" to "9":

PS C:\Notes> [Regex]::IsMatch("1", "[0-9]")


True
PS C:\Notes> [Regex]::IsMatch("i", "[0-9]")
False
PS C:\Notes>
42 by Francis Norton

NOTE:
If you're new to Powershell, you can read "[Regex]::IsMatch" as "use the static method
'IsMatch' of the .NET library 'Regex'". In fact we could use Powershell's "–cmatch"
operator, which is precisely equivalent to a [Regex]::IsMatch() expression, but I like the
clarity of using the .NET class directly.
The square bracket expression is a character class. In effect, it gives us a concise way of doing a
character-level OR expression, so "[0-9]" can be understood as "does the input character equal 0, 1,
2…or 9?" The dash ("-") acts as a range operator in this context so "[0-9]" is exactly equivalent to
"[0123456789]".
At the moment we're simply testing whether the test string contains a match for the regex, which
would be fine for searches, but when we're doing validation we want to ensure that the test string
doesn't also contain non-matching text. For example:

PS C:\Notes> [Regex]::IsMatch("1", "[0-9]")


True
PS C:\Notes> [Regex]::IsMatch("ninety 9 point nine", "[0-9]")
True

We can stop that behaviour using the special characters "^" and "$" to specify that the regex pattern
must match from the start to the end of the test string:

PS C:\Notes> [Regex]::IsMatch("1", "^[0-9]$")


True
PS C:\Notes> [Regex]::IsMatch("ninety 9 point nine", "^[0-9]$")
False

Now we'll make the regex accept one or more digits by using the "+" modifier on the "[0-9]"
character class. The "+" means, in general, "give me one or more matches for whatever I've been
attached to", so in this case means "give me one or more digits".

PS C:\Notes> [Regex]::IsMatch("123", "^[0-9]$")


False
PS C:\Notes> [Regex]::IsMatch("123", "^[0-9]+$")
True

That just leaves the optional minus sign. The good news and the bad news is that outside a character
class (like "[0-9]") the dash is just a literal character (good news because it means we won't have to
escape it; bad news because treating the same character as a literal in some parts of a pattern and a
special character in others is a triumph of terseness over readability). We'll make it optional with the
"?" modifier, which can be read as "give me zero or one matches".

PS C:\Notes> [Regex]::IsMatch("-123", "^[0-9]+$")


False
PS C:\Notes> [Regex]::IsMatch("-123", "^-?[0-9]+$")
True
PS C:\Notes> [Regex]::IsMatch("123", "^-?[0-9]+$")
True

Using "{" … "}", "(" … ")", "\", and "d" to implement Repetition
These "?" and "+" modifiers are very nice and convenient, but suppose we have a counting system
that can express more than None, One, and Many?
Let's take another look at the DECIMAL format requirement:
Implementing Real-World Data Input Validation using Regular Expressions 43

Dec: May be fixed length. A numeric amount (positive or negative) including a maximum of 2
decimal places unless stated otherwise, for example 12345.5, 12345, 12345.50 or -
12345.50 are all valid Dec 7 inputs
Ignoring the fixed length option for now, let's look at the decimal section. It seems that we're
expected to accept numbers with a decimal point and one or two decimals or with no decimal point
and decimals at all.
Our first challenge is the decimal point. We want to use the "." sign, but this gives us some strange
behaviour:

PS C:\Notes> [Regex]::IsMatch(".", "^.$")


True
PS C:\Notes> [Regex]::IsMatch(",", "^.$")
True

We've discovered that "." is a special character in regular expressions – in fact it matches any
character. We need to escape it with the "\" prefix to make it a literal:

PS C:\Notes> [Regex]::IsMatch(".", "^\.$")


True
PS C:\Notes> [Regex]::IsMatch(",", "^\.$")
False

The next step is to use the braces modifier to specify that we want one to two digits following the
decimal point – we can put the minimum and maximum number of matches (in our case 1 and 2,
which we'll test with zero to three) inside the "{" and "}" curly brackets:

PS C:\Notes> [Regex]::IsMatch(".", "^\.[0-9]{1,2}$")


False
PS C:\Notes> [Regex]::IsMatch(".0", "^\.[0-9]{1,2}$")
True
PS C:\Notes> [Regex]::IsMatch(".01", "^\.[0-9]{1,2}$")
True
PS C:\Notes> [Regex]::IsMatch(".012", "^\.[0-9]{1,2}$")
False

Now we can add the entire decimal suffix pattern, " \.[0-9]{1,2}", to our existing number pattern, and
test it:

PS C:\Notes> [Regex]::IsMatch("123.45", "^-?[0-9]+\.[0-9]{1,2}$")


True
PS C:\Notes> [Regex]::IsMatch("123", "^-?[0-9]+\.[0-9]{1,2}$")
False

Aha, we should still be accepting numbers with no decimal places, but we're not. We know how to
make a single character optional using the "?" modifier, but how can we do this to larger sub-
patterns? The pleasantly obvious answer is to use parentheses to wrap the decimal suffix sub-pattern
in "(" and ")", and then apply the "?".

PS C:\Notes> [Regex]::IsMatch("123.45", "^-?[0-9]+(\.[0-9]{1,2})?$")


True
PS C:\Notes> [Regex]::IsMatch("123", "^-?[0-9]+(\.[0-9]{1,2})?$")
True
44 by Francis Norton

And before we leave this pattern, one more trick to make regular expressions more readable and more
robust: we can replace "[0-9]" with "\d" (escape + d) which is pre-defined to mean "any digit". Be
aware that this is case-sensitive and "\D" means the opposite!

PS C:\Notes> [Regex]::IsMatch("123.45", "^-?\d+(\.\d{1,2})?$")


True

Using "|" to implement a logical OR


We know how to use character classes, i.e. the "[" … "]" expressions, to accept alternative single
characters, but the requirement for UK Bank Sort Codes requires us to accept input strings that fall
into one of two different patterns.
Let's take another look at the requirement:
UK Bank Sort Code: Six digits, either xx-xx-xx or xxxxxx input format allowed.
Accepting either one of these on its own is straightforward (remembering that "-" is just a literal
character outside character classes):

PS C:\Notes> [Regex]::IsMatch("123456", "^\d\d\d\d\d\d$")


True
PS C:\Notes> [Regex]::IsMatch("12-34-56", "^\d\d-\d\d-\d\d$")
True

We can match one pattern or the other using the "|" (or) operator. We're going to have to use
parentheses too, as we'll discover when we start testing.

PS C:\Notes> [Regex]::IsMatch("123456",
"^\d\d\d\d\d\d|\d\d-\d\d-\d\d$")
True
PS C:\Notes> [Regex]::IsMatch("123456 la la la",
"^\d\d\d\d\d\d|\d\d-\d\d-\d\d$")
True

What happened when we matched that second value? The "$" sign at the end of the pattern was
intended to reject input with text following the sort code itself, but the "|" meant that it was only
applied to the right-hand sub-pattern. (Try working out how to get a sort code with leading junk
accepted by the pattern above)
We can fix this by using parentheses again:

PS C:\Notes> [Regex]::IsMatch("123456",
"^(\d\d\d\d\d\d|\d\d-\d\d-\d\d)$")
True
PS C:\Notes> [Regex]::IsMatch("123456 la la la",
"^(\d\d\d\d\d\d|\d\d-\d\d-\d\d)$")
False

Using "(?=" … ")" to implement a logical AND


You may have noticed that we have some unfinished business with the Decimal requirement,
specifically that sentence "May be fixed length". It's clear from the examples that the fixed length
refers to the number of digits, not the number of characters (which could include minus signs and
decimal points).
We could adapt our existing decimal pattern, with its optional minus sign and decimal point, to
restrict input to just seven digits, but this is inadvisable. It would be better to keep our existing
Implementing Real-World Data Input Validation using Regular Expressions 45

pattern, which is relatively simple and well-tested, and apply a second regular expression to count the
number of digits, each optionally preceded by a non-digit character.
Remembering that "\d" means "any digit" and "\D" means "any non-digit", we can do this to restrict
the input to, say, no more than seven digits:

PS C:\Notes> [Regex]::IsMatch("-123456.7", "^(\D?\d){1,7}$")


True
PS C:\Notes> [Regex]::IsMatch("-123456.78", "^(\D?\d){1,7}$")
False

This is fine if we're in a position to validate a single input with multiple regular expressions, but
sometimes we're going need to do it all in one regex. This raises a problem – both of our expressions
necessarily start at the beginning of the input string and work their way, character by character, to the
end. If we are going to do "logical and" patterns as opposed to simply "and then" patterns, we need a
way of applying multiple sub-patterns to the same input.
Fortunately .NET regular expressions support the obscurely named, but very powerful, "lookahead"
feature which allows us to do just that. Using this feature we can, from our current position in the
input string, test a pattern over the rest of the string (all the way to the end if necessary), then resume
testing from where we were.
A lookahead sub-pattern is wrapped in "(?=" … ")" and here's how we can use it to implement the
requirement "up to seven digits AND a valid decimal number" by combining our two existing
patterns:

PS C:\Notes> [Regex]::IsMatch("-123456.7",
"^(?=(\D*\d){1,7}$)-?\d+(\.\d{1,2})?$")
True
PS C:\Notes> [Regex]::IsMatch("-123456.78",
"^(?=(\D*\d){1,7}$)-?\d+(\.\d{1,2})?$")
False
PS C:\Notes> [Regex]::IsMatch("-12345.6.7",
"^(?=(\D*\d){1,7}$)-?\d+(\.\d{1,2})?$")
False

And this completes our implementation of the decimal requirement.

Using "(?!" … ")" to implement AND NOT


Our final input validation requirement was for address lines, to exclude any that used a PO Box
instead of a real (residential) address.
As usual, let's revisit the friendly spec:
House: Alphanumeric. Must not include the strings'PO Box', 'P.O. Box', 'P.O.Box', 'P.O Box' or
'POBox' (any case)
Let's first implement the rule that the string must be alphanumeric. This means that the string can
contain alphabetic and numeric characters, spaces, dashes, full stops (period), commas or slashes. We
can implement this rule quite easily, remembering that the space character is a literal, not a separator:

PS C:\Notes> [Regex]::IsMatch("Platform 9 1/2,", "^[-a-zA-Z\d .,/]*$")


True

Now let's write a pattern that will find any obvious variation of "P O Box" anywhere after the start
of the input, which is where we test it. Remember from earlier that the space character is a literal, and
that the "." is a special character unless we escape it, "\."
46 by Francis Norton

PS C:\Notes> [Regex]::IsMatch("No PO Box here", "^.*P\.? ?O\.? ?Box")


True

Next, we'll reverse the result by asking for the pattern not to be found, and combine it with our
alphanumeric pattern, both done using "(?!" … ")" notation:

PS C:\Notes> [Regex]::IsMatch("Platform 9 1/2,",


"^(?!.*P\.? ?O\.? ?Box)[-a-zA-Z\d .,/]*$")
True
PS C:\Notes> [Regex]::IsMatch("Platform 9 1/2, PO Box 64",
"^(?!.*P\.? ?O\.? ?Box)[-a-zA-Z\d .,/]*$")
False

Finally, we'll make the PO Box rule case-insensitive. This can be done by setting a mode at the start
of the expression that will apply to everything that follows it. We can specify "case insensitive mode"
with the notation "(?i)" – notice that since we're going to be case-insensitive anyway, I've also
simplified the alpha bit of the alphanumeric pattern

PS C:\Notes> [Regex]::IsMatch("Platform 9 1/2,",


"^(?i)(?!.*P\.? ?O\.? ?Box)[-a-z\d .,/]*$")
True
PS C:\Notes> [Regex]::IsMatch("Platform 9 1/2, po box 64",
"^(?i)(?!.*P\.? ?O\.? ?Box)[-a-z\d .,/]*$")
False

Conclusion
Like any good tool, regular expressions can be used or abused. The purpose of this article is to help
you write regular expressions that are fit for the purpose of validating inputs against typical business
validation rules.
In order to do this we've covered writing straight-forward patterns using literals, special characters
and character classes, and applying them to the whole input using "^" … "$". We've also seen how to
combine simple patterns to implement logical OR, AND and NOT rules.

References
Using Regular expressions to validate input in ASP.NET:
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=46
Using Regular expressions to in SQL Server 2005:
http://msdn.microsoft.com/msdnmag/issues/07/02/SQLRegex/default.aspx
Regular expression options in the .NET library:
http://msdn2.microsoft.com/en-us/library/yd1hzczs(VS.80).aspx
A concise summary of all special characters recognised by .NET regular expressions:
http://regexlib.com/CheatSheet.aspx
NET 3.5 Language Enhancements 47

NET 3.5 LANGUAGE ENHANCEMENTS


25 July 2007
by John Papa
While Visual Studio 2008, the several variations of LINQ, and the ADO.NET Entity Framework are
getting a lot of attention in the upcoming .NET Framework 3.5, there are also several key language
enhancements on the near horizon. Many of the language enhancements (which will be found in VB
9 and C# 3.0) are the foundation of these more prominent new technologies. This article is a primer
for some of the key enhancements that will be introduced with the .NET Framework 3.5 and how
they relate to each other.
There are several .NET language enhancements to be introduced with Visual Studio 2008 including
implicitly typed variables, extension methods, anonymous types, object initializers, collection
initializers and automatic properties. These language enhancements, along with features like generics,
are critical to the use of some of the new features, such as LINQ with the ADO.NET Entity
Framework. What can be confusing is that these features are often referred to in the same
conversation as LINQ. Because of this relation by association, you may be led to believe that these
features are part of LINQ. They are not; they are part of the .NET Framework 3.5 and the VB 9 and
C# 3.0 languages. They are very valuable in their own rights as well as playing a huge role for LINQ.
This article will demonstrate and discuss several key language features including:
• Automatic Property setters/getters
• ·Object Initializers
• Collection Initializers
• ·Extension Methods
• Implicitly Typed Variable
• Anonymous Types

Figure 1 – Many of the New Language Enhancements

Automatic Properties
Since creating classes by hand can be monotonous at times, developers use either code generation
programs and IDE Add-Ins to assist in creating classes and their properties. Creating properties can
be a very redundant process, especially when there is no logic in the getters and setters other than
getting and setting the value of the private field. Using public fields would reduce the code required,
48 by John Papa

however public fields do have some drawbacks as they are not supported by some other features such
as inherent data binding.
One way to get around having to type the code for a private field and its public property getter and
setter is to use a refactoring tool. However, there is a new language feature called Automatic
Properties that allows you to type less code and still get a private field and its public getter and setter.
You declare the automatic property using a shortcut syntax and the compiler will generate the private
field and the public setter and getter for you. For example, Figure 2 shows a Customer class that has
several private fields that are exposed through a series of corresponding public properties. This class
has 4 properties including one that is of the class type Address.

public class Customer


{
private int _customerID;
private string _companyName;
private Address _businessAddress;
private string _phone;

public int CustomerID


{
get { return _customerID; }
set { _customerID = value; }
}
public string CompanyName
{
get { return _companyName; }
set { _companyName = value; }
}
public Address BusinessAddress
{
get { return _businessAddress; }
set { _businessAddress = value; }
}
public string Phone
{
get { return _phone; }
set { _phone = value; }
}
}

Figure 2 – Customer Class using Explicit Fields, Getters and Setters

Figure 3 shows how the same result can be achieved through automatic properties with less code than
Figure 2. The Customer class in Figure 3 uses automatic properties to create the class’ properties
without writing all of the code to declare a field and its property getter and setter.

public class Customer


{
public int CustomerID { get; set; }
public string CompanyName { get; set; }
public Address BusinessAddress { get; set; }
public string Phone { get; set; }
}

Figure 3 – Customer Class using Automatic Properties

Object Initializers
It is often helpful to have a constructor that accepts the key information that can be used to initialize
an object. Many code refactoring tools help create constructors like this with .NET 2. However
another new feature coming with .NET 3.5, C# 3 and VB 9 is object initialization. Object Initializers
allow you to pass in named values for each of the public properties that will then be used to initialize
the object.
NET 3.5 Language Enhancements 49

For example, initializing an instance of the Customer class could be accomplished using the following
code:

Customer customer = new Customer();


customer.CustomerID = 101;
customer.CompanyName = "Foo Company";
customer.BusinessAddress = new Address();
customer.Phone = "555-555-1212";

However, by taking advantage of Object Initializers an instance of the Customer class can be created
using the following syntax:

Customer customer = new Customer {


CustomerID = 101,
CompanyName = "Foo Company",
BusinessAddress = new Address(),
Phone = "555-555-1212" };

The syntax is to wrap the named parameters and their values with curly braces. Object Initializers
allow you to pass in any named public property to the constructor of the class. This is a great feature
as it removes the need to create multiple overloaded constructors using different parameter lists to
achieve the same goal. While you can currently create your own constructors, Object initializers are
nice because you do not have to create multiple overloaded constructors to handle the various
combinations of how you might want to initialize the object. To make matters easier, when typing the
named parameters the intellisense feature of the IDE will display a list of the named parameters for
you. You do not have to pass all of the parameters in and in fact, you can even use a nested object
initialize for the BusinessAddress parameter, as shown below.

Customer customer = new Customer


{
CustomerID = 101,
CompanyName = "Foo Company",
BusinessAddress = new Address { City="Somewhere", State="FL" },
Phone = "555-555-1212"
};

Collection Initializers
Initializing collections have always been a bother to me. I never enjoy having to create the collection
first and then add the items one by one to the collection in separate statements. (What can I say, I like
tidy code.) Like Object Initializers, the new Collection Initializers allow you to create a collection and
initialize it with a series of objects in a single statement. The following statement demonstrates how
the syntax is very similar to that of the Object Initializers. Initializing a List<Customer> is
accomplished by passing the instances of the Customer objects wrapped inside of curly braces.

List<Customer> custList = new List<Customer>


{ customer1, customer2, customer3 };

Collection Initializers can also be combined with Object Initializers. The result is a slick piece of code
that initializes both the objects and the collection in a single statement.

List<Customer> custList = new List<Customer>


{
new Customer {ID = 101, CompanyName = "Foo Company"},
new Customer {ID = 102, CompanyName = "Goo Company"},
new Customer {ID = 103, CompanyName = "Hoo Company"}
};
50 by John Papa

The List<Customer> and its 3 Customers from this example could also be written without Object
Initializers nor Collection Initializers, in several lines of code. The syntax for that could look
something like this without using these new features:

Customer customerFoo = new Customer();


customerFoo.ID = 101;
customerFoo.CompanyName = "Foo Company";
Customer customerGoo = new Customer();
customerGoo.ID = 102;
customerGoo.CompanyName = "Goo Company";
Customer customerHoo = new Customer();
customerHoo.ID = 103;
customerHoo.CompanyName = "Hoo Company";
List<Customer> customerList3 = new List<Customer>();
customerList3.Add(customerFoo);
customerList3.Add(customerGoo);
customerList3.Add(customerHoo);

Extension Methods
Have you ever looked through the list of intellisense for an object hoping to find a method that
handles your specific need only to find that it did not exist? One way you can handle this is to use a
new feature called Extension Methods. Extension methods are a new feature that allows you to
enhance an existing class by adding a new method to it without modifying the actual code for the
class. This is especially useful when using LINQ because several extension methods are available in
writing LINQ query expressions.
For example, imagine that you want to cube a number. You might have the length of one side of a
cube and you want to know its volume. Since all the sides are the same length, it would be nice to
simply have a method that calculates the cube of an integer. You might start by looking at the
System.Int32 class to see if it exposes a Cube method, only to find that it does not. One solution for
this is to create an extension method for the int class that calculates the Cube of an integer.
Extension Methods must be created in a static class and the Extension Method itself must be defined
as static. The syntax is pretty straightforward and familiar, except for the this keyword that is passed as
the first parameter to the Extension Method. Notice in the code below that I create a static method
named Cube that accepts a single parameter. In a static method, preceding the first parameter with
the this keyword creates an extension method that applies to the type of that parameter. So in this
case, I added an Extension Method called Cube to the int type.

public static class MyExtensions


{
public static int Cube(this int someNumber)
{
return someNumber ^ 3;
}
}

When you create an Extension Method, the method sows up in the intellisense in the IDE, as well.
With this new code I can calculate the cube of an integer using the following code sample:

int oneSide = 3;
int theCube = oneSide.Cube(); // Returns 27

As nice as this feature is I do not recommend creating Extension Methods on classes if instead you
can create a method for the class yourself. For example, if you wanted to create a method to operate
on a Customer class to calculate their credit limit, best practices would be to add this method to the
Customer class itself. Creating an Extension method in this case would violate the encapsulation
principle by placing the code for the Customer’s credit limit calculation outside of the Customer class.
NET 3.5 Language Enhancements 51

However, Extension Methods are very useful when you cannot add a method to the class itself, as in
the case of creating a Cube method on the int class. Just because you can use a tool, does not mean
you should use a tool.

Anonymous Types and Implicitly Typed Variables


When using LINQ to write query expressions, you might want to return information from several
classes. It is very likely that you'd only want to return a small set of properties from these classes.
However, when you retrieve information from different class sources in this manner, you cannot
retrieve a generic list of your class type because you are not retrieving a specific class type. This is
where Anonymous Types step in and make things easier because Anonymous Types allow you to
create a class structure on the fly.

var dog = new { Breed = "Cocker Spaniel",


Coat = "black", FerocityLevel = 1 };

Notice that the code above creates a new instance of a class that describes a dog. The dog variable
will now represent the instance of the class and it will expose the Breed, Coat and Ferocity properties.
Using this code I was able to create a structure for my data without having to create a Dog class
explicitly. While I would rarely create a class using this feature to represent a Dog, this feature does
come in handy when used with LINQ.
When you create an Anonymous Type you need to declare a variable to refer to the object. Since you
do not know what type you will be getting (since it is a new and anonymous type), you can declare the
variable with the var keyword. This technique is called using an Implicitly Typed Variable.
When writing a LINQ query expression, you may return various pieces of information. You could
return all of these data bits and create an Anonymous Type to store them. For example, let’s assume
you have a List<Customer> and each Customer has a BusinessAddress property of type Address. In
this situation you want to return the CompanyName and the State where the company is located. One
way to accomplish this using an Anonymous Type is shown in Figure 4.

List<Customer> customerList = new List<Customer>


{
new Customer {ID = 101,
CompanyName = "Foo Co",
BusinessAddress = new Address {State="FL"}},
new Customer {ID = 102,
CompanyName = "Goo Co",
BusinessAddress = new Address {State="NY"}},
new Customer {ID = 103,
CompanyName = "Hoo Co",
BusinessAddress = new Address {State="NY"}},
new Customer {ID = 104,
CompanyName = "Koo Co",
BusinessAddress = new Address {State="NY"}}
};

var query = from c in customerList


where c.BusinessAddress.State.Equals("FL")
select new { Name = c.CompanyName,
c.BusinessAddress.State };

foreach (var co in query)


Console.WriteLine(co.Name + " - " + co.State);

Figure 4 – Using Anonymous Types with LINQ

Pay particular attention to the select clause in the LINQ query expression. The select clause is
creating an instance of an Anonymous Type that will have a Name and a State property. These values
come from 2 different objects, the Customer and the Address. Also notice that the properties can be
52 by John Papa

explicitly renamed (CompanyName is renamed to Name) or they can implicitly take on the name as
happens with the State property. Anonymous Types are very useful when retrieving data with LINQ.

Wrapping Up
There are a lot of new language features coming with .NET 3.5 that both add new functionality and
make the using of existing technologies easier. As we have seen in the past, when new technologies
have been introduced, such as with generics, they often are the precursors to other technologies. The
introduction of Generics allowed us to create strongly typed lists. Now because of those strongly
typed lists of objects we will be able to write LINQ query expressions against the strongly typed
objects and access their properties explicitly even using intellisense. These new features such as
Object Initializers and Anonymous Types are the building blocks of LINQ and other future .NET
technologies.
NET Collection Management with C# 3.0 53

NET COLLECTION MANAGEMENT WITH C# 3.0


25 February 2008
by Amirthalingam Prasanna

Using C# 3.0 to manage a collection of objects

Generics in C#, enable you to define classes, interfaces, delegates or methods with placeholders
for parameterized types used within. This allows you to define classes that use a generic type, and
define the type at the time of instantiation or method calls. This makes your code strongly typed,
but makes maintenance easier. Prasanna describes the improvements in .NET v3.5

This article looks into some of the new features in C# 3.0 and introduces Linq in managing a
collection of objects within a generic List

Introduction
A couple of years ago, I wrote an article entitled “.NET Collection Management” for Simple-Talk. The
purpose of that article was to introduce generics and to show how generics can be used to manage a
collection of strongly typed objects within a generic List using C# 2.0. Generics allow us to write
code without binding the code to a particular type, and at the same time ensures we can use strongly
typed objects. I thought I’d revisit the article to see how much my code would be simplified and
improved in C# 3.0, and introduce some of the new features in C# 3.0.
Let us define an Employee class that we will be using throughout the examples in this article. The
Employee class has the properties Name and Salary.

public class Employee


{
public string Name { get; set; }
public double Salary { get; set; }
}

We have omitted the implementation of the properties because their implementation is very simple.
You set a value to a private field and get the value from a private field. So we are going to let the
compiler implement the properties for us. This is a feature called “Automatic Properties” that saves a
few lines of code and improves the readability of the code when the property implementation is very
simple.
Next we will define and use a collection of Employee objects using a List<T>.

List<Employee> col = new List<Employee>();


col.Add(new Employee() { Name = "John", Salary = 25500 });
col.Add(new Employee() { Name = "Smith", Salary = 32000 });

In this code, we have used a special syntax in initializing the property values at the time of creating
Employee objects. This is a feature called “Property Initialization” that provides a very easy way of
initializing one or more properties when creating an object.

Sorting a List
We can use the Comparison delegate and pass it into the Sort method of List<Employee>. In the
following code we will use an anonymous method to pass the instance of a Comparison delegate to
54 by Amirthalingam Prasanna

do the sorting operation. The anonymous method simplifies the call to the Sort method since we do
not need to define a separate method.

col.Sort(delegate(Employee emp1,Employee emp2)


{
return emp1.Salary.CompareTo(emp2.Salary);
});

We could have written this code in C# 2.0. But in C# 3.0 we can further simplify the implementation
by using Lambda expressions for method implementations. Lambda expression is an inline method
implementation that is translated to an instance of a delegate by the compiler. These expressions use
the syntax “(parameter 1, parameter 2 …) => method implementation”. Lambda expressions allow
us to define methods on the fly with a simpler syntax compared to anonymous methods. So the above
code can be simplified by using the Lambda expression syntax as follows:

col.Sort((emp1, emp2) => emp1.Salary.CompareTo(emp2.Salary));

By using the Lambda expression, I have omitted defining the type for emp1 and emp2. Since the Sort
method accepts an instance of a Comparison delegate for Employee objects, the compiler is
intelligent enough to understand that emp1 and emp2 has to refer to Employee objects. The
expression “(emp1, emp2) => emp1.Salary.CompareTo(emp2.Salary)” will be translated to an
instance of the Comparison delegate.
Another way of sorting the generic List is by using the static method Enumerable.OrderBy. This
method will return an ordered collection of Employee objects

IEnumerable<Employee> orderedEmp = Enumerable.OrderBy<Employee, double>(col,


(emp) => emp.Salary);

The OrderBy method is an extension method. An “Extension method” is a new feature in C# 3.0
that allows you to call a static method belonging to a class as if it is an instance method belonging to
an object. This also allows us to extend types which normally we might not be able to extend. So the
OrderBy method can be called as if it is an instance method because it is an extension method. The
compiler would replace it as a call to the static Enumerable.OrderBy extension method:

IEnumerable<Employee> orderedEmp = col.OrderBy<Employee,double>((emp) =>


emp.Salary);

Searching a List
The generic List has the methods Find or FindAll to search for one or more objects within the List.
Both these methods accept an instance of the Predicate delegate as a parameter. The Predicate
delegate instance can be defined by creating a method or an anonymous method or a lambda
expression.
We can also use the Enumerable.Where extension method to search within the generic List. The
following code segment returns a collection of Employee objects where the Salary property value is
greater than 1000.

IEnumerable<Employee> empsWithBigSalary = col.Where((emp) => emp.Salary > 1000);


Operations on objects within List
NET Collection Management with C# 3.0 55

There are many operations that are available through extension methods that can be performed on
objects within a List. Most of these operations require looping through the objects within the
collection. But with the use of extension methods, we can perform these operations without the need
to loop through the collection.
For example let us assume we want to retrieve the maximum Salary amount within the collection of
Employee objects within List<Employee>. We can use the Max extension method as show in the
below code:

double max = col.Max((emp) => emp.Salary);

Similarly we can use many other operations such as Min, Sum, and Count that are available.

List Conversion
Converting a List of one type to a List of another type is very simple. We can still use the Converter
delegate that was available in C# 2.0. Another way of converting the type of a List is to use the
Enumerable.Select extension method.

IEnumerable<string> empNames = col.Select<Employee, string>((emp) => emp.Name);

This method call would return a collection of employee names. However, let‘s assume that we want
to convert the collection of Employee objects into a collection of objects that has the name of the
employee and a boolean value indicating whether the employee has a salary over 1000. We would need
to create a new type as a class or a structure that has a string property and a boolean property. C# 3.0
supports a new feature called “Anonymous Types” that allows us to define types on the fly.

var emps = col.Select((emp) => new { Name = emp.Name, BigSalary = emp.Salary > 1000
});

We’ve defined a new type that has the properties Name and BigSalary. Another thing that you might
have noticed here is the use of the new keyword “var”. This is a new feature called “Type Inference”.
Type inference is used when we do not know the name of the type of the variable and we require the
compiler to help us out in inserting the name of the type. This is used with anonymous types, since
the compiler defines the type anonymously.

Linq
We went through sorting, searching, performing operations and converting a collection of Employee
objects in a generic List. The extension methods OrderBy and Where returns an IEnumerable of the
type that we use within the generic List – in this instance the Employee type. The extension method
Select return an IEnumerable of the type we want to convert the employee objects to. We can
combine these extension methods to search, sort and convert the objects within the generic List

IEnumerable<string> emps = col.


Where((emp) => emp.Salary > 1000).
OrderBy((emp) => emp.Salary).
Select((emp) => emp.Name);

This is where Linq comes in. Linq stands for Language INtegrated Query and provides a SQL like
syntax for accomplishing what we did in the above code.

IEnumerable<string> emps = from emp in col


56 by Amirthalingam Prasanna

where emp.Salary > 1000


orderby emp.Salary
select emp.Name;

This code uses Linq to query the collection of Employee objects. The syntax is very similar to SQL
except that the select clause is at the end of the query expression. It makes sense because the process
of converting the objects within the generic List would be the last step. Each expression in the Linq
statement followed by the where, orderby and select keywords are lambda expressions. These lambda
expressions are used to make method calls to the extension methods as shown in the above examples.

Conclusion
This article looks at the capabilities of the new features in C# 3.0 that helps to better handle
collection of objects. It also introduces language integrated query and how it helps in managing
collections.
Exceptionally expensive 57

EXCEPTIONALLY EXPENSIVE
26 March, 2008 1:09
By Brian Donahue
Many years ago, when switching from programming in plain old C to the managed environment of
.NET Framework, I had discovered exceptions. The idea was not completely new to me because I'd
already seen try/catch blocks in JavaScript and I liked that method of error handling a lot, especially
when compared to the ON ERROR GOTO handling that VBScript uses, which is why I prefer using
JScript whenever I can, although sometimes in the scripting environment you begrudgingly have to
use VBS.
.NET had significantly enhanced the try/catch block by allowing the programmer to extend the
exceptions by adding their own properties and methods to them. In addition, the exceptions can be
typed, so if you are interested in one type of exception, say a file can't be opened, but not in another
type of exception, for instance an out-of-memory condition, you can have that sort of granularity.
So I thought, great!, I will use exceptions everywhere. I will use them all over the place, not only to
handle catastrophic and unusual errors, but also anywhere an object could not be created or a value
exceeded a certain threshold or any one of 1001 completely common situations where something
happened that needed some conditional branching to happen. This, as I found out, could have some
particularly nasty performance implications!
Here is a simple example to demonstrate just how much slower throwing exceptions can make your
.NET Program:

using System;
using System.Collections.Generic;
using System.Collections;
using System.Text;
namespace ExceptionTest
{
class Program
{
static void Main(string[] args)
{
for (int i = 0; i < 1000000; i++)
{
HandleViaException();
HandleViaCondition();
}
}
static void HandleViaException()
{
string s = null;
try{
string ss = s.Substring(0, 2);
}
catch (System.NullReferenceException)
{
}
}
static void HandleViaCondition()
{
string s = null;
if (s != null)
{
string ss = s.Substring(0, 2);
}
}
}

In the HandleViaException method, I attempt to execute a method on a string object set to a null
value, which will cause a NullReferenceException and make the exception handling code run. In the
HandleViaCondition method, I simply check the value of the string, and if it is null, I do not run the
58 By Brian Donahue

Substring method on the string. Although these methods perform the same function, there should be
a noticable performance difference when the methods are both run a million times. I had tested this
using the ANTS Profiler code profiling tool with the following results:

HandleViaCondition -- Hit Count: 1000000 Total Time: 0.571 seconds


HandleViaException -- HitCount: 1000000 Total Time: 50.4 seconds

Using Exceptions to trap a null condition is roughly a hundred times slower than simply checking to
see if the string is null. I knew that using exceptions would incur a performance penalty but mama
mia that is slow! I've taken a program that should return in less than a second and turned it into an
excuse to hang out at the water cooler and gossip for awhile.
Could I make this any worse? Oh, yes I can, by attaching a debugger (cdb.exe) to the program as well!

HandleViaCondition -- Hit Count: 1000000 Total Time: 0.554 seconds


HandleViaException -- HitCount: 1000000 Total Time: 54.3 seconds

Well, that's not too much worse, but then again, cdb is pretty lightweight. Let's attach to it using
Visual Studio 2005's debugger:

HandleViaCondition -- Hit Count: 1000000 Total Time: 0.678 seconds


HandleViaException -- HitCount: 1000000 Total Time: 1936 seconds

See, now I have a convenient excuse to go down to the cantine and get a donut. Mmmmmm, donuts.
The Visual Studio debugger is particularly invasive when it encounters an exception in the code that
you're debugging. You expect a debugger to pause your code when an exception is encountered, grab
information about the stack and heap, and allow your code to continue on. With CDB.exe, the
overhead is pretty minimal, but Visual Studio 2005 seems to pause my running code for much longer.
The end result is that if this code was part of a real-world application, I would probably spend all day
trying to debug it, and I frankly have better things to do, like eat bacon sandwiches. On toast. With
some of that nice Brown Sauce they have over here.
From now on, I use exceptions in my code very sparingly, and try to avoid using them in code loops
altogether because the cumulative effect of wasting a few milliseconds in a tight code loop can turn
an application into sludge if you're not careful!
Exceptionally expensive 59
60 By Brian Donahue
Need to loosen my bindings 61

NEED TO LOOSEN MY BINDINGS


26 March, 2008 1:09
By Brian Donahue
Microsoft .NET's runtime provides an execution engine for Just-In-Time compiled code, but it also
has the clandestine capability to pre-compile code and cache it on disk. This at first seems a little odd,
since the point of environments like .NET Framework and Java are supposedly designed to offer
machine-independent, 'virtual' code. I suppose that .NET's native image support was introduced to
solve some performance pitfalls of using Intermediate Language code, which needs to be compiled
dynamically as it runs. Native Images would not suffer from this performance loss, since the
compilation has been done in advance.
Native images can become invalid, however, and cause some very strange errors. In one case, one of
our programs was crashing at random points in the usage, and in most cases, the program would
actually fail to even start, throwing a scary-looking invalid program exception and offering the chance
to debug, which presents some users with and obtuse error message and if a debugger is installed, a
yellow arrow pointing at a machine instruction that nobody who has started programming a
computer after 1987 really understands.
Theoretically, of course, invalid native images should not occur. But if we understand how these
native images are created, it's possible to see some holes in Microsoft's design.
For starters, native images are typically created when a program is installed, and the process appears
to be automatic -- Windows Installer knows that .NET assemblies are contained in an MSI by some
hocus-pocus and ngen.exe is invoked to create a cache of native images for all assemblies in the MSI.
If the native images are never modified, there is a possibility that they could become invalid, if, say
the .NET Framework environment had changed. If the Framework is patched or re-installed, there
could be outdated reference in the native image that would cause the program to crash where it would
not if it were properly loaded and JITted in near-real time. Microsoft have thought about this and
designed the Framework to examine the .NET Runtime for changes that would cause a problem. If,
for instance, mscorwks.dll, mscoree.dll, or other runtime libraries have changed, this will force a new
native image generation when a cached assembly tries to load. Likewise, any changes to dependent
assemblies, even if they have cached native images, will force a 'cascading' recompilation of native
images for all dependent assemblies.
It sounds like Microsoft have thought of everything, but, alas, no, there are situations where an
invalid native image is sitting on the hard disk, waiting to strike. Microsoft know that when they apply
changes to the .NET Framework, such as the updates that may be distributed by Windows Update,
that the runtime's native image cache needs to be updated. So these updates run NGEN.exe /update
to refresh the native images so that they are no longer invalid. Sounds all well and good -- but there
are some situations where multiple automatic updates have resulted in invalid native images that the
runtime thinks are still valid, probably because the system really requires a reboot after running the
first .NET update, but Automatic Updates silently suppresses it and applies a second .NET update.
If you come across strange behaviour in a managed .NET program, it may be useful to rule out the
native image as the cause before accusing your program's vendor of releasing buggy code! It is
possible to determine if the program you are running is the JITted version, or a native image, by
using our old friend, the Fusion Log Viewer. Setting this tool up as described in the previous link will
create a log entry any time a .NET assembly is loaded by your program. Because Fusion Log Viewer
can discriminate between bindings to native assemblies and ones that are purely Intermediate
Language, it's plain to see whether the assemblies are being loaded from cache or compiled
dynamically by mscorjit.dll.
If the errant program is loading lots of native-image assemblies, running %systemroot%\-
Microsoft.net\framework\v2.0.50727\NGEN.exe /update may just put your program, and possibly
many others, back in working order.
62 by Tilman Bregler

EXTENDING MSBUILD
06 December 2007
by Tilman Bregler

Because MSbuild underpins the Visual Studio 'build' process, you can use MSBuild to explore and
extend the build process to suit your needs. If you are finding the job of building Microsoft .NET
applications tedious , the chances are that there is a way that using 'extended MSBuild' for your
automated builds will save you time, effort and code.

MSBuild is the build platform for Microsoft and Visual Studio. Unlike other build systems, for
example NAnt, MSBuild not only provides you with a scripting language for controlling builds, but
also with a default build process that can be extended and altered to make it suit your needs.
The benefits of MSBuild, over other build tools, are time savings, code base reduction, harnessing of
tried and tested Microsoft code (don't laugh), and even one or two unique features (assuming that you
are using .NET, of course!).
In this article I will first explore the default MSBuild process and then show how it can be altered by
modifying and overwriting predefined properties and items, and by inserting your own custom targets.
The overall aim is to promote an understanding of the default build process that will enable you to
discover your own ways of extending MSBuild.
NOTE:
This article will refer to .NET 2.0, but as far as I know everything applies equally to
.NET 3.5. The article requires some prior knowledge of MSBuild. See here for an
overview from the horse's mouth: MSDN: Visual Studio MSBuild Concepts.

Project Files are MSBuild Files


There are two important points to notice when working with MSBuild.
1. Visual Studio project files, i.e. .csproj and .vbproj files, are MSBuild scripts. When Visual Studio
2005 builds a project it actually calls MSBuild.exe and passes it the project file. The project files
are the starting point of your builds, and they provide the main entry point for extending the
build process
2. Everything that happens when a project builds is defined in some MSBuild build script. It
follows that every step can be altered, overwritten or removed. In other words, you could go
ahead and write the whole default build process yourself, using MSBuild and the provided tasks.
An important consequence of the first point is that not only can you customise MSBuild command
line builds, but also builds that are started from within Visual Studio. In .NET there is no difference
between the two. You can distinguish between command line and Visual Studio builds by querying
the BuildingInsideVisualStudio property.

The Default Build Process


Let's start from the top. You can build any Visual Studio project by executing a command line like so:

msbuild app.csproj

Notice that we didn't specify a target, i.e. when MSBuild runs; it will call the default target of
app.csproj.
Now, let's look at a project file generated by Visual Studio. Below is the project file of a freshly
created C# class library (with some omissions).
Extending MSBuild 63

<Project DefaultTargets="Build"
xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup>
<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
<ProductVersion>8.0.50727</ProductVersion>
<SchemaVersion>2.0</SchemaVersion>
<ProjectGuid>{5B9EEF3B-7CF4-4D38-B80D-E07F4B1E3CD0}</ProjectGuid>
<OutputType>Library</OutputType>
<AppDesignerFolder>Properties</AppDesignerFolder>
<RootNamespace>ClassLibrary1</RootNamespace>
<AssemblyName>ClassLibrary1</AssemblyName>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">

</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">

</PropertyGroup>
<ItemGroup>
<Reference Include="System" />
<Reference Include="System.Data" />
<Reference Include="System.Xml" />
</ItemGroup>
<ItemGroup>
<Compile Include="Class1.cs" />
<Compile Include="Properties\AssemblyInfo.cs" />
</ItemGroup>
<Import Project="$(MSBuildBinPath)\Microsoft.CSharp.targets" />
</Project>

As you can see, there is not a lot in it. There are some properties which you might recognise from the
Properties pages of your projects. Furthermore, there are two item groups, one called 'Reference',
containing the references, and another called 'Compile', containing the source files.
So where is the action? In particular, where is the default 'Build' target that's called when Visual
Studio builds the project?
The answer is of course this line:

<Import Project="$(MSBuildBinPath)\Microsoft.CSharp.targets" />

'MSBuildBinPath' is a reserved property and evaluates to the path where MSBuild.exe resides, i.e.
usually something like C:\WINNT\Microsoft.NET\Framework\v2.0.50727. (See here for a list of
reserved properties http://msdn2.microsoft.com/en-us/library/0k6kkbsd(VS.80).aspx )
However, perusing Microsoft.CSharp.targets reveals that there isn't much in there either, appart
from a target called 'CoreCompile' which calls the Csc task. In particular, there is no 'Build' target
here either.
However, at the bottom you'll find another import,

<Import Project="Microsoft.Common.targets" />

This is where all the action is. Microsoft.Common.targets contains the 'Build' target we were
looking for, and most of the targets, properties and items executed during a build are defined here.

Property, Item and Target Evaluation


When hooking into the default build process you need to know how MSBuild evaluates properties,
items and targets. In particular, note that:
64 by Tilman Bregler

a) Properties, items and targets are evaluated in order from top to bottom. Furthermore, properties,
items and targets that appear later in the build script always overwrite those of the same name that
appear earlier in the build script.
b) Properties and items defined in property and item groups are evaluated statically. What that means
is that all properties and items, no matter where they appear in the build script, are evaluated at the
start, before any targets have executed.
This implies that, most of the time, you will want to make your additions after the default build
process has been declared, that is in your project file below the Microsoft.CSharp.targets import.
This will allow you to use properties and items defined by Microsoft, to overwrite and amend those
properties and items, and to overwrite and extend predefined targets. Moreover, even though you
modify properties and items at the bottom of your build script, these modifications will have been
evaluated by the time the Microsoft targets run. You can thus provide custom values to default
targets. (See the section below, 'Referencing Different Dlls for Release and Debug Builds', for an
example.)
This also means that you must be very careful not to modify or overwrite existing properties, items
and targets when you don't intend to. To avoid conflicts, I suggest always adding a unique pre-fix to
your custom property, item and target names, for example your company name followed by an
underscore.
Other points to note about property evaluation are, that a) property values passed in via the
command line always overwrite property values set in build scripts, and b) all environment variables
are automatically available as properties in your build scripts. For example, '$(PATH)' evaluates to the
value of the PATH environment variable. Properties defined in scripts overwrite properties defined
by environment variables, however.
It is possible to not overwrite earlier properties by using the following idiom:

<Property Condition=" '$(Property)' == '' ">Value</Property>

This will only set the value of Property to Value if property hasn't been assigned a value previously.
Exercise care though, as this does not work for properties for which the empty string is a valid value.

Side Effects of Static Evaluation


One unfortunate side-effect of static item evaluation manifests itself when specifying items with
wild-cards. Imagine writing the following script,

<ItemGroup>
<OutputFile Include="$(OutputPath)\**\*" />
</ItemGroup>

<Target Name="CopyOutputFiles">
<Copy SourceFiles="OutputFile" DestinationFolder="$(DestFolder)" />
</Target>

On the face of it, we are defining an item which contains all output files. We then use this item to
copy the output files to another folder. If you run the script, however, you will find that not a single
file was copied. The reason is that, due to static evaluation, the item got evaluated before any targets
ran. In other words, before any output was generated. The item is thus empty.
Dynamic items get around this problem. The example would then look like this,

<Target Name="CopyOutputFiles">
<CreateItem Include="$(OutputPath)\**\*">
<Output TaskParameter="Include" ItemName="OutputFile"/>
</CreateItem>
<Copy SourceFiles=" @(OutputFile)" DestinationFolder="$(DestFolder)" />
Extending MSBuild 65

</Target>

Now the item is created just before the Copy task is called and after all the output creating targets
have run. Sadly, dynamic items and properties are not very nice, in that they require an excessive
amount of difficult-to-read code.

Discovering Predefined Properties and Items


The default build process defines a whole raft of properties and items that you can use for your own
purposes. The three I found most useful were:
• $(Configuration) – The configuration you are building, i.e. either 'Debug' or 'Release'
• $(OutputPath) – The output path as defined by your project
• @(MainAssembly) – The main assembly generated by your project
There are two ways to discover what other properties and items are available. One is by inspecting the
Microsoft build scripts. This can be tedious if you don't know exactly what you are looking for, but it
is useful if you know the target whose properties and items you want to modify. In the latter case,
looking at the target should tell you exactly what properties and items are of interest.
The second way is to set verbosity to diagnostic on a command line build:

msbuild app.csproj /verbosity:diagnostic

This will result in all defined properties and items, together with their values, being listed at the top of
the output. From there it's easy to pick the ones you need. Bear in mind, though, that dynamic
properties and items will not be listed. For those it's back to scanning the Microsoft build scripts, I'm
afraid.
Diagnostic verbosity is also useful for debugging your scripts, in general. It gives very detailed output,
as well as the aggregate timings of each task and target.
When running from the command line, I recommend redirecting output to a file, as printing to the
screen can slow down execution considerably. That way you also get the complete output, and you
can search it in a text editor.

Referencing Different Dlls for Release and Debug Builds


Let's use all the above in an example. Assume we want to reference different dlls, depending on
whether we are building a debug or a release build. First, we need two folders that hold the different
dlls, 'Debug' and 'Release', say. And let's assume they reside in the same folder as the .csproj file.
Next, from reading the documentation and inspecting previous build output, I know what I want to
hook into is the call to the ResolveAssemblyReference task. This task can be found in
Microsoft.Common.targets in the ResolveAssemblyReferences target, and looks like this (when
you squint a little),

<ResolveAssemblyReference

SearchPaths="$(AssemblySearchPaths)"

</ResolveAssemblyReference>

So, all we need to do is extend the AssemblySearchPaths property. To do so, add the following to the
bottom of your project file,

<PropertyGroup>
66 by Tilman Bregler

<AssemblySearchPaths>
$(Configuration);
$(AssemblySearchPaths)
</AssemblySearchPaths>
</PropertyGroup>

What we're doing here is defining a new property, also called AssemblySearchPaths. Since our definition
occurs below the original definition we overwrite the original definition. The new definition states
that AssemblySearchPaths consists of the value of Configuration followed by whatever the old value of
AssemblySearchPaths was. In effect, we prepend the value of Configuration to the value of the
AssemblySearchPaths property.
Now, when the ResolveAssemblyReference task runs, it will use our new definition of AssemblySearchPaths,
thanks to static evaluation. It will look for referenced dlls in a folder, called the value of the
Configuration property, before it looks anywhere else. In the case where you are building a Debug build
it would look first in a subfolder of the current folder called 'Debug'. Since the current folder is
always the folder in which your 'start-up project' resides, i.e. the project file, we are done.
The changed value of the AssemblySearchPaths property can be verified by looking at the build output
with verbosity set to diagnostic.
The cool thing is that this change takes effect even when building inside Visual Studio. In other
words, when you set the configuration drop down to 'Release' you are referencing a release build and
when you set it to 'Debug' a debug build.

Executing Custom Targets


Most of the time, when extending MSBuild, you will want to insert your own custom targets. There
are two ways to go about this. The first is to overwrite predefined targets with your own. These
targets are defined in Microsoft.Common.targets, like so,

<Target Name="BeforeBuild"/>

As you can see, the above target does nothing. However, it gets called during each build at the
approprieate time, i.e. before the build starts. So, if you now define a target of the same name, your
target will overwrite the one in Microsoft.Common.targets, and your target will be called instead.
There is a list of available targets here,
http://msdn2.microsoft.com/en-us/library/ms366724(VS.80).aspx.
The second method for inserting your own targets into the build process is to modify the
'DependsOn' properties. Most targets in Microsoft.Common.targets have the DependsOnTargets
attribute set to be the value of a property, whose name is of the form 'xxxDependsOn'. Where 'xxx'
is the name of the target. For example, the 'Build' target depends on whatever the value of the
BuildDependsOn property is.

<Target
Name="Build"
DependsOnTargets="$(BuildDependsOn)"/>

To insert your own target, you have to modify the value of the BuildDependsOn property, like so,

<PropertyGroup>
<BuildDependsOn>
CustomTargetBefore;
$(BuildDependsOn);
CustomTargetAfter
</BuildDependsOn>
Extending MSBuild 67

</PropertyGroup>

The outcome of this will be that CustomTargetBefore will run before all previously defined targets in
BuildDependsOn, and CustomTargetAfter will run after all previously defined targets. The advantage
of using the 'DependsOn' properties is, that you don't inadvertently overwrite somebody elses targets,
as is possible when overwriting predefined targets.

Extending All Builds


Occasionally, you will want to make extensions to the build process that apply to all builds. For
example, if you have a dedicated build server and you want to automatically obfuscate each build.
One way to do this would be to import a common file into all your project files. There is, however,
the danger that someday someone forgets to import it and, without any warning, you would end up
with unobfuscated builds. As an alternative, you could modify Microsoft.Common.targets et al. This
is also not desirable. For example, if a future Framework installation modifies those files and removes
your additions you would again end up with unobfuscated builds, and no warning.
Luckily Microsoft has anticipated this scenario and allows you to specify custom build files. If these
custom files exist, they are imported by Microsoft.Common.targets on every build. From those files
you can make the same modifications to the default build process as discussed previously, but they
will apply to every build.
There are two files available:
1. Custom.Before.Microsoft.Common.targets which is imported at the top of
Microsoft.Common.targets
2. Custom.After.Microsoft.Common.targets which is imported at the bottom.
Both files are expected to reside in %program files%\MSBuild\v2.0. However, since these files are
defined via the properties CustomBeforeMicrosoftCommonTargets and CustomAfterMicrosoft-
CommonTargets, you can supply your own file names via the command line, like so,

msbuild.exe app.proj /property:CustomAfterMicrosoftCommonTargets=custom.proj

Most of the time you will want to use Custom.After.Microsoft.Common.targets, so that you can
overwrite and extend existing properties, items and targets, as you would in your project file.
Coming back to the obfuscation example, you would have to create a project file called
'Custom.After.Microsoft.Common.targets' and put it in %program files%\MSBuild\v2.0. In
that file you would have to define a target and hook it in like so,

<PropertyGroup>
<BuildDepensOn>
$(BuildDependsOn);
Obfuscate
</BuildDepensOn>
</PropertyGroup>

<Target Name="Obfuscate">
...
</Target>

MSBuild Constrictions
Having evangelised, at some length, the wonderful ways in which one can extend MSBuild, I can't
help but mention two serious flaws. The first problem is that Visual Studio solution files are not
MSBuild files. MSBuild can, and does execute solution files, but these files are not in the MSBuild
format and hence cannot be extended in the same way as project files. What actually happens when
68 by Tilman Bregler

MSBuild is called with a solution file, is that it converts the solution file to an MSBuild file in memory
and then runs that file. You can even save that file to disk and then extend it when running command
line builds. You cannot, however, make extensions at the solution level, that take effect when building
inside Visual Studio. Also, when you change your solution file you will have to merge the changes into
any generated 'MSBuild Solution'.
The second problem concerns managed C++ projects. And yes, you guessed it, they are not MSBuild
projects either. They are VCBuild projects. When MSBuild builds a managed C++ project it simply
calls VCBuild and passes it the project. From there, it gets very tricky and labour intensive to integrate
managed C++ projects into your build system. So from a build manager's point of view it's good
advice to steer clear of managed C++.

Conclusion
If you're using Visual Studio, you are already using MSBuild, because Visual Studio calls MSBuild
whenver you click the 'Build' button. The Visual Studio integration of MSBuild allows you to extend
your local builds in lots of wonderful ways. In addition, if you're using .NET, using 'extended
MSBuild' for your automated builds is a good idea, given the savings in time, effort and code. I hope
to have shown how you can explore and extend the default build process to make it suit your needs.
Controls Based Security in a Windows Forms Application 69

CONTROLS BASED SECURITY IN A WINDOWS FORMS


APPLICATION
22 January 2007
by Jesse Liberty
One of my clients wanted to be able to restrict any given control, on any form, so that it is either
invisible or disabled based on who is using the form. We decided to make the restrictions "roles-
based" – that is "managers can click this button, users can see it, but to guests it is invisible."
We wanted to build an architecture that would allow us to add forms and controls to the application
without deciding in advance which roles we would use, and without having to modify the forms or
controls to meet the needs of the security architecture any more than absolutely necessary. The ideal
security architecture would be independent of the participating forms and controls.
In this article, I will review the approach I took, focusing on the nitty-gritty code used to make this
work, and the challenges faced in creating such an application quickly (their budget for this was 4
days). This article takes you as far as saving the users, roles and the permissions those roles have for
the various controls. It does not implement login and so it does not implement any of the checks to
see if the logged in user should be restricted in access to the controls on any given page. All of that is
left as a (dare I say?) fairly straight-forward exercise for the reader.
The full source code for this article is available for download. Simply click on the CODE
DOWNLOAD link in the box to the right of the article title. It is also available on the author's
website.
NOTE:
This article is targeted at .NET 2 programmers already familiar with C# and .NET
Windows Forms. I do not explain how to create forms, or how event handling works, nor
do I explain how to interact with a SQL database.

Users and roles


There are many security schemes that have evolved over time, but the one which has proven most
successful, at least in the Windows world, is that of Access Control Lists (ACLs) now most
commonly referred to as Users and Roles. We see this most cleanly and starkly implemented in
ASP.NET, though it presents a bit more of a challenge with Windows Forms applications that will be
used by a very large number of users. (One approach is to use the ASP.NET authentication support,
accessing it through a web service.)
NOTE:
See, for example, my articles on creating users and roles on O'Reilly Windows DevCenter
(though today I would re-write these articles to use the Web Site Administration Tool
accessed from Visual Studio under Website-> ASP.NET -> Configuration.
My client had their own authentication system (as part of their larger in-house system) and to keep
this article simple I'll follow their lead and simply create a Users table and a Roles table and finesse
the authentication.

Roles and controls


To decide if a user has "access" to a control (which will be defined as meaning the right to see a
control or to invoke the control) we'll create two additional objects:
• ControlPermission which will represent a given control on a given form, and
• PermissionToRole which encapsulates the relationship between a given
ControlPermsision and a given Role (the one to many relationship) and the permissions
for that control by users in that role.
70 by Jesse Liberty

NOTE:
In a "real" application, I'd add a middle tier of business objects to represent the user,
role, control and the relationships, but again, to keep this paper stripped down to the
essence, the sample application will have the presentation layer talk directly to the
persistence layer (a practice I generally recommend against! Please see this article for
more on this topic).

The database
The database (for this part of the application) is very straight forward. We'll create five tables, as
shown in the figure 1:
NOTE:
To make this article as accessible as possible, I created the database using SQLExpress,
though I accessed and manipulated the database using the SQL Server Management
Studio from SQL Server 2005.

Figure 1 - Database diagram

There are three data tables (Users, Roles and Controls) and two tables that create many-to-many
relationships: UsersToRoles and ControlsToRoles. Users and Roles are more or less self-
Controls Based Security in a Windows Forms Application 71

explanatory. The Controls table represents a control on a form, and ControlsToRoles is the heart of
the control-based security approach; it represents the permissions of a given role for a given control
on a given form, as is explained in detail below.

Application and control-security forms


The application may consist of any number of forms. To keep the explanation clear, we'll draw a
distinction between the two forms used for control-security, and all the other forms used for the
application (which we'll call "application forms.").
There are only two requirements for an application form to participate in control-based security:
1. Each application form must include two toolTip controls (explained in detail below) which
must be named toolTip1 and toolTip2.
2. Each application form must provide some means (typically a menu choice) of invoking the two
control-security forms: ManageRoles.cs and ManagePermissions.cs
The application forms are free to use toolTip1 in any way they choose (including ignoring it), but they
must not use toolTip2 at all, as its contents will be controlled by the Control-security forms.
That's it, otherwise all application forms and their controls (including user controls and custom
controls) remain unchanged.

Creating the application


Begin by creating a new Windows Forms application in Visual Studio 2005 (or your favorite
alternative tool).
Let's assume, for the sake of this article, that the application you are building will be used by a Sales
Representative to enter data about a request to Liberty Associates, Inc. for contract programming,
training or writing. For illustration purposes I'll create two application forms (PotentialClient and
NewContract), but you can imagine an application with 20 (or 200) application forms. I'll also create
the two security forms: ManageRoles and ManagePermissions.
PotentialClient asks for basic demographic information as shown in figure 2:
72 by Jesse Liberty

Figure 2 - Potential Client Form

And NewContract asks for details about the work requested, as shown in figure 3.
Controls Based Security in a Windows Forms Application 73

Figure 3 - Second form: New Contract

I've intentionally made these application forms crude and simple to allow us to focus on the control-
based security rather than on the form design. In any case, none of the data retrieved in these forms
will be persisted, and I encourage you to create your own forms that more closely represent your own
business needs.

Creating users and roles


As noted earlier, there are numerous ways to approach creating users and roles in a Windows
application. The three that I personally find most appealing are:
3. Use the Windows built in users and roles if the application needs and network needs are 100% isomorphic
(a rare but not impossible scenario)
4. Use a Web Service to leverage the ASP.NET forms-based security infrastructure created by Microsoft.
5. Create your own simple database, and use a proprietary (existing) authenticating system.
Again, to keep this paper focused, we'll assume situation #3. Thus, we need only create a Windows
form for adding users, adding roles and adding users to roles, and saving all of that to the database.
This is easily accomplished by creating the form shown in Figure 4:
74 by Jesse Liberty

Figure 4 - Manage Roles Form

You will want to bind the list boxes to data sources tied to your data tables. When the user clicks on
AddNew you'll get the name from the text box and create the new record for the database and for
the list box:

private void AddNewRole_Click( object sender, EventArgse )


{
string newName = string.Empty;
newName = NewRoleName.Text;
NewRoleName.Text = string.Empty; // clear the control

ControlSecurityDataSet.RolesRownewRolesRow;
newRolesRow = controlSecurityDataSet.Roles.NewRolesRow();
newRolesRow.RoleName = newName;
this.controlSecurityDataSet.Roles.Rows.Add( newRolesRow );

try
{
this.rolesTableAdapter.Update
( this.controlSecurityDataSet.Roles );
}
catch ( Exceptionex )
{
this.controlSecurityDataSet.Roles.Rows.
Remove( newRolesRow );
MessageBox.Show( "Unable to add role "+ newName
+ ex.Message,
"Unable to add role!", MessageBoxButtons.OK,
MessageBoxIcon.Error );
}

RolesListBox.SelectedIndex = -1;

When you add a user to a role (by clicking on the arrow key), you'll make an entry in the
UsersToRoles table (adding the UserID and RoleID and then updating the TreeView (far right)). To
update the TreeView, you'll retrieve the entries fro the database and iterate through the rows of the
table crating a new parent node each time you come across a new user name (if the "Name" radio
button is selected) or a new RoleName (if the Role radio button is pressed):

foreach ( DataRow row indt.Rows )


{
if( rbName.Checked )
{
subNode = new TreeNode( row["roleName"].ToString() );
if ( currentName != row["Name"].ToString() )
{
ParentNode = new TreeNode( row["Name"].ToString() );
Controls Based Security in a Windows Forms Application 75

currentName = row["Name"].ToString();
UsersInRoles.Nodes.Add( ParentNode );
}
}

Since our focus is on the controls-based security, I won't go into more detail here, though as
mentioned earlier, the complete source is available for you to download and try out.

The Manage Permissions page


The second, and perhaps more important, page involved in managing the control-based security will
display all the controls for a given page, and will display all the roles known to the application. Since
the default is full access, the administrator need only indicate the restrictions to apply for any given
control for any given role. The page will have two multi-select list boxes: one displaying all the
controls from the page and one displaying all the roles. In addition, much as in the ManageRoles
form, there will be a Tree control displaying the saved restrictions, either by role or by control, as
shown in Figure 5:

Figure 5 - Manage Permissions

To accomplish this, you'll want to extend your data source to include the Controls and
ControlsToRoles tables as shown in Figure 6:
76 by Jesse Liberty

Figure 6 - Control Security Data Set

Once again you'll populate the Roles list box by dragging the Roles table from the data sources view
onto the list box, thus creating both a BindingSource and a TableAdapter
Populating the Controls permission list box is a bit trickier. To do this you need access to the list of
controls for the form.
The ManagePermissions constructor will take three arguments: a reference to a Form, and two
references to ToolTip objects.). The constructor will stash these away in private member variables.

public partial class ManagePermissions : Form


{
private FormworkingForm;
private ToolTip formToolTip1 = null;
private ToolTip formToolTip2 = null;

public ManagePermissions( Form f, ToolTip toolTip1,


ToolTiptoolTip2 )
{
InitializeComponent();
workingForm = f;

formToolTip1 = toolTip1;
formToolTip2 = toolTip2;
formToolTip1.Active = false;
formToolTip2.Active = true;

this.Text += " for page " + f.Name;


ShowControls( f.Controls );
PopulatePermissionTree();

}
Controls Based Security in a Windows Forms Application 77

Note that after it stashes away the references it sets the title to indicate which form (page) it is setting
permissions for. The reason we pass in the two tool tips is to change the tooltips from whatever the
programmer was using the tooltips for to now using the tooltips to indicate the name of every
control on the page. This allows the administrator to identify the exact control to be restricted, as
shown in figure 7:

Figure 7 – Tool Tips used to name Controls


78 by Jesse Liberty

In Figure 7 you can see that when the permissions dialog is open if the user hovers over any control on the
page for which permissions are being set, the tool tip has been changed to indicate the name of the
control. This corresponds directly to the name listed in the Control Permissions list box (which is
sorted alphabetically).
Setting permissions
The administrator selects (or multi-selects) one or more controls in the Controls Permission list box,
then selects (or multi-selects) one or more roles in the roles list box and then checks either or both of
invisible and disabled. The handler for the Save button loops through each indicated control and
each indicated role and calls a stored procedure to make a record in ControlToRoles (first ensuring
there is a record for that control in Controls). All of this is done under transaction support, as
explained in detail below).
Cleaning up
When the Control Permissions page closes, we reset the original tool tips (which we stashed away).
The nitty gritty
There are a number of juicy technical details, and the best way to see them is to walk through them
step by step.

Implementing permissions – step-by-step


The following is an annotated walk through of adding restrictions to a control for a given role, as you
might see it stepping through the debugger.
<Rant>
Programmers do not spend nearly enough time becoming proficient with their editor nor with their
debugger. This seems crazy to me; on a par with race car drivers who do not know how the engine to
their car works, or infantry who can't break down their weapons. They put others at risk.
Code should be self-commenting. Excessive comments (more than a few per page) are an admission
of failure and each comment you add to your code greatly decreases the life expectancy of your code
(comments rust). This second point is highly controversial and worth an opinion piece all its own
(forthcoming).
The best way to learn how to program is (a) to buy a really good book with lots of exercises, (b) to
expand those exercises and (c) [most important] to step through your exercises in the debugger to see
what is really happening. The best way to learn new techniques is to step through code. Visual Studio
has a great debugger, pay special attention to the watch and quick watch features.
</Rant>
We'll pick up the program where the user clicks on ManagePermissions from the PotentialClient
Application Form, as shown in figure 8:

Figure 8 - Manage Permissions

The code handler follows:


Controls Based Security in a Windows Forms Application 79

private void managePermissionsToolStripMenuItem_Click


( object sender, EventArgse )
{
ManagePermissions dlg = new ManagePermissions(
this,
this.toolTip1,
this.toolTip2 );
dlg.Show();
}

There are two things to note about this handler:


1. As required, it passes in a reference to itself (the form) and to its two ToolTip objects to the
constructor for the ManagePermissions form.
2. Equally important, it calls Show, and not ShowModal, when invoking the dialog – this allows
the user to return to the invoking form to give the form focus and hover over controls to see
the name of the control in the tooltips.

ManagePermisssions constructor
Control now switches to the constructor of the ManagePermissions form, which was shown,
slightly-excerpted, before. The ommission was that I hid the Dictionary named oldMenuToolTips
that I was forced to create despite the fact that I hold on to ToolTips1. This dictionary is needed
because tool tips work differently for menus than they do for other controls. This is a detail we'll
come to in a bit.
Stepping into the ManagePermissions constructor, the class members are initialized and then the
body of the constructor itself is called:

public partial class ManagePermissions : Form


{
private Dictionary<string, string> oldMenuToolTips =
new Dictionary<string, string>();
private FormworkingForm;
private ToolTip formToolTip1 = null;
private ToolTip formToolTip2 = null;

public ManagePermissions( Form f, ToolTip toolTip1,


ToolTiptoolTip2 )
{
InitializeComponent();
workingForm = f;

formToolTip1 = toolTip1;
formToolTip2 = toolTip2;
formToolTip1.Active = false;
formToolTip2.Active = true;

this.Text += " for page " + f.Name;


ShowControls( f.Controls );
PopulatePermissionTree();

NOTE:
If you are stepping through, you'll find yourself in the designer. Put a breakpoint on
ShowControls in the constructor and hit F5 to jump there, as it is the next interesting
piece to examine.

ShowControls
After setting the title bar ShowControls is called, passing in the controls collection of the form.
Stepping in you see the controls collection defined as a Control.ControlCollection. The trick with
80 by Jesse Liberty

this method is that Controls themselves can contain other controls (e.g., a panel can contain controls)
and so the method must be made recursive:

foreach ( Control c incontrolCollection )


{
string displayName = string.Empty;
if ( c.Controls.Count > 0 )
{
ShowControls( c.Controls );
}

A bit nastier, menu strips handle their members differently, so if the control is a menu strip, you'll
need to call a different method:

if ( c is MenuStrip)
{
MenuStrip menuStrip = c as MenuStrip;
ShowToolStipItems( menuStrip.Items );
}

The ShowtoolStipItems method itself must be recursive as well:

private void ShowToolStipItems(ToolStripItemCollectiontoolStripItems)


{
foreach ( ToolStripMenuItem mi intoolStripItems )
{
oldMenuToolTips.Add( mi.Name, mi.ToolTipText );
mi.ToolTipText = mi.Name;

if ( mi.DropDownItems.Count > 0 )
{
ShowToolStipItems( mi.DropDownItems );
}

PageControls.Items.Add( mi.Name );
}
}

Notice the last line; it is here that we add the menu items to the list of controls that might be
restricted – that is we treat the menu items just like any other control for purposes of control-based
security.
Returning to the ShowControls method, we are now ready to see if the control is of a type that we
might want to restrict (you are free to expand the list of types). If so, we'll set its second tool tip to its
name and we'll add it to the list box of controls:

if ( c is Button || c is ComboBox || c is TextBox||


c is ListBox || c is DataGridView || c is RadioButton|
c is RichTextBox || c is TabPage )
{

formToolTip2.SetToolTip( c, c.Name );
PageControls.Items.Add( c.Name );

Populate Permission Tree


Having populated the controls, and remembering that we bound the roles control box to the roles
table (by dragging the roles table from the data source to the control in design view, thus setting up a
Controls Based Security in a Windows Forms Application 81

rolesBindingSource and a rolesTableAdapter and letting them do the work), we are up to the last
line in the constructor in which we invoke PopulatePermissionTree.
This method is factored out because it is invoked from a number of places (which the debugger is
happy to point out to you, just go to the method, right click and then click on "Find all references."
NOTE:
Here, as in many places in the code, I would normally remove all calls to the database to
a business object (or at least to a helper object with static member methods). Once again,
to focus on the task at hand, and to keep the code straight forward, I've put the data
access code directly into the presentation layer code, whch gives me the willies but does
make for an easier to follow example.
The method begins by retrieving the connection string from the AppSettings.cs file, using the
ConfigurationManager object (the new and preferred way to do so in 2.0 applications). Once this is
obtained, a SQL connection is created and opened:

ConnectionStringSettingsCollectionconnectionStrings =
ConfigurationManager.ConnectionStrings;

stringconnString = connectionStrings[
"ControlBasedSecurity.Properties.Settings.
ControlSecurityConnectionString"].
ToString();

SqlConnection conn = new SqlConnection( connString );


conn.Open();

The queryString to obtain the controls we want is hard wired into the code (typically, we'd use a
stored procedure) and the order clause is set by which radio button is chosen by the user. A hack, but
an effective one:

string queryString = "select controlID, Invisible, Disabled,


RoleName "+
"from ControlsToRoles ctr "+
" join controls c on c.ControlID = ctr.FKControlID and c.Page =
ctr.FKPage "+
" join roles r on r.RoleID = ctr.FKRole ";

if( ByControlRB.Checked )
{
queryString += " order by ControlID";
}
else
{
queryString += " order by RoleName";
}

In the parallel method, in ManageRoles, I had an if statement to check which radio button was
checked (by user or by role) and set the varaibles subNode and parentNode accordingly:

if( rbName.Checked )
{
subNode = new TreeNode( row["roleName"].ToString() );
if ( currentName != row["Name"].ToString() )
{
parentNode = new TreeNode( row["Name"].ToString() );
currentName = row["Name"].ToString();
UsersInRoles.Nodes.Add( parentNode );
}
}
else
{
subNode = new TreeNode( row["Name"].ToString() );
82 by Jesse Liberty

if ( currentName != row["RoleName"].ToString() )
{
parentNode = new TreeNode( row["RoleName"].ToString() );
currentName = row["RoleName"].ToString();
UsersInRoles.Nodes.Add( parentNode );
}

In this method, I'll use the C# ternary operator to consolidate this code:

subNode = new TreeNode( subNodeText );

stringdataName = ByControlRB.Checked ?
row["ControlID"].ToString() : row["RoleName"].ToString();

if( currentName != dataName )


{
parentNode = new TreeNode( dataName );
currentName = dataName;
PermissionTree.Nodes.Add( parentNode );
}

NOTE:
You read the ternary operator statement as follows "Is the radio button ControlRB
checked? If so, assign what is in the column ControlID to the string DataName,
otherwise assign what is in the column RoleName to that string.
This avoids duplicating the code in an else statement, and thus makes the code more robust.
NOTE:
The opportunity to factor the two PopulateTree methods (from ManagePermissions and
ManageRoles, into a single helper method is left as an exercise for the reader.
With the query string created, we can retrieve the permissions from the database into a dataset, and
from that extract the DataTable whose Rows collection we'll iterate through to get all the existing
controlsToRows relations.

DataSet ds = new DataSet();


SqlDataAdapter dataAdapter = null;
DataTable dt = null;
try
{
dataAdapter = new SqlDataAdapter( queryString, conn );
dataAdapter.Fill( ds, "controlsToRoles" );
dt = ds.Tables[0];
}
catch (Exception e)
{
MessageBox.Show( "Unable to retrieve permissions: " + e.Message,
"Error retrieving permissions",
MessageBoxButtons.OK,
MessageBoxIcon.Error );
}
finally
{
conn.Close();

With the DataTable in hand, the next step is to prep the TreeView by calling BeginUpdate (which
stops it from updating until we call EndUpdate) and by clearing all its existing nodes, so that we can
add all the nodes from the database and not worry about duplication. We then iterate through each
row, creating sub-nodes for each parent, and adding the parents to the TreeView as we find new
parents. The parents are defined as a new control (when sorting by controls) or a new role (when
sorting by roles).
Controls Based Security in a Windows Forms Application 83

PermissionTree.BeginUpdate();
PermissionTree.Nodes.Clear();
TreeNode parentNode = null;
TreeNode subNode = null;

string currentName = string.Empty;


foreach ( DataRow row indt.Rows )
{
stringsubNodeText = ByControlRB.Checked ?
row["RoleName"].ToString() : row["ControlID"].ToString();
subNodeText += ":";
subNodeText += Convert.ToInt32(
row["Invisible"] ) == 0 ? " visible " : " not visible ";
subNodeText += " and ";
subNodeText += Convert.ToInt32(
row["Disabled"] ) == 0 ? " enabled " : " disabled ";

subNode = new TreeNode( subNodeText );


stringdataName = ByControlRB.Checked ?
row["ControlID"].ToString() : row["RoleName"].ToString();
if( currentName != dataName )
{
parentNode = new TreeNode( dataName );
currentName = dataName;
PermissionTree.Nodes.Add( parentNode );
}

if ( parentNode != null )
{
parentNode.Nodes.Add( subNode );
}
}
PermissionTree.EndUpdate();

We've chosen to have the sub-nodes tell whether the control is invisible or disabled no matter what
the view, allowing for displays as shown in figures 9 and 10:

Figure 9 - Current Status by Control Figure 10 - Current Status by Role

Adding a new restriction to a control


At this point, all we've done (hard to believe!) is finish the constructor. The dialog is displayed with
the name of the page we're setting permissions for, the controls for that page are displayed, the roles
are displayed, and any previously recorded restrictions are displayed. In addition, the application
page's ToolTip's have been adjusted to display the name of each control when you hover over the
control (all of this was shown in figure 7).
Assume the administrator selects a control (e.g., InternationalRB) and then selects two roles (e.g.,
User and Technician) and checks Disable and Save. The user's intent is to restrict all members in
the User role and the Technician role from having the International radio button being enabled
when they view the Potential Client application page.
When the administrator clicks Save, control will jump to the Save_Click event handler. Once again
we'll retrieve the connection settings and open a connection. The code then iterates through each of
the selected items in the PageControls list box extracting the string representing the controlID of
84 by Jesse Liberty

the control that was selected (InternatioanlRB) and then within that loop iterates through each of
the selected items in the PermissionRoles list box, retrieving the DataRowViews corresponding to
the selected items (remember that the PermissionRoles list box was populated through data
binding).
With these in hand, you are ready to create records in ControlsToRoles which you will do by calling
the stored procedure spInsertNewControlToRole, shown here:

PROCEDUREspInsertNewControlToRole
@RoleID int,
@PageName varchar(50),
@ControlID varchar(50),
@invisible int,
@disabled int
AS
BEGIN
Begin Transaction

if not exists (select * fromControls


where Page = @PageName and ControlID = @ControlID)
Insert into Controls (Page, ControlID)
values (@PageName, @ControlID)
if @@Error <> 0 gotoErrorHandler

insert into ControlsToRoles (FKRole, FKPage, FKControlID,


invisible, disabled)
values (@RoleID, @PageName, @ControlID,
@invisible, @disabled)

if @@Error <> 0 gotoErrorHandler

commit transaction
return

ErrorHandler:
rollback transaction
return

END

Note first that this stored procedure uses transactions to ensure that either the Control row is added
(if needed) and the ControlsToRoles row is added, or neither is added. Second, the stored procedure
checks whether the table already has an entry for this control/page combination and only attempts to
insert one if it does not already exist.
The ControlsToRoles row does double duty; it manages the relation between a control and a role
and it manages the state for that relationship (is invisible set? Is disabled set?). While this may be in
some ways counterintuitive, it ensures that (1) a role can have only one relationship with any given
control and (2) when you set a control invisible for two roles, and then set it visible for one of the
roles you do not inadvertently set it visible for the other. That is, the relationship between a role and a
control (and the state of that control) is atomic.
The code to call this stored procedure is shown here:

private void Save_Click( object sender, EventArgse )


{

ConnectionStringSettingsCollectionconnectionStrings =
ConfigurationManager.ConnectionStrings;

SqlConnection conn = new SqlConnection( connString );


conn.Open();
SqlParameterparam;

foreach ( String controlID inPageControls.SelectedItems )


{
foreach ( DataRowView roleRow inPermissionRoles.SelectedItems )
Controls Based Security in a Windows Forms Application 85

int roleID = Convert.ToInt32( roleRow["RoleID"] );


try
{
SqlCommand cmd = new SqlCommand();
cmd.Connection = conn;
cmd.CommandText = "spInsertNewControlToRole";
cmd.CommandType = CommandType.StoredProcedure;

param = cmd.Parameters.Add( "@RoleID", SqlDbType.Int );


param.Value = roleID;
param.Direction = ParameterDirection.Input;

param = cmd.Parameters.Add( "@PageName",


SqlDbType.VarChar, 50 );
param.Value = workingForm.Name.ToString();
param.Direction = ParameterDirection.Input;

param = cmd.Parameters.Add( "@ControlID",


SqlDbType.VarChar, 50 );
param.Value = controlID;
param.Direction = ParameterDirection.Input;

param = cmd.Parameters.Add( "@invisible", SqlDbType.Int );


param.Value = InVisible.Checked ? 1 : 0;
param.Direction = ParameterDirection.Input;

param = cmd.Parameters.Add( "@disabled", SqlDbType.Int );


param.Value = Disabled.Checked ? 1 : 0;
param.Direction = ParameterDirection.Input;

introwsInserted = cmd.ExecuteNonQuery();
if( rowsInserted < 1 || rowsInserted > 2 )
{
DisplayError(
controlID, roleID,
"Rows inserted = "+
rowsInserted.ToString() );
}
}
catch ( Exception ex )
{
DisplayError( controlID, roleID, ex.Message );
}
}
}
conn.Close();
PopulatePermissionTree();
}

Once the new row(s) is inserted, we call PopulatePermissionTree to repopulate the permission tree
to reflect the change and give positive feedback to the user.

Clean up on exit
When the administrator is finished setting restrictions, the ManagePermissions page is closed. An
event is fired as the page is closed (FormClosing) which we trap, providing us an opportunity to reset
the Tooltips for the form that we were setting permissions for:

private voidManagePermissions_FormClosing(
object sender, FormClosingEventArgse)
{
foreach ( Control c inworkingForm.Controls )
{
if ( c is MenuStrip)
{
MenuStrip ms = c as MenuStrip;
RestoreMenuStripToolTips(ms.Items);
}
}
86 by Jesse Liberty

formToolTip1.Active = true;
formToolTip2.Active = false;
}

The menu item tool tips are restored through the recursive method RestoreMenuStripToolTips:

private void RestoreMenuStripToolTips( ToolStripItemCollection


toolStripItems )
{
foreach ( ToolStripMenuItem mi intoolStripItems )
{
if( mi.DropDownItems.Count > 0 )
{
RestoreMenuStripToolTips( mi.DropDownItems );
}

if( oldMenuToolTips.ContainsKey( mi.Name ) )


{
mi.ToolTipText = oldMenuToolTips[mi.Name];
}
else
{
mi.ToolTipText = string.Empty;
} // end else
} // end foreach
} // end RestoreMenuStripToolTips

RestoreMenuStripToolTips recurses down to leaf menu items and then retrieves their value from
the dictionary into which we stashed them in ShowToolStipItems which we called from
ShowControls which was called from the constructor.
Form closing then makes our special ToolTips object inactive and reactivates the normal ToolTips
object and the form is back to normal. The database is fully updated and it is up to the form designer
to check the ControlsToRoles table to ensure that the current user's role does not prohibit displaying
or enabling any given control.

The Debugger is your friend


Rather than creating the forms from scratch, an effective way to understand this project is to
download the source and put it in the debugger. The focus is not on the two main forms (which are
implemented only enough to provide a context for the control-based security) but rather on
ManagePermissions.cs and ManageRoles.cs.
The interaction between the two Management forms and the underlying database (ControlSecurity)
is where all the action is, and understanding the code-behind for these pages is critical. (The database
and its stored procedure can be created by running ControlSecurity.sql)
Within ManagePermissions.cs, pay particular attention to the manipulation of the ToolTips and
also to ShowControls and the invocation of the stored procedure spInsertNewControlToRow.
Similarly, in ManageRoles.cs, pay particular attention to the invocation of spInsert-
NewUserInRole and make sure you are comfortable with how these relationships are created and
what they do.

Wrap-up
The goal of this article was not to provide a complete solution, but rather, to demonstrate an
approach that utilizes the ability to find all the controls on the page, to assist in finding the names of
each control at run time by taking over the tool tips, and by storing the permissions in a database.
This is an approach I've used with some success for clients who need control-by-control security in
their applications.
Controls Based Security in a Windows Forms Application 87

Since I've left you with work to do to create a full, working application, let me compensate by saying
that I also leave you with support. Please post any questions you may, or difficulties that you
encounter, in the comments at the end of this article and/or on the support forum to my web site
(clicking on books, and then clicking on "Free Support Forum").
Building Active Directory Wrappers in .NET 89

BUILDING ACTIVE DIRECTORY WRAPPERS IN .NET


09 February 2007
by Jeff Hewitt
From authenticating application users against Active Directory, to programmatically adding users to
Active Directory groups, it seems that a developer in a Microsoft supported environment is never too
far away from Active Directory. At least this has been true in my experience. In recent years, there
have been many times when I've needed to read or modify values in an Active Directory repository.
Each time, I've found myself going back to previous projects or resorting to internet searching, in
order to re-learn the steps and components involved.
Eventually, I decided to build a .NET library containing wrappers and components that would allow
me to interact with Active Directory without having to remember each time how everything in the
System.DirectoryServices namespace works.
This article will explain pieces of this library by walking through how to convert four commonly used
Active Directory data types into data types that can be used in a .NET application. It will also explain
how to:
• Retrieve a user from Active Directory using the System.DirectoryServices namespace
• Read the user's properties
• Commit any changes back to the Active Directory repository.
The source code for this article (see the Code Download link in the box to the right of the article
title) contains the full .NET 2.0 solution, written in Visual Basic.NET. This solution contains two
projects: A class library called SimpleTalk_ADDataWrappers which contains the four wrapper
classes mentioned above and a console application called TestConsole which retrieves a user from an
Active Directory repository and demonstrates how each of the four wrappers can be used. In this
article, the wrapper objects and the console application are explained in detail.

The problem with values in Active Directory


At first glance, building this library seemed pretty easy, but I quickly hit my first road block. When
you retrieve an object from Active Directory there are no strong typed properties or intellisense to
help you get to the information you need. For example, you can't type user.displayName to get the
string representing the user's display name. Once you get the user object from the repository, you
would access the user's display name as user.properties("displayName") which returns an array of
objects. So, not only do you have to know that there is a property on the user object called
displayName but also what data type you're expecting it to return, and whether you should expect
multiple values or just need to look at the first position in the array.
Having finally mastered displayName, what if you wanted to know when the user had last logged
onto the network. You would start by getting the property named lastLogOn which will also return
an array of objects. However, the returned data type is a COM object of type
Interop.ActiveDs.IADsLargeInteger which is a large integer object with a high and low part
representing a date and time. To make easy use of this data it would need to be converted into a Date
object, and if you wanted to save it back to the repository it would need to be converted back again.
Aside from IADsLargeInteger, there are three more values that are returned from Active Directory
that require some sort of conversion if you want to leverage them in a .NET application:
1. The object GUID
2. The object SID
3. The user account control value.
First, let's look at how to convert the IADsLargeInteger into something that can be easily used.
90 by Jeff Hewitt

IADsLargeInteger wrapper
The ADDateTime class in our .NET 2 library wraps the IADsLargeInteger object. Several Active
Directory schema properties including accountExpires, badPasswordTime, lastLogon,
lastLogoff and passwordExpirationDate return IADsLargeInteger values, all of which can be
wrapped using the ADDateTime object.
The wrapper exposes the following three members.

Sub New(ByVal ADsLargeInteger As IADsLargeInteger)

The constructor for the wrapper is straight forward and accepts the IADsLargeInteger value from
the Active Directory repository.
Public ReadOnly Property ADsLargeInteger() As IADsLargeInteger
This read-only property is also fairly straight forward. It exposes the IADsLargeInteger and is used
for getting the value back from the wrapper when saving it to the repository.
Public Property StandardDateTime() As DateTime
This property exposes the ADsLargeInteger value as a standard .NET DateTime object. The
conversions to the underlying IADsLargeInteger are done when the property is invoked (both
getting and setting) ensuring that the latest version of the IADsLargeInteger will be returned when
reading this property, as well as when invoking the read only ADsLargeInteger property.
The code for the StandardDateTime() property is listed below:

Public Property StandardDateTime() As DateTime


Get

Return Me.IADsLargeIntegerToDateTime(Me._ADLI)

End Get
Set(ByVal Value As DateTime)

Me._ADLI = Me.DateTimeToIADsLargeInteger(Value)

End Set
End Property

The private member _ADLI is the underlying IADsLargeInteger value. The two methods that this
property uses are private methods of the ADDateTime class. These methods convert between
IADsLargeInteger values and standard DateTime objects using calls to unmanaged code, and can
be found in the source code region IADsLargeInteger CONVERSION METHODS.

ObjectGuid wrapper
The ADObjectGuid object wraps identifier byte arrays returned from Active Directory. Every
schema object in Active Directory, including users and groups, is uniquely identified using a string of
bytes. This value is returned by the Active Directory schema property objectGuid. Because the
identifier values should not be modified, only read only properties are exposed on this class. The
ADObjectGuid opbject exposes four members. Firstly, the constructor accepts the 128 bit byte array
returned from the Active Directory repository.

Sub New(ByVal bytes As Byte())

Me._bytes = bytes

End Sub
Building Active Directory Wrappers in .NET 91

The read only property, bytes, returns the byte array as it was passed into the constructor.

Public ReadOnly Property bytes() As Byte()


Get
Return Me._bytes
End Get
End Property

The read only property, guid, returns the byte array in the form of a Guid.

Public ReadOnly Property guid() As Guid


Get
Return New Guid(Me._bytes)
End Get
End Property

The read only property, splitOctetString, returns the identifier byte array as an octet string with each
byte displayed as a hexadecimal representation and delimited by a '\' character. This format is required
when using the System.DirectoryServices.DirectorySearcher to search for Active Directory
objects by the objectGUID schema property.

Public ReadOnly Property splitOctetString() As String


Get

Dim iterator As Integer


Dim builder As StringBuilder
Dim values() As Byte = Me._bytes

builder = _
New StringBuilder((values.GetUpperBound(0) + 1) * 2)
For iterator = 0 To values.GetUpperBound(0)
builder.Append("\" & values(iterator).ToString("x2"))
Next

Return builder.ToString()

End Get
End Property

ObjectSid wrapper
The ADObjectSid object is used to wrap the value of an Active Directory object's objectSid schema
property. It is very similar to the ADObjectGuid. In fact, they both have the same constructor and
all of the same properties, except ADObjectSid does not have a guid property. That's because the
object's SID byte array is 224 bits instead of 128 and cannot be converted to the Guid data type.
When an Active Directory object is created, it is assigned a SID value by the system. This value can
subsequently be changed by the system but once it changes the system will never again reuse the old
value with a different object. Old values are stored in the object's schema property, sidHistory, which
returns an object array of SID byte arrays.
Like the ADObjectGuid, the splitOctetString property of the ADObjectSid can also be used in
search filters when searching for objects by the object's SID value. This value is also often used when
searching for objects by association as the association often references this value.

UserAccountControl wrapper
The ADUserAccountControl object wraps the value of the Active Directory schema property,
userAccountControl. The value is simply an integer that represents several different common
account control flags. Once you know what the flags are and their values, you only need to perform
bitwise operations on the value to set the flag or see if the flag is set. The following snippet from the
ADUserAccountControl class is the enumeration of the available flags with their values.
92 by Jeff Hewitt

Public Enum enumUserAccountControlFlag


SCRIPT = &H1
ACCOUNT_DISABLED = &H2
HOMEDIR_REQUIRED = &H8
LOCKED_OUT = &H10
PASSWD_NOT_REQD = &H20
PASSWD_CANT_CHANGE = &H40
ENCRYPTED_TEXT_PASSWD_ALLWD = &H80
TEMP_DUPLICATE_ACCT = &H100
NORMAL_ACCOUNT = &H200
INTERDOMAIN_TRUST_ACCT = &H800
WORKSTATION_TRUST_ACCT = &H1000
SERVER_TRUST_ACCT = &H2000
PASSWD_NO_EXPIRE = &H10000
MNS_LOGON_ACCT = &H20000
SMART_CART_REQD = &H40000
TRUSTED_FOR_DELEGATION = &H80000
NOT_DELEGATED = &H100000
USE_DES_KEY_ONLY = &H200000
PREAUTH_NOT_REQD = &H400000
PASSWD_EXPIRED = &H800000
TRUSTED_TO_AUTH_FOR_DELEGATION = &H1000000
End Enum

Most of the rest of the class contains public properties, one for each flag above, to set the flag or see
if the flag is set. As an example, below, is the code for the LOCKED_OUT property.

Public Property accountLockedOut() As Boolean


Get
Return Me.isFlagSet(enumUserAccountControlFlag.LOCKED_OUT)
End Get
Set(ByVal value As Boolean)
Me.updateFlag(enumUserAccountControlFlag.LOCKED_OUT, value)
End Set
End Property

The bitwise operations actually take place in two convenience methods that can be seen used above:
• isFlagSet, which takes an enumUserAccountControlFlag and returns a Boolean value
indicating whether or not the flag is set
• updateFlag which takes an enumUserAccountControlFlag and a Boolean value, true
to set the flag and false to remove it.
Every other flag property of the ADUserAccountControl class is implemented this way as well.

Using the wrapper classes


The TestConsole project included with this article explains how a user can be retrieved from an
Active Directory repository, and how the wrapper objects examined above can be leveraged to make
useful information out of the data returned from the user's schema properties.
The TestConsole project has the following three references (beyond the default references included
when the project is first created):
1. System.DirectoryServices is used by the application to interact with the Active Directory
repository
2. Interop.ActiveDs contains many of the data types returned from Active Directory
SimpleTalk_ADDataWrappers is the class library containing the wrapper objects described earlier
in this article
The Main method of the TestConsole application starts by setting up some configuration variables
that will be used during the execution.
Building Active Directory Wrappers in .NET 93

Dim repositoryPath As String = "LDAP://yourRepositoryPathGoesHere"


Dim username As String = "usernameOfUserToQuery"
Dim filter As String = _
"(&(objectClass=user)(sAMAccountName=" & username & "))"
Dim ADUsername As String = "ADUsername"
Dim ADPassword As String = "ADPassword"

The repositoryPath string will specify the path to your Active Directory repository. The username
string contains the domain username of the user you are going to query. The filter specifies the
"query", if you will, that will be executed to find this user. The ADUsername and ADPassword
strings specify the credentials for the user that will be used when binding to the user entry that you
are searching for in the Active Directory repository.
In the typical domain environment, your domain credentials will give you access to bind to your own
user object entry in the Active Directory repository. In other words, depending on your domain's
security settings, you may not have access to query any other username in the repository but your
own. Therefore, if you run into any problems running the example code, try setting the value of
username to your domain username and the values for ADUsername and ADPassword to your
domain username and password.
NOTE:
It's worth mentioning at this point that although rare, depending on your domain's
security settings, this application may not work at all for your credentials. If you continue
to experience problems after a tweaking the configuration variables a few times, you may
need to contact your system administrator to gain the necessary access.
Once the configuration variables have the correct values, the application can initialize the directory
service objects.

Dim repositoryRootEntry As New _


DirectoryServices.DirectoryEntry(repositoryPath, _
ADUsername, ADPassword)

Dim directorySearcher As New _


DirectoryServices.DirectorySearcher(repositoryRootEntry, filter)

The repositoryRootEntry will be the starting location in the search. In my case, when I instantiated
the repositoryRootEntry, I passed in the path to the root of my domain's Active Directory tree and
the administrator's username and password. The directorySearcher is used in executing searches
against an entry, in our case the repositoryRootEntry using the specified filter.
Next, the application executes the search by invoking the directorySearcher's FindOne method to
return a DirectoryServices.SearchResult object:

Dim result As DirectoryServices.SearchResult = _


directorySearcher.FindOne()

The directorySearcher also exposes a FindAll method which returns a


DirectoryServices.SearchResultCollection. The FindOne method is used in this case because
there is only one user in the repository that has the specified username. So, if the number of return
results can be expected to be only one, the FindOne method can be used, otherwise use FindAll.
Also, note that the rest of the code is wrapped in a Try/Catch block, because from this point on if
anything is going to go wrong it will happen once the connection to the repository is made.
If the result is not null, then the directorySearcher succeeded in finding the user, which will be
returned as a DirectoryServices.DirectoryEntry.
94 by Jeff Hewitt

Dim user As DirectoryServices.DirectoryEntry = result.GetDirectoryEntry

The remaining four method calls test the four wrapper methods described earlier in the article. Since
they are all similar, let's look at the testADObjectGuid in detail.

' object guid


Dim value As Object = getProperty(user, "objectGUID")
If Not value Is Nothing Then
Dim wraper As New ADObjectGuid(value)
Console.WriteLine("User's Guid Identifier:" & _
ControlChars.Tab & ControlChars.Tab & _
ControlChars.Tab & wraper.guid.ToString)
Console.WriteLine("User's Split Octet Identifier:" &_
ControlChars.Tab & ControlChars.Tab & wraper.splitOctetString.ToString)
Else
Console.WriteLine("Something is wrong - this user has no unique identifier.")
End If

It may look like more code than necessary, but that's because the ControlChars end up bloating the
Console.WriteLine lines. First, the value is retrieved using another method in the module,
getProperty, which takes the user DirectoryEntry and the name of the property to retrieve, in this
case, objectGUID. If getProperty returns nothing then a message is printed. Otherwise, the value
object is loaded into a new ADObjectGuid wrapper object. Finally, the method writes the Guid and
the split octet string representations of the identifier to the console.
The getProperty function is used to retrieve the value because it takes a few lines of code to get just
one value from an entry. That's because, as mentioned earlier, when you request a property from a
DirectoryEntry it returns an array of objects. All of the properties being requested in this module
are only expected to return one value. So, this function extracts that value from the first index of the
array returned. The code for the getProperty function is listed below.

Private Function getProperty _


(ByVal user As DirectoryServices.DirectoryEntry, _
ByVal propertyName As String) As Object
If user.Properties.Contains(propertyName) Then
Dim properties As _ DirectoryServices.PropertyValueCollection = _
user.Properties(propertyName)
Return properties(0)
End If
Return Nothing
End Function

Notice that before the function actually requests the value from the directoryEntry, it first checks to
see if the entry contains the property. Although the schema may support a property on the entry, if
the entry doesn't have a value for that property, it won't exist.
The properties that the entry has values for can be obtained by invoking
DirectoryEntry.Properties.PropertyNames, which is an array of strings representing the property
names of the properties that have values for the specific entry. There are actually several hundred
properties available for many types of DirectoryEntry objects. A list of the properties can be
obtained programmatically including information about which properties are mandatory and which
are optional. However, this being outside of the scope for this article, to find out more information
go to:
http://msdn2.microsoft.com/en-us/library/ms675085.aspx.

Updating the Active Directory entry


If you have downloaded the solution, you may have already noticed that there is one more method
call commented out after the four tests. The updateUserEmailAddress method accepts the user
Building Active Directory Wrappers in .NET 95

DirectoryEntry and a new email address string. Before you uncomment this method, it may be a
good idea to contact your network administrator to ensure that updating Active Directory properties
won't have adverse effects on the other systems running on the network. For example, there may be a
system on your network, like a spam filter or scheduled task that is expecting a certain user's email
address to be a certain value. If this value is changed, this system may not be able to send out a
notification to that user.
First, the method retrieves the user's email address, just as for the previous four tests, by calling the
getProperty function and passing in the user and the property name mail, which returns the user's
email address. Once it has the email address, it writes the current address to the console and then
writes the newEmailAddress to the console. Then the user's email address is updated to the
repository using the following code.

user.Properties("mail")(0) = newEmailAddress
user.CommitChanges()

When using the wrapper classes, the values are saved back to the repository in the same way. Simply
replace the newEmailAddress value with the underlying value of the wrapper object. For example,
updating the accountExpires property of a user would look like this:

user.Properties("accountExpires")(0) = _
ADDateTimeWrapper.ADsLargeInteger
user.CommitChanges()

At this point, an exception may be thrown if the ADUsername and ADPassword do not belong to
a user with sufficient rights to update the entry's properties. Typically, as mentioned earlier, users have
sufficient rights to bind to and read from their own entries but not enough rights to commit changes
to the entry. Depending on your domain security settings you may experience varied results when
invoking this method.
To make sure that the change worked, if you have access to actually view the Active Directory users
and computers on your network, you will see that the email address for the given user has been
updated. If you don't have access, you can run the TestConsole a second time and see what email
address is written to the console before the email address is changed.

Things to keep in mind


While experimenting with this code, keep in mind that the scenarios described here may not work if
the user being used to bind to the Active Directory entries does not have sufficient rights. Further,
even if the user does have sufficient rights to read, the user may not have sufficient rights to commit
changes. In the end, although the examples described in this article are pretty straight forward,
different users may experience varying experiences based on the domain's security settings. For
example, by default, administrators are the only users on the domain with sufficient rights to update a
user entry. If this is the case on your domain, if you are not an administrator, the
updateUserEmailAddress method will fail with an UnauthorizedAccessException error, when it
calls user.commitChanges. For more information on best practices and how to modify Active
Directory security settings visit:
http://technet2.microsoft.com/WindowsServer/en/library/373a4e2b-89a6-4ccc-9e20-
be07c741f47b1033.mspx?mfr=true
Also, not all properties can be directly updated. For example, using the ADUserAccountControl
wrapper to update the PASSWD_CANT_CHANGE flag will not actually change whether or not
the user's password can be modified. In addition, depending on the state of the entry (disabled,
password expired, etc.), some properties may be read only and although it may look like the changes
have been committed, they have not. I've discovered some of these anomalies from my own
experimentation and have tried to document the code where I have encountered these situations.
96 by Jeff Hewitt

Conclusion – possible improvements or additions


I believe this code is a great foundation for anyone wanting to gain a deeper expertise on leveraging
an Active Directory repository in a .NET application. However, as anyone reading this article can see,
there are several areas for improvements and additions. One glaring issue that I have struggled with is
the fact that under different scenarios in addition to insufficient access rights, this code may not work
as expected or at all. Every domain is different and different user states require different
implementations of the code described in this article. For example, as mentioned in the latter section,
although some of these user states may be predictable, writing code that can detect and work through
some of these situations is outside of the scope of this article.
Also, now that this article describes how to wrap different data types returned from Active Directory
repositories, it may be nice to have a wrapper object for wrapping an entire Active Directory user
entry or even a group entry.
Integrating with WMI 97

INTEGRATING WITH WMI


17 October 2007
by James Moore

James shows how to add a simple WMI provider to a service so that you can monitor it, and make
changes to it, remotely across the network

If you are writing an application, such as a service, that you want to be able to monitor and configure
remotely then you'll want to ensure that your application integrates smoothly with Windows
Management Instrumentation (WMI). WMI is the Microsoft implementation of an industry standard
interface called Web Based Enterprise Management. It allows a user to access, and change,
management information from a wide variety of devices across the enterprise. The classes that
support WMI in the .NET Framework reside in the system.management namespace, within the
framework’s class library.
In this article, we will first create a Windows Service that accepts a TCP connection and echoes back
any text typed into a telnet connection. We will then add some WMI code to our service in order to
enable it to publish information about the number of characters that have been echoed back. In other
words, we will turn our Windows Service into a WMI Provider.
Although this is a fairly simple project, it does highlight the key steps that are required to WMI-enable
your application.

Creating and installing the service


Create a new Windows Service project in Visual Studio and call it TestService. In Solution
Explorer, open up the file called Service1.cs in the code editor.
Add a new member variable called m_engineThread of type Thread:

Thread m_engineThread;

We then want to kick it off this new listening thread during service startup:

protected override void OnStart(string[] args)


{
m_engineThread = new Thread(new ThreadStart(ThreadMain));
m_engineThread.Start();
}

And make sure we terminate it when the service is stopped:

protected override void OnStop()


{
try
{
m_engineThread.Abort();
}
catch (Exception) { ;}
}
98 by James Moore

The code for ThreadMain is fairly simple; it just sets up a TCPListner and accepts connections. It
then prints out any line entered via the TCP connection, until a single "." is received on a line by
itself:

public void ThreadMain()


{
// Setup the TCP Listener to bind to 127.0.0.1:50009
IPAddress localAddr = IPAddress.Parse("127.0.0.1");
TcpListener tlistener = new TcpListener(localAddr, 50009);
try
{
// Start listening
tlistener.Start();
String data = null;
// Enter processing loop
while (true)
{
// Block until we get a connection
TcpClient client = tlistener.AcceptTcpClient();

data = null;

// Get a stream object and


// then create a StreamReader for convience
NetworkStream stream = client.GetStream();
StreamReader sr = new StreamReader(stream);

// Read a line from the client at a time.


while ((data = sr.ReadLine()) != null)
{
if (data == ".")
{
break;
}

byte[] msg = System.Text.Encoding.ASCII.GetBytes(data);


stream.Write(msg, 0, msg.Length);
stream.WriteByte((byte)'\r');
stream.WriteByte((byte)'\n');
}

// Shutdown and end connection


client.Close();
}
}
catch (SocketException e)
{
;
}
finally
{
// Stop listening for new clients.
tlistener.Stop();
}

Finally, we need to get the service to install itself. To do this, add a project reference to
System.Configuration.Install.dll and then add a new class to your project, called MyInstaller.
This class should derive from Installer and be attributed with the RunInstallerAttribute:

[System.ComponentModel.RunInstaller(true)]
public class MyInstaller : Installer
{
…..

In the constructor of the MyInstaller class, we need the following code to install the service:
Integrating with WMI 99

public MyInstaller()
{
ServiceProcessInstaller procInstaller = new
ServiceProcessInstaller();
ServiceInstaller sInstaller = new ServiceInstaller();
procInstaller.Account = ServiceAccount.LocalSystem;
sInstaller.StartType = ServiceStartMode.Automatic;
sInstaller.ServiceName = "Simple-Talk Test Service";
Installers.Add(sInstaller);
Installers.Add(procInstaller);
}

All this does is to ensure that the service installs correctly and appears in the services.msc control
panel.

Starting and Stopping the Service


Let's give it a go. Hit F6 to build the project and then run InstallUtil.exe on the resulting binary. In
my case, this is:
C:\Simple-Talk>InstallUtil.exe TestService.exe
You will see a large amount of text output. Once this has completed, hit start->Run and type
services.msc. This will bring up the services control panel; scroll down until you find Simple-Talk
Test Service and start it.
Having started the service, we can now try it out. Hit start->run again and type: telnet 127.0.0.1
50009. This will open up a telnet window; anything you type will be echo’d back to you when you hit
enter.
To close the connection enter "a ." on a line, on its own.
We now need to stop the service, which you can do using the services control panel.

Adding WMI Support


We now want to add the WMI support to the service. As an example, we will publish the number of
characters which have been echo’d back since the service started.
To WMI-enable our service, include a reference to System.Management.Dll and then add a new
class to the project called EchoInfoClass. Attribute this class with the InstrumentationClass
attribute, with its parameter as InstrumentationType.Instance. Then, add a public field called
CharsEchoed of type int:

[InstrumentationClass(InstrumentationType.Instance)]
public class EchoInfoClass
{
public int CharsEchoed;
}

The InstrumentationClass attribute specifies that the class provides WMI data; this WMI Data
can either be an instance of a class, or a class used during a WMI event notification. In this case, we
want to provide an instance of a class. Next, in order to WMI-enable our project, we need to modify
the installer class we wrote earlier so that it registers our WMI object with the underlying WMI
framework.
For safety, first run InstallUtil.exe /u against the binary we built before to uninstall the service.
Now, we can change the installer class so that it registers our WMI object correctly with the
underlying WMI subsystem. Luckily, the .NET Framework architects made this easy for us. There is a
class called DefaultManagementProjectInstaller in the framework that provides the default
100 by James Moore

installation code to register classes attributed with InstrumentationClass. To take advantage of this
we simply change the class MyInstaller to derive from DefaultManagementProjectInstaller rather
than Installer.

[System.ComponentModel.RunInstaller(true)]
public class MyInstaller :
DefaultManagementProjectInstaller
{

We need to create and register an instance of this service class on service startup. To do this, first add
a member variable to the service class:

EchoInfoClass m_informationClass;

Then, add the following code to your OnStart override:

protected override void OnStart(string[] args)


{
m_informationClass = new EchoInfoClass();
m_informationClass.CharsEchoed = 0;
Instrumentation.Publish(m_informationClass);

m_engineThread = new Thread(new ThreadStart(ThreadMain));


m_engineThread.Start();

This creates the class instance and registers it with the WMI framework so that it is accessible via
WMI. Once that is done we just use the class as normal.
We have now told the WMI Framework about our class (via the installer) and published an instance
of it (in our OnStart method). Now, we just need to update the information we are publishing via
WMI. To do this we increment the m_informationClass.CharsEchoed field whenever we echo
a character back to the client. To do this add the following line to ThreadMain:

while ((data = sr.ReadLine()) != null)


{
if (data == ".")
{
break;
}

byte[] msg =
System.Text.Encoding.ASCII.GetBytes(data);
stream.Write(msg, 0, msg.Length);
stream.WriteByte((byte)'\r');
stream.WriteByte((byte)'\n');

m_informationClass.CharsEchoed += msg.Length;
}

Testing the WMI Provider


We are now ready to give it a go and see if it all works! Build your application by hitting F6 and then
run InstallUtil again:

C:\Simple-Talk>InstallUtil.exe TestService.exe
Integrating with WMI 101

Then just start the service and try it out:

C:\Simple-Talk>net start “Simple-talk test service”


C:\Simple-Talk>telnet 127.0.0.1 50009

The telnet command opens up a blank screen, waiting for you to type something in. I typed in
"simple-talk" and hit enter and the service duly echo'd back "simple-talk" to the screen.
So, the service returned 11 characters and, hopefully, our WMI provider worked correctly and
recorded that. Microsoft provides a WMI information browser called wbemtest – it's fairly ropey,
but it will do for now so open that up:

C:\Simple-Talk>wbemtest

Click connect, leave all the settings at their default value, and click OK:
102 by James Moore

Next click the Query… button and enter the following:

WQL returns instances of the classes we requested, rather than rows, and presents us with the
following screen:
NOTE:
WQL is very similar to SQL – in fact it is a subset of SQL and allows you to query
management information in a very similar way to an RDBMS. WQL generally returns
“instances” rather than rows. However, these can be thought of as analogous.
Integrating with WMI 103

Double click on the instance of the class:

The CharsEchoed property shows us that 11 characters have been sent back from the service.

Summary
WMI is a wide ranging infrastructure on Windows (and other platforms) for managing machines and
programs across an enterprise. Although our example is fairly simple it should give you enough
information to be able to include WMI integration next time you are writing a service or website that
requires remote monitoring and management.
It is equally as easy to consume WMI information from .NET however that topic can wait for
another article.
Make sure your .NET applications perform 105

MAKE SURE YOUR .NET APPLICATIONS PERFORM


30 October 2007
by James Moore

James Moore is a developer and runs the .NET Tools division at Red Gate Software. Previously,
he developed the UI for SQL Compare 5 and SQL Data Compare 5, and was technical lead on
SQL Backup 5.

In today's world of high processor speeds, and vast amounts of available memory, many
programmers seem to ignore the limitations on these resources in applications.
This can often lead to scalability and performance problems that can often be solved with very little
difficulty. This article discusses common pitfalls and problems when creating applications in .NET. It
will cover some nomenclature first and then present several common issues that I have come across
in a wide variety of applications. It also describes some common ways of solving these problems.
Future articles will delve deeper into some of the specific techniques or problems covered here

Beware of optimising needlessly


You are better off writing readable code, and then focussing your efforts on optimising just those
parts that need it. This is just as true on Just In Time (JIT) compiled platforms, such as .NET, as with
native code.
If you over optimize your code, you risk limiting the ability of the compiler to perform compiler
optimizations, such as loop interchange thereby leading to a decrease in performance as the compiler
may select different optimizations depending on Platform.
NOTE:
See http://en.wikipedia.org/wiki/Compiler_optimization for information about the
wonderful things your compiler can do for you
In my experience, it is best to ensure you have a scalable architecture, and leave optimisation for a
subsequent stage in which you can identify, measure, and fix those performance hotspots. For a great
overview of different architectures and their different behaviours, get a copy of Computer
Architecture, a Quantitative Approach by John L. Hennessy and David Patterson

A big Oh
It took me a long while to fully grasp the big-O notation (also known as Big-Oh, Landau notation or
asymptotic notation), and how it could be used to understand the scalability of your code. It's well
worth your time in investigating it.
Basically the big-O notation is used to describe how an algorithm uses resources (time and memory)
depending on the size of its input. So, what do I mean if I say that a function has a time complexity
of O(n2)? I mean that the amount of time spent in the function grows roughly in proportion to the
square of the size of the input. When n is small, this is not necessarily significant, but this changes as
n grows.
Consider the following function:

1: static void SomeFunction(int[] array)


2: {
3: Thread.Sleep(100000);
4:
5: for (int i = 0; i < array.Length; i++)
106 by James Moore

6: {
7: for (int j = 0; j < array.Length; j++)
8: {
9: Thread.Sleep(10);
10: }
11: }
12: }

This function has a time complexity of O(n2). If we pass in an array of 1 element, then the function
will spend longest in the sleep on line 3 of our function. The same is true if we pass a 100 element
array. However, if we pass in a 1000 element array, the same amount of time is spent in line 9 as line
3.
As the number of elements in the array grows beyond 1000, the sleep on line 9 becomes more and
more dominant and the sleep on line 3 loses its influence on the run time of the function.
Below is a table outlining several common function complexities, illustrating how they grow with the
size of the input:
>O(1) O(lg N) >O(n) O(n lg n) O(n2) O(n3)
1 1 1 1 1 1 1
10 1 3 10 33 100 1000
50 1 6 50 282 2500 125000
100 1 7 100 664 10000 1000000
1000 1 10 1000 9966 1000000 100000000
0
10000 1 13 10000 132877 100000000 1012
1000000 1 20 1000000 19931569 1012 1018
We have been talking about time complexities so far but it is also important to remember the same
can be true of memory usage as well.
You will often find that to decrease a function's time-complexity, you need to increase its memory
complexity. Can you come up with a function that has an O(1) time complexity (i.e. the function's
execution time does not vary however large the number given) and a different function with an O(1)
memory complexity, to tell if a number under 10000 is prime or not? What are the respective memory
and time complexities of both of these functions?

Get real
The problem is, back in the real world, that my function is 5000 lines of code, calls other functions
and has 20 parameters; there is no way I can figure out the time/memory complexity of my function!
Fortunately, you don’t need to figure it out by hand. The way that I do it is to use a profiler and run
the application with one set of inputs, and rank the functions in order of their execution times.
ANTS Profiler is my personal favourite as it supports line level timings. I then repeat the process,
varying the size of the inputs.
Next, I compare the two lists and look at the differences in function rank. Functions that have moved
up relative to the others are the ones that are likely to have scalability problems, and can be
investigated further. A note of caution about this: although you do not need to go to the extremes of
input size differences you do need to make sure they are large enough.
Think about the function introduced above; if I started with a 10 element array input and only tested
up to a 100 element array, then I would conclude that the sleep on line 3 is dominant. However, if I
pass in a 10 element array and a 10,000 element array then I will see the function’s rank change.

What to look for


In addition to the quick test outlined above, I have, over the years, learned the hard way about what to
avoid and what to look out for when solving performance problems.
Make sure your .NET applications perform 107

Loops
The most obvious source of poor performance is loops, whereby the number of iterations around
the loop cannot be determined at compile time. These loops generally scale as a function of some
piece of input.
Watch out for places where you call a function in a loop, and where that function also has a
complexity of O(n) or above, as you can then end up with a function with a complexity of O(n2)
without realising it. This is probably the main cause of poor scalability in .NET applications.
How you fix the problem really depends on your application, but I always find that it's useful to think
about the following "rules" when trying to solve performance issues related to loops.

Do less work
This seems fairly obvious, but I always ask myself if there is a way to move some of the processing
out of the function, and pre-compute or cache some of the information I am computing. This often
comes at the expense of using some extra memory, so be careful!

Divide and conquer


Is there some way I can split the input into parts, do the heavy lifting on the relevant part, and then
merge the results back together again? A prime example of this sort of algorithm is merge sort.
If you can rewrite your algorithm so that the “heavy lifting” is O(n), then your algorithm should end
up with a complexity of O(n lg n).

Modelling
Can you model your problem in a similar way to another problem? Does your problem resemble a
sort or is it more like a flow optimization? If so, then find an algorithm which performs well and
adapt it to solve your problem. If you have never looked at it then I would suggest getting a copy of
"Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein. It sits on my desk and is well
thumbed. I find it much easier to follow than Knuth.

Dynamic programming
You should consider using dynamic programming when your problem can be split into multiple sub-
problems; however, these sub-problems are not independent. In other words, each sub- problem
shares a sub-sub-problem with other sub problems, as illustrated below:

A divide and conquer approach will do more work than is necessary in these cases and can be
optimised by storing the solutions to these sub-sub problems in a lookup table, computing them the
first time they are used, and then reusing the result next time rather than computing it again.
108 by James Moore

Translate the problem


The implementation of a solution to a problem is often dictated by the way the data is presented and
stored – can you change the way that the data is stored? Does this make solving your problem any
easier?

If all else fails, fake it


Sometimes there is nothing you can do about performance. After all, you can only factor a prime
number so quickly! In these cases you need to make sure that you tell your user that you are making
progress and allow them to cancel the operation. Users are willing to wait for a while for results as
long as you let them know!
Usability is hugely important. When you cannot do anything about performance, have a look at Jacob
Nielsen’s website http://www.useit.com/ for lots of information about usability

Blocking functions
If your application is interactive, as is the case with a website or windows forms application, then be
very careful when calling blocking functions, for example when making a web service request, or
executing a query. If the web service runs on a remote machine, or the query has not been properly
tuned, then never run these on your main thread. Use the ThreadPool class or BackgroundWorker
class and show a progress indicator using the main thread.
Database queries are one of the most common causes of poor performance. They behave in a similar
way to function calls and their performance often depends on the number of records in the tables
accessed. Check that your queries scale properly by using a SQL Profiler and varying the number of
records in the tables. Also, examine your query execution plans and make sure that the indexes you
have on your tables are relevant and that your database’s statistics are up to date, so that the correct
indexes are used. Optimizing queries can be tricky and the exact optimisations to use differ from
query to query. You are generally best-advised of to talk to an expert. Luckily there are plenty willing
to give their advice for free at sites such as SQLServerCentral.com.
In the past, rewriting a query and ensuring that the correct indexes are available has allowed me to
decrease a query's execution time form over 30 minutes to two or three seconds. It’s well worth
spending a few hours testing the scalability of your queries.

Threads can end up in knots


Another common problem that I have encountered involves optimisations that use multiple threads
to perform some processing. This can work really well, but if your threads access the same data
structures a lot then you need to be careful.
Each access should be mutually exclusive, enforced by mutexs. Otherwise, you may encounter race
conditions. However, the acquisition of these mutexs can, itself, be the cause of performance
problems. If a thread holds on to a mutex for too long then it will stop other threads accessing the
shared resource and cause them to block, possibly leading to performance issues.
Conversely, acquiring and releasing a mutex constantly has its own overhead and can slow program
performance down.
There are several solutions to this sort of problem (normally called a contention problem).

Batch it up
The first solution is to look at how your program accesses the data structure in question. Can you
batch up several accesses to the structure and only acquire the lock once? If your process is:
1. Acquire the lock
2. Do something
Make sure your .NET applications perform 109

3. Release the lock,


4. Repeat as necessary
You should investigate whether you can change it to
1. Acquire the lock
2. Do something 10 times
3. Release the lock
When taking this approach, just be careful not to do too much at once, as this can block other
threads.

It's good to share


If two or more threads are trying to read from the same data structure at the same time then you
might want to look at using a ReaderWriterLock class rather than C#’s native lock keyword.

Lock free
If your levels of lock contention are high, but you very rarely get update clashes, then you might want
to look into using a lock-free data structure. These are great in some situations, but if you get a large
number of update conflicts you will see a large drop in performance while using these.
NOTE:
A lock-free data structure is one that is built to support concurrent access and writing to
data without a lock being acquired. This can be achieved through transactional memory
or through atomic actions such as Compare and Swap.

Conclusion
Improving the performance of your application often boils down to rephrasing the problem you are
trying to solve. There are two approaches to this – do less, or do it differently. Doing less generally
involves storing and caching information, and doing something differently involves rephrasing the
computation in terms of some other problem. Take for example the Radix sort whereby, given a
list of numbers, say 12, 15, 24, 2, 108, 1067, we sort by the least significant digit first, resulting in: 12,
2, 24, 15, 1067, 108, we then sort by the tens column: 2, 108, 12, 15, 24, 1067, the hundreds column:
2, 12, 15, 24, 1067, 108 and finally the thousands column: 2, 12, 15, 24, 108, 1067. This odd sorting
algorithm seems fairly counter intuitive but, when sorting using a most-significant-digit-based radix
sort, each sub-problem is independent from other sub-problems, allowing for effective use in highly
parallel systems.
This article is admittedly lacking in the hard and fast rules you can apply to your code – unfortunately,
performance improvements are often trade offs. There is the old saying that you don’t get something
for nothing and this is equally true for computer science as for any other area of life. The choices you
make when developing you applications will affect your user’s perception of the application’s quality
so it is definitely worth spending a little time ensuring you get the quick performance and scalability
wins – of which there are often many.
Tracing memory leaks in .NET applications with ANTS Profiler 111

TRACING MEMORY LEAKS IN .NET APPLICATIONS WITH ANTS


PROFILER
29 September 2006
by Mike Bloise

Mike Bloise is a lead developer at Recognin Technologies. This case study recounts his
experiences on a recent CRM project, built using C#, where he found himself facing severe
memory leak issues and a very tight deadline.

Memory leaks are relatively easy to introduce into your programs when coding in C or C++ (no-one
could have enjoyed having to write destructors for every single one of their C++ classes). However,
when you code in a .NET language, such as C#, you are working in managed code, with automatic
garbage collection, so memory management becomes a bit of a non-issue, right?
That certainly pretty much describes my mindset when I developed the brand new C# 2005 desktop
version of my company's sales and CRM application. I was new to C#, and while I was aware that
there could still be problems if references weren't cleaned up properly, I hadn't given memory
management much thought during development. I certainly wasn't expecting it to be a major issue. As
it turns out, I was wrong.
The problems begin…
I knew I was in trouble the first time my customer called with specific memory use numbers from the
Windows Task Manager. Jack is a salesperson by trade, but by volunteering to beta-test my new
desktop application, he had unknowingly put himself in line for a crash course in memory leak
awareness. "The memory is up to five hundred thousand K since restarting this morning", he said,
"What should I do?"
It was a Friday afternoon and I was at an out of town wedding for the weekend. Jack had noticed this
issue the day before and I had advised the temporary fix of close-out-and-come-back-in. Like all
good beta testers he was happy to accept the temporary solution, but like all excellent beta testers he
wasn't going to accept having to do a temporary fix over and over.
Jack wasn't even the heaviest user of the application and I knew that the installed memory of his
machine was above average, so going live wouldn't be possible until I could trace and fix this leak.
The problem was growing by the minute: the scheduled go-live date was Monday and I'd been on the
road, so I hadn't been able to look through the code since the memory issue had arisen.
Fighting memory leaks armed only with Task Manager and Google
I got back home on Sunday evening and scoured the search engines, trying to learn the basics of C#
memory management. My company's application was massive, though, and all I had was the Task
Manager to tell me how much memory it was using at any given time.
Displaying an invoice seemed to be part of the problem; this was a process that involved a large
number of different elements: one tab page, a usercontrol on the page, and about one hundred other
controls within that usercontrol, including a complicated grid control derived from the .Net ListView
that appeared on just about every screen in the application. Every time an invoice was displayed, the
memory use would jump, but closing the tab wouldn't release the memory. I set up a test process to
show and close 100 invoices in series and measure the average memory change. Oh no. It was losing
at least 300k on each one.
By this point it was about 8pm on Sunday evening and needless to say, I was beginning to sweat. We
HAD to go live the next day. We were already at the tail end of our original time estimate, other
projects were building up, and the customer was already starting to question the wisdom of the entire
re-design process. I was learning a lot about C#'s memory management, but nothing I did seemed to
keep my application from continuing to balloon out of control.
112 by Mike Bloise

Enter ANTS
At this point, I noticed a banner ad for ANTS Profiler, a memory profiler for .NET. I downloaded and
installed the free trial, mentally composing the apologetic 'please give me a few more days' email I
would need to write the next morning if I didn't find a resolution.
How ANTS worked was pretty clear as soon as I opened it. All it needed was the path to the . exe,
after which it launched the system with memory monitoring turned on. I ran through the login
process in my application, and then used the main feature in ANTS to take a 'snapshot' of the
application's memory profile before any invoices or other screens had been shown.
Browsing that first profile snapshot, I was stunned at the amount of information available. I had been
trying to pinpoint the problem using a single memory use number from the Task Manager, whereas
now I had an instance-by-instance list of every live object my program was using. ANTS allowed me
to sort the items by namespace (the .NET ones as well as my own), by class, by total memory use, by
instance count, and anything else I could possibly want to know.

Armed with this information, I mentally put my apology email on hold, brought up my application,
ran the process that displayed 100 invoices, and then took another snapshot. Back in ANTS, I
checked the list for instances of the main invoice display usercontrol. There they were; 100 instances
of the control along with 100 instances of the tab and 100 instances of everything else, even though
the tabs themselves had all been closed on the screen.
Tracing memory leaks in .NET applications with ANTS Profiler 113

The obvious problem: objects not being removed from an array


In my research I had learned that the .NET memory management model uses an instance's reference
tree to determine whether or not to remove it. With a bit more clicking in ANTS, I found that it
could show me all of the references both to and from every instance in my program.
Using ANTS to navigate forward and backward through the maze of linked references, I was quickly
able to find a static ArrayList to which all displayed tabs were added, but from which they were never
removed.
After adding a few lines of code to remove each tab from this collection as it was closed, I re-ran the
profiler and the 100 invoice process and voilà; the tabs, the main usercontrol, and nearly all of the
sub-controls were gone. It got even better too: the memory increase after each invoice was down to a
fifth of what it had been, which changed the memory leak from a major concern down to a minor
annoyance. The next day we went live, and although issues of all sizes arose, none of them was
caused by the leak.
The subtler problem: events listeners and the ListView object
Later that week, however, Jack's calls resumed: "The memory is still slowly creeping up; what's going
on?" I didn't know, but at least I knew where to look now. I used ANTS to see if I could locate the
remaining leak. What I found was that one of the sub-controls of the main invoice usercontrol, the
ListView-based one that formed a primary part of the interface, was being held in the reference tree
by what appeared to be standard event handlers like OnClick and MouseMove, hooks that had been
added using the Visual Studio IDE and that would have been, I thought, cleared automatically.
114 by Mike Bloise

This was really puzzling to me, and I wrote to Red Gate Software, the developers of the ANTS
system, asking for some additional help. Their support staff promptly responded and explained that
in situations with lots of complex references and event handlers, the .NET runtime can leave event
handlers in place when they should be disposed. They suggested manually removing each handler in
the Dispose method of the usercontrol that was causing the problem.
I added the 20 or so minus-equals statements to remove each handler, and for good measure, I added
a System.GC.Collect()statement after the tab closing process.

Re-running the ANTS profiler and the 100 invoice process, I found that the memory use remained
rock solid. Then, when re-checking the ANTS snapshot, I could see that all of the invoice-related
controls had been released, and the memory use numbers in the task manager never moved.
I re-compiled and uploaded the new version. Now it was my turn to call Jack.
Tracing memory leaks in .NET applications with ANTS Profiler 115

Summary
What did I learn from all this? Firstly, that the "it's managed code so we don't have to worry about
memory leaks" assumption falls short of the mark.
Although automatic memory management in .NET makes our lives as .NET developers a whole lot
easier, it is still easy to introduce memory leaks into your application. Even in managed memory, there
can be issues. The memory manager cannot free memory that is still 'live' – and it will still be
considered live if it is referenced directly or indirectly through the "spider's web" of references that
link the various objects. Also, when complex reference trees and event handlers are involved, the
memory manager doesn't always deregister these event handlers, and so the memory will never be
released unless you forcibly release it.
Secondly, I learned that tracking down these sorts of issues equipped only with Task Manager was
simply not possible – certainly not in the timeframe I had. Tools such as Task Manager (and
Performance Monitor) were able to tell me that my application was using up a lot of memory, but I
needed a dedicated memory profiler like ANTS Profiler to really show me what objects made up that
memory and why they were still there.
Testing Times Ahead: Extending NUnit 117

TESTING TIMES AHEAD: EXTENDING NUNIT


14 March 2008
by Ben Hall

If you want to get serious with Unit Testing, then you'll need to understand how to extend the
NUnit framework in different ways for your own particular test requirements and how to tailor test
behaviour. Test Expert Ben Hall, of the SQL Data Generator team, shows how it is done, with
special reference to the art of testing SQL code.

This article discusses why you might want to extend NUnit with custom addins and attributes, and
how to go about doing it. NUnit has become the de-facto standard unit-testing framework for the
.Net platform. Originally a port of JUnit, it has now grown, and has been completely re-written to
take advantage of the .Net framework. This article is based on NUnit 2.4.6.
The sample source code for this article can be downloaded from http://www.simple-talk.-
com/content/file.ashx?file=777.

Why Extend?
Why would you need to extend NUnit? NUnit provides an excellent harness for unit testing, but you
will have requirements which cannot be solved using the framework.
By extending NUnit, you can move this generic repeatable code from your test code into an attribute
that can then be added to your test methods.
This has a number of advantages:
• Your tests are more readable because an attribute can describe more about the test than
some code in a setup method,
• You can reuse your extensions instead of them being isolated in test code.
• You can also extend the code to allow additional possibilities not possible out of the box,
such as dynamically creating your tests and test data.

NUnit Extension Points


NUnit 2.4 has a set of extension points that allow you to hook into the framework at various
different levels to solve different problems.
These are the different types of extension points that are available.
1) Suite Builders
This is arguable the most flexible and highest level hook into NUnit. The Suite Builder
extension point allows you to implement your own version of TestFixture. This means
you can redefine much of the way that NUnit finds, and executes the tests. For example, if
you need a different way to identify tests, instead using the [Test] attribute, you could create
your own Suite Builder and, by using Reflection on the type, you could dynamically add
methods as tests.
Another advantage of this is that you can define the way to execute the Test Method.
Every test is defined as a Test object, these Test objects define the tests name, how the test
should be executed, the arguments for a test, any expected exceptions and other
information NUnit uses internally. By dynamically creating your test fixture, you can alter
the behaviour of any of the tests.
2) Test Case Builders
118 by Ben Hall

Where Suite builders work by redefining the TestFixture, Test Case Builders redefine how
the Test attribute behaves. By using the Test Case Builder to hook into NUnit, you can
define a custom test within a standard NUnitTestFixture.
These custom tests work by dynamically creating the test objects and adding them to an
existing Test Fixture to execute. This allows you to dynamically build the tests to execute
within your test fixture based on data that might not be known until run time, for example
the contents of a directory or database.
3) Test Decorators
Test decorators are combined with an existing [Test] attribute in order to add additional
behaviour to a test when it is executed, such as modifying the description of the test or
executing a command before the test is started. Unlike the two previous tests, we still use
NUnit’s basic features but tweak the actual execution.
An example of this is being able to add a RepeatAttribute to a test which causes the test to
execute a certain number of times instead of just once.
4) Event Listeners
The fourth main way to extend NUnit is to listen to the various events Nunit fires during
execution of the test suite. The events you can hook into are:
• Run Started
• Run Finished
• Test Started
• Test Finished
• Suite Started
• Suite Finished
• Unhandled Exception
• Test Output
By listening to these events, you can add a response as required. A common requirement for this
would be to provide additional/different reporting functionality of test execution.
NOTE:
Version 2.4.6 has a known bug with event listeners that cause the loading of the addin to
fail.

How to Extend NUnit?


So how would you actually go about implementing the addins?

Hello World
To start with, I am going to create a Hello World attribute. NUnit addins can be created as any .Net
assembly as long as they reference the correct NUnit assemblies and implement the interfaces
correctly.
To get started, you will need to create a class library (I will be using C#), you will then need to
reference the correct NUnit assemblies, which can be found in the nunit directory, which are
nunit.framework.dll, nunit.core.dll and nunit.core.interfaces.dll. With those in place, we can
create our first addin.
The type of extension point and addin you wish to create will depend on how the addin gets
implemented. For this example, I will create a Test Case Builder addin that will take a test object and
add a description before passing it back to NUnit to execute. The Test Case Builder requires an
object to implement the NUnit.Core.Extensibility.ITestCaseBuilder interface.
Testing Times Ahead: Extending NUnit 119

The interface has two method, one called CanBuildFrom which returns true if the current test
method is supported by the attribute and false if it isn’t. The second method is called BuildFrom
that takes the current method as a MethodInfo object and returns a constructed Test object to
execute. Figure 1 demonstrates the code required to implement the addin. The CanBuildFrom just
checks to make sure it has the correct attribute (see below), the BuildFrom method then simply
creates the built-in NUnitTestMethod object based on the current method, then sets the description
property and finally returns it to the calling code (NUnit).

#region ITestCaseBuilder Members


public bool CanBuildFrom(System.Reflection.MethodInfo method)
{
return NUnit.Core.Reflect.HasAttribute(method,
"NUnitAddInAttributes.HelloWorldAttribute", false);
}

public NUnit.Core.Test BuildFrom(System.Reflection.MethodInfo method)


{
NUnitTestMethod tmethod = new NUnitTestMethod(method);
tmethod.Description = "Hello World Test Addin Method";
return tmethod;
}
#endregion

Figure 1: - ITestCaseBuilder code

The next stage is to create the correct attribute so it can be used by the test code. The first time I
attempted this, I put the attributes into the same assembly as the addins. However, this meant that
my test code also had to reference nunit.core and nunit.core.interfaces if it wanted to use my
attribute: this is not great for ‘zero fiction’ usage, so I moved my attribute code into a separate
assembly that my test code could then reference in order to use my addin. This had the benefit that
they don’t need to reference any additional assemblies.
As the attribute is used to tell NUnit to use the addin instead of NUnit’s Test attribute, the attribute
itself is simple as shown in figure 1.

namespace NUnitAddInAttributes
{
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited =
false)]
public class HelloWorldAttribute : Attribute
{
}
}

Figure 2: - Hello World Attribute

Now we have a useable addin, we need to tell NUnit about it. The first step is to add a
NUnitAddinAttribute to the addin class so that NUnit can find it.

[NUnitAddin(Description = "Hello World Plugin")]


public class HelloWorld : NUnit.Core.Extensibility.IAddin, ITestCaseBuilder

Figure 3: - NUnitAddinAttribute

The next step is to install it into the correct extension point. The IAddin interface has a single
method called Install; this has a IExtensionHost parameter that we can use to gain access to the
different extension points. By calling the method GetExtensionPoint and passing in the name of
the extension we get a IExtensionPoint object. We can then call the Install method on this object,
120 by Ben Hall

passing in the object that implements the correct interface, finally returning true to say it completed
successfully.
Figure 4 is the code for installing a Test Case Builder addin.

#region IAddin Members


public bool Install(NUnit.Core.Extensibility.IExtensionHost host)
{
IExtensionPoint testCaseBuilders = host.GetExtensionPoint("TestCaseBuilders");
testCaseBuilders.Install(this); //this implments both interfaces
return true;
}
#endregion

Figure 4: - Addin Install

If we load NUnit and go Tools > Addins the following dialog box will appear.

You should see the addin successfully listed, together with a RepeatedTestDecorator, which is
included within NUnit.
Figure 5 is an example of how to use the attribute with your test. Remember, a Test Case Builder
addin attribute is used instead of the [Test] attribute which you would normally expect, so if we use
the attribute there is no need to redefine it.

[HelloWorld]
public void Test()
{
Assert.IsTrue(true);
}

Figure 5: - Usage Example


Testing Times Ahead: Extending NUnit 121

Within the NUnit GUI, in the properties dialog for the test, we can see that the description has
successfully been set via the addin.

This is a very simple example, but shows how to create the Addin. Now for more detail.

Suite Builders
As discussed above, the suite builder allows you to redefine how the tests are found and executed.
Addins that extend this will require the ISuiteBuilder interface. This is similar to the
ITestCaseBuilder interface, and has two methods – CanBuildFrom and BuildFrom. However,
instead of BuildFrom returning a single test, it returns a complete test suite with all the tests added.
The NUnit object model is slightly confusing where a TestSuite is actually a type of Test, which is
why BuildFrom can return either single tests or a suite of tests.

public bool CanBuildFrom(Type type)


{
return Reflect.HasAttribute(type, "NUnitAddinAttributes.SuiteBuilderAttribute",
false);
}

public Test BuildFrom(Type type)


{
return new SuiteTestBuilder(type);
}

Figure 6:- ISuiteBuilder code

With BuildFrom returning a TestSuite object, we need to implement the object and add any tests
required. Figure 7 is the code for my custom Test Suite. Within the constructor, I use reflection to
get a list of all the methods within the Type, which is the class containing all of the methods. If the
method name starts with MyTest, I add it to the TestSuite as a standard NUnitTestMethod, at this
point I could also have redefined how each test is executed, I will discuss this in more detail in the
next section.

public class SuiteTestBuilder : TestSuite


{
public SuiteTestBuilder(Type fixtureType): base(fixtureType)
{
this.Fixture = Reflect.Construct(fixtureType);
foreach (MethodInfo method in fixtureType.GetMethods(BindingFlags.Public |
BindingFlags.Instance | BindingFlags.DeclaredOnly))
{
if (method.Name.StartsWith("MyTest"))
this.Add(new NUnitTestMethod(method));
}
}
}

Figure 7: - Suite Test Builder code


122 by Ben Hall

With our TestSuite code completed, we need to install the addin. To install the addin, we need to
access the extension point called “SuiteBuilders” and pass in the ISuiteBuilder object as shown in
figure 8. We also need to add the NUnitAddinAttribute to the class.

public bool Install(IExtensionHost host)


{
IExtensionPoint builders = host.GetExtensionPoint("SuiteBuilders");
if (builders == null)
return false;

builders.Install(this);
return true;
}

Figure 8: - Suite Builder Install

Finally, we need to implement an attribute for use within our tests, this is identify which classes
should use the addin.

[AttributeUsage(AttributeTargets.Class, AllowMultiple = false)]


public class SuiteBuilderAttribute : Attribute
{
}

Figure 9: - Suite Builder Attribute

Figure 10 is an example of how to use the addin. There is no reference to TestFixture or Test
attributes, as we would expect. All we need to do is add the SuiteBuilderAttribute to the top of the
class, then prefix any tests we want to be executed with ‘MyTest’, which is what the addin uses to
identity the methods.

[SuiteBuilder]
public class SampleSuiteExtensionTests
{
public void MyTest1()
{
Console.WriteLine("Hello from test 1");
}

public void MyTest2()


{
Console.WriteLine("Hello from test 2");
}

public void NotATest()


{
Console.WriteLine("This is not a test and so shouldn't be called.");
}
}

Figure 10:- Usage Sample

The NUnit GUI should now show that we have two tests listed for that object which can be executed
as normal.
Testing Times Ahead: Extending NUnit 123

This is a very powerful extension, but with limited usage. Test Case Builders and Test Decorates have
much more of an impact.

Test Case Builders


Test Case Builders allow you to extend NUnit without having to rewrite large parts of the
fundamental framework. By using Test Case Builders, we can take a single method with the correct
attribute and define one or more different tests that can be executed as part of the test suite. This
allows much more flexibility with NUnit, thereby improving the readability and maintenance of tests.
This is because the process of defining the tests to be executed can be customized via the attribute.
The example I will use to demonstrate this extension is the ability to define a test case for each row
from a SQL statement, the columns of which are then passed into the test as parameters to use as
part of the test. Figure 11 demonstrates the final extension being used.
Instead of defining the [Test] attribute, we define a SqlServerDataSource attribute, and set the
connection string and query parameters. When NUnit loads the test, it will execute the query and
create a separate test for each row returned from the query.

[SqlServerDataSource(ConnectionString = "Data Source=.;Initial


Catalog=Northwind;Integrated Security=True", Query = "SELECT CustomerID FROM
Northwind.dbo.Customers")]
public void TestCustomerIDIsValid(string customerID)
{
Console.WriteLine(customerID);
Assert.IsFalse(string.IsNullOrEmpty(customerID));
}

Figure 11: - Sample Usage

When the tests are executed, it will pass in the values (one parameter for each column returned) so we
can use them to test against. This is useful for dynamically creating the tests, especially if you do not
know all the possible test values when you are creating your tests. By using this attribute, anyone can
include additional test cases for the method.
We have already created a Test Case Builder before in the Hello World example so I won’t go into the
concept here. Once we have inherited from the ITestCaseBuilder interface, we need to determine
if tests can built from the method.
The major different is that our attribute now has properties which we can use to pass data into our
attribute. Within our attribute code, we simply add the properties we want.

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]


public class SqlServerDataSourceAttribute : Attribute
{
public string ConnectionString { get; set; }
public string Query { get; set; }
}

Figure 12: - Sql Server Data Source Attribute

Figure 13 defines if the test method is supported by the SqlDataSource Addin we are creating. The
first task is to see if the current method has a SqlServerDataSourceAttribute, this is done using the
Reflect class which is a wrapper around System.Reflection and provides an interface that is easy to
understand – this can be found in NUnit.Core. If the method does not have the attribute then it is
124 by Ben Hall

not supported so we return false. Otherwise, we access the attribute using Reflect, and check to make
sure the conversion works correctly and return true to indicate it is a supported method.

public bool CanBuildFrom(MethodInfo method)


{
if (method == null)
throw new ArgumentNullException("method");

if (Reflect.HasAttribute(method, SqlServerAttribute, false))


{
SqlServerDataSourceAttribute sql = Reflect.GetAttribute(method,
SqlServerAttribute, false) as SqlServerDataSourceAttribute;
if (sql != null)
return true;
}

return false;
}

Figure 13: - Sql Server Can Build From method

Once NUnit knows that it should use the Addin we need to define how it can actually build a test, or
in this case a collection of tests. Figure 14 shows the code for defining how to build the test suite.
The first task is to gain access to the attribute again, enabling us to access the connection string and
query provided by the user. We then need to define a standard TestSuite which is passed to the
CreateRowsFromSQL method along with the method and the attribute (Figure 14), finally returning
the populated suite back to NUnit.

public Test BuildFrom(MethodInfo method)


{
if (method == null)
throw new ArgumentNullException("method");

SqlServerDataSourceAttribute sql = Reflect.GetAttribute(method,


SqlServerAttribute, false) as SqlServerDataSourceAttribute;

string parentName = method.DeclaringType.ToString();


TestSuite suite = new TestSuite(parentName, method.Name);

CreateRowsFromSQL(method, suite, sql);

return suite;
}

Figure 14: - Sql Server Build From code

Within CreateRowsFromSQL we create all of the tests to be executed and add them to the suite. I
am just using standard ADO.net to query the database based on the attribute’s properties. Once I
have queried the database, I iterate each field and use it to help create a test name which allows the
test to be identified when executing. Finally, I create a SqlServerTestMethod object that defines
how the test is executed. To setup the object; I pass in the test method, the name of the test and then
the rows returned from the query as an object array for use during execution. I then add the created
SqlServerTestMethod into the test suite for execution.

sqlConn = new SqlConnection(sql.ConnectionString);


sqlConn.Open();

SqlCommand sqlCmd = new SqlCommand(sql.Query, sqlConn);


SqlDataReader sdr = sqlCmd.ExecuteReader();

int fieldCount = sdr.FieldCount;


Testing Times Ahead: Extending NUnit 125

while (sdr.Read())
{
try
{
object[] args = new object[fieldCount];
string methodName = "";
for (int i = 0; i < fieldCount; i++) //Create test name
{
object v = sdr.GetValue(i);
if (string.IsNullOrEmpty(methodName))
methodName = v.ToString().Replace(" ", "");
else
methodName = methodName + "," + v.ToString().Replace(" ", "");

args[i] = v; //Populate collection


}

suite.Add(new SqlServerTestMethod(method, method.Name + "(" + methodName +


")", args));
}
catch (Exception ex)
{
NUnitTestMethod ts = new NUnitTestMethod(method);
ts.RunState = RunState.NotRunnable;
ts.IgnoreReason = string.Format(
"Exception thrown. Check query and connection strings are correctly
for {0}. Exception {1}", method.Name, ex.Message);
suite.Add(ts);
}
}

Figure 15: - Snippet from CreateRowsFromSQL

If an error occurs during the setup we need to alert the user someone. One approach is use the catch
block create another test, set the running state to be Not Runnable and then set a reason, which is
displayed to the user in the GUI. When the tests are executed, they will have a yellow icon and the
message will appear under the Tests Not Run tab.

If it was more of a serious error that affects the entire test suite we could set the same properties
directly on the test suite.

suite.RunState = RunState.NotRunnable;
suite.IgnoreReason =
string.Format(
"SQL Exception thrown. Check query and connection strings are correctly for
{0}. Exception {1}", suite.TestName, ex.Message);

Figure 16: - Snippet from CreateRowsFromSQL error handling

This code will give us a test suite containing a number of tests based on rows of the query, if our
SQL statement returned 10 rows of data then 10 tests would be created in the test suite. The next
stage is to define how they can be executed. One approach would be to use the standard
NUnitTestMethod, as we did in suite builder. This does not support parameters and so, as that is a
major requirement, we need to create our own TestMethod object.
126 by Ben Hall

The first step is to inherit from NUnitTestMethod, which will give us the fundamental support for
executing the test. We then need to define our own constructor so we can pass in the arguments to
the object.

public SqlServerTestMethod(MethodInfo method, string testName, object[] arguments) :


base(method)
{
_arguments = arguments;
TestName.Name = testName;
}

Figure 17: - Sql Server Test Method Constructor

Within the constructor, I set the TestName property to be the test name we created based on the row.
The next step is to pass in the arguments to the test. To do this, we need to override the
RunTestMethod that is called to execute the test. The first task is to ensure we have arguments, if
not, then we need to pass in null. We thereby gain access to the current TestSuite in order to access
the test fixture. Finally, we use Reflection to invoke the test method within the test fixture, passing in
the arguments as a parameter.

public override void RunTestMethod(TestCaseResult testResult)


{
object[] arguments = _arguments ?? new object[] {null};
TestSuite testSuite = (TestSuite) this.Parent.Parent;
Reflect.InvokeMethod(this.Method, testSuite.Fixture, arguments); //Execute Test
}

Figure 18: - Override Run Test Method

Under the covers, the CLR will set the arguments and do any type conversion required, finally
invoking the test just like NUnit does with the [Test] attribute.
In summary, by using the Test Case Builder extension, a single test method can be reused in several
different ways to meet your own requirement for defining test values.

Test Decorators
A test decorator allows us to make simple modifications to the test without having a great affect on
the way that NUnit processes the test. Test Decorators allow us to add additional attributes to the
standard [Test] that to either define the way the test is executed, set properties on the test such as
description, or run commands before/after the test has been executed. For this extension, I will
create a decorate which can execute a SQL Statement before or after, or before and after the test is
executed – great for creating and deleting test data.
Figure 19 is an example of how the test decorator could be used. The TestSequence enumeration
simply indicates when to execute the query.

[Test]
[ExecuteSql(ExecuteWhen = TestSequence.BeforeAndAfter, Connection = "Data
Source=.;Initial Catalog=Northwind;Integrated Security=True", Script = "DELETE FROM
[Order Details]")]
public void Test()
{
Assert.IsTrue(true);
Testing Times Ahead: Extending NUnit 127

Figure 19: - Sample Usage

All test decorators must implement the ITestDecorator and install into the TestDecorators
extension point. The ITestDecorator has a single method called Decorate. This has the parameters
of the constructed Test object (built via the Test attribute) and the member info for the test method.
This member info can be used to access all the metadata about the method, such as attributes.
Figure 20 is the Decorate method for the ExecuteSQL extension. The first task is to check that the
test is actually a NUnitTestMethod (if you want to support other Test Case Builders then you will
need to add checking for this) to ensure it is not a test suite. I then get the attributes for the method,
and for each attribute of type ExecuteSqlAttribute I add it to the list. Once I have finished
processing, I then create an ExecuteSqlTestMethod so I can configure how the test is executed , I
also pass in the collection of ExecuteSqlAttributes as a parameter. The purpose of processing all
the attributes as a list is to allow support for multiple execute sql attributes per test method, each of
which could define a different query and test sequence.

public Test Decorate(Test test, System.Reflection.MemberInfo member)


{
if (test is NUnitTestMethod)
{
List<ExecuteSqlAttribute> methodAttributes = new List<ExecuteSqlAttribute>();

Attribute[] attributes = Reflect.GetAttributes(member, ExecuteSqlAttribute,


false);

if (methodAttributes.Count != attributes.Length) //Already processed


{
foreach (Attribute attr in attributes) //Need to convert them all
{
ExecuteSqlAttribute sqlAttribute = attr as ExecuteSqlAttribute;
if (sqlAttribute != null && !methodAttributes.Contains(sqlAttribute))
{
methodAttributes.Add(sqlAttribute);
}
}

test = new ExecuteSqlTestMethod(((NUnitTestMethod) test).Method,


methodAttributes);
}
}
return test;
}

Figure 20: - Test Decorator for Execute SQL

The ExecuteSqlTestMethod inherits from NUnitTestMethod, in order to customise the Run


method. Our run method executes any sql attributes marked as being ‘required to execute’ Before or
BeforeAndAfter, it then calls the Run method on the base object (NUnitTestMethod) thereby
allowing all of the additional processing, such as error checking, to take place via NUnit. Finally it
does another search for all the sql attributes which are marked After or BeforeAndAfter. If it finds
an attribute it will call the ExecuteSql method (Figure 21).

public override void Run(TestCaseResult testResult)


{
try
{
foreach (ExecuteSqlAttribute attribute in _sqlAttributes)
{
if (attribute.ExecuteWhen == TestSequence.Before ||
128 by Ben Hall

attribute.ExecuteWhen == TestSequence.BeforeAndAfter)
ExecuteSql(attribute);
}

base.Run(testResult);

foreach (ExecuteSqlAttribute attribute in _sqlAttributes)


{
if (attribute.ExecuteWhen == TestSequence.After ||
attribute.ExecuteWhen == TestSequence.BeforeAndAfter)
ExecuteSql(attribute);
}
}
catch (System.Data.SqlClient.SqlException ex)
{
testResult.Failure("SQL Exception Thrown from TempFileAttribute: " +
ex.Message, ex.StackTrace);

}
}

Figure 21: - Execute Sql Run Method

private void ExecuteSql(ExecuteSqlAttribute attribute)


{
SqlConnection sqlConn = null;
try
{
sqlConn = new SqlConnection(attribute.Connection);
sqlConn.Open();

SqlCommand sqlCmd = new SqlCommand(attribute.Script, sqlConn);


sqlCmd.ExecuteNonQuery();
}
finally
{
if (sqlConn != null && sqlConn.State != ConnectionState.Closed)
sqlConn.Close();
}
}

Figure 22: - Execute Sql ExecuteSql method

This decorator can then be used to execute sql commands during the execution of the test itself.
This is great for defining SQL to be executed on a test-by-test basis. If the query execution fails then
the test will fail, but with some correct try...catch blocks you can change this.

[Test]
[ExecuteSql(ExecuteWhen = TestSequence.BeforeAndAfter, Connection = "Data
Source=.;Initial Catalog=Northwind;Integrated Security=True", Script = "DELETE FROM
[Order Details]")]
[ExecuteSql(ExecuteWhen = TestSequence.Before, Connection = "Data Source=.;Initial
Catalog=Northwind;Integrated Security=True", Script = "select 'before'")]
[ExecuteSql(ExecuteWhen = TestSequence.After, Connection = "Data Source=.;Initial
Catalog=Northwind;Integrated Security=True", Script = "select 'after'")]
public void Test()
{
Assert.IsTrue(true);
}

Event Listeners
As mentioned earlier, event listeners are currently broken in 2.4.6; this should be fixed in later
releases.
Testing Times Ahead: Extending NUnit 129

Event Listeners allow you to hook into various events that NUnit fires at various points in the test
suite execution. This is great for providing additional reporting functionality, which is the example I
will do so every event will write information out to the Debug listener.
In order to create an event listener addin, the class needs to inherit from the EventListener object.
Figure 23 is a list of all the methods on the object where you can provide implementation hooks.
Note that all the methods must be included in the addin; the actual method body can be empty if you
do not wish to do anything with that event.

#region EventListener Members


public void RunFinished(Exception exception);
public void RunFinished(TestResult result);
public void RunStarted(string name, int testCount);
public void SuiteFinished(TestSuiteResult result);
public void SuiteStarted(TestName testName);
public void TestFinished(TestCaseResult result);
public void TestOutput(TestOutput testOutput);
public void TestStarted(TestName testName);
public void UnhandledException(Exception exception);
#endregion

Figure 23: - Event Listener events

An example of how to implement the code for the different events is shown in figure 24.

public void RunStarted(string name, int testCount)


{
Debug.Write("Test Started " + name + " " + testCount);
}

public void RunFinished(TestResult result)


{
Debug.Write("Test Finished " + name + " " + result.ResultState);
}

Figure 24: - Event Listener implementation

Finally, to install the addin we need to use the extension point named “EventListeners”.

public bool Install(IExtensionHost host)


{
IExtensionPoint listeners = host.GetExtensionPoint("EventListeners");
if (listeners == null)
return false;

listeners.Install(this);
return true;
}

Figure 25: - Event Listener Installation Code

Once the addin has been deployed, it will automatically take affect for any NUnit tests executed.
There is no need to modify the tests.

Deployment and execution


After you have successfully created your required addin, you will need to actually deploy and install it.
One limitation of the addins is that they can only be installed into the same version of NUnit as they
130 by Ben Hall

were compiled against; so, if someone compiled an addin against 2.4.5, then it wouldn’t work in
2.4.6.
To install the addin, you need to copy the assembly into the addin folder in your NUnit folder. For
example, C:\Program Files\NUnit\2.4\addins\, this could vary on different NUnit installations.
You will also need to install the addin on any machines that will execute the test suite, such as build
servers. When NUnit loads, it automatically searches for assemblies in this directory and attempts to
load them.
In terms of executing tests containing your addin, the NUnit GUI and console application will
support your addin correctly. However, support within 3rd party runners depends on the runner.
TestDriven.Net works with all types of NUnit extensions, whereas ReSharper does not support any
type of extensions. This is something to keep in mind when developing and executing.
Once your addin is installed and your tests are executing, you might find that you need to debug into
your assembly to fix problems. The solution is to use the Attach to Process option within Visual
Studio. After you have set breakpoints within your code, I always
1. start the NUnit exe,
2. attach to the NUnit process,
3. load my test assembly using the addin
…at which point your debugger should kick it and allow you to step through your code.

Summary
In summary, NUnit 2.4.6 has a number of different ways of extending the framework for your own
requirements and to tailor test behaviour for different scenarios. The article has demonstrated how to
extend NUnit via the built-in extension points which are Suite Builders, Test Case Builders, Test
Decorators and Event Listeners.
Using SQL Data Generator with your Unit Tests 131

USING SQL DATA GENERATOR WITH YOUR UNIT TESTS


15 April 2008
by Ben Hall

Ben Hall, one of the team who wrote SQL Data Generator, has gone on to develop ways of using
it to provide test data for a wide variety of Unit Testing requirements. Whether you use NUnit,
XUnit, MbUnit or MSTest, you can now lay your hands on as much data as you need.

SQL Data Generator


You can create large amounts of test data quickly and easily with Red Gate’s SQL Data Generator.
(http://www.red-gate.com/products/sql_data_generator/index2.htm) You can now integrate this
data generation facility into your unit and integration testing
I’d like to show you how I would go about using SQL Data Generator (SDG) within my unit tests,
not just for testing the Data Access Layer (DAL), but also the Business Layer (BU). SQL Data
Generator can be used for testing a wide variety of code in different languages.
With the code I provide in this article, along with SQL Data Generator, you can test the scalability
and performance of any of your .NET code that processes data.

Automating Data Generation


After installing SQL Data Generator, you have the option to generate data via the UI or the
command line. The UI can create, configure, and generate data with the ability to save all of the
settings as a project file for later use. The project file is XML and contains a list of all the columns for
each table within the database. For each column in the table, we store all of the related settings, such
as which generator was assigned and the properties for the generator, for example the expression
used on a RegEx generator.
Installed alongside the UI is a command line application. The application generates data based on a
saved project. However, the application doesn’t include an API. This could cause problems when you
attempt to include it as part of an automated process.
As a possible solution to this, I have created a C# wrapper around the console application which you
can get by downloading it from Codeplex (including source code). This wrapper allows you to
integrate SQL Data Generator into your unit tests, with the added advantage of the wrapper being
framework agnostic, meaning it will work with all versions of NUnit, XUnit, MbUnit or even
MSTest.
To start using this framework, you will need to download the SDG.UnitTesting.dll assembly from
CodePlex. In my solution, I have two projects, SDG.UnitTesting.SimpleTalk and
SDG.UnitTesting.SimpleTalk.Tests; the first project will be my ‘production’ code which I want to
test, while the second project contains my unit tests. Within SDG.UnitTesting.SimpleTalk.Tests, I
add a reference using Visual Studio to the SDG.UnitTesting.SimpleTalk project so that I can create
the objects to be tested. Within SDG.UnitTesting.SimpleTalk.Tests I also add a reference to the
downloaded SDG.UnitTesting.dll assembly.
At this point, I have two possible approaches for executing SQL Data Generator.
• The attribute based approach – Decorate your test methods with attributes to specify
which SQL Data Generator project to execute before running the test.
• The POCO Wrapper approach – This allows you to create an object within your code and
call the Execute() method to generate the data.
132 by Ben Hall

The Attribute Based approach


The first approach is to decorate the TestFixtures and Test methods within your system with our
attributes, SDG and SDGExecute. The advantage of this approach is that the data generation is
expressed as an attribute rather than being mixed into your test method. This makes it much easier to
read and understand the test, and what it is actually testing.
To demonstrate this, I will show how to create and test a DataAccess class. The class uses ADO.net
but the basic principal is the same for Linq to SQL, or any other access framework. We’re just issuing
commands to the database in order to return data.
The first task, following the Test Driven Development approach, is to create a DataAccessTests
class within our test solution. The first test within this class simply ensures that if we delete all the
data from the database, then a DataTable is returned with 0 rows.

[Test]
[ExecuteSql(ExecuteWhen = TestSequence.Before, //Taken from Testing Times Ahead:
Extending NUnit
Script = "DELETE FROM [Order Details]; DELETE FROM [Orders]; DELETE FROM
CustomerCustomerDemo; DELETE FROM Customers",
Connection = "Server=(local);Database=Northwind;Integrated
Security=True")]
public void GetAllCustomers_EmptyDatabase_ZeroRows()
{
DataTable dt = DataAccess.GetAllCustomers();
Assert.AreEqual(0, dt.Rows.Count);
}

The test uses the ExecuteSql attribute, which will execute SQL statements against the server
NOTE:
I showed how to create the ExecuteSql attribute in my previous chapter in thie book
Extending NUnit article (Testing Times Ahead: Extending NUnit) so you can get the
code from there.
This is a great example of how using attributes can improve the readability of code: the logic to setup
the test has been moved into an attribute, leaving the actual test code just as test code.
To pass the test, we need to implement our DataAccess class. The DataAccess.GetAllCustomers()
method is shown below and it simply executes a SELECT * to return all the customers in the table.

public static DataTable GetAllCustomers()


{
SqlConnection sqlConn = new
SqlConnection(ConfigurationManager.ConnectionStrings["NorthwindConnection"].Connectio
nString);
SqlCommand sqlCmd = new SqlCommand("SELECT * FROM Customers", sqlConn);
SqlDataAdapter sqlDA = new SqlDataAdapter(sqlCmd);

DataTable dt = new DataTable();


sqlDA.Fill(dt);

return dt;
}

The test passes successfully. Next we want to test that data is being returned correctly, this is where
SQL Data Generator can be beneficial.
In this next test, we want to ensure that the DataAccess object does actually return a populated
DataTable with the contents from the database. First, we need to setup our TestFixture class in
order to be able to use the new attributes. All we need to do is add a SDGAttribute to the
TestFixture and inherit from SDGContext.
Using SQL Data Generator with your Unit Tests 133

These attributes are located in the SDG.UnitTesting assembly; the SDGAttribute tells our
extension that the class contains a method which we should be interested in as it should contain test
methods that requires SDG to execute. The SDGContext class allows the extension to hook into the
class and intercept the method calls, if the method has the attribute SDGExecute, see below, then we
know to execute SQL Data Generator.

[TestFixture]
[SDG]
//[SDG("D:\Red Gate\SQL Data Generator\SQLDataGenerator.exe")] //Overload to set
console application location.
public class DataAccessTests : SDGContext

On SDGAttribute, we have an overload where you can specify a different installed location for the
command line application.
With this in place, we can execute SDG as part of our tests. The following test ensures that the
DataTable is populated with 1000 rows and we can be confident that this will always pass because we
are executing SDG before the test. Before the test is executed, the SDGContext intercepts the call to
GetAllCustomers_SDGProject_1000Rows(), as the method has a SDGExecute attribute, the
framework knows to execute SDG. The attribute has an argument for a filename, this is the project
file created using the UI which we want to use. Under the covers, the command line application is
executed using the project name as a parameter.

[Test]
[SDGExecute("Database\\NorthwindCustomers_Orders.sqlgen")]
public void GetAllCustomers_SDGProject_1000Rows()
{
DataTable dt = DataAccess.GetAllCustomers();
Assert.AreEqual(1000, dt.Rows.Count);
}

I find this approach very clean and very effective. We no longer have to worry about what or how the
data is generated.
Another advantage of SDG is that it uses seedable data. The means that the same data will be
produced each time, making it good to test against. The next test below ensures that the
GetCustomerByID() method works correctly; we pass in an ID for a customer generated by SDG
and verify that the data is being returned as we expect.

[Test]
[SDGExecute("Database\\NorthwindCustomers_Orders.sqlgen")]
public void GetCustomerByID_SDGProject_1000Rows()
{
DataTable dt = DataAccess.GetCustomerByID("00040");

Assert.AreEqual("00040", dt.Rows[0]["CustomerID"]);
Assert.AreEqual("Suppebentor Holdings ", dt.Rows[0]["CompanyName"]);
Assert.AreEqual("Bobbi Yates", dt.Rows[0]["ContactName"]);
Assert.AreEqual("Web", dt.Rows[0]["ContactTitle"]);
Assert.AreEqual("564 White Old Freeway", dt.Rows[0]["Address"]);
}

Again, the readability of the test is not harmed by the fact we are generating data, unlike other
techniques such as inserting data manually using ADO.net
By using attributes, we have an easy and flexible way to generate different data for each test, simply by
supplying SDG with a different project file to execute.
However, sometimes we might want to generate data once for the entire fixture.
134 by Ben Hall

The POCO Wrapper approach


This wrapper can be called in a similar fashion to the attributes. However it is just a Plain Old CLR
Object (POCO) which you create. You can then call the method Execute() anywhere in your code.
At the moment, my DataAccess tests execute the same project three times. However for one test
(GetAllCustomers_EmptyDatabase_ZeroRows()) we don’t execute it at all. Therefore, it would
make sense to separate these tests into two fixtures, DataAccessTestsV2 and
DataAccessV2_WithoutSDG, and executing the project once at the beginning would make the most
sense – this is what I have done for my second set of Data Access tests.
In the test fixture DataAccessV2_WithoutSDG, I have all the code which doesn’t require data
(GetAllEmployees_EmptyDatabase_ZeroRows) while in my second fixture, DataAccessTestsV2,
I have all the code which does require data. Without using any attributes, in my TestFixtureSetup I
have this code:

[TestFixtureSetUp]
public void TestFixtureSetup()
{
SDGConsoleWrapper sdgConsoleWrapper = new
SDGConsoleWrapper(@"Database\NorthwindEmployees_Orders.sqlgen");
sdgConsoleWrapper.Execute();
}

This runs the data generation before any tests are executed within DataAccessTestsV2. The result is
that all of my tests have the data required, and I only execute the generation once, speeding up the
tests.
NOTE:
During the trail period for SDG you will have a trail screen displayed for each test, as
each test executes the console application. You will need to purchase a license to be able
to execute without this screen.

Maintainability and coping with schema changes


We now have our data successfully being generated successfully. However database schemas change
over time; columns are added and removed, as are tables. Ideally, we don’t want to have to go and
modify all of our data generation scripts in order to be able to cope with minor changes. Luckily, SQL
Data Generator is able to cope with changing schemas and doesn’t require any modifications to the
project file!
When a project is opened in the UI or the console application, we check to see if any of the schema
have changed for the tables. If we check for schema changes, such as new or removed columns, data
type changes or changes to the relationships between tables, SDG then updates the project as
required automatically matching generators to columns when possible.
The result of this is that you can make changes to your database without having to worry about
keeping the projects up to date – they will just do it for themselves. Of course, if you want to make
changes, then that is also possible by loading the project file in the UI and saving it back to disk.

Testing your Business Layer


While SQL Data Generator can make a great difference when testing against databases, it also has a
number of usages while testing your business layer.
Testing the business layer has a number of different aspects, sometimes we just have a simple data
model with a well defined set of test cases, such as the method GetTodaysDate() should return
today’s date and nothing else. Other times, we have complex business rules and data models which
have a large number of different possible combinations and ranges for the data. In order to test this
effectively, we need a very good set of test data to work against. SQL Data Generator can help
Using SQL Data Generator with your Unit Tests 135

generate the data to meet these combinations and rules quickly and effectively. We can then export
this data to use as part of our unit and integration tests.
For the next test, we need a method that will verify that an order is valid. A valid order must have an
OrderID, CustomerID and an OrderDate. In normal test cases, we would have to create a series of
valid and invalid Order objects and give them to a ValidateOrder method to see if it worked
correctly for different combinations. For example, we would have a series of tests like this:

[Test]
public void ValidateOrder_ValidOrder_ReturnsTrue()
{
Order o = new Order();
o.CustomerID = 1;
o.OrderID = 1;
o.OrderDate = DateTime.Now.Add(new TimeSpan(0, 1, 0));

Assert.IsTrue(BusinessLayer.ValidateOrder(o));
}

However, after a number of different scenarios this would become difficult to manage and difficult to
understand what is being tested, especially if we have a number of different tests with only slightly
different requirements. Making a change to the code being tested would then break all of our tests,
resulting in increased maintenance of the test suite.
One alternative is to use SQL Data Generator and combine it with some of the advanced testing
frameworks already available. Both MbUnit (http://blog.benhall.me.uk/2007/04/mbunit-
datafixture-data-driven-unit.html) and XUnit (http://blog.benhall.me.uk/2008/01/introduction-to-
xunitnet-extensions.html) can use SQL Tables, CSV and XML files as the source of test data. We can
then use the data SDG produces as the source of the test data for the unit tests allowing for large
amounts of real data to be used in a maintainable fashion.
However, there is still a problem. How do we actually get data into the format we need in order to
use the data for testing. Currently, SDG can only insert data into SQL Server 2000 and 2005
databases. However once the data is in there it doesn’t mean it has to stay there. SQL Server includes
a tool called Bulk Copy Program (BCP) which can export data to a flat file, for example CSV or
XML.
Executing the following command will export data from the table ValidateOrder in the database,
TestData and write the data to the CSV file TestData_ValidateOrder.csv, using a comma as the
delimiter. We can then include this file as an item within our solution and have it copied to the output
directory when we build.

"C:\Program Files\Microsoft SQL Server\90\Tools\Binn\bcp.exe"


TestData.dbo.ValidateOrder out "C:\TestData_ValidateOrder.csv" -c -T -t, -S (local)

Exporting as XML is a little bit more difficult. Instead of giving a database and table to use as the
source, we provide a query. This query selects all the data, uses the built in FOR XML AUTO clause
to convert the data into XML elements and then wrap the results in a XML root element.

"C:\Program Files\Microsoft SQL Server\90\Tools\Binn\bcp.exe" "SELECT '<root>' +


(select * from TestData.dbo.ValidateOrder FOR XML AUTO) + '</root>' AS Xml" queryout
"C:\TestData_ValidateOrder.xml" -c -T -S (local)

After executing this, all of our data (1000 rows) is created as XML elements within the XML file.
There is no limit to the amount of data we can use. The data produced will look like this:
136 by Ben Hall

<root><TestData.dbo.ValidateOrder OrderID="1" CustomerID="1" OrderDate="2008-10-


10T12:23:45.930"/></root>

We can then use this data together with the MbUnit XmlDataProvider
(http://blog.benhall.me.uk/2007/04/mbunit-datafixture-data-driven-unit.html) feature to introduce a
great deal of flexibility into our unit tests. We can create a series of different xml files with different
data scenarios for tests. We can then store them within a source control to share with the data. As
they are not accessing the database, they will run faster and we don’t have to worry about having
access to the database with the data inserted.
In the following test, we use the DataFixture Attribute with the filename for the xml and an XPath
expression defining how to find the data items (we only have one root so we just use that). We
stipulate that the test should be executed for each Xml Element of type TestData.dbo.ValidateOrder,
passing in the current XML Node for the test.
We can then use the XMLNode to access the data we want to test against.

[DataFixture]
[XmlDataProvider(@"TestData\TestData_Valid_ValidateOrder.xml", "root")]
public class BusinessLayerTests_MbUnit_Valid
{
[ForEachTest("TestData.dbo.ValidateOrder")]
public void x(XmlNode node)
{
Order o = new Order();
//Not nice, but need to get around limitations of Xml and the Xml Serialiser
o.CustomerID = Convert.ToInt32(node.Attributes["CustomerID"] != null ?
node.Attributes["CustomerID"].InnerText : "-1");
o.OrderID = Convert.ToInt32(node.Attributes["OrderID"] != null ?
node.Attributes["OrderID"].InnerText : "-1");
o.OrderDate = Convert.ToDateTime(node.Attributes["OrderDate"] != null ?
node.Attributes["OrderDate"].InnerText : "01/01/1970");

Assert.IsTrue(BusinessLayer.ValidateOrder(o));
}
}

As a result, loading this test will produce 1000 different tests, each using data produced using SDG.
This is excellent for integration and boundary tests as it is so quick and easy to produce huge amounts
of data to test against.

Summary
Hopefully this article has demonstrated some of the possibilities when using SQL Data Generator
with your unit testing. Download the wrapper from Codeplex (http://www.codeplex.com-
/SDGGenerators and experiment to see what kind of difference it makes to your unit tests. Also,
keep in mind that the data produced doesn’t have to be used just with databases, it can be used as any
kind of test data. When you combine this with the more powerful features of the unit testing
frameworks, you can test code faster and more effectively.
Finally, we would love to hear your feedback about the approach taken in this article and SQL Data
Generator in general.

The code for this article can be downladed from here,( http://www.simple-
talk.com/iwritefor/articlefiles/493-RedGate.SDG.Code.zip )
137

You might also like