LINQ to SQL, DataSets and Nullable types

June 14, 2008

I am writing an informational application at the moment. I want to code fast. I want it to be installable by just about anyone. I don’t want the user to have to configure the application, and the application involves graphics, so I decided on a WinForms app. And I want the code to be simple. Because it is informational, whoever is doing the data entry does not necessarily have all the information at the time they create the item.

So my first thought was, “I know DataSets, I know Table Adapters, I’ll use them”. And of course my next thought was how best to cause the errors to be raised at compile time and not at runtime. Earlier detection of bugs makes them cheaper to fix.  So I added some DataSets, dragged some tables onto the designer, and started to code away, using TableAdapter Fill and Update commands.

DataSets are a beautiful thing. Once populated, a DataSet can be disconnected, moved around, modified, passed back, reconnected, and all rows can be updated in a batch. If you’re writing a multi-tier application, passing data back and forth this way can be very practical. By using the designer, and dragging the tables onto the designer, you are in-effect creating typed-DataSets. And typed-DataSets are good if you want to ensure that you are using the database fields consistent with their underlying database types.

So there I was coding away, and I decided that I wanted a number of my fields to be nulls – hey, the person entering may not have the information at the time they enter it. But I also wanted to bind the data to the user interface components, like DataGridView or ComboBox. This should be as easy as possible. Create an instance of a typed-DataSet, pass it into the typed-Table Adapter Fill command, and presto, the DataSet is populated. Bind it directly to the component’s DataSource. Set the DataMember and DataValue properties, and whalla, the component is populated. If I have to do anything more than this, then I lose time doing something that should be simple infrastructure, when I could be coding business logic.

The problem I had, of course, was due to Nullable types. When Microsoft created typed DataSets, there wasn’t a concept of nullable types, and as such the MSDataSetGenerator custom tool that is used to generate typed DataSet code from the designer can’t handle nullable types. Of course, if you drill down into the underlying datarows you can, of course, check a column to find out if the column IsDBNull. But this doesn’t help if you simply want to bind a typed-DataSet to a grid or combobox. It crashes when it reaches the null value, because it can’t parse the value to be returned to the component (int and DateTime .net types can’t be set to null). Microsoft, apparently, has no intention of changing this.

So my first thought here was “someone must have had this problem before, surely someone has written their own code-generator custom tool to provide a workaround” so I went looking. I found a whole stack of references to building custom tools, and specifically related to the nullable issue. But no one had apparently written one and made it freely available, or at least not with the keywords I was using. Damn. I would have to write my own one. Or, I could try LINQ to SQL to see if it could handle nullable types. With LINQ, I do like the way queries are created in code, because if you get the structure of the query wrong in code, it will break the build at compile time, not run time. As it turns out, LINQ does allow nullable types, such as int? and DateTime?

First, I need a call to get all records from my reference table. I added a new item to my project. I selected Add New Item in Solution Explorer and selected LINQ to SQL classes. It opened up in the designer. I connected up to my SQL Server database and dragged then entire table structure into the designer. Saved it. Compiled it. Next, I went into my business layer and created a class for my object.

using System.Linq;
using MyDataLayer;
class MyRefTableClass
{
 internal static
     IQueryable<MyRefTable> SelectAll()
 {
  MyDatabaseDataContext context =
                  new MyDatabaseDataContext();
  return context.MyRefTables.AsQueryable();
 }
}

That retrieved all the MyRefTable rows. Now I could bind it to a combo box, like so:

cboMyCombo.DataSource= MyRefTableClass.SelectAll();
cboMyCombo.DisplayMember = "DisplayNameField";
cboMyCombo.DisplayValue = "ValueField";

Ok, so I did the same for my informational table, named MyInfoTable. One of the fields in MyInfo has to be populated by the key value out of a MyRefTable row, or it can be null. Then I need to update the MyInfo with the new value.

Notice above that I have declared the DataContext object within the scope of the SelectAll method. Because of this, if I want to update a MyRefTable row, information about the original state of the row will be lost, and generally, I will need to retrieve the row from the database again, modify the object, then resubmit it to the database. This is undesireable because you are requerying the database for an object that you are about to change. That’s extra unnecessary traffic.

Also, I want the ability to add a row and delete a row. Creating the objects themselves is easy. My informational class now looks like this:

class MyInfoTableClass
{
 //note the following line was moved from method
 //scope to class scope as it will then retain
 //information about changes in underlying objects.
 static MyDatabaseDataContext context =
    new MyDatabaseDataContext();
 internal static
    IQueryable<MyInfoTable> SelectAll()
 {
  return context.MyInfoTables.AsQueryable();
 }
 internal static
  IQueryable<MyInfoTable> SelectByID(int ID)
 {
  var result = (from p in context.MyInfoTables
                where p.ID == ID
                select p);
  return result;
 }
 internal static void Insert(
    MyInfoTable newInfoTable)
 {
  context.MyInfoTables.
     InsertOnSubmit(newInfoTable);
 } 
 internal static void DeleteByID(int ID)
 {
  var result = (from p in context.MyInfoTables
                where p.ID == ID
                select p);
  context.MyInfoTables.
     DeleteOnSubmit(result.Single());
 }
}

Note that LINQ operates in batches, so after doing all the CRUD work, you can send all the changes in one big hit by calling SubmitChanges() if you like. At the moment I am calling SubmitChanges straight after I modify the data.

Note that the Delete method above is really undesireable.  It will cause extra traffic to the database as it retrieves the record before deleting it. It’s best if you keep the object alive then pass that object back into the data context object’s DeleteOnSubmit method. I think it would be nice if LINQ provided an easier way to do this.

Another problem I see is in multi-tier development. It can be advantageous to completely disconnect from the database. That’s not really practical in LINQ if you want to reduce network/query traffic. If the datacontext object goes out of scope, you have to reattach the data object to a new live instance of the data context. That mean that there is a need to effectively keep the original object around to reattach it. Ok, that might be fine, but if you have a whole lot of referential contraints within your dbml file, reattaching your object to a new DataContext object will simply bomb out with “An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext.  This is not supported.” Unfortunately, the only way around this seems to be to Detach all referenced data, or don’t have referential constraints (which kind of defeats the purpose of LINQ). I’m looking for a clean way to do this at the moment. If you know one, please let me know.