Tuesday, April 27, 2010

AAPL Part 6: Building a DataMapper and DataMapperFactory

I recently decided that I needed to take a fresh look at how to build a persistence architecture that would provide the flexibility of an ORM, that would embrace change instead of resist it (making maintenance code easier), and would use ADO.Net.  I started building from the ground up, threw away any best practices that resulted in friction, and Agile ADO.Net Persistence Layer is the result.  Part 1 in the series can be found here: Agile ADO.Net Persistence Layer Overview

In this post, we’ll get into the DataMapper.  If you’ve been waiting to see some real ADO.Net code, this is the class you’ve been waiting for.

Where the DataMapper classes fit in

Let’s review our solution structure.  Core is our main business layer project (Note: in previous posts and in the sample code Core is actually the BAL project.  I never liked the name BAL so I’ve renamed it Core.  Same code, just a different name.  Sorry for any confusion.).  Core contains Service classes which are our one stop shop for all persistence (see AAPL Part 3 and AAPL Part 4 for more on how the service classes work). The Service classes use the SqlDao for all communication with the database.  The SqlDao uses the DataMapper classes to handle all mapping between the DataReader returned by executing a query, and the desired data shape.  Here’s what our project looks like:

image

Last post we covered the SqlDao class and saw how it’s data access methods take a query wrapped in an SqlCommand, execute that query, then return either a Single<DTO> (where DTO is our desired data shape), List<DTO>, DataPage<DTO>,or a scalar value.  The data access methods in SqlDao handle all the boilerplate logic for getting a connection, executing a query, stuff like that, but the logic that is specific to a data shape, things like get the value from this column in my reader put it in this property on my returned data shape, that stuff get’s delegated to a DataMapper class. This is what that SqlDao code looks like:

SqlDataReader reader = command.ExecuteReader();

if (reader.HasRows)

{

    reader.Read();

    IDataMapper mapper = new DataMapperFactory().GetMapper(typeof(T));

    dto = (T)mapper.GetData(reader);

    reader.Close();

}

What does the DataMapperFactory do?

You’ll notice in the code above that we get our IDataMapper object from a DataMapperFactory.  If we need to get the DataMapper for type User, we just call the DataMapperFactory.GetMapper method, pass type(User) as a parameter, and GetMapper returns the appropriate concrete DataMapper (Note that the code example above uses generics, so we use generic T instead of a specific type like User, but the logic is the same). DataMapperFactory is a simple factory class with one method, GetMapper. The code looks like this: 

class DataMapperFactory

{

    public IDataMapper GetMapper(Type dtoType)

    {

        switch(dtoType.Name)

        {

            case "ListItemDTO":

                return new ListItemDTOMapper();

            case "User":

                return new UserMapper();

            default:

                return new GenericMapper(dtoType);

        }      

    }

}

Whenever we create a new IDataMapper for a specific type, like User, we need to add it to this factory method.  The method is just a switch statement that uses the type.Name to look for a type-specific implementation of IDataMapper.  If it finds one, it returns it.  If it doesn’t find a type specific IDataMapper, it returns an instance of GenericMapper, which is our reflection based DataMapper that can be used to handle mappings for any query / data shape combination.  We’ll cover GenericMapper in the next post.

What does the DataMapper do?

The behavior of a DataMapper class is defined by the IDataMapper interface.  The interface requires DataMappers to implement two methods.

interface IDataMapper

{

    // Main method that populates dto with data

    Object GetData(IDataReader reader);

    // Gets the num results returned. Needed for data paging.

    int GetRecordCount(IDataReader reader);

}

The GetData method is the important one.  That’s the method that our SqlDao calls whenever it needs to map the current record in an SqlDataReader to a specific data shape.  Remember, this method doesn’t do any mapping of an SqlDataReader to a List, or DataPage, or anything like that. That’s handled by the SqlDao data access methods. The purpose of the DataMapper GetData function is to take an SqlDataReader that’s already open and on a record, read that record, put the column data into an object (DTO) of the desired data shape, then return that single DTO.

Note that GetData returns an Object, so the SqlDao has to cast the GetData result to the proper data shape.  I tried all kinds of designs to get around this cast, I used interfaces, generics, base classes, but in the end all of these options added complexity to the code that I just couldn’t accept.  So, I opted to stick with GetData returning an object.  I think it was the right tradeoff, but if anyone has a better solution please let me know.

Here is a diagram of our IDataMapper interface and a couple of concrete DataMapper implementations.   

image

 

Inside a DataMapper class

So let’s take a look inside the UserMapper class and see what an IDataMapper implementation looks like. The user class is a data shape that represents, wait for it… a user.  The class looks like this.

image

So the job of UserDataMapper is to take a SqlDataReader, get the values from it’s columns, and put those values into and instance of the  User class above.  The DataMapper implementation that I use accomplishes this task in a very structured way that is easy to copy and reproduce for new data shapes. 

Before we look at the code, I should mention that I write a fair amount of extra code to get the ordinals for each data field instead of just using the string field names.  The reason for this is that using the ordinals allows me to use strongly typed methods on the SqlDataReader to get each column.  This allows me to avoid the cast to and from Object that occurs when you get a column by name which results in boxing and unboxing for value types.  It also gives me a little type safety.  If the query is returning a column called UserGuid that has a type of Int32, when I call reader.GetGuid() on that column, my code will throw an error and alert me that there is a problem with my query.  For more on the benefits of using ordinals take a look at High Performance Data Access Layer Architecture Part 3

Now, with no further ado, here’s the UserMapper implementation.

class UserMapper : IDataMapper

{

    private bool _isInitialized = false;

    private int _ordinal_UserGuid;

    private int _ordinal_UserRole;

    private int _ordinal_CreatedUtc;

    private int _ordinal_ModifiedUtc;

    private int _ordinal_CompanyGuid;

    private int _ordinal_FirstName;

    private int _ordinal_LastName;

    private int _ordinal_Email;

    private int _ordinal_Password;

    private int _ordinal_PhoneWork;

    private int _ordinal_PhoneMobile;

 

    private void InitializeMapper(IDataReader reader)

    {

        PopulateOrdinals(reader);

        _isInitialized = true;

    }

 

    public void PopulateOrdinals(IDataReader reader)

    {

        _ordinal_UserGuid = reader.GetOrdinal("UserGuid");

        _ordinal_UserRole = reader.GetOrdinal("UserRole");

        _ordinal_CreatedUtc = reader.GetOrdinal("CreatedUtc");

        _ordinal_ModifiedUtc = reader.GetOrdinal("ModifiedUtc");

        _ordinal_CompanyGuid = reader.GetOrdinal("CompanyGuid");

        _ordinal_FirstName = reader.GetOrdinal("FirstName");

        _ordinal_LastName = reader.GetOrdinal("LastName");

        _ordinal_Email = reader.GetOrdinal("Email");

        _ordinal_Password = reader.GetOrdinal("Password");

        _ordinal_PhoneWork = reader.GetOrdinal("PhoneWork");

        _ordinal_PhoneMobile = reader.GetOrdinal("PhoneMobile");

    }

 

    public Object GetData(IDataReader reader)

    {

        // This is where we define the mapping between the object properties and the

        // data columns. The convention that should be used is that the object property

        // names are exactly the same as the column names. However if there is some

        // compelling reason for the names to be different the mapping can be defined here.

 

        // We assume the reader has data and is already on the row that contains the data

        //we need. We don't need to call read. As a general rule, assume that every field must

        //be null  checked. If a field is null then the nullvalue for that  field has already

        //been set by the DTO constructor, we don't have to change it. 

        if (!_isInitialized) { InitializeMapper(reader); }

        User dto = new User();

        // Now we can load the data

        if (!reader.IsDBNull(_ordinal_UserGuid)) { dto.UserGuid = reader.GetGuid(_ordinal_UserGuid); }

        if (!reader.IsDBNull(_ordinal_UserRole)) { dto.UserRole = (UserRole)reader.GetInt32(_ordinal_UserRole); }

        if (!reader.IsDBNull(_ordinal_CreatedUtc)) { dto.CreatedUtc = reader.GetDateTime(_ordinal_CreatedUtc); }

        if (!reader.IsDBNull(_ordinal_ModifiedUtc)) { dto.ModifiedUtc = reader.GetDateTime(_ordinal_ModifiedUtc); }

        if (!reader.IsDBNull(_ordinal_CompanyGuid)) { dto.CompanyGuid = reader.GetGuid(_ordinal_CompanyGuid); }

        if (!reader.IsDBNull(_ordinal_FirstName)) { dto.FirstName = reader.GetString(_ordinal_FirstName); }

        if (!reader.IsDBNull(_ordinal_LastName)) { dto.LastName = reader.GetString(_ordinal_LastName); }

        if (!reader.IsDBNull(_ordinal_Email)) { dto.Email = reader.GetString(_ordinal_Email); }

        if (!reader.IsDBNull(_ordinal_Password)) { dto.Password = reader.GetString(_ordinal_Password); }

        if (!reader.IsDBNull(_ordinal_PhoneWork)) { dto.PhoneWork = reader.GetString(_ordinal_PhoneWork); }

        if (!reader.IsDBNull(_ordinal_PhoneMobile)) { dto.PhoneMobile = reader.GetString(_ordinal_PhoneMobile); }

        return dto;

    }

 

    public int GetRecordCount(IDataReader reader)

    {

        Object count = reader["RecordCount"];

        return count == null ? 0 : Convert.ToInt32(count);

    }

}

You’ll notice that the first thing we do is declare an _ordinal_FieldName member for each property in our User class.  These local members are used to store the ordinal location for the column in the SqlDataReader that maps to that User field.  We also create an _isInitialized flag that let’s us know whether we’ve set our _ordinal_FieldName values yet.  This is important when the UserMapper is being used to get more than one User from a DataReader.  The SqlDao code will get a single IDataMapper instance and will then call that mapper over and over for each record that exists in the SqlDataReader. We only want to find the ordinals once, so we create our _isInitialized flag that can be set to true after we set all the ordinal values.

Next is the InitializeMapper method.  This just calls our PopulateOrdinals method and then sets that _isInitialized flag, that we just discussed, to true.

PopulateOrdinals is the method that takes our SqlDataReader, finds the ordinal location of each field we need, then stores those ordinal values in our local _ordinal_FieldName members.

Finally we arrive at GetData, the heart of our class.  We’ve structured our code in such a way that GetData is actually pretty simple.  We check to make sure the _isInitialized flag has been set. If it hasn’t we call PopulateOrdinals.  We then create a new instance of our User class called dto. After that it’s just a matter of repeating the same line of code for each field.

if (!reader.IsDBNull(_ordinal_UserGuid)) { dto.UserGuid = reader.GetGuid(_ordinal_UserGuid); }  

This one line of code performs a null check on the ordinal, if the column exists it uses a strongly typed Get method to get the value of that column and assign it to the dto, if the column is null then we do nothing because the constructor for User already set each property to it’s null value on creation.  And that’s it, after setting each field value (or not setting it) we just return our dto and we’re done.  Simple maintainable, easy to copy and reproduce code.

The last method is GetRecordCount.  We need this method to support the DataPage<DTO> shape.  Our SqlDao needs to know the total record count if it’s returning a data page.  If it’s not returning a data page this method isn’t used.

Conclusion

That’s it.  Now that I look at the DataMapper code its a little anti-climactic because it’s really very simple stuff.  But then that’s the idea, break a complex operation into smaller pieces so that each one is easy to understand, and if needed, reproduce.  To add a new DataMapper we just copy and paste the existing UserMapper and change the field names.  Easy.  We don’t need to touch any of the code at the SqlDao level that handles getting Single<DTO>, List<DTO>, or DataPage<DTO> return types.  We also don’t touch any code that affects any other data shape.  All of our code is encapsulated in this one easy to handle class. 

Also, keep in mind that the real power of this architecture is that we don’t need to create a type specific DataMapper at all.  For the vast majority of tasks, the GenericMapper is going to work just fine.  In practice I never create any type-specific DataMappers during development because it seems like at that early stage data shapes are always in flux, and when a data shape changes, the last thing I want to do is go back and update a bunch of mapping code.  I look at the type specific DataMappers as something that I never want to use unless an app has gone into production and it’s experiencing performance issues related to the GenericMapper.  If that happens then I have a hook in the framework that I can use to create a simple DataMapper for that data shape that contains what I think is the fastest ADO.Net code possible. 

Next time we’ll go over the GenericMapper.

6 comments:

  1. How long till part 7?

    ReplyDelete
  2. Will there be a blog post for GenericMapper? I like your implementation very much. Many of the examples cover LINQ to SQL so it is refreshing to see your example.

    ReplyDelete
  3. I wonder why people don't just use
    public int Prop { get; set; }
    instead of defining a private variable.
    Can you tell me why plz?

    ReplyDelete
  4. Hi Andrius, I think most people who know about the new property syntax use it. The exception is when you want to do some extra processing on a get or set like when implementing a Lazy Load or an Observable pattern.

    ReplyDelete
  5. Hi

    Does this architecture still hold good for an ASP.Net web forms application?

    I liked both AAPL and the high performance DAL. I think high performance DAL was a cleaner design however I do agree with your reasons for improving to AAPL.

    I am looking for a solution to improve the architecture of an existing ASP.Net web forms application.

    I will highly appreciate it if you can share some information on improving the architecture for the mentioned application and that for a lesser known database that just has an ADO.Net library for data access.

    ReplyDelete
  6. Hi Rudy,

    I've been working on a similar concept using IDataReader and I've been able to retrieve records making use of reflection. Using reflection I juts have to bind records with the custom classes and I don't have to provide the mapping details during this process. However performance is something which is worry me. I've tried to retrieve the same data inside dataset and there was a huge difference in the time taken.

    I would like to try with adapters so that I could bind my custom classes rather than using IDataReader. If you want then I could share my code with you so that you could point out areas where I could improve.

    Another thing which I forgot to mention was there were approx 16,000 rows which my stored procedure was fetching. I'm trying to see if somehow it makes any difference.

    ReplyDelete