Reference Bits: Business Layer

Showing posts with label Business Layer. Show all posts

Tuesday, January 19, 2010

Agile ADO.Net Persistence Layer: Part 3 Service Class Single<DTO> Data Access Method

When I say data access methods, I’m talking about the methods my UI is going to call whenever it needs to get data. When my Posts controller needs a list of BlogPosts to display, it’s going to call a BAL method like GetAllBlogPosts() or GetAllBlogPostsForCategory(). Last time I mentioned (over and over) that I like to keep things simple. When I need to get or save data, I don’t want to have to search through 3 different classes just to find the one with the method I need. Instead, I’m putting all my persistence logic for a given aggregate in just one place, a service class. This is not a web service. I’m using service in the Domain Driven Design sense here. That means that I have a BlogService class that is my one stop shop for all persistence that has to to with Blogs, BlogPosts, SubmittedBlogUrls, and anything else that falls within the Blog aggregate. Here is what my BlogService class looks like. You can see that it’s mostly “Get” data access methods.

What’s an Aggregate?

I keep using the word aggregate. If you’re not familiar with the term, it just means a group of entities that all share the same persistence class (whether that be a repository, a service, or something else). This is a key concept in Domain Driven Design. If you want to know more I would recommend picking up Eric Evans’ book or Jimmy Nilsson’s book on DDD. For now, all you need to know is that a BlogPost can never exist without a Blog, so there’s no point in BlogPost having it’s own persistence class. In fact we find that if we do break BlogPost out into it’s own persistence class, it will lead to problems down the road due to BlogPost’s dependency on Blog. What’s the solution? We put data access methods for both Blog and BlogPost in the same persistence class and call it an aggregate. That is why BlogService has methods for both Blog and BlogPost entities.

What type of data will data access methods return?

We covered this last post, but to recap all data will be returned as a Data Transfer Object (DTO). The DTOs are all defined in our Common assembly in the DataShapes folder. Our BAL will return data in one of the following 4 formats.

a single DTO
a List<DTO>
a DataPage<DTO>
a string value

For more see last week’s post Agile ADO.Net Persistence Layer: Part 2 Use DTOs.

A simple Single<DTO> data access method

Let’s look at the simplest possible data access method. GetBlogPost() takes a postGuid for a parameter, defines the query to find the BlogPost entity for that postGuid, and then returns the result as a single BlogPost DTO. Here’s the complete method.

public BlogPost GetBlogPost(Guid postGuid)
{
    string query = @"SELECT p.*, s.Score 
                    FROM [dbo].[BlogPost] p
                    LEFT JOIN [dbo].[BlogPostReputationScore] s on s.PostGuid = p.PostGuid
                    WHERE PostGuid = @PostGuid";
    SqlDao dao = new SqlDao();
    SqlCommand command = dao.GetSqlCommand(query);
    command.Parameters.Add(dao.CreateParameter("@PostGuid", postGuid));
    return dao.GetSingle<BlogPost>(command);
}

The first thing you’ll notice is that this isn’t a lot of code. All we’re really doing here is defining a parameterized TSQL query, wrapping that query up in a SqlCommand, and then passing the command and our desired return type off to a Data Access Object (DAO) that automagically executes the command and maps the results to our desired type. It may seem counter intuitive to write code like this when we haven’t even written the DAO yet, but that’s exactly how I did it when I wrote this code for the very first time. I decided that my data access methods should be very simple. I would start with the query and the DTO type that I wanted it to return, then I would pass them both to some type of helper class that would handle the details of running the query and figuring out how to map the query results to the properties of my DTO. By using this top down approach, I gave myself a very clear picture of how I needed my DAO to behave.

What’s a DAO (Data Access Object)?

By looking at the query logic above, you can see that I have this thing called a DAO or Data Access Object. This is a class that encapsulates the helper logic for working with my database. The DAO handles things like creating parameters, getting a connection, and most importantly it implements methods to return my four main data formats, GetSingle<DTO>, GetList<DTO>, GetDataPage<DTO>, and GetStringValue(). The DAO and it’s associated DataMappers are where you’ll find the special sauce that makes this architecture work. We’ll get into their implementation later on.

A BAL that embraces change

It’s easy to look at the simple code above and miss something that I think is very important. In fact that thing is the whole reason that I wrote this framework. That simple data access method above is the blueprint for a flexible persistence layer that makes changing your entities and associated persistence code easy and almost painless. It sets up a simple 3 step process for all data access in your application.

Define a DTO in the exact data shape that you’re looking for. That means create a DTO property for each data field that you need out of the database.
Define a query that gets the data. It can be as simple or as complex as you like. You can develop it in Sql Server Management Studio. You can easily optimize it. Use whatever process or tools work for you. When you’re done just paste the query into your data access method.
Pass both the query and your DTO to the DAO and it will automatically handle field mappings and pass the results back in the data shape you requested.

This is a very powerful way to work. I can’t count the number of times that I’ve worked with and architecture where I dreaded any changes because I knew that any data fields added would require me to modify a sproc, a DAL method, a BAL method, parsing logic, an entity class, it all adds up to a lot of friction that resists any change. This BAL design embraces change. It’s written with the attitude that we know change is going to happen so we’re going to give you as few things as possible to modify, and make sure we don’t have any cross cutting dependencies, so that you can make changes easily.

Next time, more on the service classes.

Next Post: Agile ADO.Net Persistence Layer Part 4: Writing data access for new data shapes

Saturday, January 9, 2010

Agile Ado.Net Persistence Layer Part 1: Design Overview

Last year I did a blog post series on how to design a High Performance DAL using Ado.Net. Judging by the response I’ve gotten from that series, there must be a lot of developers out there who believe that even with the availability of LINQ, Entity Framework, and a host of other ORM technologies, Ado.Net is still your best option when designing a persistence layer. BTW, I’m one of them.

After that series I started digging into Entity Framework and LINQ, and I was impressed by how effortless those technologies made certain parts of application development. Once the EF or LINQ mappings were in place, I found myself writing much less code and focusing more on the business logic of my application. I also found myself driven right back to ADO.Net as I found myself struggling with how do something that I already knew how to do in T-SQL, or fighting errors resulting from an attached data context object that I really didn’t want to start with.

So, I found myself back with ADO.Net, but I didn’t want to give up the ease of development and coding efficiencies that I got from the ORMs. I decided to take a fresh look at how to design an ADO.Net persistence layer. I started at the top (the application layer) and thought about how I want my app code to consume business logic, and then I worked down from there. I incorporated many of the best practices that I’ve used over the years, but I also looked critically at each one and whenever I found that something was slowing me down or leading to duplicate code, I threw it out. The resulting architecture is quick to develop on,testable, easily maintainable, and can be easily optimized for performance. This series of posts will detail the entire design, from application code to database.

A peek at the final design

I always find it’s easier to follow along if I have some idea where I’m going, so this is a quick look at where we’re headed. We’re going to cover the entire architecture for a simple blog aggregator called RelevantAssertions.com. RelevantAssertions is an Asp.Net MVC application that uses our new Agile Ado.Net Persistence Layer. We have 4 projects in the RA solution, WebUI, Tests, Common, and BAL Here’s a quick look.

WebUI

WebUI is our Asp.Net MVC application, that’s our application layer. This contains all UI and presentation logic, but it contains absolutely no business logic.

Common

Common contains classes that we need at all layers of our code. The DataShapes folder is where we define all of our DTO classes.

Tests

This project contains all of our automated tests for both the BAL and the UI.

BAL

I know it’s probably more correct to say BLL, but I like the term BAL. It just sounds better. This is the big class where everything interesting happens. The main workhorses of the BAL are the service classes. These are not web services. They are service classes in the DDD sense. The service classes are going to be the one stop shop where our application code goes to do whatever it needs to do. The service classes will also contain query logic, that’s right I said query logic. Behind the scenes the service classes will use DAOs (Data Access Objects), Data Mappers, and Persisters to do their thing in an efficient object oriented way, but the only classes our application code will use directly are the services.

What, no DAL??

You’ll notice that there is no DAL. It seems a little strange to have an architecture that focuses on ADO.Net but doesn’t have a DAL, but there’s a reason. Usually, the DAL is where I’ll put my query logic, mappings to query results, and any other database specific code. The DAL would allow me to keep all of my TSQL and ADO.Net code separated from the rest of my application, and this separation provided me with some important benefits like:

1) Separation is it’s own virtue, that’s just the right way to do it.
2) I wouldn’t have leakage of db or query logic into my business logic.
3) I could easily swap out SQL Server with another database if needed.
4) It encapsulates code that would otherwise be repeated.
5) We need to hide TSQL from programmers, it scares them.
6) It’s fun to make changes to 3 layers of code every time I add a new data member to an entity class.

At least that’s what I was always taught. But after working with more ORM oriented architectures and the Domain Driven Design way of doing things, I started to look at things differently. Let’s look at some of these benefits (at least the ones that aren’t sarcastic).

I’ve never met anyone who’s ever switched out their database

YAGNI means You Ain’t Gonna Need It. The idea is that we spend a lot of time building stuff that we don’t really need. We build it because it seems like the architecturally correct way to do it, or we think we’ll need the feature one day, or maybe we’re just used to doing it that way. Whatever the reason, the result is that we spend a lot of time coding features that are never used, and that’s not good. After doing this for 14 years or so, I’ve never, ever, run into a single project where they’ve decided “hey, let’s trash the years of investment we’ve made in SQL Server and switch over to MySQL, or any other database. Now I am aware that a db switch is likely if you’re writing a product that clients install onsite and it has to work with whatever their environment is, but for 99% of .Net developers this is just never going to happen. I call YAGNI on this one.

Query logic IS business logic

One of the biggest gripes I had when I started investigating LINQ, EF, and Hibernate (yes I was looking at Java code) architectures is that they had query logic in their repository classes. Now the query logic was written in LINQ, or EntitySQL, or some other abstracted query language, but it was still query logic. Blasphemy!! You can’t put query logic in a BAL class! That stuff has to be abstracted away in the DAL or it will contaminate the rest of the application architecture! Our layered architecture is being violated! Worlds are colliding! It’ll be chaos!! Then I started to notice something, it’s really easy to develop business logic when you include queries in the BAL. In the past I would put my queries in a sproc, then I would write a DAL wrapper for the sproc, and a BAL wrapper for the DAL method. Then, if the query changed, or if I needed an identical query but with a slightly different parameter list I would write a new sproc, then write a new DAL wrapper, then write yet another BAL wrapper method for the DAL wrapper method. By the time all was said and done I would have this crazy duplication of methods across all layers of my application including my database! And don’t even get me started on the crazy designs that I implemented to try and pass query criteria structures (basically the stuff that goes in the where clause) between my BAL an my DAL. I came up with these crazy layers of abstraction that basically existed so that I wouldn’t have to create a simple TSQL WHERE clause in my BAL. Then there’s the problem of handling sorting and data paging, that required even more DAL methods, and each of these DAL methods had corresponding wrapper methods in the BAL that did nothing but pass the call through to the DAL! Why?? I was doing the right thing by separating my business logic from my query logic, why was it so painful? The answer I finally arrived at is simply that query logic is business logic. I’d been putting a separation where no separation belonged.

All real programmers know TSQL

I’ve heard the argument that TSQL is too hard for programmers so we’re going to create something much easier for programmers to use like LINQ or EF. The problem is that these tools require almost exactly the same syntax as TSQL but they put an extra layer of stuff in there that can break and a data context (or session for you nHibernate folks) that throws errors whenever you try to save a complex object graph. How did this attitude that TSQL is a problem for programmers gain any traction? Have you ever met a real programmer who can’t write TSQL? And if you did meet such a person, would you let them touch your business layer code? Why would we ever want to abstract TSQL away? It’s the perfect DSL for accessing SQL Server data and every programmer in the world is already familiar with it.

Using good object oriented design and encapsulating data access code is a good thing

I fully believe this one, but once we decide that query logic is business logic and that we don’t need to hide TSQL from programmers, there’s no reason to put our well designed object oriented data access code in a separate project and call it a DAL. I decided to just put it in a Persistence folder in my BAL and now I have one less DLL to worry about.

So, that’s some of what I was thinking when I made the decisions that I did. It made sense to me. I’m sure it won’t make sense to everyone, but I do think that it resulted in a very usable architecture. Before I wrap up for today, I want to look at one more thing.

The target application code experience. What will it be like to use?

When I’m writing code in my application layer, consuming the logic that is provided through my BAL service classes, what does that code look like. Well I know a couple of examples of code I don’t want it to look like.

I’ve been in a few environments where there were huge libraries of BAL classes, any of which could contain the logic I want. I would often have to resort to a solution wide text search looking for sproc names or keywords that might exist in the method that I needed. I don’t want that. I want everything I need to be in one easy to find place.

I’ve also seen a practice that’s common in the DDD (Domain Driven Design) crowd where you’ll need to go to a factory class to create a new entity, you need to go to a repository class to get an entity from the database, if you have complex logic that involves more than one entity you need to go to a service class, and saving entities is a toss up between using either the repository or a separate service class. There may be a good reason to use that kind of class design inside of the BAL, but when I’m writing code in my application layer, I don’t want to have to worry about which of 4 different classes I’m going to use. So again, I’m a simple guy, when I need to get, save, or validate a BlogPost entity, I want a single service class that I can go to for everything. My app code should look something like this.

// instantiate service classes 
BlogService blogService = new BlogService();
CategoryService categoryService = new CategoryService();
// Get data shaped as lists and pages of our DTOs
DataPage<BlogPost> page = blogService.GetPageOfBlogPosts(pageSize.Value, pageIndex.Value, sortBy);
List<Category> categoryList = categoryService.GetTopCategoryList(30);
// create and save a new BlogPost
BlogPost newPost = new BlogPost();
newPost.BlogGuid = blog.BlogGuid;
newPost.PostTitle = item.Title.Text;
newPost.PostUrl = item.Links[0].Uri.AbsoluteUri;
newPost.PostSummary = item.Summary.Text;
newPost.Score = 0;
blogService.Save(newPost);

Next time we’ll focus less on discussion and more on code. We’ll look at DTO classes and the 4 main data shapes that will go into and come out of our BAL: DTO, List<DTO>, DataPage<DTO>, and String.

Next Post: Agile ADO.Net Persistence Layer Part 2: Use DTOs

Monday, December 28, 2009

How do you test CRUD methods? Just hit the database and be happy.

It’s time for a little programmer heresy. For the last few months I’ve been reevaluating all of my “best practices” and throwing out anything that creates friction. Friction is a term I’ve heard Jeff Atwood use fairly often. It’s a fuzzy term but I think I understand what he’s getting at. When you’re writing code and you’ve got some momentum going, any time you find yourself struggling against the persistence framework, any time you find yourself jumping through hoops to find that bit of code that’s throwing an error (and is probably called through reflection), any code that makes you scream inside but you have to write it to comply with some standard, pretty much anything that slows you down, that is friction. Sometimes friction is justified. Most of the time it isn’t.

Background

So I’ve been doing more and more unit testing lately. One thing that I run up against often is how to test persistence (Create Read Update Delete) methods. Below is a diagram of some persistence classes from an application I’m working on. The classes below handle persistence for my SubmittedBlogUrl entity. BlogService is the one stop shop that my application layer uses to do everything related to blogs. In Domain Driven Design parlance, the BlogService handles getting and saving data for all classes that are part of the Blog aggregate. Getting and saving BlogProfiles, BlogPosts, and SubmittedBlogUrls all takes place through the BlogService class. That doesn’t mean that BlogService contains all of the persistence logic though. I use a strategy pattern that delegates the actual Delete, Insert, Save, and Update methods to separate persister classes.

Now the code I want to test are the Delete, Insert, and Update methods in my SubmittedBlogUrlPersister class. The methods look like this.

public class SubmittedBlogUrlPersister
    {
 
        public void Save(SubmittedBlogUrl dto)
        {
            if (dto.Id == NullValues.NullInt)
            {
                Insert(dto);
            }
            else
            {
                Update(dto);
            }
        }
 
        public void Insert(SubmittedBlogUrl dto)
        {
            SqlRepository repository = new SqlRepository();
            string sql = @"INSERT INTO [dbo].[SubmittedBlogUrl]
                                ([SubmittedByUserGuid]
                                ,[SubmittedOnUtc]
                                ,[IpAddress]
                                ,[Status]
                                ,[BlogUrl])
                            VALUES
                                (@SubmittedByUserGuid
                                ,@SubmittedOnUtc
                                ,@IpAddress
                                ,@Status
                                ,@BlogUrl)
                            SELECT SCOPE_IDENTITY()";
 
            SqlCommand command = repository.GetSqlCommand(sql);
            command.Parameters.Add(repository.CreateParameter("@SubmittedByUserGuid",       
              dto.SubmittedByUserGuid));
            command.Parameters.Add(repository.CreateParameter("@SubmittedOnUtc",  
              dto.SubmittedOnUtc));
            command.Parameters.Add(repository.CreateParameter("@IpAddress", dto.IpAddress, 20));
            command.Parameters.Add(repository.CreateParameter("@Status", dto.Status));
            command.Parameters.Add(repository.CreateParameter("@BlogUrl", dto.BlogUrl, 100));
            Object result = repository.ExecuteScalar(command);
            dto.Id = Convert.ToInt32(result);
        }
 
 
        public void Update(SubmittedBlogUrl dto)
        {
            SqlRepository repository = new SqlRepository();
            string sql = @"UPDATE [dbo].[SubmittedBlogUrl]
                           SET [SubmittedByUserGuid] = @SubmittedByUserGuid
                              ,[SubmittedOnUtc] = @SubmittedOnUtc
                              ,[IpAddress] = @IpAddress
                              ,[Status] = @Status
                              ,[BlogUrl] = @BlogUrl
                         WHERE Id = @Id";
 
            SqlCommand command = repository.GetSqlCommand(sql);
            command.Parameters.Add(repository.CreateParameter("@Id", dto.Id));
            command.Parameters.Add(repository.CreateParameter("@SubmittedByUserGuid",  
              dto.SubmittedByUserGuid));
            command.Parameters.Add(repository.CreateParameter("@SubmittedOnUtc",       
              dto.SubmittedOnUtc));
            command.Parameters.Add(repository.CreateParameter("@IpAddress", dto.IpAddress, 20));
            command.Parameters.Add(repository.CreateParameter("@Status", dto.Status));
            command.Parameters.Add(repository.CreateParameter("@BlogUrl", dto.BlogUrl, 100));
            repository.ExecuteNonQuery(command);
        }
 
 
        public void Delete(SubmittedBlogUrl dto)
        {
            SqlRepository repository = new SqlRepository();
            string sql = @"DELETE FROM [dbo].[SubmittedBlogUrl]
                           WHERE Id = @Id ";
 
            SqlCommand command = repository.GetSqlCommand(sql);
            command.Parameters.Add(repository.CreateParameter("@Id", dto.Id));
            repository.ExecuteNonQuery(command);
        }
 
    }

As you can see, these methods just create some parameterized T-SQL, package it in a command, and then pass the command off to my repository to be executed. The key thing that I need to test here is the query logic. I need to make sure I didn’t make any mistakes when writing the SQL and creating parameters, and I need to make sure that the SQL works when run against my database.

Getting to the Tests

So how should I test this? The TDD purists might say that I should fire up a mocking framework, create some mock objects of my repository and make sure that all right things are passed in. I also might need to look at a Dependency Injection framework to make it easier to switch out my real SqlRepository with the mock SqlRepository. And at the end of all this coding I’ll know what??? Well, I’ll just know that I passed a command to a mock object without anything blowing up. I won’t know if my SQL syntax is right, I won’t know if my SQL works with my database, I won’t know anything that I actually need to know. I call this friction. Lots of effort that at the end of the day doesn’t even get me what I need, a valid test of the SQL in my persister class.

My solution, let’s dumb down the tests and get on with coding. I’ll create a single CRUD test that instantiates a real SubmittedBlogUrlPersister (not a mock) then creates a new SubmittedBlogUrl object, saves it to the database, updates it, and finally deletes it. If I make sure that I’m using a dev database (not production) and I make sure that my test cleans up after itself (deletes the data it creates) this should work just fine. The resulting test looks like this.

 [TestClass()]     
publicclassSubmittedBlogUrlPersisterTest{
    [TestMethod()]
    public void CrudTest()
    {
        // create our test object
        var persister = new SubmittedBlogUrlPersister();
        var service = new BlogService();
        var dto = new SubmittedBlogUrl();
        SubmittedBlogUrl dto2;
        var utcNow = DateTime.UtcNow;
        dto.BlogUrl = "testUrl";
        dto.IpAddress = "testIp";
        dto.SubmittedByUserGuid = Guid.NewGuid();
        dto.SubmittedOnUtc = utcNow;
        // insert it
        persister.Insert(dto);
        // get it
        dto2 = service.GetSubmittedBlogUrl(dto.Id);
        Assert.AreEqual(dto.Id, dto2.Id);
        // update it
        dto2.BlogUrl = "NewUrl";
        persister.Save(dto2);
        dto2 = service.GetSubmittedBlogUrl(dto.Id);
        Assert.AreEqual(dto2.BlogUrl, "NewUrl");
        // delete it
        persister.Delete(dto2);
        dto2 = service.GetSubmittedBlogUrl(dto.Id);
        Assert.IsNull(dto2);
    }
}

Technically, this isn’t a unit test, it’s an integration test since it’s hitting a real database, but I think it’s the right approach for this situation. The principle here is that when you reach the point where you’re writing tests for methods that contain query logic (like CRUD methods), that might be the right place to ditch the unit tests and switch over to some simple integration tests that write to a real database and test what you really need tested, your query logic.

Addendum

Great comment that I wanted to draw attention to from Steve Freeman who co-authored the book Growing Object-Oriented Software, Guided by Tests. Steve said..

As a TDD purist who "wrote the book" on Mock Objects and once wrote a foolish paper on mocking out JDBC, I would say that testing against a real database is the right thing to do here (provided you've set up your environment to do this cleanly). The other right thing is to only need such tests at the boundaries of your application, the persistence code should be nicely contained and not leak into the domain code.

That makes a lot of sense to me. I might be off a bit on what I expect to hear from TDD purists. I like Steve’s statement that you need integration tests like this only at the boundaries of your application. In my example, if I’m testing app logic in an MVC controller class that uses my BlogService for persistence, then it makes sense to mock my BlogService and write real unit tests because I really just want to test the logic in my controller. But at the edge of the application, where I’m touching the database, it’s probably more appropriate to switch over to integration tests. I think a good rule of thumb is that if the code you’re testing contains any query logic (SQL, LINQ, etc..), go ahead and hit the db.

Tuesday, July 28, 2009

Fluent Interface Pattern for Composing Entity Framework Queries

I’ve been doing a fair amount of work with Entity Framework recently. There are some things about EF that make me want to throw it out the window, but this post is about something that I really like, the ability to eliminate redundant code from my BLL and DLL and create a fluent interface for composing queries.

The problem we want to solve

So here’s a typical scenario. I have a blog aggregator application that I’m building. I use Entity Framework to create a BlogPost entity and it’s data mappings. Great, now I’m ready to create a BlogRepository class that will contain all of my queries for getting Blog posts. So I write the first data access method and it looks something like this.

public List<BlogPost> BlogPostSetByWeek_GetPage(int pageIndex, int pageSize, DateTime startDate)
{
    startDate = ToSunday(startDate);         
    DateTime startUtc = startDate.Date;
    DateTime endUtc = startDate.AddDays(7).Date;
    int skipCount = pageSize * pageIndex;
    var query = from p in Context.BlogPostSet.Include("Categories")
                where p.PostedUtc > startUtc & p.PostedUtc < endUtc
                orderby p.PostedUtc descending
                select p;
    List<BlogPost> postSet = query.Skip(skipCount).Take(pageSize).ToList<BlogPost>();
    return postSet;
}

The above method takes a startDate and some paging parameters and then returns the specified page of results in the shape of a generic list of BlogPost entities. How easy was that!!

Now for the next step. I need a query that’s exactly like the query above but this time I want the entire list of results instead of just a page. And after that I need another query that sorts the BlogPosts by Category instead of by PostedUtc, and then I need another that sorts by the BlogName, and on and on and on. So how do I handle this?? I could just create a completely new EF query for each one of these. Or maybe I could use EntitySQL instead of Linq to Entities and then I would be able to use a bunch of conditional blocks to create the EntitySQL text that I need….. Neither of those solutions really appeals to me. First, I don’t like the idea of rewriting the same query over and over with minor differences in criteria or sort order. That just seems inefficient. Second I don’t really want to use EntitySQL because I like the strong typing that I get with Linq to Entities, plus I would need a lot of conditionals to handle all of the possible query combinations and that sounds like a mess.

The Solution

So I was thinking about how much I hate duplicating the same query code over and over when I realized something. Microsoft has made the query an object. I didn’t really appreciate the significance of that before. The query is no longer just text, it is now an object, an ObjectQuery<> object to be precise. The cool part is that if I write methods that take an ObjectQuery as their parameter and then return an ObjectQuery for their return value, I can chain them together and use them to compose queries.

How could this work? I looked at the queries in my BLL and found that each of them consists of 3 major components:

Looking at this break down, I realized that I could have a Filter Method that creates an ObjectQuery that gets the data I’m looking for, then I could pass that ObjectQuery to a Sort Method that applies a sort then returns the modified ObjectQuery, then I could pass that to a Projection Method that applies paging, shapes the data, and executes the ObjectQuery.

So, when all this is said and done I should be able to compose Entity framework queries by combining a Filter Method, a Sort Method, and a Projection Method. The end result should be data access code that looks like this:

List<BlogPost> postSet = GetBlogPostSet().SortBy(“Date”).GetPage(pageIndex, pageSize);
List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetPage(pageIndex, pageSize);
List<BlogPost> postSet = GetBlogPostSet().SortBy(“ID”).GetAll();

Building an Example

So, I coded it up and it works pretty well. The first step is creating a Filter Method. This method takes search criteria as parameters and returns an ObjectQuery. Below is my filter method for getting the BlogPost entities for a given week.

// GetBlogPostSetForWeek
private ObjectQuery<BlogPost> GetBlogPostSetForWeek(DateTime startDate)
{
    startDate = ToSunday(startDate);
    DateTime startUtc = startDate.Date;
    DateTime endUtc = startDate.AddDays(7).Date;
    var query = from p in Context.BlogPostSet.Include("Categories")
                where p.PostedUtc > startUtc & p.PostedUtc < endUtc
                select p;
    return (ObjectQuery<BlogPost>)query;
}

Now I need to create my Sort Method. This method will take the results of my Filter Method as a parameter, along with an enum that tells the method what sort to apply. Note that I’m using strongly typed object queries of type ObjectQuery<BlogPost>. The strong typing serves two purposes. First it lets my Sort Method know that I’m dealing with BlogPost entities which tells me what fields are available to sort by. Second, the stong typing provides a distinct method signature so I can have multiple methods called SortBy which all handle ObjectQueries that return different types of entities. I can have a SortBy( ObjectQuery<BlogPost>), SortBy(ObjectQuery<Person>), etc.

One other thing. I want to chain these methods together, fluent interface style. For that reason I’m implementing both SortBy and my GetPage as extension methods. Here’s the code for the SortBy method.

// SortBy
internal static ObjectQuery<BlogPost> SortBy( this ObjectQuery<BlogPost> query, Enums.BlogPostSortOption sortOption)
{
    switch (sortOption)
    {
        case Enums.BlogPostSortOption.ByDate:
            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);
        case Enums.BlogPostSortOption.BySite:
            return (ObjectQuery<BlogPost>)query.OrderBy(p => p.BlogProfile.BlogName);
        case Enums.BlogPostSortOption.ByVote:
            return query;
        default:
            return (ObjectQuery<BlogPost>)query.OrderByDescending(p => p.PostedUtc);
    }
}

Lastly we need to create a Projection Method. Below is the GetPage method. It takes the ObjectQuery<BlogPost> from the SortBy method, applies paging logic to it, executes the query, then returns the results as a List<BlogPost>.

// GetPage
internal static List<BlogPost> GetPage(this ObjectQuery<BlogPost> query, int pageIndex, int pageSize)
{
    int skipCount = pageSize * pageIndex;
    return query.Skip(skipCount).Take(pageSize).ToList<BlogPost>();
}

So that’s it. I now have all the pieces needed to create my data access methods without duplicating query logic over and over. If I want all blog posts ordered by date, I can use the code:

  Enums.BlogPostSortOption sort = Enums.BlogPostSortOption.ByDate;
  return GetBlogPostSetForWeek(startDate).SortBy(sort).GetPage(pageIndex, pageSize);

To sort those same results by BlogName I can use the code:

  Enums.BlogPostSortOption sort = Enums.BlogPostSortOption.BySite;
  return GetBlogPostSetForWeek(startDate).SortBy(sort).GetPage(pageIndex, pageSize);

If I want to get BlogPosts by category instead of by week, I just write a new filter method named GetBlogPostSetForCategory and it plugs right in:

  return GetBlogPostSetForCategory(category).SortBy(sort).GetPage(pageIndex, pageSize);

Conclusion

So that's it. This technique has significantly reduced the amount of data access code in my Repository classes and the time that it takes to write it. I also like the fact that I’m not writing the same paging and sorting code over and over in different queries. If you see any advantages or disadvantages to the technique, please leave a comment and let me know what you think. Also, if you’re aware of anyone else using a similar method, please send me a link at rlacovara@gmail.com, I would like to check it out.

Reference Bits