Updates to GraphDiff with new scenarios supported

Hi, This post will just be a quick update on the progress of GraphDiff.

It’s great that quite a few people are using/looking into GraphDiff. I have finally found some time and I am updating the code with numerous bug fixes and new features. I will be able to spend more time on this project now so if you have any issues please add them to github here .

There is a new official package on nuget called ‘RefactorThis.GraphDiff’ the only breaking change is a namespace change from GraphDiff to RefactorThis.GraphDiff. This package is a newer version and will receive all future updates you can get it here .

I also want to quickly clarify what AssociatedCollection and AssociatedEntity does as I’ve had some questions.

Associated/Owned clarification

If a person has a list of friends and you want to update the list of a friends for a person, but not update the entities inside that collection (such as that friend’s first name) then that friend collection is an associated collection and can be written like this:

context.UpdateGraph(person1, map => map.AssociatedCollection(person => person.friends));

if an update has been made to the friend entity we don’t want it to be saved, however we do want to save the fact that the person now has a new friend. This is what the associated collection does. An owned collection on the other hand will state that all of the entities inside of the friends collection are owned by the parent entity and will be updated, so the first name change will be saved to the database.

For more information please read my original post here

New Features/Fixes

  • Supports cyclic navigational properties
  • Now only performs updates of the parent and nested entities when needed (better support for auditing and concurrency scenarios)
  • Bugfix for complex graphs of collections of collections
  • Supports proxy objects (no lazy loading is done from within GraphDiff, all entities needed are loaded with one query) however the code did not support proxy objects.
  • Supports reloading entities that have been attached with the GraphDiffConfiguration.ReloadAssociatedEntitiesOnAttach configuration option. This is useful in cases where you may want to return the graph once saved. EF (by design) will returned the attached object not the object as it truely is in the database as the object exists in its local cache. The easy way to get around this is to simply make the database calls on a new context. If this is unavailable to you then you can set this configuration option and GraphDiff will ensure that any associated entities are updated from the database, ensuring you always receive the latest copy of all entities.
Advertisements

Introducing GraphDiff for Entity Framework Code First – Allowing automated updates of a graph of detached entities

Looking for a complete solution for automatically updating a graph of entities using the Entity Framework? Read On!

Hi, As usual I have neglected this blog as of late. It is getting harder and harder to find time to put some notes up here. But hopefully today I have something very interesting to make up for it.

So today I’m finally going to post something that we have actually been using live on our production code for quite some time, and the good news is that it is working beautifully. I’m introducing GraphDiff – an extension method allowing for the automatic update of a detached graph using Entity Framework code first. (Edit: code has been rewritten to handle multiple new features, as such no guarantee can be given on its production-ready usage, but I’m continuing to work on it to make sure it is relatively bug-free)

Working with detached graphs of entities we quite often found that it was cumbersome to use the Entity Framework to manually map all of the changes from an aggregate root to the database. By aggregate root I mean a bunch of models which are handled as one unit when updating/adding/deleting.

Below I will describe my proposed solution to this problem of automatically updating a detached graph consisting of multiple add/delete/update changes at any point in the graph. This should work for all sorts of graphs, and allows for updating entities with associated collections and single entities.

I find its always clearer with some code so lets try an example:

Say you have a Company which has many Contacts. A contact is not defined on its own and is a One-To-Many (with required parent) record of a Company. i.e. The company is the Aggregate Root. Assume you have a detached Company graph with its Contacts attached and want to reflect the state of this graph in the database.

At present using the Entity Framework you will need to perform the updates of the contacts manually, check if each contact is new and add, check if updated and edit, check if removed then delete it from the database. Once you have to do this for a few different aggregates in a large system you start to realize there must be a better, more generic way.

Well good news is that after a few refactorings I’ve found a nice solution to this problem. The proposed extension method below handles the whole diff for you in a nice convenient package.

using (var context = new TestDbContext())
{
    // Update the company and state that the company 'owns' the collection Contacts.
    context.UpdateGraph(company, map => map
        .OwnedCollection(p => p.Contacts)
    );

    context.SaveChanges();
}

Using the above code a diff will be run between the provided company graph and one retrieved from the database. The bounds of the graph are defined by the mapping which above is the company entity itself and its child Contacts. Only entities within these bounds will be included in the database diff.

The retrieval code makes use of the provided mapping configuration to get all data needed in one query at the start of the process, thus making the process quite efficient.

From this diff the algorithm will add/update/delete depending on what action needs to be performed and commit all of these changes in one batch at the end of the algorithm.

There are 2 different scenarios that this extension method must cater for. One is the situation above where you have a One-To-Many or One-To-One where the right hand side of the relationship is owned by the parent. This is defined within the mapping as an OwnedCollection/OwnedEntity.

The other scenario is that you have a Many-To-Many, or a One-To-Many and the related record exists in its own right as it not a defined part of the aggregate you are updating. That scenario is defined as an AssociatedCollection/AssociatedEntity.

The mapping configuration can handle many complex scenarios with ease. For example:

I have a company aggregate which is made of address objects and contact objects. On the contact model there is a list of accepted advertising materials (List) which are managed elsewhere in the system and not a part of the company aggregate. The list of advertisingOptions that are associated with the contact still needs to be updated however when managing the company as an aggregate.

using (var context = new TestDbContext())
{
    // Update the company and state that the company 'owns' the collection Contacts.
    context.UpdateGraph(company, map => map
        .OwnedCollection(p => p.Contacts, with => with
            .AssociatedCollection(p => p.AdvertisementOptions))
        .OwnedCollection(p => p.Addresses)
    );

    context.SaveChanges();
}

The associated collection line tells the update method that the actual entities that are AdvertisementOptions should not be marked as changed (even if the objects provided are different from the database values. We do however want to update the relation so that it reflects the fact that the contact has the provided AdvertisementOptions and this is performed during the update.

If you would like to use this code please leave all copyright marks as they are. The code has been put up on https://github.com/refactorthis/GraphDiff.

Hope this helps you as it did our team. Please note that there are a few things that can still be improved and if you have any suggestions please let me know.

Group or Order by sub type using LINQ to Entities

It’s a little hard to see how to group by a sub type in Linq to Entities. Since the discriminating field (when using table per hierarchy inheritance) or table name (when using table per type inheritance) is of course not exposed past the ORM layer. This essentially leaves us with the type of the object to discriminate sub types.

If we try to use GetType() in a LINQ to Entites query we get an exception of the sort:
“LINQ to Entities does not recognize the method ‘System.Type GetType()’ method, and this method cannot be translated into a store expression.”

On the other hand however EF can recognise the ‘is’ keyword. Now lets see how to group by using this new found knowledge.

Scenario: A PERSON is subdivided into ADULT, TEENAGER, etc like below:


public class Person
{
    public int Age { get; set; }
}

public class Teenager : Person { //... }
public class Adult : Person { //... }

Now how do I find the count of people who are adults and teenagers in one LINQ query? (without ToList() as this no longer performs the query in SQL)
The trick is to to use nested ternary operators inside of your GroupBy keySelector parameter using the ‘is’ keyword.


var results = Context.Persons
    .GroupBy(p => p is Adult? "Adult" : p is Teenager ? "Teenager" : "", (key, results) =>
    {
        k,
        results.Count()
    }).ToDictionary(p => p.key);

The same can be done for OrderBy queries. Hope this helps you as it did me.

Entity Framework ToList() in nested type (LINQ to Entities does not recognize the method)

Update:
The issue is described here:
http://entityframework.codeplex.com/workitem/808
I’ve made a fork of EF source which solves this issue @
http://entityframework.codeplex.com/SourceControl/network/forks/brentmckendrick/Issue808/changeset/b18e48b3e51f
however more work could be done to enhance the code by allowing for ToArray and ToDictionary.

So today I have a suggestion for an improvement to the Entity Framework.

First let me set the scenario:

Say I have an entity called Person which has many Addresses, as shown below.

public class Person
{
    // .. lots of fields ..
    public ICollection<Address> Addresses { get; set; }
}

public class Address
{
    // .. lots of fields ..
}

Now say I would like to create a projection so that I do not return all of the information contained in the Person and Address models.
I could write this:

Context.Persons
.Select(p => new
{
    Addresses = p.Addresses.Select(m => new
    {
        // address properties
    })
    // other person properties
}).ToList();

This would obviously give me an anonymous type (A1) which contains a property of List'(T) called Addresses where T is anonymous type (A2). The key here is that the type instantiated is in fact a List'(T)

Now lets say I have a set of data transfer objects. One is PersonDTO which has a property of type List'(T) where T is an AddressDTO as shown below.

public class PersonDTO
{
    // .. a few fields ..
    public List<AddressDTO> Addresses { get; set; }
}

public class Address
{
    // .. a few fields ..
}

I want to map a person to a PersonDTO. This is shown here:

Context.Persons
.Select(p => new PersonDTO
{
    Addresses = p.Addresses.Select(m => new AddressDTO
    {
        // address properties
    }).ToList() // not supported
    // other person properties
}).ToList();

I get an Exception : System.NotSupportedException – “LINQ to Entities does not recognize the method ‘System.Collections.Generic.List`1[t] ToList[t](System.Collections.Generic.IEnumerable`1[t])’ method, and this method cannot be translated into a store expression.”

Now all of you LINQ to Entities guys are screaming “You can’t call ToList() inside of a nested projection!”. OK, I can see that the ToList() will not be able to be translated to an SQL query, obviously if I leave it out then my code will not compile since PersonDTO.Addresses is a List'(T). If I change the PersonDTO.Addresses field to an IEnumerable and remove the ToList() then it works! (and by reflecting the runtime type it is in fact populated with a generic list). So the list IS being instantiated anyway.

Now I know this projection could be done in memory and I would not have any problems calling ToList() because it would not need to be translated into an SQL query, however this defeats the purpose as I am trying to write an SQL query.

Now to the point!:

– An anonymous type can be created with a nested collection property of type List.
– A concrete type can be created with a nested collection of type IEnumerable and a List will be created and populated for this variable
– A concrete type with a nested property collection of a List (or any derivative of type ICollection) can NOT be projected to (because of the ToList() issue)

Now it seems to me that since in all of these cases a List is created anyway that perhaps when the expression is mapped to an SQL function the call to the nested ToList() should be ignored? Simply by doing this all of the problems would be solved. (Please correct me if this assumption is incorrect). Better yet what if it WASN’T ignored but removed from the SQL expression and stored for later use so that we could also call ToArray(), ToList(), etc. and change the type which is being returned from the EF materialization? Eg.

Context.Persons
.Select(p => new PersonDTO
{
    Addresses = p.Addresses.Select(m => new AddressDTO
    {
        // address properties
    }).ToArray()
    // other person properties
}).ToList();