Updates to GraphDiff with new scenarios supported

Hi, This post will just be a quick update on the progress of GraphDiff.

It’s great that quite a few people are using/looking into GraphDiff. I have finally found some time and I am updating the code with numerous bug fixes and new features. I will be able to spend more time on this project now so if you have any issues please add them to github here .

There is a new official package on nuget called ‘RefactorThis.GraphDiff’ the only breaking change is a namespace change from GraphDiff to RefactorThis.GraphDiff. This package is a newer version and will receive all future updates you can get it here .

I also want to quickly clarify what AssociatedCollection and AssociatedEntity does as I’ve had some questions.

Associated/Owned clarification

If a person has a list of friends and you want to update the list of a friends for a person, but not update the entities inside that collection (such as that friend’s first name) then that friend collection is an associated collection and can be written like this:

context.UpdateGraph(person1, map => map.AssociatedCollection(person => person.friends));

if an update has been made to the friend entity we don’t want it to be saved, however we do want to save the fact that the person now has a new friend. This is what the associated collection does. An owned collection on the other hand will state that all of the entities inside of the friends collection are owned by the parent entity and will be updated, so the first name change will be saved to the database.

For more information please read my original post here

New Features/Fixes

  • Supports cyclic navigational properties
  • Now only performs updates of the parent and nested entities when needed (better support for auditing and concurrency scenarios)
  • Bugfix for complex graphs of collections of collections
  • Supports proxy objects (no lazy loading is done from within GraphDiff, all entities needed are loaded with one query) however the code did not support proxy objects.
  • Supports reloading entities that have been attached with the GraphDiffConfiguration.ReloadAssociatedEntitiesOnAttach configuration option. This is useful in cases where you may want to return the graph once saved. EF (by design) will returned the attached object not the object as it truely is in the database as the object exists in its local cache. The easy way to get around this is to simply make the database calls on a new context. If this is unavailable to you then you can set this configuration option and GraphDiff will ensure that any associated entities are updated from the database, ensuring you always receive the latest copy of all entities.
Advertisements

Introducing GraphDiff for Entity Framework Code First – Allowing automated updates of a graph of detached entities

Looking for a complete solution for automatically updating a graph of entities using the Entity Framework? Read On!

Hi, As usual I have neglected this blog as of late. It is getting harder and harder to find time to put some notes up here. But hopefully today I have something very interesting to make up for it.

So today I’m finally going to post something that we have actually been using live on our production code for quite some time, and the good news is that it is working beautifully. I’m introducing GraphDiff – an extension method allowing for the automatic update of a detached graph using Entity Framework code first. (Edit: code has been rewritten to handle multiple new features, as such no guarantee can be given on its production-ready usage, but I’m continuing to work on it to make sure it is relatively bug-free)

Working with detached graphs of entities we quite often found that it was cumbersome to use the Entity Framework to manually map all of the changes from an aggregate root to the database. By aggregate root I mean a bunch of models which are handled as one unit when updating/adding/deleting.

Below I will describe my proposed solution to this problem of automatically updating a detached graph consisting of multiple add/delete/update changes at any point in the graph. This should work for all sorts of graphs, and allows for updating entities with associated collections and single entities.

I find its always clearer with some code so lets try an example:

Say you have a Company which has many Contacts. A contact is not defined on its own and is a One-To-Many (with required parent) record of a Company. i.e. The company is the Aggregate Root. Assume you have a detached Company graph with its Contacts attached and want to reflect the state of this graph in the database.

At present using the Entity Framework you will need to perform the updates of the contacts manually, check if each contact is new and add, check if updated and edit, check if removed then delete it from the database. Once you have to do this for a few different aggregates in a large system you start to realize there must be a better, more generic way.

Well good news is that after a few refactorings I’ve found a nice solution to this problem. The proposed extension method below handles the whole diff for you in a nice convenient package.

using (var context = new TestDbContext())
{
    // Update the company and state that the company 'owns' the collection Contacts.
    context.UpdateGraph(company, map => map
        .OwnedCollection(p => p.Contacts)
    );

    context.SaveChanges();
}

Using the above code a diff will be run between the provided company graph and one retrieved from the database. The bounds of the graph are defined by the mapping which above is the company entity itself and its child Contacts. Only entities within these bounds will be included in the database diff.

The retrieval code makes use of the provided mapping configuration to get all data needed in one query at the start of the process, thus making the process quite efficient.

From this diff the algorithm will add/update/delete depending on what action needs to be performed and commit all of these changes in one batch at the end of the algorithm.

There are 2 different scenarios that this extension method must cater for. One is the situation above where you have a One-To-Many or One-To-One where the right hand side of the relationship is owned by the parent. This is defined within the mapping as an OwnedCollection/OwnedEntity.

The other scenario is that you have a Many-To-Many, or a One-To-Many and the related record exists in its own right as it not a defined part of the aggregate you are updating. That scenario is defined as an AssociatedCollection/AssociatedEntity.

The mapping configuration can handle many complex scenarios with ease. For example:

I have a company aggregate which is made of address objects and contact objects. On the contact model there is a list of accepted advertising materials (List) which are managed elsewhere in the system and not a part of the company aggregate. The list of advertisingOptions that are associated with the contact still needs to be updated however when managing the company as an aggregate.

using (var context = new TestDbContext())
{
    // Update the company and state that the company 'owns' the collection Contacts.
    context.UpdateGraph(company, map => map
        .OwnedCollection(p => p.Contacts, with => with
            .AssociatedCollection(p => p.AdvertisementOptions))
        .OwnedCollection(p => p.Addresses)
    );

    context.SaveChanges();
}

The associated collection line tells the update method that the actual entities that are AdvertisementOptions should not be marked as changed (even if the objects provided are different from the database values. We do however want to update the relation so that it reflects the fact that the contact has the provided AdvertisementOptions and this is performed during the update.

If you would like to use this code please leave all copyright marks as they are. The code has been put up on https://github.com/refactorthis/GraphDiff.

Hope this helps you as it did our team. Please note that there are a few things that can still be improved and if you have any suggestions please let me know.

Group or Order by sub type using LINQ to Entities

It’s a little hard to see how to group by a sub type in Linq to Entities. Since the discriminating field (when using table per hierarchy inheritance) or table name (when using table per type inheritance) is of course not exposed past the ORM layer. This essentially leaves us with the type of the object to discriminate sub types.

If we try to use GetType() in a LINQ to Entites query we get an exception of the sort:
“LINQ to Entities does not recognize the method ‘System.Type GetType()’ method, and this method cannot be translated into a store expression.”

On the other hand however EF can recognise the ‘is’ keyword. Now lets see how to group by using this new found knowledge.

Scenario: A PERSON is subdivided into ADULT, TEENAGER, etc like below:


public class Person
{
    public int Age { get; set; }
}

public class Teenager : Person { //... }
public class Adult : Person { //... }

Now how do I find the count of people who are adults and teenagers in one LINQ query? (without ToList() as this no longer performs the query in SQL)
The trick is to to use nested ternary operators inside of your GroupBy keySelector parameter using the ‘is’ keyword.


var results = Context.Persons
    .GroupBy(p => p is Adult? "Adult" : p is Teenager ? "Teenager" : "", (key, results) =>
    {
        k,
        results.Count()
    }).ToDictionary(p => p.key);

The same can be done for OrderBy queries. Hope this helps you as it did me.

Entity Framework ToList() in nested type (LINQ to Entities does not recognize the method)

Update:
The issue is described here:
http://entityframework.codeplex.com/workitem/808
I’ve made a fork of EF source which solves this issue @
http://entityframework.codeplex.com/SourceControl/network/forks/brentmckendrick/Issue808/changeset/b18e48b3e51f
however more work could be done to enhance the code by allowing for ToArray and ToDictionary.

So today I have a suggestion for an improvement to the Entity Framework.

First let me set the scenario:

Say I have an entity called Person which has many Addresses, as shown below.

public class Person
{
    // .. lots of fields ..
    public ICollection<Address> Addresses { get; set; }
}

public class Address
{
    // .. lots of fields ..
}

Now say I would like to create a projection so that I do not return all of the information contained in the Person and Address models.
I could write this:

Context.Persons
.Select(p => new
{
    Addresses = p.Addresses.Select(m => new
    {
        // address properties
    })
    // other person properties
}).ToList();

This would obviously give me an anonymous type (A1) which contains a property of List'(T) called Addresses where T is anonymous type (A2). The key here is that the type instantiated is in fact a List'(T)

Now lets say I have a set of data transfer objects. One is PersonDTO which has a property of type List'(T) where T is an AddressDTO as shown below.

public class PersonDTO
{
    // .. a few fields ..
    public List<AddressDTO> Addresses { get; set; }
}

public class Address
{
    // .. a few fields ..
}

I want to map a person to a PersonDTO. This is shown here:

Context.Persons
.Select(p => new PersonDTO
{
    Addresses = p.Addresses.Select(m => new AddressDTO
    {
        // address properties
    }).ToList() // not supported
    // other person properties
}).ToList();

I get an Exception : System.NotSupportedException – “LINQ to Entities does not recognize the method ‘System.Collections.Generic.List`1[t] ToList[t](System.Collections.Generic.IEnumerable`1[t])’ method, and this method cannot be translated into a store expression.”

Now all of you LINQ to Entities guys are screaming “You can’t call ToList() inside of a nested projection!”. OK, I can see that the ToList() will not be able to be translated to an SQL query, obviously if I leave it out then my code will not compile since PersonDTO.Addresses is a List'(T). If I change the PersonDTO.Addresses field to an IEnumerable and remove the ToList() then it works! (and by reflecting the runtime type it is in fact populated with a generic list). So the list IS being instantiated anyway.

Now I know this projection could be done in memory and I would not have any problems calling ToList() because it would not need to be translated into an SQL query, however this defeats the purpose as I am trying to write an SQL query.

Now to the point!:

– An anonymous type can be created with a nested collection property of type List.
– A concrete type can be created with a nested collection of type IEnumerable and a List will be created and populated for this variable
– A concrete type with a nested property collection of a List (or any derivative of type ICollection) can NOT be projected to (because of the ToList() issue)

Now it seems to me that since in all of these cases a List is created anyway that perhaps when the expression is mapped to an SQL function the call to the nested ToList() should be ignored? Simply by doing this all of the problems would be solved. (Please correct me if this assumption is incorrect). Better yet what if it WASN’T ignored but removed from the SQL expression and stored for later use so that we could also call ToArray(), ToList(), etc. and change the type which is being returned from the EF materialization? Eg.

Context.Persons
.Select(p => new PersonDTO
{
    Addresses = p.Addresses.Select(m => new AddressDTO
    {
        // address properties
    }).ToArray()
    // other person properties
}).ToList();

Generic Repository: Fake IDbSet implementation update (Find Method & Identity key)

UPDATE (again) Just a quick one: see https://github.com/refactorthis/GraphDiff/blob/master/EFDetachedUpdate/DetachedUpdate/DbContextExtensions.cs on line 209 for a replacement GetKeyProperties method which allows for convention and fluent API mapped keys (You no longer need to annotate your model with KeyAttribute)

UPDATE: Thanks to Eli Weinstock-Herman for pointing out the fact that Find should return null if no result is found (SingleOrDefault instead of Single). Cheers Eli.

Hey guys,

I’ve been back in the coding seat lately creating a new generic repository for a system that we are building. I’ve made some improvements to the FakeDbSet that I posted about earlier Here.

I want to add some notes to the previous post which are long enough they warrant a new post. Firstly, IT IS MUCH EASIER if you do not use foreign keys in your objects but instead use ‘association’ object references. This means you will not have to co-ordinate two different fields when setting up test data. Of course EF does this for you when connected to the database but in memory you would have to do this yourself.

Secondly the implementation of find was quite hard, though I believe I have come up with an elegant generic solution. If you look at the IDbSet documentation MSDN you will see that Find() expects the keys to passed in “the same order that they are defined in the model”.

If I use reflection to find my key properties I can then iterate through the keys and ensure that each object given in the find method equals the value of that key, as shown below.


        private List<PropertyInfo> _keyProperties;

        public virtual T Find(params object[] keyValues)
        {
            if (keyValues.Length != _keyProperties.Count)
                throw new ArgumentException("Incorrect number of keys passed to find method");

            IQueryable<T> keyQuery = this.AsQueryable<T>();
            for (int i = 0; i < keyValues.Length; i++)
            {
                var x = i; // nested linq
                keyQuery = keyQuery
                   .Where(entity => _keyProperties[x].GetValue(entity, null).Equals(keyValues[x]));
            }

            return keyQuery.SingleOrDefault();
        }

        private void GetKeyProperties()
        {
            _keyProperties = new List<PropertyInfo>();
            PropertyInfo[] properties = typeof(T).GetProperties();
            foreach (PropertyInfo property in properties)
            {
                foreach (Attribute attribute in property.GetCustomAttributes(true))
                {
                    if (attribute is KeyAttribute)
                    {
                        _keyProperties.Add(property);
                    }
                }
            }
        }

Now thirdly I wanted the FakeDbSet to act like the database and use an identity column for properties that are ints and marked with the [Key] attribute. I made these changes here

private int _identity = 1;

private void GenerateId(T entity)
{
     // If non-composite integer key
     if (_keyProperties.Count == 1 && _keyProperties[0].PropertyType == typeof(Int32))
         _keyProperties[0].SetValue(entity, _identity++, null);
}

  public T Add(T item)
  {
      GenerateId(item);
      _data.Add(item);
      return item;
 }

Now of course this is being done in the Add method not the commit method as the database would. For my purposes this makes no difference. If however you want the key generation to be done on commit then you need to keep an un-comitted list inside of the FakeDbSet and then when commit is called you would iterate the list generating id’s for each element and then adding them to the ‘comitted’ list.

Here is the new FakeDbSet implementation


public class FakeDbSet<T> : IDbSet<T> where T : class
    {
        private readonly HashSet<T> _data;
        private readonly IQueryable _query;
		private int _identity = 1;
        private List<PropertyInfo> _keyProperties;

        private void GetKeyProperties()
        {
            _keyProperties = new List<PropertyInfo>();
            PropertyInfo[] properties = typeof(T).GetProperties();
            foreach (PropertyInfo property in properties)
            {
                foreach (Attribute attribute in property.GetCustomAttributes(true))
                {
                    if (attribute is KeyAttribute)
                    {
                        _keyProperties.Add(property);
                    }
                }
            }
        }

		private void GenerateId(T entity)
		{
            // If non-composite integer key
            if (_keyProperties.Count == 1 && _keyProperties[0].PropertyType == typeof(Int32))
                _keyProperties[0].SetValue(entity, _identity++, null);
		}

        public FakeDbSet(IEnumerable<T> startData = null)
        {
            GetKeyProperties();
			_data = (startData != null ? new HashSet<T>(startData) : new HashSet<T>());
            _query = _data.AsQueryable();
        }

        public virtual T Find(params object[] keyValues)
        {
            if (keyValues.Length != _keyProperties.Count)
                throw new ArgumentException("Incorrect number of keys passed to find method");

            IQueryable<T> keyQuery = this.AsQueryable<T>();
            for (int i = 0; i < keyValues.Length; i++)
            {
                var x = i; // nested linq
                keyQuery = keyQuery.Where(entity => _keyProperties[x].GetValue(entity, null).Equals(keyValues[x]));
            }

            return keyQuery.SingleOrDefault();
        }

        public T Add(T item)
        {
            GenerateId(item);
            _data.Add(item);
            return item;
        }

        public T Remove(T item)
        {
            _data.Remove(item);
            return item;
        }

        public T Attach(T item)
        {
            _data.Add(item);
            return item;
        }

        public void Detach(T item)
        {
            _data.Remove(item);
        }

        Type IQueryable.ElementType
        {
            get { return _query.ElementType; }
        }

        Expression IQueryable.Expression
        {
            get { return _query.Expression; }
        }

        IQueryProvider IQueryable.Provider
        {
            get { return _query.Provider; }
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return _data.GetEnumerator();
        }

        IEnumerator<T> IEnumerable<T>.GetEnumerator()
        {
            return _data.GetEnumerator();
        }

        public T Create()
        {
            return Activator.CreateInstance<T>();
        }

        public ObservableCollection<T> Local
        {
            get
            {
                return new ObservableCollection<T>(_data);
            }
        }

        public TDerivedEntity Create<TDerivedEntity>() where TDerivedEntity : class, T
        {
            return Activator.CreateInstance<TDerivedEntity>();
        }
    }

Hope this code is useful to someone else ūüôā

EF 4.1 DbContext Issue : Manually opening and closing the database connection

As I’m sure many of you are aware the Entity Framework will create and close database connections automatically when needed. This is great most of the time, however when we want to manually configure the connection for performance or to perform a list of actions within a transaction we don’t want the entity framework to automatically close our connection.

I’ve found an issue where I’m trying to manually manage my DbContext connection and the DbContext API does not want to let me.
(I’m using Sql Server 2005 and am trying to avoid transaction promotion to the DLC which means I want to do all of my queries on the same connection).

In ObjectContext land, when I call ObjectContext.Connection.Open() I am manually opening the connection and the documentation states on MSDN that this connection will NOT be closed until I call the Close() method or dispose of the context.

It seems calling DbContext.Database.Connection.Open() does not give the same results. When called I watch the context close and reopen for each query. Below is the code that I am trying to write that presents the problem.

DbContext version:


            dbContext.Database.Connection.Open();
            using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required))
            {
                  // perform a list of queries
                 // the connection will close
                 scope.Complete();
                 dbContext.Database.Connection.Close();
            }

ObjectContext version:


            (dbContext as IObjectContextAdapter).ObjectContext.Connection.Open();
            using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required))
            {
                  // perform a list of queries
                 // The connection will not close!
                 scope.Complete();
                 (dbContext as IObjectContextAdapter).ObjectContext.Connection.Close();
            }

So the fix for now is to get the ObjectContext from your DbContext. But can someone explain what the difference is and is this by design?

Faking DbContext in Entity Framework 4.1 with a generic repository

Update 30/11/2011: FakeDbSet implementation update Please see the new and improved FakeDbSet Here

Update 16/06/2011:  Added step (2) description of how to implement Set<>() method in your original DbContext so that it returns IDbSet<>. Also added SaveChanges() to expose the context as a unit of work. + A little reorganisation.

Faking of the new Entity Framework 4.1 DbContext can be done quite simply by following these steps:

1. Create a common interface for your particular DbContext type.

I’m using a generic repository so my interface only needs to implement the Set method. But you could of course expose all your collections through this interface.

    public interface IMainModuleContext
    {
        IDbSet<Person> People { get; set; } // My collections...
        IDbSet<TEntity> Set<TEntity>() where TEntity : class;
        void SaveChanges();
    }

Notice how our DbSet collections IDbSet instead of DbSet. This is because we will use an in-memory representation of the DbSet collection called FakeDbSet which implements IDbSet.

If you are exposing all of your collections and using model-first this could be generated with a T4 Template to save development time. Ensure your real DbContext implements this interface, and that your repository will take a IMainModuleContext instead of the concrete type.

2. Now lets make sure our original context (mine is called MainModuleContext) is implementing this interface. Example of the code to do this is below:

public partial class MainModuleContext : DbContext, IMainModuleContext
{
    public IDbSet<Person> People { get; set; }
    public MainModuleContext() : base() {}
    public IDbSet<TEntity> Set<TEntity>() where T : class
    {
        return base.Set<TEntity>();
    }

    public void SaveChanges()
    {
        base.SaveChanges();
    }
    // Other methods
}

Notice our properties must return IDbSet instead of DbSet. This is easy since the EF team have included the IDbSet interface for us.

3. Now we will create a fake dbset (an in-memory representation of a dbset)

    public class FakeDbSet<T> : IDbSet<T> where T : class
    {
        private HashSet<T> _data;

        public FakeDbSet()
        {
            _data = new HashSet<T>();
        }

        public virtual T Find(params object[] keyValues)
        {
            throw new NotImplementedException();
        }

        public T Add(T item)
        {
            _data.Add(item);
            return item;
        }

        public T Remove(T item)
        {
            _data.Remove(item);
            return item;
        }

        public T Attach(T item)
        {
            _data.Add(item);
            return item;
        }

        public void Detach(T item)
        {
             _data.Remove(item);
        }

        Type IQueryable.ElementType
        {
            get { return _data.AsQueryable().ElementType; }
        }

        Expression IQueryable.Expression
        {
            get { return _data.AsQueryable().Expression; }
        }

        IQueryProvider IQueryable.Provider
        {
            get { return _data.AsQueryable().Provider; }
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return _data.GetEnumerator();
        }

        IEnumerator<T> IEnumerable<T>.GetEnumerator()
        {
            return _data.GetEnumerator();
        }

        public T Create()
        {
            return Activator.CreateInstance<T>();
        }

        public ObservableCollection<T> Local
        {
            get
            {
            return new ObservableCollection<T>(_data);
            }
        }

        public TDerivedEntity Create<TDerivedEntity>() where TDerivedEntity : class, T
        {
            return Activator.CreateInstance<TDerivedEntity >();
        }
    }

4. Now implement your fake context. The only tricky thing here is the Set method needs to use reflection to find the property we are after.


    public partial class FakeMainModuleContext : IMainModuleContext
    {
        public IDbSet<Person> People { get; set; }
        public IDbSet<T> Set<T>() where T : class
        {
            foreach (PropertyInfo property in typeof(FakeMainModuleContext).GetProperties())
            {
                if (property.PropertyType == typeof(IDbSet<T>))
                    return property.GetValue(this, null) as IDbSet<T>;
            }
            throw new Exception("Type collection not found");
        }

        public void SaveChanges()
        {
             // do nothing (probably set a variable as saved for testing)
        }

        public FakeMainModuleContext()
        {
            // Set up your collections
            People = new FakeDbSet
            {
                new Person() { FirstName = "Brent" }
            };
        }

You can now swap out your DbContext with a FakeDbContext for unit testing.