LINQ Deferred Execution & Lambda Methods for providing Simple Stats (Part II)

This is part 2 in a series of posts on Linq & Lambda capabilities in C# 

Deferred Execution

So lets take a minute to talk about deferred execution. You may here this referred to as Lazy Execution as well. But in a nutshell what this means is that when you write a linq or lambda query against a collection or list, the execution of that query doesn’t actually happen until the point where you need to access the resuts. Let’s look at a simple example.

var ienum = Enumerable.Range(1, 10).ToList();

var query = from i in ienum
            where i%2 == 0
            select i;

ienum.Add(20);
ienum.Add(30);

SuperConsole.WriteLine(query);
//prints 2, 4, 6, 8, 10, 20, 30

So why does it print out 20 and 30. This is deferred execution in practice. At the point where you write your query (var query) the query is not actually executed against your datasource (ienum). After the query is setup, more data is added to your data source, and the query is only actually executed at the point where the results need to be evaluated (SuperConsole.WriteLine)

This holds true in a number of other Linq Scenarios. In Linq-to-Sql or Linq-to-Entity Framework, execution of the Sql Query is only sent to the database at the point where you need to evaluate your queries. It’s important to understand this so that queries don’t go out of scope before being executed, so that un-executed queries aren’t inadvertently passed to other parts or layers in your application and so that you don’t end up introducing N+1 problems where you think your working on data in memory but in actual fact, your performing multiple executions over and over in a loop. If you do need to make your queries “Greedy” and force them to execute there and then, you can wrap them in parenthesis and immediately call .ToList() on them to force the execution.

Min, Max, Count & Average

Linq has a number of convenient built in methods for getting various numeric stats about the data your working on. Consider a collection of Movies which you want to Query.

public class Movie
{
    public string Title { get; set; }
    public double Rating { get; set; }
}

...

var movies = new List
    {
        new Movie() {Title = "Die Hard", Rating = 4.0},
        new Movie() {Title = "Commando", Rating = 5.0},
        new Movie() {Title = "Matrix Revolutions", Rating = 2.1}
    };

Console.WriteLine(movies.Min(m => m.Rating));
//prints 2.1

Console.WriteLine(movies.Max(m => m.Rating));
//prints 5

Console.WriteLine(movies.Average(m => m.Rating));
//prints 3.7

Console.WriteLine(movies.Count);
Console.WriteLine(movies.Count());
//prints 3

Min, Max and Average are all fairly straight forward, finding the Minimum, Maximum and Average movie rating values respectively. It’s worth mentioning with regards the Count implementations that there are different “versions” of the count implementation depending on the underlying data structure you are operating on. The Count property is a property of the List class are returns the current number of items in that collection. The Count() method is an extension method on the IEnumerable interface which can be executed on any IEnumerable structure regardless of implementation.

In general LINQ’s Count will be slower and is an O(N) operation while List.Count and Array.Length are both guaranteed to be O(1). However in some cases LINQ will special case the IEnumerable parameter by casting to certain interface types such as IList or ICollection. It will then use that Count method to do an actual Count() operation. So it will go back down to O(1). But you still pay the minor overhead of the cast and interface call. Ref: [http://stackoverflow.com/questions/981254/is-the-linq-count-faster-or-slower-than-list-count-or-array-length/981283#981283]

This is important as well if you are testing your collections to see if they are empty. People coming from versions of .NET previous to Generics would use the Count or Length properties of a collection to see if they were empty. i.e.

if(list.Count == 0)
{ 
    //empty
}
if(array.Length == 0)
{
    //empty
}

Linq however provides another method to test for contents called Any(). It can be used to evaluate whether the collection is empty, or if the collection has any items which validate a specific filter.

if(list.Any()) //equivalent of count == 0
{ 
    //empty
}
if(list.Any(m => m.Rating == 5.0)) //if it contains any top rated movies.
{
    //empty
}

If you are starting with something that has a .Length or .Count (such as ICollection, IList, List, etc) – then this will be the fastest option, since it doesn’t need to go through the GetEnumerator()/MoveNext()/Dispose() sequence required by Any() to check for a non-empty IEnumerable sequence. For just IEnumerable, then Any() will generally be quicker, as it only has to look at one iteration. However, note that the LINQ-to-Objects implementation of Count() does check for ICollection (using .Count as an optimisation) – so if your underlying data-source is directly a list/collection, there won’t be a huge difference. Don’t ask me why it doesn’t use the non-generic ICollection… Of course, if you have used LINQ to filter it etc (Where etc), you will have an iterator-block based sequence, and so this ICollection optimisation is useless. In general with IEnumerable : stick with Any() Ref: [http://stackoverflow.com/questions/305092/which-method-performs-better-any-vs-count-0/305156#305156]

Next post, we’ll look at some different mechanisms for filtering and transforming our queries.

~Eoin C

Handy LINQ & Lambda Methods and Extensions (Part I)

The System.Linq namespace contains a fantastic set of utility extension methods for filtering, ordering & manipulating the contents of your collections and objects. In the following posts I’ll go through some of the most useful ones (in my humble opinion) and how you might use them in your C# solutions

This is part 1 in a series of posts on Linq & Lambda capabilities in C# 

Before we start, here’s a handy static method to print your resulting collections to the console so you can quickly verify the results.

public class SuperConsole
{
    public static void WriteLine<T>(IEnumerable<T> list, bool includeCarriageReturnBetweenItems =false)
    {
        var seperator = includeCarriageReturnBetweenItems ? ",\n" : ", ";
        var result = string.Join(seperator, list);
        Console.WriteLine(result);
    }
}

Enumerable

The System.Linq.Enumerable type has 2 very useful static methods on it for quickly generating a sequence of items. Enumerable.Range & Enumerable.Repeat. The Range method allows you to quickly generate a sequential list of integers from a given starting point for a given number of items.

IEnumerable<int> range = Enumerable.Range(1, 10);
SuperConsole.WriteLine(range);
//prints "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

So why is this useful, well you could use it to quickly generate a pre-initialised list of integers rather than new’ing up a list and then iterating over it to populate it. Or you could use it to replicate for(;;) behavior. e.g.

for (int i = 1; i <= 10; i++) 
{     
    //DoWork(i); 
} 

Enumerable.Range(1, 10).ToList().ForEach(i =>
    {
        //DoWork(i)
    });

Repeat is similar but is not limited to integers. You can generate a Sequence of a given length with the same default value in every item. Imagine you wanted to create a list of 10 strings all initialised with a default string of “ABC”;

var myList = Enumerable.Repeat("ABC", 10).ToList();

Item Conversion

There are also a few handy ways to convert/cast items built into the System.Linq namespace. The Cast<T> extension method allows you to cast a list of variables from one type to another as long as a valid cast is available. This can be useful for quickly changing a collection of super types into their base types.

var integers = Enumerable.Range(1, 5);
var objects = integers.Cast<object>().ToList();

Console.WriteLine(objects.GetType());
SuperConsole.WriteLine(objects);

//prints
//System.Collections.Generic.List`1[System.Object]
//1, 2, 3, 4, 5

But what if a valid implicit cast isn’t available. What if we wanted to convert our collection of integers into a collection of strings with a ‘:’ suffix. Thankfully Linq has us covered with it’s ConvertAll Method on List

var integers = Enumerable.Range(1, 5);
var converter = new Converter<int, string>(input => string.Format("{0}: ", input));
var results = integers.ToList().ConvertAll(converter);

SuperConsole.WriteLine(results, true);
/*prints
    1:
    2:
    3:
    4:
    5:
    */

In the next post, we’ll look at some the lazy & deferred execution capabilities of LINQ and some useful methods for performing quick calculations and manipulations on our collections.

~Eoin C

Dynamic

Building Lambda Expressions at Runtime

Dynamic
Dynamic

Necessity is the mother of all… reasons to learn something new. So when some project requirements came down to put together a Search UI for an object graph of ~200 different properties in one wide table, we got an opportunity to play with some dynamic LINQ. We needed to come up with a quick way to allow a user to search across all the properties without making the UI unwieldy. What we provided them with was a simple UI allowing the user to apply 0:N conjunctive search filters. For each filter they choose an object property to filter by, the filtering operator (equal, less than, etc…) and the value they were searching for.

By the way, if there’s a nicer way to do this, I’d love to know about it.

Read More “Building Lambda Expressions at Runtime”