Andy's observations as he continues to attempt to know all that is .NET...

Tuesday, December 01, 2009

Devweek 2010

Rock Solid Knowledge are pleased to announce that we have had a host of talks accepted for Devweek 2010, including the latest advances in Silverlight, Designing for testing, Enterprise and OO Design Patterns, Workflow and a full day of whats new in .NET 4.  Hope to see you there…

For a full list of talks click here

Monday, October 19, 2009

VS2010 Beta 2 Download

VS2010 Beta 2 is now available for download for MSN subscribers.

Monday, October 05, 2009

Geometric Decomposition Screencast

I’ve posted a new screen cast on the Rock Solid Knowledge screen casts side that provides an example of how to partition a set of data for parallel processing.

The screen cast covers a simple way using Parallel.ForEach, and a more efficient version combining the use of Barrier.

You can checkout the screen cast and others here

Parallel utilities

Whilst working with .NET 4 parallel extensions I often find the need to extend the framework to assist me in various day to day tasks.  I’ve finally got around to compiling a library of such extensions.

The extensions include the following

  • Set Process Affinity, so I can see how my algorithm scales on a different number of cores.  Will select non hyperthreaded cores first.
  • Determine the number of Real cores, not including hyperthreading ones.
  • A Range type that supports the partitioning of a range so that I can farm sub ranges out to different tasks
  • A Sequence class for building non integer parallel loops or loops with steps other than 1
  • Finally a simplified way to handle aggregated exceptions.

You can download the library from here

Its a Visual Studio 2010 project, containing the library and a unit test project which should provide enough insight to how the library works.

Im very keen to know what people think of the aggregate exception handling.

Friday, October 02, 2009

Are Singletons Evil ?

Finally got around to delivering a conference talk on this subject this week at Software Architecture Week, its a topic myself and Kevin Jones are constantly being asked.  Of course a quick google reveals the answer there are numerous rants about this evil pattern.  But like most things in life its not as simple as yes or no. 

Our goal for this talk was to expose the areas were this pattern causes a developer a whole load of pain.  In fact one member of the audience was experiencing such pain in his attempt to take legacy code base heavily utilising singletons and start to write unit tests. 

So whilst the majority of the time was spent examining the consequences of using the singleton pattern we also took time to  highlight that one or two singletons correctly positioned in your application could in fact enable unit testing, and coding to interface without having to refactor large areas of a legacy code base.

This talk took the format of a short geeky play, featuring two developers trying to wrestle with getting the job done and unit testing.  You can download the script and accompanying code

So like all good consultants our answer to this question is “It Depends”

Friday, August 14, 2009

Fluent Parallel While

During devweek 2009 Oliver introduced me to Fluent Api's, I personally love programs that naturally read, after all programs are read far more often than written.  This week a posting on the Parallel Extensions blog demonstrated how to achieve parallel while, since we only have Parallel.For and Parallel.ForEach

 

static void Main(string[] args)
{
ConcurrentQueue<int> queue = new ConcurrentQueue<int>();

for (int i = 0; i < 10; i++)
{
queue.Enqueue(i);
}


Action<ParallelLoopState> processQueue = (lps) =>
{
int item;
if (queue.TryDequeue(out item))
{
Console.WriteLine(" Thread Id = {0} Task Id = {1} : {2}", Thread.CurrentThread.ManagedThreadId, Task.Current.Id, item);
}
};

Func<bool> queueContainsItems = () => queue.IsEmpty == false;

ParallelUtil.While(new ParallelOptions(), queueContainsItems, processQueue);
}





I thought I’d combine both these ideas and produce what I would consider a more  Fluent version




public static class ParallelUtil
{
private static IEnumerable<bool> Infinite()
{
while (true) yield return true;
}

private static ParallelOptions NoParallelOptions = new ParallelOptions();

public static void InParallelWhile(this Action<ParallelLoopState> action, Func<bool> condition, ParallelOptions options = null)
{
if (options == null) options = NoParallelOptions;

Parallel.ForEach(IterateForever(), options ,
(ignored, loopState) =>
{
if (!condition())
{
loopState.Stop();
}
else
{
action(loopState);
}
});

}

private static IEnumerable<bool> IterateForever()
{
while (true)
{
yield return true;
}
}
}
class Program
{
static void Main(string[] args)
{
ConcurrentQueue<int> queue = new ConcurrentQueue<int>();

for (int i = 0; i < 10; i++)
{
queue.Enqueue(i);
}


Action<ParallelLoopState> processQueue = (lps) =>
{
int item;
if (queue.TryDequeue(out item))
{
Console.WriteLine(" Thread Id = {0} Task Id = {1} : {2}", Thread.CurrentThread.ManagedThreadId, Task.Current.Id, item);
}
};

Func<bool> queueContainsItems = () => queue.IsEmpty == false;

processQueue.InParallelWhile(queueContainsItems);
}
}
}









The key difference is the first example you use the following line to process the queue in parallel



ParallelUtil.While(new ParallelOptions(), queueContainsItems, processQueue);



As opposed to the second version with a perhaps more fluent implementation



processQueue.InParallelWhile(queueContainsItems);

Friday, July 31, 2009

Linq for NHibernate

Its been a while coming, but there it is no possible to use LINQ against NHibernate..I think its a shame that MS didn’t get behind the Hibernate family when they first created LINQ as its a marriage made in heaven, combining a mature ORM with a language integrated query.  Rather than spending all that time ( and still ) on trying to create their own ORM.

I’ve yet to try out the integration, but hopefully will be over the next few weeks…

Click Here for the announcement

Thursday, July 30, 2009

WARNING..Cancellation support may makes things go slow….

.NET 4 Tasks offers much better support for task cancellation, unlike QueueUserWorkItem tasks can be cancelled before commencing, and the Task library offers a standard way for running tasks to detect and report cancellation.  I recently recorded a screencast that demonstrated the new Task API including the cancellation support.   This blog post isn’t so much about the cancellation mechanics but some guidance on how best to use it if you don’t want to change the throughput of your task.

Below is some code that calculates pi, it is expecting to be run inside a task and is supporting the notion of cancellation.

 

private static double AbortableCalculatePi()
{
double pi = 1;
double multiplier = -1;

const int N_ITERATIONS = 500000000;

for (int nIter = 3; nIter < N_ITERATIONS; nIter += 2)
{

if (Task.Current.IsCancellationRequested)
{
Task.Current.AcknowledgeCancellation();
return 0.0;
}


pi += (1.0 / (double)nIter) * multiplier;
multiplier *= -1;

}
return pi * 4.0;
}





So all well and good until you benchmark it and compare it to the version with no cancellation support and it runs almost twice as slow.  The reason being that the cost of detecting cancellation is high in relation to the work being done.  The important aspect to cancellation is being able to respond in  a meaningful time for the client, at present we are being way over aggressive in checking.



One option would be to only check every N iterations..




private static double BetterAbortableCalculatePi()
{
double pi = 1;
double multiplier = -1;

const int N_ITERATIONS = 500000000;

for (int nIter = 3; nIter < N_ITERATIONS; nIter += 2)
{

if ((nIter - 3) % 100000 == 0)
{
if (Task.Current.IsCancellationRequested)
{
Task.Current.AcknowledgeCancellation();
return 0.0;
}
}


pi += (1.0 / (double)nIter) * multiplier;
multiplier *= -1;

}
return pi * 4.0;
}





This  took a third less time than the previous more aggressive version.  However the if block’s effect on the pipeline and the additional maths is still an additional cost over the version that had no cancellation support.



So a third approach is called for this time refactoring the algorithm to use two loops instead of one, were the check for cancellation is done once per iteration of the outerloop, this results in little additional cost.




private static double OptimisedAbortableCalculatePi()
{
double pi = 1;
double multiplier = -1;

const int N_ITERATIONS = 500000000 / 2;

const int OUTER_ITERATIONS = 10000;
const int INNER_ITERATIONS = N_ITERATIONS / OUTER_ITERATIONS;

int i = 3;
for (int outerIndex = 0; outerIndex < OUTER_ITERATIONS; outerIndex++)
{
for (int nIter = 0; nIter < INNER_ITERATIONS; nIter++)
{
pi += (1.0 / i) * multiplier;
multiplier *= -1;
i += 2;
}

if (Task.Current.IsCancellationRequested)
{
Task.Current.AcknowledgeCancellation();
return 0.0;
}

}
return pi * 4.0;
}





Here are the timings are got from the various approaches



NoAbortableCalculatePi = 3.14159264958921 took 00:00:03.7357873

AbortableCalculatePi = 3.14159264958921 took 00:00:09.6137173


BetterAbortableCalculatePi = 3.14159264958921 took 00:00:06.3826212


OptimisedAbortableCalculatePi = 3.14159265758921 took 00:00:03.6883268



As the figures show there is virtually no difference between the first and last run, but a considerable difference when cancellation is inserted into the core of the computation.



So to sum up whilst cancellation support is good the frequency you check for could have an impact on the overall performance of your algorithm.  Cancellation is something we want to support but in general users probably don’t need it so we need to strike the right balance between throughput and responding to cancellation in an appropriate timeframe.

.NET 4 Tasks and UI Programming

Just upload a new screencast covering how to marshal results from .NET 4 Tasks back on to the UI thread.  One method is to continue to utilise the same API’s from previous versions of .NET thus utilising SynchronizationContext.Post, the Task based API offers an alternative and in some cases more elegant solution using the ContinueWith method.

Thursday, July 23, 2009

My First Silverlight 3 App

Had a brief rest from patterns and parallel stuff to have a quick play with Silverlight 3.  I mainly wanted to see the out of browser aspect, as I think the idea of being able to build RIA that also run on the desktop is very compelling….

So what to build, I really like the iPhone weather app so I thought I’d have a go at reproducing it in Silverlight, below is a screen shot showing it running out of the browser.

image

I’m using isolated storage to store the list of weather centre’s of interest, a more typical line of business app would store app config on the web server/cloud and potentially locally to support true client roaming, but I’ll leave that for another day.

One thing to note is that the method for enabling Out Of Browser mode for you application is now different from pre-release versions of Silverlight 3, so there are many old blog posts that are unfortunately wrong now.  The good news is now it is trivial, view the project properties, and under the Silverlight tab there is an option to enable the app to support out of browser.  This then creates the OutOfBrowserSettings.xml file.

image

You can download the source via here or to just see the app in action click here.  I’ll let you decide if its as cool as the iPhone….

Thursday, July 16, 2009

Testing on varying number of cores

I’ve written many blog articles in the past that show that the performance of a piece of parallel code can vary dramatically based on the number of available cores. With that it mind, its obviously desirable even when given a machine with 8 cores that you test your code against a machine that could have substantially less.  You can resort to task manager and set Process Affinity and reduce the number of cores available for the process, but this is tedious.  There is a .NET API that allows access to controlling which cores to make available for a process.  The API requires the use of a bitmask to identity which cores to use, that's a bit ( no pun intended) overkill for what I'm trying to do, so I created a  facade that allows me to simply say use N cores.

public static class Cores
{
public static int Max
{
get
{
return Environment.ProcessorCount;
}
}
public static int CoresInUse
{
get
{
IntPtr cores =
Process.
GetCurrentProcess()
.ProcessorAffinity;


int nCores = 0;
while( cores != IntPtr.Zero )
{
if ( ((int)cores & 1) == 1 )
{
nCores++;
}
cores = (IntPtr)((int)cores >> 1);
}
return nCores;
}

set
{
if ((value < 1) || (value > Environment.ProcessorCount))
{
throw new ArgumentException("Illegal number of cores");
}

int cores = 1;
for (int nShift = 0; nShift < value-1; nShift++)
{
cores = 1 | (cores << 1);
}

Process.GetCurrentProcess().ProcessorAffinity = (IntPtr)cores;
}
}
}





The following code prints out the number of active cores and then reduces the number of cores to 4




Console.WriteLine("Using {0} out of {1}" , Cores.CoresInUse , Cores.Max);
Cores.CoresInUse = 4;
Console.WriteLine("Using {0} out of {1}", Cores.CoresInUse, Cores.Max);




Saturday, July 11, 2009

Guerilla .NET Demos from 6th July 2009

Had loads of fun as normal teaching Guerilla .NET with Rich and Marcus , All the demos from class here

Friday, June 26, 2009

Parallel.For and Parallel.ForEach…moving past Thread.Sleep

Just released a screen cast demoing Parallel.For and Parallel.ForEach where we are not demoing the virtues of Parallel Sleep….

Click here to see this and many other Rock Solid Knowledge Screen casts

Wednesday, June 24, 2009

Short and Long running tasks in .NET 4

Just added a screen cast on how to create short and long running tasks in .NET 4.  You can get to all the Rock Solid Knowledge screen casts via www.rocksolidknowledge.com/screencasts

Monday, June 15, 2009

Rock Solid Knowledge at Software Architect 2009

Just to say that we will be delivering talks at Software Architect 2009, here is a full list of our sessions.

Long or Short running tasks

Been playing around with the new version of Pfx for .NET.  I must say Ive been very impressed with the improvements since the last CTP for .NET 3.5.  So here goes a series of blogs and screen casts on some on various bits of Pfx for .NET 4 BETA 1.

The one thing that really shouts is that the original  Pfx types are no longer something for just  fine grained parallelism its is a unification of the various threading API’s.  Unifying Api’s is something not new to the .NET framework its been happening since 1.1.  As developers its great that whilst the framework is evolving a constant effort is being made to refactor and simplify previous complexity.

So previously If  I wanted to create a short running piece of background work I would favour the thread pool, if I was to write a long running piece of background activity I would have to create my own custom thread.   Creating a short running thread using the thread pool meant either QueueUserWorkItem or Delegate.BeginInvoke, and a long running via new Thread(), and calling Start.

Now for either types of situation  you simply create a new Task using the new Task type, either via the Task Factory ( not a real GOF Factory, but don’t get me started on that ), or via new Task()

Task bgTask = Task.Factory.StartNew( MyShortRunningTask);

For a long running task we use the same API but this time giving it a hint that this is a long running task and shouldn’t therefore use a thread pool thread, but create a new thread outside the thread pool

Task bgTask = Task.Factory.StartNew( MyLongRunningTask, TaskCreationOptions.LongRunning);

So too very similar calls but leaving it up to the framework to decide how best to schedule the work. 

Thursday, June 11, 2009

VS2010 Edit window outside the IDE

Just noticed that I can drag source windows outside the IDE in VS2010, this is something Ive wanted for ages…Now I can really use multiple monitors

Wednesday, June 10, 2009

Pfx team examples moving into the real world

I’ve written many articles on my blog about how Im sick of trade show demos of Pfx ( Parallel framework extensions ).  You know the ones using simple Parallel.For with Thread.Sleep or Thread.SpinWait as the piece of work.  These examples scale wonderfully but the moment people take those simple examples and apply them to their own for loops terrible performance often results.  Thankfully the Pfx team have written a blog article offering some suggestions about what to do when the piece of work inside the for loop is too small. ( Blog article ).  Interesting this is the first time I’ve seen them utilise the number of processors in the machine to determine the number of tasks, something I’ve advocated many times in the past for tasks of equal cost. 

Wednesday, June 03, 2009

Trailing comma in C#

Since C# v3 you have been able to initialise collections leaving a trailing comma…

List<int> vals = new List<int>

{

1,

2,

3,

4,

5,

};

I was asking students in a recent class if they liked it or not…One said that he didn’t like the final comma as he thought it suggested something was missing…My original reason for liking them was that it meant I could easily add to the list.  However one student said he liked them as it meant that if you ever did a source control diff on the different versions the diff would only show one line change as opposed to two.  I never thought of that…

Wednesday, May 20, 2009

.NET Screencasts

Just to say Rock Solid Knowledge of which Im one of the founders have now started to produce a series of screen casts on various bits of .NET technology.  With Beta 1 of VS2010 just out expect a load more in the not too distant future covering all the new goodness  You can checkout the screen casts either via

http://www.rocksolidknowledge.com/screencasts or subscribe via RSS

Tuesday, May 12, 2009

Entity Framework and Coarse grain locking

I’ve recently had to come back and visit the Entity Framework (EF), as part of this I had the need to provide editing for many entities as a single edit operation.  In effect what I'm really saying here is that my collection of entities is really one entity as far as my user is concerned, e.g. An Order entity is comprised of a series of Order Item entities, as far as the user request is concerned they are editing an order.

Now out of the box Entity Framework provides support for Optimistic Locking on a per entity basis.  However in my case I don’t want to manage optimistic locking at the level of each EF entity, I wish to do it at the collection/or aggregate level.  There is no direct support for this in EF, so it has to be done by the application developer.  Martin Fowler presents a serious of  patterns for solving this problem which he calls the Coarse Grained Lock.  The Coarse Grained lock is in effect using a single lock for a collection of items, rather than have a lock per item.

Two possible approaches are put forward one is to nominate a root entity to act as the lock for all other entities, the other is to create a separate entity that represents a shared lock.

The root entity approach works assuming that from any entity in the aggregate you can easily find and fetch the root entity in order to perform the necessary locking.

The shared lock approach requires each entity to have a direct reference to the shared lock.

Neither of these approaches looked especially simple to implement in EF, after playing around with both approaches I finally opted for the shared Optimistic lock pattern. 

The use case I opted for was a simple one were I’m modelling an Order and a series of Order Items.

image

   Using simple EF optimistic concurrency I would place a version column in each entity table.  The problem with this is that if two users are editing the same order, by simply modifying different order items or adding items there is no way to ensure that at the end of both edits we have a valid order, since the result in the store will be a merge of both edits whilst each edit session will only see the original and its own changes.  If the two edits were modifying the same order item a concurrency conflict would be detected.  Now in some cases having the ability for two users to edit the same Order concurrently is great, but what if I want to ensure that the composite as a result of both users modifications is in fact valid, perhaps you wish to run some validation logic across the entire composite before saving it. 

You could start a system transaction per user update, perform the users updates, reload the entire composite, validate it and if it checks out commit the transaction.  However this may be too expensive, and if most of the time there is no contention then this approach is expensive.

image

By having a shared version across all entities that make up the order we can  be sure that when we save the  parts of the composite that we have changed that the parts that we haven’t changed are still the same.

When any entity that references a shared version is modified it must update the version entity’s version number, this will only be successfully persisted if the version number in the store is the same as it was when the entity was loaded.

Now to see if EF can implement this, when building the mappings we end up with an entity for each table, and a reference from the Orders and OrderItems to a Versions Entity.

The norm in EF when doing optimistic locking is to use a database timestamp column for the version number, in this case this wasn’t possible since we needed to make the version entity be persisted when a change was made to any entity that referenced it, so the version is represented as an integer, when a change is made to any entity that references that version the version number needs to be updated, and thus it will be persisted by EF when we decide to save our changes. 

All of this could be done manually when ever you load an entity be sure to include the Version its associated with and when you wish to perform a modification increment its associated version, version number.   This seems tedious and in a production environment missing this step is very likely, the result of which would result in possible corruption assuming two edits happen at the same time.  So I wasn’t happy with that, so what I decided to do was to automate this part of the process.

Normally in EF you call SaveChanges on the context to persist any changes, the changes are stored in the context and are available to see what has changed, so all I would need to do was to provide my own SaveChanges and look at the change set and for every entity in that set  that references a shared version I increment that version, thus modifying the version and now placing it into the change set.

Unfortunately you can’t  override SaveChanges, ( apparently you will in EF 2 )  so you end up writing a new method on the ObjectContext in my case I called it Submit, which performs the necessary checks to see what has changed, and then calls SaveChanges.  Note that the type SharedVersion represents the entities in the Version table.

 

   1:   


   2:  public partial class AcmeWidgets


   3:  {


   4:  public void Submit()


   5:  {


   6:      // Establish all the entities that have changed


   7:      //


   8:      var entities = this.ObjectStateManager


   9:          .GetObjectStateEntries(EntityState.Modified | EntityState.Added );


  10:   


  11:      var toIncrement = new Dictionary<int, SharedVersion>();


  12:   


  13:      // Foreach entity that has changed, if it has a shared version


  14:      // then increment it.


  15:      foreach (ObjectStateEntry attachedObject in entities.Where(e => e.IsRelationship == false ) )


  16:      {


  17:          // If No property of type SharedVersion then it will not attempt to increment the version


  18:          // if many it will throw an exception


  19:          // if just one will increment the version number, assuming it has not 


  20:          // previously been incremented


  21:          PropertyInfo versionProperty = (from prop in attachedObject.Entity.GetType().GetProperties()


  22:                                          where prop.PropertyType == typeof(SharedVersion)


  23:                                          select prop)


  24:                                          .DefaultIfEmpty(null)


  25:                                          .Single();


  26:   


  27:          if (versionProperty != null)


  28:          {


  29:              SharedVersion versionEntity = (SharedVersion) versionProperty.GetValue(attachedObject.Entity, null);


  30:   


  31:              if (versionEntity != null)


  32:              {


  33:   


  34:                  if (!toIncrement.ContainsKey(versionEntity.id))


  35:                  {


  36:                      toIncrement.Add(versionEntity.id, versionEntity);


  37:                      versionEntity.Increment();


  38:                  }


  39:   


  40:              }


  41:              else


  42:              {


  43:                  throw new SharedVersionException("You must load the version of the entity if you wish to modify it");


  44:              }


  45:          }     


  46:      }


  47:   


  48:      // Now call down to the entity framework to do persist


  49:      // any version entities used will now be marked.


  50:      SaveChanges();


  51:  }


  52:  }




In addition to my own Submit method, I also needed to write strongly typed Delete methods on the context, so that if the final part of the composite was deleted the version entity associated with the composite would also be deleted.



Finally it would be nice to automatically assign the version reference to any new object that becomes part of the composite.  To do this I register for change on the OrderItems collection of the order and assign it the same version object of the Order its part of.




   1:  public partial class Order


   2:  {


   3:      public Order()


   4:      {


   5:          this.OrderItems.AssociationChanged += new System.ComponentModel.CollectionChangeEventHandler(OrderItems_AssociationChanged);


   6:      }


   7:   


   8:      void OrderItems_AssociationChanged(object sender, System.ComponentModel.CollectionChangeEventArgs e)


   9:      {


  10:          if (e.Action == System.ComponentModel.CollectionChangeAction.Add)


  11:          {


  12:              OrderItem item = (OrderItem)e.Element;


  13:              if (item.Version == null)


  14:              {


  15:                 item.Version = this.Version;


  16:              }


  17:          }


  18:          else if (e.Action == System.ComponentModel.CollectionChangeAction.Remove)


  19:          {


  20:              this.Version.Increment();


  21:          }


  22:      }




The last part and probably the bit I like least is the fact that you need to ensure that for every entity that is loaded you must insure that the associated version entity is loaded at the same time.  This can’t be lazily loaded as its important to know the entity and associated version are in step with each other.




   1:   Order order = ctx.OrderSet


   2:                      .Include("Version")


   3:                      .Include("OrderItems")


   4:                      .Include("OrderItems.Version")


   5:                      .First();




So whilst this all works, its a real shame coarse grained locking has not been considered by the framework providers, EF is at best an ok ORM for simple entities but fails to support out of the box moderately complex real world examples.  I think its having to go to this level of effort that still puts developers off betting on a ORM.  All the code can be downloaded from here.  Note the database creation script is for SQL 2008.

Wednesday, April 01, 2009

Microsoft's Parallel Framework extensions (Pfx) isn’t always the free lunch its cracked up to be.

I’ve seen numerous demos of Microsoft's Parallel Framework Extensions ( Pfx), and as you would expect they focus on showing how easy it is to make sequential code go faster on multiple cores.  Whilst there is no disputing that the code that is run certainly scales as you add new processors, you can’t get away from the fact that a lot of these demos are contrived, heavy use of Thread.SpinWait to simulate lots of computation for each parallel task.   Take away the SpinWait and do some simple one line piece of computation and you will almost certainly end up with a far worse performance than you had with sequential approach.  The reason being that creating , scheduling and correlating results is not free.  So when deciding to utilise Pfx you need to ensure that you each piece of parallel activity is sufficient in size to benefit from the additional cores.  This is were the problem lies whilst its easy to drop in a Thread.SpinWait for demo purposes its often not easy to take your piece of real world computation you wish to parallelise and make each piece parallel activity be of significant size.

One demo I’ve seen over and over again is summing the nodes of a binary tree. The sequential version looks like this

private static int SumTree(Node root)
{

           return root.Value + (root.Left == null ? 0 : SumTree(root.Left)) + (root.Right == null ? 0 : SumTree(root.Right));
}

This could then be parallelised  like so

private static int ParallelSumTree(Node root)
        {
            int lhsTotal = 0;
            int rhsTotal = 0;

                Future<int> lhs = Future<int>.Create(() => root.Left == null ? 0 : ParallelSumTree(root.Left));
                Future<int> rhs = Future<int>.Create(() => root.Right == null ? 0 :ParallelSumTree(root.Right));
                lhsTotal = lhs.Value;
                rhsTotal = rhs.Value;
            return root.Value + lhsTotal + rhsTotal;
        }

Future<T> is a Pfx type that is used to create a parallel task that yields a result in this case an int.  The instance of Future<T> has a Value property that you can use to get the result of the parallel task. If you request the Value and the task has not yet completed  you block waiting for the result.  I personally really like this model but they have scrapped this type in .NET 4.0 release of Pfx and have the concept that a parallel task can return a value as opposed to void.

This form of parallelisation can work if the computation of both of the futures takes a reasonable time.  To demonstrate this I have an implementation that allows the amount of computation to vary, utilising a for…loop.  I varied the length of the loop from 0 to 50,000 iterations.  With the case of 0 iterations simulating the real task in hand to sum all nodes in the binary tree, and 50,000 to simulate a more expensive operation to perform on each node.  I ran the code over a binary tree of 500k nodes with 2 cpus.

 

image

What the result shows is that you need to have a workload equivalent to 42k loop iterations for the parallel version to be faster, but note not twice as fast.  I then ran the same piece of code on a 8 core machine, here the point in which the parallel version was more effective was around 10k loop iteration mark.  This shows that the size of the workload to be effective for parallelisation is also a function of the number of available cores.  Today you may well rule out a parallel variant of an algorithm as it doesn’t offer a speed up, whilst in the future when we have say 32+ cores it may well do so, which is why the MS guys are keen to advocate you build for future concurrency not necessarily what you have today.   However we obviously don’t want to have a slower parallel version today for the possibility of some future benefit.  The reason for this speed up as we add more cores means we run more code in parallel this is not only our algorithm but also the task creation code.

image

The eight processor version is a lot more compelling than the two processor version, however I'm not comfortable with this, as the speed up  I’m getting is not close to 8x speed up which is what the ideal case would be, in fact it is at best about 4 times.  In some cases it may be reasonable to accept this, as the effort to make the 4x speed up was very little work, and if we wait a while until we have say 32 procs we will continue to get a speed up just in the same way we did in the past with greater clock speed.

If you do want to squeeze every last drop out of the machine then you will need to think of ways to optimise the algorithm so that the overhead of parallelisation is reduced.  In this case its about not having to manage lots of separate tasks.  For this algorithm a couple of things stand out that will help here. 

  1. The tree is perfectly balanced
  2. The computation for each node is identical

We can take advantage of these two facts along with the knowledge of how many cores we currently have available, and simply divide the tree up between all the cores.  That way we take minimal hit for task creation, and each core can run flat out on its given piece of work, and since all work items are equal we would expect all cores to be utilised the same.  The two graphs below show this new implementation which has a far better performance.

image

image

The two above facts about the tree made producing an efficient parallel implementation reasonably easy.  If you had a unbalanced tree it may be worth balancing it or producing a series of other containers that contain the collection of nodes spread evenly.  For this to be effective the computational work has to be of  greater than the cost of scanning the tree. Alternatively a more dynamic approach could be undertaken were by the algorithm kept track of how many cores it was utilising and when it detected full utilisation it keeps all new activities on the same core as part of the same task.  It is often worth considering parallelisation when deciding how to structure your data.

Conclusion

Parallel.For and use of lots of tasks will not necessarily make your code go faster, it may well go slower.  The piece of work being parallelised must be of sufficient size to benefit from the overhead of parallel framework and finding that piece of work is often not as easy as you might expect.  Going forward as machines have greater and greater number of cpu’s that size of computation necessary to take advantage of the framework will get less.  However  this will not always result in the most effective us of the cores.  If you want to take full advantage of the cores to get maximum performance you will need to structure the algorithm so that understands the environment that it is running in, in order to best utilise resources, and it is this part that often requires the hard thinking.  So is the free lunch back?  Well to some degree if you want an ok lunch, but for a gut busting lunch you will still have to work for it.

Sunday, March 29, 2009

Ruby Ranges in .NET

Managed to attend some talks whilst I was at DevWeek 2009, one such talk was by Oliver Sturm on the topic of other cool stuff you can do with C# v3 other than Linq.  Really enjoyed the talk introduced me to topics like Fluent Api's.  If a series of types have a fluent API they allow you to build up code that reads more like a regular sentence.

E.g. invoices.All.OverDueBy(30.Days).SendReminders();

One of the demo's Oliver did was to show a C# implementation of a standard Ruby technique called Ranges.

numbers = [ 1..10]

Here the numbers variable represents all values between 1 and 10.  Some obvious things you would like to do with the range is consume it via foreach, or in Ruby speak Each, or determine if a given value is inside the range.

Oliver took based his implementation from Don Box's implementation and modified a few bits..  Don's implementation is based on .NET 2.0 generics and thus when he builds a Range<T> type you are required to provide delegate instances that provide the next item in the range and determine if a given value is inside the range.  So that your don't have to provide delegate instances for the basic primitive types Don supplied a series of static helper methods that supply the necessary delegate instances.

I started thinking if this was necessary in C# 3, could we not use the new Expression syntax to automatically build the ranges without the need to supply delegate instances.  The idea is that the Range<T> builds the necessary delegate instances by building expressions, I blogged about this technique a while back that whilst you can't do numeric operations on T as part of standard generics you can build expressions on T that you can attempt to perform an Add operation on. 

Based on the previous experience of building a Generic Sum method using Expressions I got to work and built an implementation of Range<T> that attempts to build the the necessary next and isIn delegate instances for you.

public class Range<T> : IEnumerable<T>
{
    public T Start { get; private set; }
    public T End { get; private set; }

    private Func<T, T> next;
    private Func<T, bool> isIn;

    public Range(T start, T end, T step)
        : this(start, end, step, false)
    {

    }

    public Range(T start, T end, T step, bool highToLow)
    {
        InitStartAndEnd(start, end);

        ParameterExpression current = Expression.Parameter(typeof(T), "current");

        Expression<Func<T, T>> nextExpr = Expression.Lambda<Func<T, T>>(
            Expression.Add(
             current,
             Expression.Constant(step)),
             current);

        Expression<Func<T, bool>> isInExpr = null;

        if (highToLow)
        {
            isInExpr = Expression.Lambda<Func<T, bool>>(
                Expression.And(
                    Expression.GreaterThanOrEqual(current, Expression.Constant(end)),
                    Expression.LessThanOrEqual(current, Expression.Constant(start))),
                    current);
        }
        else
        {
            isInExpr = Expression.Lambda<Func<T, bool>>(
                Expression.And(
                    Expression.GreaterThanOrEqual(current, Expression.Constant(start)),
                    Expression.LessThanOrEqual(current, Expression.Constant(end))),
                    current);
        }

        isIn = isInExpr.Compile();
        next = nextExpr.Compile();
    }

    public Range(T start, T end, Func<T, T> next, Func<T, bool> isIn)
    {
        InitStartAndEnd(start, end);
        this.next = next;
        this.isIn = isIn;
    }

    public bool IsIn(T val)
    {
        return isIn(val);
    }

    public override string ToString()
    {
        return String.Format("{0}..{1}", Start, End);
    }

    private void InitStartAndEnd(T start, T end)
    {
        Start = start;
        End = end;
    }

    public IEnumerator<T> GetEnumerator()
    {
        T current = Start;
        yield return current;

        while (isIn(current))
        {
            current = next(current);
            yield return current;
        }
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

}  

With this type defined you now write code like this

var digits = new Range<int> (0,9,1);

I can now consume all the digits via a simple foreach loop, or perhaps use the range as part of a Linq expression since my Range<T> implements IEnumerable<T>.

digits.Except( new  int[] { 0,2,4,6,8 } ).ToList().ForEach( i=>Console.WriteLine(i) );

This will print out all the odd numbers from the range..

With the addition of another C# v3 feature extension methods, we can write a "fluent api" style method.

public static class RangeUtil
    {

        public static bool In<T>(this T val, Range<T> range)
        {
            return range.IsIn(val);
        }
    }

I can now write

bool isSingleDigitValue =  5.In( digits ) ;

I really wanted to support enumerations without the need to define your own next and IsIn delegate instances but couldn't quite get there.  Im sure its possible with a bit more thought.  But for now enum ranges need to be defined by creating a new type derived from the Range<T> type.

public enum Places { First, Second, Third, Fourth };

   public class PlacesRange : Range<Places>
   {
       public PlacesRange(Places start, Places end)
           : base(start, end, p => p + 1, p => p>= start && p <=end )
       {

       }
   }

Note this only works if there are no gaps in the enumeration values.

There are other situations were the normal Add operator is not sufficient

public class DayRange : Range<DateTime>
  {
      public DayRange(DateTime start, DateTime end )
          : base(start.Date, end.Date, d => d.AddDays(1), d => d >= start && d <= end )
      {

      }
  }

To use the DayRange type

DayRange days = new DayRange(new DateTime(2009, 1, 1), new DateTime(2009, 12, 30));

          if (DateTime.Now.In(days))
          {
              Console.WriteLine("You are in Year 2009");
          }

In conclusion Ranges do look pretty cool..however it should be noted that using ranges to represent a simple iteration is far more expensive than a normal for loop, but they are great for building fluent api’s.

You can download the complete source here

Thursday, March 26, 2009

Devweek 2009

Just posted all my demos from my sessions at devweek, checkout all the Rock Solid Knowledge guys demos via Rock Solid Knowledge Conferences.  Also just like to say what a fantastic conference Devweek is, met loads of interesting people and had a lot of fun delivering the talks.  Patterns Dartboard worked out really well...Tim tried hard to get me to consider moving to Ruby...and Oliver scared me with F#...

Thursday, February 19, 2009

New email client

Im sure its not just me but outlook has always seemed very sluggish...and this week I had finally had enough when it stopped synchronising with my IMAP inbox folder...So I reverted back to using Mozilla Thunderbird...And boy its so much sleeker...

Install took seconds...and best of all with a couple of free addins it  fully integrates with google calendar and contacts, which is perfect for me especially since I have my iPhone sync directly with google calendar and contacts via a great free service www.nuevaSync.com.  This removes the need to sync my iPhone except for Media content...My contacts and calendar are sync'd all the time..

Tuesday, January 06, 2009

Power Point Merge

Im currently preparing my slide decks for Dev Week 2009, and for the day of design patterns Im having to merge five individual decks to make a single deck, now I could do this by hand or I could write a tool to do, and since Ive managed to get the hang of the power point automation api I did just that.  You can download the tool from here

Blog Archive

About Me

My photo
Im a freelance consultant for .NET based technology. My last real job, was at Cisco System were I was a lead architect for Cisco's identity solutions. I arrived at Cisco via aquisition and prior to that worked in small startups. The startup culture is what appeals to me, and thats why I finally left Cisco after seven years.....I now filll my time through a combination of consultancy and teaching for Developmentor...and working on insane startups that nobody with an ounce of sense would look twice at...