onsdag 4 april 2012

Long running System.Transactions (TransactionScope)

I've came accross a problem when working with long running System.Transactions (TransactionScope). An InvalidOperationException is thrown with the following message: "The transaction associated with the current connection has completed but has not been disposed. The transaction must be disposed before the connection can be used to execute SQL statements."

No transaction is reused, the usage of the connection and transaction is perfectly fine:

var transactionOptions = new TransactionOptions()
{
    IsolationLevel = IsolationLevel.Serializable,
    Timeout = TimeSpan.FromMinutes(20),
};
using (var connection = new SqlConnection(ConnectionString))
using (var transaction = new TransactionScope(TransactionScopeOption.Required, transactionOptions))
{
    // Some database manipulations that eventually throws the exception

    transaction.Complete();
}
The problem occurs if the transaction runs longer than the maxTimeout property. This property can only be modified in the machine.config and its default value is 10 minutes. It doesnt matter that the Timeout property of the TransactionOptions are set to a higher value, TransactionOptions.Timeout can not exceed maxTimeout property.

Your machine.config is located at: %windir%\Microsoft.NET\Framework\[version]\config\machine.config

On my current machine, they exist at (depending on target platform):

C:\Windows\Microsoft.NET\Framework\v4.0.30319\Config

C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config

To modify the machine.config add the following (in the configuration tag):

    
        
    

Be aware that the maxTimeout property is sometimes used for deadlock detection, so don't set a too high value.

fredag 2 mars 2012

Quartz Candy, Part 1

This blog post will show how Quartz.NET version 1.0.3 can be extended to handle jobs that when they fails, are retried in the future a configurable number of times with a paus between attempts.

Quartz already has a feature to retry jobs immediatly. The only exception that a job is allowed to throw is JobExecutionException. If the refireImmediately flag is set on the exception it will be executed again by the scheduler on the same JobExecutionContext. My First idea was to take advantage of this and implement an abstract base class i.e RetryableJobBase that handles the retry logic by throwing a JobExecutionExeption with refireImmediatly flag set to true.

This would work, but wouldnt in an easy manner allow for a wait time between retries other than a call to Thread.Sleep. This will block the current thread, which is a threadpool thread, from executing other jobs. Neither didn't I like the idea of using a base class for this functionality due to the fact that .NET classes only can derive from one class. If another feature also would be implemented with base classes, for example workflow handling like the quartz FAQ suggests, it would not be possible to combine these two behaviours.

So I had to rethink my design a bit. Quartz.NET offers extendability, it uses interfaces for most classes. It is possible to plug in JobListener, TriggerListener and SchedulerListener. A listener can be registered with the scheduler and its methods are then called by the scheduler at appropriate time. IJobListener contains the following members:

void JobToBeExecuted(JobExecutionContext context);
void JobWasExecuted(JobExecutionContext context, JobExecutionException jobException);
As the methodnames explain, JobToBeExecuted is called before a job is executed and JobWasExecuted after a job is executed. If the job fails (throws exception) the jobException argument will be set.

This could be utilized for the retry logic. JobToBeExecuted must increase a try number counter that is saved per job. JobWasExecuted must reschedule the job if the job failed and the counter has not reached max retries. Scheduling the retry job with the scheduler instead of using Thread.Sleep will prevent the current thread from blocking. The thread can then perform other jobs until it's time to execute the retry. Only the next try will be scheduled, its impossible to know beforehand which try will success or fail.

The hard part in this is where to store the try number counter. My first thought was to store them in a dictionary but that would require some kind of unique key per job. The solution would work, but would require some extra work.

Quartz.NET is loosely coupled, a job is started by providing a JobDetail and a Trigger. When the Trigger fires, a new instance of the job is created by a IJobFactory (for more details about the IJobFactory see this blog post). The JobDetail contains a JobDataMap that can be used for providing data for jobs. Bingo! It sounds like the JobDataMap is the place to store the try number counter. But as the documentation states, any changes made to the contents of the job data map of a non-stateful job (job based on IJob) during execution of the job will be lost. So this would only work with implementations of IStatefulJob. That will be quite pointless because stateful jobs are not allowed to be run concurrently.

But if the job retry is scheduled with the same JobDetail as the original job, the JobDataMap will be intact independently of the job is stateful or not! It doesnt even matter that the behaviour for the default IJobFactory is to create a new instance of the job when the retry trigger fires, because the JobDataMap is stored on the JobDetail in the JobContext and not on the job itself.

So if the job retry is scheduled on the same JobDetail as the original job the JobDataMap is a good place to store the try number counter.

The retry is scheduled by creating a new trigger that utilizes an interface for a RetryableJob. It's up to the implementator of the IRetryableJob interface to decide how many retries that can be done and when the next retry will be performed in the JobWasExecuted method:

    var oldTrigger = context.Trigger;

    // Unschedule old trigger
    _scheduler.UnscheduleJob(oldTrigger.Name, oldTrigger.Group);

    // Create and schedule new trigger
    var retryTrigger = new SimpleTrigger(oldTrigger.Name, oldTrigger.Group, retryableJob.StartTimeRetryUtc, retryableJob.EndTimeRetryUtc, 0, TimeSpan.Zero);
    _scheduler.ScheduleJob(context.JobDetail, retryTrigger);

Notice above that the job is scheduled with the current JobDetail.

The try number counter is stored in the JobDataMap of the JobDetail:

    if (!context.JobDetail.JobDataMap.Contains(NumberTriesJobDataMapKey))
        context.JobDetail.JobDataMap[NumberTriesJobDataMapKey] = 0;

    int numberTries = context.JobDetail.JobDataMap.GetIntValue(NumberTriesJobDataMapKey);
    context.JobDetail.JobDataMap[NumberTriesJobDataMapKey] = ++numberTries;

All code is available at my GitHub account

torsdag 16 februari 2012

How I Learned to Stop Worrying about my LINQ queries and Love the integration test environment

In the past I've put lots of effort into making Entity Framework LINQ queries testable by faking ObjectSet/DbSet and ObjectContext/DbContext/DataContext. My code to make this possible can be found on GitHub 

The key principle to make this work is to provide interfaces for the context and objectsets so an alternative, faked implementation based on IEnumerable<T> can be used for the tests. A negative aspect about this is that LINQ to Objects will be used in the tests instead of a LINQ implementation that will actually generate SQL code (LINQ to Entities or LINQ to SQL).

I've seen many others putting effort into making this possible.

In the past I thought that testing your queries with LINQ to Objects was good enough. But if you really are thinking why you need the tests, you might start rethinking your approach.

The most important reasons of using automatic tests are in my opinion:

  • To assure the quality of your production code
  • To assert that your codes behave correctly now, but also in the future

When you can trust your tests, you will have more courage to refactor code and know that it will still work. This will lead to better code.

I've finally realized that LINQ to Objects and LINQ to Entities are too different (LINQ to Sql is also to different). I think that the tests utilizing LINQ to Objects can not be trusted. I've been TDD:ing many linq queries and provided LINQ to Object tests that passes, but when I fire up the complete system the query fails. I’ve seen queries that work with LINQ to Objects but fails with LINQ to Entities for lots of different reasons; enum handling, string comparison, use of non entity properties, invalid use of extension methods (for example Distinct()) and several other reasons.

Small and simple LINQ queries might be easy to test by LINQ to Objects, the differences between LINQ to Objects and LINQ to Entities are easier to spot and it will also be much easier to predict which SQL code that will be generated. But when you handle large queries it will be almost impossible (except for Jon Skeet) to guess or predict what SQL the query will generate. It's when you are writing those large and complicated queries that you really will need the tests, both now and in the future. Wouldn't you feel more comfortable editing a complicated query if it was covered by tests?

My suggestion is to stop putting effort into making Entity Framework or Linq to Sql testable with Linq to Objects. Spend the time creating a good integration test environment instead. An environment where you can test your queries against a real database. Some of my advices for your integration environment:

  • Make it resemble the producation environment
  • Try to use the same database server product and version
  • Use the same LINQ implementation as in production
  • Use the same ADO.NET provider
  • Separate your integration tests from your unit tests. If the integration tests becomes slower you can still execute the unit tests really fast and often, the integration tests would probably not run as often as the unit tests
  • Make it easy for the team to execute the integration tests. This can involve scripting creation of database and masterdata and creata an environment that executes the scripts on your locally hosted database before the tests are executed or by using a shared instance of the integration test database. Choose what suits you best!

The above reasons will make me able to trust my tests for the LINQ queries again.

What are your suggestion for a good integration test environment?

måndag 13 februari 2012

Integrate Quartz.NET with your favourite IoC container

I've used a job scheduler library called Quartz.NET in a couple of projects. This blogpost describes how to integrate it with your favourite flavoured IoC-container.

This applies to Quartz.NET version 1.0.3 that is also available on NuGet by package name "Quartz".

Jobs are implemented by implementing one of the job-interfaces IJob, IInterruptableJob or IStatefulJob (they are all based on IJob). The job is then registered with a scheduler and a trigger.

The initialization code below is also described in Lesson 1 of Quartz.NET tutorial:
// construct a scheduler factory
ISchedulerFactory schedFact = new StdSchedulerFactory();

// get a scheduler
IScheduler sched = schedFact.GetScheduler();
sched.Start();

// construct job info
JobDetail jobDetail = new JobDetail("myJob", null, typeof(HelloJob));
// fire every hour
Trigger trigger = TriggerUtils.MakeSecondlyTrigger(5);
trigger.Name = "myTrigger";
sched.ScheduleJob(jobDetail, trigger);
This is fine as long as the job doesn't require a dependency. But what if your Job implementation demands a dependency?
public class HelloJob : IJob
{
    private readonly ILogger _logger;

    public HelloJob(ILogger logger)
    {
        _logger = logger;
    }

    public void Execute(JobExecutionContext context)
    {
        _logger.Log(@"Oh Hai \o/");
    }
}
Then you need to roll your own IJobFactory implementation. My implementation with my favourite IoC container Castle Windsor is described below:
public class WindsorJobFactory : IJobFactory
{
    private readonly IWindsorContainer _container;

    public WindsorJobFactory(IWindsorContainer container)
    {
        _container = container;
    }

    public IJob NewJob(TriggerFiredBundle bundle)
    {
        return (IJob)_container.Resolve(bundle.JobDetail.JobType);
    }
}
The JobFactory instance then needs to be assigned to the scheduler in the initialization code:
var container = new WindsorContainer();
IJobFactory jobFactory = new WindsorJobFactory(container);

ISchedulerFactory schedFact = new StdSchedulerFactory();

IScheduler sched = schedFact.GetScheduler();
sched.JobFactory = jobFactory;
sched.Start();
I'm aware that you should not call your IoC-container directly, instead adhere to The Hollywood principle, but I think its ok in this case because it's the initialization of the infrastructure. The job implementations must also be registered with your IoC-container:
public class JobRegistrar
{
    private readonly IWindsorContainer _container;

    public JobRegistrar(IWindsorContainer container)
    {
        _container = container;
    }

    private static IEnumerable GetJobTypes()
    {
        return AppDomain.CurrentDomain.GetAssemblies().ToList()
            .SelectMany(s => s.GetTypes())
            .Where(p => typeof(IJob).IsAssignableFrom(p) && !p.IsInterface);
    }

    public void RegisterJobs()
    {
        var jobTypes = GetJobTypes();
        foreach (Type jobType in jobTypes)
        {
           _container.Register(Component.For(jobType).ImplementedBy(jobType).LifeStyle.Transient);
        }
    }
}
That's all for tonight folks! All code can be found on my GitHub account

söndag 12 februari 2012

Welcome to my coding blog! I will post stuff that I think is interesting in this blog.
public class HelloWorld
{
    public string GreetVisitor()
    {
        return @"Hai \o/";
    }
}