c#

closures

captured-variable

I met an interesting issue about C#. I have code like below.

List<Func<int>> actions = new List<Func<int>>();

int variable = 0;
while (variable < 5)
{
    actions.Add(() => variable * 2);
    ++ variable;
}

foreach (var act in actions)
{
    Console.WriteLine(act.Invoke());
}

I expect it to output 0, 2, 4, 6, 8. However, it actually outputs five 10s.

It seems that it is due to all actions referring to one captured variable. As a result, when they get invoked, they all have same output.

Is there a way to work round this limit to have each action instance have its own captured variable?

Solution 1

Yes - take a copy of the variable inside the loop:

while (variable < 5)
{
    int copy = variable;
    actions.Add(() => copy * 2);
    ++ variable;
}

You can think of it as if the C# compiler creates a "new" local variable every time it hits the variable declaration. In fact it'll create appropriate new closure objects, and it gets complicated (in terms of implementation) if you refer to variables in multiple scopes, but it works :)

Note that a more common occurrence of this problem is using for or foreach:

for (int i=0; i < 10; i++) // Just one variable
foreach (string x in foo) // And again, despite how it reads out loud

See section 7.14.4.2 of the C# 3.0 spec for more details of this, and my article on closures has more examples too.

Note that as of the C# 5 compiler and beyond (even when specifying an earlier version of C#), the behavior of foreach changed so you no longer need to make local copy. See this answer for more details.

Solution 2

I believe what you are experiencing is something known as Closure http://en.wikipedia.org/wiki/Closure_(computer_science). Your lamba has a reference to a variable which is scoped outside the function itself. Your lamba is not interpreted until you invoke it and once it is it will get the value the variable has at execution time.

Solution 3

Behind the scenes, the compiler is generating a class that represents the closure for your method call. It uses that single instance of the closure class for each iteration of the loop. The code looks something like this, which makes it easier to see why the bug happens:

void Main()
{
    List<Func<int>> actions = new List<Func<int>>();

    int variable = 0;

    var closure = new CompilerGeneratedClosure();

    Func<int> anonymousMethodAction = null;

    while (closure.variable < 5)
    {
        if(anonymousMethodAction == null)
            anonymousMethodAction = new Func<int>(closure.YourAnonymousMethod);

        //we're re-adding the same function 
        actions.Add(anonymousMethodAction);

        ++closure.variable;
    }

    foreach (var act in actions)
    {
        Console.WriteLine(act.Invoke());
    }
}

class CompilerGeneratedClosure
{
    public int variable;

    public int YourAnonymousMethod()
    {
        return this.variable * 2;
    }
}

This isn't actually the compiled code from your sample, but I've examined my own code and this looks very much like what the compiler would actually generate.

Solution 4

The way around this is to store the value you need in a proxy variable, and have that variable get captured.

I.E.

while( variable < 5 )
{
    int copy = variable;
    actions.Add( () => copy * 2 );
    ++variable;
}

Solution 5

This has nothing to do with loops.

This behavior is triggered because you use a lambda expression () => variable * 2 where the outer scoped variable not actually defined in the lambda's inner scope.

Lambda expressions (in C#3+, as well as anonymous methods in C#2) still create actual methods. Passing variables to these methods involve some dilemmas (pass by value? pass by reference? C# goes with by reference - but this opens another problem where the reference can outlive the actual variable). What C# does to resolve all these dilemmas is to create a new helper class ("closure") with fields corresponding to the local variables used in the lambda expressions, and methods corresponding to the actual lambda methods. Any changes to variable in your code is actually translated to change in that ClosureClass.variable

So your while loop keeps updating the ClosureClass.variable until it reaches 10, then you for loops executes the actions, which all operate on the same ClosureClass.variable.

To get your expected result, you need to create a separation between the loop variable, and the variable that is being closured. You can do this by introducing another variable, i.e.:

List<Func<int>> actions = new List<Func<int>>();
int variable = 0;
while (variable < 5)
{
    var t = variable; // now t will be closured (i.e. replaced by a field in the new class)
    actions.Add(() => t * 2);
    ++variable; // changing variable won't affect the closured variable t
}
foreach (var act in actions)
{
    Console.WriteLine(act.Invoke());
}

You could also move the closure to another method to create this separation:

List<Func<int>> actions = new List<Func<int>>();

int variable = 0;
while (variable < 5)
{
    actions.Add(Mult(variable));
    ++variable;
}

foreach (var act in actions)
{
    Console.WriteLine(act.Invoke());
}

You can implement Mult as a lambda expression (implicit closure)

static Func<int> Mult(int i)
{
    return () => i * 2;
}

or with an actual helper class:

public class Helper
{
    public int _i;
    public Helper(int i)
    {
        _i = i;
    }
    public int Method()
    {
        return _i * 2;
    }
}

static Func<int> Mult(int i)
{
    Helper help = new Helper(i);
    return help.Method;
}

In any case, "Closures" are NOT a concept related to loops, but rather to anonymous methods / lambda expressions use of local scoped variables - although some incautious use of loops demonstrate closures traps.

Solution 6

Yes you need to scope variable within the loop and pass it to the lambda that way:

List<Func<int>> actions = new List<Func<int>>();

int variable = 0;
while (variable < 5)
{
    int variable1 = variable;
    actions.Add(() => variable1 * 2);
    ++variable;
}

foreach (var act in actions)
{
    Console.WriteLine(act.Invoke());
}

Console.ReadLine();

Solution 7

The same situation is happening in multi-threading (C#, .NET 4.0].

See the following code:

Purpose is to print 1,2,3,4,5 in order.

for (int counter = 1; counter <= 5; counter++)
{
    new Thread (() => Console.Write (counter)).Start();
}

The output is interesting! (It might be like 21334...)

The only solution is to use local variables.

for (int counter = 1; counter <= 5; counter++)
{
    int localVar= counter;
    new Thread (() => Console.Write (localVar)).Start();
}

Solution 8

for (int n=0; n < 10; n++) //forloop syntax
foreach (string item in foo) foreach syntax

Solution 9

It is called the closure problem, simply use a copy variable, and it's done.

List<Func<int>> actions = new List<Func<int>>();

int variable = 0;
while (variable < 5)
{
    int i = variable;
    actions.Add(() => i * 2);
    ++ variable;
}

foreach (var act in actions)
{
    Console.WriteLine(act.Invoke());
}

Solution 10

Since no one here directly quoted ECMA-334:

10.4.4.10 For statements

Definite assignment checking for a for-statement of the form:

for (for-initializer; for-condition; for-iterator) embedded-statement

is done as if the statement were written:

{
    for-initializer;
    while (for-condition) {
        embedded-statement;
    LLoop: for-iterator;
    }
}

Further on in the spec,

12.16.6.3 Instantiation of local variables

A local variable is considered to be instantiated when execution enters the scope of the variable.

[Example: For example, when the following method is invoked, the local variable x is instantiated and initialized three timesonce for each iteration of the loop.

static void F() {
  for (int i = 0; i < 3; i++) {
    int x = i * 2 + 1;
    ...
  }
}

However, moving the declaration of x outside the loop results in a single instantiation of x:

static void F() {
  int x;
  for (int i = 0; i < 3; i++) {
    x = i * 2 + 1;
    ...
  }
}

end example]

When not captured, there is no way to observe exactly how often a local variable is instantiatedbecause the lifetimes of the instantiations are disjoint, it is possible for each instantation to simply use the same storage location. However, when an anonymous function captures a local variable, the effects of instantiation become apparent.

[Example: The example

using System;

delegate void D();

class Test{
  static D[] F() {
    D[] result = new D[3];
    for (int i = 0; i < 3; i++) {
      int x = i * 2 + 1;
      result[i] = () => { Console.WriteLine(x); };
    }
  return result;
  }
  static void Main() {
    foreach (D d in F()) d();
  }
}

produces the output:

1
3
5

However, when the declaration of x is moved outside the loop:

static D[] F() {
  D[] result = new D[3];
  int x;
  for (int i = 0; i < 3; i++) {
    x = i * 2 + 1;
    result[i] = () => { Console.WriteLine(x); };
  }
  return result;
}

the output is:

5
5
5

Note that the compiler is permitted (but not required) to optimize the three instantiations into a single delegate instance (§11.7.2).

If a for-loop declares an iteration variable, that variable itself is considered to be declared outside of the loop. [Example: Thus, if the example is changed to capture the iteration variable itself:

static D[] F() {
  D[] result = new D[3];
  for (int i = 0; i < 3; i++) {
    result[i] = () => { Console.WriteLine(i); };
  }
  return result;
}

only one instance of the iteration variable is captured, which produces the output:

3
3
3

end example]

Oh yea, I guess it should be mentioned that in C++ this problem doesn't occur because you can choose if the variable is captured by value or by reference (see: Lambda capture).