Friday, October 30, 2009

LINQ and infinite enumerations

People tend to feel they know everything until they stumble upon borders of their knowledge. Today I had a chance to test my understanding of late evaluation of IEnumerable in LINQ expressions.

Scott Allen recently asked what is wrong with following code, and if it is an obvious bug:

   1: public static IEnumerable<int> Values(
   2:     this Random random, 
   3:     int minValue, int maxValue)
   4: {
   5:     while (true)
   6:         yield return random.Next(minValue, maxValue);
   7: }
   1: var values =
   2:     new Random().Values(1, 100)
   3:                 .Distinct()
   4:                 .OrderBy(value => value)
   5:                 .Take(10)
   6:                 .ToArray();

This functional piece of code is supposed to create a sorted array of 10 distinct random numbers between 1 and 100. And I’d say it does it in rather beautiful and self-explanatory way, except that it blocks forever trying to sort infinite sequence.

Let’s look closer into what happens here. Values() call on line 2 creates an iterator of infinite sequence of random numbers, but does not start to enumerate it. Distinct() call creates a new iterator that filters results of Values(). Same story with OrderBy() and Take().

But then ToArray() call starts to actually enumerate the sequence, and now we have a problem. When Take() asks underlying sequence for first element, OrderBy() cannot produce it without enumerating all elements. And Values() sequence is infinite, so it blocks forever.

As people pointed in comments, initial fix is simple (but there is more to it): we just change order of Take and OrderBy, so Take produces exactly 10 elements and OrderBy can enumerate them all.

Is this problem obvious? It is, as long as you know what to look for. You need to look for cases where you are trying to enumerate infinite sequence, i.e. call the method that fully enumerates the sequence without limiting it first. But it takes experience to spot it the same way it takes experience to spot potentially infinite loops in procedural code or potential NullReferenceExceptions in OO code.

How can we help people in spotting it? With better tools. Resharper already warns you about problems with null arguments sometimes using the following logic: if result of some function is marked as potential null with special attribute, you should check it for null before sending into argument that is marked as not accepting null.

I think it is possible to extend Resharper or another static code analysis tool with the notion of infinite sequences. We mark IEnumerables that result from function calls as potentially infinite, definitely finite, or neutral (same as the source enumeration). Then we mark functions that will enumerate whole sequence (such as ToArray, or OrderBy. Tool warns us if we do stupid things. Profit.

It still leaves another problem with initial snippet unaddressed: side effects in Values call. I do not want to go into it here, you can read detailed explanation about this with workarounds in Bart De Smet's blog.

I do not think that functional style is ‘too clever’ or anything, you just need to get used to it.

Monday, October 26, 2009

TortoiseGit cannot find libcurl-4.dll

After recent switch to Windows 7 x64 TortoiseGit stopped working for me. Symptoms are non-descriptive – when you try to “Create a new repository here” it just shows error box with no text.

I had two suspects: some obscure configuration settings of TortoiseGit itself or msysgit, or missing library. So after quick upgrade to latest 1.1.1.0-64bit version it was time to run ProcessMonitor.

ProcessMonitor immediately showed that after some activity git starts looking for libcurl-4.dll in all possible locations starting with C:\msysgit\bin where I have msysgit installed, and fails. It appears that this library is in another folder: C:\msysgit\mingw\bin. I did not try to understand why, just added both folders to system path and voila! It worked. Setting dialog shows git version again, repositories get created.

Hope this helps someone.

If you have a more involved problem, there are some solutions  proposed at http://ys.lancop.nl/site/?p=62, and of course by all means read official instructions once again: http://code.google.com/p/tortoisegit/.

What does compiled .aspx look like?

I always took for granted the fact that in ASP.NET WebForms you have this .aspx markup, and partial class in two other files, and then everything is magically compiled together and you just follow page lifecycle events.

Today I wanted to get a little deeper and sort it out for myself.

So for refresher, if you create an empty ASP.NET project, you get 1 page /Default.aspx divided into 3 files:

Default.aspx contains Page directive and some markup.

   1: <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="Default.aspx.cs" Inherits="WebApplication1._Default" %>
   2:  
   3: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   4:  
   5: <html xmlns="http://www.w3.org/1999/xhtml" >
   6: <head runat="server">
   7:     <title></title>
   8: </head>
   9: <body>
  10:     <form id="form1" runat="server">
  11:     <div>
  12:      
  13:     </div>
  14:     </form>
  15: </body>
  16: </html>

Default.aspx.cs contains the part of _Default partial class  that is supposed to be modified by developer.

   1: namespace WebApplication1
   2: {
   3:     public partial class _Default : System.Web.UI.Page
   4:     {
   5:         protected override void FrameworkInitialize()
   6:         {
   7:             base.FrameworkInitialize();
   8:         }
   9:  
  10:         protected void Page_Load(object sender, EventArgs e)
  11:         {
  12:             form1.Name = "New Name";
  13:         }
  14:     }
  15: }

Default.aspx.designer.cs contains another part of _Default class with server-side controls declarations.

   1: namespace WebApplication1 {
   2:     
   3:     
   4:     public partial class _Default {
   5:         
   6:         /// <summary>
   7:         /// form1 control.
   8:         /// </summary>
   9:         /// <remarks>
  10:         /// Auto-generated field.
  11:         /// To modify move field declaration from designer file to code-behind file.
  12:         /// </remarks>
  13:         protected global::System.Web.UI.HtmlControls.HtmlForm form1;
  14:     }
  15: }

Protected form1 field in partial class allows you to work with form from code-behind code. You just have to remember that form1 is null early in page lifecycle and gets created by the time Page_Load is invoked. But actual initialization code for this field is nowhere to be found.

It turns out that when ASP.NET runtime compiles .aspx page, it creates a new class default_aspx derived from _Default which overrides Page.FrameworkInitialize method. You can open relevant App_Web_{hash}.dll in ‘Temporary ASP.NET Files’ folder with Reflector and have a look. I did:

   1: [CompilerGlobalScope]
   2: public class default_aspx : _Default, IRequiresSessionState, IHttpHandler
   3: {
   4:     // snip
   5:  
   6:     [DebuggerNonUserCode]
   7:     private HtmlHead __BuildControl__control2()
   8:     {
   9:         HtmlHead head = new HtmlHead("head");
  10:         HtmlTitle title = this.__BuildControl__control3();
  11:         IParserAccessor accessor = head;
  12:         accessor.AddParsedSubObject(title);
  13:         return head;
  14:     }
  15:  
  16:     [DebuggerNonUserCode]
  17:     private HtmlTitle __BuildControl__control3()
  18:     {
  19:         return new HtmlTitle();
  20:     }
  21:  
  22:     [DebuggerNonUserCode]
  23:     private HtmlForm __BuildControlform1()
  24:     {
  25:         HtmlForm form = new HtmlForm();
  26:         base.form1 = form;
  27:         form.ID = "form1";
  28:         IParserAccessor accessor = form;
  29:         accessor.AddParsedSubObject(new LiteralControl("\r\n    <div>\r\n     \r\n    </div>\r\n    "));
  30:         return form;
  31:     }
  32:  
  33:     [DebuggerNonUserCode]
  34:     private void __BuildControlTree(default_aspx __ctrl)
  35:     {
  36:         this.InitializeCulture();
  37:         IParserAccessor accessor = __ctrl;
  38:         accessor.AddParsedSubObject(new LiteralControl("\r\n\r\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\r\n\r\n<html xmlns=\"http://www.w3.org/1999/xhtml\" >\r\n"));
  39:         HtmlHead head = this.__BuildControl__control2();
  40:         accessor.AddParsedSubObject(head);
  41:         accessor.AddParsedSubObject(new LiteralControl("\r\n<body>\r\n    "));
  42:         HtmlForm form = this.__BuildControlform1();
  43:         accessor.AddParsedSubObject(form);
  44:         accessor.AddParsedSubObject(new LiteralControl("\r\n</body>\r\n</html>\r\n"));
  45:     }
  46:  
  47:     [DebuggerNonUserCode]
  48:     protected override void FrameworkInitialize()
  49:     {
  50:         base.FrameworkInitialize();
  51:         this.__BuildControlTree(this);
  52:         base.AddWrappedFileDependencies(__fileDependencies);
  53:         base.Request.ValidateInput();
  54:     }
  55: }
  56:  

So what we see here is the actual creation of control tree for this page. The tree consists of some instances of LiteralControl, HtmlHead, HtmlTitle and HtmlForm. At line 26 we can see how form1 field is assigned the new form which is later added to the tree by System.Web.UI.Control.AddParsedSubObject() method.

If you are curious and not afraid of some disassembly you can trace the whole process yourself. Get .NET symbols from Microsoft symbol server and place breakpoint in FrameworkInitialize method. From there you can get to dynamically created class in the call stack and step around.

Saturday, October 3, 2009

Do not use Application Settings as global static variable

Interesting thing came up during refactoring of one WinForms application.

There are two forms in MDI container. One is a map of the game world, another is control form. On control form you can change coordinates of map centre and tell map to redraw.

The way it was achieved is through binding textboxes with X and Y coordinates to Application Setting of type string. Now, I have two problems with that. First one is that of encapsulation. Second is continuous conversions between string and int.

What I’ve done is extracted a view model and bound everything to it. Coordinates are saved from it to settings when the form closes.