std::string strMyBlog = "Chris Taylor";: 2010

Thursday, December 23, 2010

Visualizing Big-O complexity functions

I quickly threw this little Silverlight application together to help visualize some of the common Big-O complexity functions often quoted in any discusion on Data Structures and Algorithms.

This works on Linux Firefox with Moonlight 3 Preview Release it is a little slow but it works.

The following is a brief summary of the functions demonstrated in the application. You can get more detail here.

Function	Description	Example
O(1)	Constant complexity regardless of the domain size.	Array direct indexing, Hash table
O(n)	Linear complexity. As the domain size increases, the time/complexity increases linearly. If you double the items the complexity will double.	Sequential search
O(log n)	Logarithmic complexity. As the domain size increases, the time/complexity increases logorithmically	Binary Search on sorted data, Lookup in balanced binary tree
O(n log n)	Log Linear complexity. As the domain size increases, the time/complexity increases at a log linear or geometric rate.	Heap Sort, Merge Sort, Quick Sort¹
O(n²)	Quadratic complexity. As the domain size increases, the time/complexity increases quadratically.	Bubble Sort, Insertion sort
O(n³)	Cubic complexity. As the domain size increases, the time/complexity increases at a cubic rate.	Naive multiplication of two nxn matrices.
O(2ⁿ)	Exponential complexity. As the domain size increases, the time/complexity increases exponentialy.	Some graph algoritms like finding the exact solution to the traveling salesman problem.

¹Quick sort has best and average case of O(n log n) but worst case of O(n²)

Saturday, September 25, 2010

Code Share: Finding controls by control type.

On stackoverflow.com the question was asked, how to ‘Find ContentPlaceHolders in Master Page’.

In this case the OP wanted to work with a Master Page other than the one that was already The first part of the problem was getting the Master Page loaded in memory so that the Control tree could be interrogated.

Fortunately loading the Master Page is quite simple, you can use LoadControl to load the Master Page just like you would load any other user control.

For example in the Page_Load handler you could use something like the following to load the Master Page.

var site1Master = LoadControl("Site1.Master");

The next part, finding all the controls of a specific type, requires a simple recursive routine to search the control tree for all the controls of the type that you are interested in. Here is a simple implementation of just such a routine.

static class WebHelper
{
  public static IList<T> FindControlsByType<T>(Control root) 
    where T : Control
  {
    if (root == null) throw new ArgumentNullException("root");

    List<T> controls = new List<T>();
    FindControlsByType<T>(root, controls);
    return controls;
  }

  private static void FindControlsByType<T>(Control root, IList<T> controls)
    where T : Control
  {
    foreach (Control control in root.Controls)
    {
      if (control is T)
      {
        controls.Add(control as T);
      }
      if (control.Controls.Count > 0)
      {
        FindControlsByType<T>(control, controls);
      }
    }
  }
}

Using the tow pieces of code above, finding all the ContentPlaceHolders on the Master Page can be done like this.

// Load the Master Page
var site1Master = LoadControl("Site1.Master");

// Find the list of ContentPlaceHolder controls
var controls = WebHelper.FindControlsByType<ContentPlaceHolder>(site1Master);

// Do something with each control that was found
foreach (var control in controls)
{
  Response.Write(control.ClientID);
  Response.Write("<br />");
}

Hope someone finds this useful. I thought I would share it since I took the few moments to write it and did not want it to go to waste.

Sunday, June 27, 2010

Crossing the process boundary with .NET

Every so often my post Hacking my way across the process boundary gets some attention. Mostly in the form of requests for a .NET version of this technique. Now out of laziness more than anything else I have not actually taken the time or effort to do the conversion until now. So for those that need to access ListView or TreeView data from another process here is a simple example of one possible way to do it. Since I used C# for the example, I could very well have used unsafe code blocks to do some of the work, however I decided to avoid this making this example applicable to VB.NET developers as well. If you would like to see a version using unsafe code blocks, drop me a note and I will get round to it. To keep the sample short I have removed anything but the most rudimentary error checking. For an explanation of this code, please refer to the original post sighted above.

using System;
using System.Runtime.InteropServices;
using System.Text;
public class CrossProcessMemory
{
  const int LVM_GETITEM = 0x1005;
  const int LVM_SETITEM = 0x1006;
  const int LVIF_TEXT   = 0x0001;
  const uint PROCESS_ALL_ACCESS = (uint)(0x000F0000L | 0x00100000L | 0xFFF);
  const uint MEM_COMMIT         = 0x1000;
  const uint MEM_RELEASE        = 0x8000;
  const uint PAGE_READWRITE     = 0x04;

  [DllImport("user32.dll")]
  static extern bool SendMessage(IntPtr hWnd, Int32 msg, Int32 wParam, IntPtr lParam);
  
  [DllImport("user32")]
  static extern IntPtr GetWindowThreadProcessId( IntPtr hWnd, out int lpwdProcessID );    
  
  [DllImport("kernel32")]
  static extern IntPtr OpenProcess(uint dwDesiredAccess, bool bInheritHandle, 
    int dwProcessId);

  [DllImport("kernel32")]
  static extern IntPtr VirtualAllocEx( IntPtr hProcess, IntPtr lpAddress, 
    int dwSize, uint flAllocationType, uint flProtect);

  [DllImport("kernel32")]
  static extern bool VirtualFreeEx( IntPtr hProcess, IntPtr lpAddress, int dwSize, 
    uint dwFreeType );

  [DllImport("kernel32")]
  static extern bool WriteProcessMemory( IntPtr hProcess, IntPtr lpBaseAddress, 
    ref LV_ITEM buffer, int dwSize, IntPtr lpNumberOfBytesWritten );

  [DllImport("kernel32")]
  static extern bool ReadProcessMemory( IntPtr hProcess, IntPtr lpBaseAddress, 
    IntPtr lpBuffer, int dwSize, IntPtr lpNumberOfBytesRead );

  [DllImport("kernel32")]
  static extern bool CloseHandle( IntPtr hObject );

  [StructLayout(LayoutKind.Sequential)]
  public struct LV_ITEM
  {
    public uint   mask; 
    public int    iItem; 
    public int    iSubItem; 
    public uint   state; 
    public uint   stateMask; 
    public IntPtr pszText; 
    public int    cchTextMax; 
    public int    iImage;
  }

  public static string ReadListViewItem( IntPtr hWnd, int item )
  {
    const int dwBufferSize = 1024;
          
    int         dwProcessID;
    LV_ITEM     lvItem;      
    string      retval;
    bool        bSuccess;
    IntPtr      hProcess        = IntPtr.Zero;
    IntPtr      lpRemoteBuffer  = IntPtr.Zero;
    IntPtr      lpLocalBuffer   = IntPtr.Zero;
    IntPtr      threadId        = IntPtr.Zero;
          
    try
    {
      lvItem = new LV_ITEM();
      lpLocalBuffer = Marshal.AllocHGlobal(dwBufferSize);
      // Get the process id owning the window
      threadId = GetWindowThreadProcessId( hWnd, out dwProcessID );
      if ( (threadId == IntPtr.Zero) || (dwProcessID == 0) )
        throw new ArgumentException( "hWnd" );

      // Open the process with all access
      hProcess = OpenProcess( PROCESS_ALL_ACCESS, false, dwProcessID );
      if ( hProcess == IntPtr.Zero )
        throw new ApplicationException( "Failed to access process" );

      // Allocate a buffer in the remote process
      lpRemoteBuffer = VirtualAllocEx( hProcess, IntPtr.Zero, dwBufferSize, MEM_COMMIT, 
        PAGE_READWRITE );
      if ( lpRemoteBuffer == IntPtr.Zero )
        throw new SystemException( "Failed to allocate memory in remote process" );
      
      // Fill in the LVITEM struct, this is in your own process
      // Set the pszText member to somewhere in the remote buffer,
      // For the example I used the address imediately following the LVITEM stuct
      lvItem.mask = LVIF_TEXT;
      lvItem.iItem = item;
      lvItem.pszText = (IntPtr)(lpRemoteBuffer.ToInt32() + Marshal.SizeOf(typeof(LV_ITEM)));
      lvItem.cchTextMax = 50;

      // Copy the local LVITEM to the remote buffer
      bSuccess = WriteProcessMemory( hProcess, lpRemoteBuffer, ref lvItem, 
        Marshal.SizeOf(typeof(LV_ITEM)), IntPtr.Zero );
      if ( !bSuccess )
        throw new SystemException( "Failed to write to process memory" );

      // Send the message to the remote window with the address of the remote buffer
      SendMessage( hWnd, LVM_GETITEM, 0, lpRemoteBuffer);
      
      // Read the struct back from the remote process into local buffer
      bSuccess = ReadProcessMemory( hProcess, lpRemoteBuffer, lpLocalBuffer, dwBufferSize, 
        IntPtr.Zero );
      if ( !bSuccess )
        throw new SystemException( "Failed to read from process memory" );
      
      // At this point the lpLocalBuffer contains the returned LV_ITEM structure
      // the next line extracts the text from the buffer into a managed string
      retval = Marshal.PtrToStringAnsi((IntPtr)(lpLocalBuffer.ToInt32() + 
        Marshal.SizeOf(typeof(LV_ITEM))));
    }
    finally
    { 
      if ( lpLocalBuffer != IntPtr.Zero )
        Marshal.FreeHGlobal( lpLocalBuffer );
      if ( lpRemoteBuffer != IntPtr.Zero )
        VirtualFreeEx( hProcess, lpRemoteBuffer, 0, MEM_RELEASE ); 
      if ( hProcess != IntPtr.Zero )
        CloseHandle( hProcess );
    }
    return retval;
  }
}

Sunday, June 6, 2010

Visual Studio 2010 – Box Selection

This has been blogged about and promoted everywhere and it is true, it is a super cool feature that even the pre-.NET IDE supported, but VS2010 breaths new life into the feature with some great enhancements. For a very cool intro to the functionality take a look here where a member of the IDE team demonstrates the enhancements.

So, why on earth would I be blogging about “old” news. Well one thing that I found “missing” with this feature until now was the ability to use the keyboard to perform a block selection. Now with VS 2010 you can use Shift+Alt+[Up, Down, Left, Right] to perform a block selection. This is a great enhancement, and worth knowing about if you are a keyboard junkie.

Saturday, April 17, 2010

Archive: Binary data from a Structure

For this post, I will be using the binary structure used by the trusty old DBF file structure.

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct DBFHeader
{
  public byte Tag;
  public byte Year;
  public byte Month;
  public byte Day; 
  public int RecCount;
  public short HeaderSize;
  public short RecSize;
  public short Reserved1;
  public byte Trans;
  public byte Encrypt;
  public long Reserved2_1;
  public short Reserved2_2;
  public byte MDX;
  public byte LangId;
  public short Reserved3;
}

Given the above structure, our challenge now is to transfer it from its in memory binary format to a stream. The stream may be a physical file, memory stream even a network stream. The most important aspect of all the techniques that I will investigate here is that the resulting binary data format is consistent with the format expected by the reader.

After looking at the interface provided by the Stream class, from which all streams inherit, I came to the conclusion that I would have to convert the structure to a byte array (byte[]).

Again our first stop is the interop services provided by the .NET framework. The following piece of code will convert a structure to a byte[] that can then be written to a stream.

DBFHeader hdr = new DBFHeader(); 
byte[] data = new byte[Marshal.SizeOf(hdr)];
unsafe
{
  DBFHeader *p = &hdr;
  Marshal.Copy( (IntPtr)p, data, 0, data.Length ); 
}

This code requires that you compile with unsafe code allowed. This can be done either by supplying the /unsafe switch to the command line compiler or in the Visual Studio .NET project properties under Configuration Properties>Build, you can set the Allow Unsafe Code Blocks to True.

After creating a instance of the structure DBFHeader a byte[] is created large enough to hold the contents of the structure. At that point we need to transfer the contents of the DBFHeader instance to the byte[]. To accomplish this, I declared an unsafe block and assigned the address of the structure, being a ValueType the structure is allocated on the local stack and is therefore implicitly pinned. Then the Marshal.Copy method is used to copy the data from an address in memory to the byte[]. Now we are free to write the byte array to a stream using the Write method provided by the stream.

There are a number of things that count against this technique. Using unsafe code blocks means that the assembly can not be verified and the caller would require the SkipVerification permission. The use of the Marshal methods requires that the immediate caller has SecurityPermissionAttribute.UnmanagedCode. This technique also requires that the data be moved in two stages, from the structure to the byte[] an then from there to the stream, this could be costly for large structures.

The first problem of the code being non-verifiable can be overcome by using a technique very similar to the above technique, without the requirement for unsafe code blocks. The following code demonstrates this technique.

DBFHeader hdr = new DBFHeader(); 
byte[] data = new byte[Marshal.SizeOf(hdr)];
IntPtr p = Marshal.AllocHGlobal( Marshal.SizeOf(hdr) );
Marshal.StructureToPtr( hdr, p, false );
Marshal.Copy( (IntPtr)p, data, 0, data.Length ); 
Marshal.FreeHGlobal( p );

Notice that it still requires that the data be moved in two stages and the use of the Marshal class carries the same security requirements as earlier. The only benefit is that the resulting assembly remains verifiable, not requiring the /unsafe option. However, allocating memory from the unmanaged heap carries with it additional performance overheads.

And finally the pure managed code solution that while not as easy as the prior techniques, it does benefit from the fact that the code is easily ported to other .NET languages and does not have the same security restrictions as the other two techniques. The solution is to use the BinaryWriter class to write each bit of information from the structure to the stream.

using (BinaryWriter wr = new BinaryWriter( stm ))
{
  wr.Write( hdr.Tag );
  wr.Write( hdr.Year );
  wr.Write( hdr.Month );
  wr.Write( hdr.Day );
  wr.Write( hdr.RecCount );
  wr.Write( hdr.HeaderSize );
  wr.Write( hdr.RecSize );
  wr.Write( hdr.Reserved1 );
  wr.Write( hdr.Trans );
  wr.Write( hdr.Encrypt );
  wr.Write( hdr.Reserved2_1 );
  wr.Write( hdr.Reserved2_2 );
  wr.Write( hdr.MDX );
  wr.Write( hdr.LangId );
  wr.Write( hdr.Reserved3 ); 
}

This code first creates a BinaryWriter that provides binary methods to access the underlying stream. In this case the underlying stream is designated by the stm argument passed to the BinaryWriter constructor. Then using the binary writer, each member of the structure is written in order to the stream through the BinaryWriter instance. I have already mentioned the clear advantages of this technique, but it does carry with it a share of disadvantages. The first and most obvious is that it requires you to write out each member individually, and with larger structures, this can be a tedious and error prone task. This brings me to the second disadvantage, if the data is not written out in the exact order expected by the reader, the reader will either fail to read the structure or even worse, corrupt data.

Just as a closing note, the final technique could also be used to get a byte[] by writing to a MemoryStream and then calling the GetBuffer method provided by MemoryStream to access the underlying byte[].

Conclusion

As with every design, it is a series of compromises that determines the ultimate chooses we make. The above techniques provide a number of alternative ranging from simplicity of code to flexibility of implementation. I believe that the final option is most likely the more purist and from a reusability standpoint the more effective solution.

Archive: Structure from Binary Data

Original Post Date: 8 Jan 2004

The question of moving stream data into a structure seems to be addressed every so often on the newsgroups. While I was building a front-end for a small packet sniffing application I was faced with these same questions. While I was aware of the standard approaches I decided to investigate what the alternative options where. This article is a summary of the primary ways I looked at of mapping binary data to a structure.

Defining the structure

The first thing that I did was to define the structure, for this example I am going to use the IP header. Deciding on the correct definition of the structure is dependent on the solution or approach taken to solve this problem.

Given that I have spent a considerable amount of time familiarizing myself with the intricacies of the P/Invoke or interoperability services, my first solution was to use System.Runtime.InteropServices to marshal the data into a structure. I promptly defined the following structure to handle the marshaled data.

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct IpHeader
{
  public byte VerLen;
  public byte TOS;
  public short TotalLength; 
  public short ID;
  public short Offset;
  public byte TTL;
  public byte Protocol;
  public short Checksum;
  public int SrcAddr;
  public int DestAddr;
}

The two things to note about the definition of the structure is that it is defined as a sequential structure which ensures that the order of the members and relative location of these members are not altered by the CLR.

Now that the structure is defined the next step is to populate the structure with data. In this case the data came from a call to the Receive method of the Socket class. To read IP header the socket was created using SocketType.Raw. The Receive method fills a byte array with the data and it is the first 20 bytes of this data that makes up the IP header. Because the structure that I have defined, and the structure of the data in these 20 bytes are a one to one match, using a language like C++ reading this data would be a relatively simple task requiring only a cast.

IpHeader *pHeader = (IpHeader*)packet;

With the structure of the IpHeader overlayed on the memory stream it is a matter of accessing the members of the structure. Unfortunately our structure is stored in managed memory and using code like the above in C# would require entering an unsafe code block. The following piece of code demonstrates how this could be achieved in C#.

IpHeader iphdr;
unsafe
{
  fixed ( byte *pData = packet)
  {
    iphdr = *(IpHeader*)pData;
  }
}
//Use iphdr...

For the above code to compile successfully you will have to flip the switch to allow unsafe code to be compiled. This can be achieved using either the /unsafe switch for the command line compiler or in the Visual Studio .NET project properties under Configuration Properties>Build you can set the Allow Unsafe Code Blocks to true.

The code in the unsafe block uses the fixed keyword to pin the packet structure in memory ensuring that the garbage collector does not shift the memory around under our feet. Because the memory is fixed it can now be used much like a standard C/C++ pointer allowing us to cast the block of memory to an IpHeader* and assign the pointer to iphdr structure.

Though this is a relatively simple piece of code there are a few points to be noted about this solution. First whenever we allow unsafe code blocks in a C# application we inhibit the CLR’s ability to verify the code. In this case, by fixing the data in memory and accessing it like a pointer the CLR can no longer check the array bounds and our code is open to a buffer overrun. The second point of concern with the fixed memory is that the garbage collector cannot shift this memory impacting the garbage collection process and the performance of the garbage collector if a collection is required. And finally this solution, as far as I know, can not be done in VB.NET, so it would have to be written in C# and the assembly referenced from VB.NET.

That brings me to the solution of using the marshaling provided by interop services. The clear advantage of this solution is that it can be applied in both C# and VB.NET and it does not require the use of unsafe code blocks therefore the code remains verifiable. Before I discuss this solution any further let me present a code snippet that demonstrates the solution.

IntPtr pIP = Marshal.AllocHGlobal( len );
Marshal.Copy( packet, 0, pIP, len );
iphdr = (IpHeader)Marshal.PtrToStructure( pIP, typeof(IpHeader) );
Marshal.FreeHGlobal( pIP );

Now the moment I wrote this piece of code it felt like a knife stabbing into my back. The first thing that happens is that a block of memory is allocated on the unmanaged heap. Since the memory is allocated from the unmanaged heap it does not impact the garbage collector in anyway. The problem is that the packet is stored in the managed heap so we need to first copy the data from the buffer in the managed heap to the buffer in the unmanaged heap using Marshal.Copy. Once the data resides in the unmanaged heap we can use the Marshal.PtrToStructure to marshal the data back to the managed heap in the form of our IpHeader structure. And finally we must free the unmanaged memory that we allocated.

This really gave me a sour taste in my mouth. Every packet that I picked up needs to be shifted from the managed heap to the unmanaged heap and then be marshaled back to managed heap. Are there words to describe this? Clearly I was not satisfied with this solution.

The options investigated so far have required somehow side stepping the way the CLR normally would go about its business. Introducing risks such as buffer overruns or even memory leaks if we fail to free the unmanaged memory we have allocated.

So is there a completely managed solution to the problem? Well yes there is and at first this might seem like it is really the long way around in comparison to the options we have explored so far. However after I present the code I will demonstrate why in this particular case it was actually less code than the previous solutions. This solution makes use of the BinaryReader to read the data directly into the structures members.

System.IO.MemoryStream stm = new System.IO.MemoryStream( packet, 0, len );
System.IO.BinaryReader rdr = new System.IO.BinaryReader( stm );
iphdr.VerLen = rdr.ReadByte();
iphdr.TOS = rdr.ReadByte();
iphdr.TotalLength = rdr.ReadInt16();
iphdr.ID = rdr.ReadInt16();
iphdr.Offset = rdr.ReadInt16();
iphdr.TTL = rdr.ReadByte();
iphdr.Protocol = rdr.ReadByte();
iphdr.Checksum = rdr.ReadInt16();
iphdr.SrcAddr = rdr.ReadInt32();
iphdr.DestAddr = rdr.ReadInt32();

This solution has it’s own set of advantages and disadvantages. Firstly the structure definition can loose the StructLayoutAttribute since the physical layout of the structure is no longer important. The primary advantage in my opinion of this code is that we are working with code that is 100% managed and there are no cute tricks that could end up tripping us up, although there are other aspects to this solution that could. Another advantage is particular to the case of reading network data, which I will address shortly.

The code creates a MemoryStream on the existing packet buffer and a BinaryReader is created to read the data from the MemoryStream. Then byte for byte (and now and then a int or short) the data is read into an instance of the structure.

As I said earlier, the next advantage is specific to the requirements of this particular case. As you know when reading data from a socket the data is in network byte order or big-endian form. This byte ordering is incompatible with Intel processors, which expect multi-byte numeric data types to be represented in little-endian form. Since the packet’s data is in big-endian we are required to convert each Int16 and Int32 (short and int) to little-endian form. I use IPAddress.NetworkToHostOrder to do the byte swapping for me, since this is required regardless of which of the above solutions where chosen, this would mean calling the function on each non-byte member of the struct. Using the last solution we investigated this could be accomplished in one step while filling the structure.

iphdr.VerLen = rdr.ReadByte();
iphdr.TOS = rdr.ReadByte();
iphdr.TotalLength = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.ID = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.Offset = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.TTL = rdr.ReadByte();
iphdr.Protocol = rdr.ReadByte();
iphdr.Checksum = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.SrcAddr = IPAddress.NetworkToHostOrder(rdr.ReadInt32());
iphdr.DestAddr = IPAddress.NetworkToHostOrder(rdr.ReadInt32());

The most obvious concern using this solution is the possibility of misaligned reads, e.g. reading a Byte where you should have read an Int. One mistake of this kind and all the data after that point are misaligned and make no sense. This solution also requires that you maintain both the structure and the code that populates the structure, if the structure changes in someway the code to read the structure needs to be updated. And even with medium sized structures you end up doing a lot of byte counting and checking that every read is performed in the correct order, which is very error prone.

I have found that including a constructor and/or a static method provides an elegant means of hiding the actual method used to perform the de-serialization. The following pieces of code demonstrate the two possible ways of creating a de-serialized packet using either a static method or constructor.

IpHeader iphdr = IpHeader.FromPacket( packet, len );

IpHeader iphdr = new IpHeader( packet, len );

Typically I implement both, and the static method just creates an instance of the structure using the appropriate constructor.

Conclusion

Though the above solutions are not the only solutions, most other solutions are some form of one of these. Alternatively you could use managed C++ and get the best of both worlds and the expense of verifiable code. The presented solutions are applicable whenever you want to read binary data into a structured form; regardless of the source of the data, be it a network byte stream, or data from a disk file.

Selecting an appropriate solution is dependent on your requirements and constraints; personally I tend towards using unsafe code blocks especially if the structures are large (call me lazy) or the final solution of reading the data byte for byte. I just can’t feel good about moving perfectly good memory all over the show.