Saturday, April 17, 2010

Archive: Structure from Binary Data

Original Post Date: 8 Jan 2004

The question of moving stream data into a structure seems to be addressed every so often on the newsgroups. While I was building a front-end for a small packet sniffing application I was faced with these same questions. While I was aware of the standard approaches I decided to investigate what the alternative options where. This article is a summary of the primary ways I looked at of mapping binary data to a structure.

Defining the structure

The first thing that I did was to define the structure, for this example I am going to use the IP header. Deciding on the correct definition of the structure is dependent on the solution or approach taken to solve this problem.

Given that I have spent a considerable amount of time familiarizing myself with the intricacies of the P/Invoke or interoperability services, my first solution was to use System.Runtime.InteropServices to marshal the data into a structure. I promptly defined the following structure to handle the marshaled data.

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct IpHeader
{
  public byte VerLen;
  public byte TOS;
  public short TotalLength; 
  public short ID;
  public short Offset;
  public byte TTL;
  public byte Protocol;
  public short Checksum;
  public int SrcAddr;
  public int DestAddr;
}

The two things to note about the definition of the structure is that it is defined as a sequential structure which ensures that the order of the members and relative location of these members are not altered by the CLR.

Now that the structure is defined the next step is to populate the structure with data. In this case the data came from a call to the Receive method of the Socket class. To read IP header the socket was created using SocketType.Raw. The Receive method fills a byte array with the data and it is the first 20 bytes of this data that makes up the IP header. Because the structure that I have defined, and the structure of the data in these 20 bytes are a one to one match, using a language like C++ reading this data would be a relatively simple task requiring only a cast.

IpHeader *pHeader = (IpHeader*)packet;

With the structure of the IpHeader overlayed on the memory stream it is a matter of accessing the members of the structure. Unfortunately our structure is stored in managed memory and using code like the above in C# would require entering an unsafe code block. The following piece of code demonstrates how this could be achieved in C#.

IpHeader iphdr;
unsafe
{
  fixed ( byte *pData = packet)
  {
    iphdr = *(IpHeader*)pData;
  }
}
//Use iphdr...
For the above code to compile successfully you will have to flip the switch to allow unsafe code to be compiled. This can be achieved using either the /unsafe switch for the command line compiler or in the Visual Studio .NET project properties under Configuration Properties>Build you can set the Allow Unsafe Code Blocks to true.

The code in the unsafe block uses the fixed keyword to pin the packet structure in memory ensuring that the garbage collector does not shift the memory around under our feet. Because the memory is fixed it can now be used much like a standard C/C++ pointer allowing us to cast the block of memory to an IpHeader* and assign the pointer to iphdr structure.

Though this is a relatively simple piece of code there are a few points to be noted about this solution. First whenever we allow unsafe code blocks in a C# application we inhibit the CLR’s ability to verify the code. In this case, by fixing the data in memory and accessing it like a pointer the CLR can no longer check the array bounds and our code is open to a buffer overrun. The second point of concern with the fixed memory is that the garbage collector cannot shift this memory impacting the garbage collection process and the performance of the garbage collector if a collection is required. And finally this solution, as far as I know, can not be done in VB.NET, so it would have to be written in C# and the assembly referenced from VB.NET.

That brings me to the solution of using the marshaling provided by interop services. The clear advantage of this solution is that it can be applied in both C# and VB.NET and it does not require the use of unsafe code blocks therefore the code remains verifiable. Before I discuss this solution any further let me present a code snippet that demonstrates the solution.

IntPtr pIP = Marshal.AllocHGlobal( len );
Marshal.Copy( packet, 0, pIP, len );
iphdr = (IpHeader)Marshal.PtrToStructure( pIP, typeof(IpHeader) );
Marshal.FreeHGlobal( pIP );
Now the moment I wrote this piece of code it felt like a knife stabbing into my back. The first thing that happens is that a block of memory is allocated on the unmanaged heap. Since the memory is allocated from the unmanaged heap it does not impact the garbage collector in anyway. The problem is that the packet is stored in the managed heap so we need to first copy the data from the buffer in the managed heap to the buffer in the unmanaged heap using Marshal.Copy. Once the data resides in the unmanaged heap we can use the Marshal.PtrToStructure to marshal the data back to the managed heap in the form of our IpHeader structure. And finally we must free the unmanaged memory that we allocated.

This really gave me a sour taste in my mouth. Every packet that I picked up needs to be shifted from the managed heap to the unmanaged heap and then be marshaled back to managed heap. Are there words to describe this? Clearly I was not satisfied with this solution.

The options investigated so far have required somehow side stepping the way the CLR normally would go about its business. Introducing risks such as buffer overruns or even memory leaks if we fail to free the unmanaged memory we have allocated.

So is there a completely managed solution to the problem? Well yes there is and at first this might seem like it is really the long way around in comparison to the options we have explored so far. However after I present the code I will demonstrate why in this particular case it was actually less code than the previous solutions. This solution makes use of the BinaryReader to read the data directly into the structures members.

System.IO.MemoryStream stm = new System.IO.MemoryStream( packet, 0, len );
System.IO.BinaryReader rdr = new System.IO.BinaryReader( stm );
iphdr.VerLen = rdr.ReadByte();
iphdr.TOS = rdr.ReadByte();
iphdr.TotalLength = rdr.ReadInt16();
iphdr.ID = rdr.ReadInt16();
iphdr.Offset = rdr.ReadInt16();
iphdr.TTL = rdr.ReadByte();
iphdr.Protocol = rdr.ReadByte();
iphdr.Checksum = rdr.ReadInt16();
iphdr.SrcAddr = rdr.ReadInt32();
iphdr.DestAddr = rdr.ReadInt32();

This solution has it’s own set of advantages and disadvantages. Firstly the structure definition can loose the StructLayoutAttribute since the physical layout of the structure is no longer important. The primary advantage in my opinion of this code is that we are working with code that is 100% managed and there are no cute tricks that could end up tripping us up, although there are other aspects to this solution that could. Another advantage is particular to the case of reading network data, which I will address shortly.

The code creates a MemoryStream on the existing packet buffer and a BinaryReader is created to read the data from the MemoryStream. Then byte for byte (and now and then a int or short) the data is read into an instance of the structure.

As I said earlier, the next advantage is specific to the requirements of this particular case. As you know when reading data from a socket the data is in network byte order or big-endian form. This byte ordering is incompatible with Intel processors, which expect multi-byte numeric data types to be represented in little-endian form. Since the packet’s data is in big-endian we are required to convert each Int16 and Int32 (short and int) to little-endian form. I use IPAddress.NetworkToHostOrder to do the byte swapping for me, since this is required regardless of which of the above solutions where chosen, this would mean calling the function on each non-byte member of the struct. Using the last solution we investigated this could be accomplished in one step while filling the structure.

iphdr.VerLen = rdr.ReadByte();
iphdr.TOS = rdr.ReadByte();
iphdr.TotalLength = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.ID = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.Offset = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.TTL = rdr.ReadByte();
iphdr.Protocol = rdr.ReadByte();
iphdr.Checksum = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.SrcAddr = IPAddress.NetworkToHostOrder(rdr.ReadInt32());
iphdr.DestAddr = IPAddress.NetworkToHostOrder(rdr.ReadInt32());

The most obvious concern using this solution is the possibility of misaligned reads, e.g. reading a Byte where you should have read an Int. One mistake of this kind and all the data after that point are misaligned and make no sense. This solution also requires that you maintain both the structure and the code that populates the structure, if the structure changes in someway the code to read the structure needs to be updated. And even with medium sized structures you end up doing a lot of byte counting and checking that every read is performed in the correct order, which is very error prone.

I have found that including a constructor and/or a static method provides an elegant means of hiding the actual method used to perform the de-serialization. The following pieces of code demonstrate the two possible ways of creating a de-serialized packet using either a static method or constructor.

IpHeader iphdr = IpHeader.FromPacket( packet, len );
Or
IpHeader iphdr = new IpHeader( packet, len );
Typically I implement both, and the static method just creates an instance of the structure using the appropriate constructor.

Conclusion

Though the above solutions are not the only solutions, most other solutions are some form of one of these. Alternatively you could use managed C++ and get the best of both worlds and the expense of verifiable code. The presented solutions are applicable whenever you want to read binary data into a structured form; regardless of the source of the data, be it a network byte stream, or data from a disk file.

Selecting an appropriate solution is dependent on your requirements and constraints; personally I tend towards using unsafe code blocks especially if the structures are large (call me lazy) or the final solution of reading the data byte for byte. I just can’t feel good about moving perfectly good memory all over the show.

No comments:

Post a Comment