Sunday, October 2, 2011

My first attempt at using the HTML 5 Canvas

Today I found myself wondering what kind of performance I can get with Javascript and the new HTML 5 Canvas. Looking for something fun to implement quickly I dug up some code for a demo I did of the Silverlight WriteableBitmap and started porting that to Javascript.

Now I am not much of a Javascript guy, in fact I can safely say that this is probably the most Javascript I have written, probably more than all other previous attempts at playing with Javascript put together.

So why am I writing about this if I lack so much experience, well to be honest, it is because I was actually quite surprised by the results and I thought it would be worth sharing.

I did the initial development using IE 9, and I was impressed by the performance, I hit the test page using Firefox 7, Google Chrome 14 and Safari 5.1. By far Firefox was the slowest, IE 9, Chrome and Safari all performed really well on my Window 7 box. So if you are viewing this in Firefox, try it with IE 9 and see if your experience is similar.

Tuesday, September 6, 2011

Practical Algorithms and Data Structures : Arrays–Part 1

To understand arrays, you first need to understand some basic concepts of how computer memory is managed. This is going to be a very high-level explanation with many over simplifications, basically I hope to share just enough to ensure that you don’t get lost when we get into the details of arrays.

You can think of computer memory as a collection of blocks which are placed sequentially one after the other in a row. Each block is uniquely identified by a number, this number in known as a memory address. For example, the first block would be at memory address 0 the next at memory address 1 and so on until you reach the last block.

Each of of these blocks or memory locations can store a single piece of data, this piece of data is known as a byte. A byte is made-up of 8 bits where each bit can have a binary state of either ‘1’ or ‘0’, 256 unique combinations of 1s and 0s can be formed for a byte. The interpretation or meaning associated to each of these patterns depends on what part of the system is looking at the data and how it has chosen to interpret the data. For example if your application is interpreting the data in the memory area as letters of the alphabet you might choose to interpret the bit pattern 01000001 as the upper case letter ‘A’. If on the other hand your application is treating the data as a numerical quantity representing someone's age for example, that same bit pattern would be interpreted as the numeric value 65. (See note 1 below)

More complex data can be represented by combining the patterns of multiple memory locations and interpreting those patterns appropriately. Let me give you an example.

Suppose you wanted to store a string of characters in memory, you could choose to store that string with a length prefix which indicates how many characters are in the string, and the subsequent memory locations would contain the bytes that represent each character of the string.

String in memory

The above image represents a string stored in memory with a length prefix of 5. Memory location 0 contains the bit pattern for the number 5 indicating that the next 5 memory locations contain the character data of the string. Memory location 1 contains the bit pattern 01100011 which has the numeric value of 99, but because we know this is a character of a string we will map this to the character ‘c’ (We are using an ASCII mapping here, see note 2 below), the ‘r at memory address 3 is encoded as 01110010 which represents the numeric value 114. This mapping is done for the 5 memory locations after the length prefix.

Similarly we need encoding mechanisms to store large numeric integer values, real numbers which can represent fractions of a whole number etc. As you might have already realized, the data stored in a single memory location can be quite limiting having only 256 possible values. That is not much to work with, it might be fine for storing your age, but what about the population of a country? As with the example above where we used multiple memory locations to store a string, a similar solution can be found for storing larger numbers that will require multiple memory locations which when interpreted as a whole represent a larger numeric value. For example, using 4 memory locations we have a total of 32 bits of data which gives you 4,294,967,295 unique bit patterns which covers us for the population for any country in the world, we need to go even bigger if we want to store the entire population of the world however.

The important thing here is that you know up front how to interpret the various pieces of information stored in the memory and how many bytes make up a single unit of information.

As you can imagine, these encodings get quite complex, fortunately you rarely need to deal with these low level details, for the most part the details are nicely taken care of for us by the higher level tools that we use to write our software.

The important take away from this post, is that more complex data representations can require multiple memory locations to store a single instance of data.


Prev – Introduction


* Notes:

  1. The bit patterns are not random, they follow a numbering scheme known as the binary system. Read more about this here.
  2. Interpreting the bit patterns as characters is done using a look-up table, in this case I have used the ASCII table which you can read more about here.

Saturday, July 9, 2011

Practical Algorithms and Data Structures : Introduction

Welcome to the first of hopefully many posts on Algorithms and Data Structures.

One of the areas of computer science that I have always felt drawn to is the study of algorithms. That is not to say that I have developed any kind of specialized expertise in the field, but I do find it tremendously interesting and continuously strive to expand on my understanding and ability to apply algorithms effectively.

For me at least, the best way to evolve my knowledge of a topic is to attempt to explain some aspect of the particular topic to someone else. On countless occasions I have found that by answering questions and having my explanations challenged, has enabled me to reach new levels of that “Aha” moment when suddenly a topic that I thought I understood just became that much clearer.

Now you might be wondering why I would even bother writing something like this, especially given all the material out there covering this specific area. Well other than my selfish motivation to learn more in the process, I have also realized that so many developers today lack the basics in terms of even the simplest algorithms. They might know the terminology and sometimes not even that, but as soon as you start probing on the specifics of the algorithms, how to decide which algorithm to use under which circumstance etc. things start to go sideways.

Why is it the case, that so many practicing developers today find themselves lacking in this area? Well I guess, the truth is that with all the excellent tooling and libraries accompanying  most languages and development platforms today, very few people need to actually delve into the depths of the algorithms and data structures they use. Everything is right there, got a list and want to find something in it, just call Find, Search, IndexOf or whatever function is documented to find an item in the list and be done with it. There is definitely nothing wrong with this, these libraries are professionally developed, robust and already used by thousands of developers so they are well QA’d. There is however tremendous value in having a good understanding of the algorithms and data structures you use.

If you understand your data, and you know what it is you need to do with that data, the next step is selecting the most appropriate data structure to store your data. Algorithms and data structures are tightly coupled together, often the data structures you use to store your data will determine which algorithms can be used efficiently on the data. While most data structures can be searched, what will vary is how efficiently that search can be performed, depending on the underlying data structure or even the ordering of the data in the structure. Having an understanding of the various data structures and the corresponding algorithms can help you choose the most efficient way to work with your data and ensure your software does not buckle under the pressure of huge data volumes just because you selected the wrong data structure and/or algorithm to manipulate and manage your data. Selecting the right algorithm for the job often it the key to a successful outcome, but to do that you need to understand the pros and cons of what is available to you.

With this series of blog posts I hope to share some of my learning's and at the same time gain a deeper level of understanding as we explore the algorithms and data structures together. I invite you to participate in this series, if you know a better way to implement something or have a better approach to explaining a specific algorithm, please share with us and help enrich our journey.

What are Data Structures?

Data structures are the containers that you use to store and manage the data for your application.

As we design and develop software, one of the decisions we need to make is what data types to use to store the application specific data. If we where developing a simple contact management system, in which we can enter contact information and later retrieve that information, we would need to define how the contacts would be represented internally within the system i.e. the data elements that represent a contact, such as first name, last name, address, email address, mobile number, date of birth etc. as well as the data types used to store each of these elements, how multiple contacts will be maintained and managed within the system, all of which help define the data structures that will be required to build a functional system.

However simply looking at the data requirements is not always enough, we also need to look at the algorithms we intend to apply to these data structures, how will we manipulate the data, perform searches, sort the data etc.. As we will discover through this series, the intended algorithms will have a bearing on the data structures we might select to represent the data in the system.

What is an Algorithm?

An algorithm is a recipe or set of instructions that can be followed to solve a specific type of problem.

Assume we have chosen to store our contacts from earlier in a list in which we can access each contact by walking through the list item by item, like paging through a book. We now need to find an algorithm we can use to search through the list to locate a contact by last name. How would you go about that? Given what we know about the data structure, the obvious solution would be to iterate through the list of contacts comparing the last name element to the search key. If you find a match you can stop the iteration and return the instance of the contact that was found, otherwise if you reach the end of the list and no match was found you return some indication that the contact does not exist. These steps describe what is known as a Sequential Search.

There are situations where using the simple Sequential Search algorithm might not be the best option and could severely hurt the performance of your system. For example, if the list in question contained a significant number of items and you need to perform frequent searches to determine the existence of an item in the list, this could quickly become a bottleneck. Every time we search for an item that does not exist, we will be iterating through the entire list just to determine that the item does not exist, in the best case the item we are searching for is found quickly within the first few items of the list, while on other occasions the item might only be found towards the end of the list. We will look at this in a little more depth in the section on Big-O notation.

Let’s look at one possible alternative, we could use a Binary Search. This algorithm can be significantly more efficient than a Sequential Search, especially in the worst case scenarios where the item being searched is either not in the list or it exists far from the beginning of the list. However, to be able to use the Binary Search, the collection of items will need to conform to the basic requirements of the Binary Search.

  1. The list of items must be sorted
  2. The list structure must support what is often called random access. i.e. we should be able to access item 83 in a list of 100 items without needing to iterate over the first 82 items.

Given the above constraints are met, for the moment we will ignore the cost of ensuring the data is sorted, while not insignificant, for the purposes of the discussion we will choose to ignore it for now, we can use the Binary Search algorithm to introduce some significant optimization.

Here is a quick introductory example of the basics of the Binary Search algorithm.

First you select the item in the middle of the list, in a list of 100 items that will be item 50. Now compare item 50 (the midpoint item) to the search key, if it is a match we can terminate the search and return item, if the search key is greater than the midpoint item then we know that, if the item exists, it must be in the second half of the list. We know this because the list is sorted, therefore if the the search key is greater than the midpoint item, if must be greater than all the items preceding the midpoint item. And visa versa, if the search key is less than the midpoint item, then a potentially matching item would be in the first half of the list. Can you see how with a single comparison we have eliminated half of the items to be searched?

Having determined which half of the list the item might be in, you can repeat the same logic on that subset of the data. Having narrowed the list of items down to a subset of 50, you can again select the midpoint item and compare it to the search key, which will either be a match or indicate that the search key potentially exists in the top or bottom half of the subset. After 2 comparisons we have eliminated roughly 75% of the items to be searched. You can continue until you either find the item or run out of items to search which would indicate that the item does not exist in the list.

We will cover both the Sequential and Binary Search in more detail later, for now I just want to use this to demonstrate how selecting the right algorithm for the job can make a difference and how the nature of the data might influence the algorithms you can use. And it also leads us into the next topic and that is Big-O notation.

Big-O notation

I am sure you have at some point seen or read about Big-O notation even if you did not know what it meant. If you spent anytime reading about algorithms you might have seen something like the following O(n), O(log n) or O(n3). And if you wondered what it all means, I will try to give a very brief non-mathematical description of how you can make some basic sense of this notation.

When selecting an algorithm there are a number of factors that you might have to take into consideration. For desktops or server based applications your primary criteria might be performance, where you want to select the algorithm that is going to give you the best performance regardless of the amount of memory the algorithm requires to be executed. On the other hand, on mobile devices you might be more concerned about the memory requirements of a particular algorithm. In either case, we need some way to represent this these characteristics of an algorithm, this representation needs to be simple enough that just by looking at it I can tell if one algorithm will perform better than another or if it will be more memory efficient without needing to read or understand the complex mathematical analysis of each algorithm. For our purposes we will focus on the performance aspect and discuss the memory aspect later when we work with actual algorithms.

Big-O notation gives us a concise notation that captures the performance characteristics of an algorithm over a collection of items. Basically we can see at a glace if the algorithm performance will degrade rapidly, linearly or gradually as the number of items in the collection increases. Of course as we saw earlier algorithms have best case scenarios as well as worst case scenarios, Big-O notation represents the average case, so looking at the Big-O for a Binary Search I can say that on average the Binary Search will out perform a Sequential Search. That does not mean it will always out perform the Sequential Search, remember, if the matching item is first in the list the Sequential Search will find it immediately, while the Binary Search will need to perform a few iterations before finding that the first item is the matching item, but on average we would expect that the Binary Search will perform better for real world searches.

Using Big-O notation, the Sequential Search would be described as an O(n) algorithm, where n represents the number of items the algorithm will be working with. If we searched a list of 10 items then n=10 and if we searched a list of 1000 items then n=1000. From this we can conclude that on average the algorithm performs linearly, if we double the number of items the average search time will double, so the relationship between the number of items and the execution time is linear.

How does that compare to our Binary Search algorithm, well without getting into the details now, I will tell you that a Binary Search is an O(log n) algorithm. This means that as the number of items increase the execution time increases logarithmically. Mathematically log(n) < n were n is a positive integer (see the table below), therefore we can say that Binary Search is faster than a Sequential search.

Lets look a quick analysis, if the item that I am searching for is the first item in a list of 1,000,000 items then the Sequential Search will clearly out perform the Binary Search which is going to jump to the middle of the list, see that the item is in the first half of the list and half that portion and so on for a total of 20 comparisons before locating the target item at the beginning of the list. However if the item being searched was the last item in the list then the Sequential Search would require 1,000,000 comparisons while the Binary Search worst case would not be more than 20 comparisons. So the worst case of the Binary Search of a collection of sorted items is 20 comparisons while the sequential search will exceed this worst case for when searching of any of the 999 980 that are after the first 20 items in the list.

What we have seen here is that the Sequential Search has a best case execution of O(1), that is constant time regardless of the number of items in the list, of course this is the absolute best case when you are lucky enough to have the item you are looking for be the first item in the list. While the worst case if the item is the last item or the item does not exist at all will be O(n) which is also the average case.

The Binary Search also has a best case scenario of O(1), that is when the item you are searching for happens to be the item in the middle of the list, in which case the item would be found on the first comparison, but the average case is O(log n).

n O(1) O(n) O(log n) O(n log n) O(n2)
1 1 1 0 0 1
10 1 10 3.32 33.22 100
100 1 100 6.64 664.39 10000
1000 1 1000 9.97 9965.78 1000000
10000 1 10000 13.29 132877.12 100000000
100000 1 100000 16.61 1660964.05 10000000000
1000000 1 1000000 19.93 19931568.57 1000000000000

Looking at the above table you see a comparison of some Big-O representations for various values of n. This should give you a feel for how one algorithm would perform relative to another based on the Big-O of the algorithm. The best general purpose sorting algorithms today are O(n log n) algorithms.

Given the speed of computers today, even the worst performing algorithms will appear to perform efficiently for small values of n. That is why it is very important to understand the volume of data that your system might need to work with and make sure you test with volumes that are representative of what you expect to see in the production environment.

When selecting your algorithms, make sure that you fully grasp the context in which the algorithm will be used and how that scope might change overtime as your system hopefully becomes more and more popular.

If you would like to see a visual representation of the table above take a look at my Big-O Visualizer. Note this application requires Silverlight 4.


Next: Arrays – Part 1


Thursday, December 23, 2010

Visualizing Big-O complexity functions

I quickly threw this little Silverlight application together to help visualize some of the common Big-O complexity functions often quoted in any discusion on Data Structures and Algorithms.

This works on Linux Firefox with Moonlight 3 Preview Release it is a little slow but it works.

The following is a brief summary of the functions demonstrated in the application. You can get more detail here.
FunctionDescriptionExample
O(1) Constant complexity regardless of the domain size. Array direct indexing, Hash table
O(n) Linear complexity. As the domain size increases, the time/complexity increases linearly. If you double the items the complexity will double. Sequential search
O(log n) Logarithmic complexity. As the domain size increases, the time/complexity increases logorithmically Binary Search on sorted data, Lookup in balanced binary tree
O(n log n) Log Linear complexity. As the domain size increases, the time/complexity increases at a log linear or geometric rate. Heap Sort, Merge Sort, Quick Sort1
O(n²) Quadratic complexity. As the domain size increases, the time/complexity increases quadratically. Bubble Sort, Insertion sort
O(n³) Cubic complexity. As the domain size increases, the time/complexity increases at a cubic rate. Naive multiplication of two nxn matrices.
O(2ⁿ) Exponential complexity. As the domain size increases, the time/complexity increases exponentialy. Some graph algoritms like finding the exact solution to the traveling salesman problem.
1Quick sort has best and average case of O(n log n) but worst case of O(n²)

Saturday, September 25, 2010

Code Share: Finding controls by control type.

On stackoverflow.com the question was asked, how to ‘Find ContentPlaceHolders in Master Page’

In this case the OP wanted to work with a Master Page other than the one that was already The first part of the problem was getting the Master Page loaded in memory so that the Control tree could be interrogated.

Fortunately loading the Master Page is quite simple, you can use LoadControl to load the Master Page just like you would load any other user control.

For example in the Page_Load handler you could use something like the following to load the Master Page.

var site1Master = LoadControl("Site1.Master");

The next part, finding all the controls of a specific type, requires a simple recursive routine to search the control tree for all the controls of the type that you are interested in. Here is a simple implementation of just such a routine.



static class WebHelper
{
public static IList<T> FindControlsByType<T>(Control root)
where T : Control
{
if (root == null) throw new ArgumentNullException("root");

List<T> controls = new List<T>();
FindControlsByType<T>(root, controls);
return controls;
}

private static void FindControlsByType<T>(Control root, IList<T> controls)
where T : Control
{
foreach (Control control in root.Controls)
{
if (control is T)
{
controls.Add(control as T);
}
if (control.Controls.Count > 0)
{
FindControlsByType<T>(control, controls);
}
}
}
}

Using the tow pieces of code above, finding all the ContentPlaceHolders on the Master Page can be done like this.



// Load the Master Page
var site1Master = LoadControl("Site1.Master");

// Find the list of ContentPlaceHolder controls
var controls = WebHelper.FindControlsByType<ContentPlaceHolder>(site1Master);

// Do something with each control that was found
foreach (var control in controls)
{
Response.Write(control.ClientID);
Response.Write("<br />");
}


Hope someone finds this useful. I thought I would share it since I took the few moments to write it and did not want it to go to waste.

Sunday, June 27, 2010

Crossing the process boundary with .NET

 

Every so often my post Hacking my way across the process boundary gets some attention. Mostly in the form of requests for a .NET version of this technique. Now out of laziness more than anything else I have not actually taken the time or effort to do the conversion until now. So for those that need to access ListView or TreeView data from another process here is a simple example of one possible way to do it. Since I used C# for the example, I could very well have used unsafe code blocks to do some of the work, however I decided to avoid this making this example applicable to VB.NET developers as well. If you would like to see a version using unsafe code blocks, drop me a note and I will get round to it. To keep the sample short I have removed anything but the most rudimentary error checking. For an explanation of this code, please refer to the original post sighted above.

 

using System;
using System.Runtime.InteropServices;
using System.Text;
public class CrossProcessMemory
{
const int LVM_GETITEM = 0x1005;
const int LVM_SETITEM = 0x1006;
const int LVIF_TEXT = 0x0001;
const uint PROCESS_ALL_ACCESS = (uint)(0x000F0000L | 0x00100000L | 0xFFF);
const uint MEM_COMMIT = 0x1000;
const uint MEM_RELEASE = 0x8000;
const uint PAGE_READWRITE = 0x04;

[DllImport("user32.dll")]
static extern bool SendMessage(IntPtr hWnd, Int32 msg, Int32 wParam, IntPtr lParam);

[DllImport("user32")]
static extern IntPtr GetWindowThreadProcessId( IntPtr hWnd, out int lpwdProcessID );

[DllImport("kernel32")]
static extern IntPtr OpenProcess(uint dwDesiredAccess, bool bInheritHandle,
int dwProcessId);

[DllImport("kernel32")]
static extern IntPtr VirtualAllocEx( IntPtr hProcess, IntPtr lpAddress,
int dwSize, uint flAllocationType, uint flProtect);

[DllImport("kernel32")]
static extern bool VirtualFreeEx( IntPtr hProcess, IntPtr lpAddress, int dwSize,
uint dwFreeType );

[DllImport("kernel32")]
static extern bool WriteProcessMemory( IntPtr hProcess, IntPtr lpBaseAddress,
ref LV_ITEM buffer, int dwSize, IntPtr lpNumberOfBytesWritten );

[DllImport("kernel32")]
static extern bool ReadProcessMemory( IntPtr hProcess, IntPtr lpBaseAddress,
IntPtr lpBuffer, int dwSize, IntPtr lpNumberOfBytesRead );

[DllImport("kernel32")]
static extern bool CloseHandle( IntPtr hObject );

[StructLayout(LayoutKind.Sequential)]
public struct LV_ITEM
{
public uint mask;
public int iItem;
public int iSubItem;
public uint state;
public uint stateMask;
public IntPtr pszText;
public int cchTextMax;
public int iImage;
}

public static string ReadListViewItem( IntPtr hWnd, int item )
{
const int dwBufferSize = 1024;

int dwProcessID;
LV_ITEM lvItem;
string retval;
bool bSuccess;
IntPtr hProcess = IntPtr.Zero;
IntPtr lpRemoteBuffer = IntPtr.Zero;
IntPtr lpLocalBuffer = IntPtr.Zero;
IntPtr threadId = IntPtr.Zero;

try
{
lvItem = new LV_ITEM();
lpLocalBuffer = Marshal.AllocHGlobal(dwBufferSize);
// Get the process id owning the window
threadId = GetWindowThreadProcessId( hWnd, out dwProcessID );
if ( (threadId == IntPtr.Zero) || (dwProcessID == 0) )
throw new ArgumentException( "hWnd" );

// Open the process with all access
hProcess = OpenProcess( PROCESS_ALL_ACCESS, false, dwProcessID );
if ( hProcess == IntPtr.Zero )
throw new ApplicationException( "Failed to access process" );

// Allocate a buffer in the remote process
lpRemoteBuffer = VirtualAllocEx( hProcess, IntPtr.Zero, dwBufferSize, MEM_COMMIT,
PAGE_READWRITE );
if ( lpRemoteBuffer == IntPtr.Zero )
throw new SystemException( "Failed to allocate memory in remote process" );

// Fill in the LVITEM struct, this is in your own process
// Set the pszText member to somewhere in the remote buffer,
// For the example I used the address imediately following the LVITEM stuct
lvItem.mask = LVIF_TEXT;
lvItem.iItem = item;
lvItem.pszText = (IntPtr)(lpRemoteBuffer.ToInt32() + Marshal.SizeOf(typeof(LV_ITEM)));
lvItem.cchTextMax = 50;

// Copy the local LVITEM to the remote buffer
bSuccess = WriteProcessMemory( hProcess, lpRemoteBuffer, ref lvItem,
Marshal.SizeOf(typeof(LV_ITEM)), IntPtr.Zero );
if ( !bSuccess )
throw new SystemException( "Failed to write to process memory" );

// Send the message to the remote window with the address of the remote buffer
SendMessage( hWnd, LVM_GETITEM, 0, lpRemoteBuffer);

// Read the struct back from the remote process into local buffer
bSuccess = ReadProcessMemory( hProcess, lpRemoteBuffer, lpLocalBuffer, dwBufferSize,
IntPtr.Zero );
if ( !bSuccess )
throw new SystemException( "Failed to read from process memory" );

// At this point the lpLocalBuffer contains the returned LV_ITEM structure
// the next line extracts the text from the buffer into a managed string
retval = Marshal.PtrToStringAnsi((IntPtr)(lpLocalBuffer.ToInt32() +
Marshal.SizeOf(typeof(LV_ITEM))));
}
finally
{
if ( lpLocalBuffer != IntPtr.Zero )
Marshal.FreeHGlobal( lpLocalBuffer );
if ( lpRemoteBuffer != IntPtr.Zero )
VirtualFreeEx( hProcess, lpRemoteBuffer, 0, MEM_RELEASE );
if ( hProcess != IntPtr.Zero )
CloseHandle( hProcess );
}
return retval;
}
}



Sunday, June 6, 2010

Visual Studio 2010 – Box Selection

This has been blogged about and promoted everywhere and it is true, it is a super cool feature that even the pre-.NET IDE supported, but VS2010 breaths new life into the feature with some great enhancements. For a very cool intro to the functionality take a look here where a member of the IDE team demonstrates the enhancements.

So, why on earth would I be blogging about “old” news. Well one thing that I found “missing” with this feature until now was the ability to use the keyboard to perform a block selection. Now with VS 2010 you can use Shift+Alt+[Up, Down, Left, Right] to perform a block selection. This is a great enhancement, and worth knowing about if you are a keyboard junkie.

Saturday, April 17, 2010

Archive: Binary data from a Structure

For this post, I will be using the binary structure used by the trusty old DBF file structure.

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct DBFHeader
{
  public byte Tag;
  public byte Year;
  public byte Month;
  public byte Day; 
  public int RecCount;
  public short HeaderSize;
  public short RecSize;
  public short Reserved1;
  public byte Trans;
  public byte Encrypt;
  public long Reserved2_1;
  public short Reserved2_2;
  public byte MDX;
  public byte LangId;
  public short Reserved3;
}

Given the above structure, our challenge now is to transfer it from its in memory binary format to a stream. The stream may be a physical file, memory stream even a network stream. The most important aspect of all the techniques that I will investigate here is that the resulting binary data format is consistent with the format expected by the reader.

After looking at the interface provided by the Stream class, from which all streams inherit, I came to the conclusion that I would have to convert the structure to a byte array (byte[]).

Again our first stop is the interop services provided by the .NET framework. The following piece of code will convert a structure to a byte[] that can then be written to a stream.

DBFHeader hdr = new DBFHeader(); 
byte[] data = new byte[Marshal.SizeOf(hdr)];
unsafe
{
  DBFHeader *p = &hdr;
  Marshal.Copy( (IntPtr)p, data, 0, data.Length ); 
} 

This code requires that you compile with unsafe code allowed. This can be done either by supplying the /unsafe switch to the command line compiler or in the Visual Studio .NET project properties under Configuration Properties>Build, you can set the Allow Unsafe Code Blocks to True.

After creating a instance of the structure DBFHeader a byte[] is created large enough to hold the contents of the structure. At that point we need to transfer the contents of the DBFHeader instance to the byte[]. To accomplish this, I declared an unsafe block and assigned the address of the structure, being a ValueType the structure is allocated on the local stack and is therefore implicitly pinned. Then the Marshal.Copy method is used to copy the data from an address in memory to the byte[]. Now we are free to write the byte array to a stream using the Write method provided by the stream.

There are a number of things that count against this technique. Using unsafe code blocks means that the assembly can not be verified and the caller would require the SkipVerification permission. The use of the Marshal methods requires that the immediate caller has SecurityPermissionAttribute.UnmanagedCode. This technique also requires that the data be moved in two stages, from the structure to the byte[] an then from there to the stream, this could be costly for large structures.

The first problem of the code being non-verifiable can be overcome by using a technique very similar to the above technique, without the requirement for unsafe code blocks. The following code demonstrates this technique.

DBFHeader hdr = new DBFHeader(); 
byte[] data = new byte[Marshal.SizeOf(hdr)];
IntPtr p = Marshal.AllocHGlobal( Marshal.SizeOf(hdr) );
Marshal.StructureToPtr( hdr, p, false );
Marshal.Copy( (IntPtr)p, data, 0, data.Length ); 
Marshal.FreeHGlobal( p );

Notice that it still requires that the data be moved in two stages and the use of the Marshal class carries the same security requirements as earlier. The only benefit is that the resulting assembly remains verifiable, not requiring the /unsafe option. However, allocating memory from the unmanaged heap carries with it additional performance overheads.

And finally the pure managed code solution that while not as easy as the prior techniques, it does benefit from the fact that the code is easily ported to other .NET languages and does not have the same security restrictions as the other two techniques. The solution is to use the BinaryWriter class to write each bit of information from the structure to the stream.

using (BinaryWriter wr = new BinaryWriter( stm ))
{
  wr.Write( hdr.Tag );
  wr.Write( hdr.Year );
  wr.Write( hdr.Month );
  wr.Write( hdr.Day );
  wr.Write( hdr.RecCount );
  wr.Write( hdr.HeaderSize );
  wr.Write( hdr.RecSize );
  wr.Write( hdr.Reserved1 );
  wr.Write( hdr.Trans );
  wr.Write( hdr.Encrypt );
  wr.Write( hdr.Reserved2_1 );
  wr.Write( hdr.Reserved2_2 );
  wr.Write( hdr.MDX );
  wr.Write( hdr.LangId );
  wr.Write( hdr.Reserved3 ); 
}

This code first creates a BinaryWriter that provides binary methods to access the underlying stream. In this case the underlying stream is designated by the stm argument passed to the BinaryWriter constructor. Then using the binary writer, each member of the structure is written in order to the stream through the BinaryWriter instance. I have already mentioned the clear advantages of this technique, but it does carry with it a share of disadvantages. The first and most obvious is that it requires you to write out each member individually, and with larger structures, this can be a tedious and error prone task. This brings me to the second disadvantage, if the data is not written out in the exact order expected by the reader, the reader will either fail to read the structure or even worse, corrupt data.

Just as a closing note, the final technique could also be used to get a byte[] by writing to a MemoryStream and then calling the GetBuffer method provided by MemoryStream to access the underlying byte[].


Conclusion

As with every design, it is a series of compromises that determines the ultimate chooses we make. The above techniques provide a number of alternative ranging from simplicity of code to flexibility of implementation. I believe that the final option is most likely the more purist and from a reusability standpoint the more effective solution.

Archive: Structure from Binary Data

Original Post Date: 8 Jan 2004

The question of moving stream data into a structure seems to be addressed every so often on the newsgroups. While I was building a front-end for a small packet sniffing application I was faced with these same questions. While I was aware of the standard approaches I decided to investigate what the alternative options where. This article is a summary of the primary ways I looked at of mapping binary data to a structure.

Defining the structure

The first thing that I did was to define the structure, for this example I am going to use the IP header. Deciding on the correct definition of the structure is dependent on the solution or approach taken to solve this problem.

Given that I have spent a considerable amount of time familiarizing myself with the intricacies of the P/Invoke or interoperability services, my first solution was to use System.Runtime.InteropServices to marshal the data into a structure. I promptly defined the following structure to handle the marshaled data.

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct IpHeader
{
  public byte VerLen;
  public byte TOS;
  public short TotalLength; 
  public short ID;
  public short Offset;
  public byte TTL;
  public byte Protocol;
  public short Checksum;
  public int SrcAddr;
  public int DestAddr;
}

The two things to note about the definition of the structure is that it is defined as a sequential structure which ensures that the order of the members and relative location of these members are not altered by the CLR.

Now that the structure is defined the next step is to populate the structure with data. In this case the data came from a call to the Receive method of the Socket class. To read IP header the socket was created using SocketType.Raw. The Receive method fills a byte array with the data and it is the first 20 bytes of this data that makes up the IP header. Because the structure that I have defined, and the structure of the data in these 20 bytes are a one to one match, using a language like C++ reading this data would be a relatively simple task requiring only a cast.

IpHeader *pHeader = (IpHeader*)packet;

With the structure of the IpHeader overlayed on the memory stream it is a matter of accessing the members of the structure. Unfortunately our structure is stored in managed memory and using code like the above in C# would require entering an unsafe code block. The following piece of code demonstrates how this could be achieved in C#.

IpHeader iphdr;
unsafe
{
  fixed ( byte *pData = packet)
  {
    iphdr = *(IpHeader*)pData;
  }
}
//Use iphdr...
For the above code to compile successfully you will have to flip the switch to allow unsafe code to be compiled. This can be achieved using either the /unsafe switch for the command line compiler or in the Visual Studio .NET project properties under Configuration Properties>Build you can set the Allow Unsafe Code Blocks to true.

The code in the unsafe block uses the fixed keyword to pin the packet structure in memory ensuring that the garbage collector does not shift the memory around under our feet. Because the memory is fixed it can now be used much like a standard C/C++ pointer allowing us to cast the block of memory to an IpHeader* and assign the pointer to iphdr structure.

Though this is a relatively simple piece of code there are a few points to be noted about this solution. First whenever we allow unsafe code blocks in a C# application we inhibit the CLR’s ability to verify the code. In this case, by fixing the data in memory and accessing it like a pointer the CLR can no longer check the array bounds and our code is open to a buffer overrun. The second point of concern with the fixed memory is that the garbage collector cannot shift this memory impacting the garbage collection process and the performance of the garbage collector if a collection is required. And finally this solution, as far as I know, can not be done in VB.NET, so it would have to be written in C# and the assembly referenced from VB.NET.

That brings me to the solution of using the marshaling provided by interop services. The clear advantage of this solution is that it can be applied in both C# and VB.NET and it does not require the use of unsafe code blocks therefore the code remains verifiable. Before I discuss this solution any further let me present a code snippet that demonstrates the solution.

IntPtr pIP = Marshal.AllocHGlobal( len );
Marshal.Copy( packet, 0, pIP, len );
iphdr = (IpHeader)Marshal.PtrToStructure( pIP, typeof(IpHeader) );
Marshal.FreeHGlobal( pIP );
Now the moment I wrote this piece of code it felt like a knife stabbing into my back. The first thing that happens is that a block of memory is allocated on the unmanaged heap. Since the memory is allocated from the unmanaged heap it does not impact the garbage collector in anyway. The problem is that the packet is stored in the managed heap so we need to first copy the data from the buffer in the managed heap to the buffer in the unmanaged heap using Marshal.Copy. Once the data resides in the unmanaged heap we can use the Marshal.PtrToStructure to marshal the data back to the managed heap in the form of our IpHeader structure. And finally we must free the unmanaged memory that we allocated.

This really gave me a sour taste in my mouth. Every packet that I picked up needs to be shifted from the managed heap to the unmanaged heap and then be marshaled back to managed heap. Are there words to describe this? Clearly I was not satisfied with this solution.

The options investigated so far have required somehow side stepping the way the CLR normally would go about its business. Introducing risks such as buffer overruns or even memory leaks if we fail to free the unmanaged memory we have allocated.

So is there a completely managed solution to the problem? Well yes there is and at first this might seem like it is really the long way around in comparison to the options we have explored so far. However after I present the code I will demonstrate why in this particular case it was actually less code than the previous solutions. This solution makes use of the BinaryReader to read the data directly into the structures members.

System.IO.MemoryStream stm = new System.IO.MemoryStream( packet, 0, len );
System.IO.BinaryReader rdr = new System.IO.BinaryReader( stm );
iphdr.VerLen = rdr.ReadByte();
iphdr.TOS = rdr.ReadByte();
iphdr.TotalLength = rdr.ReadInt16();
iphdr.ID = rdr.ReadInt16();
iphdr.Offset = rdr.ReadInt16();
iphdr.TTL = rdr.ReadByte();
iphdr.Protocol = rdr.ReadByte();
iphdr.Checksum = rdr.ReadInt16();
iphdr.SrcAddr = rdr.ReadInt32();
iphdr.DestAddr = rdr.ReadInt32();

This solution has it’s own set of advantages and disadvantages. Firstly the structure definition can loose the StructLayoutAttribute since the physical layout of the structure is no longer important. The primary advantage in my opinion of this code is that we are working with code that is 100% managed and there are no cute tricks that could end up tripping us up, although there are other aspects to this solution that could. Another advantage is particular to the case of reading network data, which I will address shortly.

The code creates a MemoryStream on the existing packet buffer and a BinaryReader is created to read the data from the MemoryStream. Then byte for byte (and now and then a int or short) the data is read into an instance of the structure.

As I said earlier, the next advantage is specific to the requirements of this particular case. As you know when reading data from a socket the data is in network byte order or big-endian form. This byte ordering is incompatible with Intel processors, which expect multi-byte numeric data types to be represented in little-endian form. Since the packet’s data is in big-endian we are required to convert each Int16 and Int32 (short and int) to little-endian form. I use IPAddress.NetworkToHostOrder to do the byte swapping for me, since this is required regardless of which of the above solutions where chosen, this would mean calling the function on each non-byte member of the struct. Using the last solution we investigated this could be accomplished in one step while filling the structure.

iphdr.VerLen = rdr.ReadByte();
iphdr.TOS = rdr.ReadByte();
iphdr.TotalLength = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.ID = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.Offset = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.TTL = rdr.ReadByte();
iphdr.Protocol = rdr.ReadByte();
iphdr.Checksum = IPAddress.NetworkToHostOrder(rdr.ReadInt16());
iphdr.SrcAddr = IPAddress.NetworkToHostOrder(rdr.ReadInt32());
iphdr.DestAddr = IPAddress.NetworkToHostOrder(rdr.ReadInt32());

The most obvious concern using this solution is the possibility of misaligned reads, e.g. reading a Byte where you should have read an Int. One mistake of this kind and all the data after that point are misaligned and make no sense. This solution also requires that you maintain both the structure and the code that populates the structure, if the structure changes in someway the code to read the structure needs to be updated. And even with medium sized structures you end up doing a lot of byte counting and checking that every read is performed in the correct order, which is very error prone.

I have found that including a constructor and/or a static method provides an elegant means of hiding the actual method used to perform the de-serialization. The following pieces of code demonstrate the two possible ways of creating a de-serialized packet using either a static method or constructor.

IpHeader iphdr = IpHeader.FromPacket( packet, len );
Or
IpHeader iphdr = new IpHeader( packet, len );
Typically I implement both, and the static method just creates an instance of the structure using the appropriate constructor.

Conclusion

Though the above solutions are not the only solutions, most other solutions are some form of one of these. Alternatively you could use managed C++ and get the best of both worlds and the expense of verifiable code. The presented solutions are applicable whenever you want to read binary data into a structured form; regardless of the source of the data, be it a network byte stream, or data from a disk file.

Selecting an appropriate solution is dependent on your requirements and constraints; personally I tend towards using unsafe code blocks especially if the structures are large (call me lazy) or the final solution of reading the data byte for byte. I just can’t feel good about moving perfectly good memory all over the show.

Saturday, November 7, 2009

Archive – Debugging WindowsIdentity and IsInRole

This is one of those invaluable little utility functions, you never need it until you face a problem trying to determine why IsInRole is returning and unexpected value when you run your application in a non development production environment. Using this function you can quickly determine the list of roles or groups that IsInRole is matching against.

Framework 2.0 and Later
public string[] GetWindowsIdentityRoles(WindowsIdentity identity)
{
if (identity == null) throw new ArgumentNullException("identity");

IdentityReferenceCollection groups = identity.Groups.Translate(typeof(NTAccount));
string[] roles = new string[groups.Count];
for (int i = 0; i < groups.Count; ++i)
{
roles[i] = groups[i].Value;
}

return roles;
}


Framework 1.0/1.1 (For that legacy code)


public static string[] GetWindowsIdentityRoles( WindowsIdentity identity )
{
object result = typeof(WindowsIdentity).InvokeMember( "_GetRoles",
BindingFlags.Static | BindingFlags.InvokeMethod | BindingFlags.NonPublic,
null, identity, new object[]{identity.Token}, null );

return (string[])result;
}