Archive for the ‘stackoverflow’ Tag

Searching for Exceptions in .NET

I recently came across a rather interesting question on StackOverflow that posed the problem of discovering all the exceptions that a given method might throw under every circumstance.

Of course, in the great majority of situations, XML  documentation for the BCL (and ideally any third-party libraries too) should provide information about any exception that might be thrown and any potential reason for it. Indeed, thisthis is generally all one needs to write largely error-safe code. However, not every exception is documented in any case, and for production-quality applications, it is often desirable to insure that there is no realistic chance of an unhandled exception ocurring. For this reason, it is sometimes desirable to do a rigorous check for all exceptions. Clearly, an application-level unhandled (fatal) exception handler would do the job to some extent, and although this is always a good fallback feature to have, it is the least elegant solution to coping with exceptions.

After some consideration, it became quite apparent that the task reduces to the halting problem. However, with a few simplifications, the problem does become relatively solvable. Most importantly, complex logic that determines whether an exception will be thrown must be ignored, and one must simply assume that any throw statement within a given method could possibly cause an exception under certain conditions.

Here is the complete code for the algorithm I wrote. The GetAllExceptions method is an extension method that returns a read-only collection of exceptions, which makes it very straightforward and efficient to use.

Notably, the code detects all of

  • instantiated exceptions,
  • exceptions return from method/property calls,
  • exceptions stored in fields (though if the method return type or field type is non-specific, i.e. a parent class of the actual exception type thrown), this is used instead.

Exceptions are only counted when the appropiate throw instruction is encountered at some level. Also, the stack and local variables are handled correctly, as far as I can tell, so this method should work soundly in pretty much all cases. (It has been tested with some a few quite complex methods within the BCL, as well as simpler user-defined ones.)

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Linq;
using System.Reflection;
using System.Reflection.Emit;
using System.Text;
using ClrTest.Reflection;

public static class ExceptionAnalyser
{
    public static ReadOnlyCollection<Type> GetAllExceptions(this MethodBase method)
    {
        var exceptionTypes = new HashSet<Type>();
        var visitedMethods = new HashSet<MethodBase>();
        var localVars = new Type[ushort.MaxValue];
        var stack = new Stack<Type>();
        GetAllExceptions(method, exceptionTypes, visitedMethods, localVars, stack, 0);

        return exceptionTypes.ToList().AsReadOnly();
    }

    public static void GetAllExceptions(MethodBase method, HashSet<Type> exceptionTypes,
        HashSet<MethodBase> visitedMethods, Type[] localVars, Stack<Type> stack, int depth)
    {
        var ilReader = new ILReader(method);
        var allInstructions = ilReader.ToArray();

        ILInstruction instruction;
        for (int i = 0; i < allInstructions.Length; i++)
        {
            instruction = allInstructions[i];

            if (instruction is InlineMethodInstruction)
            {
                var methodInstruction = (InlineMethodInstruction)instruction;

                if (!visitedMethods.Contains(methodInstruction.Method))
                {
                    visitedMethods.Add(methodInstruction.Method);
                    GetAllExceptions(methodInstruction.Method, exceptionTypes, visitedMethods,
                        localVars, stack, depth + 1);
                }

                var curMethod = methodInstruction.Method;
                if (curMethod is ConstructorInfo)
                    stack.Push(((ConstructorInfo)curMethod).DeclaringType);
                else if (method is MethodInfo)
                    stack.Push(((MethodInfo)curMethod).ReturnParameter.ParameterType);
            }
            else if (instruction is InlineFieldInstruction)
            {
                var fieldInstruction = (InlineFieldInstruction)instruction;
                stack.Push(fieldInstruction.Field.FieldType);
            }
            else if (instruction is ShortInlineBrTargetInstruction)
            {
            }
            else if (instruction is InlineBrTargetInstruction)
            {
            }
            else
            {
                switch (instruction.OpCode.Value)
                {
                    // ld*
                    case 0x06:
                        stack.Push(localVars[0]);
                        break;
                    case 0x07:
                        stack.Push(localVars[1]);
                        break;
                    case 0x08:
                        stack.Push(localVars[2]);
                        break;
                    case 0x09:
                        stack.Push(localVars[3]);
                        break;
                    case 0x11:
                        {
                            var index = (ushort)allInstructions[i + 1].OpCode.Value;
                            stack.Push(localVars[index]);
                            break;
                        }
                    // st*
                    case 0x0A:
                        localVars[0] = stack.Pop();
                        break;
                    case 0x0B:
                        localVars[1] = stack.Pop();
                        break;
                    case 0x0C:
                        localVars[2] = stack.Pop();
                        break;
                    case 0x0D:
                        localVars[3] = stack.Pop();
                        break;
                    case 0x13:
                        {
                            var index = (ushort)allInstructions[i + 1].OpCode.Value;
                            localVars[index] = stack.Pop();
                            break;
                        }
                    // throw
                    case 0x7A:
                        if (stack.Peek() == null)
                            break;

                        exceptionTypes.Add(stack.Pop());
                        break;
                    default:
                        switch (instruction.OpCode.StackBehaviourPop)
                        {
                            case StackBehaviour.Pop0:
                                break;
                            case StackBehaviour.Pop1:
                            case StackBehaviour.Popi:
                            case StackBehaviour.Popref:
                            case StackBehaviour.Varpop:
                                stack.Pop();
                                break;
                            case StackBehaviour.Pop1_pop1:
                            case StackBehaviour.Popi_pop1:
                            case StackBehaviour.Popi_popi:
                            case StackBehaviour.Popi_popi8:
                            case StackBehaviour.Popi_popr4:
                            case StackBehaviour.Popi_popr8:
                            case StackBehaviour.Popref_pop1:
                            case StackBehaviour.Popref_popi:
                                stack.Pop();
                                stack.Pop();
                                break;
                            case StackBehaviour.Popref_popi_pop1:
                            case StackBehaviour.Popref_popi_popi:
                            case StackBehaviour.Popref_popi_popi8:
                            case StackBehaviour.Popref_popi_popr4:
                            case StackBehaviour.Popref_popi_popr8:
                            case StackBehaviour.Popref_popi_popref:
                                stack.Pop();
                                stack.Pop();
                                stack.Pop();
                                break;
                        }

                        switch (instruction.OpCode.StackBehaviourPush)
                        {
                            case StackBehaviour.Push0:
                                break;
                            case StackBehaviour.Push1:
                            case StackBehaviour.Pushi:
                            case StackBehaviour.Pushi8:
                            case StackBehaviour.Pushr4:
                            case StackBehaviour.Pushr8:
                            case StackBehaviour.Pushref:
                            case StackBehaviour.Varpush:
                                stack.Push(null);
                                break;
                            case StackBehaviour.Push1_push1:
                                stack.Push(null);
                                stack.Push(null);
                                break;
                        }

                        break;
                }
            }
        }
    }
}

To be quite honest, I’m not sure whether I’ll need to use this code myself at any point, but I’ve posted it regardless for the benefit of anyone who might require such rigorous exception checking. It was definitely an interesting challenge, at the least.

Any further comments or suggestions would be welcome, as always.

Code Golf: Evaluating Mathematical Expressions

Yesterday I happened to stumble across a code golf question and for no particular reason (except for perhaps boredom) decided to create my own problem and to post it on StackOverflow for the community to reply with their solutions. It actually turned out to be much more popular than I might have anticipated.

A quick definition of code golf for those who are unaware of this enormous (though really quite enjoyable) time sink:

The objective of code golf is simply to write a program/function that solves a given problem using the fewest possible number of characters. This usually involves clever tricks related to the problem and whatever language you use, followed by heavy obfuscation.

Here is the problem specification, copied from my StackOverflow post:

Write a function that takes a single argument that is a string representation of a simple mathematical expression and evaluates it as a floating point value. A “simple expression” may include any of the following: positive or negative decimal numbers, +, -, *, /, (, ). Expressions use (normal) infix notation. Operators should be evaluated in the order they appear, i.e. not as in BODMAS, though brackets should be correctly observed, of course. The function should return the correct result for any possible expression of this form. However, the function does not have to handle malformed expressions (i.e. ones with bad syntax).

Examples of expressions:

1 + 3 / -8                            = -0.5       (No BODMAS)
2*3*4*5+99                            = 219
4 * (9 - 4) / (2 * 6 - 2) + 8         = 10
1 + ((123 * 3 - 69) / 100)            = 4
2.45/8.5*9.27+(5*0.0023)              = 2.68...

Now, my own solution isn’t particularly astounding, but I did get it down to 403 characters, and have since cut off a few more (though haven’t bothered to re-obfuscate it). It is in fact my first proper attempt at code golf, so I don’t consider it too bad.

Here it is, in all its obfuscated ugliness:

float e(string x){float v=0;if(float.TryParse(x,out v))return v;x+=';';int t=0;char o,s='?',p='+';float n=0;int l=0;for(int i=0;i<x.Length;i++){o=s;if(
x[i]!=' '){s=x[i];if(char.IsDigit(x[i])|s=='.'|(s=='-'&o!='1'))s='1';if(s==')')
l--;if(s!=o&l==0){if(o=='1'|o==')'){n=e(x.Substring(t,i-t));if(p=='+')v+=n;
if(p=='-')v-=n;if(p=='*')v*=n;if(p=='/')v/=n;p=x[i];}t=i;if(s=='(')t++;}
if(s=='(')l++;}}return v;}

And in a rather more readable form (identical in behaviour):

float Eval(string expr)
{
    float val = 0;
    if (float.TryParse(expr, out val))
        return val;
    expr += ';';
    int tokenStart = 0;
    char oldState, state = '?', op = '+';
    float num = 0;
    int level = 0;
    for (int i = 0; i < expr.Length; i++)
    {
        oldState = state;
        if (expr[i] != ' ')
        {
            state = expr[i];
            if (char.IsDigit(expr[i]) || state == '.' ||
                (state == '-' && oldState != '1'))
                state = '1';
            if (state == ')')
                level--;
            if (state != oldState && level == 0)
            {
                if (oldState == '1' || oldState == ')')
                {
                    num = Eval(expr.Substring(tokenStart, i - tokenStart));
                    if (op == '+') val += num;
                    if (op == '-') val -= num;
                    if (op == '*') val *= num;
                    if (op == '/') val /= num;
                    op = expr[i];
                }
                tokenStart = i;
                if (state == '(')
                    tokenStart++;
            }
            if (state == '(')
                level++;
        }
    }
    return val;
}

The current leading solution in one written in Haskell (a mere 226 chars), with another in Python (237 chars) taking second place. This hardly surprises me – the functional and dynamic languages almost inevitably have more succinct syntax, besides generally being known to be more suitable for creating parsers. (If I hadn’t specified the absence of the BODMAS rules, I would have surely seen a solution containing little more than an eval” statement!) Interestingly, the top two have both managed to avoid using regex altogether (though other solutions have with some success). In my opinion, it’s worth reading through the question to see how the various languages compare at performing the same task.

Please feel free to reply to the StackOverflow question or this post if you have a unique solution (in any language) that you’d like to share.

Update

I ended up spending just a bit longer on this task, since having seen some of the other solutions, it became pretty clear that I could get the char count down a good deal more. With the help of regex, my new solution stands at 294 characters. That in fact seems to be the winner amongst the set of solutions in C-style languages, so I’m quite pleased. (I have now promised myself not to entertain myself any long with this game, however.)

Here it is in a (relatively) clear form, in case anyone’s interested. (It assumes the System.Text.RegularExpressions namespace has been imported.)

float e(string x)
{
    while (x.Contains("("))
        x = Regex.Replace(x, @"\(([^\(]*?)\)", m => e(m.Groups[1].Value).ToString());

    float r = 0;
    foreach (Match m in Regex.Matches("+" + x, @"\D ?-?[\d.]+"))
    {
        var o = m.Value[0];
        var v = float.Parse(m.Value.Substring(1));
        r = o == '+' ? r + v : o == '-' ? r - v : o == '*' ? r * v : r / v;
    }
    return r;
}
http://stackoverflow.com/questions/928563/code-golf-evaluating-mathematical-expressions/944716#944716

Numerical Analysis for .NET

During my ongoing work on a computational project for university, I recently discovered the need to perform some serious numerical analysis from my C# code.  Unfortunately, I must admit that the .NET world only now seems to be catching up in terms of the free and open source libraries it offers for various tasks, and initially I was disheartened to find that there seemed to be nothing available for doing calculations on large (sparse) matrices. After a fair deal of searching, only a couple of somewhat incomplete and no longer maintained matrix libraries turned up. Being an avid user of StackOverflow, however, I decided that if anyone was aware of some library that could do what I needed, I would most likely find them there.

The result was much better than for what I was even hoping. dnAnalytics is a general-purpose package for numerical analysis in .NET that does almost everything for which I might possibly ask – and from my first impressions, does it very well indeed. This wonderful find is a well-maintained, fully open-source, library with great API documentation (not a wholly unexpected thing, but surprisingly uncommon among so many open source projects). There are several features that stand out as particularly impressive. One undoubtedly is I/O classes for Matlab and delimited files (among other formats). What is more, the library seems to offer both a fully managed version and one that wraps the Intel® Math Kernel Library. I’m not sure how the performance compares between the two (I haven’t yet tried the latter), but it is surely nice to have the pair of options available, quite similarly to how you have alternatives of cryptographic algorithms in the .NET BCL, that is to say, a) a fully managed version, v) a version based on top of the Windows Crypto API, c) a version that uses the CNG (Next Generation) API introduced with Vista. Perhaps what appeals to me the greatest about this library is that the developers have clearly gone to an effort to make it user-friendly, not only with regards to the documentation, but also by adding an interface friendly to F# coders (likely to be a language of choice for future mathematical/scientific programming), and even visual debuggers for Visual Studio (possibly the only library to date I’ve seen include them).

My particular usage of the library requires me to use the linear algebra (specifically, sparse matrix) classes. Although I must point out that the specific algorithm that I was intending to employ for the project was not available (see my later discussion), it did include a host of other ones, primarily focusing on direct and iterative matrix decomposition, which would appear to be quite handy in many circumstances. I haven’t yet had a chance to play with the other areas of the library, but I have noticed that it offers some statistical functions and methods as well as a number of modern pseudo-RNG algorithms such as the Mersenne Twister.

To conclude, I should come back to the point that the most important part of the analysis I require was not (at least direclty) contained by the library – finding the eigenvalues or eigendecomposition of large (1000s of rows/columns) matrices, which happens to be in relation to spectral theory, in case you’re curious. Even so, being such a complex field and one fraught with difficulties when it comes to implementation (numerical instability is a huge problem), I was not surprised to find that an implementation of the Arnoldi or Lanczos algorithm was not present. Fortunately, after a bit more searching around (by this point I knew specifically what I was looking for), I came across the ARPACK library, written in the archaic Fortran77 language. It did however seem to be exactly what I was looking for: a set of fast routines to find the eigenvalues of large (either dense or sparse) matrices of various types. After only a small amount of pain messing about with MinGW, I managed to get the code compiled nicely into a DLL. At this point, I am of course perfectly able just to use the P/Invoke capabilities of .NET and do some hackery to integrate the ARPACK stuff with my existing code and dnAnalytics. Yet, I am also inclined to do this whole task properly and basically write a managed wrapper for ARPACK that is tightly conforms with dnAnalytics. I could then perhaps submit these wrapper types (along with a few unit tests?) as a repository patch to the dnAnalytics team in the hope that they’ll take it and add it to the next release. As with most other projects at this time, I will have to see what time permits me, though I would certainly hope to contribute something substantial to what truly is a terrific project that I would love to see expand further.