Archive for the ‘f#’ Tag

Numerical Analysis for .NET

During my ongoing work on a computational project for university, I recently discovered the need to perform some serious numerical analysis from my C# code.  Unfortunately, I must admit that the .NET world only now seems to be catching up in terms of the free and open source libraries it offers for various tasks, and initially I was disheartened to find that there seemed to be nothing available for doing calculations on large (sparse) matrices. After a fair deal of searching, only a couple of somewhat incomplete and no longer maintained matrix libraries turned up. Being an avid user of StackOverflow, however, I decided that if anyone was aware of some library that could do what I needed, I would most likely find them there.

The result was much better than for what I was even hoping. dnAnalytics is a general-purpose package for numerical analysis in .NET that does almost everything for which I might possibly ask – and from my first impressions, does it very well indeed. This wonderful find is a well-maintained, fully open-source, library with great API documentation (not a wholly unexpected thing, but surprisingly uncommon among so many open source projects). There are several features that stand out as particularly impressive. One undoubtedly is I/O classes for Matlab and delimited files (among other formats). What is more, the library seems to offer both a fully managed version and one that wraps the Intel® Math Kernel Library. I’m not sure how the performance compares between the two (I haven’t yet tried the latter), but it is surely nice to have the pair of options available, quite similarly to how you have alternatives of cryptographic algorithms in the .NET BCL, that is to say, a) a fully managed version, v) a version based on top of the Windows Crypto API, c) a version that uses the CNG (Next Generation) API introduced with Vista. Perhaps what appeals to me the greatest about this library is that the developers have clearly gone to an effort to make it user-friendly, not only with regards to the documentation, but also by adding an interface friendly to F# coders (likely to be a language of choice for future mathematical/scientific programming), and even visual debuggers for Visual Studio (possibly the only library to date I’ve seen include them).

My particular usage of the library requires me to use the linear algebra (specifically, sparse matrix) classes. Although I must point out that the specific algorithm that I was intending to employ for the project was not available (see my later discussion), it did include a host of other ones, primarily focusing on direct and iterative matrix decomposition, which would appear to be quite handy in many circumstances. I haven’t yet had a chance to play with the other areas of the library, but I have noticed that it offers some statistical functions and methods as well as a number of modern pseudo-RNG algorithms such as the Mersenne Twister.

To conclude, I should come back to the point that the most important part of the analysis I require was not (at least direclty) contained by the library – finding the eigenvalues or eigendecomposition of large (1000s of rows/columns) matrices, which happens to be in relation to spectral theory, in case you’re curious. Even so, being such a complex field and one fraught with difficulties when it comes to implementation (numerical instability is a huge problem), I was not surprised to find that an implementation of the Arnoldi or Lanczos algorithm was not present. Fortunately, after a bit more searching around (by this point I knew specifically what I was looking for), I came across the ARPACK library, written in the archaic Fortran77 language. It did however seem to be exactly what I was looking for: a set of fast routines to find the eigenvalues of large (either dense or sparse) matrices of various types. After only a small amount of pain messing about with MinGW, I managed to get the code compiled nicely into a DLL. At this point, I am of course perfectly able just to use the P/Invoke capabilities of .NET and do some hackery to integrate the ARPACK stuff with my existing code and dnAnalytics. Yet, I am also inclined to do this whole task properly and basically write a managed wrapper for ARPACK that is tightly conforms with dnAnalytics. I could then perhaps submit these wrapper types (along with a few unit tests?) as a repository patch to the dnAnalytics team in the hope that they’ll take it and add it to the next release. As with most other projects at this time, I will have to see what time permits me, though I would certainly hope to contribute something substantial to what truly is a terrific project that I would love to see expand further.

Learning F#

My being a long-time C# coder, I finally decided to take a break from imperative programming and try something new. To be honest, I’ve quite often reached the point with my projects where I wonder: Couldn’t this be written so much more concisely and elegantly in another way? This thought occurs especially frequently in the context of mathematical and scientific coding. In response to this question, there have been suggestions (on more than one occasion) by David to try F# (specifically this functional language as he knew he couldn’t get me totally away from .NET yet). In case you’re not aware of it, it’s Microsft’s attempt at bringing functional programming to .NET, and has proven very successful so far.

It turned out that F# was surprisingly straightforward to learn, even for a language still in the beta stage (albeit the end thereof). Three or four days of regular coding gave me a pretty good idea of how it can be used for all sorts of purposes. Most likely my familiarity with lambda functions and LINQ in .NET 3.5/C# 3.0 made the task a lot easier. Even so, Microsoft have put together a number of helpful resources/links for getting started. The online documentation is especially useful given the (current) lack of XML intellisense comments inside Visual Studio. The Microsoft Research F# page and the F# Developer Center should be your first stops when learning the language. In addition, there is a pretty active community at hubFS and a host of blogs dedicated to F# out there. It seems like only a matter of time before there’s a multitude of forums and an IRC channel. I get the feeling that due to the ability to use the .NET framework from a functional language (and the fact that Microsoft will soon be making it a first-class language alongside C# and VB.NET), F# could have unprecented popularity.

After reading a few of the guides and beginner tutorials, I came across Project Euler, a series of mathematical/programming challenges that are particularly suited to the functional programming style. This website was really all I needed to get a solid grasp of the language, and was pretty fun aside from that. After trying some of the problems I finally had to admit that functional languages have their place in the programming world alongside imperative ones such as C# and C++. If you’re interested in some of my solutions to the first 23 problems, you can download the zipped project here. (Most solutions are under 5 lines in length, not including helper functions or input data.) I’ve had fair success optimising most of the algorithms, though in one or two cases you’ll find clearly more efficient solutions elsewhere. However, they’re not all terribly well commented, so feel free to question me about any of the algorithms.

Unfortunately I have quickly been forced back into the realm of imperative languages due to my project commitments (posts to come soon). Do however expect some upcoming posts and projects to feature F#, or at the least functional techniques.