Archive for the ‘c# 3.0’ Tag
Circular Buffer for .NET
This is a quick announcement that I’ve released my code for an implementation of a generic circular buffer for .NET (written in C# 3.0). The release also includes an implementation of a circular stream, which is a wrapper over a circular buffer of bytes. You can view the project and download the source code over at CodePlex, where I’ve licensed it under the MS-PL, which should hopefully be fairly unrestrictive.
If you’re wondering where this idea came from, I recently came across an interesting use for a circular buffer, and upon finding out that the .NET BCL contained nothing along the lines of a circular buffer (or stream), I decided to implement one myself. I’ve attempted to do everything properly of course (i.e. in the style of the BCL data structures), so hopefully it should be immediately useful to anyone familiar with the concept.
More about this to come later.
LINQ to YAML
LINQ to XML is one of the many technologies introduced with the .NET Framework 3.5, and one that is certainly a step forward in terms of usability. It allows querying in both the functional style (using LINQ and lambda expressions) and the more traditional imperative one, meaning that it’s a great tool for concisely working with XML data in any sort of application, and undoubtedly a significant improvement over the old XML DOM that resides in the System.Xml namespace.
In the spirit of LINQ, and with the advent of YAML, I recntly decided it was about time that this new “markup language” were integrated with LINQ. Surprisingly, there does not already exist anything akin to LINQ to YAML out there (though there are a couple of fairly usable implementations of a YAML reader/writer for .NET). This seemed to me like a good chance to potentially create something that might be used by more than the odd .NET developer or two. My plans are to implement a LINQ to YAML provider either from scratch or on top of one of the existing YAML libraries. (Which option I choose will depend on the state of the existing projects, which I haven’t yet investigated properly. I am however suspecting that it might be worthwhile writing my own, since it would a) teach me all the intricacies of YAML, and b) allow me to support the latest version [1.2], which the existing libraries do not.)
Before I launch into an overview of my intended implementation, here is a little bit about YAML itself, for those who aren’t already familiar with it. Although technically YAML isn’t a markup language (after all, the recursive acronym stands for YAML Ain’t Markup Language) – it is rather a serialisation format – it does essentially fulfill the the role that XML traditionally has, in a variety of common situations. I’m not going to try to sell the format to you right now, but it should suffice to say that you wouldn’t have reached this far in the post if you weren’t already at least intrigued! Without doubt, the format is actively gaining popularity because of it’s ultra-lightweight syntax and suitability for hand editing, perhaps the two points that summarise its advantages over XML.
Anyway, here’s a short example of a YAML document (taken straight from the Wikipedia page), so you can see precisely how pleasant it is to work with (at least for humans).
—
receipt: Oz-Ware Purchase Invoice
date: 2007-08-06
customer:
given: Dorothy
family: Gale
items:
- part_no: A4786
descrip: Water Bucket (Filled)
price: 1.47
quantity: 4
- part_no: E1628
descrip: High Heeled "Ruby" Slippers
price: 100.27
quantity: 1
bill-to: &id001
street: |
123 Tornado Alley
Suite 16
city: East Westville
state: KS
ship-to: *id001
specialDelivery: >
Follow the Yellow Brick
Road to the Emerald City.
Pay no attention to the
man behind the curtain.
...
Of course, the great thing about YAML, which is demonstrated clearly by this example, is that you don’t have to have any real knowledge about YAML to understand exactly and immediately what the data represents, and as a bonus it doesn’t hurt your eyes to stare at for too long! Even the referencing syntax should be fairly self evident. (&id00 and *id001 would surely be nothing new to C programmers.)
The semantics as well as the syntax of YAML obviously differ to those of XML greatly, although there is almost always some sort of correspondence between the features and possibilities that the two formats offer. The only notable missing feature when contrasted to XML is attributes, yet their usefulness is questionable anyway.
Right, so now I ought to explain a bit about how I actually plan to design this library. The basic framework will be virtually equivalent to that of LINQ to XML. In other words, the hierarchy will be largely based around an abstract YamlObject (YObject?) class, and will look very much like the one contained within System.Xml.Linq.
Though LINQ to YAML must of course accomodate for the unique nature of the format, I would initially aim for minimal difference and only significantly adjust the hierarchy when it is found to be necessary. Classes such as XCData and XDocumentType would not apply at all to YAML, yet there would need to be a place for a YReference or such somewhere in the hierarchy. The referencing aspect of YAML will likely prove to be one of the more interesting challenges; while YAML’s lists, maps (dictionaries), and combinations thereof would seem relatively straightforward with regards to emulation of the LINQ to XML design, references would introduce a substantially novel concept. Some sort of implementation of lazy evaluation followed by concrete referencing should be able to solve the problem, but there’s no way to predict how well this might work in practice at this moment.
What I realised only after deciding to create a LINQ to YAML library is that among LINQ providers, LINQ to XML is somewhat special in that the LINQ aspect of it is built on top of LINQ to Objects (i.e. LINQ using IEnumerable<T> objects), with only a relatively small number of extension methods specific to LINQ to XML. Indeed, most LINQ providers (LINQ to Objects and LINQ to SQL among others) require you to implement the IQueryable and IQueryProvider interfaces to provide complex logic for interpreting and returning the results of expressions, as well as evaluating complex expression trees. All this means that I can pretty much just design a DOM to a certain style (i.e. one suited to functional code, like LINQ to XML), and let LINQ to Objects to everything else for me.
As I can’t think of anything more worth mentioning about my project at this time, I shall leave any more specific and complex details to a future post. Still, do by all means feel free to query me about my plans – I would be glad to answer any questions, and even gladder to receive some suggestions as how you think I might design LINQ to YAML, or simply a nod that you might find this useful at some point. I don’t anticipate this project to be a very long one, though I must say that both my work and free-time schedule are likely to be fairly messed up for the next month or two, therefore I’m not going to promise when I’ll get around to my initial release. Whenever it so happens, I will duly post the link to the project page on Launchpad (or wherever I decide to host it).
Strongly-Typed CSV Reader in C#
As part of a project on which I’ve recently started working, I found it necessary to write a class that reads entries from CSV files. Such a simple format, you might think, so why would I bother sharing such trivial code? Indeed, it is a relatively short class, but I thought I’d post it here nonetheless, primarily because I believe its usage promotes a design practice of which I am particularly fond, and I suspect (hope) other people may appreciate as well. There are also a few bits of code that might be considered interesting (and unusual) from a language/design perspective.
When I decided to formalise the logic for reading from CSV files, I firstly thought it would be nice to write something in the spirit of .NET 3.5 – in this case, easily compatible with LINQ, fully generic (strongly-typed), and attribute-oriented (as seems to be the trend in APIs nowadays). Before I launch into any further discussion, here’s the code for the class in full.
using System; using System.ComponentModel; using System.Collections.Generic; using System.IO; using System.Linq; using System.Reflection; using System.Text; namespace NetworkAnalyser { public class CsvReader<TEntry> : IDisposable where TEntry : struct { private StreamReader streamReader; private FieldTypeInfo[] fieldTypeInfos; private bool isDisposed = false; public CsvReader(string path) { streamReader = new StreamReader(path); Initialize(); } public CsvReader(Stream stream) { streamReader = new StreamReader(stream); Initialize(); } ~CsvReader() { Dispose(false); } public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } protected virtual void Dispose(bool disposing) { if (!isDisposed) { if (disposing) { if (streamReader != null) streamReader.Dispose(); } } isDisposed = true; } public IEnumerable<TEntry> ReadAllEntries() { TEntry? entry; while ((entry = ReadEntry()).HasValue) yield return entry.Value; } public TEntry? ReadEntry() { var line = streamReader.ReadLine(); if (line == null) return null; var entry = new TEntry(); var fields = line.Split(new char[] { ',' }, StringSplitOptions.None); FieldTypeInfo fieldTypeInfo; object fieldValue; for (int i = 0; i < fields.Length; i++) { fieldTypeInfo = fieldTypeInfos[i]; fieldValue = fieldTypeInfo.TypeConverter.ConvertFromString(fields[i].Trim()); fieldTypeInfo.FieldInfo.SetValueDirect(__makeref(entry), fieldValue); } return entry; } private void Initialize() { var entryType = typeof(TEntry); fieldTypeInfos = (from fieldInfo in entryType.GetFields(BindingFlags.Instance | BindingFlags.Public) let fieldTypeConverterAttrib = fieldInfo.GetCustomAttributes( typeof(TypeConverterAttribute), true).SingleOrDefault() as TypeConverterAttribute let fieldTypeConverter = (fieldTypeConverterAttrib == null) ? null : Activator.CreateInstance(Type.GetType( fieldTypeConverterAttrib.ConverterTypeName)) as TypeConverter select new FieldTypeInfo() { FieldInfo = fieldInfo, TypeConverter = fieldTypeConverter ?? TypeDescriptor.GetConverter(fieldInfo.FieldType) }).ToArray(); } private struct FieldTypeInfo { public FieldInfo FieldInfo; public TypeConverter TypeConverter; } } }
(Please excuse the utter lack of comments in the code. Most of it is self-explanatory, but admittedly some parts are probably not. I put it together pretty quickly, but I may get around to commenting it some time soon. Some basic error handling might also be nice.)
At this point it may seem rather excessive just to read data from a CSV file, but I hope you’ll agree that it’s worthwhile once you see an example of typical usage.
The first step is to define a structure (struct) that holds each entry in memory. Here we’re going to define one that holds some basic information about a programming language.
public struct LanguageEntry { public string Name; public string[] Paradigms; public string LatestVersion; [TypeConverter(typeof(CustomDateTimeConverter))] public DateTime InitialRelease; [TypeConverter(typeof(CustomDateTimeConverter))] public DateTime LatestRelease; public float Popularity; }
The TypeConverter attributes are completely optional, and are only required when you’re reading some fields that have unusual formats and whose values you would like to convert to something simpler/more accessible (e.g. a string “Jun2002″ to a DateTime object in this case). For any field of a type recognisable by the default type converter, you don’t need to bother, as is shown for the double type. (This actually applies to a very large range of types within the BCL, including System.Drawing.Color, which can be specified in any format that you might use in the propeprty editor of Visual Studio, such as “DarkRed”.)
Finally, here’s a snippet to show how you might actually use the CsvReader<TEntry> class to read from a CSV file. This example reads all entries from the languages.csv file and prints out to the console the names of all functional languages.
using (var languagesReader = new CsvReader<LanguageEntry>("language.csv")) { var languages = from lang in languagesReader.ReadAllEntries() where lang.Paradigms.Contains("Functional") select lang; foreach (var lang in languages) Console.WriteLine(lang.Name); }
Hopefully that’s now convinced you that this is the right way to go about reading data entries from files. What this class provides is completely strongly-typed I/O (reading in this case, though it wouldn’t be very hard to create a similar CsvWriter class), and a declarative manner to defining entry types (or records, to use database termninology).
I’m not going to delve too deeply into the implementation of the class, but I think it’s worth highlighting a few specifics. Going back to the code for the class, the first thing to notice is the Initialize method – this is where much of the interesting stuff is happening. To summarise: it loops over all the public fields of the type specified by TEntry, gets the default type converter for the type of each field (or the one given by TypeConverterAttribute, if it exists), and then stores the FieldInfo along with the TypeConverter in a simple struct. The only other noteworthy point is the call to SetValueDirect in the ReadEntry method. This uses a keyword that’s almost wholly unknown (and undocumented!) to C# developers by the name of __makeref (there are other related ones by the names of __reftype and __refvalue) – I was certainly unaware of it before today. The problem that I initially encountered was one of using the SetValue method, which works perfectly well on classes, but presents a unique problem with structs: namely, because they are value-types, and the obj parameter is of type object, the argument must be boxed (wrapped into a reference type) and placed on the heap rather than the stack, meaning that the heap-based copy gets altered, and not the one you passed to the method (which is on the stack)! What the __makeref keyword does is create a TypeReference that directly references the stack-based object and thus allows SetValueDirect to set the field accordingly.
That’s enough explanation, I think. If you still aren’t sure about how it works precisely, then feel free to comment on this post. I’d also be quite happy to hear what anyone thinks of the general design and implementation, too.
Leave a Comment
Leave a Comment.jpg)
Comments (1)