One of our internal tools eschews XML or JSON configuration files in favour of something more human readable - YAML using YamlDotNet. For the most part the serialisation and deserialisation of YAML documents in .NET objects is as straight forward as using libraries such as JSON.net but when I was working on some basic serialisation there were a few issues.

A demonstration program showing the basics of YAML serialisation
A demonstration program showing the basics of YAML serialisation

Setting the scene

For this demonstration project, I'm going to use a pair of basic classes.

csharp
internal sealed class ContentCategoryCollection : Collection<ContentCategory>
{
  private ContentCategory _parent;

  public ContentCategory Parent
  {
    get { return _parent; }
    set
    {
      _parent = value;

      foreach (ContentCategory item in this)
      {
        item.Parent = value;
      }
    }
  }

  protected override void InsertItem(int index, ContentCategory item)
  {
    item.Parent = _parent;

    base.InsertItem(index, item);
  }
}

internal sealed class ContentCategory
{
  private ContentCategoryCollection _categories;

  private StringCollection _topics;

  [Browsable(false)]
  public ContentCategoryCollection Categories
  {
    get { return _categories ?? (_categories = new ContentCategoryCollection { Parent = this }); }
    set { _categories = value; }
  }

  [Browsable(false)]
  [DesignerSerializationVisibility(DesignerSerializationVisibility.Hidden)]
  [DefaultValue(false)]
  public bool HasCategories
  {
    get { return _categories != null && _categories.Count != 0; }
  }

  [Browsable(false)]
  [DesignerSerializationVisibility(DesignerSerializationVisibility.Hidden)]
  [DefaultValue(false)]
  public bool HasTopics
  {
    get { return _topics != null && _topics.Count != 0; }
  }

  public string Name { get; set; }

  [Browsable(false)]
  public ContentCategory Parent { get; set; }

  public string Title { get; set; }

  [Browsable(false)]
  public StringCollection Topics
  {
    get { return _topics ?? (_topics = new StringCollection()); }
    set { _topics = value; }
  }
}

The classes are fairly simple, but they do offer some small challenges for serialisation

  • Read-only properties
  • Parent references
  • Special values - child collections that are only initialised when they are accessed and should be ignored if null or empty

Basic serialisation

Using YamlDotNet, you can serialise an object graph quite simply enough

csharp
Serializer serializer;
string yaml;

serializer = new SerializerBuilder().Build();

yaml = serializer.Serialize(_categories);

Basic deserialisation

Deserialising a YAML document into a .NET object is also quite straightforward

csharp
Deserializer deserializer;

deserializer = new DeserializerBuilder().Build();

using (Stream stream = File.OpenRead(fileName))
{
  using (TextReader reader = new StreamReader(stream))
  {
    _categories = deserializer.Deserialize<ContentCategoryCollection>(reader);
  }
}

Serialisation shortcomings

The following is an example of the YAML produced by the above classes with default serialisation

yaml
- Categories: []
  HasTopics: true
  Name: intro
  Title: Introducing  {{ applicationname }}
  Topics:
  - whatis.md
  - licenseagreement.md
- &o0
  Categories:
  - Categories: []
    Name: userinterface
    Parent: *o0
    Title: User Interface
    Topics: []
  HasCategories: true
  Name: gettingstarted
  Title: Getting Started
  Topics: []
- Categories: []
  Name: blank
  Title: Blank
  Topics: []

For a format that is "human friendly" this is quite verbose with a lot of extra clutter as the serialisation has included the read-only properties (which will then cause a crash on deserialisation), and our create-on-demand collections are being created and serialised as empty values. It is also slightly alien when you consider the alias references. While those are undeniably cool (especially as YamlDotNet will recreate the references), the nested nature of the properties implicitly indicate the relationships and are therefore superfluous in this case

It's also worth pointing out that the order of the serialised values matches the ordering in code file - I always format my code files to order members alphabetically, so the properties are also serialised alphabetically.

You can also see that, for the most part, the HasCategories and HasTopics properties were not serialised - although YamlDotNet is ignoring the BrowsableAttribute, it is processing the DefaultValueAttribute and skipping values which are considered default, which is another nice feature.

Resolving some issues

Similar to Json.NET, you can decorate your classes with attributes to help control serialisation, and so we'll investigate these first to see if they can resolve our problems simply and easily.

Excluding read-only properties

The YamlIgnoreAttribute class can be used to force certain properties to be skipped, so applying this attribute to properties with only getters is a good idea.

csharp
[YamlIgnore]
public bool HasCategories
{
  get { return _categories != null && _categories.Count != 0; }
}

Changing serialisation order

We can control the order in which YamlDotNet serialises using the YamlMemberAttribute. This attribute has various options, but for the time being I'm just looking at ordering - I'll revisit this attribute in the next post.

csharp
[YamlMember(Order = 1)]
public string Name { get; set; }

If you specify this attribute on one property to set an order you'll most likely need to set it on all.

Processing the collection properties

Unfortunately, while I could make use of the YamlIgnore and YamlMember attributes to control some of the serialisation, it wouldn't stop the empty collection nodes from being created and then serialised, which I didn't want. I suppose I could finally work out how to make DefaultValue apply to collection classes effectively, but then there wouldn't be much point in this article!

Due to this requirement, I'm going to need to write some custom serialisation code - enter the IYamlTypeConveter interface.

Creating a custom converter

To create a custom converter for use with YamlDotNet, we start by creating a new class and implementing IYamlTypeConverter.

csharp
internal sealed class ContentCategoryYamlTypeConverter : IYamlTypeConverter
{
  public bool Accepts(Type type)
  {
  }

  public object ReadYaml(IParser parser, Type type)
  {
  }

  public void WriteYaml(IEmitter emitter, object value, Type type)
  {
  }
}

First thing is to specify what types our class can handle via the Accepts method.

csharp
private static readonly Type _contentCategoryNodeType = typeof(ContentCategory);

public bool Accepts(Type type)
{
  return type == _contentCategoryNodeType;
}

In this case, we only care about our ContentCategory class so I return true for this type and false for anything else.

Next, it's time to write the YAML content via the WriteYaml method.

The documentation for YamlDotNet is a little lacking and I didn't find the serialisation support to be particularly intuitive, so the code I'm presenting below is what worked for me, but there may be better ways of doing it.

First we need to get the value to serialise - this is via the value and type parameters. In my example, I can ignore type though as I'm only supporting the one type.

csharp
public void WriteYaml(IEmitter emitter, object value, Type type)
{
  ContentCategory node;

  node = (ContentCategory)value;
}

The IEmitter interface (accessed via the emitter parameter) is similar in principle to JSON.net's JsonTextWriter class except it is less developer friendly. Rather than having a number of Write* methods or overloads similar to BCL serialisation classes, it has a single Emit method which takes in a variety of objects.

Writing property value maps

To create our dictionary map, we start by emitting a MappingStart object. Of course, if you have a start you need an end so we'll close by emitting MappingEnd.

csharp
emitter.Emit(new MappingStart(null, null, false, MappingStyle.Block));

// reset of serialisation code

emitter.Emit(new MappingEnd());

YAML supports block and flow styles. Block is essentially one value per line, while flow is a more condensed comma separated style. Block is much more readable for complex objects, but flow is probably more valuable for short lists of simple values.

Next we need to write our key value pairs, which we do by emitting pairs of Scalar objects.

csharp
if (node.Name != null)
{
  emitter.Emit(new Scalar(null, "Name"));
  emitter.Emit(new Scalar(null, node.Name));
}

if (node.Title != null)
{
  emitter.Emit(new Scalar(null, "Title"));
  emitter.Emit(new Scalar(null, node.Title));
}

Although the YAML specification allows for null values, attempting to emit a Scalar with a null value seems to destabilise the emitter and it will promptly crash on subsequent calls to Emit. For this reason, in the code above I wrap each pair in a null check. (Not to mention if it is a null value there is probably no need to serialise anything anyway).

Writing lists

With the basic properties serialised, we can now turn to our child collections.

This time, after writing a single Scalar with the property name instead of writing another Scalar we use the SequenceStart and SequenceEnd classes to tell YamlDotNet we're going to serialise a list of values.

For our Topics property, the values are simple strings so we can just emit a Scalar for each entry in the list.

csharp
if (node.HasTopics)
{
  this.WriteTopics(emitter, node);
}

private void WriteTopics(IEmitter emitter, ContentCategory node)
{
  emitter.Emit(new Scalar(null, "Topics"));
  emitter.Emit(new SequenceStart(null, null, false, SequenceStyle.Block));

  foreach (string child in node.Topics)
  {
    emitter.Emit(new Scalar(null, child));
  }

  emitter.Emit(new SequenceEnd());
}

As the Categories property returns a collection of ContentCategory objects, we can simply start a new list as we did for topics and then recursively call WriteYaml to write each child category object in the list.

csharp
if (node.HasCategories)
{
  this.WriteChildren(emitter, node);
}

private void WriteChildren(IEmitter emitter, ContentCategory node)
{
  emitter.Emit(new Scalar(null, "Categories"));
  emitter.Emit(new SequenceStart(null, null, false, SequenceStyle.Block));

  foreach (ContentCategory child in node.Categories)
  {
    this.WriteYaml(emitter, child, _contentCategoryNodeType);
  }

  emitter.Emit(new SequenceEnd());
}

Deserialisation

In this article, I'm only covering custom serialisation. However, the beauty of this code is that it doesn't generate different YAML from default serialisation, it only excludes values that it knows are defaults or that can't be read back, and provides custom ordering of values. This means you can use the basic deserialisation code presented at the start of this article and it will just work, as demonstrated by the sample program accompanying this post.

For this reason, for the time being I change the ReadYaml method of our custom type converter to throw an exception instead of actually doing anything.

csharp
public object ReadYaml(IParser parser, Type type)
{
  throw new NotImplementedException();
}

Using the custom type converter

Now we have a functioning type converter, we need to tell YamlDotNet about it.

At the start of the article, I showed how you create a SerializerBuilder object and call its Build method to get a configured Serializer class. By calling the builder objectsWithTypeConverter method, we can enable the use of our custom converter.

csharp
Serializer serializer;
string yaml;

serializer = new SerializerBuilder()
                 .WithTypeConverter(new ContentCategoryYamlTypeConverter())
                 .Build();

yaml = serializer.Serialize(_categories);

See the attached demonstration program for a fully working sample.

Update History

  • 2017-04-01 - First published
  • 2020-11-22 - Updated formatting

Like what you're reading? Perhaps you like to buy us a coffee?

Donate via Buy Me a Coffee

Donate via PayPal


Files


Comments