XmlBuilder for .NET

alternative xml creation API

by
last update: January 18th 2008

Another XML API? Are you serious?

First and foremost, I'd like to say that I'm not proposing that we ditch the existing ways of writing XML with .Net code. The existing classes work just fine and have no reason to be avoided.

The API I'll introduce here is just the result of a developer playing with the new .Net 3.5 features (more specifically C# 3.0) and trying to learn how to leverage them in his own libraries.

I'll also warn you that the XmlBuilder library present in this article works better in C# and any other .Net language that supports multiline lambda functions (VB is not one of them.)

Books that have helped me

The problem with the current APIs

The new version of Visual Basic came out with greater support and a nifty syntax for XML data. For the rest of us, which prefers to write code in C#, writing XML is not that hard, but involves too many steps for my taste. If it were not for the "xml" in the class names, you might take a few moments to realize that the code was actually producing XML.

Consider the following simple XML document.

<?xml version="1.0" encoding="utf-8"?>
<children>
    <!--Children below...-->
    <child age="1" referenceNumber="ref-1">child &amp; content #1</child>
    <child age="2" referenceNumber="ref-2">child &amp; content #2</child>
    <child age="3" referenceNumber="ref-3">child &amp; content #3</child>
    <child age="4" referenceNumber="ref-4">child &amp; content #4</child>
    <child age="5" referenceNumber="ref-5">child &amp; content #5</child>
    <child age="6" referenceNumber="ref-6">child &amp; content #6</child>
    <child age="7" referenceNumber="ref-7">child &amp; content #7</child>
    <child age="8" referenceNumber="ref-8">child &amp; content #8</child>
    <child age="9" referenceNumber="ref-9">child &amp; content #9</child>
</children>

A typical code in C# using the DOM API would look like this.

XmlDocument xml = new XmlDocument();
XmlElement root = xml.CreateElement("children");
xml.AppendChild(root);

XmlComment comment = xml.CreateComment("Children below...");
root.AppendChild(comment);

for(int i = 1; i < 10; i++)
{
	XmlElement child = xml.CreateElement("child");
	child.SetAttribute("age", i.ToString());
	child.SetAttribute("referenceNumber", "ref-" + i);
	child.InnerText = "child & content #" + i;
	root.AppendChild(child);
}

string s = xml.OuterXml;

Well, that didn't hurt. But remember that one of the reasons we may find the DOM easy to work with is sheer osmosis. This DOM thing is everywhere. No matter how verbose and laborious we may think it is, we just got used to it.

The same XML created using the XmlTextWriter can be done as follows.

StringWriter sw = new StringWriter();
XmlTextWriter wr = new XmlTextWriter(sw);

wr.WriteStartDocument();
wr.WriteComment("Children below...");
wr.WriteStartElement("children");

for(int i=1; i<10; i++)
{
	wr.WriteStartElement("child");
	wr.WriteAttributeString("age", i.ToString());
	wr.WriteAttributeString("referenceNumber", "ref-" + i);
	wr.WriteString("child & content #" + i);
	wr.WriteEndElement();
}

wr.WriteEndElement();
wr.WriteEndDocument();


wr.Flush();
wr.Close();
string s = sw.ToString();

Uhmmm. No, thank you. Although this performs better than the DOM approach, I definitely don't see myself creating XML this way. There's just too many things that could go wrong. You could forget one of those WriteEndXXXXX calls and you're toast. This thing actually makes the DOM look good.

Nice to meet you XmlBuilder

Maybe it's just me, but I find the above two styles of creating XML way too brittle. It's so easy to forget a required step or to get lost trying to find the right method or object to use.

Take a moment to inspect the following code. Put yourself in the shoes of a developer that is trying to write XML for the first time and needs to choose an XML API.

string s = XmlBuilder.Build(xml =>
{
	xml.Root(children =>
	{
		children.Comment("Children below...");

		for(int i = 1; i < 10; i++)
		{
			children.Element(child =>
			{
				child["age"] = i.ToString();
				child["referenceNumber"] = "ref-" + i;
				child.AppendText("child & content #" + i);
			});
		}
	});
});
			

Did you notice how the code structure maps nicely to the XML document structure? See how there's no way for you to forget one of those AppendChild calls from the DOM or WriteEndElement from the XmlTextWriter?

I particularly like the way the attributes are defined using the indexer syntax. Do you see how I chose to format the lambdas so that they look like C# control blocks? Placing the opening brace of the lambda in the next line created this indented block of code that defines some form of context. The context in this case is "inside this block I'll be building one XML element. When the block ends, the element ends."

Other features

Did I mention that the XML produced by all the above snippets is not nicely formatted? They're all in a single, glorious line of XML. Do you know how to make the DOM produce pretty XML with identation? Don't bother looking. Let me show how the XmlBuilder enables that.

string s = XmlBuilder.Build(xml =>
{
	xml.Indentation = 4;
	xml.Formatting = Formatting.Indented;

	xml.Root(children =>
	{
		// ... same as before
	});
});
			

Using the lambda parameter name as the created element's name not always works. The naming rules for C# variables is not the same as XML elements. To circumvent this problem you can specify the desired element name.

string s = XmlBuilder.Build(xml =>
{
	xml.Root("my-root", myRoot =>
	{
		myRoot.Element("test-item", item =>
		{
			item.AppendText("Text here");
		}
	});
});
			

Which produces

<?xml version="1.0" encoding="utf-8"?>
<my-root>
    <test-item>Text here</test-item>
</my-root>
			

You don't need to produce a string every time. You can also pass in a Stream or even a file name. You can also specify the desired text encoding.

XmlBuilder.Build("file.xml", Encoding.UTF8, xml =>
{
	// ... same as before
});
			

Another interesting example would be if we were writing some ASP.NET code that returns XML to the browser.

XmlBuilder.Build(Response.OutputStream, Response.ContentEncoding, xml =>
{
	// ... same as before
});
			

What is the trick?

As I said before, the purpose of this API is mostly experiment with different was of leveraging the new C# features in API designs. In particular we are playing with lambdas and finding uses for it beyond the obvious ones. I wouldn't be surprised if this design is considered an abuse of the syntax.

Take for example our implementation of the Element() method that we used in the samples.

public virtual void Element(Action<XmlElementBuilder> build)
{
	string name = build.Method.GetParameters()[0].Name;
	Element(name, new Dictionary<string, string>(), build);
}

public virtual void Element(string localName, Action<XmlElementBuilder> build)
{
	Element(localName, new Dictionary<string, string>(), build);
}

public virtual void Element(string localName, IDictionary<string, string> attributes, Action<XmlElementBuilder> build)
{
	XmlElementBuilder child = new XmlElementBuilder(localName, Writer);
	
	Writer.WriteStartElement(localName);
	child._tagStarted = true;

	foreach(var att in attributes)
		child[att.Key] = att.Value;

	build(child);// <-- element content is generated here
	Writer.WriteEndElement();
	_contentAdded = true;
}
			

You will notice that we adopted a new API design convention here: Leave the delegate parameter always for last. That way we can conveniently format our code to look like a control block. All the above overloads just cascade the increasing number of parameters to the last overload, which calls the delegate at last.

Instead of refering to this action as calling or invoking the delegate, I like to think of it as executing the block.

If you liked what you saw or just want to see more of how it was done, you can download the code.

Did you find anything wrong in the article? Please and I'll try to fix it as soon as possible.