Home > Software Development > XML-to-C# Code Generation

XML-to-C# Code Generation

Altova probably hates me. Not their products, but the company. I’ve frequently wanted to give their product line a noble shot for utilization, but I never have time to give it a fair shot, so I am never able to afford to purchase it or give it a full recommendation to my employer. My old user ID shows up in their tutorial videos alongside generic examples of hackers and spammers. For years, I’d try reinstalling the product to get past the 30-day trial in hopes that I’d have time to really check their cool tools out. When they killed that ability, I tried doing it within a virtual machine. Now in VMs I cannot get a trial key anymore; perhaps my e-mail domain name is blocked.

But I often forget that there is no real need for an investment in some third party XML code generation tool like Altova’s XMLSpy or MapForce if you need a complete object model written in C# to introspect a deserialized XML file. After spending hours Googling for C# code generators from XML, I realized that the solution is right under my nose. And I don’t have to spend a dime for it.

Why Generate?

You might be asking, why are you trying to generate C# code? Doesn’t System.Xml.XmlDocument and its XPath support work well enough to do what you need to do with an XML document? The answer is, yes, sometimes. Sometimes Notepad.exe is sufficient to edit an .aspx file, too, but that doesn’t mean that having a good ASP.NET IDE w/ code generation, like Visual Studio, should be ignored for Notepad.

In fact, I was happy with using XmlDocument until I realized that some of the code I was tasked to maintain consisted of hundreds of lines of code that would read CDATA values into a business object’s own properties, like this:

XmlNode node = storyNode.SelectSingleNode("./title");
if (node != null && node.ChildNodes.Count > 0 && node.ChildNodes[0].value != null)
{
	this._title = node.ChildNodes[0].Value
}

node = storyNode.SelectSingleNode("./category");
if (node != null && node.ChildNodes.Count > 0 && node.ChildNodes[0].value != null)
{
	this._category = node.ChildNodes[0].Value
}

...

This just seemed silly to me. When I started working with a whole new XML schema that was even more complex, I decided that manually writing all that code is just ludicrous.

XML -> XSD

Visual Studio 2005 (of which there are freely downloadable Express versions, of course) has the ability to introspect an XML document to generate an XML Schema (.xsd). It’s really very simple: load the XML file into the IDE, then select "Create Schema" from the "XML" menu. Overwhelmed by the complexity of it all yet?

Bear in mind that the resulting Schema is not perfect. It must be validated–by you. If at first glance the schema looks fine, there’s a simple test to validate it: simply programmatically load your XML document while enforcing the schema. For my purposes, I found that most of the adjustments I needed to make were just to make "required" elements "optional", unless of course they were indeed required.

XSD -> C# Code

If the schema’s clean, all you need is the .NET Framwork SDK, which comes bundled with Visual Studio 2005. Tucked away therein is XSD.exe, which does the magic for you. All you have to do is pass it "/c" along with the name of the .xsd file and the new name of the .cs file you want it to auto-generate.

The generated C# code isn’t always perfect, either. To say nothing of the rough comment stubs, one or two textual content elements were completely ignored in my case–the attributes were exposed as C# properties but the content, which was CDATA, was not. Easy enough to fix. This was likely due to an imperfect XSD file, but since this was really a run-once-and-forget-about-it effort, I was not afraid of diving into the C# to add the missing properties.

        private string _value;
        [System.Xml.Serialization.XmlText()]
        public string value
        {
            get { return _value; }
            set { _value = value; }
        }

System.Xml.Serialization.XmlSerializer works flawlessly with the generated C# code. I created the following generic class for the generated classes to inherit, so that they automatically offer a Deserialize() method:

using System;
using System.Collections.Generic;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
using System.IO;

namespace MyProject.XmlGen
{
    public class XmlDeserializer<T>
    {
        public static T Deserialize(string xmlFilePath)
        {
            using (FileStream stream = new FileStream(xmlFilePath, FileMode.Open))
            {
                return Deserialize(stream);
            }
        }
        public static T Deserialize(Stream xmlFileStream)
        {
            return (T)Serializer(typeof(T)).Deserialize(xmlFileStream);
        }

        public static T Deserialize(TextReader textReader)
        {
            return (T)Serializer(typeof(T)).Deserialize(textReader);
        }

        public static T Deserialize(XmlReader xmlReader)
        {
            return (T)Serializer(typeof(T)).Deserialize(xmlReader);
        }

        public static T Deserialize(XmlReader xmlReader, string encodingStyle)
        {
            return (T)Serializer(typeof(T)).Deserialize(xmlReader, encodingStyle);
        }

        public static T Deserialize(XmlReader xmlReader, XmlDeserializationEvents events)
        {
            return (T)Serializer(typeof(T)).Deserialize(xmlReader, events);
        }

        public static T Deserialize(XmlReader xmlReader, string encodingStyle, XmlDeserializationEvents events)
        {
            return (T)Serializer(typeof(T)).Deserialize(xmlReader, encodingStyle, events);
        }

        private static XmlSerializer _Serializer = null;
        private static XmlSerializer Serializer(Type t)
        {
            if (_Serializer == null) _Serializer = new XmlSerializer(t);
            return _Serializer;
        }

    }
}

So with this I just declare my generated C# as such:

public class MyGeneratedClass : XmlDeserializer<MyGeneratedClass>
{
 ...
}

Literally, now it takes a whopping ONE line of code to deserialize an XML file and access it as a complex object model.

MyGeneratedClass myObject = MyGeneratedClass.Deserialize(xmlFilePath);

Cheers.

Advertisements
Categories: Software Development
  1. Steve Smith
    May 30, 2013 at 11:44 am

    I tried your solution and it works great for 1 file after you get the XSD correct for nulls. My issue is I have many different responses that start with STEVE5 and when I try to change the class name but the class name is used for the root node. How can I modify this to make it work with 20 types of responses?

    Steve

    • May 30, 2013 at 4:24 pm

      Steve,

      FYI I migrated this blog entry a few years back over to here: http://www.jondavis.net/techblog/post/2007/05/01/XML-to-C-Code-Generation.aspx.

      Unfortunately, I don’t understand your question at all by “responses” and thus cannot answer as I don’t understand the question.

      Jon

      • May 31, 2013 at 4:46 am

        Thank you for responding. The issue is when you generate the classes and you have say 10 possible XML response objects all with the root node of Steve. When you generate the classes they force the partial class base to be Steve and thus you have ambiguity. I figered out that if I wrap the generated code with a base class name it all works perfect.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: