Monday, February 20, 2012

Dynamic XPATH in XSLT using C#

Recently I had a task of converting a flat XML to hierarchical XML deserializable to .NET entities. Everything was smooth until I got into a situation where-in I had some elements like shown below


...
<CCY1>GBP</CCY1>
<Value1>1.34</Value1>
<CCY2>EUR</CCY2>


<Value2>1.34</Value2>
...


which needed to be like after conversion as shown below


....
<CCYDetails>
<CCYDetail>
<CCY>GBP</CCY>
<Value>1.34</Value>
....
</CCYDetails>
</CCYDetail>
...
 
After some research in internet, I figured out that it is not straight forward in XSLT. It needs some help of scripting language. Since I am from .NET background, I wasn't able to find much help in this regard. Then I found one code in one of the forum threads and tweaked a bit to get it working.


XSLT
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
xmlns:user="http://tempuri.org/user">
<msxsl:script language="CSharp" implements-prefix="user">
<![CDATA[
public XPathNodeIterator FilterNodes(XPathNodeIterator context,string xpath)
{
context.MoveNext();
return context.Current.Select(xpath);
}
]]>
</msxsl:script>
<xsl:template match="/">
<root>
<CCYDetailss>
<xsl:for-each select="MinPayInAmendment">
<xsl:call-template name="ccy_loop">
</xsl:call-template>
</xsl:for-each>
</CCYDetails>
</root>
</xsl:template>
<xsl:template name="ccy_loop">
<xsl:param name="num">1</xsl:param>
<xsl:if test="not ($num=17)">
<xsl:variable name="ccy" select="concat('CCY',$num)">
</xsl:variable>
<xsl:variable name="new" select="concat('New',$num)">
</xsl:variable>
<CCYDetail>
<CCY>
<xsl:value-of select="user:FilterNodes(.,$ccy)"/>
</CCY>
<New>
<xsl:value-of select="user:FilterNodes(.,$new)"/>
</New>
</CCYDetail>
<xsl:call-template name="ccy_loop">
<xsl:with-param name="num">
<xsl:value-of select="$num+1"></xsl:value-of>
</xsl:with-param>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>


C#
XPathDocument xpath = new XPathDocument(@"C:\input_xml.xml");

XPathNavigator navi = xpath.CreateNavigator();
XsltSettings settings = new XsltSettings();
settings.EnableScript = true;
settings.EnableDocumentFunction = true;
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(@"C:\XSLTFile1.xslt",settings,new XmlUrlResolver());
MemoryStream resultStream = new MemoryStream();
XmlWriterSettings writer_settings = new XmlWriterSettings();
writer_settings.Indent = true;
XmlWriter writer = XmlWriter.Create(resultStream,writer_settings);
transform.Transform(xpath, null, writer);
resultStream.Position = 0;
StreamReader stream = new StreamReader(resultStream);
Console.WriteLine(stream.ReadToEnd());
writer.Close();



PDF to XML using iTextsharp

Here is the very simple way of creating the XML from PDF document. I used Form fields in the PDF document. Then using the iTextsharp, I looped through the Acrofields or Form fields and created the flat XML document out of it. You can customize to create more complex structures if need be.

XmlDocument doc = new XmlDocument();
PdfReader reader = new PdfReader(@"C:\Input.pdf");
AcroFields fields = reader.AcroFields;
doc.LoadXml(string.Format("<{0}/>", root));
foreach (string keyName in fields.Fields.Keys)
{
AcroFields.Item item = fields.GetFieldItem(keyName);
XmlElement elt = doc.CreateElement(keyName);
elt.InnerXml = "<![CDATA[" + fields.GetField(keyName) + "]]>"";
doc.DocumentElement.AppendChild(elt);
}

doc.Save(@"C:\output.xml");