Skip to content Skip to sidebar Skip to footer

Normalize Space Issue With Html Tags

Here's one for you XSLT gurus :-) I have to deal with XML output from a Java program I cannot control. In the docs outputted by this app the html tags remain as

Solution 1:

An XSLT 1.0 solution is an XPath expression to replace a sequence of several whitespace characters with a single one. The idea is not my own, it is taken from an answer by Dimitre Novatchev.

The advantage over the built-in normalize-space() function is that trailing whitespace (in your case, before and after the b element) is kept.

EDIT: As a response to you editing your question. Below is the said XPath expression incorporated into your stylesheet. Also:

  • Explicitly saying omit-xml-declaration="no" is redundant. It is the default action taken by the XSLT processor
  • Several of your templates have the same content. I summarized them using | to a single one.

Stylesheet

<xsl:stylesheetversion="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:outputmethod="xml"indent="yes"encoding="UTF-8"/><xsl:strip-spaceelements="*" /><xsl:templatematch="@*|node()"><xsl:copy><xsl:apply-templatesselect="@*|node()"/></xsl:copy></xsl:template><xsl:templatematch="Text//*|Instruction//*|Title//*"><xsl:value-ofselect="concat('&lt;',name(),'&gt;')" /><xsl:apply-templates /><xsl:value-ofselect="concat('&lt;/',name(),'&gt;')" /></xsl:template><xsl:templatematch="text()"><xsl:value-ofselect=
  "concat(substring(' ', 1 + not(substring(.,1,1)=' ')),
          normalize-space(),
          substring(' ', 1 + not(substring(., string-length(.)) = ' '))
          )
  "/></xsl:template></xsl:stylesheet>

XML Output

<?xml version="1.0" encoding="UTF-8"?><LocatorPrecode="7"><TextLanguageId="7">The next word is &lt;b&gt;bold&lt;/b&gt; and is correctly spaced around the html tag, but the sentence has extra whitespace and line breaks</Text></Locator>

Post a Comment for "Normalize Space Issue With Html Tags"