Normalize Space Issue With Html Tags
Here's one for you XSLT gurus :-) I have to deal with XML output from a Java program I cannot control. In the docs outputted by this app the html tags remain as
Solution 1:
An XSLT 1.0 solution is an XPath expression to replace a sequence of several whitespace characters with a single one. The idea is not my own, it is taken from an answer by Dimitre Novatchev.
The advantage over the built-in normalize-space()
function is that trailing whitespace (in your case, before and after the b
element) is kept.
EDIT: As a response to you editing your question. Below is the said XPath expression incorporated into your stylesheet. Also:
- Explicitly saying
omit-xml-declaration="no"
is redundant. It is the default action taken by the XSLT processor - Several of your templates have the same content. I summarized them using
|
to a single one.
Stylesheet
<xsl:stylesheetversion="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:outputmethod="xml"indent="yes"encoding="UTF-8"/><xsl:strip-spaceelements="*" /><xsl:templatematch="@*|node()"><xsl:copy><xsl:apply-templatesselect="@*|node()"/></xsl:copy></xsl:template><xsl:templatematch="Text//*|Instruction//*|Title//*"><xsl:value-ofselect="concat('<',name(),'>')" /><xsl:apply-templates /><xsl:value-ofselect="concat('</',name(),'>')" /></xsl:template><xsl:templatematch="text()"><xsl:value-ofselect=
"concat(substring(' ', 1 + not(substring(.,1,1)=' ')),
normalize-space(),
substring(' ', 1 + not(substring(., string-length(.)) = ' '))
)
"/></xsl:template></xsl:stylesheet>
XML Output
<?xml version="1.0" encoding="UTF-8"?><LocatorPrecode="7"><TextLanguageId="7">The next word is <b>bold</b> and is correctly spaced around the html tag, but the sentence has extra whitespace and line breaks</Text></Locator>
Post a Comment for "Normalize Space Issue With Html Tags"