XPath puzzle

Any XSLT/XPath experts out there? I’m a little bit stuck. I have a stylesheet that is effectively transforming XHTML into XHTML (best not to ask) and is matching any element with select = "xhtml:*". However, sometimes empty a elements creep into the original XHTML and get copied across to the output. These can play havoc with the CSS and JavaScript used on the final web page so I’d like to supress them.

How do I modify the select statement above to select all XHTML elements except for a elements that have either no text node children or have text node children composed solely of white space?

In other words if the input contains <a></a> or <a /> or <a> </a> then it should be skipped (assume for now that any attributes are irrelevant and that we’ll deal with the case where it contains another element node but no text nodes later).

I tried select = "xhtml:*[not(self::a[not(text())])][not(self::a[not(text() = ' ')])]" as a first stab but as well as being very ugly it doesn’t seem to be working. Any ideas?

Tags: xpath, xslt

Mike says:

October 29, 2007 at 9:46 am

you want a jsp based site you do.
😉

Ben says:

November 6, 2007 at 4:27 pm

Check out the XSLT FAQ entry here:
http://www.dpawson.co.uk/xsl/sect2/N3328.html

I think the normalise-space() function is going to help you at a pinch – probably need to see an instance document to help much more.

Steve Pugh says:

November 6, 2007 at 5:29 pm

Mike: You might think that, I couldn’t possibly comment.

Ben: Thanks. That looks very promising.

Mark Caldwell says:

November 14, 2007 at 10:13 pm

Did you find a solution? I’ve been fighting an XSLT processor to stop it closing up elements with nothing but white space in leaving us with and elements which really messes with the CSS…

November 14, 2007 at 10:54 pm

I’ve not had a chance to test things yet but some of the ideas in the FAQ Ben pointed to look like they may put me on the right tracks.

One thing you can try is if your XSLT parser is XSLT 2.0 compliant then try changing the output method to XHTML rather than XML. XSLT 1.0 only had text, HTML (4.01) and generic XML methods so didn’t do a very good producing Appendix C compliant XHTML 1.0. I’ve not tried this yet either as we’ve only just upgraded to the latest version of Saxon.

XPath puzzle

No Comments

Leave a Comment