Hi, I concur, xml:space is the right way to go. Performing normalization at extraction and then setting preserve space for the content is likely going to be the safest thing extractors can do in practice. Putting whitespace in inline elements I agree is a very bad practice. And can probably lead ...
|