-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Dave,
On 23/03/14 12:32, davep wrote:
> I'm playing with a grammar checker that isn't as yet XML friendly.
> One option is to strip all markup and pass through to the grammar
> checker having expanded any xincludes.
Interesting -- what checker do you use, if I may ask?
> Issues: 1. Plain text output, Ideally block -> newline, inlines
> ->whitespace separation. 2. Indexing is a special. Null template
> for <db:indexterm/> 3. Ditto (remove markup) for toc
>
> Can anyone think of any other 'specials' that might need
> processing to obtain a simple text file ready for a spell checker?
Since I am trying to implement some sort of style/terminology checker
here, here are the rules I use to prepare the text before the
terminology check:
https://www.gitorious.org/style-checker/style-checker/source/999eb9696fed15e75b01eee2febbb28562fc3144:source/xsl-checks/terminology.xslcYou can see that I try to hide things like literals and keys from the
style checker. The ##@sth## format is because I am using regular
expressions and wanted a format that is distinctive but does not
contain any regular expression characters.
Hth,
Stefan.
- --
SUSE LINUX Products GmbH, Maxfeldstraße 5, D-90409 Nürnberg
Geschäftsführer: Jeff Hawn, Jennifer Guild, Felix Imendörffer
HRB 16746 (Amtsgericht Nürnberg)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iF4EAREIAAYFAlMwADsACgkQ5AP3bIqhlM1h0gD/YZsuB/RNWJEyPYBhkYoBRoN6
q7EnNviWub9HPF1JmLMA/Ao0nDvCror2CfS/GauSA7LCaISXvkGQFVztP4OQ6c6v
=brM5
-----END PGP SIGNATURE-----