-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Oops. Snapshots are available, but the dns change hasn't happened yet.
I forgot that I'd added the following to my hosts file a while back:
50.56.245.89 snapshots.docbook.org snapshots
Feel free to go directly to
http://50.56.245.89 in the mean time.
David
On 01/12/2012 09:53 AM, David Cramer wrote:
> Yes, Kasun, Peter, and I talked about it then, but are just now
> finding time to fix it.
>
> Btw., Mike Smith has the snapshot builds moved over to the new
> server, so you can again download a snapshot to test the latest
> functionality [1] or check out what the latest output looks like
> [2].
>
> David
>
> [1]
http://snapshots.docbook.org/ [2]
>
http://snapshots.docbook.org/xsl/webhelp/docs/content/ch01.html>
> On 01/11/2012 09:31 AM, Bill Burns wrote:
>> Thanks, David. I reported this same issue to Kasun about three
>> months ago.
>
>> Bill Burns Verbum Communications, Inc. +1.208.336.6081
>>
bburns@verbumcomm.com http://www.verbumcomm.com>
>
>> -----Original Message----- From: David Cramer
>> [mailto:
david@thingbag.net] Sent: Tuesday, January 10, 2012 9:54
>> PM To: Bort, Paul Cc:
docbook-apps@lists.oasis-open.org Subject:
>> Re: [docbook-apps] WebHelp, English stemmer, problems with
>> specific words
>
>> Hi Paul, Funny you should mention that. I've also been working
>> on the client side stemmer recently to address the same issue
>> you mention and some others. The problem was with all words
>> ending with vowel+y (relay, array, key, say, day) being stemmed
>> to -i (relai, arrai,kei, sai, dai) by the client side stemmer but
>> not by the build-time indexer. I'm mostly done, but I think it
>> still overstems words like arsenal.
>
>>
http://docbook.svn.sourceforge.net/viewvc/docbook/trunk/xsl/webhelp/template/content/search/stemmers/en_stemmer.js?r1=9067&r2=9178>
>> Basically, nothing from the section "Exceptional forms in
>> general" was implemented and step 1c was incorrectly implemented:
>>
http://snowball.tartarus.org/algorithms/english/stemmer.html>
>> Regarding nucleus etc., I've also committed a fix from a
>> colleague that should always check the index for the full
>> unstemmed word to catch those Latinate terms that are handled
>> correctly by the indexer but not the client side stemmer:
>
>>
http://docbook.svn.sourceforge.net/viewvc/docbook/trunk/xsl/webhelp/template/content/search/nwSearchFnt.js?r1=9105&r2=9172>
>> He's also working on always searching the index for things that
>> look like filenames (e.g. build.xml, which it currently
>> tokenizes to 'build' and 'xml').
>
>> Here's a demo of the current state of things:
>
>>
http://www.thingbag.net/docbook/docs/content/ch05s01.html>
>> You can grab the en_stemmer.js and use it now. The
>> nwSearchFnt.js file also has changes related to adding search
>> weighting to the results, so you'd need to take changes from it
>> more carefully.
>
>> We should have a release of the xsls out before too long though.
>
>> Thanks, David
>
>> On 01/10/2012 07:33 PM, Bort, Paul wrote:
>>> Hi,
>
>>> I found the conversation about problems with the stemmer used
>>> with English at
>>>
http://lists.oasis-open.org/archives/docbook-apps/201103/msg00040.html>
>>>
>>>
>
>> very informative in tracking down the problem I'm having with
>> the
>>> stemmer, which is similar. In my case, the word that isn't
>>> being stemmed correctly is "relay".(It comes out as "relai".)
>>> This does break searches: searching for "relay" in a document
>>> that should have six matches returns an error "Your search
>>> returned no results for relai".
>
>>> The solution that I've implemented locally, and offer below
>>> for your consideration, is a list of words to be stemmed
>>> manually. I've tried to follow your coding style but I'm not a
>>> serious JavaScript hacker so I may have stepped on some toes
>>> inadvertently.
>
>>> Regards, Paul Bort Systems Engineer TMW Systems, Inc.
>>>
pbort@tmwsystems.com>
>>> ----------------------------------
>
>>> --- en_stemmer.js +++ en_stemmer.js @@ -54,6 +54,14 @@ meq1 =
>>> "^(" + C + ")?" + V + C + "(" + V + ")?$", // [C]VC[V] is m=1
>>> mgr1 = "^(" + C + ")?" + V + C + V + C, // [C]VCVC... is
>>> m>1 s_v = "^(" + C + ")?" + v; // vowel in
>>> stem + + var exceptionWords = { +
>>> "relay":"relay", + "relaying":"relay", +
>>> "relays":"relay", + "nucleus":"nucleus", +
>>> "zeus":"zeus" + };
>
>>> return function (w) { var stem, @@ -67,6 +75,8 @@
>
>>> if (w.length < 3) { return w; }
>
>>> + if (w in exceptionWords) { return exceptionWords{w};
>>> } + firstch = w.substr(0,1); if (firstch == "y") { w =
>>> firstch.toUpperCase() + w.substr(1);
>
>
>
>> ---------------------------------------------------------------------
>
>>
>
> To unsubscribe, e-mail:
>
docbook-apps-unsubscribe@lists.oasis-open.org>> For additional commands, e-mail:
>>
docbook-apps-help@lists.oasis-open.org>
>
>
>> ---------------------------------------------------------------------
>
>>
>
> To unsubscribe, e-mail:
>
docbook-apps-unsubscribe@lists.oasis-open.org>> For additional commands, e-mail:
>>
docbook-apps-help@lists.oasis-open.org>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org/iQEcBAEBAgAGBQJPDx4vAAoJEMHeSXG7afUhqJIH/0dM7XiwCjovObvS0pfjKNC0
obmZsGbV3+03bKXAVbuDDfTtjysdf18sp+AxXDsA7pg2cS4VVNjuimnnTTG3PrKh
rCFIgpoQ+/Z5Cr3R/M8fVmxTkve9ytPn14BWYYlaip84Qt1HUdKPxHuIJXRlbJzl
O42OHoJPXXta5DKWNaqnqo4puwgoagMqVq3ICkiBZdagTJIXPWVWJGJK5RFrc0sq
3btvOVzSgSshC/U7mlq2nxCsNxuFIvwulqXnvHTcQ9PhCYwj8Inc2fFucUiimW+7
YC/P7EzKnOY2AQplvGVwgxlW4DLSNqvkJaATZqJXx1gIFuw8S7U39LNch2Qb2ns=
=rywA
-----END PGP SIGNATURE-----