LSTSRV-L Archives

LISTSERV Site Administrators' Forum

LSTSRV-L

Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Eric Thomas <[log in to unmask]>
Wed, 20 Jun 2018 20:56:20 +0000
text/plain (8 kB) , text/html (27 kB)
The search order is whatever you select in the pull-down menu before clicking on "Search." The default comes out of a web template that could have been changed over the years or customized either at the list or site level. If you select most to least recent and only get old messages, you need to open a support ticket.

  Eric

From: LISTSERV Site Administrators' Forum <[log in to unmask]> On Behalf Of Landon, Krista J
Sent: Wednesday, June 20, 2018 16:32
To: [log in to unmask]
Subject: Re: Search functionality with HPO 16.0

If it is returning results, and the default is most to least recent, it doesn't make sense that it isn't returning results from 2017-2018 if it's not being tossed aside as an overflow word.

I'll let our server admin know there's an upgrade and we'll test and see how much of an improvement we see.  Thanks.

That said, I don't think that just because a word is used very frequently it should be rejected as an 'overflow' word.  There should be a way to turn that off.  If the word is used frequently and it's a technical term, it is likely very important for the group members on the list, and they may want to be able to find the last 100 messages with that term for good reason.

Krista Landon

Personal
UNCLASSIFIED

From: LISTSERV Site Administrators' Forum <[log in to unmask]<mailto:[log in to unmask]>> On Behalf Of Eric Thomas
Sent: Wednesday, June 20, 2018 4:26 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Search functionality with HPO 16.0

There was a bug between 2013-11-18 and 2017-03-09 causing searches with a mix of overflow and non-overflow words to be incorrectly rejected in some circumstances. Your 2017-02-28 build does not have the fix. The message, "Your search contains only "overflow words" - words that occur so frequently that they are not indexed. Please refine your search" is what LISTSERV 16.5 returns to WA or to a manual search via LCMD or e-mail. The message you are getting appears to be coming from a web template:

+SE T-ERROR-NOSEARCHRES <p class="error">Your search produced no results. Check your search string and the lists that you selected and try again.</p>

I guess WA substituted it, perhaps because of the bug. Anyway, you need a newer build.

You can choose the sort order from the search page. The default is most to least recent.

  Eric

From: LISTSERV Site Administrators' Forum <[log in to unmask]<mailto:[log in to unmask]>> On Behalf Of Landon, Krista J
Sent: Wednesday, June 20, 2018 15:24
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Search functionality with HPO 16.0

We have re-indexed all lists, and I've specifically re-indexed individual lists that I've been working on testing the functionality on.
We didn't have this problem on our previous version (prior to HPO).  These searches work worked as expected.

We're currently running:
LISTSERV Type:

LISTSERV(R) High Performance

LISTSERV Version:

16.0

Build Date:

28 Feb 2017


We haven't been getting the error the that the search only contains overflow words.  If I search a string that I KNOW is in the archives, like BIOS.  It will come up with "Your search produced no results. Check your search string and the lists that you selected and try again."  Even though I see it in the messages right in front of me.  If I put it into Subject Contains:  It actually comes up with results, but incomplete.

If I search for Ironkey in the "string" box.  I don't get a return of anything after 2011
If I search for Ironkey in Subject Contains: I get more recent posts (up to 2016) but I picked that term because I see a message last month where Ironkey was in the subject line and it doesn't show up.

These are just a couple examples.

There should be a configuration interface for Search.  We should be able to determine how many results we want returned and how we want them sorted, at the very least.  Being able to see and remove those "overflow words" would be good too, though re-indexing isn't that much of a problem (except it doesn't seem to work effectively)
Krista Landon

Personal
UNCLASSIFIED

From: LISTSERV Site Administrators' Forum <[log in to unmask]<mailto:[log in to unmask]>> On Behalf Of Eric Thomas
Sent: Wednesday, June 20, 2018 2:37 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Search functionality with HPO 16.0

Here is how searching and indexing works.


  1.  LISTSERV Classic:
1.      New DBRINDEX files are created in Classic format.
2.      Words whose index record exceeds 4096 bytes are overflowed.
3.      DBRINDEX files imported from an HPO server are supported, but records longer than 4096 bytes are overflowed.
4.      If there are 100,000 matches...

                                                      i.     LISTSERV fully processes and prepares 100,000 match results, in the requested sort order.

                                                    ii.     LISTSERV returns the first 100 to WA (I think this used to be 50).

                                                   iii.     WA returns 100 matches to browser, with previous/next navigation.

  1.  LISTSERV HPO:
1.      New DBRINDEX files are created in HPO format.
2.      The threshold at which words are overflowed adapts to the size of the archives, and is never less than 4096 bytes.
3.      DBRINDEX files in Classic format are supported, but the overflow threshold is pinned at 4096 bytes (there is no way to "bring back" a word that has been overflowed by Classic). You must reindex to gain the benefits of the adaptive overflow algorithm. Reindexing all lists could take over an hour on a large server, so it is not done automatically when installing an HPO LAK.
4.      If there are 100,000 matches...

                                                      i.     If possible (this depends on the search), LISTSERV fully processes and prepares only the 100 match results that it is going to return. These are the 100 items that would come on top if LISTSERV were to process all 100,000 matches in the requested sort order.

                                                    ii.     LISTSERV returns 100 matches to WA.

                                                   iii.     WA returns 100 matches to browser, with previous/next navigation.

  1.  With either version:
1.      Overflowed search operands initially match every message in the archive.
2.      A search containing only overflowed words is rejected with the message, "Your search contains only "overflow words" - words that occur so frequently that they are not indexed. Please refine your search." There was a bug between 2013-11-18 and 2017-03-09 causing searches with a mix of overflow and non-overflow words to be incorrectly rejected in some circumstances.
3.      Every matching message is read from the archive and post-processed to confirm that it actually contains the overflow words; messages that do not are removed from the search results.

This is for 16.5 and 16.0/2017a. Older versions handled ordering differently.

Sites like Amazon have gotten people used to searching for just "knife" and finding the nakiri knife they always dreamed of among the first 5 matches because matches are sorted based on past purchases, browsing history, and a host of other private data that we don't even know about, but that is very effective in predicting what we want to see. LISTSERV collects no such data, so if you could search for "email" in LSTSRV-L, the message you are looking for would be very unlikely to be among the first 5. LISTSERV returns the equivalent of 10 Google pages' worth of results and you can paginate for more but, if you value your time, you just have to narrow your search.

Anyway, there is nothing in the HPO search function that is more limiting than in Classic.

  Eric

________________________________

To unsubscribe from the LSTSRV-L list, click the following link:
http://peach.ease.lsoft.com/scripts/wa-PEACH.exe?SUBED1=LSTSRV-L&A=1

________________________________

To unsubscribe from the LSTSRV-L list, click the following link:
http://peach.ease.lsoft.com/scripts/wa-PEACH.exe?SUBED1=LSTSRV-L&A=1

________________________________

To unsubscribe from the LSTSRV-L list, click the following link:
http://peach.ease.lsoft.com/scripts/wa-PEACH.exe?SUBED1=LSTSRV-L&A=1

________________________________

To unsubscribe from the LSTSRV-L list, click the following link:
http://peach.ease.lsoft.com/scripts/wa-PEACH.exe?SUBED1=LSTSRV-L&A=1

############################

To unsubscribe from the LSTSRV-L list:
write to: mailto:[log in to unmask]
or click the following link:
http://peach.ease.lsoft.com/scripts/wa-PEACH.exe?SUBED1=LSTSRV-L&A=1


ATOM RSS1 RSS2