Choosing an open source search engine for a website can be hard, in part because — on the outside — all of them show relevant results based on a search term, so they all seem similar. Having an incremental index, which can index new listings dynamically, keeps administrators from regenerating the entire index. Stop words, especially for large websites, are powerful tools for an open source search engine, because they allow users to limit results. A fuzzy search feature means the search engine is able to find similar results based on the keyword, even though the results do not match the keyword exactly. Ranking systems determine how each listing is displayed and should reflect how the main website operates.
When an open source search engine is used, an index is loaded with all the different listings and websites that can be searched via the search engine. While this list is normally long, it will typically get longer as the website is used. When the index has to grow, normally the administrator has to regenerate the entire index, plus add all new websites and listings; this takes time and a lot of resources. With an incremental index, new listings are added dynamically and there is no reason to regenerate the entire index; the administrator only has to add the new information.
If someone types a search term into the open source search engine, he or she commonly gets relevant results. This is not always the case, and the results may have a tendency toward irrelevant information. For example, if the user searches for pirates, he or she may only find websites about pirate movies, and not historical information about pirates. A stop word allows the user to place a "-" mark before a word, which tells the search engine to block results that include that keyword.
A fuzzy search feature sounds bad, but it is a useful tool that many open source search engine programs employ. Without this, the search engine can only search for websites and listings that directly reflect the keyword. A fuzzy search brings up results similar to the keyword, so the user receives broader results.
The open source search engine ranking system is how the search engine determines relevancy. Some search engines base relevancy on the number of times a keyword was used, when the listing or website was created, the number of links pointing to the one website, or other arbitrary values. The administrator should choose a search engine that reflects how the website itself operates. For example, if the main website allows users to post listings, a date-based search engine usually works best.