Solr exact match regarding words number -
i've been looking week working solution allow following:
documents: [phrase:"cat"], [phrase:"pussy cat"], [phrase:"cats"]
search query: "cat" => results: "cat", "cats" (but not "pussy cat")
search query: "cats" => results: "cats", "cat" (but not "pussy cat")
i saw several suggestions on web on how accomplish this. somewhere saw suggestion insert marker tokens @ origin , end of field values when indexing, , "phrase queries" include marker tokens. in other place saw suggestion calculate number of unique terms in each document.
i find sec suggestion (with calculating words) quite complicated, , cannot recognize on how utilize first suggestion.
so question give hint on how implement "exact match regarding requested words number , using stemming (word forms)" in solr?
any thoughts appreciated.
well have solved problem follows (with prefixes , suffixes):
in solrconfig.xml:
<updaterequestprocessorchain name="exact"> <processor class="solr.clonefieldupdateprocessorfactory"> <str name="source">phrase</str> <str name="dest">phraseexact</str> </processor> <processor class="solr.regexreplaceprocessorfactory"> <str name="fieldname">phraseexact</str> <str name="pattern">^(.*)$</str> <str name="replacement">_prefix_ $1 _suffix_</str> <bool name="literalreplacement">false</bool> </processor> <processor class="solr.logupdateprocessorfactory" /> <processor class="solr.runupdateprocessorfactory" /> </updaterequestprocessorchain> <!-- other contents of solrconfig.xml... --> <requesthandler name="/update" class="solr.updaterequesthandler"> <lst name="defaults"> <str name="update.chain">exact</str> </lst> </requesthandler>
in schema.xml:
<field name="phrase" type="text_en" indexed="true" stored="true"/> <field name="phraseexact" type="text_en" indexed="true" stored="true"/>
after changes needed restart solr instance re-index (re-add) documents.
now have documents this:
{ "phrase": "test", "id": "9c95fac2ed78149c", "phraseexact": "_prefix_ test _suffix_", "_version_": 1471599816879374300 }, { "phrase": "test phrase", "id": "9c95fac2ed78123c", "phraseexact": "_prefix_ test phrase _suffix_", "_version_": 1471599816123474300 },
if search documents queries like
"q=phraseexact:"_prefix_ test _suffix_" "q=phraseexact:"_prefix_ testing _suffix_" "q=phraseexact:"_prefix_ tests _suffix_"
we receive {"phrase":"test"} document (and not {"phrase":"test phrase"})
solr exact-match
No comments:
Post a Comment