lucene - Elasticsearch - How to normalize score when combining regular query and function_score? -
idealy trying achieve assign weights queries such query1 constitutes 30% of final score , query2 consitutes other 70%, achieve maximum score document has have highest possible score on query1 , query2. study of documentation did not yield hints how achieve lets try solve simpler problem.
consider query in following form:
{ "query": { "bool": { "should": [ { "function_score": { "query": {"match_all": {}}, "script_score": { "script": "<some_script>", } } }, { "match": { "message": "this test" } } ] } } }
the script can return arbitrary number (think-> can return 12392002).
how make sure result script not dominate overall score?
is there way normalize it? example instead of script score return ratio max_script_score (achieved document highest score)?
recently working on problem too. couldn't find formal documentation issue when investigate results "explain api", seems "querynorm" not applied score directly coming "functions" field. means can not directly normalize script value.
however, think find little bit tricky solution problem. if combine function field query (match_all query) , give boost query, normalization working on query is, multiplication of 2 scores - normalized query , script- give total normalization. better explanation query like:
{ "query": { "bool": { "should": [ { "function_score": { "query": {"match_all": {"boost":1}}, "functions": [ { "script_score": { "script": "<some_script>", }}], "score_mode": "sum", "boost_mode": "multiply" } }, { "match": { "message": "this test" } } ] } } }
this answer not proper solution problem think can play query obtain required result. suggestion use explain api, try understand returned, examine parameters affecting final score , play script , boost values optimized solution.
btw, "rescore query" may lot obtain %30-%70 ratio on final score: official documentation
Comments
Post a Comment