java - ElasticSearch- How to query one result from 30 million documents quickly -
now situation want search 3 1000000 times elasticsearch in short time. test set 1 es cluster 4 cores cpu , 16g memory.and take 8 hours. query utilize is:
xxx/type/_search { "query": { "match": { "poiname": { "query": "xxxxx" , "operator": "or" } } } }
and utilize java http request query elasticsearch hadoop.
url url = new url(searchurl); con = (httpurlconnection) url.openconnection(); con.setdooutput(true); con.setdoinput(true); outputstreamwriter wr= new outputstreamwriter(con.getoutputstream()); string query = getqueryjson(field,value); wr.write(query); wr.flush(); int httpresult =con.getresponsecode(); if(httpresult ==httpurlconnection.http_ok){ bufferedreader br = new bufferedreader(new inputstreamreader(con.getinputstream(),"utf-8")); string line = null; while ((line = br.readline()) != null) { sb.append(line + "\n"); } br.close(); }
in fact,we need 1 result response.how can improve this?
===================update===============================
for task :
the document {"doc_name":"an foo eoo","name_id:123456","other filed":"value"}.
we query "ann foo eoo" es name_id, donot need hits.
we query 3 1000000 different doc_name elasticsearch.
actually ,we need match result, , not care how much score is. attach terms query .the minimum_match depend on size of poiname.
(ps. minimum_match = math.ceil(terms size of poiname) /2 )
get xxx/type/_search { "query": { "terms": { "poiname": [ "an", "foo", "eoo" ], "minimum_match":2 } } }
java elasticsearch
No comments:
Post a Comment