Breeding: java - ElasticSearch- How to query one result from 30 million documents quickly -

Tuesday, 15 March 2011

java - ElasticSearch- How to query one result from 30 million documents quickly -

now situation want search 3 1000000 times elasticsearch in short time. test set 1 es cluster 4 cores cpu , 16g memory.and take 8 hours. query utilize is:

 xxx/type/_search { "query": { "match": {      "poiname": {        "query": "xxxxx"        , "operator": "or"      }     }    } }

and utilize java http request query elasticsearch hadoop.

        url url = new url(searchurl);         con = (httpurlconnection) url.openconnection();         con.setdooutput(true);         con.setdoinput(true);          outputstreamwriter wr= new outputstreamwriter(con.getoutputstream());         string query = getqueryjson(field,value);         wr.write(query);         wr.flush();         int httpresult =con.getresponsecode();         if(httpresult ==httpurlconnection.http_ok){             bufferedreader br = new bufferedreader(new inputstreamreader(con.getinputstream(),"utf-8"));             string line = null;             while ((line = br.readline()) != null) {                 sb.append(line + "\n");             }              br.close();          }

in fact,we need 1 result response.how can improve this?

===================update===============================

for task :

the document {"doc_name":"an foo eoo","name_id:123456","other filed":"value"}.

we query "ann foo eoo" es name_id, donot need hits.

we query 3 1000000 different doc_name elasticsearch.

actually ,we need match result, , not care how much score is. attach terms query .the minimum_match depend on size of poiname.

(ps. minimum_match = math.ceil(terms size of poiname) /2 )

get xxx/type/_search {   "query": {     "terms": {       "poiname": [         "an",         "foo",         "eoo"       ],       "minimum_match":2     }   }  }

java elasticsearch

Breeding

Tuesday, 15 March 2011

java - ElasticSearch- How to query one result from 30 million documents quickly -

No comments:

Post a Comment