Tuesday, 15 February 2011

java - Simplest way to allow range queries on Numeric fields in Lucene 4 -



java - Simplest way to allow range queries on Numeric fields in Lucene 4 -

i'm revisiting code written lucene 3.0 (i updated worked 4.0 haven't yet taken advantage of 4.0 improvements) .

for numeric fields want able range queries add together field document follows:

bytesref bytes = new bytesref(numericutils.buf_size_int); numericutils.inttoprefixcoded(value, 0, bytes); doc.add(new field(fieldname, bytes.utf8tostring(), new fieldtype(stringfield.type_not_stored)));

then have have custom query parser overriding methods newtermquery() , new rangequery()

i.e

public class releasequeryparser extends multifieldqueryparser { public releasequeryparser(string[] strings, analyzer a) { super(luceneversion.lucene_version, strings, a); } protected query newtermquery(term term) { if ( (term.field().equals(releaseindexfield.num_tracks.getname())) || (term.field().equals(releaseindexfield.num_tracks_medium.getname())) || (term.field().equals(releaseindexfield.num_mediums.getname())) ) { seek { int number = integer.parseint(term.text()); bytesref bytes = new bytesref(numericutils.buf_size_int); numericutils.inttoprefixcoded(number, 0, bytes); termquery tq = new termquery(new term(term.field(), bytes.utf8tostring())); homecoming tq; } grab (numberformatexception nfe) { //if not provided numeric argument leave is, won't give matches homecoming super.newtermquery(term); } } else { homecoming super.newtermquery(term); } } @override public query newrangequery(string field, string part1, string part2, boolean startinclusive, boolean endinclusive) { if ( (field.equals(releaseindexfield.num_tracks.getname())) || (field.equals(releaseindexfield.num_tracks_medium.getname())) || (field.equals(releaseindexfield.num_mediums.getname())) ) { bytesref bytes1 = new bytesref(numericutils.buf_size_int); bytesref bytes2 = new bytesref(numericutils.buf_size_int); numericutils.inttoprefixcoded(integer.parseint(part1), 0, bytes1); numericutils.inttoprefixcoded(integer.parseint(part2), 0, bytes2); part1 = bytes1.utf8tostring(); part2 = bytes2.utf8tostring(); } termrangequery query = (termrangequery) super.newrangequery(field, part1, part2, startinclusive, endinclusive); homecoming query; } }

ive thought code rather clunky , wondering if simplified ?

update

attempted utilize intfield follows

doc.add(new intfield(field.getname(), value, new fieldtype(stringfield.type_not_stored)));

compiled okay, index build test method

fields fields = multifields.getfields(ir); terms terms = fields.terms(field.getname()); termsenum termsenum = terms.iterator(null); termsenum.next(); assertequals(value, numericutils.prefixcodedtoint(termsenum.term()));

fails nullpointerexception on terms.iterator() line.

changed

doc.add(new intfield(field.getname(), value, new fieldtype(intfield.type_not_stored)));

and worked, im suprised that line

numericutils.prefixcodedtoint(termsenum.term())

still works, guess intfield wrapper around byteref code had.

then rewrote queryparser follows

public class releasequeryparser extends multifieldqueryparser { public releasequeryparser(string[] strings, analyzer a) { super(luceneversion.lucene_version, strings, a); } protected query newtermquery(term term) { if ( (term.field().equals(releaseindexfield.num_tracks.getname())) || (term.field().equals(releaseindexfield.num_tracks_medium.getname())) || (term.field().equals(releaseindexfield.num_mediums.getname())) ) { homecoming numericrangequery.newintrange(term.field(), integer.parseint(term.text()), integer.parseint(term.text()), true, true); } else { homecoming super.newtermquery(term); } } @override public query newrangequery(string field, string part1, string part2, boolean startinclusive, boolean endinclusive) { if ( (field.equals(releaseindexfield.num_tracks.getname())) || (field.equals(releaseindexfield.num_tracks_medium.getname())) || (field.equals(releaseindexfield.num_mediums.getname())) ) { homecoming numericrangequery.newintrange(field, integer.parseint(part1), integer.parseint(part2),startinclusive, endinclusive); } else { homecoming super.newrangequery(field, part1, part2, startinclusive, endinclusive); } } }

and worked, didn't want utilize range query within newtermquery(term term) couldn't see easier way.

use intfield or longfield classes provided lucene . can query index using standard numericrangequery of specified type

java lucene

No comments:

Post a Comment