Saturday, 15 June 2013

Java Fuzzy Search for an entity name with typos as well as abbrevations -



Java Fuzzy Search for an entity name with typos as well as abbrevations -

i need implement in java fuzzy search entity name manufacturer name take care of

(a) typos, (b) shortened forms limited, ltd, etc

say need identify next 1 thru 7 refer same entity while 8 thru 9 entity :

1) info scheme technlogies 2) info scheme technlogies 3) info scheme techlology limited 4) info scheme techlology ltd 5) info scheme technlogies limited 6) info scheme ltd 7) limited 8) delivery scheme technologies limited 9) ds limited

using lavenshtein distance, won't 5 , 8 more similar , 7 , 9 appear more similar when other way round in both cases.

i not want have maintain pre-defined dictionary abbreviations have big info situation pre-empting possibilities may not feasible.

any pointers if single fuzzy method can help here typos abbreviations or need go hybrid , best utilize in case please?

java fuzzy-search

No comments:

Post a Comment