java - Collections.sort() isn't sorting in the right order -
i have code in java:
list<string> unsorted = new arraylist<string>(); list<string> beforehash = new arraylist<string>(); string[] unsortedaux, beforehashaux; string line = null; bufferedreader reader = new bufferedreader(new filereader("c:\\cpd\\temp0.txt")); while ((line = reader.readline()) != null){ unsorted.add(line); beforehash.add(line.split("#")[0]); } reader.close(); collections.sort(beforehash); beforehashaux = beforehash.toarray(new string[beforehash.size()]); unsortedaux = unsorted.toarray(new string[unsorted.size()]); system.out.println(arrays.tostring(beforehashaux)); system.out.println(arrays.tostring(unsortedaux));
it reads file named temp0.txt, contains:
carlos magno#261 mateus carl#12 analise soares#151 giancarlo tobias#150
my goal sort names in string, without string after "#". using beforehash.add(line.split("#")[0]); this. problem reads correctly file, sorts in wrong order. correspondent outputs are:
[analise soares, giancarlo tobias, mateus carl, carlos magno] [carlos magno#261, mateus carl#12, analise soares#151, giancarlo tobias#150]
the first result "sorted" one, note "carlos magno" comes after "mateus carl". cannot find problem in code.
the problem "carlos magno" starts unicode byte-order mark.
if re-create , paste sample text ([analise ... carlos magno]
) unicode explorer you'll see before "c" of carlos magno, you've got u+feff.
basically, you'll need strip when reading file. easiest way use:
line = line.replace("\ufeff", "");
... or check first:
if (line.startswith("\ufeff")) { line = line.substring(1); }
note should specify encoding want utilize when opening file - utilize fileinputstream
wrapped in inputstreamreader
.
java sorting collections
No comments:
Post a Comment