Tuesday, 15 April 2014

.net - Performance issue with reading integers from a binary file at specific locations in F# -



.net - Performance issue with reading integers from a binary file at specific locations in F# -

this morning asked here why python code (a lot) slower f# version i'm wondering whether f# version can made faster. ideas how create faster version of below code reads sorted list of unique indexes binary file 32-bit integers? note tried 2 approaches, 1 based on binaryreader, other 1 based on memorymappedfile (and more on github).

module simpleread allow readvalue (reader:binaryreader) cellindex = // set stream right location reader.basestream.position <- cellindex*4l match reader.readint32() | int32.minvalue -> none | v -> some(v) allow readvalues filename indices = utilize reader = new binaryreader(file.open(filename, filemode.open, fileaccess.read, fileshare.read)) // utilize list or array forcefulness creation of values (otherwise reader gets disposed before values read) allow values = list.map (readvalue reader) (list.ofseq indices) values module memorymappedsimpleread = open system.io.memorymappedfiles allow readvalue (reader:memorymappedviewaccessor) offset cellindex = allow position = (cellindex*4l) - offset match reader.readint32(position) | int32.minvalue -> none | v -> some(v) allow readvalues filename indices = utilize mmf = memorymappedfile.createfromfile(filename, filemode.open) allow offset = (seq.min indices ) * 4l allow lastly = (seq.max indices) * 4l allow length = 4l+last-offset utilize reader = mmf.createviewaccessor(offset, length, memorymappedfileaccess.read) allow values = (list.ofseq indices) |> list.map (readvalue reader offset) values

for comparing here latest numpy version

class="lang-py prettyprint-override">import numpy np def convert(v): if v <> -2147483648: homecoming v else: homecoming none def read_values(filename, indices): values_arr = np.memmap(filename, dtype='int32', mode='r') homecoming map(convert, values_arr[indices])

update in contrary said before here, python still lot slower f# version due error in python tests appeared otherwise. leaving question here in case in depth knowledge of binaryreader or memorymappedfile knows improvements.

i managed simplereader 30% faster using reader.basestream.seek instead of reader.basestream.position. replaced lists arrays didn't alter lot.

the total code of simple reader now:

open scheme open system.io allow readvalue (reader:binaryreader) cellindex = // set stream right location reader.basestream.seek(int64 (cellindex*4), seekorigin.begin) |> ignore match reader.readint32() | int32.minvalue -> none | v -> some(v) allow readvalues indices filename = utilize reader = new binaryreader(file.open(filename, filemode.open, fileaccess.read, fileshare.read)) // utilize list or array forcefulness creation of values (otherwise reader gets disposed before values read) allow values = array.map (readvalue reader) indices values

the total code , versions in other languages on github.

.net f# binaryfiles memory-mapped-files binaryreader

No comments:

Post a Comment