haskell - Lazy ByteString : memory exploding in certain cases -
below have 2 seemingly functionally equivalent programs. first memory remains constant, whereas sec memory explodes (using ghc 7.8.2 & bytestring-0.10.4.0 in ubuntu 14.04 64-bit):
non-exploding :
--noexplode.hs --ghc -o3 noexplode.hs module main import data.bytestring.lazy bl import data.bytestring.lazy.char8 blc num = 1000000000 bytenull = blc.pack "" countdatapoint arg sum | arg == bytenull = sum | otherwise = countdatapoint (bl.tail arg) (sum+1) test1 = bl.last $ bl.take num $ blc.cycle $ blc.pack "abc" test2 = countdatapoint (bl.take num $ blc.cycle $ blc.pack "abc") 0 main = print test1 print test2
exploding :
--explode.hs --ghc -o3 explode.hs module main import data.bytestring.lazy bl import data.bytestring.lazy.char8 blc num = 1000000000 bytenull = blc.pack "" countdatapoint arg sum | arg == bytenull = sum | otherwise = countdatapoint (bl.tail arg) (sum+1) longbytestr = bl.take num $ blc.cycle $ blc.pack "abc" test1 = bl.last $ longbytestr test2 = countdatapoint (bl.take num $ blc.cycle $ blc.pack "abc") 0 main = print test1 print test2
additional details :
the difference inexplode.hs
have taken bl.take num $ blc.cycle $ blc.pack "abc"
out of definition of test1
, , assigned own value longbytestr
.
strangely if comment out either print test1
or print test2
in explode.hs
(but not both), programme not explode.
is there reason memory exploding in explode.hs
, not in noexplode.hs
, , why exploding programme (explode.hs
) requires both print test1
, print test2
in order exlode?
why ghc
performs mutual look elimination in 1 case, not in other? knows. maybe mutual expressions killed inlining. depends on internal implementation.
regarding -ddump-simp
, see question: reading ghc core
i reproduced ghc-7.8.2
. performs mutual look elimination. can check output of -ddump-simpl
. creating 1 lazy bytestring.
in first version create 2 lazy bytestrings. print test1
forces first one, garbage collected on fly because nobody else uses it. same print test2
-- forces sec bytestring, , gc'ed on fly.
in sec version create 1 lazy bytestring. print test1
forces it, can't gc'ed because needed print test2
. result, after first print
have entire bytestring loaded memory.
if remove 1 print
, bytestring gc'ed on fly again. because not used anywhere else.
ps. "gc'ed on fly" means: print
takes first chunk , outputs stdout
. chunk becomes available gc. prints
takes sec chunk, etc...
haskell lazy-evaluation
No comments:
Post a Comment