c++ - Can I have a heap-like contiguous layout for complete trees based on a depth first order rather than breadth first? -
the heap classical info construction puts finish binary (or d-ary generalized version) tree contiguous array, storing elements in breadth-first traversal order. in way, elements same level of tree stored contiguous 1 after other.
i'm implementing info construction which, under hood, finish balanced tree of fixed grade d, , want store tree in contiguous form free space of node pointers. thought of putting nodes breadth-first order used in heaps, i'm worried cache performance of typical search root downwards leaf, since @ each level l, jump on lot of elements.
is there way obtain compact contiguous representation of d-ary finish tree, based on depth-first order instead?
this way, nodes touched during search of leaf seem me more found closer each other. problem how retrieve index of parent , children of node, i'm wondering operations on tree in general efficient in setting.
i'm implementing thing in c++, in case matters @ all.
for simplicity, i'm going limit give-and-take binary trees, holds true n-ary trees, too.
the reason heaps (and trees in general) stored in arrays breadth-first because it's much easier add together , remove items way: grow , shrink tree. if you're storing depth-first, either tree has allocated @ maximum expected size, or have lot of moving items around when add together levels.
but if know you're going have complete, balanced, n-ary tree, selection of bfs or dfs representation largely matter of style. there isn't particular benefit 1 on other in terms of memory performance. in 1 representation (dfs) take cache misses front, , in other case (bfs) take cache misses @ end.
consider binary tree 20 levels (i.e. 2^20 - 1 items) contains numbers 0 (2^20 - 1). each node occupies 4 bytes (the size of integer).
with bfs, incur cache miss when first block of tree. have first 4 levels of tree in cache. next 3 queries guaranteed in cache. after that, you're guaranteed have cache miss when node index greater 15, because left kid @ x*2 + 1
@ to the lowest degree 16 positions (64 bytes) away parent.
with dfs, incur cache miss when read first block of tree. long number you're searching in left subtree of current node, you're guaranteed not cache miss first 15 levels (i.e. continually go left). branch goes right incur cache miss until downwards 3 levels above leaves. @ point, entire subtree fit cache , remaining queries incur no cache misses.
with bfs, number of cache misses straight proportional number of levels have search. dfs, number of cache misses proportional path taken through tree , number of levels have search. on average, number of cache misses incur when searching item same dfs bfs.
and math computing node positions easier bfs dfs, when want find parent of particular node.
c++ data-structures tree heap
No comments:
Post a Comment