I, Me, MySelf and CS Problems: Advance Data Structure B-Tree, B+ Tree, RB Tree, AVL Tree & It Application

Introduction

B-Tree is a balanced m-way tree. This discussion from Wiki is a good material to introduce one to the characteristics and node layout of a B-Tree algorithm: http://en.wikipedia.org/wiki/B-tree.

This article discusses an in-memory B-Tree implementation. Although B-Tree is typically used in file/database system indexing (see my B-Tree disk based version for details @: http://www.4atech.net), there is a significant value in implementing a B-Tree based collection or dictionary in the .NET framework or in any language/platform for that matter.

Advantages of B-Tree In-memory

Typical in-memory sorted dictionary data structures today are based on the Binary Tree algorithm, which is not to be confused with B-Tree. Each node of a Binary Tree can contain a single item, whereas a B-Tree can contain a user defined number of items per node, and its nodes are kept balanced. This is a very important differentiation. Being that each node has a single item, storing a large number of items in a Binary Tree will generate a tall and narrow node graph with numerous nodes.

In a corresponding B-Tree implementation, the graph will tend to be shorter and wider with a lot fewer nodes. This is because a node in a B-Tree is typically configured to have numerous items; e.g., a B-Tree dictionary with 12 items will require a single node to contain the items, and a Binary Tree will require 12 nodes which can be scattered around in the heap (memory). Increasing our item sampling to thousands, hundreds of thousands, or millions, we're talking about a significant differentiation and significant optimization that a corresponding B-Tree based sorted dictionary can provide. A million single item node of a Binary Tree vs. around eighty three thousand of a B-Tree, for a 12 item node setup, and even far less if the user specifies more items per node than mentioned.

With this characterization, it is easy to imagine that a Binary Tree will tend to use more memory, and will tend to generate more memory fragmentation than a respective dictionary utilizing the B-Tree algorithm. And in production environments, we can't afford to have a system that is prone to memory fragmentation as in time, the application will degrade in performance and can cause out of memory conditions due to lack of contiguous allocable space.

More &v Detailed Information

Source
B-Tree http://www.codeproject.com/KB/collections/BTreeSortedDictionary.aspx
B+ Tree http://www.codeproject.com/KB/files/NTFSUndelete.aspx
RB Tree http://www.codeproject.com/KB/recipes/redblackcs.aspx
AVL Tree
1 http://www.codeproject.com/KB/architecture/avl_tree.aspx
2 http://www.codeproject.com/KB/architecture/avl_cpp.aspx
3 http://www.codeproject.com/KB/cpp/avltree.aspx

I, Me, MySelf and CS Problems

Wednesday, May 25, 2011

Advance Data Structure B-Tree, B+ Tree, RB Tree, AVL Tree & It Application

1 comment: