r-tree_空间索引详解-－金锄头文库

R-Tree: Spatial Representation on a Dynamic-Index Structure,Advanced Algorithms & Data Structures Lecture Theme 03 Part I K. A. Mohamed Summer Semester 2006,2,Overview,Representing and handling spatial data The R-Tree indexing approach, style and structure Properties and notions Searching and inserting index Entry-records Deleting and updating Performance analyses Node splitting algorithms Derivatives of the R-Trees Applications,3,Spatial Database (Ia),Consider: Given a city map, index all university buildings in an efficient structure for quick topological search.,4,Spatial Database (Ib),Consider: Given a city map, index all university buildings in an efficient structure for quick topological search.,Spatial object: Contour (outline) of the area around the building(s). Minimum bounding region (MBR) of the object.,5,Spatial Database (Ic),Consider: Given a city map, index all university buildings in an efficient structure for quick relational-topological search.,MBR of the city neighbourhoods. MBR of the city defining the overall search region.,6,Spatial Database (II),Notion: To retrieve data items quickly and efficiently according to their spatial locations. Involves 2D regions. Need to support 2D range queries. Multiple return values desired: Answering a query region by reporting all spatial objects that are fully-contained-in or overlapping the query region (Spatial-Access Method SAM). In general: Spatial data objects often cover areas in multidimensional spaces. Spatial data objects are not well-represented by point-location. An index based on an objects spatial location is desirable.,7,The Indexing Approach,A B-Tree (Rosenberg & Snyder, 1981) is an ordered, dynamic, multi-way structure of order m (i.e. each node has at most m children). The keys and the subtrees are arranged in the fashion of a search tree. Each node may contain a large number of keys, and the number of subtrees in each node, then, may also be large. The B-Tree is designed (among other objectives): to branch out this large number of directions, and to contain a lot of keys in each node so that the height of the tree is relatively short.,8,The R-Tree Index Structure,An R-Tree is a height-balanced tree, similar to a B-Tree. Index records in the leaf nodes contain pointers to the actual spatial-objects they represent. Leaves in the structure all appear on the same level. Spatial searching requires visiting only a small number of nodes. The index is completely dynamic: inserts and deletes can be intermixed with searches. No periodic reorganisation is required.,9,The R-Tree Index Structure,A spatial database consists of a collection of tuples representing spatial objects, known as Entries. Each Entry has a unique identifier that points to one spatial object, and its MBR; i.e. Entry = (MBR, pointer).,10,R-Tree Index Structure Leaf Entries,An entry E in a leaf node is defined as (Guttman, 1984): E = (I, tuple-identifier) Where I refers to the smallest binding n-dimensional region (MBR) that encompasses the spatial data pointed to by its tuple-identifier. I is a series of closed-intervals that make up each dimension of the binding region. Example. In 2D, I = (Ix, Iy), where Ix = xa, xb, and Iy = ya, yb.,11,R-Tree Index Structure Leaf Entries,In general I = (I0, I1, , In-1) for n-dimensions, and that Ik = ka, kb. If either ka or kb (or both) are equal to , this means that the spatial object extends outward indefinitely along that dimension.,12,R-Tree Index Structure Non-Leaf Entries,An entry E in a non-leaf node is defined as: E = (I, child-pointer) Where the child-pointer points to the child of this node, and I is the MBR that encompasses all the regions in the child-nodes pointers entries.,13,Properties,Then an R-Tree must satisfy the following properties: Every leaf node contains between m and M index records, unless it is the root. For each index-record Entry (I, tuple-identifier) in a leaf node, I is the MBR that spatially contains the n-dimensional data object represented by the tuple-identifier. Every non-leaf node has between m and M children, unless it is the root. For each Entry (I, child-pointer) in a non-leaf node, I is the MBR that spatially contains the regions in the child node. The root has two children unless it is a leaf. All leaves appear on the same level.,Let M be the maximum number of entries that will fit in one node. Let m M/2 be a parameter specifying the minimum number of entries in one node.,14,Node Overflow and Underflow,A Node-Overflow happens when a new Entry is added to a fully packed node, causing the resulting number of entries in the node to exceed the upper-bound M. The overflow node must be split, and all its current entries, as well as the new one, consolidated for local optimum arrangement. A Node-Underflow happens when one or more Entries are removed from a node, causing the remaining number of entries in that node to fall below the lower-bound m. The underflow node mu