Project Milestone 3: Implementing the R-Tree

Now that your system has a spatial datatype, the next step is making the system work with spatial indexes. In this assignment we shall implement an R-Tree index in Redbase and use it to answer spatial range queries.

To start with, you must become familiar with the structure of a database in Redbase. When dbcreate test is run, a folder named test is created, and contains two files, attrcat and relcat. These files constitute the ‘catalog’ of the database. The relcat file records the information about the relations in test, and attrcat consists of information about the attributes of different relations in test. When a CREATE TABLE data command is run inside a database, a file data is created in the test folder, containing information stored in that relation. For all subsequent queries issued, the Storage Management (SM) layer uses the catalog and data files to implement the DDL and DML statements. The attrcat, relcat and relation data files are paged and handled by the PF layer of redbase using appropriate functions. No raw data should be written to or read from these files.

At a high level, the following steps take place when the CREATE INDEX query is issued:

  1. Similar to the SELECT and INSERT statements, the query string goes through the lexing and parsing stages, checks for syntax correctness and a query tree is generated.
  2. The interp.cc interprets nodes of the query tree, calling the System Management layer’s SM_Manager::CreateIndex() to create a new index on the input attribute.
  3. SM_Manager::CreateIndex() looks for the information about the relation and the attribute in relcat and attrcat, and calls the IX_Manager::CreateIndex() to create an index on the specified attribute. An index of a relation is a new file that contains an R-Tree, which holds a subset of information of the original relation, and contains pointers to the original relation data file for faster queries.
  4. Currently, the methods in IX_Manager, IX_IndexScan and IX_IndexHandle are empty. Your task is to fill the methods of these classes with the implementation of an R-Tree.

Tasks Assigned There are two tasks you need to finish to complete this assignment: implement the R-tree, and define the INTERSECTS operation that can be used in conjunction with the SELECT or DELETE statements. Here are a few sample queries to test your implementation:

CREATE TABLE DATA (id i, location m); // Note the new datatype MBR denoted by 'm'
INSERT INTO DATA VALUES (1, [1, 2, 3, 4]); // The 4 input values grouped in [] represent a single MBR

CREATE INDEX DATA (location);
SELECT * FROM DATA WHERE INTERSECTS (location, [2, 4, 3, 5]); 
DELETE FROM DATA WHERE INTERSECTS (location, [2, 4, 3, 5]);

You may assume that the system supports indexing on only a single datatype, i.e. an index may be created only on an attribute of MBR type. Please use Piazza for any questions/discussions.

Deliverables

Please submit a snapshot of your code with a README file containing the instructions to run your code, packaged into a single .tar.gz file with the name firstname_lastname.tar.gz. (50 points)

Due date: 7th December, 11:59 pm

How to submit: The submissions will be through iLearn. Please upload the report on iLearn > CS 236 > Assignments > Project milestone 3.