Difference between indexing and hashing in tabular form
Answers
Answered by
0
Indexing is a storage/access method in databases for fast data retrieval — speeding up query operations by creating indexes. Imagine you have a table with million records and you need to retrieve the row where SALARY column value is 5000.
SELECT * FROM T WHERE SALARY=5000;
If there aren’t any indexes on the tablle, this query will have to search through every record in the table (in worst case) to locate the desired record. On the other hand, if there is an index on SALARY column, it will help in quickly retrieving the result.
Hashing is one of the ways to create index structures. The other popular way is B-Trees which are usually disk based structures. If we use hashing to index our data, we will probably have an in-memory hash table where the key is your indexing key (in this case it would be salary).
Given salary S, we first compute hash of this key, and index into the hash table (to the corresponding bucket that computed hashvalue maps to) and retrieve the Value. The value part of <Key,Value> pair can simply be a row-locator.
In RDBMS, there is usually a concept of row-locator which helps in getting to the correct disk block.page# where the row is stored. So the row locator will encapsulate information like block#, offset into block or whatever is necessary to get to the row within the block.
The result of hash based index would be this row locator which can then be used to fetch the row. Thus we have indexed our table data using an in-memory hash table like structure.
You can take a look at the following answer to know about differences between Hash based indexing v/s BTree based indexing.
Between hashing and b-trees, which method is preferable for storing indexes in a database?
SELECT * FROM T WHERE SALARY=5000;
If there aren’t any indexes on the tablle, this query will have to search through every record in the table (in worst case) to locate the desired record. On the other hand, if there is an index on SALARY column, it will help in quickly retrieving the result.
Hashing is one of the ways to create index structures. The other popular way is B-Trees which are usually disk based structures. If we use hashing to index our data, we will probably have an in-memory hash table where the key is your indexing key (in this case it would be salary).
Given salary S, we first compute hash of this key, and index into the hash table (to the corresponding bucket that computed hashvalue maps to) and retrieve the Value. The value part of <Key,Value> pair can simply be a row-locator.
In RDBMS, there is usually a concept of row-locator which helps in getting to the correct disk block.page# where the row is stored. So the row locator will encapsulate information like block#, offset into block or whatever is necessary to get to the row within the block.
The result of hash based index would be this row locator which can then be used to fetch the row. Thus we have indexed our table data using an in-memory hash table like structure.
You can take a look at the following answer to know about differences between Hash based indexing v/s BTree based indexing.
Between hashing and b-trees, which method is preferable for storing indexes in a database?
Similar questions