Hash Table in C/C++ - A Complete Implementation With Examples

A Hash Table in C/C++ (Associative array) is a data structure that maps keys to values. This uses a hash function to compute indexes for a key.

Based on the Hash Table index, we can store the value at the appropriate location.

If two different keys get the same index, we need to use other data structures (buckets) to account for these collisions.

315px-Hash_table_3_1_1_0_1_0_0_SP.svg

Hash Table – Wikipedia Article

The whole benefit of using a Hash Table is due to it’s very fast access time. While there can be a collision, if we choose a very good hash function, this chance is almost zero.

So, on average, the time complexity is a constant O(1) access time. This is called Amortized Time Complexity.

The C++ STL (Standard Template Library) has the std::unordered_map() data structure which implements all these hash table functions.

However, knowing how to construct a hash table from scratch is a crucial skill, and that is indeed what we aim to show you.

Let us understand more about the implementation details.

Any Hash Table implementation has the following three components:

  • A good Hash function to map keys to values
  • A Hash Table Data Structure that supports insert, search and delete operations.
  • A Data Structure to account for collision of keys

Choose a Hash Function

The first step is to choose a reasonably good hash function that has a low chance of collision. But, since this is for illustration, I will be doing the opposite! Reverse Psychology, eh?

We will be working only with strings (or character arrays in C) in this article.

I’ll be using a very simple hash function, that simply sums the ASCII value of the string. I am using this to show you how we will handle collision cases.

#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
    unsigned long i = 0;
    for (int j=0; str[j]; j++)
        i += str[j];
    return i % CAPACITY;
}

You can test this for different strings and check if they collide or not. For example, the strings “Hel” and “Cau” will collide, since they have the same ASCII value.

NOTE: We must return a number within the capacity of the hash table. Otherwise, we may access an unbound memory location, leading to an error.


Define the Hash Table data structures

A Hash table is an array of items, which themselves are a {key: value} pair.

Let’s define our item structure first.

typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
    char* key;
    char* value;
};

Now, the Hash table has an array of pointers which themselves point to Ht_item, so it is a double-pointer.

Other than that, we will also keep track of the number of elements in the Hash table using count, and store the size of the table in size.

typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
    // Contains an array of pointers
    // to items
    Ht_item** items;
    int size;
    int count;
};

Create the Hash Table and its items

We need functions to create a new Hash table into memory and also create its items.

Let’s create the item first. This is very simple since we only need to allocate memory for its key and value and return a pointer to the item.

Ht_item* create_item(char* key, char* value) {
    // Creates a pointer to a new hash table item
    Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
    item->key = (char*) malloc (strlen(key) + 1);
    item->value = (char*) malloc (strlen(value) + 1);
    strcpy(item->key, key);
    strcpy(item->value, value);
    return item;
}

Now, let’s write the code for creating the table. This allocates memory for the wrapper structure HashTable, and sets all it’s items to NULL (Since they aren’t used).

HashTable* create_table(int size) {
    // Creates a new HashTable
    HashTable* table = (HashTable*) malloc (sizeof(HashTable));
    table->size = size;
    table->count = 0;
    table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
    for (int i=0; i<table->size; i++)
        table->items[i] = NULL;
    return table;
}

Now, we are almost done with this part. As a C/C++ programmer, it is your responsibility to free up memory that you’ve allocated on the heap using malloc(), calloc().

So let’s write functions which free up a table item, and the whole table too.

void free_item(Ht_item* item) {
    // Frees an item
    free(item->key);
    free(item->value);
    free(item);
}
void free_table(HashTable* table) {
    // Frees the table
    for (int i=0; i<table->size; i++) {
        Ht_item* item = table->items[i];
        if (item != NULL)
            free_item(item);
    }
    free(table->items);
    free(table);
}

We’ve now completed our groundwork to building a functional Hash Table. Let’s now start writing our insert(), search() and delete() methods.


Insert into the Hash table

We will create a function ht_insert() that performs this task for us.

This takes in a HashTable pointer, a key and a value as parameters.

void ht_insert(HashTable* table, char* key, char* value);

Now, there are certain steps involved in the insert function.

  • Create the item based on the {key : value} pair
  • Compute the index based on the hash function
  • Check if the index is already occupied or not, by comparing key.
    • If it is not occupied. we can directly insert it into index
    • Otherwise, it is a collision, and we need to handle it

I will explain how we will handle collisions after we create the initial model, so let’s wait a bit for that.

The first step is simple. We directly call create_item(key, value).

int index = hash_function(key);

The second and third steps use hash_function(key) to get the index. If we are inserting the key for the first time, the item must be a NULL. Otherwise, the exact key : value pair already exists, or it is a collision.

In this case, we’ll define another function handle_collision(), which as the name suggests, will handle that potential collision for us.

// Create the item
Ht_item* item = create_item(key, value);
// Compute the index
int index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
    // Key does not exist.
    if (table->count == table->size) {
        // Hash Table Full
        printf("Insert Error: Hash Table is fulln");
        free_item(item);
        return;
    }
    // Insert directly
    table->items[index] = item;
    table->count++;
}

Let’s consider the first scenario where the key : value pair already exists (i.e the same item was already inserted before). In this case, we must update the item value only to the new one.

if (current_item == NULL) {
    ....
    ....
}
else {
        // Scenario 1: We only need to update value
        if (strcmp(current_item->key, key) == 0) {
            strcpy(table->items[index]->value, value);
            return;
        }
    else {
        // Scenario 2: Collision
        // We will handle case this a bit later
        handle_collision(table, item);
        return;
    }
}

Okay, so our insert function (without collisions) now looks a bit like this:

void handle_collision(HashTable* table, Ht_item* item) {
}
void ht_insert(HashTable* table, char* key, char* value) {
	// Create the item
	Ht_item* item = create_item(key, value);
	Ht_item* current_item = table->items[index];
	if (current_item == NULL) {
	    // Key does not exist.
	    if (table->count == table->size) {
	        // Hash Table Full
	        printf("Insert Error: Hash Table is fulln");
	        return;
	    }
	    // Insert directly
	    table->items[index] = item;
	    table->count++;
	}
	else {
    		// Scenario 1: We only need to update value
    		if (strcmp(current_item->key, key) == 0) {
        		strcpy(table->items[index]->value, value);
        		return;
    		}
    	else {
			// Scenario 2: Collision
		  	// We will handle case this a bit later
          	handle_collision(table, item);
          	return;
    	}
	}
}

Search Items in the Hash Table

If we want to check if the insertion was done correctly, we also need to define a search function, that checks if the key exists or not, and returns the corresponding value if it does.

char* ht_search(HastTable* table, char* key);

The logic is very simple. It simply moves to non-NULL items and compares the key. Otherwise, we will return NULL

char* ht_search(HashTable* table, char* key) {
    // Searches the key in the hashtable
    // and returns NULL if it doesn't exist
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    // Ensure that we move to a non NULL item
    if (item != NULL) {
        if (strcmp(item->key, key) == 0)
            return item->value;
    }
    return NULL;
}

Testing our basic model

Let’s check if what we’ve written was correct, by writing a main() driver program.

We’ll add one more function print_table() that prints the Hash Table, for illustration.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
    unsigned long i = 0;
    for (int j=0; str[j]; j++)
        i += str[j];
    return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
    char* key;
    char* value;
};
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
    // Contains an array of pointers
    // to items
    Ht_item** items;
    int size;
    int count;
};
Ht_item* create_item(char* key, char* value) {
    // Creates a pointer to a new hash table item
    Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
    item->key = (char*) malloc (strlen(key) + 1);
    item->value = (char*) malloc (strlen(value) + 1);
    strcpy(item->key, key);
    strcpy(item->value, value);
    return item;
}
HashTable* create_table(int size) {
    // Creates a new HashTable
    HashTable* table = (HashTable*) malloc (sizeof(HashTable));
    table->size = size;
    table->count = 0;
    table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
    for (int i=0; i<table->size; i++)
        table->items[i] = NULL;
    return table;
}
void free_item(Ht_item* item) {
    // Frees an item
    free(item->key);
    free(item->value);
    free(item);
}
void free_table(HashTable* table) {
    // Frees the table
    for (int i=0; i<table->size; i++) {
        Ht_item* item = table->items[i];
        if (item != NULL)
            free_item(item);
    }
    free(table->items);
    free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
}
void ht_insert(HashTable* table, char* key, char* value) {
    // Create the item
    Ht_item* item = create_item(key, value);
    // Compute the index
    unsigned long index = hash_function(key);
    Ht_item* current_item = table->items[index];
    if (current_item == NULL) {
        // Key does not exist.
        if (table->count == table->size) {
            // Hash Table Full
            printf("Insert Error: Hash Table is fulln");
            // Remove the create item
            free_item(item);
            return;
        }
        // Insert directly
        table->items[index] = item;
        table->count++;
    }
    else {
            // Scenario 1: We only need to update value
            if (strcmp(current_item->key, key) == 0) {
                strcpy(table->items[index]->value, value);
                return;
            }
        else {
            // Scenario 2: Collision
            // We will handle case this a bit later
            handle_collision(table, index, item);
            return;
        }
    }
}
char* ht_search(HashTable* table, char* key) {
    // Searches the key in the hashtable
    // and returns NULL if it doesn't exist
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    // Ensure that we move to a non NULL item
    if (item != NULL) {
        if (strcmp(item->key, key) == 0)
            return item->value;
    }
    return NULL;
}
void print_search(HashTable* table, char* key) {
    char* val;
    if ((val = ht_search(table, key)) == NULL) {
        printf("Key:%s does not existn", key);
        return;
    }
    else {
        printf("Key:%s, Value:%sn", key, val);
    }
}
void print_table(HashTable* table) {
    printf("nHash Tablen-------------------n");
    for (int i=0; i<table->size; i++) {
        if (table->items[i]) {
            printf("Index:%d, Key:%s, Value:%sn", i, table->items[i]->key, table->items[i]->value);
        }
    }
    printf("-------------------nn");
}
int main() {
    HashTable* ht = create_table(CAPACITY);
    ht_insert(ht, "1", "First address");
    ht_insert(ht, "2", "Second address");
    print_search(ht, "1");
    print_search(ht, "2");
    print_search(ht, "3");
    print_table(ht);
    free_table(ht);
    return 0;
}

Output

Key:1, Value:First address
Key:2, Value:Second address
Key:3 does not exist
Hash Table
-------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
-------------------

Great! This seems to work as expected, so now let’s move on to handling collisions.


Handle Collisions

There are different ways through which a collision can be resolved. We will look at a method called Separate Chaining, which aims to create independent chains for all items that have the same hash index.

We will create these chains using Linked Lists.

Whenever there is a collision, we will add further items that collide on the same index on an Overflow Bucket List. Thus, we will not have to delete any existing records on our hash table.

Due to linked lists having O(n) time complexity for insertion, searching and deletion, in case of a collision, we will have a worst-case access time of O(n) as well. The advantage of this method is that it is a good choice if your hash table has a low capacity.

450px-Hash_table_5_0_1_1_1_1_1_LL.svg

Separate Chaining using Linked Lists- Wikipedia

With that covered, let’s start implementing our good old Linked List!

typedef struct LinkedList LinkedList;
// Define the Linkedlist here
struct LinkedList {
    Ht_item* item;
    LinkedList* next;
};
LinkedList* allocate_list () {
    // Allocates memory for a Linkedlist pointer
    LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
    return list;
}
LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
    // Inserts the item onto the Linked List
    if (!list) {
        LinkedList* head = allocate_list();
        head->item = item;
        head->next = NULL;
        list = head;
        return list;
    }
    else if (list->next == NULL) {
        LinkedList* node = allocate_list();
        node->item = item;
        node->next = NULL;
        list->next = node;
        return list;
    }
    LinkedList* temp = list;
    while (temp->next->next) {
        temp = temp->next;
    }
    LinkedList* node = allocate_list();
    node->item = item;
    node->next = NULL;
    temp->next = node;
    return list;
}
Ht_item* linkedlist_remove(LinkedList* list) {
    // Removes the head from the linked list
    // and returns the item of the popped element
    if (!list)
        return NULL;
    if (!list->next)
        return NULL;
    LinkedList* node = list->next;
    LinkedList* temp = list;
    temp->next = NULL;
    list = node;
    Ht_item* it = NULL;
    memcpy(temp->item, it, sizeof(Ht_item));
    free(temp->item->key);
    free(temp->item->value);
    free(temp->item);
    free(temp);
    return it;
}
void free_linkedlist(LinkedList* list) {
    LinkedList* temp = list;
    while (list) {
        temp = list;
        list = list->next;
        free(temp->item->key);
        free(temp->item->value);
        free(temp->item);
        free(temp);
    }
}

Now, we need to add these “Overflow Bucket” lists to our Hash Table. We want every item to have one such chain, so for the whole table, it is an array of LinkedList pointers.

typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
    // Contains an array of pointers
    // to items
    Ht_item** items;
    LinkedList** overflow_buckets;
    int size;
    int count;
};

Now that we’ve defined our overflow_buckets, we will add functions to create and delete them. We also need to account for them in our old create_table() and free_table() functions.

LinkedList** create_overflow_buckets(HashTable* table) {
    // Create the overflow buckets; an array of linkedlists
    LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
    for (int i=0; i<table->size; i++)
                   buckets[i] = NULL;
    return buckets;
}
void free_overflow_buckets(HashTable* table) {
    // Free all the overflow bucket lists
    LinkedList** buckets = table->overflow_buckets;
    for (int i=0; i<table->size; i++)
          free_linkedlist(buckets[i]);
    free(buckets);
}
HashTable* create_table(int size) {
    // Creates a new HashTable
    HashTable* table = (HashTable*) malloc (sizeof(HashTable));
    table->size = size;
    table->count = 0;
    table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
    for (int i=0; i<table->size; i++)
        table->items[i] = NULL;
    table->overflow_buckets = create_overflow_buckets(table);
    return table;
}
void free_table(HashTable* table) {
    // Frees the table
    for (int i=0; i<table->size; i++) {
        Ht_item* item = table->items[i];
        if (item != NULL)
            free_item(item);
    }
    // Free the overflow bucket linked linkedlist and it's items
    free_overflow_buckets(table);
    free(table->items);
    free(table);
}

Phew! We still have work to do. Let’s now go to the handle_collision() function.

There are two scenarios here. If the overflow bucket list for the item does not exist, we need to create one such list and add the item to it.

Otherwise, we can simply insert the item to the list

void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
    LinkedList* head = table->overflow_buckets[index];
    if (head == NULL) {
        // We need to create the list
        head = allocate_list();
        head->item = item;
        table->overflow_buckets[index] = head;
        return;
    }
    else {
        // Insert to the list
        table->overflow_buckets[index] = linkedlist_insert(head, item);
        return;
    }
 }

So we are done with insertion, but now, we need to update our search function as well, since we may need to look at the overflow buckets as well.

char* ht_search(HashTable* table, char* key) {
    // Searches the key in the hashtable
    // and returns NULL if it doesn't exist
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];
    // Ensure that we move to items which are not NULL
    while (item != NULL) {
        if (strcmp(item->key, key) == 0)
            return item->value;
        if (head == NULL)
            return NULL;
        item = head->item;
        head = head->next;
    }
    return NULL;
}

Finally, we have accounted for collisions in insert() and search()!

To remind you of the current status of the code, I’ll post the complete program until now.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
    unsigned long i = 0;
    for (int j=0; str[j]; j++)
        i += str[j];
    return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
    char* key;
    char* value;
};
typedef struct LinkedList LinkedList;
// Define the Linkedlist here
struct LinkedList {
    Ht_item* item;
    LinkedList* next;
};
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
    // Contains an array of pointers
    // to items
    Ht_item** items;
    LinkedList** overflow_buckets;
    int size;
    int count;
};
static LinkedList* allocate_list () {
    // Allocates memory for a Linkedlist pointer
    LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
    return list;
}
static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
    // Inserts the item onto the Linked List
    if (!list) {
        LinkedList* head = allocate_list();
        head->item = item;
        head->next = NULL;
        list = head;
        return list;
    }
    else if (list->next == NULL) {
        LinkedList* node = allocate_list();
        node->item = item;
        node->next = NULL;
        list->next = node;
        return list;
    }
    LinkedList* temp = list;
    while (temp->next->next) {
        temp = temp->next;
    }
    LinkedList* node = allocate_list();
    node->item = item;
    node->next = NULL;
    temp->next = node;
    return list;
}
static Ht_item* linkedlist_remove(LinkedList* list) {
    // Removes the head from the linked list
    // and returns the item of the popped element
    if (!list)
        return NULL;
    if (!list->next)
        return NULL;
    LinkedList* node = list->next;
    LinkedList* temp = list;
    temp->next = NULL;
    list = node;
    Ht_item* it = NULL;
    memcpy(temp->item, it, sizeof(Ht_item));
    free(temp->item->key);
    free(temp->item->value);
    free(temp->item);
    free(temp);
    return it;
}
static void free_linkedlist(LinkedList* list) {
    LinkedList* temp = list;
    while (list) {
        temp = list;
        list = list->next;
        free(temp->item->key);
        free(temp->item->value);
        free(temp->item);
        free(temp);
    }
}
static LinkedList** create_overflow_buckets(HashTable* table) {
    // Create the overflow buckets; an array of linkedlists
    LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
    for (int i=0; i<table->size; i++)
        buckets[i] = NULL;
    return buckets;
}
static void free_overflow_buckets(HashTable* table) {
    // Free all the overflow bucket lists
    LinkedList** buckets = table->overflow_buckets;
    for (int i=0; i<table->size; i++)
        free_linkedlist(buckets[i]);
    free(buckets);
}
Ht_item* create_item(char* key, char* value) {
    // Creates a pointer to a new hash table item
    Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
    item->key = (char*) malloc (strlen(key) + 1);
    item->value = (char*) malloc (strlen(value) + 1);
    strcpy(item->key, key);
    strcpy(item->value, value);
    return item;
}
HashTable* create_table(int size) {
    // Creates a new HashTable
    HashTable* table = (HashTable*) malloc (sizeof(HashTable));
    table->size = size;
    table->count = 0;
    table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
    for (int i=0; i<table->size; i++)
        table->items[i] = NULL;
    table->overflow_buckets = create_overflow_buckets(table);
    return table;
}
void free_item(Ht_item* item) {
    // Frees an item
    free(item->key);
    free(item->value);
    free(item);
}
void free_table(HashTable* table) {
    // Frees the table
    for (int i=0; i<table->size; i++) {
        Ht_item* item = table->items[i];
        if (item != NULL)
            free_item(item);
    }
    free_overflow_buckets(table);
    free(table->items);
    free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
    LinkedList* head = table->overflow_buckets[index];
    if (head == NULL) {
        // We need to create the list
        head = allocate_list();
        head->item = item;
        table->overflow_buckets[index] = head;
        return;
    }
    else {
        // Insert to the list
        table->overflow_buckets[index] = linkedlist_insert(head, item);
        return;
    }
 }
void ht_insert(HashTable* table, char* key, char* value) {
    // Create the item
    Ht_item* item = create_item(key, value);
    // Compute the index
    unsigned long index = hash_function(key);
    Ht_item* current_item = table->items[index];
    if (current_item == NULL) {
        // Key does not exist.
        if (table->count == table->size) {
            // Hash Table Full
            printf("Insert Error: Hash Table is fulln");
            // Remove the create item
            free_item(item);
            return;
        }
        // Insert directly
        table->items[index] = item;
        table->count++;
    }
    else {
            // Scenario 1: We only need to update value
            if (strcmp(current_item->key, key) == 0) {
                strcpy(table->items[index]->value, value);
                return;
            }
        else {
            // Scenario 2: Collision
            handle_collision(table, index, item);
            return;
        }
    }
}
char* ht_search(HashTable* table, char* key) {
    // Searches the key in the hashtable
    // and returns NULL if it doesn't exist
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];
    // Ensure that we move to items which are not NULL
    while (item != NULL) {
        if (strcmp(item->key, key) == 0)
            return item->value;
        if (head == NULL)
            return NULL;
        item = head->item;
        head = head->next;
    }
    return NULL;
}
void print_search(HashTable* table, char* key) {
    char* val;
    if ((val = ht_search(table, key)) == NULL) {
        printf("%s does not existn", key);
        return;
    }
    else {
        printf("Key:%s, Value:%sn", key, val);
    }
}
void print_table(HashTable* table) {
    printf("n-------------------n");
    for (int i=0; i<table->size; i++) {
        if (table->items[i]) {
            printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
            if (table->overflow_buckets[i]) {
                printf(" => Overflow Bucket => ");
                LinkedList* head = table->overflow_buckets[i];
                while (head) {
                    printf("Key:%s, Value:%s ", head->item->key, head->item->value);
                    head = head->next;
                }
            }
            printf("n");
        }
    }
    printf("-------------------n");
}
int main() {
    HashTable* ht = create_table(CAPACITY);
    ht_insert(ht, "1", "First address");
    ht_insert(ht, "2", "Second address");
    ht_insert(ht, "Hel", "Third address");
    ht_insert(ht, "Cau", "Fourth address");
    print_search(ht, "1");
    print_search(ht, "2");
    print_search(ht, "3");
    print_search(ht, "Hel");
    print_search(ht, "Cau");
    print_table(ht);
    free_table(ht);
    return 0;
}

Delete from the Hash Table

Let’s now finally look at the delete function:

void ht_delete(HashTable* table, char* key);

Again, the method is similar to insertion.

  1. Compute the hash index and get the item
  2. If it is NULL, we don’t need to do anything
  3. Otherwise, after comparing keys, it there is no collision chain for that index, simply remove the item from the table
  4. If a collision chain exists, we must remove that element and shift the links accordingly

I am not providing you with too much detail since this only involves updating the head items and freeing up memory. My suggestion would be to try and implement this yourself.

I will provide you with a working version for comparison.

void ht_delete(HashTable* table, char* key) {
    // Deletes an item from the table
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];
    if (item == NULL) {
        // Does not exist. Return
        return;
    }
    else {
        if (head == NULL && strcmp(item->key, key) == 0) {
            // No collision chain. Remove the item
            // and set table index to NULL
            table->items[index] = NULL;
            free_item(item);
            table->count--;
            return;
        }
        else if (head != NULL) {
            // Collision Chain exists
            if (strcmp(item->key, key) == 0) {
                // Remove this item and set the head of the list
                // as the new item
                free_item(item);
                LinkedList* node = head;
                head = head->next;
                node->next = NULL;
                table->items[index] = create_item(node->item->key, node->item->value);
                free_linkedlist(node);
                table->overflow_buckets[index] = head;
                return;
            }
            LinkedList* curr = head;
            LinkedList* prev = NULL;
            while (curr) {
                if (strcmp(curr->item->key, key) == 0) {
                    if (prev == NULL) {
                        // First element of the chain. Remove the chain
                        free_linkedlist(head);
                        table->overflow_buckets[index] = NULL;
                        return;
                    }
                    else {
                        // This is somewhere in the chain
                        prev->next = curr->next;
                        curr->next = NULL;
                        free_linkedlist(curr);
                        table->overflow_buckets[index] = head;
                        return;
                    }
                }
                curr = curr->next;
                prev = curr;
            }
        }
    }
}

Putting it all together

Now, finally, I will show the entire program for the Hash Table.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
    unsigned long i = 0;
    for (int j=0; str[j]; j++)
        i += str[j];
    return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
    char* key;
    char* value;
};
typedef struct LinkedList LinkedList;
// Define the Linkedlist here
struct LinkedList {
    Ht_item* item;
    LinkedList* next;
};
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
    // Contains an array of pointers
    // to items
    Ht_item** items;
    LinkedList** overflow_buckets;
    int size;
    int count;
};
static LinkedList* allocate_list () {
    // Allocates memory for a Linkedlist pointer
    LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
    return list;
}
static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
    // Inserts the item onto the Linked List
    if (!list) {
        LinkedList* head = allocate_list();
        head->item = item;
        head->next = NULL;
        list = head;
        return list;
    }
    else if (list->next == NULL) {
        LinkedList* node = allocate_list();
        node->item = item;
        node->next = NULL;
        list->next = node;
        return list;
    }
    LinkedList* temp = list;
    while (temp->next->next) {
        temp = temp->next;
    }
    LinkedList* node = allocate_list();
    node->item = item;
    node->next = NULL;
    temp->next = node;
    return list;
}
static Ht_item* linkedlist_remove(LinkedList* list) {
    // Removes the head from the linked list
    // and returns the item of the popped element
    if (!list)
        return NULL;
    if (!list->next)
        return NULL;
    LinkedList* node = list->next;
    LinkedList* temp = list;
    temp->next = NULL;
    list = node;
    Ht_item* it = NULL;
    memcpy(temp->item, it, sizeof(Ht_item));
    free(temp->item->key);
    free(temp->item->value);
    free(temp->item);
    free(temp);
    return it;
}
static void free_linkedlist(LinkedList* list) {
    LinkedList* temp = list;
    while (list) {
        temp = list;
        list = list->next;
        free(temp->item->key);
        free(temp->item->value);
        free(temp->item);
        free(temp);
    }
}
static LinkedList** create_overflow_buckets(HashTable* table) {
    // Create the overflow buckets; an array of linkedlists
    LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
    for (int i=0; i<table->size; i++)
        buckets[i] = NULL;
    return buckets;
}
static void free_overflow_buckets(HashTable* table) {
    // Free all the overflow bucket lists
    LinkedList** buckets = table->overflow_buckets;
    for (int i=0; i<table->size; i++)
        free_linkedlist(buckets[i]);
    free(buckets);
}
Ht_item* create_item(char* key, char* value) {
    // Creates a pointer to a new hash table item
    Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
    item->key = (char*) malloc (strlen(key) + 1);
    item->value = (char*) malloc (strlen(value) + 1);
    strcpy(item->key, key);
    strcpy(item->value, value);
    return item;
}
HashTable* create_table(int size) {
    // Creates a new HashTable
    HashTable* table = (HashTable*) malloc (sizeof(HashTable));
    table->size = size;
    table->count = 0;
    table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
    for (int i=0; i<table->size; i++)
        table->items[i] = NULL;
    table->overflow_buckets = create_overflow_buckets(table);
    return table;
}
void free_item(Ht_item* item) {
    // Frees an item
    free(item->key);
    free(item->value);
    free(item);
}
void free_table(HashTable* table) {
    // Frees the table
    for (int i=0; i<table->size; i++) {
        Ht_item* item = table->items[i];
        if (item != NULL)
            free_item(item);
    }
    free_overflow_buckets(table);
    free(table->items);
    free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
    LinkedList* head = table->overflow_buckets[index];
    if (head == NULL) {
        // We need to create the list
        head = allocate_list();
        head->item = item;
        table->overflow_buckets[index] = head;
        return;
    }
    else {
        // Insert to the list
        table->overflow_buckets[index] = linkedlist_insert(head, item);
        return;
    }
 }
void ht_insert(HashTable* table, char* key, char* value) {
    // Create the item
    Ht_item* item = create_item(key, value);
    // Compute the index
    unsigned long index = hash_function(key);
    Ht_item* current_item = table->items[index];
    if (current_item == NULL) {
        // Key does not exist.
        if (table->count == table->size) {
            // Hash Table Full
            printf("Insert Error: Hash Table is fulln");
            // Remove the create item
            free_item(item);
            return;
        }
        // Insert directly
        table->items[index] = item;
        table->count++;
    }
    else {
            // Scenario 1: We only need to update value
            if (strcmp(current_item->key, key) == 0) {
                strcpy(table->items[index]->value, value);
                return;
            }
        else {
            // Scenario 2: Collision
            handle_collision(table, index, item);
            return;
        }
    }
}
char* ht_search(HashTable* table, char* key) {
    // Searches the key in the hashtable
    // and returns NULL if it doesn't exist
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];
    // Ensure that we move to items which are not NULL
    while (item != NULL) {
        if (strcmp(item->key, key) == 0)
            return item->value;
        if (head == NULL)
            return NULL;
        item = head->item;
        head = head->next;
    }
    return NULL;
}
void ht_delete(HashTable* table, char* key) {
    // Deletes an item from the table
    int index = hash_function(key);
    Ht_item* item = table->items[index];
    LinkedList* head = table->overflow_buckets[index];
    if (item == NULL) {
        // Does not exist. Return
        return;
    }
    else {
        if (head == NULL && strcmp(item->key, key) == 0) {
            // No collision chain. Remove the item
            // and set table index to NULL
            table->items[index] = NULL;
            free_item(item);
            table->count--;
            return;
        }
        else if (head != NULL) {
            // Collision Chain exists
            if (strcmp(item->key, key) == 0) {
                // Remove this item and set the head of the list
                // as the new item
                free_item(item);
                LinkedList* node = head;
                head = head->next;
                node->next = NULL;
                table->items[index] = create_item(node->item->key, node->item->value);
                free_linkedlist(node);
                table->overflow_buckets[index] = head;
                return;
            }
            LinkedList* curr = head;
            LinkedList* prev = NULL;
            while (curr) {
                if (strcmp(curr->item->key, key) == 0) {
                    if (prev == NULL) {
                        // First element of the chain. Remove the chain
                        free_linkedlist(head);
                        table->overflow_buckets[index] = NULL;
                        return;
                    }
                    else {
                        // This is somewhere in the chain
                        prev->next = curr->next;
                        curr->next = NULL;
                        free_linkedlist(curr);
                        table->overflow_buckets[index] = head;
                        return;
                    }
                }
                curr = curr->next;
                prev = curr;
            }
        }
    }
}
void print_search(HashTable* table, char* key) {
    char* val;
    if ((val = ht_search(table, key)) == NULL) {
        printf("%s does not existn", key);
        return;
    }
    else {
        printf("Key:%s, Value:%sn", key, val);
    }
}
void print_table(HashTable* table) {
    printf("n-------------------n");
    for (int i=0; i<table->size; i++) {
        if (table->items[i]) {
            printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
            if (table->overflow_buckets[i]) {
                printf(" => Overflow Bucket => ");
                LinkedList* head = table->overflow_buckets[i];
                while (head) {
                    printf("Key:%s, Value:%s ", head->item->key, head->item->value);
                    head = head->next;
                }
            }
            printf("n");
        }
    }
    printf("-------------------n");
}
int main() {
    HashTable* ht = create_table(CAPACITY);
    ht_insert(ht, "1", "First address");
    ht_insert(ht, "2", "Second address");
    ht_insert(ht, "Hel", "Third address");
    ht_insert(ht, "Cau", "Fourth address");
    print_search(ht, "1");
    print_search(ht, "2");
    print_search(ht, "3");
    print_search(ht, "Hel");
    print_search(ht, "Cau");  // Collision!
    print_table(ht);
    ht_delete(ht, "1");
    ht_delete(ht, "Cau");
    print_table(ht);
    free_table(ht);
    return 0;
}

Output

Key:1, Value:First address
Key:2, Value:Second address
3 does not exist
Key:Hel, Value:Third address
Key:Cau, Value:Fourth address
-------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address => Overflow Bucket => Key:Cau, Value:Fourth address
-------------------
-------------------
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address
-------------------

Conclusion

Hopefully, you’ve understood how a Hash Table can be implemented from scratch in C/C++ and possibly implemented one yourself!

I would also suggest you try other collision handling algorithms and different hash functions on this Hash Table and test their performance.


Download the Code

You can get the code through a Github Gist that I’ve uploaded. Feel free to ask any questions or provide suggestions in the comment section below!


By admin

Leave a Reply

%d bloggers like this: