Asymptotic Analysis of Different Data Structures Operations
--
There are numerous data structures that come handy while problem solving. The main aim of these data structures is to help optimise brute force solutions. A very important step to master data structures is to know which ones are to be used in which cases.
To know when to use which data structure, the most important thing is understanding which kind of operations they help in optimising. Let’s try to explore the operations these different data structures provide optimal solutions for, by analysing the time complexities for the same.
Arrays/Vectors
Arrays are the simplest data structures but they can be used in very elegant problems.
- Accessing an element at a particular index: Accessing a particular element at an index is a simple O(1) operation. We can directly access arr[i] for getting the ith element in array arr.
- Searching for an element in the array: Searching a particular element in the array would require you to loop over the array making it an O(n) operation. This can be optimised in certain special arrays like by using Binary Search in a sorted array in O(logn).
- Adding/Deleting: Adding or removing an element at any place in the array would in general take O(n) as we would require to shift other elements to make place for our element. In some special cases like near the beginning of the array or at the end, it will take O(1).
Conclusion: Arrays should be used if we want to deal with indices and we need to quickly check or update values at particular indices.
Unordered/Hash Map
Unordered Maps or Hash Maps are simple data structures based on the concept of hashing. These can be imagined like a simple store of key-value pairs. Keys for hashmap could be anything — integers, strings or even complex classes/structures.
- Getting a value for a particular key: The main purpose of hash maps is to store certain values corresponding to some keys. The time complexity of getting the value for a key is O(1) in average. As hashing may lead to collisions, it might take O(n) in the worst case. Similarly updating the value for a particular key also takes similar time
- Adding/Deleting key value pairs: Adding new key/value pairs or removing some key/value pairs takes O(1) time on average and O(n) in worst cases.
Conclusion: Unordered Maps or HashMaps are helpful light weight data structures for maintaining frequencies or certain values corresponding to different keys.
Ordered/Tree Map
Ordered Maps or Tree Maps are similar to a Hashmap in functionality but different in internal implementations. Underlying idea behind this kind of map is a balanced Binary Search Tree.
- Getting value for a particular key: Getting a particular key in a balanced BST involves searching for it in the tree. This requires traversing the tree height which takes O(logN) time
- Adding/Deleting new key: Adding new key or deleting certain key is also same as insertion or deletion operation on the balanced BST followed by the balancing operation. Both these operations would take O(logN) time making insertion/deletion in ordered map O(logN)
- Searching elements just smaller than or larger than a value: Ordered maps have BSTs as their underlying data structure and thus easily give out inorder successors and predecessors of any element in O(logN) time. This is an additional operation feature in ordered maps that allows us to do queries of finding elements just smaller or greater than a particular element.
Conclusion: Ordered Maps are similar to HashMaps in most of the features. It provides certain additional optimal operations like upper and lower bounds making it a very useful data structure for cases which require the keys to be sorted.
Stacks/Queues/Deques
Stacks, queues and double ended queues are special linear data structures with rigid ideas of insertions and deletions.
- Insertion of element: Stack is based on Last In First Out Principle (LIFO). It can be visualised as a list where insertion means inserting at the end. Queues are opposite to stack and are based on FIFO (First in First Out). Insertion in Queue happens at the end too. Double Ended Queues are a hybrid version of stacks and queues where they allow insertion to happen both at the beginning and at the end. Operation of insertion in stacks/queues/dequeues is O(1) if done at the required positions. If we want to insert something at a random position, it will take O(n) time instead.
- Deletion of element: As stacks are LIFO, deletion here occurs at the same place as insertion, that is at the end. In Queues, it is opposite, deletions occur from the beginning instead of the end. Dequeues, again being more flexible, allow insertions/deletions from both ends.
Conclusion: Stacks/Queues/Deques are very specific data structures that help in problems where we want to deal with insertions and deletions of either the last entered element or the one that entered first or in some cases, both. Ideally, a deque can be used to implement stacks or queues but memory wise, stacks and queues are more efficient.
Heaps/Priority Queues
Heaps and Priority Queues are data structures specifically aimed at finding smallest or largest elements quickly. They are especially helpful in problems dealing with greedy algorithms.
- Insertion/Deletion: Heaps are balanced binary trees thus making the height of the tree logN, if it has N elements. Inserting a new element in the heap or deleting an element from the heap will take O(logN) time. This will be followed by a heapify function generally to keep the structure of heap intact. Heapify will also take O(logN) time. Priority Queues are a little less extensible because of the fact that they allow only deletions of the root of the heap (which is generally the maximum or the minimum element).
- Finding the max/min element: The root of the heap is maintained to be the min element in the min-heap or the max element if it is a max-heap. So, finding the min/max assuming the heap is correctly ordered is O(1).
Conclusions: Heaps are very beneficial in cases where we want to keep maintaining the maximum or minimum elements according to a particular criteria through certain insertion/deletion operations.