How to implement a data structure in Typescript
Sandro Maglione
Check out my newsletter 👨💻Web development
10 January 2024
•8 min read
Sandro Maglione
Web development
I implemented a Trie
data structure for Effect. During the process I learned all the details of what is required to build a usable and efficient data structure in Typescript:
- What features to consider when implementing a data structure
- Performance (time and space)
- Stack safety
- Immutability
- How to make the data structure inspectable (
JSON.stringify
,toString
) - How to compare equality by value
- How to make the data structure
Iterable
Data structure features
Some feature of a data structure will radically change its implementation.
While the algorithm behind the data structure may be straightforward, adding some requirements can make the actual implementation more challenging
For the Trie
I implemented I considered 3 requirements:
- Performance
- Stack safety
- Immutability
Performance
This is the obvious (and crucial) feature.
The point of a new data structure is to be more performant for a specific usecase.
Therefore, big-O notation is the first aspect to consider.
Fortunately, most of the times you do not need to calculate this metric yourself, you can find plenty of analysis on the web 💁🏼♂️
⚠️ Important: The performance in the end depends on the internal implementation.
Stack safety
This instead is definitely not obvious (it was not obvious for me 💁🏼♂️).
Every time a recursive function is called some space on the stack (memory) is allocated.
If the recursion is too deep, we risk to reach the limit of the stack and the app will crash 🤯
I prevented this issue by not using recursion at all. This introduces some challenges and makes the code more complex.
It's always possible to convert a recursive implementation to non-recursive using an array (manual stack)
Immutability
An immutable data structure cannot be modified in-place: methods like insert
and delete
will not update the original variable but return a new updated value.
This requirement can have an impact on performance.
You cannot create a full copy of all the nodes every time you update some value. Instead, I used a technique called Path Copying:
Path Copying: create a copy only of the nodes in the path from the root to the updated node, while keeping all the other nodes shared
Find the node to add, update or delete, trace the path from the root, and create only the updated nodes
In practice this adds even more complexity to the implementation 💁🏼♂️
There is more.
Every week I build a new open source project, with a new language or library, and teach you how I did it, what I learned, and how you can do the same. Join me and other 600+ readers.
Data structure implementation
The first aspect to consider is the API. For a Trie
these where the main methods:
insert
remove
get
longestPrefixOf
keysWithPrefix
size
Each of these methods require a specific and efficient algorithm.
Fortunately, most of the times you do not need to invent the algorithm yourself
You can search online and find plenty of algorithms with examples. You job is to implement the algorithm considering all the requirements.
Inspectable
Making a data structure inspectable in Typescript requires to implement the following interface:
In Effect the implementation is the following:
Iterable
By implementing the iterable protocol it is possible to allow using a for...of
loop and Array.from
to iterate the values of the data structure:
You can also implement the iterator protocol to allow using the data structure in generator functions (yield
):
I defined a TrieIterator
that satisfies the iterable protocol:
The
next()
method returns one-by-one all the values when iterating over the data structure
Equality
When you compare two instances you usually want the comparison to be based on their values (not by reference).
This requires a custom equality implementation. In Effect this is achieved using the Equal
module:
For Trie
we have the following:
Pipable
Data structures in Effect are also pipable: you have access to a pipe
function to chain method calls:
For reference, you can view here below the standard implementation of the Pipeable
interface:
This can then be used by simply passing an instance of the data structure to pipe
:
Refinement
You also want to check if a value is a valid instance of your data structure (similar to isArray
).
We achieve this by assigning a Symbol
parameter internal to the data structure:
We then assign TypeId
to the data structure while also defining its type variance:
You can read more about type variance here: Covariant, Contravariant, and Invariant in Typescript
This allows to define an isTrie
refinement function that simply checks the existence of the TypeId
property:
Prototype and Object.create
Having done all the above we can finally define the complete prototype instance for the data structure:
We can then create a new instance of Trie
by using Object.create
:
Object.create()
creates a new object using an existing object as the prototype
This is all!
Internal structure of the final implementation of Trie.
There are indeed a lot of aspect to consider when implementing your own data structure.
Nonetheless, remember that the hard part is then the actual implementation. I needed to make sure to satisfy all the requirements.
You can view and read the final implementation on Github:
If you are interested to learn more, every week I publish a new open source project and share notes and lessons learned in my newsletter. You can subscribe here below 👇
Thanks for reading.