|
|
How best to implement a hash table in Mathematica
Posted:
Feb 18, 2012 6:28 AM
|
|
I have an application that resembles dictionary lookup, with significant rates of collision.
The current approach is to initialize the hash table, a 10^4 element list of empty sublists, and then build the table by inserting document object id numbers into the correct sublists as indexed by some important property of the object. The choice of property varies with application. The property may be quantized geographic location, or some attribute of the object itself.
When done building the hash table, query consists of reading the sublist using an index derived in the same way as for the original building of the list. Each sublist may have many entries, and the entries need not be numbers.
Anyway, I have a workable approach using
HashList[[index]]= Union[Flatten[Join[HashList[[index]],{new object ID}]]]]
as the core update function, but I'm wondering how well this will scale to say 10^6 sublists (which may be needed to keep the collisions under control with large datasets), and of course if there is a better way in Mathematica.
I did look at SparseArrays, but they don't support elements that are sublists.
Although this hash table is one dimensional, I have uses for 2D and maybe 3D tables, which are very common when one is hashing physical location.
Thanks in advance,
Joe Gwinn
|
|