[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SV: SV: Progress Update



>> Now, to make a HashTable out of a DynHashTable, the array and the linked
>> list of the DynHashTable has to be copied into one, new array. With that
>> done, the job of the DynHashTable is finished. Then the HashTable needs
to
>> sort the array (it now uses qsort()), and then index it (ie, create a
hash
>> look-up table).
>>
>Why the heck isn't this a normal hash table? No need to sort, no need
>for linked lists. For the (once in a while) table resizing, you need a
>costly realloc and only binary logical operations (masking the
>significant parts of the key (they are already hashed, since they're in
>the hash table), giving you the straight index).
>
Either you haven't understod what I've said, or I'm misunderstanding you.

A HashTable is a completely standard hashtable. There's two arrays, one for
the hash table itself, and one for the entries that the positions in the
hash table points to. There are no linked lists in HashTable.

The only mild deviation from the normal hash table using chained hashing is
that each segment of the entry array (ie, each segment of entries that hash
to the same position in the hash table), is sorted in ascending order. This
makes it possible to do a binary search.

The advantages of this is partly that worst case performance is changed from
O(n) to O(log n), partly that even hash tables with very high load factors
(like 20) still perform reasonably effeciently. The performance loss for
very small segments (1 or 2 entries, I believe everything above that will
recieve a performance boost) is extremely slight, something like 1
assignment and one division).

DynHashTable is only used when altering the hash table, and it does this in
the most effecient way possible; with only ONE reallocation, nomatter the
amount of entries that get added (it does allocate for a linked lists, but
that's pretty effecient due to a special allocator I'm using). If I moved
the alteration functionality into HashTable and gave it an AddEntry()
method, then that would be extremely ineffecient. I could do a AddEntries().
That would still only require one reallocation, but it would also require
the client code to have the entries to be added in one array, which migth
not always be possible. The solution with DynHashTable is better because it
allows me to do what can be done in terms of optimizing this process, while
making client code both smaller and simpler.

If I am the one misunderstanding you, then I apologize, and I would be very
interested in hearing if you had an idea on how to better implement a hash
table for our porpuse.