[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: "PenguinPlay mailing list" <penguinplay@sunsite.auc.dk>*Subject*: A little more exact specification of the hash-table*From*: "Bjarke Hammersholt Roune" <asmild@post6.tele.dk>*Date*: Sun, 12 Sep 1999 22:02:15 +0200*Reply-To*: penguinplay@sunsite.auc.dk

Actually, now I'm thinking about it, is it you (Christian) or me who is going to do the hash-table thingie? I am under the impression it is me, but I don't remember this explicitly being stated. Just an effort to eliminate possible duplicate effort. Anyways, I thougth I'd write up a little more specific specification of the hash-table and its capabilities. Its only a draft. The hash-table is from here on referred to as "the container". The data it contains, dir and file data, is from here on referred to as "entries". The class ppf_URLInfo is simply referred to as "a path". ** Access The container allows access to its entries by means of a key. This key is a path. The container will search incrementally for an entry, searching containers that are also entries in the container, as specified by the path. The container does not allow for the capability to iterate over all entries. The container searches first by means of a hash-table, then uses a binary search algorithm to narrow the search. The container guarantees that access will, in the best, average and worst case, have a performance of O(log n). ** Serialization The container can tell how much space it would take up were it to be asked to serialize itself and its entries to a target in the state it currently is. [this depends a little on the IO utilised, so either this:] The container supports serialization of itself and its entries to any target that fully implements the interface of the standard library class ostream, and from any source that fully implements the interface of the standard library class istream. [or this:] The container supports serialization of itself and its entries to any target handle opened via the ppfOpen() function, and from any source handle opened via the ppfOpen() function. [one of those] The container guarantees that its serialized output will be of this format, and can only serialize itself from input that takes this format: 4 bytes - 32 bit unsigned integer, having a value equal to N 4 bytes - 32 bit unsigned integer, having a value of S size of N entries - N entries, sorted in ascending order (small-to-large) as based on the value returned by the modulus operator applied to the hash-value of each entry as the first operand, and the value S as the second operand (ie, hash % S). The hash-value stored, however, is not this modulus-returned value, but the actual hash-value of each entry, as produced by the hash-function. For those entries where this modulus-returned value match, they are sorted in ascending order in accordance with their hash-value, and if this also matches, then in accordance to the outcome of a comparision of their respective names. Here, of course, N is the amount of entries, and S is the table-size used by the table that produced the serialized representation. If the table-size desired by a container that is serializing from sotrage, is different from S, performance will be noticeably impacted, though the container will be able to do the conversion. The container guarantees that serialization will, in the best, average and worst case, have performance of O(n). ** Internal representation The container is represented internally by two arrays. The first array is as large as the table-size times 8 (ie, 8 bytes per position). The second array is as large as the amount of entries times the size of each entry. The first array holds a pointer and an unsigned integer value at each position. The pointer points to somewhere in the second array, or is equal to 0 (NULL). The integer holds a value here called D. The entry pointed to (if any), and the next D-1 entries, all share the characteristic that the value returned by the modulus operator given each entries hash-value as its first value, and the table-size as the second (ie, hash % table-size) is equal to the offset into the first array the position in the first array has. The entries in this sub-part of the second array are sorted in ascending order as based on their hash-value. In cases the hash-value matches, the name of the respective entries are used as the key for the comparision. -- Did I forget something?

- Prev by Date:
**Sorry** - Next by Date:
**Re: Sv: I'm back** - Prev by thread:
**Sorry** - Next by thread:
**Re: A little more exact specification of the hash-table** - Index(es):