[Tfug] "Ordering" tuples in a table
Bexley Hall
bexley401 at yahoo.com
Mon Oct 19 23:47:28 MST 2009
Hi Terence,
>> In particular, I am trying to figure out how to
>> adapt to the fact that tuples in the database are
>> not ordered in the same sense that I can explicitly
>> order them in a TEXT (or DATA) segment within my code.
>> (yes, I know you can order them "on output" but the
>> data themselves are not really "ordered" within the
>> table -- forget indices, etc)
>>
>> E.g., the RDBMS doesn't see any difference in the
>> following two lists:
>> Tom Martin
>> Bob Jones
>> Fred Williams
>>
>> vs.
>>
>> Fred Williams
>> Tom Martin
>> Bob Jones
>>
>> And, this is "A Good Thing"! :-/
>
> I think you miss the point, The RDBMS holds the data, not
> the information. An order implies information whicj is not
> in the realm of data storage. In a broad definition of
> RDBMS the order of a set is indeterminate.
That was *exactly* the point I was making! The RDBMS *can't*
preserve the order (unless I artificially impose that with a
special field, etc. that *I* "interpret* as "order")
However, I am sure that other real world problems make
assumptions regarding the order/presence of certain tuples
*in* the database. My question is: how do you handle this
in those "other" applications?
> But, when building table driven code, you often
> implicitly (or explicitly) apply some ordering to
> the contents of your tables! It allows you to
> later exploit that ordering to simplify the
> algorithms that are *driven* from those tables!
>
> In one such table, I *guarantee* circular definitions
> can't exist by imposing the requirement that anything
> referenced by a definition must already be defined
> (i.e., must exist "earlier" in the table). So, I
> can have:
> x = 3
> ...
> y = x + 2
> but this rule prevents:
> x = y - 2
> ...
> y = 5
> (i.e., because, at the time x is defined, y does not
> yet have a definition). Otherwise, I could create
> anomalies inadvertently like:
> x = y - 2
> ...
> y = X + 2
>
> I can force new entries to the (RDBMS) table to observe this
> rule by tying some code to the INSERT method such that any
> references the code can't resolve at INSERT-time are
> rejected.
>
> However, if the table is eventually *dumped*, there is
> no guarantee that rebuilding the table will cause the
> tuples to be re-INSERTed in the same order that was
>
> imposed on the original INSERTions.
> If the INSERT order of the data effects
> the information then the model is wrong.
>
>
>
>
> So, how do you do this? Consider a record for a
>
> "person" having fields that specify name,
> birthdate, etc.
>
> In addition, fields that REFERENCE *other* records
>
> defining the "mother" and "father" for
> that "person".
>
> I.e., how do you refer to a mother who's record
> doesn't
>
> yet exist??
> A CHILD cannot exits before it's PARENT conceived it.
Yes, but there is nothing that prevents the data for the
child from being entered into the database *before* the
data for its parent (other than my bogus INSERT method).
> Some times in data acquisition The
> CHILD is entered before it's parent, Thus deferred
> constraints.
But, how do you ensure those constraints are *eventually* satisfied?
I.e., how can you rollback the offending INSERTs (i.e., remove those
now BOGUS tuples) when you "later" discover that key fields are
missing/incorrect? Especially when there is never really a "commit"
point in the logic?
E.g., when restoring a dump, you can turn off these constraints
while loading the data. There is then a point "at the end"
when you essentially say, "OK, that's everything!" and it *must*
be everything.
Returning to the parent/child example, how (other than the hacked
INSERT method I mentioned) do you force the *user* to specify the
data in a way that follows these rules?
>> (I figure *this* sort of thing has to come up "in the real world"!?)
>
> It does come up it. I have seen in
> countless DBs I have had to 'fix' , but it has never
> been an issue in any DB I have designed, at least the ones I
> have had full information of the data. There in lies the
> rub, can you know enough about the data before you start
> the design.
Look at my two examples. ("parent/child" and "X/Y").
I don't see any way of preventing this problem from coming up
without putting tests in the INSERT (and DROP -- think about it)
methods.
More information about the tfug
mailing list