[Tfug] RDBMS reprise
Bexley Hall
bexley401 at yahoo.com
Wed Jan 23 17:14:11 MST 2008
Hi, Tim,
--- Tim Ottinger <tottinge at gmail.com> wrote:
> Bexley Hall wrote:
> > I don't see any way to *avoid* making two (or
> more)
> > passes at the server to get all the data that an
> > application may need (e.g., the PDA example I
> > sited is a great model for thinking about the
> > sorts of issues that can arise... "How large of
> > an address book do you expect the user to have?
> > How many fields in each 'contact'?" etc.)
>
> Okay. Have you worked on making more, narrower
> queries?
> for instance, if you want the address book, you
> probably don't really want the address book.
<scratches head> Huh? (typos)
> You might want record ID and
> name first, which is less bandwidth and local
I don't worry about bandwidth. Just storage on
the client.
> storage. Maybe
> you also only want As and Bs for the first screen.
> When
> someone selects a person by name, then you might
> want the one person's record.
>
> Did you already do that? The initial question
> sounded like you wanted too much at once.
If I take things "in small bites" (as you suggest)
then I run the risk of the dataset changing between
the time I look at the first *portion* of the data
and the *rest* of the data. (e.g., John Doe's
record can disappear, be renamed, etc. between
the time I pick "John Doe" and the time I issue the
query to retrieve the balance of his record.
While this is unavoidable, it violates the Principle
of Least Surprise: the user *saw* there was a
"John Doe" in his address book and, now, when he
goes to look at john does's *record*, the device
complains that there is no such person!
(recall *users* don't understand that this is two
separate queries... they just see data that *was*
present is suddenly *not* present!)
> BTW: Selecting by first letter of name is generally
> much cheaper than setting the 'first' and 'length'
> on a bigger query. Especially if the db is indexed
> apropriately.
>
> > Currently, I qualify each successive query with
> > those from which it was (obviously) derived so
> > that I can tell if a record has "disappeared",
> > etc. I can see no way around this possibility
> > short of explicit locking *or* implicit locking
> > (even client-side locking!)
>
> Probably okay to be optimistic, then. I get
> (id,name) as (1234, "Ottinger, Tim") then when I
> select my name, I select from people where
> id = 1234. If there is no
> such record, then I know that something happened
> on the other end.
*You* (I, the programmer) know this is what happened.
But, Joe User doesn't understand this. Instead, you
get a call to tech support (angry/confused customer)
and you have to explain what *might* have happened.
[note the PDA example falls down here as it is too
simplistic. Imagine looking looking at a directory's
contents, selecting a file and then being told that
the file doesn't exist! "Huh? I just *saw* it?!!!"
Sure, even if you refresh the directory listing and
show the user that it is no longer present, it
results in confusion on his part: "Where did it go?
Why??"]
> If you can't save data, and can't hold locks, then
> you'll have to have some recovery logic. Tradeoffs,
> y'know?
Of course! Hence my point that the cleanest
implementation is to burden the server with this
requirement. And, if the server ends up unable to
satisfy subsequent requests from *other* clients,
then these look like simple "server busy, please
try again later" errors (which, IMHO, are much
easier for people to relate to than "disappearing
names/files")
> > Datasets vary. Some may be wide, others narrow.
> > E.g., consider the smarts that you might add to
> > fetch the area code for a particular address
> > based on its ZIP code (I, for one, don't enter
> > area codes in my address book so if I ever
> > forgot Tucson's -- for example -- I would have
> > to make this extra step manually).
>
> Yeah, but you can always get the very least you
> need,
> and not select any data that you *might* need later.
> So you get the fewest columns of the fewest rows you
> can possibly get away with. I don't know any better
> way to deal with having a lack of space on the
> client.
That was my point. Taking bite sized pieces
complicates the recovery logic, forces it to vary
as a function of the data/application *and* confuses
the end user. In light of that, it seems that
server-side caching is the better approach. It
also gives a simple upgrade path -- add more memory
to the server and *all* clients benefit.
> Well, there is the idea of pulling in data and
> compressing
> it one way or another. Then you get a little
> storage back
> for a bit of processing. But that's hardly a
> panacea either.
Exactly.
> More data on the client == more stale data on
> client.
But the data makes sense to the user. If John Doe's
address has changed in the intervening
seconds/minutes/hours/days since he saw the list
of names, at least the data that he had represents
a valid data point that *did* exist (at one time).
Contrast that with having a name that *existed*
but the record behind it that *doesn't* (any more)
> If you're low on resources, you have to be high on
> finesse.
>
> You probably know the flyweight pattern, too. Maybe
> that's helpful. Possibly not.
>
> Area-zip isn't one-to-one, either. That makes it
> harder.
I think ZIP to area code is damn near one-to-one
(though area code to zip is not). Regardless, it
is only germane to this bogus example...
> But a very focused query could get just the area
> codes
> for the one zip code, and then it's a matter of
> whether
> you REALLY need the city name and state name.
>
> Sorry I'm not more helpful.
<grin> I suspect its just one of those problems
that *has* no real (clean) solution. Can't squeeze
both ends of the balloon at the same time!
Thx,
--don
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
More information about the tfug
mailing list