[Tfug] Smoothing curves
Bexley Hall
bexley401 at yahoo.com
Mon Dec 1 07:32:50 MST 2008
I've built a gesture recognizer and I'm now trying
to "back-fill" some of the features that I skipped
over along the way. ;-)
To accommodate a variety of input devices with
different resolutions, reporting rates, etc. I
want to add a level of "filtering" to the data
prior to use. On low resolution devices, this
"smooths" many of the transitions between the
reportable points (which would, otherwise, be
integers). On high resolution devices, this
helps reduce the "jitter" in the reported data.
This also has the desired benefit of allowing it
to be "detuned" for various user characteristics
(e.g., to desensitize it to Parkinsonian tremor,
etc.)
Ideally, such a filter would be configurable -- to
allow its effect to be tailored to the environment
(i.e., user + input device) without impacting the
recognizer itself.
For example, the compensation applied to input from
a direct-acting input device (e.g., pen on tablet)
differs from that of an indirect acting one (e.g.,
mouse). Likewise, the use that either of these
would see in the hands of a young adult differs
greatly from that of an octogenarian! :>
In canvasing the literature, it seems that many
recognizers use such a feature. But, their design
seems to be somewhat ad hoc/arbitrary -- there
is no *reasoned* explanation for why a particular
implementation is chosen over some other (better?)
implmentation. Indeed, even the coefficients used
in these filters seem somewhat arbitrary! It's as
if the implementers just "tried something" and
didn't even prove to themselves that their particular
approach had merit -- let alone being ideal! :<
Some, for example, simply "average" each point's
coordinates with its immediate neighbors. I.e.,
X = (Xi-1 + Xi + Xi+1) / 3
Y = (Yi-1 + Yi + Yi+1) / 3
Others claim to be trying to fit a Gaussian to each
by adjusting the weights of neighbors accordingly:
X = (Xi-1)/5 + 3*(Xi)/5 + (Xi+1)/5
X = (Yi-1)/5 + 3*(Yi)/5 + (Yi+1)/5
This latter approach, of course, ignores pesky little
details like the role of "sigma", etc.
Almost universally, each approaches applies the
filter in the time domain instead of in space. I.e.,
Pi-1 and P are separated by a fixed amount of *time*
as are Pi and Pi+1, etc. I haven't been able to
convince myself that this is correct or incorrect.
<frown> But, neither have the implementers!
An alternative (more rational?) approach is to apply
the filter in *space* -- so that points *farther*
away have considerably less influence on the given
point regardless of their proximity in time. E.g.,
if Pi+1 is "quite far" from Pi, then it's weight on
the average should NOT be the same as that of Pi-1
(which may have been physically *closer*) despite
the fact that each is separated in *time* from Pi
by the exact same interval.
Lastly, does treating each coordinate independantly
of the other really make sense? Or, should any
filter model the location (in 2-space) of those
other points in their impact on the point in question?
Having independant controls for each axis is a big
win for many devices. For example, the horizontal
characteristics of pen-on-tablet motions are very
different from the vertical ones. And, their
effects *swap* when applied to things like mice...
But, it seems (intuitively) like this process should
be modeled as a "flexible stylus" moving through a
"viscous fluid". I.e., the stylus' stiffness and
liquid's viscosity are the parameters being tweeked
to affect the path that the *tip* of the stylus
actually takes. [But, I have *no* experience with
the behaviour of fluids so I can't draw on anything
besides intuition to clarify that to myself :< ]
<shrug>
I'll keep poking around to see if I can find anything
in other application domains that might have parallels
with this. (sigh) Nothing's ever easy! :>
While I don't expect a "solution" here, I would
appreciate any suggestions as to other directions
that I could explore.
Thanks!
--don
More information about the tfug
mailing list