[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Generalized Analysis Manager Was Re: [seul-sci] Introduction



Danilo Gonzalez Hashimoto wrote:

>         That's my feeling: Pete's proposed GAM is essencial to Linux in Science. If you have a good infra-structure(common scripting language(s), modularity, standardized file formats, etc), and good basic software(data archiving and cataloging, R-like analisis...) with a nice user interface, 'more specialized people' will write 'more specialized software' to fulfill their needs, with many advantages, in this platform.
> 
>         What do you think should be basic software to scientific purposes? Do you think a inteface to other programs is enough, or perhaps rewriting/merging/borrowing is better?

You bring up some really good points. I'm still not sure how to go about it,
but I can draw on some of the experiences that I've had dealing with large
datasets and specialized analyses as part of my own work.

Managing data - To me, this is the greatest of my problems. Getting data
from one dataset into one application for one analysis, into a second for
graphing, into a third for yet another analysis is time consuming and often
confusing. For me, making it easier to get data to/fro different packages
would be the primary goal of GAM.

Precision - I've been bitten by the precision bug more times than I care to
admit. Much of the data we deal with on a daily basis seldom comes in nice
integers, but as floating point data. Moving data from one app to another
has the very nasty habit of introducing error into calculations; larger
numbers may not be as worrisome, but much of our data is log-transformed so
small, almost imperceptible changes here are important. How have other
people dealt with this issue?

Interface - As far as going beyond the interface level, I'm hesitant to go
too far in that direction (but I can be convinced otherwise). I'd rather see
a group of motivated people write a killer graphing app or stats app or
whatever, and interface to that. Both Grace and R have good CLIs, and both
seem to be driven by very motivated people. I'm not sure that such energy
could be easily duplicated by another effort. Besides, most of the apps that
we would want already exist to some level (not all as advanced as we'd like,
perhaps), and I think an interface may be the best option.

Specificity / Generality - We all do different tasks, so any interface has
to be flexible enough so that a physicist can do his/her work with the same
general interface that a biologist would use without either losing
efficiency. To me, this suggests that the interface should be customizable
by the user. Shell / perl scripts are powerful tools, and must not be
forgotten.

Logging - For me, this is perhaps the most important part of the interface.
Having a transcript of actions that we can refer to, and perhaps at some
point play back, would certainly reduce errors and missteps in dealing with
analyses. For instance, having already done one set of analyses, I would
like to remove a subset and redo the very same analyses with the new set of
data. Later, I'd like to compare them graphically. Since most of the apps
thus far mentioned have decent CLIs, this may not be so hard to deal with.

Thoughts, ideas?

-- Pete

-- 
Pete St. Onge
pete@seul.org