[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML Interface



In message <oa3e2khkgb.fsf@burrito.fake>, aph@debian.org writes:
>>>>>> "rd" == Roger Dingledine <arma@mit.edu> writes:
>rd> I would be very interested in hearing about a centralized "how to
>rd> do your project with XML" howto page along with some
>rd> application-oriented introductions, rather than the theory-based
>rd> discussion that most faqs and docs seem to focus on. I would be
>rd> willing to help out in guiding the shape of said docs/pages.
>
>Yes -- I had put together a proposed talk about practical use of SGML
>in a real environment.  I put together a prospectus at
><URL:http://www.debian.org/~aph/proposed_bazaar_talk.txt>.
>
>I'd be interested in what sort of thing you would think would be most
>useful.  I.e., fill in the rest of this sentance: "After reading this
>document, the user should be able to ..."

(Sorry for the delay in response. My todo queue is too big...)

I'll give you an example of a project that isn't terribly complex but
uses XML as an integral part. Specifically, 
http://linuxunited.org/projects/news/ is a proposal to develop a standard
exchange format for Linux news articles/updates. From there, people could
develop sites where people input news, and then the news gets distributed
to other news sites in such a way that they can auto-parse the incoming
mail. We were throwing around some ideas on how to implement parts of it,
and lack of available XML resources/clue was a big problem. I understand
XML just fine from a theory point of view -- it's the details, like what
I can use in my DTD, where I get a parser that makes sense to me, how I
use that parser, etc.

There were three parts to figuring out how to use XML to help implement the
lu-news idea.

* I want to know how to write DTD's. I know C/perl/etc, but I never found
a simple "this is how you write your DTD" page. We threw together some
samples (http://www.seul.org/archives/lu/news/Aug-1998/msg00013.html) that
look pretty good, but I have no idea if they're optimized, or even what
optimizing means (maybe that means I'm don't actually grok the dtd concept).
It would be nice to have a graphical/text menu for creating a dtd, so you
don't actually have to know the format. This can't be that difficult, and
would benefit a lot of people. Also, it would be nice to have a one- or
two-page cheatsheet (for people who know a given language) as a reference,
written in relatively plain English. (Precisely written context-free grammar
specifications do *not* count as plain English.)

* I want to be able to create my XML file. I have the tags/data (in a cgi
in perl, in this case), and I want to write it out into an XML file that I
can mail to places. I would much prefer a simple set of perl ops (non-OO,
or if OO give me a thorough sample that I can adapt, including constructor/
destructor examples).

* I want to be able to read my XML file. It will get mailed to someplace,
which will parse it and deal appropriately. I want to be able to efficiently
pull out data (all/most of the data) into perl, into a mysql db, etc.
Ideally, I can write a config file that pulls data out in a specified format
and writes into another format, so mail recipients can configure it without
having to learn XML or any parsers.

I'll say that again: a couple people in the project may have to know xml
and understand how dtd's work, but I want most people to be able to interact
with the files through a very easy api, using a parser that is installed on
their system. This means stable rpms/debs in the standard distribs. The
first parser I grabbed was the gnome xml parser lib: gah. I haven't checked
out the recent perl XML::Parser -- if it's actually simple and convenient
for performing these task, people should give it a *lot* more publicity.

Ideally, *nobody* will have to care how dtd's work, because there will be a
menu-driven interface to build the thing. I know this goes against the
hacker mentality ("hey cool, another language/spec I can learn"), but all I
want to do is implement XML support and move on. The lu-news project had
some pretty high-profile people on it (Scoop of Freshmeat, Mark Bolzern of
Linuxmall, Jonathan Corbet of LWN, Dave from Linuxtoday, etc), and could
probably be revived into something useful if the interface issues were
magically solved. (Well, and more people had time to deal.)

Several seul-edu projects (http://www.seul.org/edu/) are looking into
supporting xml as a save file format, to help with data exchange between
applications. I asked them what questions they had; I'll include a sample 
response here: (It looks like a good *simple* anti-fud page might be in
order as well)

|I need to know some philosophical things for a start:
|
|1. How could I make my work easier with XML as compared to some
|other data format? ("Why XML?") -- 1 page of text ;-)
|
|2. Are there any GPLed, fast and relaible XML parsers (which  I
|could embed) and how they help me with task No. 1.  (I  do  not
|want to add a forest of rpms to be able to run QZB with help of
|some exotic Haskel interpreter).
|
|3. What are all those DTDs and if I develop yet another ML, are
|there any tools which will convert it  into  specific  actions.
|For example, quiz data stored in (to be built)  QZB  data  file
|needs  to  be  interpreted,  turned  into  actions  of   asking
|questions, storing answers, mailing forms, etc.
|
|4. Are there any tools which could be  used  to  create  QZB-ML
|files by 'point and click' users?
|
|5. How can I add non-text files (GIF, WAV, MOV...) into the set
|of QZB-ML documents?
|
|6. What does XML-aware browser do? How could it be used for good?
|
|Of course,
|
|s/QZB/MyProgram/g
|
|for more generality.
|
|After that go some specific questions, which will depend
|on the tools which will be in use.
|
|Sincerely yours, Roman Suzi (rnd@sampo.karelia.ru)

I think in general, sorting through the thousands of XML links and documents
out there and producing a simple summary page, pointing to only the *best*
documents, would be a very good start.

I hope this helps,
--Roger