[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

XML search/extract



On Thu, Jan 14, 1999 at 06:35:01PM +0300, rnd@sampo.karelia.ru wrote:
> On Wed, 13 Jan 1999, Daryl Campbell wrote:

> >What is available, once XML is in place, in terms of search engines
> >for doing structured searches against documents using the XML tags to
> >restrict search?  
> 
> That is what I also want to know.


This is what I use now: (all Open Source of course)

(with a simple example of how to extract the full contents of <school>
  element)

sgrep (from http://www.cs.helsinki.fi/~jjaakkol/sgrep.html )
   (a fast grep like XML-aware tool ; very unix philosophy-ish)
   
   sgrep -g xml "<school> .. </school>" eduml.xml
   
Xtract (from http://www.cs.york.ac.uk/fp/Xtract/)
   (another fast grep like XML-aware tool based loosely on XQL (see below) )

   xtract "*/school" eduml.xml

xql (perl script example from www.cpan.org CPAN's XML::XQL module) 
     (centered on the emerging XQL proposed standard query
     language for XML www.w3c.org)
    
   xql "*/school" eduml.xml 
   

The W3C is still working on what the XML Query Language should look like.
It might end up being split in two different ways.  Meanwhile, the above
worked very well for me.  The Perl Script is the most complete but also
slightly less fast than the others.  Each of the above provide nice tutorials.

BTW, none of the above <<require>> a DTD to function... and the first one
(sgrep) does not even <<require>> well-formed XML documents.  

Bruno