[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vernier@ess.vancouver.bc.ca: el]




Here is an example of what I mean about converting from private data formats
to a possible EDUML: (see bottom of email for advantages of doing this)

The following data is exported from a commercial school library
administration program called EL: (one paragraph per book, hex codes
enclosed in tag-like brackets)

------------------------------private EL data format ------------------------

Postiers.<FE>Canadian postal
worker.<FE><FE><FD>&02*E<FD><FD>&AA*N<FD><FE>1[1[Bourgeois, Paulette.
<FD>6[1[Chauveau, Dominique.<FD>3[5[Dans mon coin<FD>2[4[Postes
Canada.<FE>29 MAY 1995<FE>SYS<FE>383.49
BOU<FD>1992<FD><FD><FD><FD><FD><FD><FD><FD>$7.95<FD>0<FD>
<FE><FD><FD>ScholasticCanadaLtd.<FD>
<FD><FD><FD><FD><FD>texte fr. Kim LaFave; ill. Dominique Chauveau.

Baiser
mal<82>fique.<FE><FE><FE><FD>&02*E<FD><FD>&AA*P<FD><FE>1[1[Souli<8A>res,
Robert.<FD>6[1[Jorisch, St<82>phane.<FD>3[5[Billochet<FD>2[4[L<82>gendes -
Qu<82>bec<FE>15 MAY 1997<FE>SYS<FE>F
SOU<FD>1995<FD><FD><FD><FD><FD><FD><FD><FD>$14.95<FD>0<FD><FE><FD><FD>Les
400 coups.<FD>31 p.<FD><FD><FD> <FD><FD>ill. St<82>phane Jorisch.

V<82>lo de No<82>mie.<FE><FE><FE><FD>&02*E<FD><FD>&AA*P<FD><FE>1[1[Pommier,
Maurice.<FD>6[1[Pommier, Maurice.<FD>3[5[Mes premi<8A>res d<82>couvertes de
la lecture.<FD>2[4[Bicyclettes - fiction.<FE>27 APR 1998<FE>SYS<FE>F
POM<FD>1997<FD><FD><FD><FD><FD><FD><FD><FD>$12.95<FD>0<FD><FE><FD><FD>Gallimard.
<FD><FD><FD><FD><FD><FD><82>critetill.parMauricePommier.

-----------------------------end of private EL data format ---------------

so I wrote the following program today (2 hours of work and still does not
cover all the special cases not included in this sample).  This probably
shows I'm not the best PERL programmer in the world :-)

-------------------------------perl conversion script ----------------------
#!/usr/bin/perl -p eldata

BEGIN {print "<edu:item>\n";}
s/[\xFA\xFB\xFC\xFD\xFE]/ /g;
s/^$/ <\/edu:item>\n\n<edu:item>/;
s/(.*?)(\&.*?)(.\[1\[)/ <title> \1 <\/title>\n <elcode> \2 <\/elcode>\3/;
s/.\[1\[/\n <author> /g;
s/3\[5\[/\n <subtitle> /g;
s/2\[4\[/\n <subtitle> /g;
s/([0-9]{1,2} [A-Z][A-Z][A-Z] [0-9]{4})/\n <date> \1 <\/date>/;
s/([A-Z]{3} [0-9A-Z\.]+ [A-Z]{3}\s*[c0-9\,]*\s*[c0-9]*)[\s0]*/\n <dewey> \1 <\/dewey>/;
s/(\$[0-9\.]+)\s*([0-9]+)[\s]*(.*?)\s\s+(.*)/\n <cost> \1 \2<\/cost>\n <pub>
   \3 <\/pub>\n <notes> \4 <\/notes>/;
s/\x82/é/g;
s/\x84/è/g;
s/(<author>.*)/\1<\/author>/g;
s/(<subtitle>.*)/\1<\/subtitle>/g;

-------------------------------------------end of perl conversion script---

resulting in the following possibly EDUML conpliant data:
(warning: french accents may not show up properly)

-----------------------------------------EDUML compliant data --------------
<edu:item>
 <title> Postiers. Canadian postal worker.    </title>
 <elcode> &02*E  &AA*N   </elcode>
 <author> Bourgeois, Paulette. </author>
 <author> Chauveau, Dominique. </author>
 <subtitle> Dans mon coin </subtitle>
 <subtitle> Postes Canada. </subtitle>
 <date> 29 MAY 1995 </date>
 <dewey> SYS 383.49 BOU 1992         </dewey>
 <cost> $7.95 0</cost>
 <pub> Scholastic Canada Ltd. </pub>
 <notes> texte fr. Kim LaFave; ill. Dominique Chauveau. </notes>
 </edu:item>
	    
<edu:item>
 <title> Baiser mal<E9>fique.     </title>
 <elcode> &02*E  &AA*P   </elcode>
 <author> Souli<8A>res, Robert. </author>
 <author> Jorisch, St<E9>phane. </author>
 <subtitle> Billochet </subtitle>
 <subtitle> L<E9>gendes - Qu<E9>bec </subtitle>
 <date> 15 MAY 1997 </date>
 <dewey> SYS F SOU 1995         </dewey>
 <cost> $14.95 0</cost>
 <pub> Les 400 coups. 31 p. </pub>
 <notes> ill. St<E9>phane Jorisch. </notes>
 </edu:item>
			
<edu:item>
 <title> V<E9>lo de No<E9>mie.     </title>
 <elcode> &02*E  &AA*P   </elcode>
 <author> Pommier, Maurice. </author>
 <author> Pommier, Maurice. </author>
 <subtitle> Mes premi<8A>res d<E9>couvertes de la lecture. </subtitle>
 <subtitle> Bicyclettes - fiction. </subtitle>
 <date> 27 APR 1998 </date>
 <dewey> SYS F POM 1997         </dewey>
 <cost> $12.95 0</cost>
 <pub> Gallimard. </pub>
 <notes> <E9>crit et ill. par Maurice Pommier. </notes>
 </edu:item>
     
---------------------------------------end of EDUML compliant data----------

and now, with this data, I can write both a one line shell script to
search for keywords by tag and content of tag so that students anywhere in the
school can lookup book references they need for their assignments.

as in :   read -p keyword; awk -v RS="\n\n" "/$REPLY/"  library.data.xml

or i can take a bit more time writing a nice GUI front end for the same
thing. or a CGI-bin script or whatever.  The data is standard but let a 1000
clients flourish. Heck each student could program their own frontends!
Remember, before this, the ONLY way to search was via the commercial library
search terminals in the library.

Economical advantage: the school I am doing this for would otherwise be
forced to buy an extension of its 5 station licence for library search
terminals... assuming the Novell server licence is extended to 50 stations
from the current 5.  Many schools are in similar predicaments.


Bruno