[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[mule-cgreek:00828] [rasmith@aristotle.tamu.edu: Re: Cgreek package]



Smith $B;a$+$i$N%a!<%k$r%U%)%o!<%I$7$^$9!#(B

------- Start of forwarded message -------
To: TAKAHASHI Naoto <ntakahas@xxxxxxxxx>
Cc: mule-cgreek@xxxxxxxx
Subject: Re: Cgreek package 
In-Reply-To: Your message of "Fri, 14 Jul 2000 20:50:43 +0900."
             <200007141150.UAA02322@xxxxxxxxxxxxxxxx> 
Content-Type: multipart/mixed;
 boundary="Multipart_Fri_Jul_14_11:06:14_2000-1"
Date: Fri, 14 Jul 2000 11:06:15 -0500
From: Robin Smith <rasmith@xxxxxxxxxxxxxxxxxx>

- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: text/plain; charset=US-ASCII

>>>>> "TAKAHASHI" == TAKAHASHI Naoto <ntakahas@xxxxxxxxx> writes:

    TAKAHASHI> I have just started writing a Unicode version of
    TAKAHASHI> tlg-citation.lex in a-la-html way; information other
    TAKAHASHI> than raw text is represented in a tag surrounded by <
    TAKAHASHI> and > (just like you did for <EXC> and </EXC>).  As you
    TAKAHASHI> pointed out above, Emacs must run a function to convert
    TAKAHASHI> the tagged information into text properties after
    TAKAHASHI> having received the output of tlg-citation.lex.  It
    TAKAHASHI> might slow the processing slightly, but I do not think
    TAKAHASHI> the difference is noticeable as long as the buffer
    TAKAHASHI> contains only one work.

This would be a better solution than mine, but I'll send you what I have; it may
be useful anyway.

    >> On another subject, I'll try to send you my most recent version
    >> of tlg2emacs as well as my tlgidt2emacs (which adds a scanner
    >> for .idt file data formats and produces tables of information
    >> and offsets for the individual works contained in a .txt file).
    >> These are very rough at this point: alpha or even pre-alpha.

    TAKAHASHI> Thank you.  Would you please include
    TAKAHASHI> mule-cgreek@xxxxxxxx in the CC: field?  Many people
    TAKAHASHI> would be interested in it.

OK.  Attached are two lex files:
   tlg-citations.lex      I compile this to tlgcites2emacs; it's intended as
			  a replacement for tlg2emacs, with the addition of 
			  line citations.  At the moment, it inserts a reference
			  at the beginning of each line of the format %s.%s.%s.%s.%s
			  (for the five citation levels vwxyz) and does not print
			  the high-level citations abcd (mostly because these are
			  annoyingly reset at the beginning of every 8K block, even
			  though they usually don't change).   
      
   tlg-idt.lex            I compile this to tlgidt2emacs.  Its purpose is to read
			  the .idt files and extract simple tables containing entries

And a little bit of C:

   tlgreadblk.c		  Utterly trivial C code that opens a file, goes to an offset
			  (given as a number of 8K blocks), and dumps to stdout from 
			  that offset to another offset.  In conjunction with the .idt
			  file data, this can be used as a quick pipe to restrict output
			  to a single work (the .idt entries give the starting and ending
			  blocks for each work in a .txt file)

The output from tlgidt2emacs consists of entries like this:

(example from tlg0057.idt, Galen):

 Work 001: 
 Adhortatio ad artes addiscendas ()
 Starting citation:     1 1;  Ending citation:     14 30
 Level 0: line                   (1        - 30)
 Level 1: Section                (1        - 14)
 Level 2:                        (         - )
 Level 3:                        (         - )
 Level 4:                        (         - )
 Level 5:                        (         - )
  File blocks 0-5


(another example from tlg0059.itd, Plato:)


 Work 001: 
 Euthyphro ()
 Starting citation:    2 a 1;  Ending citation:    16 a 4
 Level 0: line                   (1        - 4)
 Level 1: section                (a        - a)
 Level 2: Stephanus page         (2        - 16)
 Level 3:                        (         - )
 Level 4:                        (         - )
 Level 5:                        (         - )
  File blocks 0-4
 
 Work 002: 
 Apologia Socratis ()
 Starting citation:    17 a 1;  Ending citation:    42 a 5
 Level 0: line                   (1        - 5)
 Level 1: section                (a        - a)
 Level 2: Stephanus page         (17       - 42)
 Level 3:                        (         - )
 Level 4:                        (         - )
 Level 5:                        (         - )
  File blocks 4-13
 
 Work 003: 
 Crito ()
 Starting citation:    43 a 1;  Ending citation:    54 e 2
 Level 0: line                   (1        - 2)
 Level 1: section                (a        - e)
 Level 2: Stephanus page         (43       - 54)
 Level 3:                        (         - )
 Level 4:                        (         - )
 Level 5:                        (         - )
  File blocks 13-17
 
 
Since the titles of works sometimes contain Greek beta code, I have
been piping the output from tlgidt2emacs to tlgcites2emacs (this is
obviously very inefficient, but I consider all this very much at the
preliminary development stage).  So, I've occasionally added a '&' to
the output from tlgidt2emacs to turn Latin coding back on (otherwise,
you get all sorts of conversions to Greek that you don't want).

Both tlg-citations.lex and tlg-idt.lex contain lots of redundant code,
including things I've experimented with and (for now) turned off and
leftover bits from strategies not being used.  I *think* they're both
at least moderately robust, though I haven't done enormous testing.  
They need to be faster, but I'm sure they can be speeded up, once I have
them working correctly, by cleaning up the rules to combine cases that
can be combined.  
   One obvious redundancy is that tlg-idt.lex is really a modified
version of tlg-citations.lex; using both is thus a genuine kludge.  I
think the easiest improvement would be to cut all the beta-code rules
out of tlg-idt.lex, but I haven't done that yet.
   There are command-line options in both tlg-idt and tlg-cites, and 
tlg-cites.lex has the capacity to compare citation strings ordinally 
according to the rules specified in the TLG documentation.  This is 
needed to find where a work starts and ends: the block information 
from the .idt file only tells which 8K block a work starts in, not where
it starts in that block (works usually do not begin or end on block
boundaries: why the TLG still has this hideous archaic block-structured format
is mysterious to me).  However, there are strange cases that I haven't
tried to accommodate in my comparison function.  For instance, a string that
is normally just an integer value occasionally becomes something very different
for a line or two.  Example: in Aristotle, Posterior Analytics, line 99b7
we have this sequence of references:
99b7
99b8,9
99b9,10
99b10,11
99b11,12
99b12,13
99b13,14
99b15
This of course means that the editor (here Ross) has made changes that
result in lineation shifting.  Unfortunately, that makes the value of the
z-level citation here go: 7 (integer), "8,9" (string), "9,10" (string), etc.
Writing a compare function that can handle all the variants on this (or even
finding out what all the possible variants are) is probably hopeless.

Please understand that the attached code is really sub-alpha: the pieces work,
but only just.  


Robin Smith              
Department of Philosophy            rasmith@xxxxxxxx
Texas A&M University                Voice (409) 845-5696
College Station, TX 77843-4237      FAX   (409) 845-0458


- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="tlg-citations.lex"
Content-Transfer-Encoding: 7bit

%{

  /******************************************************************
   * Scanner for TLG .txt files, with rules for extracting citation
   * information (based on N. Takahashi's tlg2emacs).    
   * Author: Robin Smith <rasmith@xxxxxxxxxxxxxxxxxx>
   * Date: 13 July 2000
   */

#include "tlg.h"
#include <sys/stat.h>
#include <unistd.h>
#define REF_SIZE 32      /* Length of a citation level (current TLG max is 31) */
#define DESC_SIZE 64     /* Length of a description level (current TLG max is 63) */
#define NUM_HIGH_LEVELS 4  /* How many high levels (author, work num, title, abbr) */
#define NUM_LOW_LEVELS 6   /* How many low levels (v-z, n) */
#define NUM_LEVELS (NUM_HIGH_LEVELS + NUM_LOW_LEVELS)

int quoting = 0;
int lc_l = 129;			/* (charset-id 'latin-iso8859-1) */
int lc_g1 = 156;		/* leading-code-private-21 */
int lc_g2 = 242;		/* default for (charset-id 'cgreek) */

int with_cites=0;               /* toggle: replace with a level indicator */ 
int seeking=-1;                 /* ID being sought (-1=not seeking) */
int emit=0;                    /* used to signal end of ref string */


/* Globals used for passing values to setlevel */ 

int d = 0, l = 0;                 /* integer input, level value */
int ix;                           /* in case we need one */
char c = ' ';                     /* character input */
char *s;                          /* string input */

char cite[REF_SIZE];                    /* Return location for the citation string */


void printl(int);
void printg(int);

/**************************************************** 
 * Added by R. Smith.  
 * Functions to extract TLG encoded data 
 */

int get7(void);
int get14(void);
char get1c(void);
char *getstr(void);


/****************************************************
 * Added by R. Smith
 * Globals and functions for TLG IDT level data
 */

int hierarchy=1;                                /* Are levels hierarchical? */


 /*******************************************************************
  * Citations are stored as strings in the 'levels'
  * array.  Levels Z-V are hierarchical citations:
  * Z is usually a line number, while the meanings of
  * Y-V vary from work to work (often only Y is present).
  * N is a 'non-hierarchical' citation indicator, and its
  * presence in a work implies that Z-V are non-hierarchical,
  * i.e., updating one of them does not imply resetting
  * all lower levels to "1".
  * 
  * A-D are the top-level counters, with these fixed meanings:
  * A = Author number (always used)
  * B = Work number (always used)
  * C = Preferred author abbreviation (optional)
  * D = Preferred work abbreviation (optional)
  * A and B are always 4-digit strings.
  */

enum level {Z=0,Y,X,W,V,N,D,C,B,A} currlevel,tmplevel; 
char levels[NUM_LEVELS][REF_SIZE]={ "","","","","","","","","",""};     

 /* A global to use for search targets */

 char target[NUM_LEVELS][REF_SIZE]={ "","","","","","","","","",""};     

 /* Place to read a target string out of argv.  This is not
  * now  used. The idea is to provide some way of giving
  * an entire reference as a single string which would then be
  * parsed into values.  One possibility is to use the TLG's own
  * dot-separated notation a.b.[c.d.]n.v.w.x.y.z; a clever
  * variant would read the string as a.b.[[[[n.][v.]][w.]][x.]]y.z, 
  * i.e., assume a,b,y,z are always present, parse any other
  * values backwards starting with x.
  */

char target_string[64];          /* NOT CURRENTLY BEING USED *.


 /* List of targets already found.  Significance:
  *  0: not looking for this level
  * -1: looking, not yet found
  *  1: was looking, have found
  */

 int found[NUM_LEVELS]={0,0,0,0,0,0,0,0,0,0};

 /*********************************************************
  * Descriptors are arbitrary text strings.  Most works do not
  * include any, and TLG documentation says at most 8 have been
  * used in existing files; however, allowance must be made for
  * up to 26 (i.e. a..z).  They are currently limited to 31
  * characters, but the TLG says this may change.  These are
  * identified by character ('a'...'z') in the text; I access
  * the array elements as i-97.
  * TODO: set string length with a #define.
  */
 char descriptors[26][64];

 /* Functions for setting the IDT level value */

char * setlevel(int which, int how, int val, char ch, char* str);
int getlevel(void);

void setseeking(void);
int citecmp (char *str1, char *str2);       /* Compares idt strings */

/*************************************************
 * Formatting strings.   These determine what is
 * printed by setlevel.
 */

 /* Print absolutely everything */
 char  fmt_all[NUM_LEVELS][REF_SIZE]={"\n%-10s",
			"\n%-10s",
			"\n%-10s",
			"\n%-10s",
			"\n%-10s",
			"\n%-10s",
			"\nAuthor abbr. %s",
			"\nWork abbr. %s",
			"\nWork %s",
			"\nAuthor %s"
 };

 /* Print z-v,n with newline */

char fmt_low[NUM_LEVELS][REF_SIZE]={"\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		     "",
		     "",
		     "",
		     ""
 };

/* Emit newlines when z-n change */
 char fmt_min[NUM_LEVELS][REF_SIZE]={"\n",
		       "\n",
		       "\n",
		       "\n",
		       "\n",
		       "\n",
		       "",
		       "",
		       "",
		       ""
 };

 /* Print nothing at all */
 char fmt_none[NUM_LEVELS][REF_SIZE]={"",
			"",
			"",
			"",
			"",
			"",
			"",
			"",
			"",
			""
 };

 char * fmt=&fmt_min[0][0];



%}

%s LATIN

%s SEEK

%s CITES
%%

    if (seeking>=0) { BEGIN(SEEK); };

<SEEK>{

[\x0-\x7f]+     ;    /* If seeking, we trash everything but references */

}

"_"		printf("--");
"!"		printf(" ");
"?"		;
"`"		;

"^"[0-9]+	printf("\t");

"$"[0-9]*	BEGIN 0;

"&"[0-9]*	BEGIN LATIN;

"%"		printf("{\\dag}");
"%1"		printf("?");
"%2"		printf("*");
"%3"		printf("/");
"%4"		printf("!");
"%5"		printf("|");
"%6"		printf("=");
"%7"		printf("+");
"%8"		printf("%%");
"%9"		printf("&");
"%10"		printf(":");
"%11"		printf(".");
"%12"		printf("*");
"%13"		printf("{\\ddag}");
"%14"		printf("{\\P}");
"%15"		printf("|");
"%16"		printf("|");
"%17"		printf("||");
"%18"		printf("'");
"%19"		printf("-");
"%24"		printf("~");
"%25"		printf("|");
"%26"		;
"%27"		;
"%30"		printg(G_smooth);
"%31"		printg(G_rough);
"%32"		printg(G_acute);
"%33"		printg(G_grave);
"%34"		printg(G_circ);
"%35"		printg(G_smooth_acute);
"%36"		printg(G_rough_acute);
"%37"		printg(G_rough_grave);
"%38"		printg(G_rough_circ);
"%39"		printg(G_uml);
"%100"		printf(";");
"%101"		printf("#");
"%102"		printf("`");
"%103"		printf("\\");
"%104"		printf("^");
"%105"		printf("|||");
"%107"		printf("~");
"%132"		printg(G_acute_uml);
"%133"		printg(G_smooth_grave);
"%134"		printg(G_smooth_circ);
"%160"		printg(G_hyphen);
"%"[0-9]+	printf(" ");

\"1		printf("``");
\"2		printf("''");
\"3		{
		  if (quoting) { printf("'"); quoting = 0; }
		  else { printf("`"); quoting = 1; }
		}
\"4		printf("`");
\"5		printf("'");
\"6		{
		  if (quoting) { printf(">>"); quoting = 0; }
		  else { printf("<<"); quoting = 1; }
		}
\"7		{
		  if (quoting) { printg(G_rangle); quoting = 0; }
		  else { printg(G_langle); quoting = 1; }
		}
\"[0-9]*	printf("\"");

          /* Rules added by R. Smith for page formatting */

"@"	       printf(" ") ;          /* Indentation Marker, space holder */
"@1"           printf("\n") ; /* End of Page  */
"@2"           printf("\n") ; /* End of Column */
"@3"           ; /* Graph, Chart or Table   */
"@4"           ; /* Beginning of Table (obsolete) */
"@5"           ; /* End of Table (obsolete) */
"@6"           printf("\n") ; /* Blank Line (obsolete)  */
"@7"           printf("_____________________\n") ; /* Horizontal Rule (0086)  */
"@8"           printf("    / "); /* Line Break: where a mid-line change in citation forces the 
			       * division of a line, @8 is placed at the end of the former line
			       * segment.  */
"@9"           printf("\n") ; /* Break in text: end-line (papyri)/mid line paragraph (poetry)  */

"@10"          ; /* Line too long for screen  */
"@11"          printf("\n"); /* Column element  */
"@12"          printf("\n"); /* Column subelement  */
"@20"          printf("\n")  ; /* End of columnar text */
"@2"[1-9]      printf("\n") ; /* Begin column n */

"@40"          printf("    ") ; /* Horizontal space filling  */
"@50"          ; /* Reserved for papyri  */
"@51"          ; /* Writing perpendicular to main text */
"@52"          ; /* Writing inverse to main text */

"@6"[0-9]      ;  /* Reserved for inscriptions */






"["		printg(G_lbracket);
"[1"		printg(G_lparen);
"[2"		printg(G_langle);
"[3"		printg(G_lbrace);
"[4"		printf("[[");
"[11"		printg(G_lparen);
"[12"		printf("<=");
"[2"[0-9]	printg(G_lbrace);
"[3"[0-9]	printg(G_lparen);
"[53"		printg(G_lparen);
"["[0-9]+	printg(G_lbracket);

"]"		printg(G_rbracket);
"]1"		printg(G_rparen);
"]2"		printg(G_rangle);
"]3"		printg(G_rbrace);
"]4"		printf("]]");
"]11"		printg(G_rparen);
"]12"		printf("=>");
"]2"[0-9]	printg(G_rbrace);
"]3"[0-9]	printg(G_rparen);
"]53"		printg(G_rparen);
"]"[0-9]+	printg(G_rbracket);

"<"		|
">"		printf("~");
"<1"		|
">1"		printf("_");
"<"[0-9]+	|
">"[0-9]+	printg(G_hyphen);

"{"[0-9]*	printg(G_lbrace);
"}"[0-9]*	printg(G_rbrace);

"#"		printf("'");
"#1"		printg(G_qoppa);
"#2"		printg(G_stigma);
"#3"		printg(G_QOPPA);
"#4"		printg(G_stigma);
"#5"		printg(G_sampi);
"#80"		printg(G_uml);
"#81"		printf("'");
"#82"		printg(G_acute);
"#83"		printg(G_grave);
"#84"		printg(G_circ);
"#85"		printg(G_rough);
"#86"		printg(G_smooth);
"#"[0-9]+	printf(" ");


<INITIAL,SEEK,LATIN>{

\xff             printf("<EOS>");       /* Should never be triggered */
\xfe[\x0]+       ;                      /* Trash end-of-block and NUL pads  */
\xf0\xfe[\x0]*   ; /*printf("<EOF>"); */   /* Might be useful for error checking */
\xf8             printf("<EXC>");       /* See TLG documentation about these */
\xf9             printf("</EXC>");
  
 /* **************************************************************************
  * TODO:
  * The rules for citation strings (characters >0x7f) are each given with two
  * different trailing contexts: (1) more citation info (/[\x80-\xff]), (2) text
  * data  (/[\x20-\x7f]).   The only purpose this has is to signal whether the given
  * pattern match is the last one in a string of citation data, so that we can tell
  * whether it's time to assemble and emit a citation string: so, the only difference
  * between the rules is the value they assign to 'emit'.  There's probably a simpler
  * way to handle this. 
  *
  * To make it easier to tinker with the code, I have not
  * tried to combine rules with identical actions (e.g. the pattern for the first
  * six could be combined to [\x80\x90\xa0\xb0\xc0\xd0]/[\x80-\xff]).
  * The final scanner would be considerably faster with these changes.
  *
  */


\x80/[\x80-\xff]  { emit=0; l=getlevel(); setlevel(l,0,0,c,"");} 
\x90/[\x80-\xff]  { emit=0; l=getlevel(); setlevel(l,0,0,c,"");} 
\xa0/[\x80-\xff]  { emit=0; l=getlevel(); setlevel(l,0,0,c,"");} 
\xb0/[\x80-\xff]  { emit=0; l=getlevel(); setlevel(l,0,0,c,"");} 
\xc0/[\x80-\xff]  { emit=0; l=getlevel(); setlevel(l,0,0,c,"");} 
\xd0/[\x80-\xff]  { emit=0; l=getlevel(); setlevel(l,0,0,c,"");} 

 
\x80/[\x20-\x7f]  { emit=1; l=getlevel(); setlevel(l,0,0,c,"");} 
\x90/[\x20-\x7f]  { emit=1; l=getlevel(); setlevel(l,0,0,c,"");} 
\xa0/[\x20-\x7f]  { emit=1; l=getlevel(); setlevel(l,0,0,c,"");} 
\xb0/[\x20-\x7f]  { emit=1; l=getlevel(); setlevel(l,0,0,c,"");} 
\xc0/[\x20-\x7f]  { emit=1; l=getlevel(); setlevel(l,0,0,c,"");} 
\xd0/[\x20-\x7f]  { emit=1; l=getlevel(); setlevel(l,0,0,c,"");} 

\xe0[\x80-\xff] { fprintf(stderr,"Error: type 0 low-level increment with high level\n"); } 

[\x81-\x87]/[\x80-\xff] { emit=0; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\x91-\x97]/[\x80-\xff] { emit=0; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xa1-\xa7]/[\x80-\xff] { emit=0; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xb1-\xb7]/[\x80-\xff] { emit=0; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xc1-\xc7]/[\x80-\xff] { emit=0; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xd1-\xd7]/[\x80-\xff] { emit=0; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }


[\x81-\x87]/[\x20-\x7f] { emit=1; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\x91-\x97]/[\x20-\x7f] { emit=1; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xa1-\xa7]/[\x20-\x7f] { emit=1; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xb1-\xb7]/[\x20-\x7f] { emit=1; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xc1-\xc7]/[\x20-\x7f] { emit=1; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xd1-\xd7]/[\x20-\x7f] { emit=1; d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }


[\xe1-\xe7][\x80-\xff] { fprintf(stderr,"Error: type 1 low-level set with high level\n"); }
  
\x88[\x80-\xff]/[\x80-\xff]  { emit=0; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\x98[\x80-\xff]/[\x80-\xff]  { emit=0; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xa8[\x80-\xff]/[\x80-\xff]  { emit=0; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xb8[\x80-\xff]/[\x80-\xff]  { emit=0; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xc8[\x80-\xff]/[\x80-\xff]  { emit=0; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xd8[\x80-\xff]/[\x80-\xff]  { emit=0; l=getlevel();d=get7();setlevel(l,1,d,c,""); }


  
\x88[\x80-\xff]/[\x20-\x7f]  { emit=1; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\x98[\x80-\xff]/[\x20-\x7f]  { emit=1; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xa8[\x80-\xff]/[\x20-\x7f]  { emit=1; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xb8[\x80-\xff]/[\x20-\x7f]  { emit=1; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xc8[\x80-\xff]/[\x20-\x7f]  { emit=1; l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xd8[\x80-\xff]/[\x20-\x7f]  { emit=1; l=getlevel();d=get7();setlevel(l,1,d,c,""); }

\xe8[\x80-\xff]{2} { fprintf(stderr,"Error: type 1 low-level set with high level\n"); }

\x89[\x80-\xff]{2}/[\x80-\xff] { emit=0;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\x99[\x80-\xff]{2}/[\x80-\xff] { emit=0;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xa9[\x80-\xff]{2}/[\x80-\xff] { emit=0;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xb9[\x80-\xff]{2}/[\x80-\xff] { emit=0;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xc9[\x80-\xff]{2}/[\x80-\xff] { emit=0;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xd9[\x80-\xff]{2}/[\x80-\xff] { emit=0;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }

  
\x89[\x80-\xff]{2}/[\x20-\x7f] { emit=1;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\x99[\x80-\xff]{2}/[\x20-\x7f] { emit=1;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xa9[\x80-\xff]{2}/[\x20-\x7f] { emit=1;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xb9[\x80-\xff]{2}/[\x20-\x7f] { emit=1;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xc9[\x80-\xff]{2}/[\x20-\x7f] { emit=1;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }
\xd9[\x80-\xff]{2}/[\x20-\x7f] { emit=1;l=getlevel(); d=get7(); c=get1c();setlevel(l,2,d,c,""); }

\xe9[\x80-\xff]{3} { fprintf(stderr,"Error: type 2 low-level set with high level\n");}
	    
[\x8a\x9a\xaa\xba\xca\xda][\x80-\xff][^\xff]*\xff/[\x80-\xff] {
                                                    emit=0; 
                                                    l=getlevel();
						    d=get7();
						    s=getstr();
						    setlevel(l,3,d,c,s); 
                                                   }

[\x8a\x9a\xaa\xba\xca\xda][\x80-\xff][^\xff]*\xff/[\x20-\x7f] {
                                                    emit=1; 
                                                    l=getlevel();
						    d=get7();
						    s=getstr();
						    setlevel(l,3,d,c,s); 
                                                   }


\xea[\x80-\xff]{2}[^\xff]*\xff { fprintf(stderr,"Error: type 3 low-level set with high level\n"); }

[\x8b\x9b\xab\xbb\xcb\xdb][\x80-\xff]{2}/[\x80-\xff] { 
                                                       emit=0;
						       l=getlevel();
					               d=get14(); 
					               setlevel(l,1,d,c,""); 
                                                      }

[\x8b\x9b\xab\xbb\xcb\xdb][\x80-\xff]{2}/[\x20-\x7f] { 
                                                       emit=1;
						       l=getlevel();
					               d=get14(); 
					               setlevel(l,1,d,c,""); 
                                                      }

\xeb[\x80-\xff]{3} { fprintf(stderr,"Error: type 1 low-level set with high level\n"); }

[\x8c\x9c\xac\xbc\xcc\xdc][\x80-\xff]{3}/[\x80-\xff] { 
                                                      l=getlevel();
						      emit=0;
						      d=get14();
						      c=get1c(); 
						      setlevel(l,2,d,c,""); 
                                                     }

[\x8c\x9c\xac\xbc\xcc\xdc][\x80-\xff]{3}/[\x20-\x7f] { 
                                                      l=getlevel();
						      emit=1;
						      d=get14();
						      c=get1c(); 
						      setlevel(l,2,d,c,""); 
                                                     }

\xec[\x80-\xff]{4} { fprintf(stderr,"Error: type 2 low-level set with high level\n"); }

[\x8d\x9d\xad\xbd\xcd\xdd][\x80-\xff]{2}[^\xff]*\xff/[\x80-\xff] {
                                       emit=0; 
				       l=getlevel();
				       d=get14();
				       s=getstr(); 
				       setlevel(l,3,d,c,s); 
                                                                 }

[\x8d\x9d\xad\xbd\xcd\xdd][\x80-\xff]{2}[^\xff]*\xff/[\x20-\x7f] {
                                       emit=1; 
				       l=getlevel();
				       d=get14();
				       s=getstr(); 
				       setlevel(l,3,d,c,s); 
                                                                 }


\xed[\x80-\xff]{3}[^\xff]*\xff { fprintf(stderr,"Error: type 3 low-level set with high level\n"); }

[\x8e\x9e\xae\xbe\xce\xde][\x80-\xff]/[\x80-\xff] { 
                                                   l=getlevel(); 
						   emit=0;
						   c=get1c(); 
						   setlevel(l,4,d,c,""); 
                                                  }


[\x8e\x9e\xae\xbe\xce\xde][\x80-\xff]/[\x20-\x7f] { 
                                                   l=getlevel(); 
						   emit=1;
						   c=get1c(); 
						   setlevel(l,4,d,c,""); 
                                                  }

\xee[\x80-\xff]{2} { fprintf (stderr, "Error: type 4 low-level set with high level\n"); }

[\x8f\x9f\xaf\xbf\xcf\xdf][^\xff]*\xff/[\x80-\xff] {
                                                     emit=0; 
						     l=getlevel();
						     s=getstr(); 
						     setlevel(l,5,d,c,s); 
                                                   } 

[\x8f\x9f\xaf\xbf\xcf\xdf][^\xff]*\xff/[\x20-\x7f] {
                                                     emit=1; 
						     l=getlevel();
						     s=getstr(); 
						     setlevel(l,5,d,c,s); 
                                                   } 


\xef[\x80-\xfe][^\xff]*\xff { l=getlevel(); s=getstr(); setlevel(l,5,d,c,s); }

[\x80-\xff]     ;          /* Should never be reached */

}

<LATIN>{
  "A+"		printl(L_Auml);
  "A/"		printl(L_Aacute);
  "A="		printl(L_Acirc);
  "A\\"		printl(L_Agrave);
  "A%24"	printl(L_Atilde);
  "E+"		printl(L_Euml);
  "E/"		printl(L_Eacute);
  "E="		printl(L_Ecirc);
  "E\\"		printl(L_Egrave);
  "I+"		printl(L_Iuml);
  "I/"		printl(L_Iacute);
  "I="		printl(L_Icirc);
  "I\\"		printl(L_Igrave);
  "O+"		printl(L_Ouml);
  "O/"		printl(L_Oacute);
  "O="		printl(L_Ocirc);
  "O\\"		printl(L_Ograve);
  "O%24"	printl(L_Otilde);
  "U+"		printl(L_Uuml);
  "U/"		printl(L_Uacute);
  "U="		printl(L_Ucirc);
  "U\\"		printl(L_Ugrave);
  "C%25"	printl(L_Ccedil);
  "N%24"	printl(L_Ntilde);
  "Y/"		printl(L_Yacute);
  "a+"		printl(L_auml);
  "a/"		printl(L_aacute);
  "a="		printl(L_acirc);
  "a\\"		printl(L_agrave);
  "a%24"	printl(L_atilde);
  "e+"		printl(L_euml);
  "e/"		printl(L_eacute);
  "e="		printl(L_ecirc);
  "e\\"		printl(L_egrave);
  "i+"		printl(L_iuml);
  "i/"		printl(L_iacute);
  "i="		printl(L_icirc);
  "i\\"		printl(L_igrave);
  "o+"		printl(L_ouml);
  "o/"		printl(L_oacute);
  "o="		printl(L_ocirc);
  "o\\"		printl(L_ograve);
  "o%24"	printl(L_otilde);
  "u+"		printl(L_uuml);
  "u/"		printl(L_uacute);
  "u="		printl(L_ucirc);
  "u\\"		printl(L_ugrave);
  "c%25"	printl(L_ccedil);
  "n%24"	printl(L_ntilde);
  "y/"		printl(L_yacute);
  .		ECHO;
}



"A"		printg(G_alpha);
"B"		printg(G_beta);
"C"		printg(G_xi);
"D"		printg(G_delta);
"E"		printg(G_epsilon);
"F"		printg(G_phi);
"G"		printg(G_gamma);
"H"		printg(G_eta);
"I"		printg(G_iota);
"K"		printg(G_kappa);
"L"		printg(G_lambda);
"M"		printg(G_mu);
"N"		printg(G_nu);
"O"		printg(G_omicron);
"P"		printg(G_pi);
"Q"		printg(G_theta);
"R"		printg(G_rho);
"T"		printg(G_tau);
"U"		printg(G_upsilon);
"V"		printg(G_digamma);
"W"		printg(G_omega);
"X"		printg(G_chi);
"Y"		printg(G_psi);
"Z"		printg(G_zeta);

"A("		printg(G_alpha_rough);
"A(/"		printg(G_alpha_rough_acute);
"A(/|"		printg(G_alpha_rough_acute_isub);
"A(\\"		printg(G_alpha_rough_grave);
"A(\\|"		printg(G_alpha_rough_grave_isub);
"A(="		printg(G_alpha_rough_circ);
"A(=|"		printg(G_alpha_rough_circ_isub);
"A(|"		printg(G_alpha_rough_isub);
"A)"		printg(G_alpha_smooth);
"A)/"		printg(G_alpha_smooth_acute);
"A)/|"		printg(G_alpha_smooth_acute_isub);
"A)\\"		printg(G_alpha_smooth_grave);
"A)\\|"		printg(G_alpha_smooth_grave_isub);
"A)="		printg(G_alpha_smooth_circ);
"A)=|"		printg(G_alpha_smooth_circ_isub);
"A)|"		printg(G_alpha_smooth_isub);
"A/"		printg(G_alpha_acute);
"A/|"		printg(G_alpha_acute_isub);
"A\\"		printg(G_alpha_grave);
"A\\|"		printg(G_alpha_grave_isub);
"A="		printg(G_alpha_circ);
"A=|"		printg(G_alpha_circ_isub);
"A|"		printg(G_alpha_isub);

"E("		printg(G_epsilon_rough);
"E(/"		printg(G_epsilon_rough_acute);
"E(\\"		printg(G_epsilon_rough_grave);
"E)"		printg(G_epsilon_smooth);
"E)/"		printg(G_epsilon_smooth_acute);
"E)\\"		printg(G_epsilon_smooth_grave);
"E/"		printg(G_epsilon_acute);
"E\\"		printg(G_epsilon_grave);

"H("		printg(G_eta_rough);
"H(/"		printg(G_eta_rough_acute);
"H(/|"		printg(G_eta_rough_acute_isub);
"H(\\"		printg(G_eta_rough_grave);
"H(\\|"		printg(G_eta_rough_grave_isub);
"H(="		printg(G_eta_rough_circ);
"H(=|"		printg(G_eta_rough_circ_isub);
"H(|"		printg(G_eta_rough_isub);
"H)"		printg(G_eta_smooth);
"H)/"		printg(G_eta_smooth_acute);
"H)/|"		printg(G_eta_smooth_acute_isub);
"H)\\"		printg(G_eta_smooth_grave);
"H)\\|"		printg(G_eta_smooth_grave_isub);
"H)="		printg(G_eta_smooth_circ);
"H)=|"		printg(G_eta_smooth_circ_isub);
"H)|"		printg(G_eta_smooth_isub);
"H/"		printg(G_eta_acute);
"H/|"		printg(G_eta_acute_isub);
"H\\"		printg(G_eta_grave);
"H\\|"		printg(G_eta_grave_isub);
"H="		printg(G_eta_circ);
"H=|"		printg(G_eta_circ_isub);
"H|"		printg(G_eta_isub);

"I("		printg(G_iota_rough);
"I(/"		printg(G_iota_rough_acute);
"I(\\"		printg(G_iota_rough_grave);
"I(="		printg(G_iota_rough_circ);
"I)"		printg(G_iota_smooth);
"I)/"		printg(G_iota_smooth_acute);
"I)\\"		printg(G_iota_smooth_grave);
"I)="		printg(G_iota_smooth_circ);
"I/"		printg(G_iota_acute);
"I\\"		printg(G_iota_grave);
"I="		printg(G_iota_circ);
"I+"		printg(G_iota_uml);
"I/+"		printg(G_iota_acute_uml);
"I\\+"		printg(G_iota_grave_uml);

"O("		printg(G_omicron_rough);
"O(/"		printg(G_omicron_rough_acute);
"O(\\"		printg(G_omicron_rough_grave);
"O)"		printg(G_omicron_smooth);
"O)/"		printg(G_omicron_smooth_acute);
"O)\\"		printg(G_omicron_smooth_grave);
"O/"		printg(G_omicron_acute);
"O\\"		printg(G_omicron_grave);

"R("		printg(G_rho_rough);
"R)"		printg(G_rho_smooth);

"S1"		printg(G_sigma);
"S"/[-A-Z]	printg(G_sigma);
"S"[23]?	printg(G_fsigma);

"U("		printg(G_upsilon_rough);
"U(/"		printg(G_upsilon_rough_acute);
"U(\\"		printg(G_upsilon_rough_grave);
"U(="		printg(G_upsilon_rough_circ);
"U)"		printg(G_upsilon_smooth);
"U)/"		printg(G_upsilon_smooth_acute);
"U)\\"		printg(G_upsilon_smooth_grave);
"U)="		printg(G_upsilon_smooth_circ);
"U/"		printg(G_upsilon_acute);
"U\\"		printg(G_upsilon_grave);
"U="		printg(G_upsilon_circ);
"U+"		printg(G_upsilon_uml);
"U/+"		printg(G_upsilon_acute_uml);
"U\\+"		printg(G_upsilon_grave_uml);

"W("		printg(G_omega_rough);
"W(/"		printg(G_omega_rough_acute);
"W(/|"		printg(G_omega_rough_acute_isub);
"W(\\"		printg(G_omega_rough_grave);
"W(\\|"		printg(G_omega_rough_grave_isub);
"W(="		printg(G_omega_rough_circ);
"W(=|"		printg(G_omega_rough_circ_isub);
"W(|"		printg(G_omega_rough_isub);
"W)"		printg(G_omega_smooth);
"W)/"		printg(G_omega_smooth_acute);
"W)/|"		printg(G_omega_smooth_acute_isub);
"W)\\"		printg(G_omega_smooth_grave);
"W)\\|"		printg(G_omega_smooth_grave_isub);
"W)="		printg(G_omega_smooth_circ);
"W)=|"		printg(G_omega_smooth_circ_isub);
"W)|"		printg(G_omega_smooth_isub);
"W/"		printg(G_omega_acute);
"W/|"		printg(G_omega_acute_isub);
"W\\"		printg(G_omega_grave);
"W\\|"		printg(G_omega_grave_isub);
"W="		printg(G_omega_circ);
"W=|"		printg(G_omega_circ_isub);
"W|"		printg(G_omega_isub);

"*A"		printg(G_ALPHA);
"*B"		printg(G_BETA);
"*C"		printg(G_XI);
"*D"		printg(G_DELTA);
"*E"		printg(G_EPSILON);
"*F"		printg(G_PHI);
"*G"		printg(G_GAMMA);
"*H"		printg(G_ETA);
"*I"		printg(G_IOTA);
"*K"		printg(G_KAPPA);
"*L"		printg(G_LAMBDA);
"*M"		printg(G_MU);
"*N"		printg(G_NU);
"*O"		printg(G_OMICRON);
"*P"		printg(G_PI);
"*Q"		printg(G_THETA);
"*R"		printg(G_RHO);
"*S"		printg(G_SIGMA);
"*T"		printg(G_TAU);
"*U"		printg(G_UPSILON);
"*W"		printg(G_OMEGA);
"*X"		printg(G_CHI);
"*Y"		printg(G_PSI);
"*Z"		printg(G_ZETA);

"*(A"		{ printg(G_rough); printg(G_ALPHA); }
"*(/A"		{ printg(G_rough_acute); printg(G_ALPHA); }
"*(\\A"		{ printg(G_rough_grave); printg(G_ALPHA); }
"*(=A"		{ printg(G_rough_circ); printg(G_ALPHA); }
"*)A"		{ printg(G_smooth); printg(G_ALPHA); }
"*)/A"		{ printg(G_smooth_acute); printg(G_ALPHA); }
"*)\\A"		{ printg(G_smooth_grave); printg(G_ALPHA); }
"*)=A"		{ printg(G_smooth_circ); printg(G_ALPHA); }
"*/A"		{ printg(G_acute); printg(G_ALPHA); }
"*\\A"		{ printg(G_grave); printg(G_ALPHA); }
"*=A"		{ printg(G_circ); printg(G_ALPHA); }

"*(E"		{ printg(G_rough); printg(G_EPSILON); }
"*(/E"		{ printg(G_rough_acute); printg(G_EPSILON); }
"*(\\E"		{ printg(G_rough_grave); printg(G_EPSILON); }
"*)E"		{ printg(G_smooth); printg(G_EPSILON); }
"*)/E"		{ printg(G_smooth_acute); printg(G_EPSILON); }
"*)\\E"		{ printg(G_smooth_grave); printg(G_EPSILON); }
"*/E"		{ printg(G_acute); printg(G_EPSILON); }
"*\\E"		{ printg(G_grave); printg(G_EPSILON); }

"*(H"		{ printg(G_rough); printg(G_ETA); }
"*(/H"		{ printg(G_rough_acute); printg(G_ETA); }
"*(\\H"		{ printg(G_rough_grave); printg(G_ETA); }
"*(=H"		{ printg(G_rough_circ); printg(G_ETA); }
"*)H"		{ printg(G_smooth); printg(G_ETA); }
"*)/H"		{ printg(G_smooth_acute); printg(G_ETA); }
"*)\\H"		{ printg(G_smooth_grave); printg(G_ETA); }
"*)=H"		{ printg(G_smooth_circ); printg(G_ETA); }
"*/H"		{ printg(G_acute); printg(G_ETA); }
"*\\H"		{ printg(G_grave); printg(G_ETA); }
"*=H"		{ printg(G_circ); printg(G_ETA); }

"*(I"		{ printg(G_rough); printg(G_IOTA); }
"*(/I"		{ printg(G_rough_acute); printg(G_IOTA); }
"*(\\I"		{ printg(G_rough_grave); printg(G_IOTA); }
"*(=I"		{ printg(G_rough_circ); printg(G_IOTA); }
"*)I"		{ printg(G_smooth); printg(G_IOTA); }
"*)/I"		{ printg(G_smooth_acute); printg(G_IOTA); }
"*)\\I"		{ printg(G_smooth_grave); printg(G_IOTA); }
"*)=I"		{ printg(G_smooth_circ); printg(G_IOTA); }
"*/I"		{ printg(G_acute); printg(G_IOTA); }
"*\\I"		{ printg(G_grave); printg(G_IOTA); }
"*=I"		{ printg(G_circ); printg(G_IOTA); }
"*+I"		{ printg(G_uml); printg(G_IOTA); }
"*/+I"		{ printg(G_acute_uml); printg(G_IOTA); }
"*\\+I"		{ printg(G_grave_uml); printg(G_IOTA); }

"*(O"		{ printg(G_rough); printg(G_OMICRON); }
"*(/O"		{ printg(G_rough_acute); printg(G_OMICRON); }
"*(\\O"		{ printg(G_rough_grave); printg(G_OMICRON); }
"*)O"		{ printg(G_smooth); printg(G_OMICRON); }
"*)/O"		{ printg(G_smooth_acute); printg(G_OMICRON); }
"*)\\O"		{ printg(G_smooth_grave); printg(G_OMICRON); }
"*/O"		{ printg(G_acute); printg(G_OMICRON); }
"*\\O"		{ printg(G_grave); printg(G_OMICRON); }

"*(R"		{ printg(G_rough); printg(G_RHO); }
"*)R"		{ printg(G_smooth); printg(G_RHO); }

"*(U"		{ printg(G_rough); printg(G_UPSILON); }
"*(/U"		{ printg(G_rough_acute); printg(G_UPSILON); }
"*(\\U"		{ printg(G_rough_grave); printg(G_UPSILON); }
"*(=U"		{ printg(G_rough_circ); printg(G_UPSILON); }
"*)U"		{ printg(G_smooth); printg(G_UPSILON); }
"*)/U"		{ printg(G_smooth_acute); printg(G_UPSILON); }
"*)\\U"		{ printg(G_smooth_grave); printg(G_UPSILON); }
"*)=U"		{ printg(G_smooth_circ); printg(G_UPSILON); }
"*/U"		{ printg(G_acute); printg(G_UPSILON); }
"*\\U"		{ printg(G_grave); printg(G_UPSILON); }
"*=U"		{ printg(G_circ); printg(G_UPSILON); }
"*+U"		{ printg(G_uml); printg(G_UPSILON); }
"*/+U"		{ printg(G_acute_uml); printg(G_UPSILON); }
"*\\+U"		{ printg(G_grave_uml); printg(G_UPSILON); }

"*(W"		{ printg(G_rough); printg(G_OMEGA); }
"*(/W"		{ printg(G_rough_acute); printg(G_OMEGA); }
"*(\\W"		{ printg(G_rough_grave); printg(G_OMEGA); }
"*(=W"		{ printg(G_rough_circ); printg(G_OMEGA); }
"*)W"		{ printg(G_smooth); printg(G_OMEGA); }
"*)/W"		{ printg(G_smooth_acute); printg(G_OMEGA); }
"*)\\W"		{ printg(G_smooth_grave); printg(G_OMEGA); }
"*)=W"		{ printg(G_smooth_circ); printg(G_OMEGA); }
"*/W"		{ printg(G_acute); printg(G_OMEGA); }
"*\\W"		{ printg(G_grave); printg(G_OMEGA); }
"*=W"		{ printg(G_circ); printg(G_OMEGA); }

"("		printg(G_rough);
"(/"		printg(G_rough_acute);
"(\\"		printg(G_rough_grave);
"(="		printg(G_rough_circ);
")"		printg(G_smooth);
")/"		printg(G_smooth_acute);
")\\"		printg(G_smooth_grave);
")="		printg(G_smooth_circ);
"+"		printg(G_uml);
","		printg(G_comma);
"-"		printg(G_hyphen);
"."		printg(G_fullstop);
"/"		printg(G_acute);
"/+"		printg(G_acute_uml);
":"		printg(G_colon);
";"		printg(G_question);
"="		printg(G_circ);
"\\"		printg(G_grave);
"\\+"		printg(G_grave_uml);
"|"		printg(G_isub);

.		ECHO;



%%

void printl(int code)
{
  printf("%c%c", lc_l, code);
}

void printg(int code)
{
  printf("%c%c%c%c", lc_g1, lc_g2, code / 96 + 160, code % 96 + 160);
}


int get7 (void) {
  int i;
  i=(int)(unsigned char)*yytext++&0x7f;
  return i;
}

int get14 (void) {
  int i;
  i=(int)(unsigned char)*yytext++&0x7f;
  i=(i*128)+((int)(unsigned char)*yytext++&0x7f);
  return i;
}

char get1c (void) {

  char c;
  c=*yytext++&0x7f;
  return c;

}

char * getstr (void) {

     char *charptr;
     charptr=yytext;
     while ((unsigned char)*yytext<0xff) {
       *yytext=*yytext++&0x7f;
     };
     *yytext++=0x00;
     return charptr;
}



int getlevel(void) {
/*    printf("getlevel "); */
  switch (*yytext++&0xf0) {
  case 0x80:
    return Z;
    break;
  case 0x90:
    return Y;
    break;
  case 0xA0:
    return X;
    break;
  case 0xB0:
    return W;
    break;
  case 0xC0:
    return V;
    break;
  case 0xD0:
    hierarchy=0;
    return N;
    break;
  case 0xE0:
    if (toascii(*yytext)<4) {
      return(A-toascii(*yytext++));
    } else {
      return(toascii(*yytext++));
    };
    break;
  };
}

/**********************************************
 * TLG Citation String compare function
 */

int compcite(char *str1, char *str2) {   /* Compares citation strings */
  int d1, d2;
  char c1, c2;
  char *s1; char *s2;

  /* Use strpbrk(str,"0123456789") to get the start of the digits
   * Use strspn(str+last result, "0123456789) for the end of the digits
   * atoi the digit strings into d1/d2
   * compare the digits
   * compare the characters
   */  

};
/***********************************************************************
 *  TLG IDT level updater.  This has to do three things:
 *  (1) update the level string, possibly by incrementing it 
 *        (this may mean incrementing a trailing number after 
 *	  a character or a trailing character after a number as well
 *        as incrementing a number)
 *  (2) reset lower levels to 1, if the levels are hierarchical (indicated
 *      by the global 'hierarchy').
 *  (3) turn hierarchy off if level N is being set.
 *
 *  The return is clunky: it strcats some level strings to the global
 *  'cite' and returns its address.  That could be fixed.   
 */ 


/************************************************************
 * The setlevel routine keeps track of both citations and
 * (should there be any) descriptions.  A citation value
 * may consist of an integer, a string, or an integer followed by
 * a string or character; so , the parameters include one of each.
 * After setting the level, it prints in accordance with the 
 * current print citation format.  If in seek mode, it checks for
 * a match with the target string for the level.  (If there was
 * previously a match but nwo there isn't, we complain).  
 * 
 */


char * setlevel(int which, int how, int val, char ch, char* str) {
  int d,ix,last_i, last_d, first_d;
  char numstr[REF_SIZE];                     /*let's have plenty of room!*/
 
     /*     printf("Setlevel %d %d %d %c %s: ", which,how,val,ch,str);  */  /* Tracer */


  /* which = which level to update (=levels[which])
   * how =   what type of update this is:
   *          0=increment which 
   *          1=set which to val 
   *          2=set which to val+ch (
   *          3=set which to val+str
   *          4=change the last char in which to ch 
   *          5=set which to str         
   * val = integer parameter
   * ch  = character parameter
   * str = string parameter
   */

  /* For anything but an increment, we can just set it  */

  if (which == N) { hierarchy = 0;  };
  if (which > N) { hierarchy = 1; } ; 
  switch (how) {

  case 0:         /* INCREMENT */
    
    /* Incrementing is complicated since we don't know whether the level is
     * an integer or a string (in the latter case we increment the last
     * character).  So, we first have to see if the level ends in a digit string.
     * Just to be safe, we count backwards from the end.
     */ 

    last_i=strlen(&levels[which][0])-1;
    if (isdigit(levels[which][last_i])) {
      /* if it ends in a digit, find the whole terminal string and increment it */

      first_d=last_i;
      while (first_d>0 && isdigit(levels[which][first_d])) {
	first_d--;
      };
      d=atoi(&levels[which][first_d]);
      d++;
      sprintf(&levels[which][first_d],"%d",d);

      /* Otherwise, just increment the last character in the string */
      
    } else {
      ++levels[which][last_i];
    };    

    break;

  case 1:                /* SET LEVEL TO INTEGER VALUE  */
    sprintf(&levels[which][0],"%d",val);
    break;

  case 2:                /* SET LEVEL TO INTEGER + CHAR */         
    sprintf(&levels[which][0],"%d%c",val,ch);
    break;
    
  case 3:               /* SET LEVEL TO INTEGER + STRING */
    sprintf(&levels[which][0],"%d%s",val,str);
    break;
    
  case 4:               /* CHANGE LAST CHAR IN LEVEL TO CHAR */
    last_d=strlen(&levels[which][0])-1;
    levels[which][last_d]=c;
    break;
    
  case 5:               /* SET LEVEL TO STRING */
    sprintf(&levels[which][0],"%s",str);
    break;
    
  };

  /* Whatever we do, we reset the lower levels.
   * For levels vwxyz, lower levels are reset to "1" unless the
   * 'hierarchy' flag is turned off.    
   * as well?  Are these ever reset independently?)
   */

  if (hierarchy) {
    for (ix=0; ix<which; ix++) {
      sprintf(&levels[ix][0],"%d",1);
    };
  };

  strncpy(&cite[0],&levels[which][0],REF_SIZE);

  /*  for (ix=which-1; ix>=0; ix--) {
      strncat(&cite[0], ".");
      strncat(&cite[0], &levels[ix][0],REF_SIZE);
    }; */

  /* All updating is done, so now we compare the reference
   * if we're seeking. 
   */

  /* First consult found[which] to see if we're looking for this level */

  if (which==seeking) {
    printf("Seeking %d... ",seeking);
    if (citecmp(&levels[which][0],&target[which][0])==0) {
      found[which]=1;
      printf("Matched on id level %d: %s = %s\n",which,&levels[which][0],&target[which][0]);
      setseeking();            /* */
    } else {
      printf("Compare value is  %d\n", citecmp(&levels[which][0],&target[which][0])) ;  
    };
  };
  /* Last action is to print the new value in accordance with 
   * the current. format
   */

  if (which>N)
    { 
      printf(fmt+which*REF_SIZE,&cite[0]);
    }
  else if (emit) {
    printf("\n%s.%s.%s.%s.%s\t",
	   &levels[V][0],
	   &levels[W][0],
	   &levels[X][0],
	   &levels[Y][0],
	   &levels[Z][0]);
  }

  /* leave this now for compatibility with prior calls */
  return &cite[0];
}


/*********************************************************
 * TODO:
 *   When the top-level idt values (file no.,
 *   work no., etc.) are reset, hierarchy
 *   should also be reset to 1.
 *
 *   
 **********************************************************/


/*********************************************************
 * Comparison of citations in accordance with TLG rules
 ********************************************************/

int citecmp (char *str1, char *str2) {   
  int d1, d2;
  int l1, l2;
  int cval;
  char c1, c2;
  char *digits="0123456789";
  char *ptr1;
  char *ptr2;
  
  ptr1=str1; ptr2=str2;

  while (strlen(ptr1)>0 && strlen(ptr2)>0) {
 /*     printf("Starting compare loop...\n"); */
    if (isdigit(*ptr1) || isdigit(*ptr2)) {
 /*       printf("Comparing digit strings...\n"); */
      d1=strtol(ptr1,&ptr1,10);
      d2=strtol(ptr2,&ptr2,10);
      /*       ptr1+=strspn(ptr1,digits); */
      /*        ptr2+=strspn(ptr1,digits); */
      
      if (d1<d2) 
	{
	/*    printf("%d\n",-1); */
	  return(-1);
	} else if (d2<d1){
	/*    printf("%d\n",1); */
	  return(1);	
	};
  /*      printf("Digit strings equal: %d = %d\n", d1, d2); */
      
    } else {
   /*     printf("Comparing text strings...\n"); */
      l1=strcspn(ptr1,digits);
      l2=strcspn(ptr2,digits);
      if (l1<=l2) {
	cval=strcmp(ptr1,ptr2);
      } else {
	cval=strcmp(ptr1,ptr2);
      };
      if (cval!=0) {
	/*  printf("%d\n",cval); */
	return(cval);
      } else {
	/*  printf("%d\n",cval); */
	ptr1+=l1;
	ptr2+=l2;
      };
      
    };
    
  };
  return(0);   /* Fall-through from 'while' */
};


/***********************************************
 * Resets 'seeking' to the highest index in 'found'
 * pointing to a negative value (i.e. the number of
 * the highest ID level not yet matched)
 */

void setseeking (void) {
  int ix;
  if (seeking>=0) {
    seeking=-1;                  /* First say we're not seeking.. */
    for (ix=A; ix>=0; ix--) {    /* Then revise as necessary      */
      if (found[ix]<0) {
	seeking=ix;
	ix=-1;
      };
    };
    if (seeking>=0) {
      printf("Now seeking level %d\n",seeking);
    } else{
      printf("Done seeking\n");
      BEGIN(INITIAL);
    };
  } else {
    /*    printf("Not seeking: going to INITIAL state\n");  */
    BEGIN(INITIAL);
  };  

};

main(int argc, char *argv[])
{
  FILE *file;
  int arg;
  int offset=0;
  int leading_code;
  char option;
  char *filename;
  struct stat *buf;

 /*   printf("%d command line arguments: \n",argc-1); */

  /*******************************************************
   * Command line argument parser.  For compatibility, this
   * should behave exactly like Takahashi's tlg2emacs.  So,
   * we process all the command line options first.  Any
   * remaining arguments are interpreted as: filename, lc_g2,
   * error.
   */
  
  if (argc>1) {
    arg=1;
    while (arg<argc-1 && strchr(argv[arg],'-') == argv[arg]) {   /* Sloppy code */
      option = *(argv[arg++]+1);                   /* Increment arg once */
      
      switch (option) {
	
	/* -a: seek author number.  May be useless in a file. */
	
      case 'a':       /* This may be useless; same throughout a file? */
	seeking=1;
	strncpy(&target[A][0],argv[arg++],REF_SIZE-1);
	found[A]=-1;
	printf("Searching for level %d (author) %s...\n",A,&target[A][0]);
	break;

	/* -b: seek work number */

      case 'b':  
	seeking=1;
	strncpy(&target[B][0],argv[arg++],REF_SIZE-1);
	found[B]=-1;
	printf("Searching for level %d (work) %s...\n",B,&target[B][0]);
	break;
	
	/* -c: Choose citation format.  This is not now being used */
	
      case 'c':

        switch (*argv[arg++]) {
	case 'a':
	  fmt=&fmt_all[0][0];
	  break;
	case 'l':
	  fmt=&fmt_low[0][0];
	  break;
	case 'm':                 /* Default */
	  fmt=&fmt_min[0][0];
	  break;
	case 'n':
	  fmt=&fmt_none[0][0];
	  break;
	};
	break;

	/* -f: filename to open.  Same as argv1 in Takahashi, except that '|'= stdin  */

      case 'f':                

	filename = argv[arg++];

	break;

	/* -l: code for lc_g2 (same as argv2 in Takahashi */

      case 'l':
	lc_g2 = atoi(argv[arg++]); 

	break; 
	
	/* -o: Start reading at block.  Not yet implemented */
      case 'o':
	offset = atoi(argv[arg++])*8192;

	break;

	/* -s: seek target citation. Not now being used; replaced by -nvwxyz below */

      case 's':
        seeking=1;
	strncpy(&target_string,argv[arg++],REF_SIZE-1);
	break;

	/* -n -v -w -x -y -z: Seek low-level citation value */

      case 'n':
	seeking=1;
	strncpy(&target[N][0],argv[arg++],REF_SIZE-1);
	found[N]=-1;
	break;

      case 'v':
	seeking=1;
	strncpy(&target[V][0],argv[arg++],REF_SIZE-1);
	found[V]=-1;
	break;

      case 'w':
	seeking=1;
	strncpy(&target[W][0],argv[arg++],REF_SIZE-1);
	found[W]=-1;
	break;

      case 'x':
	seeking=1;
	strncpy(&target[X][0],argv[arg++],REF_SIZE-1);
	found[X]=-1;
	break;

      case 'y':
	seeking=1;
	strncpy(&target[Y][0],argv[arg++],REF_SIZE-1);
	found[Y]=-1;
	break;

      case 'z':
	seeking=1;
	strncpy(&target[Z][0],argv[arg++],REF_SIZE-1);
	found[Z]=-1;
	break;

      default:
	fprintf(stderr,"unrecognized option: %c\n",option);
        fprintf(stderr, 
		"usage: %s [-abnvwxyz citation] [-c format] [-f file] [-l code] [-o block]\n", argv[0]); 
	exit(-1); 
	break;
      };
    };
    /* The remaining cases are to maintain compatibility with the behavior of Takahashi's
     * tlg2emacs.  Any arguments left are assumed to be a filename and a value for
     * lc_g2; further arguments cause an error.  */

    /* First consume a filename */
    if (argc-arg>0) {
      fprintf(stderr,"%d elements left after parsing options: first %s\n",argc-arg,argv[arg]);

      /* The next option MUST be a filename (including '|' for stdin). */
      if (strlen(filename)>0) {
	fprintf(stderr,"Filename changed to %s\n",argv[arg]);
      };
      filename = argv[arg++];
    };
    /* Then consume lc_g2 */
    if (argc>arg) {
      fprintf(stderr,"lc_g2 changed to %d\n",atoi(argv[arg]));
      lc_g2 = atoi(argv[arg++]);
    }; 
    if (argc>arg) { 
      fprintf(stderr,"%d extra arguments in command line\n",argc-arg);
      exit(-1);
    };
  };

  /* Either open the filename or (if the filename is "|") read stdin */

  if (strcmp(filename,"|")==0){
    yyin=stdin;
  } else {
    file = fopen(filename, "r");
    if (!file) {
      fprintf(stderr, "cannot open %s\n", filename);
      exit(-1);
    };
    yyin=file;
  };

  /* TODO: This is from the old implementation for the -o command line argument,
   * which did an fseek to the file position indicated by -o (in 8K blocks).
   * I've shut this code down because I have made stdin possible
   * as the input.  
   * I was also stating the file to see if the block value was too high, but this
   * seems pointless since setting the
   * seek position beyond the end of the file just produces an EOF anyway,
   * so nothing would be read.  The block-finding function is handled
   * more smoothly through a pipe from tlgreadblk.
   */
/*    buf = malloc(sizeof(struct stat)); */
 /*   if (stat(filename,buf)) { fprintf(stderr,"Error in stat of file\n");}; */
 /*   if ((*buf).st_size<offset) { */
/*      fprintf(stderr,"Offset %d is too large for file; starting at offset 0\n",offset); */
/*      offset=0; */
/*    }; */

/*    if (fseek(file,offset,SEEK_SET)) { */
/*      fprintf(stderr, "error in finding starting block %d\n", offset); */
/*      exit(-2); */
/*    }; */
/*    free(buf); */

  setseeking(); 
  
  c=' ';        /* Frankly, can't remember why I do this here */
  if (seeking>=0) {BEGIN(SEEK);};

  yylex();
}











- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: text/plain; charset=US-ASCII


- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="tlg-idt.lex"
Content-Transfer-Encoding: 7bit

%{
#include "tlg.h"
#define REF_SIZE 256   /* I have had overruns with 64; I don't know what the max is */
#define DESC_SIZE 256

  /**********************************************************************
   * Scanner for TLG .idt files.  Output is presently limited to basic
   * work information, including starting and ending blocks).
   * Based on N. Takahashi's tlg2emacs, with additional rules added for
   * both citation codes (bytes > 0x7f) and idt table information (prefixed
   * by control bytes).
   *  
   * Author: Robin Smith <rasmith@xxxxxxxxxxxxxxxxxx>
   * Date: 13 July 2000
   */

  int daemon=0;                   /* Run as daemon (currently to be avoided) */
  char command_line[512];
  char command[64];

  char * c_ptr, * c_end;
  char option;

  char * this_cite;              /* Intended to point to a char[6][REF_SIZE] */
  /* TODO: replace all the citation strings  with a structured type  */

  /**********************************
   * We need a little command line
   * parser
   *********************************/

  char * next_command(char * * cstring);


int quoting = 0;
int lc_l = 129;			/* (charset-id 'latin-iso8859-1) */
int lc_g1 = 156;		/* leading-code-private-21 */
int lc_g2 = 242;		/* default for (charset-id 'cgreek) */

 int ix;          /* everyone needs one of these */
 char * ptr;      /* and one of these */ 
/* Globals for idt files
 */

 int idt_length;
 int block_number;
 int section_number;
 char full_ids[10][64];          /* Where we keep the expanded data */

 /*****************************************************
  * We need to keep all the information for a work in 
  * a structure. Decide what to print later.
  */

 struct ref {
   char z[REF_SIZE];
   char y[REF_SIZE];
   char x[REF_SIZE];
   char w[REF_SIZE];
   char v[REF_SIZE];
   char n[REF_SIZE];

 };

 struct exception {
   char file[4];
   int num;
   char start[6][REF_SIZE];
   char end[6][REF_SIZE];
 };
 struct block {
   char file[4];
   int num;
   char cite[6][REF_SIZE];
   char start[6][REF_SIZE];     /* Only used for exception blocks */
   char end[6][REF_SIZE];      /* Only used for exception blocks */
   struct block *next;
   struct work * owner;
 };

 struct section {
   int num;
   char cite[6][REF_SIZE];
   struct block * blocks;
   struct work * owner;
   struct section * next;
 };

 struct work {
   char file[6];              /* TLG file number (4 digits only) */
   char workno[6];
   int worknum;               /* Fix this up later */         
   char author[REF_SIZE];
   char title[REF_SIZE];
   char title_abbr[16];
   char author_abbr[16];
   int start;                  /* First block */
   int end;                    /* Last block  */
   char level_names[6][REF_SIZE];       /* These are arrays Z..N */
   char first_ref[6][REF_SIZE];       /* first citation */         
   char last_ref[6][REF_SIZE];        /* last citation */
   struct block * blocks;      /* TODO: Cut This Out */
   struct section * sections;  /* */
   struct block * exceptions;
   struct work * next;         /* Pointer to next work in list */
 };

 struct author {
   char file[6];
   char name[32];
   char abbr[16];
   struct work * worklist;
 };
 struct author * who;
 struct work * this_work;                 /* The current work */
 struct block * this_block;               /* The end of the current block list */
 struct section * this_section;

 struct work * new_work(struct work * * wklist);
 struct section * new_section(struct section * * sctlist);
 struct block * new_block(struct block * * blklist);

 struct block * last_block(struct block * blklist);   /* just finds the last one */


/* Globals used for passing values to setlevel */ 




int d = 0, l = 0;                 /* integer input, level value */
char c = ' ';                     /* character input */
char *s;                          /* string input */

char cite[REF_SIZE];                   /* Return location for the citation string */


 /* Print absolutely everything */
 char  fmt_all[10][16]={" -z %s ",
			" -y %s ",
			" -x %s ",
			" -w %s ",
			" -v %s ",
			" -n %s ",
			"\nAuthor abbr. %s",
			"\nWork abbr. %s",
			"\nWork %s",
			"\nAuthor %s"
 };

char  fmt_idt[10][16]={" -z %s ",
			" -y %s ",
			" -x %s ",
			" -w %s ",
			" -v %s ",
			" -n %s ",
			"\nAuthor abbr. %s",
			"\nWork abbr. %s",
			"\nWork %s",
			"\nAuthor %s"
 };
 /*TODO:  We don't need the other formats in idt files */

 /* Print z-v,n with newline */

char fmt_low[10][16]={"\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		      "\n%-8s",
		     "",
		     "",
		     "",
		     ""
 };
 char fmt_min[10][16]={"\n",
		       "\n",
		       "\n",
		       "\n",
		       "\n",
		       "\n",
		       "",
		       "",
		       "",
		       ""
 };

 /* Print nothing at all */
 char fmt_none[10][16]={"",
			"",
			"",
			"",
			"",
			"",
			"",
			"",
			"",
			""
 };

 char * fmt=&fmt_none[0][0];

char format_high[5][16]={"%d=%s","%d=%s","%d=%s","%d=%s","%d=%s"};          /* Top-level IDT formats.  The
						      * idea is that we probably don't
						      * want to print these, but we can
						      * use a line break.  
						      */
                
char format_low[6][16]={"-z %s ","-y %s ","-x %s ","-w %s ","-v %s ","-n %s "}; 


void printl(int);
void printg(int);

/**************************************************** 
 * Added by R. Smith.  
 * Functions to extract TLG encoded data 
 */

int get7(void);
int get14(void);
char get1c(void);
char *getstr(void);

/****************************************************
 * Added by R. Smith
 * Globals and functions for TLG IDT level data
 */

 int hierarchy=1;                                /* Are levels hierarchical? */
 enum level {Z=0,Y,X,W,V,N,D,C,B,A} currlevel,tmplevel;  /* The levels for the six lower counters */
 char levels[10][REF_SIZE]={ "","","","","",""};         /* IDT levels stored here as strings     */

 /* Functions for setting the IDT level value */

char * setlevel(int which, int how, int val, char ch, char* str);
int getlevel(void);
%}

%s LATIN

%%


    /* The 'type codes' all begin with a control code    */
\x1[\x00-\xff]{4}\xef\x80[\x80-\xfe]*\xff	 {   /* This means "new author start' */
                  who = malloc(sizeof(struct author));
		  who->worklist=0;  
		  yytext++;
		  idt_length=(int)(unsigned char)*yytext++;
		  idt_length=idt_length*256+(int)(unsigned char)(*yytext++);
		/*      printf(" (%d bytes): ",idt_length);  */
		  block_number=(int)(unsigned char)*yytext++;
		  block_number=block_number*256+(*yytext++);
		  section_number=1;   /* (reset section number too) */
		  /* I'm just ignoring the author start block for now */
		  yytext++; /*gobble another */
		  ptr=(char *)strchr(yytext++,0xff);
		  ix=0;
	 	  while (yytext<ptr) { 
  		    who->file[ix++]=toascii(*yytext++); 

		  };
		  who->file[ix]=0x0;
		/*    printf("\nFile number is  %s ",&who->file[0]);  */
                }
\x2[\x0-\xff]{4}\xef\x81[\x80-\xfe]*\xff         { 
                  this_work=new_work(&(who->worklist));
		  
		  yytext++;
		  idt_length=(int)*yytext++;
		  idt_length=idt_length*256+(int)(unsigned char)(*yytext++);
		/*    printf(" (%d bytes): ",idt_length);  */
		  block_number=(int)*yytext++;
		  block_number=block_number*256+(int)(unsigned char)(*yytext++);
		  section_number=1;
		  this_work->exceptions=0;
		  this_work->start=block_number;
		  /* we also want to reset the counters */
		  for (ix=0;ix<6;ix++) {
		    sprintf(&levels[ix][0],"");
		  };
		  yytext++;
		  ptr=(char *)strchr(yytext++,0xff);
		  ix=0;
	 	  while (yytext<ptr) { 
  		    this_work->workno[ix++]=toascii(*yytext++); 

		  };
		  this_work->workno[ix]=0x0;
		  this_work->worknum=atoi(this_work->workno);
   /*  		  printf("\n Work number %s=%d ",&this_work->workno[0], this_work->worknum);  */
                }

\x3[\x0-\xff]{2}		{
		  this_section=new_section(&(this_work->sections));
		  yytext++;
		  block_number=(int)*yytext++;
		  block_number=block_number*256+(int)(unsigned char)(*yytext++);
		  this_section->num=section_number++;  
		  this_section->next=0;
		  this_cite=&(this_section->cite[0][0]);
		  /* I suspect this code is never reached */
		  if (yyleng==4) {
		    switch (*yytext) {
		    case 0x08:
		      printf("Starts at block  ");
		      break;
		    case 0x09:
		      printf("  ");
		      break;
		    };
		  };
	    /*  printf(" %s %s -s %d \n",&levels[A][0],&levels[B][0], block_number); */
                }
\x4		;
\x5		;
\x6		;
\x7[\x00-\xff]{8}		printf("\nNew File (obsolete form)");
\x8/[\x80-\xff]+ {      /* Starting citation */ 
     	         /*    printf(" %s -s %d",&levels[A][0],block_number);     */ 
		   this_block=new_block(&(this_section->blocks));
		   this_block->num=block_number;
		   this_cite=&(this_block->cite[0][0]);
		   for (ix=Z;ix<=N; ix++) {
		     strncpy(this_block->owner->first_ref[ix],
			     &levels[ix][0],31);
		   };
                 }
\x9/[\x80-\xff]+ {	  /* Ending citation */
                /*     printf("\n %s -e %d ",&levels[A][0],block_number);   */    
		   this_block=new_block(&(this_section->blocks));
		   this_block->num=block_number;
		   this_block->owner->end=block_number;
		   for (ix=Z;ix<=N; ix++) {
		   strncpy(this_block->owner->last_ref[ix],
			     &levels[ix][0],31);
		   this_cite=&(this_block->cite[0][0]);
		   };
                 }
\xa/[\x80-\xff]+ {	/* Block in list */
		   this_block=new_block(&(this_section->blocks));
		   this_block->num=block_number++;
		   this_cite=&(this_block->cite[0][0]);
		   BEGIN LATIN;
                  }
\xb[\x0-\xff]{2} {
		  yytext++;
		  block_number=(int)*yytext++;
		  block_number=block_number*256+(int)(unsigned char)(*yytext);
		  this_block=new_block(&(this_work->exceptions));
		  this_block->next=0;

		  this_cite=&(this_block->start[0][0]);
		  this_block->num=block_number;

		/*    printf("\n      Exception record for block %d is at address %d", */
 /*  			 this_block->num, */
 /*  			 this_block); */
	
                }
\xc		{   /* All we do here is get ready to record the end citation */
                   this_cite=&(this_block->end[0][0]);
		   BEGIN LATIN;
                }                  

\xd[\x0-\xff]{2} {
		  yytext++;
		  block_number=(int)*yytext++;
		  block_number=block_number*256+(int)(unsigned char)(*yytext);
		  this_block=new_block(&(this_work->exceptions));
		  this_block->next=0;
		  this_block->num=block_number;
		  this_cite=&(this_block->cite[0][0]);
		  BEGIN LATIN;
		 /*   printf("\n      Single Exception record for block %d is at address %d", */
 /*  			 this_block->num, */
 /*  			 this_block); */
		/*    printf(" -o %d ",block_number); */
                }
\xe		;
\xf		;

\x10[\x0-\xff]{2}[\x20-\x7f]*	{ 
		  yytext++;
		  l=A-(int)(char)*yytext++;
		  d=(int)(char)*yytext++;
		  strncpy(&full_ids[l][0],yytext,d);   /* GET THESE */
		  full_ids[l][d]=0x0;
		  if (l==A) {
		    strncpy(who->name,yytext,d);
		  who->name[d]=0x0;
		  };
		  if (l==B) {
		    strncpy(this_work->title,yytext,d);
		  this_work->title[d]=0x0;
		  };
		  BEGIN LATIN;
		/*    printf("\nAuthor/Work ID %d is %s\n",l,&full_ids[l][0]); */
		  yyless(3);
	   	/*    printf(" (%d bytes):", (int)*yytext);   */
                }

\x11[\x0-\xff]{2}[\x20-\x7f]* 	{  
		  yytext++;
		  l=*yytext++;
		  d=(int)(char)*yytext++;
		  /* For now, we just gobble these */
		  strncpy(&full_ids[l][0],yytext,d);
		  full_ids[l][d]=0x0;
		  if (l>=0 && l<=N) {
		    strncpy(this_work->level_names[l],yytext,d);
		    this_work->level_names[l][d]=0;
		  };

		  BEGIN LATIN;
                 }
[\x12-\x1e]	;
         /* This appears to be obsolete; I don't know if it ever occurs */
\x1f[\x00-\xff]{3}	{  
                  BEGIN LATIN;
                  printf("\nID table is ");
		  yytext++;
		  idt_length=(int)*yytext++;
		  idt_length=idt_length*256+(*yytext++);
		  idt_length=idt_length*256+(*yytext);
		  /* printf(" (%d bytes):",idt_length); */
		  }




\xff             printf("<EOS>");       /* Should never be triggered */
\xfe[\x0]*       ;                      /* Trash end-of-block and NUL pads  */
\xf0\xfe[\x0]*   ; /*printf("<EOF>"); */   /* Might be useful for error checking */
\xf8             printf("<EXC>");       /* See TLG documentation about these */
\xf9             printf("</EXC>");


\x80              |
\x90              |
\xa0              |
\xb0              |
\xc0              |
\xd0              { l=getlevel();setlevel(l,0,0,c,"");} 
\xe0[\x80-\xff]   { l=getlevel(); setlevel(l,0,0,c,"");} 

[\x81-\x87]            |
[\x91-\x97]            |
[\xa1-\xa7]            |
[\xb1-\xb7]            |
[\xc1-\xc7]            |
[\xd1-\xd7]            { d=*yytext&0x07;l=getlevel();setlevel(l,1,d,c,""); }
[\xe1-\xe7][\x80-\xff] { d=*yytext&0x0f;l=getlevel();setlevel(l,1,d,c,""); }

\x88[\x80-\xff]    |
\x98[\x80-\xff]    |
\xa8[\x80-\xff]    |
\xb8[\x80-\xff]    |
\xc8[\x80-\xff]    |
\xd8[\x80-\xff]    { l=getlevel();d=get7();setlevel(l,1,d,c,""); }
\xe8[\x80-\xff]{2} { l=getlevel(); d=get7(); setlevel(l,1,d,c,""); }

\x89[\x80-\xff]{2} |
\x99[\x80-\xff]{2} |
\xa9[\x80-\xff]{2} |
\xb9[\x80-\xff]{2} |
\xc9[\x80-\xff]{2} |
\xd9[\x80-\xff]{2} { l=getlevel(); d=get7(); c=get1c(); setlevel(l,2,d,c,"");  }
\xe9[\x80-\xff]{3} { l=getlevel(); d=get7();  c=get1c();  setlevel(l,2,d,c,"");  }

\x8a[\x80-\xff][^\xff]*\xff |
\x9a[\x80-\xff][^\xff]*\xff |
\xaa[\x80-\xff][^\xff]*\xff |
\xba[\x80-\xff][^\xff]*\xff |
\xca[\x80-\xff][^\xff]*\xff |
\xda[\x80-\xff][^\xff]*\xff    { l=getlevel();d=get7();s=getstr();setlevel(l,3,d,c,s); }
\xea[\x80-\xff]{2}[^\xff]*\xff { l=getlevel();d=get7(); s=getstr();setlevel(l,3,d,c,s); }

\x8b[\x80-\xff]{2} |
\x9b[\x80-\xff]{2} |
\xab[\x80-\xff]{2} |
\xbb[\x80-\xff]{2} |
\xcb[\x80-\xff]{2} |
\xdb[\x80-\xff]{2} { l=getlevel();d=get14();setlevel(l,1,d,c,"");}
\xeb[\x80-\xff]{3} { l=getlevel(); d=get14();setlevel(l,1,d,c,"");}

\x8c[\x80-\xff]{3} |
\x9c[\x80-\xff]{3} |
\xac[\x80-\xff]{3} |
\xbc[\x80-\xff]{3} |
\xcc[\x80-\xff]{3} |
\xdc[\x80-\xff]{3} { l=getlevel();d=get14();c=get1c(); setlevel(l,2,d,c,""); }
\xec[\x80-\xff]{4} { l=getlevel() ;d=get14(); c=get1c(); setlevel(l,2,d,c,""); }

\x8d[\x80-\xff]{2}[^\xff]*\xff |
\x9d[\x80-\xff]{2}[^\xff]*\xff |
\xad[\x80-\xff]{2}[^\xff]*\xff |
\xbd[\x80-\xff]{2}[^\xff]*\xff |
\xcd[\x80-\xff]{2}[^\xff]*\xff |
\xdd[\x80-\xff]{2}[^\xff]*\xff {l=getlevel();d=get14();s=getstr();setlevel(l,3,d,c,s);}
\xed[\x80-\xff]{3}[^\xff]*\xff {l=getlevel();d=get14(); s=getstr();setlevel(l,3,d,c,s);}

\x8e[\x80-\xff]{1} |
\x9e[\x80-\xff]{1} |
\xae[\x80-\xff]{1} |
\xbe[\x80-\xff]{1} |
\xce[\x80-\xff]{1} |
\xde[\x80-\xff]{1} {  l=getlevel(); c=get1c(); setlevel(l,4,d,c,""); }
\xee[\x80-\xff]{2} {  l=getlevel(); c=get1c();  setlevel(l,4,d,c,""); }

\x8f[^\xff]*\xff            |
\x9f[^\xff]*\xff            |
\xaf[^\xff]*\xff            |
\xbf[^\xff]*\xff            |
\xcf[^\xff]*\xff            |
\xdf[^\xff]*\xff            { l=getlevel();s=getstr();setlevel(l,5,d,c,s);}
\xef[\x80-\xfe][^\xff]*\xff { l=getlevel(); s=getstr(); setlevel(l,5,d,c,s); }

[\x80-\xff]     ;          /* Should never be reached */

\x0\x0\x0\x0+   ;  /* Trash any string of over four nulls */

"_"		printf("--");
"!"		printf(" ");
"?"		;
"`"		;

"^"[0-9]+	printf("\t");

"$"[0-9]*	BEGIN 0;

"&"[0-9]*	BEGIN LATIN;

"%"		printf("{\\dag}");
"%1"		printf("?");
"%2"		printf("*");
"%3"		printf("/");
"%4"		printf("!");
"%5"		printf("|");
"%6"		printf("=");
"%7"		printf("+");
"%8"		printf("%%");
"%9"		printf("&");
"%10"		printf(":");
"%11"		printf(".");
"%12"		printf("*");
"%13"		printf("{\\ddag}");
"%14"		printf("{\\P}");
"%15"		printf("|");
"%16"		printf("|");
"%17"		printf("||");
"%18"		printf("'");
"%19"		printf("-");
"%24"		printf("~");
"%25"		printf("|");
"%26"		;
"%27"		;
"%30"		printg(G_smooth);
"%31"		printg(G_rough);
"%32"		printg(G_acute);
"%33"		printg(G_grave);
"%34"		printg(G_circ);
"%35"		printg(G_smooth_acute);
"%36"		printg(G_rough_acute);
"%37"		printg(G_rough_grave);
"%38"		printg(G_rough_circ);
"%39"		printg(G_uml);
"%100"		printf(";");
"%101"		printf("#");
"%102"		printf("`");
"%103"		printf("\\");
"%104"		printf("^");
"%105"		printf("|||");
"%107"		printf("~");
"%132"		printg(G_acute_uml);
"%133"		printg(G_smooth_grave);
"%134"		printg(G_smooth_circ);
"%160"		printg(G_hyphen);
"%"[0-9]+	printf(" ");

\"1		printf("``");
\"2		printf("''");
\"3		{
		  if (quoting) { printf("'"); quoting = 0; }
		  else { printf("`"); quoting = 1; }
		}
\"4		printf("`");
\"5		printf("'");
\"6		{
		  if (quoting) { printf(">>"); quoting = 0; }
		  else { printf("<<"); quoting = 1; }
		}
\"7		{
		  if (quoting) { printg(G_rangle); quoting = 0; }
		  else { printg(G_langle); quoting = 1; }
		}
\"[0-9]*	printf("\"");

"@"		printf(" ");
"@"[0-9]+	printf("\n");

"["		printg(G_lbracket);
"[1"		printg(G_lparen);
"[2"		printg(G_langle);
"[3"		printg(G_lbrace);
"[4"		printf("[[");
"[11"		printg(G_lparen);
"[12"		printf("<=");
"[2"[0-9]	printg(G_lbrace);
"[3"[0-9]	printg(G_lparen);
"[53"		printg(G_lparen);
"["[0-9]+	printg(G_lbracket);

"]"		printg(G_rbracket);
"]1"		printg(G_rparen);
"]2"		printg(G_rangle);
"]3"		printg(G_rbrace);
"]4"		printf("]]");
"]11"		printg(G_rparen);
"]12"		printf("=>");
"]2"[0-9]	printg(G_rbrace);
"]3"[0-9]	printg(G_rparen);
"]53"		printg(G_rparen);
"]"[0-9]+	printg(G_rbracket);

"<"		|
">"		printf("~");
"<1"		|
">1"		printf("_");
"<"[0-9]+	|
">"[0-9]+	printg(G_hyphen);

"{"[0-9]*	printg(G_lbrace);
"}"[0-9]*	printg(G_rbrace);

"#"		printf("'");
"#1"		printg(G_qoppa);
"#2"		printg(G_stigma);
"#3"		printg(G_QOPPA);
"#4"		printg(G_stigma);
"#5"		printg(G_sampi);
"#80"		printg(G_uml);
"#81"		printf("'");
"#82"		printg(G_acute);
"#83"		printg(G_grave);
"#84"		printg(G_circ);
"#85"		printg(G_rough);
"#86"		printg(G_smooth);
"#"[0-9]+	printf(" ");

<LATIN>{
  "A+"		printl(L_Auml);
  "A/"		printl(L_Aacute);
  "A="		printl(L_Acirc);
  "A\\"		printl(L_Agrave);
  "A%24"	printl(L_Atilde);
  "E+"		printl(L_Euml);
  "E/"		printl(L_Eacute);
  "E="		printl(L_Ecirc);
  "E\\"		printl(L_Egrave);
  "I+"		printl(L_Iuml);
  "I/"		printl(L_Iacute);
  "I="		printl(L_Icirc);
  "I\\"		printl(L_Igrave);
  "O+"		printl(L_Ouml);
  "O/"		printl(L_Oacute);
  "O="		printl(L_Ocirc);
  "O\\"		printl(L_Ograve);
  "O%24"	printl(L_Otilde);
  "U+"		printl(L_Uuml);
  "U/"		printl(L_Uacute);
  "U="		printl(L_Ucirc);
  "U\\"		printl(L_Ugrave);
  "C%25"	printl(L_Ccedil);
  "N%24"	printl(L_Ntilde);
  "Y/"		printl(L_Yacute);
  "a+"		printl(L_auml);
  "a/"		printl(L_aacute);
  "a="		printl(L_acirc);
  "a\\"		printl(L_agrave);
  "a%24"	printl(L_atilde);
  "e+"		printl(L_euml);
  "e/"		printl(L_eacute);
  "e="		printl(L_ecirc);
  "e\\"		printl(L_egrave);
  "i+"		printl(L_iuml);
  "i/"		printl(L_iacute);
  "i="		printl(L_icirc);
  "i\\"		printl(L_igrave);
  "o+"		printl(L_ouml);
  "o/"		printl(L_oacute);
  "o="		printl(L_ocirc);
  "o\\"		printl(L_ograve);
  "o%24"	printl(L_otilde);
  "u+"		printl(L_uuml);
  "u/"		printl(L_uacute);
  "u="		printl(L_ucirc);
  "u\\"		printl(L_ugrave);
  "c%25"	printl(L_ccedil);
  "n%24"	printl(L_ntilde);
  "y/"		printl(L_yacute);
  .		ECHO;
}

"A"		printg(G_alpha);
"B"		printg(G_beta);
"C"		printg(G_xi);
"D"		printg(G_delta);
"E"		printg(G_epsilon);
"F"		printg(G_phi);
"G"		printg(G_gamma);
"H"		printg(G_eta);
"I"		printg(G_iota);
"K"		printg(G_kappa);
"L"		printg(G_lambda);
"M"		printg(G_mu);
"N"		printg(G_nu);
"O"		printg(G_omicron);
"P"		printg(G_pi);
"Q"		printg(G_theta);
"R"		printg(G_rho);
"T"		printg(G_tau);
"U"		printg(G_upsilon);
"V"		printg(G_digamma);
"W"		printg(G_omega);
"X"		printg(G_chi);
"Y"		printg(G_psi);
"Z"		printg(G_zeta);

"A("		printg(G_alpha_rough);
"A(/"		printg(G_alpha_rough_acute);
"A(/|"		printg(G_alpha_rough_acute_isub);
"A(\\"		printg(G_alpha_rough_grave);
"A(\\|"		printg(G_alpha_rough_grave_isub);
"A(="		printg(G_alpha_rough_circ);
"A(=|"		printg(G_alpha_rough_circ_isub);
"A(|"		printg(G_alpha_rough_isub);
"A)"		printg(G_alpha_smooth);
"A)/"		printg(G_alpha_smooth_acute);
"A)/|"		printg(G_alpha_smooth_acute_isub);
"A)\\"		printg(G_alpha_smooth_grave);
"A)\\|"		printg(G_alpha_smooth_grave_isub);
"A)="		printg(G_alpha_smooth_circ);
"A)=|"		printg(G_alpha_smooth_circ_isub);
"A)|"		printg(G_alpha_smooth_isub);
"A/"		printg(G_alpha_acute);
"A/|"		printg(G_alpha_acute_isub);
"A\\"		printg(G_alpha_grave);
"A\\|"		printg(G_alpha_grave_isub);
"A="		printg(G_alpha_circ);
"A=|"		printg(G_alpha_circ_isub);
"A|"		printg(G_alpha_isub);

"E("		printg(G_epsilon_rough);
"E(/"		printg(G_epsilon_rough_acute);
"E(\\"		printg(G_epsilon_rough_grave);
"E)"		printg(G_epsilon_smooth);
"E)/"		printg(G_epsilon_smooth_acute);
"E)\\"		printg(G_epsilon_smooth_grave);
"E/"		printg(G_epsilon_acute);
"E\\"		printg(G_epsilon_grave);

"H("		printg(G_eta_rough);
"H(/"		printg(G_eta_rough_acute);
"H(/|"		printg(G_eta_rough_acute_isub);
"H(\\"		printg(G_eta_rough_grave);
"H(\\|"		printg(G_eta_rough_grave_isub);
"H(="		printg(G_eta_rough_circ);
"H(=|"		printg(G_eta_rough_circ_isub);
"H(|"		printg(G_eta_rough_isub);
"H)"		printg(G_eta_smooth);
"H)/"		printg(G_eta_smooth_acute);
"H)/|"		printg(G_eta_smooth_acute_isub);
"H)\\"		printg(G_eta_smooth_grave);
"H)\\|"		printg(G_eta_smooth_grave_isub);
"H)="		printg(G_eta_smooth_circ);
"H)=|"		printg(G_eta_smooth_circ_isub);
"H)|"		printg(G_eta_smooth_isub);
"H/"		printg(G_eta_acute);
"H/|"		printg(G_eta_acute_isub);
"H\\"		printg(G_eta_grave);
"H\\|"		printg(G_eta_grave_isub);
"H="		printg(G_eta_circ);
"H=|"		printg(G_eta_circ_isub);
"H|"		printg(G_eta_isub);

"I("		printg(G_iota_rough);
"I(/"		printg(G_iota_rough_acute);
"I(\\"		printg(G_iota_rough_grave);
"I(="		printg(G_iota_rough_circ);
"I)"		printg(G_iota_smooth);
"I)/"		printg(G_iota_smooth_acute);
"I)\\"		printg(G_iota_smooth_grave);
"I)="		printg(G_iota_smooth_circ);
"I/"		printg(G_iota_acute);
"I\\"		printg(G_iota_grave);
"I="		printg(G_iota_circ);
"I+"		printg(G_iota_uml);
"I/+"		printg(G_iota_acute_uml);
"I\\+"		printg(G_iota_grave_uml);

"O("		printg(G_omicron_rough);
"O(/"		printg(G_omicron_rough_acute);
"O(\\"		printg(G_omicron_rough_grave);
"O)"		printg(G_omicron_smooth);
"O)/"		printg(G_omicron_smooth_acute);
"O)\\"		printg(G_omicron_smooth_grave);
"O/"		printg(G_omicron_acute);
"O\\"		printg(G_omicron_grave);

"R("		printg(G_rho_rough);
"R)"		printg(G_rho_smooth);

"S1"		printg(G_sigma);
"S"/[-A-Z]	printg(G_sigma);
"S"[23]?	printg(G_fsigma);

"U("		printg(G_upsilon_rough);
"U(/"		printg(G_upsilon_rough_acute);
"U(\\"		printg(G_upsilon_rough_grave);
"U(="		printg(G_upsilon_rough_circ);
"U)"		printg(G_upsilon_smooth);
"U)/"		printg(G_upsilon_smooth_acute);
"U)\\"		printg(G_upsilon_smooth_grave);
"U)="		printg(G_upsilon_smooth_circ);
"U/"		printg(G_upsilon_acute);
"U\\"		printg(G_upsilon_grave);
"U="		printg(G_upsilon_circ);
"U+"		printg(G_upsilon_uml);
"U/+"		printg(G_upsilon_acute_uml);
"U\\+"		printg(G_upsilon_grave_uml);

"W("		printg(G_omega_rough);
"W(/"		printg(G_omega_rough_acute);
"W(/|"		printg(G_omega_rough_acute_isub);
"W(\\"		printg(G_omega_rough_grave);
"W(\\|"		printg(G_omega_rough_grave_isub);
"W(="		printg(G_omega_rough_circ);
"W(=|"		printg(G_omega_rough_circ_isub);
"W(|"		printg(G_omega_rough_isub);
"W)"		printg(G_omega_smooth);
"W)/"		printg(G_omega_smooth_acute);
"W)/|"		printg(G_omega_smooth_acute_isub);
"W)\\"		printg(G_omega_smooth_grave);
"W)\\|"		printg(G_omega_smooth_grave_isub);
"W)="		printg(G_omega_smooth_circ);
"W)=|"		printg(G_omega_smooth_circ_isub);
"W)|"		printg(G_omega_smooth_isub);
"W/"		printg(G_omega_acute);
"W/|"		printg(G_omega_acute_isub);
"W\\"		printg(G_omega_grave);
"W\\|"		printg(G_omega_grave_isub);
"W="		printg(G_omega_circ);
"W=|"		printg(G_omega_circ_isub);
"W|"		printg(G_omega_isub);

"*A"		printg(G_ALPHA);
"*B"		printg(G_BETA);
"*C"		printg(G_XI);
"*D"		printg(G_DELTA);
"*E"		printg(G_EPSILON);
"*F"		printg(G_PHI);
"*G"		printg(G_GAMMA);
"*H"		printg(G_ETA);
"*I"		printg(G_IOTA);
"*K"		printg(G_KAPPA);
"*L"		printg(G_LAMBDA);
"*M"		printg(G_MU);
"*N"		printg(G_NU);
"*O"		printg(G_OMICRON);
"*P"		printg(G_PI);
"*Q"		printg(G_THETA);
"*R"		printg(G_RHO);
"*S"		printg(G_SIGMA);
"*T"		printg(G_TAU);
"*U"		printg(G_UPSILON);
"*W"		printg(G_OMEGA);
"*X"		printg(G_CHI);
"*Y"		printg(G_PSI);
"*Z"		printg(G_ZETA);

"*(A"		{ printg(G_rough); printg(G_ALPHA); }
"*(/A"		{ printg(G_rough_acute); printg(G_ALPHA); }
"*(\\A"		{ printg(G_rough_grave); printg(G_ALPHA); }
"*(=A"		{ printg(G_rough_circ); printg(G_ALPHA); }
"*)A"		{ printg(G_smooth); printg(G_ALPHA); }
"*)/A"		{ printg(G_smooth_acute); printg(G_ALPHA); }
"*)\\A"		{ printg(G_smooth_grave); printg(G_ALPHA); }
"*)=A"		{ printg(G_smooth_circ); printg(G_ALPHA); }
"*/A"		{ printg(G_acute); printg(G_ALPHA); }
"*\\A"		{ printg(G_grave); printg(G_ALPHA); }
"*=A"		{ printg(G_circ); printg(G_ALPHA); }

"*(E"		{ printg(G_rough); printg(G_EPSILON); }
"*(/E"		{ printg(G_rough_acute); printg(G_EPSILON); }
"*(\\E"		{ printg(G_rough_grave); printg(G_EPSILON); }
"*)E"		{ printg(G_smooth); printg(G_EPSILON); }
"*)/E"		{ printg(G_smooth_acute); printg(G_EPSILON); }
"*)\\E"		{ printg(G_smooth_grave); printg(G_EPSILON); }
"*/E"		{ printg(G_acute); printg(G_EPSILON); }
"*\\E"		{ printg(G_grave); printg(G_EPSILON); }

"*(H"		{ printg(G_rough); printg(G_ETA); }
"*(/H"		{ printg(G_rough_acute); printg(G_ETA); }
"*(\\H"		{ printg(G_rough_grave); printg(G_ETA); }
"*(=H"		{ printg(G_rough_circ); printg(G_ETA); }
"*)H"		{ printg(G_smooth); printg(G_ETA); }
"*)/H"		{ printg(G_smooth_acute); printg(G_ETA); }
"*)\\H"		{ printg(G_smooth_grave); printg(G_ETA); }
"*)=H"		{ printg(G_smooth_circ); printg(G_ETA); }
"*/H"		{ printg(G_acute); printg(G_ETA); }
"*\\H"		{ printg(G_grave); printg(G_ETA); }
"*=H"		{ printg(G_circ); printg(G_ETA); }

"*(I"		{ printg(G_rough); printg(G_IOTA); }
"*(/I"		{ printg(G_rough_acute); printg(G_IOTA); }
"*(\\I"		{ printg(G_rough_grave); printg(G_IOTA); }
"*(=I"		{ printg(G_rough_circ); printg(G_IOTA); }
"*)I"		{ printg(G_smooth); printg(G_IOTA); }
"*)/I"		{ printg(G_smooth_acute); printg(G_IOTA); }
"*)\\I"		{ printg(G_smooth_grave); printg(G_IOTA); }
"*)=I"		{ printg(G_smooth_circ); printg(G_IOTA); }
"*/I"		{ printg(G_acute); printg(G_IOTA); }
"*\\I"		{ printg(G_grave); printg(G_IOTA); }
"*=I"		{ printg(G_circ); printg(G_IOTA); }
"*+I"		{ printg(G_uml); printg(G_IOTA); }
"*/+I"		{ printg(G_acute_uml); printg(G_IOTA); }
"*\\+I"		{ printg(G_grave_uml); printg(G_IOTA); }

"*(O"		{ printg(G_rough); printg(G_OMICRON); }
"*(/O"		{ printg(G_rough_acute); printg(G_OMICRON); }
"*(\\O"		{ printg(G_rough_grave); printg(G_OMICRON); }
"*)O"		{ printg(G_smooth); printg(G_OMICRON); }
"*)/O"		{ printg(G_smooth_acute); printg(G_OMICRON); }
"*)\\O"		{ printg(G_smooth_grave); printg(G_OMICRON); }
"*/O"		{ printg(G_acute); printg(G_OMICRON); }
"*\\O"		{ printg(G_grave); printg(G_OMICRON); }

"*(R"		{ printg(G_rough); printg(G_RHO); }
"*)R"		{ printg(G_smooth); printg(G_RHO); }

"*(U"		{ printg(G_rough); printg(G_UPSILON); }
"*(/U"		{ printg(G_rough_acute); printg(G_UPSILON); }
"*(\\U"		{ printg(G_rough_grave); printg(G_UPSILON); }
"*(=U"		{ printg(G_rough_circ); printg(G_UPSILON); }
"*)U"		{ printg(G_smooth); printg(G_UPSILON); }
"*)/U"		{ printg(G_smooth_acute); printg(G_UPSILON); }
"*)\\U"		{ printg(G_smooth_grave); printg(G_UPSILON); }
"*)=U"		{ printg(G_smooth_circ); printg(G_UPSILON); }
"*/U"		{ printg(G_acute); printg(G_UPSILON); }
"*\\U"		{ printg(G_grave); printg(G_UPSILON); }
"*=U"		{ printg(G_circ); printg(G_UPSILON); }
"*+U"		{ printg(G_uml); printg(G_UPSILON); }
"*/+U"		{ printg(G_acute_uml); printg(G_UPSILON); }
"*\\+U"		{ printg(G_grave_uml); printg(G_UPSILON); }

"*(W"		{ printg(G_rough); printg(G_OMEGA); }
"*(/W"		{ printg(G_rough_acute); printg(G_OMEGA); }
"*(\\W"		{ printg(G_rough_grave); printg(G_OMEGA); }
"*(=W"		{ printg(G_rough_circ); printg(G_OMEGA); }
"*)W"		{ printg(G_smooth); printg(G_OMEGA); }
"*)/W"		{ printg(G_smooth_acute); printg(G_OMEGA); }
"*)\\W"		{ printg(G_smooth_grave); printg(G_OMEGA); }
"*)=W"		{ printg(G_smooth_circ); printg(G_OMEGA); }
"*/W"		{ printg(G_acute); printg(G_OMEGA); }
"*\\W"		{ printg(G_grave); printg(G_OMEGA); }
"*=W"		{ printg(G_circ); printg(G_OMEGA); }

"("		printg(G_rough);
"(/"		printg(G_rough_acute);
"(\\"		printg(G_rough_grave);
"(="		printg(G_rough_circ);
")"		printg(G_smooth);
")/"		printg(G_smooth_acute);
")\\"		printg(G_smooth_grave);
")="		printg(G_smooth_circ);
"+"		printg(G_uml);
","		printg(G_comma);
"-"		printg(G_hyphen);
"."		printg(G_fullstop);
"/"		printg(G_acute);
"/+"		printg(G_acute_uml);
":"		printg(G_colon);
";"		printg(G_question);
"="		printg(G_circ);
"\\"		printg(G_grave);
"\\+"		printg(G_grave_uml);
"|"		printg(G_isub);

.		ECHO;

%%

void printl(int code)
{
  printf("%c%c", lc_l, code);
}

void printg(int code)
{
  printf("%c%c%c%c", lc_g1, lc_g2, code / 96 + 160, code % 96 + 160);
}


int get7 (void) {
  int i;
  i=(int)(unsigned char)*yytext++&0x7f;
  return i;
}

int get14 (void) {
  int i;
  i=(int)(unsigned char)*yytext++&0x7f;
  i=(i*128)+((int)(unsigned char)*yytext++&0x7f);
  return i;
}

char get1c (void) {

  char c;
  c=*yytext++&0x7f;
  return c;

}

char * getstr (void) {

     char *charptr;
     charptr=yytext;
     while ((unsigned char)*yytext<0xff) {
       *yytext=*yytext++&0x7f;
     };
     *yytext++=0x00;
     return charptr;
}

int getlevel(void) {
 /*     printf("getlevel ");  */
  switch (*yytext++&0xf0) {
  case 0x80:
    return Z;
    break;
  case 0x90:
    return Y;
    break;
  case 0xA0:
    return X;
    break;
  case 0xB0:
    return W;
    break;
  case 0xC0:
    return V;
    break;
  case 0xD0:
    hierarchy=0;
    return N;
    break;
  case 0xE0:
    if (toascii(*yytext)<4) {
      return(A-toascii(*yytext++));
    } else {
      return(toascii(*yytext++));
    };
    break;
  };
}

/***********************************************************************
 *  TLG IDT level updater.  This has to do three things:
 *  (1) update the level string, possibly by incrementing it 
 *        (this may mean incrementing a trailing number after 
 *	  a character or a trailing character after a number as well
 *        as incrementing a number)
 *  (2) reset lower levels to 1, if the levels are hierarchical (indicated
 *      by the global 'hierarchy').
 *  (3) turn hierarchy off if level N is being set.
 *
 *  The return is clunky: it strcats some level strings to the global
 *  'cite' and returns its address.  That could be fixed.   
 */ 

char * setlevel(int which, int how, int val, char ch, char* str) {
  int d,ix,last_i, last_d, first_d;
  char numstr[16];                     /*let's have plenty of room!*/

  /*   printf("Setlevel %d %d %d %c %s: ", which,how,val,ch,str); */  /* Tracer */


  /* which = which level to update (=levels[which])
   * how =   what type of update this is:
   *          0=increment which 
   *          1=set which to val 
   *          2=set which to val+ch (
   *          3=set which to val+str
   *          4=change the last char in which to ch 
   *          5=set which to str         
   * val = integer parameter
   * ch  = character parameter
   * str = string parameter
   */

  /* For anything but an increment, we can just set it 
   */
  if (which==N) { hierarchy=1; };
  if (which>N) { printf("WARNING: Setlevel called for a high-level id\n"); };
  switch (how) {

  case 0:
    /* This is what we have to do to increment a counter of unknown type 
     * First we have to see if the level currently ends in a digit  string.
     * Just to be safe, we count backwards from the end.
     */ 
    
    
    last_i=strlen(&levels[which][0])-1;

    if (isdigit(levels[which][last_i])) {
      
      first_d=last_i;
      while (first_d>0 && isdigit(levels[which][first_d])) {
	first_d--;
      };
      d=atoi(&levels[which][first_d]);
      d++;
      sprintf(&levels[which][first_d],"%d",d);
      /* do a sprintf here */
      
    } else {
      ++levels[which][last_i];

    };    

    break;

  case 1:
    sprintf(&levels[which][0],"%d",val);
    break;
  case 2:
    sprintf(&levels[which][0],"%d%c",val,ch);
    break;
    
  case 3:
    sprintf(&levels[which][0],"%d%s",val,str);
    break;
    
  case 4:
    last_d=strlen(&levels[which][0])-1;
    levels[which][last_d]=ch;
    break;
    
  case 5:
    sprintf(&levels[which][0],"%s",str);
    break;
    
  };

  /* Whatever we do, we reset the lower levels if 
   * the levels are hierarchical.
   */
  if (hierarchy) {
    for (ix=0; ix<which; ix++) {
      sprintf(&levels[ix][0],"%d",1);
    };
  };

  strncpy(&cite[0],&levels[which][0],16);
  
  
 /*   printf("Saving records in %d...\n",this_cite); */
  for (ix=0;ix<6;ix++) {
/*      printf("Copying id level %d to %d..\n",ix,this_cite+ix); */
    strncpy(this_cite+(ix*REF_SIZE),&levels[ix][0],REF_SIZE);
  };
  return &cite[0];
}

/*************************************
 * Citation string comparison
 */

int citecmp (char *str1, char *str2) {   
  int d1, d2;
  int l1, l2;
  int cval;
  char c1, c2;
  char *digits="0123456789";
  char *ptr1;
  char *ptr2;
  
  ptr1=str1; ptr2=str2;

  while (strlen(ptr1)>0 && strlen(ptr2)>0) {
 
    if (isdigit(*ptr1) || isdigit(*ptr2)) {
      d1=strtol(ptr1,&ptr1,10);
      d2=strtol(ptr2,&ptr2,10);
      
      if (d1<d2) 
	{
	  return(-1);
	} else if (d2<d1){
	  return(1);	
	};
      
    } else {
      l1=strcspn(ptr1,digits);
      l2=strcspn(ptr2,digits);
      if (l1<=l2) {
	cval=strcmp(ptr1,ptr2);
      } else {
	cval=strcmp(ptr1,ptr2);
      };
      if (cval!=0) {
	return(cval);
      } else {
	ptr1+=l1;
	ptr2+=l2;
      };      
    };    
  };
  return(0);   /* Fall-through from 'while' */
};


/*********************************************************
 * TODO:
 *   When the top-level idt values (file no.,
 *   work no., etc.) are reset, hierarchy
 *   should also be reset to 1.
 *
 *   We could do a cleaner job of escape-level updates.
 *
 *   Add ways to choose a citation string format
 *   Add choices of idt values (including no print at all?)
 *   
 **********************************************************/

/****************************************************
 * Follows the 'next' links from a work to null, then
 * creates a new work linked to the last work.  Returns 
 * a pointer to this work.
 */

struct work * new_work(struct work * * wklist) {
  struct work * wptr;
  wptr=*wklist;

  if (wptr) {                        /* wklist!=null */
    while (wptr->next) {             /* wklist->next!=null */
      wptr=wptr->next;
    };                         /* wklist!=null, wklist->next==null, wklist is old last work */

    wptr->next=malloc(sizeof(struct work));   /* create new work */

    wklist=&wptr;  
    (*wklist)->next=wptr->next;                      /* put its address in last work */

    wptr=wptr->next;                        /* and set wptr to it */

  } else {                     /* wklist==null*/
    wptr=malloc(sizeof(struct work));    /* wklist==new work */
    *wklist=wptr;

  };
    wptr->next=0;                       /* wklist!=null, wklist->next==null */
    return(wptr);

};
/************************************************/
struct section * new_section(struct section * * sctlist) {
  struct section * sptr;
  sptr=*sctlist;

  if (sptr) {                        /* sctlist!=null */
    while (sptr->next) {             /* sctlist->next!=null */
      sptr=sptr->next;
    };                         /* sctlist!=null, sctlist->next==null, klist is old last work */

    sptr->next=malloc(sizeof(struct section));   /* create new section */

    sctlist=&sptr;  
    (*sctlist)->next=sptr->next;                      /* put its address in last section*/

    sptr=sptr->next;                        /* and set sptr to it */

  } else {                     /* sctlist==null*/
    sptr=malloc(sizeof(struct section));    /* sctlist==new blobk */
    *sctlist=sptr;

  };
    sptr->next=0;                       /* sctlist!=null, sctlist->next==null */
    sptr->owner=this_work;
    return(sptr);


};

/******************/

struct block * new_block(struct block * * blklist) {
  struct block * bptr;
  bptr=*blklist;

  if (bptr) {                        /* blklist!=null */
    while (bptr->next) {             /* blklist->next!=null */
      bptr=bptr->next;
    };                         /* blklist!=null, blklist->next==null, klist is old last work */

    bptr->next=malloc(sizeof(struct block));   /* create new block */

    blklist=&bptr;  
    (*blklist)->next=bptr->next;                      /* put its address in last block*/

    bptr=bptr->next;                        /* and set bptr to it */

  } else {                     /* blklist==null*/
    bptr=malloc(sizeof(struct block));    /* blklist==new blobk */
    *blklist=bptr;

  };
    bptr->next=0;                       /* blklist!=null, blklist->next==null */
    bptr->owner=this_work;
    return(bptr);


};


struct block * last_block (struct block * blklist) {
 
  while (blklist->next) {
    blklist=blklist->next;
  };
  return blklist;
};

char * next_command(char * * cstring) {
  char * limit;
  char whitespace[]=" \t";
  char * c_start, * c_end;
  limit=*cstring+strlen(*cstring);
  c_start=*cstring+strspn(cstring,whitespace);  /* points to first non-ws */
  c_end=c_start+strcspn(c_start,whitespace);    /* points to next ws */
  *c_end=0x0;                                   /* terminate the string */
  *cstring=++c_end;                             /* point to next command beyond */
  return c_start;
  /* find a non-whitespace */  
  /* read until whitespace */
  /* do not read past the end of the string */
  
};

void print_work(int worknum) {
  struct work * workp;
  workp=who->worklist;
  while (workp && workp->worknum!=worknum) {
    workp=workp->next;
  };
  if (workp) {
 
    printf("\n&Work %s: \n%s& (%s)\n",workp->workno,workp->title, workp->title_abbr);
    printf("&Starting citation: %s %s %s %s %s %s;  ",
	   workp->sections->blocks->cite[N],
	   workp->sections->blocks->cite[V],
	   workp->sections->blocks->cite[W],
	   workp->sections->blocks->cite[X],
	   workp->sections->blocks->cite[Y],
	   workp->sections->blocks->cite[Z]);
    printf("&Ending citation: %s %s %s %s %s %s\n",
	   last_block(workp->sections->blocks)->cite[N],
	   last_block(workp->sections->blocks)->cite[V],
	   last_block(workp->sections->blocks)->cite[W],
	   last_block(workp->sections->blocks)->cite[X],
	   last_block(workp->sections->blocks)->cite[Y],
	   last_block(workp->sections->blocks)->cite[Z]);


    for (ix=0;ix<=N;ix++) {
      if (strlen(workp->level_names[ix])<REF_SIZE) {
	printf("&Level %d: %-20s   (%-8s - %s)\n",ix,
	       workp->level_names[ix],
	       workp->sections->blocks->cite[ix],
	       last_block(workp->sections->blocks)->cite[ix]
	       );
      };


    };
    printf("& File blocks %d-%d\n",workp->start,workp->end);

    /* Turn off block printing */  
   /*   this_section=workp->sections; */
/*      while (this_section) { */
/*        printf(" Section %d:\n", this_section->num); */
/*        this_block=this_section->blocks; */
/*        while (this_block) { */
/*  	printf("  Block %d: n=%s v=%s w=%s x=%s y=%s z=%s\n", */
/*  	       this_block->num, */
/*  	       this_block->cite[5], */
/*  	       this_block->cite[4], */
/*  	       this_block->cite[3], */
/*  	       this_block->cite[2], */
/*  	       this_block->cite[1], */
/*  	       this_block->cite[0]); */
	
/*  	this_block=this_block->next; */
/*        };      */
/*        this_section=this_section->next; */
/*      }; */
  /*    printf("\n Exceptions list:\n"); */

/*      this_block=workp->exceptions; */

/*      printf("  Exceptions list: %d\n",this_block); */
/*      if (this_block) {printf(" This exception block %d; next exception at %d\n", */
/*  			    this_block->num, */
/*  			    this_block->next); }; */
   /*   while (this_block) { */
/*        printf("  Exception : n=%s v=%s w=%s x=%s y=%s z=%s\n"); */
	     
/*  	       this_block->cite[5], */
/*    	     this_block->cite[4], */
/*    	     this_block->cite[3], */
/*    	     this_block->cite[2], */
/*    	     this_block->cite[1], */
/*    	     this_block->cite[0]); */

      
/*        this_block=this_block->next; */
/*      };    */  

  } else {
    printf("No work matching number %d\n",worknum);
  };

};

void print_author(void) {
  this_work=who->worklist;
  while (this_work) {
    print_work(this_work->worknum);
    this_work=this_work->next;
  }; 
  
};



/******************************************************************
 *
 *                    MAIN STARTS HERE
 *
 *****************************************************************/


main(int argc, char *argv[])
{
  FILE *file;
  FILE *dump;
  if (argc < 2 || argc > 3) {
    fprintf(stderr, "usage: %s filename [leading_code]\n", argv[0]);
    exit(-1);
  }
  
  if (argc == 3) {
    lc_g2 = atoi(argv[2]);
  };

  if (strcmp(argv[1],"|")==0) {
    yyin = stdin;
  } else {
    file = fopen(argv[1], "r");
    if (!file) {
      fprintf(stderr, "cannot open %s\n", argv[1]);
      exit(-1);
    };
    yyin = file;
  };
  c=' ';

  dump = fopen("/dev/null", "r"); 
  if (!dump) {
    fprintf(stderr, "cannot open /dev/null\n");
  }  else {
    yyout= dump;
  };

  yylex();

  fclose(file);
  yyout=stdout;
  print_author();


  /*******************************************************************
   * This command interpretation loop was written while experimenting with
   * using the program as a kind of daemon: the idea is that the process
   * is started by emacs with a process-buffer, initially outputs the IDT
   * data to that buffer, and then takes commands for other idt information
   * from the tty to that buffer.  I'm no longer exploring this.
   * Setting 'daemon' to 0 shuts this off.
   * */
 
  while (daemon) {
    fgets(command_line,512,stdin);  /* Read a whole command line */
    command_line[strlen(command_line)-1]=0x0;   /* Trim the final newline */
    c_ptr=command_line; 
    c_end=c_ptr+strlen(command_line);

    while (c_ptr<c_end) {
      sprintf(command,next_command(&c_ptr));

      if (command[0]=='-') {

	option = command[1];
	
	switch (option) {
	  
	  /* -a, -b are search arguments */

	case 'a':       /* This may be useless; same throughout a file? */
	/*    seeking=1; */
	  sprintf(command,next_command(&c_ptr));
	  printf("Seeking author %s\n",command);
/*  	  strncpy(&target[A][0],argv[++arg],31); */
/*  	  found[A]=-1; */
/*  	  printf("Searching for level %d (author) %s...\n",A,&target[A][0]); */
	  break;
	  
	case 'b':       /* Add 'W' as a synonym? */
	/*    seeking=1; */
	  sprintf(command,next_command(&c_ptr));
	  printf("Seeking work %s\n",command);
/*  	  found[B]=-1; */
/*  	  printf("Searching for level %d (work) %s...\n",B,&target[B][0]); */
	  break;
	  
	  
	case 'c':
	  sprintf(command,next_command(&c_ptr));
	  
	  switch (command[0]) { 
	  case 'a': 
	    fmt=&fmt_all[0][0];
	    break;
	  case 'l':
	    fmt=&fmt_low[0][0];
	    break;
	  case 'n':
	    fmt=&fmt_none[0][0];
	    break;
	  };
	  break;
	  
	case 'f':
	  sprintf(command,next_command(&c_ptr));

	  printf("Filename to open: %s\n",command ); 
	  break;
	  
	case 'l':
	/*    lc_g2 = atoi(argv[++arg]);  */
	  /*  printf("Setting charset code to %d\n",lc_g2); */
	  break; 
	  
	case 'o':
	/*    offset = atoi(next_command(&c_ptr)); */
	  /*    printf(" at block %d ...", offset);  */
	  break;
	 
	case 'q':
	  printf("Quitting...\n");
	  exit(0);

	case 's':
	 /*   seeking=1; */
/*  	  strncpy(&target_string,argv[++arg],31); */
/*  	   Parse the citation string */ 
/*  	  printf("Searching for citation %s...\n",target_string); */
	  break;
	  
	  /* The next six cases are the low-level IDTs */
	  
	case 'n':
	  sprintf(command,next_command(&c_ptr));

/*  	  found[N]=-1; */
  	  printf("Searching for n-level idt %s...\n",command); 
	  break;

	case 'p':
	  sprintf(command,next_command(&c_ptr));
	  switch (command[0]) {
	  case 'a':
	    print_author();
	    break;
	  case 'w':
	    sprintf(command,next_command(&c_ptr));
	    print_work(atoi(command));
	    break;

	  default:
	    printf("Unknown print request\n");
	    break;
	  };
	  break;

	case 'v':
	  sprintf(command,next_command(&c_ptr));

/*  	  found[N]=-1; */
  	  printf("Searching for v-level idt %s...\n",command); 
	  break;
	  
	case 'w':
	  sprintf(command,next_command(&c_ptr));



	  break;
	  
	case 'x':
	  sprintf(command,next_command(&c_ptr));
  	  printf("Searching for x-level idt %s...\n",command); 
/*  	  found[N]=-1; */
	  break;
	  
	case 'y':
	  sprintf(command,next_command(&c_ptr));
  	  printf("Searching for y-level idt %s...\n",command); 
/*  	  found[N]=-1; */
	  break;
	  
	case 'z':
	  sprintf(command,next_command(&c_ptr));
  	  printf("Searching for z-level idt %s...\n",command); 

	  break;
	  
	default:
	  fprintf(stderr,"unrecognized option: %c\n",option);
	  fprintf(stderr, 
		  "usage: %s [-abnvwxyz citation] [-c format] [-f file] [-l code] [-o block]\n", argv[0]); 
	/*    exit(-1);  */
	  break;
	};
      };
      /****************************/
    };
    
  };
  
}












- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: text/plain; charset=US-ASCII


- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="tlgreadblk.c"
Content-Transfer-Encoding: 7bit

#include <stdio.h>

/* Just a simple hose to read a tlg file from block
 * start to block end and pipe it on.
 */



int main (int argc, char *argv[]) {

  unsigned char buf[8192];
  int start, end, i;
  FILE *file;

  if (argc!=4) {
    fprintf(stderr,"usage: %s filename startblock endblock\n",argv[0]);
    exit(-1);
  };
  start = atoi(argv[2]);
  end = atoi(argv[3]);
  if (start>end) {
    fprintf(stderr,"ending block %d is before starting block %d\n",end,start);
    exit(-1);
  };
  /*  printf("Reading %s from %d to %d\n", argv[1], start, end); */
  file = fopen(argv[1], "r");
  if (!file) {
    fprintf(stderr, "cannot open %s\n", argv[1]);
    exit(-1);
  };
  if (fseek(file,start*8192,SEEK_SET)) {
    fprintf(stderr,"Seek error\n");
    exit(-1);
  };

  for (i=0;i<end-start+1;i++) {
    if (fread(&buf,8192,1,file)<1) {
      fprintf(stderr,"Read error at block %d\n",i);
      exit(-2);
    };
    fwrite(&buf,8192,1,stdout);

  };
}

- --Multipart_Fri_Jul_14_11:06:14_2000-1
Content-Type: text/plain; charset=US-ASCII





- --Multipart_Fri_Jul_14_11:06:14_2000-1--
------- End of forwarded message -------

-- 
TAKAHASHI Naoto
Electrotechnical Laboratory, Japan
ntakahas@xxxxxxxxx
http://www.etl.go.jp/~ntakahas/