CISIS – Utility Programs

Content

  1. Acronymn Description
  2. MX Utility
    1. MX
  3. Utilities for the master file
    1. MXF0
    2. MXTB
    3. MXCP
    4. MSRT
    5. RETAG
    6. CTLMFN
    7. MKXRF
    8. I2ID
    9. ID2I
    10. CRUNCHMF
  4. Utilities for inverted files
    1. IFKEYS
    2. IFLOAD
    3. MYS
    4. IFMERGE
    5. MKIY0
  5. Installation of the CISIS utilities
  6. Execution of the utilities
  7. Syntax conventions
  8. MX Utility
    1. General description
    2. Syntax
    3. Parameters. General description
    4. Initialization parameters (setup)
    5. Parameters that indicate the database source
    6. Parameters for processing data
    7. Parameters for searching records
    8. Parameters that carry out processes

- The CISIS Utilities are a group of programs developed in the C programming language that call functions offered by the CISIS Interface in order to carry out different functions on the Isis family of databases, such as finding and displaying records, maintenance of databases, etc. They can also carry out special functions that allow you to arrange master file records, generate tables from a master record, change the field tags etc.

This group of utility programs is offered under four versions: 10/30 and 16/60, LIND, FFI. The main differences are in the length of the inverted file keys and the maximum supported record size measured in bytes, as is shown in the following table.

  10/30 16/60 LIND FFI
Inverted file key length 30 60 60 60
Maximum record size 32.767 32.767 32.767 1.048.576

Note: version 10/30 is the only one compatible with CDS/ISIS from UNESCO

For more details on the structure of the master and inverted files see the Appendix: Structure of ISIS database records.

The particular characteristics of these programs can be verified in the version declaration that you can obtain with what

For example:

mx what
CISIS Interface v5.2a/PC32/M/32767/10/30/I - Utility MX
CISIS Interface v5.2a/.iy0/Z/4GB/GIZ/DEC/ISI/UTL/INVX/B7/FAT/CIP/CGI/MX Copyright (c)BIREME/PAHO 2006. [http://www.bireme.br/products/cisis]

Acronymn Description

  • V5.2a : Version number
  • PC32 : Computer used (in this case Windows PC)
  • L : Lind version if it is present
  • M : Multi-user support
  • 32767 : Maximum size of the record in bytes, value by default
  • 10/30 : Inverted file keys
  • I : Permission to update the I/F
  • Utility MX : Name of the program
  • .iy0 : Single physical file for I/F (made for mkiy0)
  • Z : Compressed I/F (made for myz – discontinued)
  • 4GB : Maximum size of the master file
  • GIZ : Gizmo
  • DEC : Decod
  • ISI : Iso-2709 import
  • UTL : Ciutl module
  • INVX : Multiple I/F searching
  • B7 : Version of the internal search mechanism
  • FAT : Fatal()
  • CIP : Cipar()
  • GCI : Supports operation in a CGI environment
  • MX : C isis_mx()

MX Utility

MX

The MX Program is a general purpose utility for working with MicroISIS databases. It can carry out most of the CISIS Interface
functions, including the import/export of ISO-2709 records, searches,
processing global changes of patterns, joining records from a master
file by record number or key from the inverted file, incorporating fields
with data generated through a field selection table (FST), and functions relating to the editing of fileds.

Utilities for the master file

MXF0

Analyzes all the records of a given master file, producing information on fields present and the characters used in them.

MXTB

The MXTB program counts the content of the fields, for example, number of times that each author appears, each descriptor, or the
combination of an author and title of the publication, etc.
The result of running MXTB is a master file that contains a record for
each different phrase found (category). These records have fields for
storing the category and its frequency.

MXCP

Copies records from an input master file to an output master file, allowing you to input data to be modified by global editing and/or procedures that supress spaces at the beginning and end, blank spaces, non-printable characters and final punctuation characters. It also converts fields that contain a specific delimeter into repeatable fields and it can discard input fields, according to the values of their tags. Another characteristic of MXCP is the recovery (undelete) of records deleted from a master file.

MSRT

Orders the records in a master file in an ascending form, according to keys that are generated by applying a format to the records.

RETAG

This program has two functions :
Change the tags of the fields in a given master file, according to a
renumbering table.
Unlock a master file.

CTLMFN

Displays and updates the master file control record. It can be used when a master file is reinitialized by accident.

MKXRF

A program for restoring the master file, that reads the .mst file and creates the corresponding .xrf file.
It can be used to restore all the active records in a master file, which
has been logically reinitialized.

I2ID

Reads a master file and generates an ASCII file that can be edited and modified.
It works together with the ID2I utility that carries out the opposite
function: reads an ASCII file and converts the data into a master file.

ID2I

Reads an ASCII file generated by I2ID (or with the same structure as a file generated by this) and converts those data into master file records.

CRUNCHMF

Converts the master file from one operating system to another, for example from Windows to Linux.

Utilities for inverted files

IFKEYS

Displays the terms from the inverted file and the number of postings of each of them.
Optionally the terms can be selected by the tags they were extracted
from.

IFLOAD

Loads an inverted file from the link file, according to the processing options. It accepts other formats as well as the standard CDS/ISIS link
file.

MYS

Sorts the link file in order to create the inverted file.

IFMERGE

Combines various inverted files from different master files into a single inverted file, with a procedure to recover the records from the source
master files.

MKIY0

Combines the six files that make up the inverted file into a single file.
CRUNCHIF Converts the inverted file from one operating system to another, for example from Windows to Linux.

Installation of the CISIS utilities

The installation of the CISIS utilities consists of creating a directory, usually \CISIS\SYS\, and copying all the utilities into this.
For convenience it is possible to add the \CISIS\SYS directory to the operating system PATH, so that it is possible to run the utilities from the location you are in, without having to reference the \CISIS\SYS directory.
Examples
The examples are mostly based on the database CDS, and presume that it is located in the directory:
\CISIS\DATABASES\
They are carried out on the database and often will modify it, therefore it is advisable to make a backup copy.

Execution of the utilities

The CISIS program is executed by a command, from the operating system prompt, or from batch files from MS-DOS or scripts (shell scripts) in UNIX.

Any program that uses CISIS can be executed by entering its name and one or more parameters, if the directory \cisis\sys (the directory where the CISIS utilities are found) is included in the system PATH list. If you do not provide parameters in the command call, each utility program displays a brief description of its use.
For example, typing only the name MXCP at the DOS prompt displays:

CISIS Interface v5.2a/PC32/M/32767/10/30/I - Utility MXCP  
Copyright (c)BIREME/PAHO 2006. [http://www.bireme.br/products/cisis] 
mxcp {in=<file>|<dbin>} [create=]<dbout> [<option> [...]]

options:

{from|to|loop|count|tell|offset}=<n>
 gizmo=<dbgiz>[,tag_list>]
 undelete
 clean [mintag=1] [maxtag=9999]
 period=.[,<tag_list>]
 repeat=%[,<tag_list>]
 log=<filename> 

Ex:

 mxcp in create=out clean period=.,3 repeat=;,7 
 in = 3 « Field 3 occ 1. »
 3 «Field 3 occ 2 . »
 7 « Field 7/1;Field 7/2 ;Field 7/3.»
 out = 3 «Field 3 occ 1»
 3 «Field 3 occ 2»
 7 «Field 7/1»
 7 «Field 7/2»
 7 «Field 7/3.»  

The parameters are displayed as a list separated by blank spaces and, for that reason, each individual parameter should be put in quotation marks when it contains blank spaces or any special system character (such as angle brackets, pipe, etc.).
The following example executes the MX program with three parameters (database name, search expression and a display format specification):

mx \cisis\databases\cds "plants*water" "pft=mfn/,'Ti: 'v24/,(|Au: |v70/)"  

In order to use the colon as part of the parameter, it should be preceded by the pipe:

mx \cisis\databases\cds "plants*water" "pft=mfn/, \" Ti: \"v24/,(|Au: |v70/)"  

Note: The dollar sign, apostrophe, asterisk, question
mark, semi colon, and other characters that have
special significance in UNIX systems, should also
be put in quotation marks.

Syntax conventions

The following conventions are used to describe the syntax of the CISIS Utility Programs:

   
<parameter> mandatory parameter
[<parameter>] optional parameter
{<option 1>|<option 2>} can choose between < option 1> or < option 2>
<option> [...] <option> can repeat

Note: Some parameters are reserved words and, if they
are used, they should be used as indicated,
including capital letters or small letters.

For example, MXCP has the general syntax:

mxcp <dbin> [create=]<dbout> [<option> [...]]  
options:  
{from|to|loop|count|tell|offset}=<n>  
gizmo=<dbgiz>[,<tag_list>]  
undelete  
clean [mintag=1] [maxtag=9999]  
period=.[,<tag_list>]  
repeat=%[,<tag_list>]  
log=<filename> 

showing that two parameters are mandatory: (a) name of the input database and (b) name of the output database.
Then, the command:

mxcp \cisis\databases\cds newcds  

copies the master file cds located in the directory \cisis\bases to the master file newcds located in the same directory. The master file newcds must already exist, otherwise an error will be produced.
If newcds does not exist you can create it using the optional parameter create as can be seen in the following example:

mxcp \cisis\databases\cds create=newcds  

The process options can be indicated using the optional parameters. To indicate, for example, the range of records to process use from and to

mxcp \cisis\databases\cds create=newcds from=10 to=20 

MX Utility

MX is a general use program for CDS/ISIS databases that carries out most of the CISIS Interface functions. Similarly to the other CISIS utility programs, MX is executed from the operating system command line, indicating the operations to carry out with parameters.
MX is used, for example, to search for and show a set of database records, according to a search expression and a display format, as in the following line:

mx \cisis\databases\cds "plants * water" "pft=mfn,x1,v24/"

Also, MX allows free-text searches even if an inverted file does not exist.
MX can also read ISO-2709 files or ASCII files, using delimiters as field separators. In these cases the input records are converted to master file records through which they are read.

The following procedures can be applied to the input records:

  1. Global change of patterns.
  2. Joining of records, by record number or inverted file key.
  3. Add fields with data generated by a field selection table.
  4. Import and export of records, specified through a format language.
    Records processed by MX can be sent to a master file, an ISO-2709 file or to a standard output (which can be directed to a file or printer). Lines produced by a format can be sent to the operating system.
    The execution of MX can generate a call to the operating system so that a certain program is run.
    The result of applying a Field Selection Table (FST) to a master file can be sent to a link file or combined into an inverted file.
    The output file can be the same as the input.
    MX also works in multi-user environments.

General description

In order to implement MX it is necessary to specify where the data are located that it will work on. You can provide a master file, an ISO-2709 file or a text file. This is the only mandatory parameter for the program.

The line:

mx \cisis\databases\cds  

generates an on-screen list of the cds database, that is found in the\cisis\databases directory. The records are displayed without formatting.
Other processing parameters can be specified, for example:

mx \cisis\databases\cds from=10 to=20  

presents the records 10 to 20 of the database cds on the screen. The database can be found in the directory \cisis\databases and the records are displayed without formatting.
The command line

mx \cisis\databases\cds from=10 to=20 "pft=mfn,x1,v24(0,7)/" 

displays records 10 to 20 from the cds database on screen, applying the format specified in the parameter pft=mfn,x1,v24(0,7)/. The database is found in the \cisis\databases directory.
It is important to bear in mind that the order in which the optional parameters are entered does not affect the execution of MX. The execution of these parameters is made in the order in which they appear in the syntax declaration.
So the previous line could be:

mx \cisis\databases\cds to=20 "pft=mfn,x1,v24(0,7)/" from=10  mx \cisis\databases\cds "pft=mfn,x1,v24(0,7)/" from=10 to=20

The following declarations are equivalent:

mx \cisis\databases\cds pft=@file1 proc=@miproc.prc
mx \cisis\databases\cds proc=@miproc.prc pft=@file1 
mx \cisis\databases\cds gizmo=gizfile1 proc=@miproc.prc 
mx \cisis\bases\cds proc=@miproc.prc gizmo=gizfile1  

the gizmo parameter can be applied before the proc parameter, because it is identified by the syntax.

If the first parameter is a database and its corresponding inverted file is available, the set of records to be processed can be obtained through a search.

The following example returns cds database records, that are found in the \cisis\databases directory, containing the words plants and water.

mx \cisis\databases\cds "plants * water"

MX can read input data from a ISO-2709 file or a delimited text file.
The following line displays the first five records of an ISO-2709 file called cds.iso, that is found in the \cisis\databases directory.

mx iso=\cisis\databases\cds.iso to=5

In the next example MX uses an ASCII file called name as its input source, whose content is:

Author 1|title 1|^aParis^bUnesco^c1965
 |title 2|^aParis^bUnesco^c1965
Author 3|title 3|^aParis^bUnesco^c1965 

This can be executed with the following call to MX:

mx seq=name "pft=mfn,c11,v1,c21,v2,c31,v3/" now 

That generates the output:

000001 Author 1 Title 1 ^aParis^bUnesco^c1965
000002 Title 2 ^aParis^bUnesco^c1965
000003 Author 3 Title 3 ^aParis^bUnesco^c1965

The processed records can be stored in a master file. The following lines create a master file sample:


mx\cisis\databases\cds "plants * water" create=sample -all now
mx iso=\cisis\databases\cds.iso to=5 create=sample -all now
mx seq=name create=sample -all now 

These records, can also be exported to anISO-2709 file, for example sample.iso:

mx \cisis\databases\cds "plants * water" iso=sample.iso -all now
mx iso=\cisis\databases\cds.iso to=5 iso=sample.iso -all now
mx seq=name iso=sample.iso -all now

When MX carries out one or more processes that modify records (whether read from a database, ISO-2709 file or a text file), these modifications are carried out in memory and do not modify the database, unless explicitly indicated. The modified records can be viewed on the screen or written to an output file. The main modification processes are:

• procedures for changing global patterns (gizmo).
• joining records (or part of them) from another database (join). • carrying out field operations (proc).
• applying a field selection table from CDS/ISIS and aggregating the results in a file in memory (fst).

The following example shows records from the master file cds, and indicates the number of records to display using the keyboard.

• MS-DOS:

mx seq=con "join=cds='mfn='v1" "proc='D1/1D32001'"  

• UNIX:

mx seq=/dev/ttyp0 "join=cds='mfn='v1" "proc='D1/1D32001'"  

MX can carry out the record modifications on the same master file that it uses as input:

mx cds "proc='D24'" copy=cds -all now  

The example deletes the field tag 24 from all the cds database records, making the changes on the same database.
MX can take parameters from a text file, allowing it to exceed the limitations of the operating system that are described here:

  1. A call to MX has more characters than is permitted in a operating system command line (128 characters in MS-DOS 512 characters in UNIX). 2. The command line that is added by keyboard contains special operating system characters.

The following example shows how to use a parameters file:

mx in=somefile  

Where the file somefile contains:

\cisis\databases\cds  
proc='D1'  
copy=\cisis\databases\cds  
-all  
now  

Syntax

MX version 5.2a, syntax table:

CISIS Interface v5.2a/PC32/L/M/32767/16/60/I - Utility MX  Copyright (c)BIREME/PAHO 2006. [http://www.bireme.br/products/cisis] 
mx [cipar=<file>] [{mfrl|load}=<n>] [cgi={mx|<v2000_fmt>}] [in=<file>]   {[db=]<db>|  
 seq[/1m]=<file>|  
 iso[={marc|<n>}]=<isofile> [isotag1=<tag>]|  
 dict=<if>[,<keytag>[,<posttag>[/<postsperrec>]]] [k{1|2}=<key>]}  
options:  
 {from|to|loop|count|tell|btell}=<n>  
 text[/show]=<text>  
 [bool=]{<bool_expr>|@<file>} [invx=<tag101_mf>] [tmpx=<tmp_mf>]   gizmo=<gizmo_mf>[,<taglist>] [gizp[/h]=<out_mfx>] [decod=<mf>]  
 join=<mf>[:<offset>][,<taglist>]=<mfn=_fmt>  
 join=<db>[:<offset>][,<taglist>]=<upkey_fmt> [jmax=<n>]   jchk=<if>[+<stwfile>]=<upkey_fmt>  
 proc=[<proc_fmt>|@<file>]  
 D{<tag>[/<occ>]|*}  
 A<tag><delim><data><delim>  
 H<tag> <length> <data>  
 <TAG[ <stripmarklen>[ <minlen>]]><data></TAG>  
 S[<tag>]  
 R<mf>,<mfn>  
 G<gizmo_mf>[,<taglist>]  
 Gsplit[/clean]=<tag>[={<char>|words|letters|numbers|trigrams}]   Gsplit=<tag>=6words[/if=<if>] 
 Gload[/<tag>][/nonl][=<file>]  
 Gmark[/<tag>]{/<elem>|/keys|/decs|/<mf>,<otag>[,<ctag>]}=<if>   Gmarx[/<tag>]/<elem>[@<att>="x"] =<tag>[:&[<att>]|/c[=224]|/i]   Gdump[/<tag>][/nonl][/xml][=<file>]  
 =<mfn>  
 X[append=]<mf>  
 convert=ansi [uctab={<file>|ansi}] [actab={<file>|ansi}]   fst[/h]={<fst>|@[<file>]} [stw=@[<file>]]  
 [mono|mast|full] {create|copy|append|merge|updatf}=<out_mf>   [out]iso[={marc|<n>}]=<out_isofile> [outisotag1=<tag>]   fullinv[/dict][/keep][/ansi]=<out_if> [maxmfn=<n>|master=<mf>]   ln{1|2}=<out_file> [+fix[/m]]   fix=<out_file> tbin=<tag>  
 tab[/lines:100000/width:100/tab:<tag>]=<tab_fmt>  
 {prolog|pft|epilog}={<diplay_fmt>|@<file>} [lw={<n>|0}]   {+|-}{control|leader|xref|dir|fields|all} mfrl now

MX takes the parameters in the order shown in the table. In first place should be, the initialisation (setup) parameters, followed by the source of the input data, and finally the optional processing parameters. There are some exceptions that are pointed out in the manual, for example btell= should go before bool=.

Parameters. General description

If you enter the name of the MX program without parameters, a menu of all possible options and a brief description of their use is displayed, as is shown in the previous section.

Initialization parameters (setup)

When one or more of the optional initialization parameters (files, mfrl, fmtl, load) are present, they should be placed before any other parameter.

Parameters that indicate the database source

A mandatory parameter that indicates the database source (database name, ISO 2709 file or text file), should be the first parameter, except for the initialization parameters, in which case it should be entered immediately after.

Parameters for processing data

Optional parameters that carry out tasks on the input data. In the command line these follow the parameter that indicates the source.

Note: By default, MX assumes that each string of characters that is found from the input source and that does not begin with a reserved word (from, to, join, etc.) is a search expression.

Processing parameters can be classified as:

Parameters for searching records

With these parameters you define a subgroup of source data on which to perform a task. The method of defining this subgroup can be by: • A search (bool) • A free-text search expression (text) • A range of records (whose limits are indicated with from, to) • Number of records (count) • Repeating for each record until a condition is met (loop)

Parameters that carry out processes

These are parameters which call internal processes that carry out tasks in memory on a set of records. These tasks could be: • Carry out global changes (gizmo) • Join records (join) • Compare master files with inverted files (jchk) • Carry out modifications in the record fields (proc) • Apply the field selection table (fst) to records • Apply formats to the records (pft)

Note: The order of execution for these processes is: gizmo, join y/o jchk, proc, fst and pft.