CISIS – Utility Programs
Content
- Acronymn Description
- MX Utility
- Utilities for the master file
- Utilities for inverted files
- Installation of the CISIS utilities
- Execution of the utilities
- Syntax conventions
- MX Utility
- The CISIS Utilities are a group of programs developed in the C programming language that call functions offered by the CISIS Interface in order to carry out different functions on the Isis family of databases, such as finding and displaying records, maintenance of databases, etc. They can also carry out special functions that allow you to arrange master file records, generate tables from a master record, change the field tags etc.
This group of utility programs is offered under four versions: 10/30 and 16/60, LIND, FFI. The main differences are in the length of the inverted file keys and the maximum supported record size measured in bytes, as is shown in the following table.
10/30 | 16/60 | LIND | FFI | |
---|---|---|---|---|
Inverted file key length | 30 | 60 | 60 | 60 |
Maximum record size | 32.767 | 32.767 | 32.767 | 1.048.576 |
Note: version 10/30 is the only one compatible with CDS/ISIS from UNESCO
For more details on the structure of the master and inverted files see the Appendix: Structure of ISIS database records.
The particular characteristics of these programs can be verified in the version declaration that you can obtain with what
For example:
mx what
CISIS Interface v5.2a/PC32/M/32767/10/30/I - Utility MX
CISIS Interface v5.2a/.iy0/Z/4GB/GIZ/DEC/ISI/UTL/INVX/B7/FAT/CIP/CGI/MX Copyright (c)BIREME/PAHO 2006. [http://www.bireme.br/products/cisis]
Acronymn Description
- V5.2a : Version number
- PC32 : Computer used (in this case Windows PC)
- L : Lind version if it is present
- M : Multi-user support
- 32767 : Maximum size of the record in bytes, value by default
- 10/30 : Inverted file keys
- I : Permission to update the I/F
- Utility MX : Name of the program
- .iy0 : Single physical file for I/F (made for mkiy0)
- Z : Compressed I/F (made for myz – discontinued)
- 4GB : Maximum size of the master file
- GIZ : Gizmo
- DEC : Decod
- ISI : Iso-2709 import
- UTL : Ciutl module
- INVX : Multiple I/F searching
- B7 : Version of the internal search mechanism
- FAT : Fatal()
- CIP : Cipar()
- GCI : Supports operation in a CGI environment
- MX : C isis_mx()
MX Utility
MX
The MX Program is a general purpose utility for working with MicroISIS databases. It can carry out most of the CISIS Interface
functions, including the import/export of ISO-2709 records, searches,
processing global changes of patterns, joining records from a master
file by record number or key from the inverted file, incorporating fields
with data generated through a field selection table (FST), and functions
relating to the editing of fileds.
Utilities for the master file
MXF0
Analyzes all the records of a given master file, producing information on fields present and the characters used in them.
MXTB
The MXTB program counts the content of the fields, for example, number of times that each author appears, each descriptor, or the
combination of an author and title of the publication, etc.
The result of running MXTB is a master file that contains a record for
each different phrase found (category). These records have fields for
storing the category and its frequency.
MXCP
Copies records from an input master file to an output master file, allowing you to input data to be modified by global editing and/or procedures that supress spaces at the beginning and end, blank spaces, non-printable characters and final punctuation characters. It also converts fields that contain a specific delimeter into repeatable fields and it can discard input fields, according to the values of their tags. Another characteristic of MXCP is the recovery (undelete) of records deleted from a master file.
MSRT
Orders the records in a master file in an ascending form, according to keys that are generated by applying a format to the records.
RETAG
This program has two functions :
Change the tags of the fields in a given master file, according to a
renumbering table.
Unlock a master file.
CTLMFN
Displays and updates the master file control record. It can be used when a master file is reinitialized by accident.
MKXRF
A program for restoring the master file, that reads the .mst file and creates the corresponding .xrf file.
It can be used to restore all the active records in a master file, which
has been logically reinitialized.
I2ID
Reads a master file and generates an ASCII file that can be edited and modified.
It works together with the ID2I utility that carries out the opposite
function: reads an ASCII file and converts the data into a master file.
ID2I
Reads an ASCII file generated by I2ID (or with the same structure as a file generated by this) and converts those data into master file records.
CRUNCHMF
Converts the master file from one operating system to another, for example from Windows to Linux.
Utilities for inverted files
IFKEYS
Displays the terms from the inverted file and the number of postings of each of them.
Optionally the terms can be selected by the tags they were extracted
from.
IFLOAD
Loads an inverted file from the link file, according to the processing options. It accepts other formats as well as the standard CDS/ISIS link
file.
MYS
Sorts the link file in order to create the inverted file.
IFMERGE
Combines various inverted files from different master files into a single inverted file, with a procedure to recover the records from the source
master files.
MKIY0
Combines the six files that make up the inverted file into a single file.
CRUNCHIF Converts the inverted file from one operating system to another, for example from Windows to Linux.
Installation of the CISIS utilities
The installation of the CISIS utilities consists of creating a directory, usually \CISIS\SYS\, and copying all the utilities into this.
For convenience it is possible to add the \CISIS\SYS directory to the operating system PATH, so that it is possible to run the utilities from the location you are in, without having to reference the \CISIS\SYS directory.
Examples
The examples are mostly based on the database CDS, and presume that it is located in the directory:
\CISIS\DATABASES\
They are carried out on the database and often will modify it, therefore it is advisable to make a backup copy.
Execution of the utilities
The CISIS program is executed by a command, from the operating system prompt, or from batch files from MS-DOS or scripts (shell scripts) in UNIX.
Any program that uses CISIS can be executed by entering its name and one or more parameters, if the directory \cisis\sys (the directory where the CISIS utilities are found) is included in the system PATH list. If you do not provide parameters in the command call, each utility program displays a brief description of its use.
For example, typing only the name MXCP
at the DOS prompt displays:
CISIS Interface v5.2a/PC32/M/32767/10/30/I - Utility MXCP
Copyright (c)BIREME/PAHO 2006. [http://www.bireme.br/products/cisis]
mxcp {in=<file>|<dbin>} [create=]<dbout> [<option> [...]]
options:
{from|to|loop|count|tell|offset}=<n>
gizmo=<dbgiz>[,tag_list>]
undelete
clean [mintag=1] [maxtag=9999]
period=.[,<tag_list>]
repeat=%[,<tag_list>]
log=<filename>
Ex:
mxcp in create=out clean period=.,3 repeat=;,7
in = 3 « Field 3 occ 1. »
3 «Field 3 occ 2 . »
7 « Field 7/1;Field 7/2 ;Field 7/3.»
out = 3 «Field 3 occ 1»
3 «Field 3 occ 2»
7 «Field 7/1»
7 «Field 7/2»
7 «Field 7/3.»
The parameters are displayed as a list separated by blank spaces and, for that reason, each individual parameter should be put in quotation marks when it contains blank spaces or any special system character (such as angle brackets, pipe, etc.).
The following example executes the MX program with three parameters (database name, search expression and a display format specification):
mx \cisis\databases\cds "plants*water" "pft=mfn/,'Ti: 'v24/,(|Au: |v70/)"
In order to use the colon as part of the parameter, it should be preceded by the pipe:
mx \cisis\databases\cds "plants*water" "pft=mfn/, \" Ti: \"v24/,(|Au: |v70/)"
Note: The dollar sign, apostrophe, asterisk, question
mark, semi colon, and other characters that have
special significance in UNIX systems, should also
be put in quotation marks.
Syntax conventions
The following conventions are used to describe the syntax of the CISIS Utility Programs:
<parameter> |
mandatory parameter |
[<parameter> ] |
optional parameter |
{<option 1>|<option 2>} |
can choose between < option 1> or < option 2> |
<option> [...] |
<option> can repeat |
Note: Some parameters are reserved words and, if they
are used, they should be used as indicated,
including capital letters or small letters.
For example, MXCP has the general syntax:
mxcp <dbin> [create=]<dbout> [<option> [...]]
options:
{from|to|loop|count|tell|offset}=<n>
gizmo=<dbgiz>[,<tag_list>]
undelete
clean [mintag=1] [maxtag=9999]
period=.[,<tag_list>]
repeat=%[,<tag_list>]
log=<filename>
showing that two parameters are mandatory: (a) name of the input database and (b) name of the output database.
Then, the command:
mxcp \cisis\databases\cds newcds
copies the master file cds located in the directory \cisis\bases to the master file newcds located in the same directory. The master file newcds must already exist, otherwise an error will be produced.
If newcds does not exist you can create it using the optional parameter create as can be seen in the following example:
mxcp \cisis\databases\cds create=newcds
The process options can be indicated using the optional parameters. To indicate, for example, the range of records to process use from and to
mxcp \cisis\databases\cds create=newcds from=10 to=20
MX Utility
MX is a general use program for CDS/ISIS databases that carries out most of the CISIS Interface functions. Similarly to the other CISIS utility programs, MX is executed from the operating system command line, indicating the operations to carry out with parameters.
MX is used, for example, to search for and show a set of database records, according to a search expression and a display format, as in the following line:
mx \cisis\databases\cds "plants * water" "pft=mfn,x1,v24/"
Also, MX allows free-text searches even if an inverted file does not exist.
MX can also read ISO-2709 files or ASCII files, using delimiters as field separators. In these cases the input records are converted to master file records through which they are read.
The following procedures can be applied to the input records:
- Global change of patterns.
- Joining of records, by record number or inverted file key.
- Add fields with data generated by a field selection table.
- Import and export of records, specified through a format language.
Records processed by MX can be sent to a master file, an ISO-2709 file or to a standard output (which can be directed to a file or printer). Lines produced by a format can be sent to the operating system.
The execution of MX can generate a call to the operating system so that a certain program is run.
The result of applying a Field Selection Table (FST) to a master file can be sent to a link file or combined into an inverted file.
The output file can be the same as the input.
MX also works in multi-user environments.
General description
In order to implement MX it is necessary to specify where the data are located that it will work on. You can provide a master file, an ISO-2709 file or a text file. This is the only mandatory parameter for the program.
The line:
mx \cisis\databases\cds
generates an on-screen list of the cds database, that is found in the\cisis\databases directory. The records are displayed without formatting.
Other processing parameters can be specified, for example:
mx \cisis\databases\cds from=10 to=20
presents the records 10 to 20 of the database cds on the screen. The database can be found in the directory \cisis\databases and the records are displayed without formatting.
The command line
mx \cisis\databases\cds from=10 to=20 "pft=mfn,x1,v24(0,7)/"
displays records 10 to 20 from the cds database on screen, applying the format specified in the parameter pft=mfn,x1,v24(0,7)/
. The database is found in the \cisis\databases
directory.
It is important to bear in mind that the order in which the optional parameters are entered does not affect the execution of MX. The execution of these parameters is made in the order in which they appear in the syntax declaration.
So the previous line could be:
mx \cisis\databases\cds to=20 "pft=mfn,x1,v24(0,7)/" from=10 mx \cisis\databases\cds "pft=mfn,x1,v24(0,7)/" from=10 to=20
The following declarations are equivalent:
mx \cisis\databases\cds pft=@file1 proc=@miproc.prc
mx \cisis\databases\cds proc=@miproc.prc pft=@file1
mx \cisis\databases\cds gizmo=gizfile1 proc=@miproc.prc
mx \cisis\bases\cds proc=@miproc.prc gizmo=gizfile1
the gizmo parameter can be applied before the proc parameter, because it is identified by the syntax.
If the first parameter is a database and its corresponding inverted file is available, the set of records to be processed can be obtained through a search.
The following example returns cds database records, that are found in the \cisis\databases directory, containing the words plants and water.
mx \cisis\databases\cds "plants * water"
MX can read input data from a ISO-2709 file or a delimited text file.
The following line displays the first five records of an ISO-2709 file called cds.iso, that is found in the \cisis\databases directory.
mx iso=\cisis\databases\cds.iso to=5
In the next example MX uses an ASCII file called name as its input source, whose content is:
Author 1|title 1|^aParis^bUnesco^c1965
|title 2|^aParis^bUnesco^c1965
Author 3|title 3|^aParis^bUnesco^c1965
This can be executed with the following call to MX:
mx seq=name "pft=mfn,c11,v1,c21,v2,c31,v3/" now
That generates the output:
000001 Author 1 Title 1 ^aParis^bUnesco^c1965
000002 Title 2 ^aParis^bUnesco^c1965
000003 Author 3 Title 3 ^aParis^bUnesco^c1965
The processed records can be stored in a master file. The following lines create a master file sample:
mx\cisis\databases\cds "plants * water" create=sample -all now
mx iso=\cisis\databases\cds.iso to=5 create=sample -all now
mx seq=name create=sample -all now
These records, can also be exported to anISO-2709 file, for example sample.iso:
mx \cisis\databases\cds "plants * water" iso=sample.iso -all now
mx iso=\cisis\databases\cds.iso to=5 iso=sample.iso -all now
mx seq=name iso=sample.iso -all now
When MX carries out one or more processes that modify records (whether read from a database, ISO-2709 file or a text file), these modifications are carried out in memory and do not modify the database, unless explicitly indicated. The modified records can be viewed on the screen or written to an output file. The main modification processes are:
• procedures for changing global patterns (gizmo).
• joining records (or part of them) from another database (join). • carrying out field operations (proc).
• applying a field selection table from CDS/ISIS and aggregating the results in a file in memory (fst).
The following example shows records from the master file cds, and indicates the number of records to display using the keyboard.
• MS-DOS:
mx seq=con "join=cds='mfn='v1" "proc='D1/1D32001'"
• UNIX:
mx seq=/dev/ttyp0 "join=cds='mfn='v1" "proc='D1/1D32001'"
MX can carry out the record modifications on the same master file that it uses as input:
mx cds "proc='D24'" copy=cds -all now
The example deletes the field tag 24 from all the cds database records, making the changes on the same database.
MX can take parameters from a text file, allowing it to exceed the limitations of the operating system that are described here:
- A call to MX has more characters than is permitted in a operating system command line (128 characters in MS-DOS 512 characters in UNIX). 2. The command line that is added by keyboard contains special operating system characters.
The following example shows how to use a parameters file:
mx in=somefile
Where the file somefile contains:
\cisis\databases\cds
proc='D1'
copy=\cisis\databases\cds
-all
now
Syntax
MX version 5.2a, syntax table:
CISIS Interface v5.2a/PC32/L/M/32767/16/60/I - Utility MX Copyright (c)BIREME/PAHO 2006. [http://www.bireme.br/products/cisis]
mx [cipar=<file>] [{mfrl|load}=<n>] [cgi={mx|<v2000_fmt>}] [in=<file>] {[db=]<db>|
seq[/1m]=<file>|
iso[={marc|<n>}]=<isofile> [isotag1=<tag>]|
dict=<if>[,<keytag>[,<posttag>[/<postsperrec>]]] [k{1|2}=<key>]}
options:
{from|to|loop|count|tell|btell}=<n>
text[/show]=<text>
[bool=]{<bool_expr>|@<file>} [invx=<tag101_mf>] [tmpx=<tmp_mf>] gizmo=<gizmo_mf>[,<taglist>] [gizp[/h]=<out_mfx>] [decod=<mf>]
join=<mf>[:<offset>][,<taglist>]=<mfn=_fmt>
join=<db>[:<offset>][,<taglist>]=<upkey_fmt> [jmax=<n>] jchk=<if>[+<stwfile>]=<upkey_fmt>
proc=[<proc_fmt>|@<file>]
D{<tag>[/<occ>]|*}
A<tag><delim><data><delim>
H<tag> <length> <data>
<TAG[ <stripmarklen>[ <minlen>]]><data></TAG>
S[<tag>]
R<mf>,<mfn>
G<gizmo_mf>[,<taglist>]
Gsplit[/clean]=<tag>[={<char>|words|letters|numbers|trigrams}] Gsplit=<tag>=6words[/if=<if>]
Gload[/<tag>][/nonl][=<file>]
Gmark[/<tag>]{/<elem>|/keys|/decs|/<mf>,<otag>[,<ctag>]}=<if> Gmarx[/<tag>]/<elem>[@<att>="x"] =<tag>[:&[<att>]|/c[=224]|/i] Gdump[/<tag>][/nonl][/xml][=<file>]
=<mfn>
X[append=]<mf>
convert=ansi [uctab={<file>|ansi}] [actab={<file>|ansi}] fst[/h]={<fst>|@[<file>]} [stw=@[<file>]]
[mono|mast|full] {create|copy|append|merge|updatf}=<out_mf> [out]iso[={marc|<n>}]=<out_isofile> [outisotag1=<tag>] fullinv[/dict][/keep][/ansi]=<out_if> [maxmfn=<n>|master=<mf>] ln{1|2}=<out_file> [+fix[/m]] fix=<out_file> tbin=<tag>
tab[/lines:100000/width:100/tab:<tag>]=<tab_fmt>
{prolog|pft|epilog}={<diplay_fmt>|@<file>} [lw={<n>|0}] {+|-}{control|leader|xref|dir|fields|all} mfrl now
MX takes the parameters in the order shown in the table. In first place should be, the initialisation (setup) parameters, followed by the source of the input data, and finally the optional processing parameters. There are some exceptions that are pointed out in the manual, for example btell= should go before bool=.
Parameters. General description
If you enter the name of the MX program without parameters, a menu of all possible options and a brief description of their use is displayed, as is shown in the previous section.
Initialization parameters (setup)
When one or more of the optional initialization parameters (files, mfrl, fmtl, load) are present, they should be placed before any other parameter.
Parameters that indicate the database source
A mandatory parameter that indicates the database source (database name, ISO 2709 file or text file), should be the first parameter, except for the initialization parameters, in which case it should be entered immediately after.
Parameters for processing data
Optional parameters that carry out tasks on the input data. In the command line these follow the parameter that indicates the source.
Note: By default, MX assumes that each string of characters that is found from the input source and that does not begin with a reserved word (from, to, join, etc.) is a search expression.
Processing parameters can be classified as:
Parameters for searching records
With these parameters you define a subgroup of source data on which to perform a task. The method of defining this subgroup can be by: • A search (bool) • A free-text search expression (text) • A range of records (whose limits are indicated with from, to) • Number of records (count) • Repeating for each record until a condition is met (loop)
Parameters that carry out processes
These are parameters which call internal processes that carry out tasks in memory on a set of records. These tasks could be: • Carry out global changes (gizmo) • Join records (join) • Compare master files with inverted files (jchk) • Carry out modifications in the record fields (proc) • Apply the field selection table (fst) to records • Apply formats to the records (pft)
Note: The order of execution for these processes is: gizmo, join y/o jchk, proc, fst and pft.
Dê sua opinião
Esta página foi útil?
Fico feliz em ouvir isso! Por favor, nos diga como podemos melhorar.
Sinto muito em ouvir isso. Por favor, diga-nos como podemos melhorar.