EDIT: editor and recode module

Overview

The EDIT module is used to edit data. It has its own specialized syntax.

The following facilities are provided:

data editing        use text-editor like commands to correct and
                    check data
CORR                correction sub-processor

General commands ---------------- EDIT change the current variable EXIT Q QUIT leave the editor VAR information on the current variable HELP on-line help SAVE save current state UNDO undo changes COMPUTE variable attributes SET control EDIT settings CLDES clear descriptor STATISTICS show statistics on current variable

Management ---------- CREATE create a new variable REORDER reorder variables SORT sort cases

Add/delete cases ---------------- AGROUP add a new group DGROUP delete a group ICAS insert cases DCAS delete cases Other ----- RECODE recode PUT conditional recode COPY conditional copy MOVE data matrix modification CONTRUCT data matrix modification



EDIT
        EDIT v1
EDIT is a special module within EDA grouping a number of WA editing and other transformation tasks, requiring special treatment (different syntax, additional security and the like).

The editor module is entered setting the current variable to v. (The EDIT command within the editor changes the current variable).

Working modes
The editor works in two different modes. Either the editor is used as a special module, i.e. the user leaves the editor only when all the editing is done (then the normal EDA-commands are not allowed), or used in immediate mode. In the second case all the editor commands and keywords are preceded by "/". Only one edit command is done and control turns back to EDA. The second mode is especially useful when only one edit command is used. In immediate mode some EDIT commands are not available, these commands work from a copy of the WA; this copy exists only when the editor is entered in edit mode.

    Commands not available in immediate mode
    ----------------------------------------
    SAVE        UNDO      MOVE      CONSTRUCT
    REORDER     COPY      PUT       SET
    RECODE

The EDA edit and recode module allows to correct cases, information attached to variables, blocks and files, recoding, adding and deleting cases and groups to existing variables and performs various transformations affecting single variables and/or whole WAs.

Within EDA the editor is a separate module, i.e. it performs its own syntax analysis. The general syntactical definitions hold also for the EDIT commands, but their use may differ from the normal command mode. Some special constructs have also been added.

Like in normal EDA mode a ? (question mark) followed by the command name displays short syntactical information on that command. A HELP command displays an overview.

There are different types of EDIT commands:

Keyword type commands

This type of command apply always to the current variable, except the commands regarding blocks (WAs) and files. The current variable may be changed with the EDIT n command within the editor. Typing in a keyword prints the information about the item. An equal sign '=' followed by a value (number or variable) or a string replaces the old value. The equal sign must be the very next line after the keyword.

These commands have to be used carefully, because they allow you to modify important information stored with variables, blocks and files, which is used to control the program flow. Modifying n of cases, minima and maxima, as well as block and variable types might yield surprising results. As a general rule a WA where these important informations are modified should never be stored for later use on a system file. The following keywords are legal editor keywords. The HELP keyword prints all available EDIT commands.

EDIT keywords


     keyword        item
     -------        ----
WA/file information
WADESCRIPTOR WA descriptor WALABEL WA name (label) GVAR cas# value of GVAR for cas# CASID cas# case identifier
Current variable editing:
DESC variable descriptor *) LABEL variable label *) TABLE tie to table CENTER estimate of center(default=median)

MIN minimum **) MAX maximum **) NCAS n of cases **) TYPE type of variable **)

case edit command (details below)
VAL [cas#] case cas# P <k> display cases N <k> advance case pointer L val locate value
*) See also TED LABEL **) Caution: Use carefully as these modify vital information. Incorrect use may cause loss of data.

Case edit commands

The keyword commands VAL, P, N, L and the additional command CORR together form a powerful case editor system. They share a pointer to the current case, which is manipulated explicitly or implicitly by these commands. This pointer is set to 1 initially and when changing the current variable.

The VAL [cas#] command displays the current case or the case specified by cas# (and the pointer is set to cas#).

The N (Next) command (default <k> is 1) sets the pointer to case p+<k>, where p is the current case, and <k> a positive or negative number. Positioning of the pointer beyond the limit of the variable (less than 1, or greater than N) sets the pointer to 1.

The P, if unsigned works like the VAL command, i.e displaying either the current case, of the case specified by <k>. If a sign is present is works like the N command, except that the cases are displayed between the current case and case p+<k>.

L searches <val> starting from the case following the pointer to the Nth case. If a value is found the pointer is set to that case, otherwise it remains unchanged (No find). If no <val> is given, the last <val> is used (initially zero). The comparison tolerance parameter (FUZZ) applies also to LOC (see below).

All these commands might be followed by the = sign to modify the case pointed to by the current case pointer. In normal EDIT mode these three commands (P, L and N) are repeated automatically when a command line starting with a blank is entered.

While in edit mode cases modified by any command are marked with a '*', until you leave the editor, or do a SAVE command.



CORR


  CORR
The CORR command invokes the case correction processor, which reads case correction lines and processes them. With the exception of two special characters all input to corr is considered a correction statement.
     /    quit CORR, return to EDIT
     ?    help, display syntax

Correction statements correct the current case for the current variable. Initially the current case/variable is the same as with all other EDIT case related commands, e.g. advancing the case pointer using the N(ext) command. Correction statements also set the current case/variable for the case related EDIT commands, i.e. on exit from CORR the current case/variable is the one set by CORR specifications.

Correction statements have the following format:

     [var#:][cas#=][val]
var# is a valid variable reference (name, number or simple expression). If var# is not present it the current variable is used. var# sets the current variable.

cas# is any valid case reference (casid, case number or simple expression) If it is absent the current case is modified and the pointer is set to the next case (1 if the current case is the last case). Otherwise the pointer is set to the case specified by cas#.

val is the value replacing the current value. If val is not specified, the last value specified (initially zero) is used.

Note that only an empty line does not modify a case (it sets the pointer to the next case).

Some examples:

  ID1=20     case 'ID1' is set to 20 on the current variable,
             current case is now the next case after 'ID1'
  190.2      the current case is set to 190.2, sets the next current
             case
  =          the current case is set to the previous used correction
             value, i.e. here 190.2, set the next current case
  <return>   no correction, set the next current case
  ID9=       set case 'ID9' to the current correction value, i.e. 190.2,
             next case is the case following 'ID9'
  VARX:ID9=-1 current variable is 'VARX', set case 'ID9' to -1
  VARZ:      new current variable is VARZ; current case is the same,
             therefore 'ID9' for that variable is set to -1.

General utility edit commands



EDIT
     EDIT v1
Changes the current variable to v1, i.e. v1 becomes the variable you are currently editing. Note that you can only change to existing variables. If you need to work a new one, use either the CREATE command or the NEW command.

EXIT


Q or QUIT
EXIT
Q
QUIT
Leave the editor and return to EDA mode.

VAR
Displays the current variable.

HELP
Lists all available edit commands.

COMP (*)
COMPUTE [NOREFERENCE or NOCENTER]
COMPUTE vlist [NOREFERENCE or NOCENTER]
Recomputes median (default center/reference), minimum and maximum, i.e. attributes stored with each variable By default only the current variable is recomputed. You may also specify a variable list instead. This is particularily useful in situations where macros create or append to variables without recomputing the center, the minimum and the maximum.

Note that unless you specify the NOREFERENCE or NOCENTER option, COMP replaces the current center/reference value by the median of the variable, i.e. if you have stored your own center value it will be lost after a COMP command.

SAVE


Saves the current state of the WA. You may use UNDO later on to restore that state. (Before leaving the editor).

SAVE is supplied to offer an UNDO option during complex edit operations. You need not use SAVE to apply the modifications when leaving EDIT. All modifications operate directly on the data matrix. SAVE cannot be used in immediate mode.

UNDO UNDO undoes all modifications to the WA since the last SAVE or entering the editor. UNDO cannot be used in immediate mode.

STATISTICS


STATISTICS [Start=case#] [End=case#]
Displays the minimum, the maximum and the sum for the current variable.

These statistics are shown for all cases and for the cases between the first case and the current case.

If the Start= or End= option is present, these values are shown for the cases specified these range options.

This command is helpful, when entering data from e.g. a printed table including totals somewhere. This command can then be used to check the data entered against these totals.

SET

        SET    |   FUZZ=val
               |   ALL [OFF]
               |   STATUS
               |   CASID  [NOMOD]
               |   CASID  [ONLY]
Controls parameters and switches for some groups of editor commands. Fuzz sets a precision criterion for the recodification commands (see section on recodes). ALL controls the commands which modify alternatively a single variable or the whole WA. CASID controls the AGROUP DGROUP ICAS and DCAS commands. STATUS displays the current status of the switches.

CLDESCR
     CLDESCRIPTOR
(clear descriptor) modification of a variable (calculations, recode etc) of a variable modify the descriptor (modification stamp in positions 46,47,48): *r* for recode, *t* for calculations etc. This command clears these stamps from all descriptors in the WA.

This command does exactly the same as the LABEL CLEAR command in normal EDA mode.

CREATE


        CREATE Var=v# N=ncases [Const=value]
Creates a new variable v# having N cases (all values set to 0, or to value if Const=value is present). This command is needed, e.g. to create a new variable before COPYing it: if the target variable does not exist, the editor will refuse to change the current variable.

Data entry



NEW


APPEND
Two EDIT commands allow to enter new data or append existing variables.
    NEW <vlist> [N=ncas]

APPEND <vlist> N=ncas

The NEW command creates a number of variables specified by <vlist>. If N= is not specified and the WA is rectangular, that N is used, otherwise N is required. The command queries the value(s) for each case separately (compare to *READ) and asks after N cases have been input for the variable label(s) and descriptor(s) for each variable.

The APPEND command does the same, but appending N cases (N is then required) to existing variables as specified by the <vlist>. (compare this with the NEWVAR facility).

Variable and WA oriented commands

This group of commands deals with whole variables or the complete WA. The commands may apply to the whole work area or to single variables only, controlled by the ALL switch. The default settings are as follows: The switch is set, if the work area is rectangular, and cleated if the WA is not rectangular. The SET command is used to alter these settings: SET ALL sets the switch ON, and SET ALL OFF turns it off.

REORDER
        REORDER  vlist [A=start#]
Reorders the WA: The variables (old order) in the vlist are copied into sequential positions starting with start# (default the first variable). Variables lying between start# and start#+ the number of variables on the list are deleted, unless they figure in the vlist. Variables outside are not modified.

This command does not work in immediate mode, because reordering is done from the saved state of the WA (which is either done by SAVE (edit) or upon entering the editor. This implies also that the user might modify the WA using any editor command, the reordered variables (vlist) will be copied from the saved version.

SORT



        SORT [GVAR | CASID]
Sorts the whole WA or a single variable using the current variable as sort key. If only one variable is sorted the casids are not altered, whereas if the whole WA is sorted the sequence of the casids is modified accordingly.

With rectangular WA in ALL mode two additional options are available: sort the WA on the CASID or on the GVAR.

Add & Delete cases

The following commands are sensitive to the CASID switch setting, as well - like the preceding commands - to the all switch. These commands are used to add or delete cases, but also to solve some editing problems, where the sequence of the cases and/or case identifications is wrong.

CASID [NOMO] Normally case identifications are always modified, i.e. adapted to the new situation (add or delete cases). If you wish to edit cases, e.g. if you have read 27 cases instead of 26, but your casids are correct, you set the CASID switch to NOMOD. It can also be used to alter a series of cases, without altering the casids.

CASID [ONLY] in some instances you might wish to edit only the casids and not the cases. Then you specify the CASID ONLY switch.

The default setting of these switches are OFF, i.e. casids and cases are modified, even in the case when the ALL switch is off.

The cases and casids are solicited from the terminal. Cases are solicited for each variable separately.

Commands


AGROUP
        AGROUP  N=ncas [G=group#] [FILE {"fnam"}]
Adds new cases to existing variables starting at the end (=n) of the existing variable(s). If no group number is specified. No group numbers are inserted (default 0 group, if a GVAR is active). The N parameter must be specified. (number of cases to add). If FILE is present, the cases are read from an external file named ADDFILE. The same rules as for the RAWIN file apply to the ADDFILE. If FILE is specified, the variables must be contiguous, an empty variable in the middle of non-empty variables, stops the input. A GVAR need not to be stored for this command to function.

DGROUP
  DGROUP G=group#
The specified group is dropped from the WA or from the current variable only. A GVAR must be stored for this command to work. Outside the editor DELETE GROUP=groupnum performs the same function.

ICAS
  ICAS  [A=cas#]
Insert a case after the position specified by the A parameter. If A is not present, the case is added at the end of the variable, i.e. the new case is case n+1.

DCAS
  DCAS A=cas#
Delete the case specified by A from the work area. Outside EDIT DELETE CASE=case performs the same function.

Recodifications

Note that the commands described here are somewhat obsolete, as the IF commands does most of it better. Remember that the current variable is edited and recoded. Only conditions may refer to other variables in the WA. Changes apply only to the current variable. The modification stamp for these commands is *r*.

General note: As these commands are usually only for specialised users, performing tricky sequences of operations on data, the commands explained below do no handle automatically labels and descriptors, you will have to use the appropriate EDIT commands mentionned in this chapter.

Please remind that for most standard tasks there are easier commands available in normal EDA mode.


          SET FUZZ=val
Sets comparison tolerance (fuzz) for the RECODE, PUT and COPY commands. The default value is 0. The parameters where this tolerance applies are marked with a "**". This facility may also be used for specifying ranges instead of discrete values.

All recodes are performed using the saved data matrix on the current data matrix, i.e. to re-recode a value you should do a SAVE command.

RECODE


    RECODE oldval (**) INTO newval
oldval of the current variable is recoded into newval. A range can be specified using the fuzz facility.

PUT
    PUT newval IF var#   | EQUAL=crit ** |
                         | NOTEQ=cit  ** |
                         | GT=crit       |

                         | LESS=crit     |
Changes the current variable into newval directed by the condition specified. vi is the criterion variable, being any variable in the WA. The possible conditions are equality(E), non-equality (N) ( optionally qualified by the preset fuzz value), greater and less than (G, L).

COPY
    COPY vi  [ IF  vj    | EQUAL=crit ** | ]
                         | NOTEQ=crit ** | ]
                         | GREATER=crit  | ]

                         | LESS=crit     | ]
Copies variable vi into the current variable. Optionally this copy can be directed by a condition. The syntax for the condition is the same as on the above command.

Low level commands

This section contains commands that should only be used by an knowledgeable user, because they apply directly to the data matrix to allow for very flexible transformations. Checks for incorrect operations are very incomplete. These operations apply to the whole data matrix, whether there is data in a variable or not, it is the users responsibility to check his results and to set the appropriate variable attributes.

Before explaining the operations itself, the organization of the data matrix and the information attached to it has to be explained in more detail:

The numeric data are stored in a matrix data(i,j), where the i are the variable numbers (max NVAR), and j the cases (max MCAS). To each row (variable) a LABEL and a descriptor is attached, and to each column (case) a CASID. Each variable has also three words of status: (TYPE,N and TABLE) manipulated by the corresponding edit command, which - for the following commands - is essential for the N of cases. Each variable has also three stored values (MIN, MAX and CENTER) which should be set either by the corresponding command or by the COMP command. The user should make sure that either the old N of case is correct or that it is set correctly. The safest way is to use CREATE and then do a COMP when leaving the editor.

MOVE


     MOVE      |  DATA   I=iind  J=iind K=iind |
               |  LABEL  I=iind  J=iind        |
               |  DESC   I=iind  J=iind        |

               |  CASID  I=iind  J=iind        |
This command transfers data or dictionary information guided by the specified indices. MOVE can not be used in immediate mode, because it operates from the saved state of the WA (the state is saved upon entering the editor, or by the SAVE command after a series of modifications). This also means that the user can alter the WA without affecting the operation of this command, i. e. unless a SAVE is done, the initial (unmodified) WA is used.

The first parameter field transfers tells what to transfer, the other parameters are indices. Note that for the DATA transfer, transfer is done to the current variable and that by each transfer command only one item is transferred. (see the macros for repetitive executions).

The commands do the following (using the previous index notations):

  DATA:      data(icurr,iind)=datacopy(jind,kind)

icurr current variable iind parameter 1..MCAS jind parameter 1..NVAR (from variable) kind parameter 1..MCAS data WA matrix datacopy saved state of WA

LABEL label(iind)=labelcopy(jind)

iind parameter 1..nvar jind parameter 1..nvar label variable labels (vector) labelcopy saved state of labels

DESCR descr(iind)=descrcopy(jind)

same as above for descriptors

CASID casid(iind)=casidcopy(jind)

iind parameter 1..mcas jind parameter 1..mcas casid case identifications idem saved state



CONSTRUCT
   CONSTRUCT ROW=var# COL=var# I=indx J=indy [VTARG=var#]
The variable constructed is either the current variable or the variable given on the V parameter. This command creates a new variable using a double index on the data matrix (saved matrix). If the R (row index) or C (column index) are equal to 0, then the program uses as default the table ties as row index and the column index is the GVAR, if stored. The R/C parameter allow to use an arbitrary variable from the WA. If one of the indices is 0, then the data item is not use for constructing the new variable, otherwise all data items corresponding simultaneously to indx and indy are put into the current variable (old content destroyed). As for the MOVE command, the indexed data is taken from the saved matrix and put into the WA. The command also creates new case ids, which are the first two characters of the column index (i.e. casid) and the first two characters from the row indexed label. An extensive example of this command can be found in the section on data structures.