Features and commands
EDA is a consequently designed and developed analysis environment for
interactive, exploratory research. The EDA design is
screen, not paper oriented.
The user communicates with EDA through a simple, yet powerful command
language designed for true interactive work. On-line help and a
command editing facilities are provided.>
The results of a terminal session may be placed into a print
file, either completely or selectively. Print files can be reviewed
(selection of images) and pretty-printed. Additional options and a text
editor facilitate the transfer of the results to a text formatter.
Variables to be analyzed are
held in a work area (data matrix). Variables are referred to
either by name or position. Case identifiers and a
grouping variable are associated with each work area. Variables may
be tied together to form groups.
In addition to its name and
descriptor a variable can be documented with a document.
Documents (text of any length) are structured using various
levels of documentation allowing for selective search and
display. Numerical information may be embedded within documents and then
searched and extracted for analysis.
The data analysis commands offered in EDA cover three main areas: (see
table I and II for an overview of the analysis commands.)
- Exploratory data analysis
- Multidimensional methods
- Cluster analysis
Data input and files Like most systems, EDA handles raw data
files and its own files (archive files), stored in an EDA specific file
system. (A normal user is completely shielded from operating system
There are facilities for data input from the interactive
terminal and dialogue-assisted building of documented system files.
Special attention has been paid to the communication with standard
packages, especially SPSS and SAS (EDA produces SPSS setups, as well as
SAS Data steps).
The PC version features a read/write interface to spreadsheet programs.
Data editing and correction
A number of data editing and
correction commands are grouped within a special module within the EDA
analysis system, the data editor, providing more security (changes may
be undone). It provides text-editor-like commands for data editing.
Powerful commands allow for conditional and unconditional
transformations using algebraic and logical expressions. The
transformation is designed to encourage numerical experimentation.
In addition to the standard operations and functions (more than 100) EDA
includes statistical and a large number of exploratory functions.
Computations may be performed on variables, matrix rows and columns and
scalars (including individual matrix elements and scalar variables).
It is for instance easy to replace all outliers by the median of the
corresponding variable. Transformations are easily performed on several
variables using control structures (loops).
A case selection lets you analyze selected observations without
altering the data matrix. Cases may be selected based on logical
expressions, group memberships and the like.
Macro and abbreviation facilities are provided. Together
with user formatted output procedures, control structures allowing for
repetitive execution, and scalar variables commands may be tailored to
one's need or new commands built.
Macros may be stored in macro-archives. EDA comes with a sample macro
library, containing for instance an interface to the SPAD data analysis
The toolbox contains a series of general purpose commands used
for file handling and other data and text processing tasks, including
sorting, data checking, file concatenation, modification and many other
useful operations performed on files.
The EDA text editor is used for editing output from commands,
documents, macros, variable descriptions, case identifiers or any text
Other facilities include data aggregation, generation of
"artificial" data (random etc.), matrix manipulation (e.g.
transposition of the work area), weighting, counting, percent checking
and many other transformation tool for common tasks.
many tools are implemented: teacher-student communication, user or group
profiles, as well as user monitoring.
As EDA is a consequently developed
interactive program, a large set of commands deal with the control of
the user's environment in order to meet the specific needs of each user
or group of users. EDA contains many more commands and facilities which
cannot be adequately described in this short text.
Commands (main analysis commands)
ADDFIT fit an additive model to a table
BOXPLOT box and whisker plot with outlier display; parallel boxplots and reexpression diagnostics
BREAK (cross) break of variables: (interval coding).
COMPARE compare variables
DIAGNOSTIC diagnostic routines (e.g. for assessing normality of variables)
DLINE density line
DISPLAY basic univariate statistics, including hinges, fences and Tukey's biweight; trimming options
FREQUENCY frequency tables
HISTOGRAM several forms of histograms
LINE resistant line, Tukey-line, and LSQ line
LIST data lister: numerical and coded displays: sort options for cases and variables; blanking options. Coded displays (many forms)
LOWESS Scatterplot smoothing
MAP simple cartography
MDIAG multivariate diagnostic methods
PLOT plots one, two, three or more variables; many symbol types, outlier elimination, large printer size plots, tool for detailed analysis (zooming, case identification) and transformations etc.
PROFILE displays profiles of single cases or groups
REGRESS biweight multiple regression (Tukey)
REEXPRESS search for appropriate reexpression
QSUMMARY quick (numerical) summaries of variables
SHOW conditional numerical and coded displays of variables
SMOOTH free smoothing (running medians)
STEMLEAF stem and leaf display (simple, back-to-back, groupwise)
SUMMARY numerical summaries and letter values
TRACES hinge and letter value tracing
ANACOR correspondence analysis (Benzecri)
CANON canonical analysis
CFIX fit two configurations
CFIT configuration comparison (Procrustes rotation and other techniques)
FACTOR Principal components and principal axes, options include Gabriel's biplot.
MDS multidimensional scaling (Kruskal-Shepard)
MINISSA smallest space analysis (Guttman-Lingoes)
SCORES factor scoring
TSCALE metric dimensional scaling
CLUSTER non-hierarchical clustering (4 methods)
HIERARCHY hierarchical clustering (6 methods)
VHIERARCH Hierarchical clustering on variables (10 methods)
TREE detailed analysis of the hierarchical tree.
BASSOC compute binary association measures
C1,C2 analyzes the result matrices from dimensional analyses (configurations): numerical and coded displays of loadings and scores, plotting (including simultaneous plots of configurations), profiles of configurations etc.
CORRELATE compute a correlation matrix (options are variance-covariance matrix, rank transformation and robustness transformation and jackknife)
DISTANCE computes a distance matrix
GANALYSIS analyze and compare groups (numerical and coded summaries). Used for detailed
GSUMMARY analysis of a cluster analysis.
MATRIX inspect a matrix (distance, correlation): numerical and coded lists, checking, matrix manipulation
ROTATE rotates a configuration
TRACES group analysis (using boxplots)