Sas summary guide

SAS
Summary Guide

School of Applied Statistics
November, 03

1

Contents

1. Introduction........................................................................................................................2
1.1 Structure of a SAS Job .........................................................................................2
1.2 SAS Language......................................................................................................2
1.3 SAS Variables ......................................................................................................2
1.4 SAS Data Sets ......................................................................................................3
2. Introduction to the DATA Step .........................................................................................3
2.1 DATA Statement..................................................................................................3
2.2 Sources of Input ...................................................................................................3
2.3 Input of Raw Data ................................................................................................4
2.4 Formats: Input and Output ...................................................................................5
2.5 How SAS Executes a DATA Step .......................................................................5
2.6 Transformation of Data ........................................................................................5
2.7 Missing Values.....................................................................................................5
2.8 Modifying an Existing SAS Data Set ..................................................................6
2.9 Output from a SAS DATA Step...........................................................................6
2.10 Output to Create Stored ASCII Files .................................................................7
3. Introduction to the PROC Step ..........................................................................................7
4. Basic Procedures................................................................................................................8
5. More on the DATA Step....................................................................................................13
5.1 IF - THEN - ELSE Statements.............................................................................13
5.2 Selecting Observations.........................................................................................14
5.3 DO and END Statements .....................................................................................14
5.4 DO Loops .............................................................................................................14
5.5 Arrays...................................................................................................................15
5.6 RETAIN ...............................................................................................................15
5.7 DROP and KEEP .................................................................................................15
5.8 RENAME and LABEL ........................................................................................16
6. Data Management ..............................................................................................................16
6.1 SET.......................................................................................................................16
6.2 MERGE................................................................................................................17
6.3 UPDATE ..............................................................................................................17
7. Statistical Procedures.........................................................................................................18
8. Graphical Procedures .........................................................................................................21
9. Output Delivery System (ODS) .........................................................................................22
10. Further Facilities ..............................................................................................................23
11. Publications......................................................................................................................23

SAS Summary Guide November, 03 School of Applied Statistics

2

1. Introduction
This handout is meant as a brief introduction to the syntax of the SAS package which is
available on UNIX workstations and PC computers at The University of Reading. The SAS
language is similar for all versions but there are differences in file access and storage. This
document is designed to give a brief synopsis of many basic commands used in the Data step
and the general structure to some statistical procedures (Proc). It is, by no means, complete
and there are numerous specialised manuals published by SAS Institute (some of which are in
Room G16 in the School of Applied Statistics).

1.1 Structure of a SAS Job
A SAS program consists of a sequence of one or more steps and each step may contain
several SAS statements. There are two kinds of step:-
• The DATA step which is used to create and manipulate SAS data sets
• The PROC step which is used for analysing or processing SAS data sets
A SAS job is made up of any number of these steps. The beginning of one step signifies the
ending of the previous step.

1.2 SAS Language
SAS statements can begin in any column of a line and can be continued on subsequent lines.
Each SAS statement must end with a semicolon but is mainly case-sensitive (i.e. upper and
lower case should not be freely mixed).
There are three types of SAS statements:-
• Statements which appear in the DATA step
• Statements which appear in the PROC step
• Statements which can appear anywhere (global statements)
Comments can also be included in a SAS program, these are useful for annotating your
program. An asterisk is used to comment out a single statement.
e.g. * This is a comment ;

or to comment out a block of lines use the /* and */ delimiter pairs:-
e.g. /* This is a comment
which will not be acted upon by SAS */

1.3 SAS Variables
There are two types of SAS variable - numeric and character. They can have the following
attributes:-
LENGTH numeric variables 2 - 8 bytes
character variables 1 - 200 bytes / characters
INFORMAT format SAS uses to read a data value into a variable
FORMAT format SAS uses to write each value of a variable
LABEL descriptive label of up to 256 characters


3

1.4 SAS Data Sets
A SAS data set is a collection of data values arranged in a rectangular table, the rows
representing observations and the columns representing variables. Each variable must be
given a name which consists of 1 - 32 characters. The name must start with a letter and can
contain any alphanumeric character or underscore. Avoid special characters in variable
names such as . or $ . Special variables within SAS are denoted by names that begin and end
with an underscore.
SAS data sets can be either temporary or permanent. Temporary data sets are given a one-
level name by the user which is automatically prefixed with WORK. by the SAS system.
This name can be omitted altogether, in which case SAS names the data sets DATA1,
DATA2 ... for the 1st, 2nd ... data sets defined. Temporary data sets are erased on leaving the
current SAS session. Permanent data sets must be given a two-level name by the user linking
to their storage location.
e.g. LIBNAME PERM 'complete_pathname';
PROC PRINT DATA=PERM.STUDENTS;
RUN;
Permanent SAS data sets are stored differently between versions and allocated different file
extensions. However, all data sets are upward compatible. There are several words which
should not be used as the first part of the SAS data set name. These include such words as
PRINT, EXEC, DATA etc. and also SAS reserved names such as LIBRARY, MAPS, WORK
etc.
SAS automatically documents a permanent data set to include a data set label, variable
attributes and history information. The data are stored in the form in which SAS uses them,
therefore saving computer time and making it unnecessary to execute input statements each
time the data set is used.

2. Introduction to the DATA Step
2.1 DATA Statement
The DATA statement signals the beginning of the DATA step and gives a name to the SAS
data set being created. This SAS data set can be used as input to any subsequent DATA or
PROC steps.
e.g. a) DATA PERM.PATIENTS; creates a permanent data set
b) DATA SCHOOL; creates a temporary data set
c) DATA; creates a temporary data set with
default name DATAn
d) DATA _NULL_; does not create a data set

2.2 Sources of Input
a) The DATALINES or CARDS statement is used when the data are in the same file as the
SAS statements:-
DATA REGRESS;
INPUT X Y Z;


4

DATALINES;
61 44 29
17 6 43
.
.
b) The INFILE statement is used to read data from an external file on your workdisk:-
DATA REGRESS;
INFILE 'file_identifier';
INPUT X Y Z;
The file identifier in the INFILE statement is the full pathname and filename of the external
data file, residing on your disk, which is to be linked to your SAS program.

2.3 Input of Raw Data
The INPUT statement is used to describe the raw input data. There are three types of input
mode which can be mixed in one INPUT statement:-
• LIST (or free-field)
• COLUMN
• FORMATTED

a) LIST INPUT
This mode of input simply lists the variables in the order in which they appear in the input
data
e.g. INPUT NAME $ AGE SEX $;
INPUT NAME $ Q1-Q32;
where $ is used after a variable name to indicate a character variable whose value has a
default length of 8 with no embedded blanks. Values must be separated by at least one space
(free format).
b) COLUMN INPUT
With this mode of input the columns are specified within which each variable value is located
e.g. INPUT CANNAME $ 1-15 PARTY $ 20-24 VOTES 30-40;

The data values can be read in any order and blank fields are automatically set to missing.
Embedded blanks are allowed in character data by specifying the maximum length of a value.
c) FORMATTED INPUT
This is a very flexible method of input as it is possible to read data in virtually any form. SAS
keeps track of its position on the input lines with a 'pointer'
e.g. INPUT @3 QUEST3 +10 QUEST12 / @60 RESPONSE;
There are various types of 'pointer' controls each having a different meaning. Listed below
are some of the more frequently used ones:-
@n move pointer to column n


5

+n move the pointer forward n columns
#n move pointer to line n
/ move to next line
Whichever mode of input is used the following 'pointer' controls can be used to maintain the
current pointer position:-
@ 'hold' data line for next INPUT statement in the current DATA step
@@ 'hold' data line for more executions of the DATA step

2.4 Formats: Input and Output
A set of directions for reading a value is called an INFORMAT and a set of directions for
printing a value is called a FORMAT. It is possible to specify formats for numeric and
character variables and also date and time variables. There are a large number of FORMAT
and INFORMAT specifications, refer to SAS Language Reference Version 8 for further
information.

2.5 How SAS Executes a DATA Step
A DATA step is executed once for each observation in the data set. A DATA step that does
not contain an INPUT, SET, MERGE or UPDATE statement is executed once. The SAS
variable _N_ is automatically generated for each DATA step, its value is the number of times
that SAS has begun executing the step (_N_ is not directly available outside the current
DATA step). All variables referred to in the DATA step, for example the variables named in
the input statement and any new variables generated, make up the program data vector.
For each execution of the DATA step:-
• The program data vector is initialised to missing.
• The data values of the current observation are read using the INPUT statement. Any
new variables are computed and added to the program data vector and any variables not
wanted are dropped.
• The values in the program data vector are then added to the data set being created

2.6 Transformation of Data
There is a range of standard functions available in SAS for transforming data. For a full list
of these functions consult the SAS Language Reference. Manipulation and transformation of
data is carried out in the DATA step with the resulting variable being added to the data set
automatically.
e.g. SUM=X + X;
X2=X * X; or X2=X**2;
LX=LOG(X);

2.7 Missing Values
Variables with missing values on input are specified in SAS by a full stop or a blank field.
On output numeric variables are displayed as a full stop and character variables as a blank
field. For numeric variables it is also possible to specify up to 27 special missing value
symbols ( A - Z and _ ) to distinguish between different kinds of missing data. This is done
using the MISSING statement:-


6

DATA;
INPUT X;
MISSING A B;
IF X = 99 THEN X = .A;
IF X = 999 THEN X = .B;
CARDS;

a) .A is used to distinguish from the variable name A
b) A variable is set to missing if the input field contains only a full stop or is blank.
c) A variable is set to missing if the input field contains an illegal character

2.8 Modifying an Existing SAS Data Set
Once data have been read into a SAS data set it is possible to modify that data in other DATA
steps while keeping the original data set unchanged and without having to re-input the data
from the raw data file. This is easily done by transferring data from the existing SAS data set
into another one.
e.g. DATA NEW;
SET PERM.PATIENTS;
DOSE=PILL_A*QTY_A;
Each time the SET statement is executed another observation is transferred from the existing
SAS data set PERM.PATIENTS to the SAS data set being created and called NEW .

2.9 Output from a SAS DATA Step
OUTPUT statements allow you to control when an observation is written to one of the SAS
data sets which are currently being created.
e.g. OUTPUT;
OUTPUT MISSDATA;
When an OUTPUT statement is executed SAS will immediately output the current values to
the named or current SAS data set. OUTPUT statements are useful for:-
a) Creating 2 or more observations from 1 record of input data
b) Combining several observations into one observation
c) Creating more than one SAS data set from one input file
eg. DATA HARV1 HARV2;
SET COMPLETE;
IF HARVEST=1 THEN OUTPUT HARV1;
IF HARVEST=2 THEN OUTPUT HARV2;


7

2.10 Output to Create Stored ASCII Files
The FILE and PUT statements are used within a DATA step and are analogous to the INFILE
and INPUT statements. The FILE command links SAS to a specific external file, while the
PUT command specifies the output record format.
e.g. DATA CREATE;
SET CLASSNO;
FILE 'file_identifier';
PUT NAME $ 1-8 SEX $ 11 AGE 13-14;

3. Introduction to the PROC Step
Some of the procedures available in SAS are:-
Basics: CHART, CONTENTS, CORR, DATASETS, FORMAT, FREQ, MEANS,
PLOT, PRINT, SORT, SUMMARY, TABULATE, TRANSPOSE,
UNIVARIATE
Statistics: ANOVA, CANCORR, CANDISC, CLUSTER, DISCRIM, FACTOR, GLM,
PRINCOMP, REG, TTEST
Graph: GCHART, GCONTOUR, GMAP, GPLOT, GSLIDE, G3D, G3GRID

SAS procedures analyse and process SAS data sets as follows:-
a) Read SAS data sets
b) Perform the requested task
c) Print results
d) Create SAS output data sets (optional)

Most SAS procedures have default option settings for the more common situations or
analyses. However, information can be given to the PROC step to specify:-
a) Which data set to process
b) Which variables to process
c) Whether to process the data in subsets
The PROC statement is used to begin a procedure.
e.g. PROC MEANS DATA=PERM.PATIENTS MEAN STD;

Some of the more commonly used statements within the PROC step are:-
a) General statements common to many procedures
VAR Specifies variables to be analysed
ID Specifies a variable whose values identify observations in the SAS data set


8

BY Specifies that the data set is to be processed in groups
N.B. The data set must have already been sorted in the order of the current
BY group.
WEIGHT Specifies a variable whose values are the relative weights for the observations
WHERE Subsets observations to be analysed based on specified criteria

b) Statements specific to individual procedures
TABLES Table request in PROC FREQ
PLOT Plot request in PROC PLOT
MODEL Model specification in PROC ANOVA, PROC GLM, PROC REG etc.

c) Statements describing variable attributes
FORMAT Specifies formats for printing variable values
LABEL Associates descriptive labels with variable names

Lists of names can be abbreviated:-
a) Range of variables VAR SEX -- TEMP;

b) Numeric suffix range VAR Q1 - Q20;

c) Range of numeric variables only VAR AGE _NUMERIC_ TEMP;
d) Range of character variables only VAR NAME _CHARACTER_ SEX;
e) All numeric variables VAR _NUMERIC_;

f) All character variables VAR _CHARACTER_;

4. Basic Procedures
PROC CHART
This procedure produces horizontal and vertical bar charts, pie charts, star charts and block
charts for numeric and character variables. The charts can represent frequencies and
cumulative frequencies, percentages and cumulative percentages, sums and means.

PROC CHART DATA = data_set_name options ;
HBAR variable_list ; produces horizontal bar chart
VBAR variable_list ; produces vertical bar chart
PIE variable_list ; produces pie chart
STAR variable_list ; produces star chart
BLOCK variable_list ; produces block chart
BY variable_list ;


9

PROC CORR
This procedure computes correlation coefficients between variables. Various univariate
statistics are also computed.

PROC CORR DATA = data_set_name options ;
VAR variable_list ;
WITH variable_list ;
WEIGHT variable ;
FREQ variable ;
BY variable_list ;

PROC FORMAT
This procedure is used to define formats for specifying labels for variable values used for
output. Formats can be used for either numeric or character variables. They can be used in
PUT statements in a DATA step and in FORMAT statements in a PROC step. In FORMAT
statements in a DATA step they can also be used in which case they are then associated with
the variable for the remainder of the SAS job, unless changed.

PROC FORMAT options ;
VALUE format_name value1 = label1
value2 = label2
. .
valuen = labeln ;

format_name Must be a unique SAS name which must begin with a $ for character variables
values Can be a single number or a range of numbers, or several numerical or
character values
labels Labels can contain a maximum of 40 characters and must be enclosed in
quotes

e.g. PROC FORMAT;
VALUE $SEXFMT 'M' = 'Male' 'F' = 'Female';
VALUE AGEFMT 1 - 16 = 'Child' 17 - High = 'Adult';

The formats defined above can be used in other procedures as follows:-
PROC PRINT DATA = PERM.PATIENTS;


10

VAR SEX AGE;
FORMAT SEX $SEXFMT. AGE AGEFMT. ;

NB. The full stop after SEXFMT and AGEFMT is essential

PROC FREQ
This procedure produces 1 - way to n - way frequency tables of character and numeric
variables.

PROC FREQ DATA = data_set_name options ;
WEIGHT weighting_variable ;
BY variable_list ;
TABLES table_request / options ;

In the TABLES specification the values of the last variable form the columns and the values
of the second last variable form the rows.
e.g. TABLES VAR1; one - way table
TABLES VAR1 * VAR2; two - way table

PROC MEANS
This procedure is used to produce simple univariate statistics for numeric variables. The
options available allow you to specify which statistics you want calculated e.g. mean,
standard deviation, minimum. If no statistics are specifically requested in the MEANS
statement, then variable name, N, mean, standard deviation, minimum, maximum are
printed automatically.

PROC MEANS DATA = data_set_name options ;
BY variable_list ;
VAR variable_list ;
ID variable_list ;
FREQ variable ;
OUTPUT OUT = output_data_set_name statistics ;


11

PROC PLOT
This procedure produces line-printer plots for both numeric and character variables. Various
options are available for specifying the plotting symbol, scaling the axes, drawing reference
lines, superimposing 2 or more plots and drawing contour plots.

PROC PLOT DATA = data_set_name options ;
PLOT vertical_variable * horizontal_variable / options ;
BY variable_list ;

PROC PRINT
This procedure prints the values in a SAS data set.

PROC PRINT DATA = data_set_name options ;
BY variable_list ;
VAR variable_list ;
ID variable_list ;
PAGEBY variable ;
SUM variable_list ;
SUMBY variable ;

PROC SORT
This procedure rearranges the observations in an existing SAS data set or creates a new data
set containing the rearranged observations. Multiple sorting groups can be specified and
variables can be sorted in ascending or descending order.

PROC SORT DATA = data_set_name OUT = output_data_set_name options ;
BY variable_list ;

Variables are automatically sorted in ascending order, for descending order put
DESCENDING before the variable names in the BY statement. The SORT procedure should
always be used when subsequent procedures process the data set in groups using the BY
statement. It is possible to process a data set without sorting it beforehand by using the
NOTSORTED option on the BY statement of the procedure being used. However, SAS
assumes that consecutive observations with the same BY value are grouped together although
the BY values are not necessarily sorted in alphabetic or numeric order.


12

PROC SUMMARY
This procedure produces a SAS data set containing statistics similar to the MEANS
procedure, but much more efficiently. PROC SUMMARY does not produce any printed
output and the data does not have to be sorted in order to produce subgroup statistics. An
OUTPUT and a VAR statement must be specified, and any number of OUTPUT statements
can be used. The VAR statement must precede the OUTPUT statements.

PROC SUMMARY DATA = data_set_name options ;
CLASS variable_list ;
VAR variable_list ;
BY variable_list ;
FREQ variable ;
ID variable_list ;
OUTPUT OUT = output data_set_name statistics ;

PROC TABULATE
This procedure provides a more flexible alternative to the FREQ procedure for producing
tables. Each cell in the table contains a descriptive statistic e.g. mean, standard deviation,
etc. TABULATE will generate tables defined by the TABLE statement. Classification
variables must be specified with the CLASS statement, while the variables to be tabulated i.e.
whose values are to be the cell contents must be specified by the VAR statement. Each
expression in the TABLE statement defines the categories for the table's dimensions - page,
row and column.

PROC TABULATE DATA = data_set_name options ;
CLASS variable_list ;
VAR variable_list ;
BY variable_list ;
FREQ variable ;
FORMAT variables'_format ;
LABEL variable = 'label' ;
TABLE page_expression, row_expression, column_expression ;


13

PROC TRANSPOSE
This procedure transposes data sets, changing observations into variables and variables into
observations. An output data set is created automatically and named according to the
DATAn convention if a name is not specified.

PROC TRANSPOSE DATA = data_set_name options ;
VAR variable_list ;
ID variable ;
IDLABEL variable ;
COPY variable_list ;
BY variable_list ;

5. More on the DATA Step
5.1 IF - THEN - ELSE Statements
These statements are used to execute a further SAS statement conditional on some
expression.

IF expression THEN statement;
ELSE statement ;

THEN statement is executed if expression is non zero, non missing or true
ELSE statement is executed if expression is zero, missing or false

There are eight relational operators:-
LT or < LE or <= GT or > GE or >=
NL or ~< NG or ~> EQ or = NE or ~=

In addition there are three logical operators:-
NOT or ~ AND or & OR

e.g. DATA ;
IF CODE = 1 OR CODE = 2 THEN SEX = 'MALE' ;
ELSE SEX = 'FEMALE';

e.g. DATA ;
INPUT AGE ;


14

IF 0 < AGE < 10 THEN AGEGRP = 1 ;
IF 10 <= AGE < 19 THEN AGEGRP = 2 ;
IF AGE >= 19 THEN AGEGRP = 3 ;

Any observations with values not included in one of the categories will produce missing or
blank values.

5.2 Selecting Observations
If not all observations are to be included in the data set being created they can be excluded by
the DELETE statement or the subsetting IF statement. The DELETE statement stops the
processing of an observation:-

e.g. DATA MALES ;
INPUT AGE SEX $ ;
IF SEX = 'F' THEN DELETE ;

The subsetting IF statement allows an observation to pass if the expression is true:-

e.g. DATA MALES ;
INPUT AGE SEX $ ;
IF SEX = 'M' ;

The result from both of the above DATA steps is the same.

5.3 DO and END Statements
DO statements specify that any statements following the DO are to be executed until a
matching END appears.

e.g. DATA ;
INPUT AGE SEX $ FAMILY $ ;
IF SEX = 'F' THEN DO ;
AGE = AGE - 5 ;
FAMILY = 'NEW' ;
END ;
ELSE AGE = AGE + 3 ;

5.4 DO Loops
DO loops allow a range of statements, within a DATA step, to be repeated either a specified
number of times or while a specified condition holds.

DO variable= start TO stop ;


15

DO variable = start TO stop BY increment ;
DO WHILE (expression) ;
DO UNTIL (expression) ;
DO OVER array_name ;

Each must have a matching END statement to terminate execution.

e.g. DO N = 1 TO 20 ;
DO N = 1 TO 20 BY 4 ;
DO WHILE (N < 20) ;
DO UNTIL (N = 20) ;

5.5 Arrays
Arrays in SAS are useful for processing a lot of SAS variables in the same way

ARRAY array_name [index_variable] array_elements ;

e.g. ARRAY A Q1 - Q5 ;
DO OVER A ;
A = LOG(A) ;
END ;

Array elements are substituted for the array name in SAS statements depending on the value
of the index variable. SAS will use its own internal index variable if none is defined. In the
example above the DO group is executed for every element in the array.

5.6 RETAIN
This statement retains a variable value from the last execution of the DATA step. Normally
all variables are set to missing before each execution of the DATA step. Initial values can
also be assigned to the variables.

RETAIN variable ;
RETAIN variable initial_value ;

5.7 DROP and KEEP
The DROP statement excludes named variables from a data set or analysis and the KEEP
statement includes only named variables in a data set or analysis. Both statements can be
used in the DATA step or as data set options which appear after the data set name on PROC
steps.


16

e.g. DATA PERM.PATIENTS ;
DROP PATNO ;

DATA PERM.PATIENTS(DROP = PATNO) ;

PROC PRINT DATA = PERM.PATIENTS(KEEP = AGE SEX) ;

5.8 RENAME and LABEL
The RENAME statement is used to rename variables.

RENAME old_name = new_name ;

The LABEL statement assigns labels of up to 40 characters to variables.

LABEL variable = 'label' ;

6. Data Management
6.1 SET
Reads observations from 1 or more SAS data sets and can interleave observations.

a) Subset the observations DATA FEMALES ;
SET STUDENTS ;
IF SEX = 'F' ;

b) Subset the variables DATA SMALL ;
SET STUDENTS ;
DROP WEIGHT AGE ;

c) Add a new variable DATA ADD ;
SET STUDENTS ;
WTKG = WEIGHT / 2.2 ;

d) Multiple output data sets DATA MALES FEMALES ;
SET STUDENTS ;
IF SEX = 'M' THEN OUTPUT MALES ;
IF SEX = 'F' THEN OUTPUT FEMALES ;

e) Multiple input data sets DATA ALL ;


17

(Concatenate) SET MALES FEMALES ;

f) Multiple input data sets DATA ALL ;

(Interleave) SET MALES FEMALES ;
BY NAME ;

6.2 MERGE
Combines observations from two or more SAS data sets and places them side by side.
a) One-to-one Merging
If there are the same number of observations in each data set and if the observations are in the
same order then they can be combined as shown below. The two data sets are placed side by
side in the combined data set being created.
DATA COUPLES ;
MERGE HUSBANDS WIVES;

For any duplicate variable name in the data sets, only the values of that variable from the last
named data set will be saved.

b) Match Merging
The two data sets, having already been sorted, are placed side-by-side in the order specified
in the BY statement.

DATA STABLE ;
MERGE HORSE TRAINER ;
BY OWNER ;

6.3 UPDATE
Updates a master file with a transaction file where the BY variable is the KEY for matching
observations.

DATA SURGERY;
UPDATE SURGERY BLOODCT;
BY PATIENT;

This should be used only when, for a master data set, there are several changes that can be
applied all in one job.


18

7. Statistical Procedures
There are a wide range of statistical procedures available in SAS for carrying out such
techniques as analysis of variance and covariance, linear and non-linear regression analysis,
multivariate methods and non-parametric methods. A few examples of some of the more
widely used procedures are given below. For more details on all the procedures available for
statistical analysis, consult the appropriate manuals.

PROC ANOVA
This procedure is used to carry out an analysis of variance of balanced data (see also PROC
GLM). Many of the statements which can be used with this procedure are not necessary for
standard analyses.
PROC ANOVA DATA=data_set_name options ; 
 required statements;
CLASS variable_list ; 
 must appear in this order
MODEL dependent_variables = effects / options ; 
BY variable_list ; 
 must appear before the
ABSORB variable_list ; 
 first RUN statement
FREQ variable ; 
MEANS effects / options ;  can appear after the
TEST H = effects E = effect ;  MODEL statement


MANOVA H = effects E = effect M = equations / options; and can be used
REPEATED factor_names / options ; 
 interactively
e.g. PROC ANOVA DATA = EXPT ;
CLASS METHOD VARIETY ;
MODEL YIELD = METHOD VARIETY METHOD * VARIETY ;
BY YEAR ;


19

PROC GLM
This procedure can be used to fit general linear models to data to enable statistical methods
such as analysis of variance, analysis of covariance, regression analysis (including
comparison of regressions) and multivariate analysis of variance to be carried out.
Unbalanced data and data with missing values can also be analysed using this procedure.
There are numerous statements and options available with this procedure, but most
applications only use a few of them.
PROC GLM DATA=data_set_name options ;  must precede MODEL

CLASS variable_list ;  statement
MODEL dependent_variables = independent_variables / options ; required statement
ABSORB variable_list ; 
BY variable_list ; 

FREQ variable ; 
ID variable_list ;  first RUN statement

WEIGHT weighting_variable ; 

CONTRAST 'label' effect_values / options ; 
ESTIMATE 'name' effect_values / options ; 

LSMEANS effects / options ; 
 can appear after the
MANOVA H = effects E = effect M = equations / options ; 
 MODEL statement
MEANS effects / options ; 
 and can be used
OUTPUT OUT = output_data_set_name;
 interactively
RANDOM effects / options ; 
REPEATED factor_names / options ; 

TEST H = effects E = effect / options ; 


e.g. PROC GLM DATA = EXPT2 ;
CLASS TREAT SUBJECT TIME ;
MODEL RESP = TREAT SUBJECT(TREAT) TIME TREAT * TIME ;
TEST H = TREAT E = SUBJECT(TREAT) ;
LSMEANS TREAT TIME TREAT*TIME ;
OUTPUT OUT = NEW P = RHAT R = RESID ;


20

PROC TTEST
This procedure carries out a simple t-test on the means of two groups of observations. The
grouping factor specified by the CLASS statement it must have only two levels.
PROC TTEST DATA = data_set_name options ;
 required statements
CLASS variable_list ; 
BY variable_list ; 
 optional statements
VAR variable_list ; 
e.g. PROC TTEST DATA = EXPT5 ;
CLASS SEX ;
VAR SCORE ;

PROC NLIN
This procedure is used to fit nonlinear regression models. The model to be fitted has to be
specified, as do the parameters to be estimated, initial guesses for them, and possibly the
partial derivatives of the model with respect to each parameter. Some models are difficult to
fit and in these cases the initial guesses can be critical. There is no guarantee that the
procedure will be able to fit the model successfully.
PROC NLIN DATA = data_set_name options ;

PARMS parameter = values ;  required statements
MODEL dependent variable = expression ; 

BOUNDS expressions ; 


ID variable_list ;  optional statements
DER.parameter = expression ; 

OUTPUT OUT = output_data_set_name ; 

e.g. PROC NLIN DATA = EXPT3 ;
PARMS B0 = 0.5 B1 = 0.08 ;
MODEL Y = B0*(1-EXP(-B1*X)) ;
DER.BO = 1-EXP(-B1*X) ;
DER.B1 = B0*X*EXP(-B1*X) ;


21

PROC REG
This procedure is used to fit linear regression models. There are other regression procedures
such as RSQUARE, RSREG and STEPWISE for selecting subsets of independent variables
in a multiple regression analysis, fitting quadratic response surfaces and carrying out
stepwise regression, respectively.
PROC REG DATA = data_set_name options ; } required statement
required statement for
MODEL dependent_variables = independent_variables / options ;} model fitting:
can be used interactively
VAR variable_list ; 

FREQ variable ; 
 first RUN statement
WEIGHT weighting_variable;

ID variable ; 

ADD variable_list; 
DELETE variable_list; 

MTEST equations ; 

OUTPUT OUT = output_data_set_name ;  can appear anywhere after

PLOT y_variate*x_variate;  a MODEL statement and
REFIT;  can be used interactively

RESTRICT equations ; 
REWEIGHT condition; 

TEST equations ; 


e.g. PROC REG DATA = EXPT4 ;
MODEL POP = YEAR ;
OUTPUT OUT = REGOUT P = EPOP R = RESID ;

8. Graphical Procedures
The majority of procedures available to produce high-quality, hard-copy graphical output
work in the same way as those mentioned in section 4. Syntactically most are prefixed by the
letter G e.g. GCHART, GPLOT etc. Additional global statements allow the user to specify
more precisely the axes, symbols and patterns etc. used in the representation of the data.
This is a topic beyond the scope of this Summary Guide but information can be found in the
two volumes of the manuals SAS/GRAPH. To produce hard-copy, the various versions of
SAS access the graphics devices in different ways, so refer to the appropriate SAS
Companion Guide for more complete information.


22

9. Output Delivery System (ODS)
Many procedures produced output data sets which could be used in further calculations e.g
parameter estimates from regression analysis. However, some more common procedures
lacked this facility. Since verion 7 the Output Delivery System (ODS) has made the saving
of datasets, formatted output for high-resolution printers and web quality output using HTML
much simpler.
Equally it is possible to control the output stream more effectively and greater choice of
output objects to data sets is available.
ODS is a vast topic with many individual statements. Each statement (shown in the next
table has its own set of options which are not shown here and are best described in the
manual.
Table of ODS Statements
ODS EXCLUDE {Specify output objects to exclude from ODS destinations.
Open, manage, or close the HTML destination. If
ODS HTML 
 the destination is open, you can create HTML output.
ODS LISTING {Open, manage or close the Listing destination.
Create a SAS data set from an output object and manage
ODS OUTPUT 
 the selection and exclusion lists for the Output destination.
Specify which locations to search for the definitions that

ODS PATH  were created by PROC TEMPLATE, as well as
 the order in which to search for them.

Open, manage or close the Printer destination. If the
ODS PRINTER 
destination is open, you can create Printer output.
ODS SELECT {Specify output objects for ODS destinations.
 Write to the SAS log the specified selection or
ODS SHOW 
exclusion list.
 Write to the SAS log a record of each output object that is
ODS TRACE 
created, or suppress the writing of this record.
Print or suppress a warning that a style definition or a table
ODS VERIFY 
definition that is used is not supplied by SAS Institute.


23

10. Further Facilities
There are many more facilities in SAS in addition to those that have been documented here.
These include:-
• A macro processing language
• A full-screen editor (FSP) enabling data to be entered and updated. It also contains a
spreadsheet facility.
• Interactive matrix language (IML). A very powerful module for programming matrix
algebra useful for statistical and mathematical applications
• Time series module (ETS) for carrying out econometric and time-series analysis.

11. Publications
There is a vast range of SAS manuals for both UNIX and PC versions. They can be ordered
from:-
SAS Software Ltd.
Wittington House
Henley Road
Medmenham
Marlow
SL7 2EB

The Main Library on campus has a few manuals for reference based on previous versions. In
addition, users of SAS at The University of Reading can read the current documentation on-
line by registering at
http://v8doc.sas.com/sashtml/


Sas summary guide

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (16)

Semelhante a Sas summary guide

Semelhante a Sas summary guide (20)

Último

Último (20)

Sas summary guide