Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SAS Basics

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 29

Fundamentals Of SAS Programming

SAS Windows
Large organisations and training institutes prefer using SAS Windows. SAS Windows
has a lot of utilities that help reduce the time required to write codes.

The following image shows the different parts of SAS Windows.

 Log Window: It is an execution window. Here, you can check the execution of
your program. It also displays errors, warnings and notes. 
 Code Window: This window is also known as editor window. Consider it as a
blank paper or a notepad, where you can write your SAS code.
 Output Window: As the name suggests, this window displays the output of the
program/ code which you write in the editor.
 Result Window: It is an index that list all the outputs of programs that are run in
one session. Since it holds the results of a particular session, if you close the
software and restart it, the result window will be empty.
 Explore Window: It holds the list of all the libraries in the system. You can also
browse the system supported files here.

SAS Data Sets


SAS data sets are called as data files. Data files constitute of rows and columns. Rows
hold observations and columns hold Variable names.
SAS Variables
SAS has two types of variables:

 Numeric variables: This is the default variable type. These variables are used in
mathematical expressions.
 Character variables: Character variables are used for values that are not used
in mathematical expressions.
They are treated as text or strings. A variable becomes a character variable by
adding a ‘$’ sign at the end of the variable name.

SAS Libraries
SAS library is a collection of SAS files that are stored in the same folder or directory on
your computer. 

 Temporary Library: In this library, the data set gets deleted when the SAS
session ends.
 Permanent Library: Data sets are saved permanently. Hence, they are available
across sessions.

Users can also create or define a new library known as user defined libraries by using
the keyword LIBNAME. These are also permanent libraries.

SAS Programming: SAS Code Structure 


SAS programming is based on two building blocks:

 DATA Step: The DATA step creates a SAS data set and then passes the data
onto a PROC step
 PROC Step: The PROC step processes the data

 A SAS program should follow below mentioned rules:

 Almost every code will begin with either DATA or a PROC Step
 Every line of SAS code ends with a semi colon
 A SAS code ends with RUN or QUIT keyword
 SAS codes are not case sensitive
 You can write a code across different lines or you can write multiple statements
in one line

Now that we have seen a few basic terminologies, let us get started with SAS
programming with this basic code:
1
2 DATA Employee_Info;
input Emp_ID Emp_Name$ Emp_Vertical$;
3 datalines;
4 101 Mak SQL
5 102 Rama SAS
6 103 Priya Java
7 104 Karthik Excel
105 Mandeep SAS
8 ;
9 Run;
10
In the above code, we created a data set called as Employee_Info. It has three
variables, one numeric variable as Emp_Id and two character variables as Emp_Name
and Emp_Verticals. The Run command displays the data set in the Output Window.

The image below shows the output of the above mentioned code.

Suppose you want to see the result in print view, well you can do that by using a PROC PRINT
procedure, the rest of the code remains same.
We just created a data set and understood how the PRINT procedure works. Now, let us take
the above data set and use it for further programming. Let’s say we want to add employee’s
Date of joining to the data set. So we create a variable called as DOJ, give it as input and print
the result.
Now how do we solve this problem? Well, one way to solve it is by using a suffix ‘$’ for
DOJ variable. This will convert DOJ variable to character and you will be able to print
date values. Let us make the changes to the code and see the output.

1 DATA Employee_Info;
input Emp_ID Emp_Name$ Emp_Vertical$ DOJ$;
2 datalines;
3 101 Mak SQL 18/08/2013
4 102 Rama SAS 25/06/2015
5 103 Priya Java 21/02/2010
6 104 Karthik Excel 19/05/2007
105 Mandeep SAS 11/09/2016
7 ;
8
9
Run;
10 PROC PRINT DATA=Employee_Info;
11 Run;
12
The output screen will display the following output.

SAS- INPUT METHODS


The input methods are used to read the raw data. The raw data may be from an
external source or from in stream datalines. The input statement creates a variable with
the name that you assign to each field. So you have to create a variable in the Input
Statement. The same variable will be shown in the output of SAS Dataset. Below are
different input methods available in SAS.

 List Input Method


 Named Input Method
 Column Input Method
 Formatted Input Method
The details of each input method are described as below.

List Input Method


In this method the variables are listed with the data types. The raw data is carefully
analysed so that the order of the variables declared matches the data. The delimiter
(usually space) should be uniform between any pair of adjacent columns. Any missing
data will cause problem in the output as the result will be wrong.
Example
The following code and the output shows the use of list input method.

On running the bove code we get the following output.

Named Input Method


In this method the variables are listed with the data types. The raw data is modified to
have variable names declared in front of the matching data. The delimiter (usually
space) should be uniform between any pair of adjacent columns.
Example
The following code and the output show the use of Named Input Method.

On running the bove code we get the following output.

Column Input Method


In this method the variables are listed with the data types and width of the columns
which specify the value of the single column of data. For example if an employee name
contains maximum 9 characters and each employee name starts at 10th column, then
the column width for employee name variable will be 10-19.
Example
Following code shows the use of Column Input Method.

Formatted Input Method


In this method the variables are read from a fixed starting point until a space is
encountered. As every variable has a fixed starting point, the number of columns
between any pair of variables becomes the width of the first variable. The character
'@n' is used to specify the starting column position of a variable as the nth column.
Example
The following code shows the use of Formatted Input Method
SAS OPERATORS
An operator in SAS is a symbol which is used in a mathematical, logical or comparison
expression. These symbols are in-built into the SAS language and many operators can
be combined in a single expression to give a final output.
Below is a list of SAS category of operators.

 Arithmetic Operators
 Logical Operators
 Comparison Operators
 Minimum/Maximum Operators
 Concatenation Operator

Arithmetic Operators
The below table describes the details of the arithmetic operators. Let’s assume two data
variables V1 and V2with values 8 and 4 respectively.
Comparison Operators
The below table describes the details of the comparison operators. These operators
compare the values of the variables and the result is a truth value presented by 1 for
TRUE and 0 for False. Let’s assume two data variables V1 and V2with
values 8 and 4 respectively.
Minimum/Maximum Operators
The below table describes the details of the Minimum/Maximum operators. These
operators compare the values of the variables across a row and the minimum or
maximum value from the list of values in the rows is returned.

Concatenation Operator
The below table describes the details of the Concatenation operator. This operator
concatenates two or more string values. A single character value is returned.

SAS Loops
While doing SAS programming, we may encounter situations where we repeatedly need
to execute a block of code several number of times. It is inconvenient to write the same
set of statements again and again. This is where loops come into picture. In SAS, the
Do statement is used to implement loops. It is also known as the Do Loop. The image
below shows the general form of the Do loop statements in SAS. 

Following are the  types of DO loops in SAS:

 Do Index: The loop continues from the start value till the stop value of the index
variable.
 Do While: The loop continues as long as the While condition becomes false.
 Do Until: The loop continues till the Until condition becomes True.

Do Index loop
We use an index variable as a start and stop value for Do Index loop. The SAS
statements get executed repeatedly till the index variable reaches its final value.
Do While Loop
The Do While loop uses a WHILE condition. This Loop executes the block of
code when the condition is true and keeps executing it, till the condition becomes false.
Once the condition becomes false, the loop is terminated.
Do Until Loop
The Do Until loop uses an Until condition. This Loop executes the block of code when
the condition is false and keeps executing it, till the condition becomes true. Once the
condition becomes true, the loop is terminated.
DECISION MAKING VIA SAS

Decision making structures require the programmer to specify one or more conditions to
be evaluated or tested by the program, along with a statement or statements to be
executed if the condition is determined to be true, and optionally, other statements to be
executed if the condition is determined to be false.
Following is the general form of a typical decision making structure found in most of the
programming languages −
An IF statement consists of a boolean expression followed by SAS statements.

Syntax
The basic syntax for creating an if statement in SAS is −
IF (condition );
If the condition evaluates to be true, then the respective observation is processed.
An IF-THEN-ELSE statement consists of a boolean expression with a THEN
statements. This is again followed by an ELSE Statement.

Syntax
The basic syntax for creating an if statement in SAS is –
IF (condition ) THEN result1;
ELSE result2;
If the condition evaluates to be true, then the respective observation is processed.

An IF-THEN-ELSE-IF statement consists of a boolean expression with a THEN


statements. This ia again followed by an ELSE Statement.
Syntax
The basic syntax for creating an if statement in SAS is −
IF (condition1) THEN result1;
ELSE IF (condition2) THEN result2;
ELSE IF (condition3) THEN result3;
If the condition evaluates to be true, then the respective observation is processed.
An IF-THEN-DELETE statement consists of a boolean expression followed by THEN
DELETE statement.

Syntax
The basic syntax for creating an if statement in SAS is −
IF (condition ) THEN DELETE;
If the condition evaluates to be true, then the respective observation is processed.
ARRAYS
Arrays in SAS are used to store and retrieve a series of values using an index value.
The index represents the location in a reserved memory area.

Syntax
In SAS an array is declared by using the following syntax −
ARRAY ARRAY-NAME(SUBSCRIPT) ($) VARIABLE-LIST ARRAY-VALUES
In the above syntax −
 ARRAY is the SAS keyword to declare an array.
 ARRAY-NAME is the name of the array which follows the same rule as variable
names.
 SUBSCRIPT is the number of values the array is going to store.
 ($) is an optional parameter to be used only if the array is going to store character
values.
 VARIABLE-LIST is the optional list of variables which are the place holders for
array values.
 ARRAY-VALUES are the actual values that are stored in the array. They can be
declared here or can be read from a file or dataline.
Examples of Array Declaration
Arrays can be declared in many ways using the above syntax. Below are the examples.
# Declare an array of length 5 named AGE with values.
ARRAY AGE[5] (12 18 5 62 44);

# Declare an array of length 5 named COUNTRIES with values starting at index 0.


ARRAY COUNTRIES(0:5) A B C D E ;

# Declare an array of length 5 named QUESTS which contain character values.


ARRAY QUESTS(1:5) $ Q1-Q5;

# Declare an array of required length as per the number of values supplied.


ARRAY ANSWER(*) A1-A100;

MERGING DATA SETS IN SAS


Multiple SAS data sets can be merged based on a specific common variable to give a
single data set. This is done using the MERGE statement and BY statement. The total
number of observations in the merged data set is often less than the sum of the number
of observations in the original data sets. It is because the variables form both data sets
get merged as one record based when there is a match in the value of the common
variable.
There are two Prerequisites for merging data sets given below −

 input data sets must have at least one common variable to merge on.
 input data sets must be sorted by the common variable(s) that will be used to merge on.

Syntax
The basic syntax for MERGE and BY statement in SAS is −
MERGE Data-Set 1 Data-Set 2
BY Common Variable
Following is the description of the parameters used −
 Data-set1,Data-set2 are data set names written one after another.
 Common Variable is the variable based on whose matching values the data sets
will be merged.

Data Merging
Let us understand data merging with the help of an example.
Example
Consider two SAS data sets one containing the employee ID with name and salary and
another containing employee ID and department. In this case to get the complete
information for each employee we can merge these two data sets. The final data set will
still have one observation per employee but it will contain both the salary and
department variables.
DATA SALARY;
INPUT empid name $ salary ;
DATALINES;
1 Rick 623.3
2 Dan 515.2
3 Mike 611.5
4 Ryan 729.1
5 Gary 843.25
6 Tusar 578.6
7 Pranab 632.8
8 Rasmi 722.5
;
RUN;
DATA DEPT;
INPUT empid dEPT $ ;
DATALINES;
1 IT
2 OPS
3 IT
4 HR
5 FIN
6 IT
7 OPS
8 FIN
;
RUN;
DATA All_details;
MERGE SALARY DEPT;
BY empid;
RUN;
PROC PRINT DATA = All_details;
RUN;

SAS - Functions
SAS has a wide variety of in built functions which help in analysing and processing the
data. These functions are used as part of the DATA statements. They take the data
variables as arguments and return the result which is stored into another variable.
Depending on the type of function, the number of arguments it takes can vary. Some
functions accept zero arguments while some other accept fixed number of variables.
Below is a list of types of functions SAS provides.
Syntax
The general syntax for using a function in SAS is as below.
FUNCTIONNAME(argument1, argument2...argumentn)
Here the argument can be a constant, variable, expression or another function.

Function Categories
Depending on their usage, the functions in SAS are categorised as below.

 Mathematical
 Date and Time
 Character
 Truncation
 Miscellaneous

Mathematical Functions
These are the functions used to apply some mathematical calculations on the variable
values.
Examples
The below SAS program shows the use of some important mathematical functions.
data Math_functions;

When the above code is run, we get the following output −


Date and Time Functions
These are the functions used to process date and time values.
Examples
The below SAS program shows the use of date and time functions.
When the above code is run, we get the following output −

Character Functions
These are the functions used to process character or text values.
Examples
The below SAS program shows the use of character functions.
data character_functions;

/* Convert the string into lower case */


lowcse_ = LOWCASE('HELLO');

/* Convert the string into upper case */


upcase_ = UPCASE('hello');

/* Reverse the string */


reverse_ = REVERSE('Hello');

/* Return the nth word */


nth_letter_ = SCAN('Learn SAS Now',2);
run;

proc print data = character_functions noobs;


run;
When the above code is run, we get the following output −

You might also like