Introduction to SAS
Ecoomics 6896
Spring, 2001
Tony Lima
Table of Contents
VIII. Univariate Data Analysis
IX. Multivariate Data Analysis
Table of Figures
Figure 5: New Library dialog box
Figure 6: Save As MyProductSales
Figure 9: Import Wizard, Select
File
Figure 10: Import Wizard, Select
Library and Member
Figure 11: Import Wizard, Create
SAS Statements
Figure 13: SAS ASSIST Start Menu
Figure 14: SAS ASSIST Block Menu
Figure 15: Enter Data Interactively
Figure 16: Creating a SAS Table
Figure 18: New Table Structure
Figure 19: Sample data structure
Figure 22: Edit Data selection
window
Figure 23: Edit Data One Record at
a Time
Figure 25: The column selection
dialog box
Figure 26: The seasonal adjustment
setup
Figure 27: Tests for seasonality
Figure 29: Domestic airline
passenger miles are seasonal
Figure 31: The MEANS Procedure
results
Figure 32: Interactive data
analysis window
Figure 33: Interactive data analysis
variable selection
Figure 35: Descriptive statistics
Figure 41: Select a Product Type
_files/image004.jpg)
Figure 1: SAS Opening screen
_files/image006.jpg)
Figure 2: Available Libraries
_files/image008.jpg)
Figure 3: The Prdsale table
_files/image010.jpg)
Figure 4: Save As Dialog Box
A SAS catalog is a SAS file that contains entries. The entries in a catalog serve a variety of utility purposes. For example, the function key settings that you use in a display manager session are stored in a KEYS entry. Catalogs are created by procedures or the CATALOG window. The entries within a catalog are created in various ways, depending on the type of entry.
A SAS data library is defined as a collection of SAS files that are recognized as a unit by the SAS System. On directory-based systems, it is a group of files in the same directory.
There is no limit to the number of SAS files you can store in a SAS data library. You can have different kinds of SAS files in one library. SAS files are stored and retrieved according to the SAS data library to which they belong.
_files/image012.jpg)
Figure 5: New Library dialog box
_files/image014.jpg)
Figure 6: Save As MyProductSales
_files/image016.jpg)
Figure 7: Column attributes
Even though you can't see all of it, the Format box says DOLLAR12.2. This means the column is formatted as dollars (other currencies are available), the total number of spaces allowed is 12 and there are 2 places to the right of the decimal point.
The total number of spaces must be large enough to accommodate the largest number. When counting spaces, be sure to allow for the currency sign, the decimal point, the comma, and any special characters associated with negative numbers.
Rule of thumb: when in doubt, increase the column width. Disk space is cheap; reformatting (and possibly losing) your data table is expensive.
To close the Format window, click the x box in the upper right corner.
_files/image018.jpg)
Figure 8: Import Wizard
_files/image020.jpg)
Figure 9: Import Wizard, Select File
_files/image022.jpg)
Figure 10: Import Wizard, Select Library and Member
_files/image024.jpg)
Figure 11: Import Wizard, Create SAS Statements
_files/image026.jpg)
Figure 12: The Excel Table
The first row should be the field names. SAS wants these names to be 8 characters or less, all capital letters, preceded with a single quote ('). Note that the SAS field name can be much longer; however, you must make this change by changing the field attributes after the table is imported.
Each row is one data record. If you're creating longitudinal data, create a separate table for each period, then worry about working with the longitudinal data later.
If you can, format each data column in Excel. Select the column (excluding the first row which contains the field name), right click in any cell, click the Number tab and change the format to whatever you want SAS to make it. Stick to Currency, Number and Date types if you can. If you have text fields, make them General type in Excel.
You can select a column of data by clicking in the top cell of the first row of actual data (row 2), scrolling to the last row and clicking in the bottom cell of the column while holding down the shift key. If you put columns that have exactly the same data type next to each other, you can format them all at once by clicking the top left cell and the bottom right cell while holding down the shift key.
_files/image028.jpg)
Figure 13: SAS ASSIST Start Menu
_files/image030.jpg)
Figure 14: SAS ASSIST Block Menu
_files/image032.jpg)
Figure 15: Enter Data Interactively
Usually the best option is to enter data in tabular form. However, if you have a large number of fields, you should enter the data one record at a time. In section VI.A.10 we'll see why.
Before we leave this screen, note the weird button labeled "Goback." SAS uses this instead of the more usual "Close" button. Goback returns you to the screen you were previously looking at.
_files/image034.jpg)
Figure 16: Creating a SAS Table
_files/image036.jpg)
Figure 17: Select a library
_files/image038.jpg)
Figure 18: New Table Structure
This window is where you set up your field names, data type, length, give them a label, and define their format. Let's look at these one at a time:
1. Name is the name of the field. It should be 8 characters long or less and contain only letters and numbers. It should not begin with a letter.
2. Data types can be numeric (N), or character (C or $).
3. The default length is 8. This is also the maximum length for numeric variables. For character variables, the maximum length is 200.
4. The label can be up to 40 characters long. It should describe what's in the field.
5. The format is the way the data will be displayed. The DOLLAR12.2 format is the best known. For more information about SAS formats, search the help file for the following topics: FORMATS: Numeric formats;
_files/image040.jpg)
Figure 19: Sample data structure
Chances are you've made a mistake somewhere. SAS informs you of this by refusing to close the window and highlighting the area where there's a problem. There may be an error message on the message bar at the bottom of the screen. The area may not be highlighted; look for the field in which the mouse cursor is placed.
_files/image042.jpg)
Figure 20: Data entry
SAS expects you to enter data in a tabular format. However, our fields are too wide and there are too many of them to see them all at once. Let's enter a record.
This is more difficult than it should be. SAS wants you to use the Tab key to move between fields. Unfortunately, it will only let you move between fields you can see on the screen. That means you must use the horizontal scroll bar to scroll field names, then select the next empty field and enter the data and so on.
When you enter the data for the last field in a record, press Enter to start a new record. Figure 21 shows part of the data for record 1.
_files/image044.jpg)
Figure 21: First new record
Select File/Save from the SAS menu, then click X to close the data entry table.
Click the Data Mgmt button, then select Edit/Browse and Edit data from the popup menus. You'll see Figure 22.
_files/image046.jpg)
Figure 22: Edit Data selection window
This time click the Single row button. SAS will remember the last library and table you opened and make those the default (Figure 23).
_files/image048.jpg)
Figure 23: Edit Data One Record at a Time
If you're happy with this selection, click the Run button.
Otherwise, select a new library by clicking the Libref button and/or a new table by clicking the Table button. Remember to select the library first if you want to change both.
You'll see Figure 24
_files/image050.jpg)
Figure 24: The first record
To add a new record, select Edit/Add new record. If SAS won't let you type data, press the Ins key once. (The cursor should look like a block, not a vertical line.)
Guess what? You have to select Edit/Add new record again.
_files/image054.jpg)
Figure 25: The column selection dialog box
_files/image056.jpg)
Figure 26: The seasonal adjustment setup
_files/image060.jpg)
Figure 27: Tests for seasonality
_files/image062.gif)
Figure 28: Seasonal factors
Your results should look like Figure 29.
_files/image064.jpg)
Figure 29: Domestic airline passenger miles are seasonal
_files/image066.jpg)
Figure 30: Summary Statistics
_files/image068.jpg)
Figure 31: The MEANS Procedure results
"The Class selection gives you a list of all the columns in the selected table, excluding those that already are in use in the active task. Choose the column(s) you want to use as classification column(s).
A classification column is any column, numeric or character, that is used for classifying the data into groups or categories of information. A class column normally has a small number of discrete values, or unique levels, which define subgroups of the data.
In this task the class column is used to compute statistics separately for categories of rows."
_files/image070.gif)
Figure 32: Interactive data analysis window
_files/image072.jpg)
Figure 33: Interactive data analysis variable selection
_files/image074.jpg)
Figure 34: Histogram
_files/image076.gif)
Figure 35: Descriptive statistics
About time, isn't it?
The Time series option is used to correct for autocorrelation.
_files/image078.jpg)
Figure 36: Regression window
_files/image080.jpg)
Figure 37: Subset Data window
_files/image082.jpg)
Figure 38: Build WHERE Clause
TIP: SAS doesn't behave like the rest of the Windows world. When you click a radio button, it automatically opens the next window. Be prepared for this!
_files/image084.jpg)
Figure 39: Column selection
To make a selection from this window, click the column you
want (PRODTYPE) and the
button. You won't need to click OK.
_files/image087.jpg)
Figure 40: Build WHERE clause
_files/image089.jpg)
Figure 41: Select a Product Type
_files/image091.jpg)
Figure 42: Regression Output
Note that you can save this output as a text file. First, though, notice that the "Output" window is automatically opened. You might look at the "Log" window, too.
When you save the output, select the RTF data type. This will let you easily read your output file into Word for editing. (Don't edit the results!)
libname
wine 'c:\wine';
PROC
REG data=wine.vf95cf SIMPLE CORR;
MODEL
nprice=laplus lagold lasilv labrnz ocplus ocgold ocsilv ocbrnz
rivplus rivgold rivsilv rivbrnz
sfplus sfgold sfsilv sfbrnz
dmnplus dmngold dmnsilv
dmnbrnz/SELECTION=STEPWISE DETAILS SLS=0.10;
MODEL
nprice=sacplus sacgold sacsilv sacbrnz
nwiplus nwigold nwisilv nwibrnz
wcplus wcgold wcsilv wcbrnz
sdplus sdgold sdsilv
sdbrnz/SELECTION=STEPWISE DETAILS SLS=0.10;
PROC
REG data=wine.vf95cs SIMPLE CORR;
MODEL
nprice=laplus lagold lasilv labrnz ocplus ocgold ocsilv ocbrnz
rivplus rivgold rivsilv rivbrnz
sfplus sfgold sfsilv sfbrnz
dmnplus dmngold dmnsilv
dmnbrnz/SELECTION=STEPWISE DETAILS SLS=0.10;
MODEL
nprice=sacplus sacgold sacsilv sacbrnz
nwiplus nwigold nwisilv nwibrnz
wcplus wcgold wcsilv wcbrnz
sdplus sdgold sdsilv
sdbrnz/SELECTION=STEPWISE DETAILS SLS=0.10;
PROC
REG data=wine.vf95ch SIMPLE CORR;
MODEL
nprice=laplus lagold lasilv labrnz ocplus ocgold ocsilv ocbrnz
rivplus rivgold rivsilv rivbrnz
sfplus sfgold sfsilv sfbrnz
dmnplus dmngold dmnsilv
dmnbrnz/SELECTION=STEPWISE DETAILS SLS=0.10;
MODEL
nprice=sacplus sacgold sacsilv sacbrnz
nwiplus nwigold nwisilv nwibrnz
wcplus wcgold wcsilv wcbrnz
sdplus sdgold sdsilv sdbrnz/SELECTION=STEPWISE DETAILS
SLS=0.10;
PROC
REG data=wine.vf95gw SIMPLE CORR;
MODEL
nprice=laplus lagold lasilv labrnz ocplus ocgold ocsilv ocbrnz
rivplus rivgold rivsilv rivbrnz
sfplus sfgold sfsilv sfbrnz
dmnplus dmngold dmnsilv dmnbrnz/SELECTION=STEPWISE DETAILS
SLS=0.10;
MODEL
nprice=sacplus sacgold sacsilv sacbrnz
nwiplus nwigold nwisilv nwibrnz
wcplus wcgold wcsilv wcbrnz
sdplus sdgold sdsilv
sdbrnz/SELECTION=STEPWISE DETAILS SLS=0.10;
run;
Index
"Create new
catalog" button..................... 7
"Create new
library" button...................... 7
Build Where clause
window................... 41
Class button.......................................... 34
classification
column, defined................. 34
coefficient of
variation............................ 33
column attributes................................... 10
column name..................................... 5,
31
column, defined....................................... 5
Comparison operator
radio button......... 43
data entry window................................. 23
data mining tool....................................... 3
data table.......................................... 5,
10
data type............................... 5, 15,
21, 45
data warehousing..................................... 3
delimited table,
defined.......................... 11
Explorer frame........................................ 4
field......................................................... 5
field format............................................ 21
field label............................................... 21
field length............................................. 21
field name.............................................. 21
field width............................................... 5
frequencies............................................ 38
Goback button,
defined......................... 18
histogram.............................................. 36
kurtosis................................................. 33
library, defined........................................ 5
linear regression..................................... 38
mean..................................................... 33
Mylib, creating........................................ 8
MyProductSales table,
creating................ 8
New Table Structure
window................ 21
Prdsale table........................................... 6
record..................................................... 5
Regression Analysis
window.................. 39
row, defined............................................ 5
Run button................................ 26, 29,
44
SAS ASSIST........................................ 16
SAS Explorer.................................... 4,
31
SAS Import Wizard.............................. 11
SAS program........................................ 46
SAS toolbar............................................ 5
SASHelp library...................................... 6
seasonal adjustment............................... 27
skewness.............................................. 33
spaces, number in
field length................. 10
standard deviation................................. 33
Statistical Analysis
System....................... 3
Subset data button................................. 40
summary statistics.................................. 32
WHERE clause: button.......................... 41