Personal tools
You are here: Home Software Packages SAS Informats and Formats

Informats and Formats

— filed under: ,

  • Informats tell SAS how to interpret input data.
  • Formats allow you to "recode" SAS variables without changing the original variable.
  • You can use predefined informats and formats.
  • You can create your own formats.
  • Formats can be created "on the fly" or stored in a permanent library.

Informats

Informats allow SAS to read "non standard input" such as numbers with imbedded commas, scientific notation, or dates. The full list of character, numeric, and date informats is in the SAS Language (version 6) manual.

 

One of the most common uses of an informat is to tell SAS that a number should be read as a date or time value. For example: the informat mmddyy6. tells SAS to read the number 122495 as the date December 24, 1995. SAS will then store this date as a SAS date which is the number of days since January 1, 1960.

Since decimal points are generally not included in input data, the informat w.d is used to tell SAS where to insert the decimal point. W is the total length of the field and d is the number of digits that are to the right of the decimal point. For example the number 212345 read with the informat 6.2 would be read as 2123.45.

 

The following example code will read 3 fields (sales date, price and quantity) from a "raw file" that looks like this:

  10184  45000     30
  10284  37500     40
  40684  10000     34
 103085 152300     35
 121785    225      7
   
  data sales;
  infile temp;
  input sdate     mmddyy6.
        price     6.2
        quantity  5.;

 

A proc print of the SAS data set would look like this:

 

 SDATE    PRICE   QUANTITY
 8766	  450.00    30
 8767     375.00    40
 8862     100.00    34
 9434	 1523.00    35
 9482       2.25     7

 

Formats

Remember: the variable sdate is a now SAS date (number of days since Jan 1, 1960). To get the SAS date to print in a more readable style such as December 12, 1985 you associate the variable with a format. Formats tell SAS how to print the variables. SAS has predefined formats such as WORDDATEw or MMDDYYw that would give you December 12, 1985 or 12/12/85 respectively. The list of pre-existing formats are in the SAS Language (version 6) manual. You can also create your own formats by using proc format.

A few things to keep in mind about formats:

  • The w is the length of the format.
  • A format does NOT change the original value of the variable.
  • A format always ends in a period.
  • Formats can be numeric, character or date.
  • Character formats always begin with $
  • Format names can not be longer than 8 characters.
  • Do NOT overlap values in a range such as 1-10, 10-20, 20-30. Use 1- <10, 10- <20, 20- <30) instead.
  • You must tell SAS to associate a format with a variable.
  • A single format can be used for multiple variables.
  • The SAS keywords, low, high, and other can be used.

In this example, the SAS date sdate will be printed with the month, day and year written out, using the SAS format worddate20.

proc print data=sales;
format sdate worddate20.;

 

 

                          
    OBS             SDATE      PRICE     QUANTITY     
     1    January 1, 1984     450.00        30 
     2    January 2, 1984     375.00        40  
     3      April 6, 1984     100.00        34     
     4   October 30, 1985    1523.00        35   
     5  December 17, 1985       2.25         7      

 

Proc Format allows you to create your own formats.

The formats and their values are created in a proc format. The formats are later associated to the variables in a data or proc step using the format statement. The format statement starts with the word FORMAT, the variable name and then the name of the format. The format must end in a period to distinguish it from a variable.

format age agefmt.;

Remember, the proc format creates the format and the format statement associates the format with the variable(s).

In this example, the formats $sexfmt, yesno and agefmt are created in a single proc format.

  proc format;
     value $sexfmt  'f' = 'female'
                    'm' = 'male';
                 
     value yesno     1  = 'yes'
                     5  = 'no'
                  other = 'bad data';

     value agefmt   low - 2    = 'infant'
                      3 - 5    = 'toddler'
                      6 - 10   = 'grade school'
                     11 - 14   = 'middle school'
                     15 - 18   = 'high school'
                     19 - high = 'adult';
 
  proc print data=temp;
  format sex $sexfmt. quest1 quest4 quest7 yesno. age agefmt.;

 

The proc format creates 3 different formats. The $sexfmt is a character format so it must begin with a $ sign and the values (f & m) and the formatted text must be in quotation marks. The yesno format is numeric so only the formatted text is in quotation marks. In the proc print, the yesno format is associated with 3 variables (quest1 quest4 and quest7). The "other" keyword will assign the format "bad data" to any value other than 1 or 5. The agefmt format is also numeric and is an example of using a range of values. Low and High are special SAS keywords. It is common practice to use a format in a proc freq for variables such as age or income where you want to aggregate within a range. Remember: Do not overlap ranges. The following example uses the sales data set created in the first example.

proc format;
   value qty low-10  = 'low-10'
              11-20   = '11-20'
              21-30   = '21-30'
              31-40   = '31-40'
              41-50   = '41-50'
              51-high = '51-high';

proc freq data=sales;
tables quantity;
format quantity qty.;
run;

 

 

sales quantity

                                             Cumulative  Cumulative
             QUANTITY   Frequency   Percent   Frequency    Percent
             ------------------------------------------------------
             low-10            1      20.0           1       20.0
             21-30             1      20.0           2       40.0
             31-40             3      60.0           5      100.0

 

 

 

Common date format problem:

When my SAS data set was created, the dates were read in as a number rather than a date. How do I create a SAS date from a number in an existing SAS data set?

  data temp;
  set sasuser.example;
  newdate = input(put(olddate,8.), mmddyy8.);

 

This code will take the number 12171985 and turn it into the SAS date of 9482 (Dec. 17, 1985). If you only have a 2 digit year (121785) use mmddyy6. If your date is in year, month and day order (851217) use yymmdd6.

Document Actions

Copyright ©2014, The Pennsylvania State University | Privacy and Legal Statements
Contact the Help Site Administrator | Last modified Aug 21, 2008 | Weblion Partner