You are here: PRI Core Services Help Working with Data Using the PSID
Document Actions

Using the PSID

An introduction to the use of the Panel Study for Income Dynamics data collection

psid_txt1

INTRODUCTION

The PSID is an annual survey of U.S. families and the individuals who make up those families.  The first wave was in 1968 and sampled approximately 4,800 U.S families who lived in 40 states.  These same families have been re-interviewed annually.   As family members move out of the Wave I households and form new families, the PSID continues to survey them and their new households.    By 1992 the number of families had grown to 9,829 encompassing 53,013 individuals.

The survey is conducted annually by interviewing a single adult who answers all questions for the family. The questions asked include information about the family's previous year's  income, employment, current family composition and demographics.  Some years a longer questionnaire has been used which  includes questions about each individual's attitudes, expectations, behavior.   The results of the PSID is intended to be used for cross-sectional, longitudinal, and inter generational analysis and may be performed at either the individual or the family level.  The large number of families who have been re-interviewed annually over several decades, combined with the detailed and diverse questionnaire used makes this a valuable and therefore widely used survey by researchers in the Social Sciences.
 

HISTORY

The  PSID  began as an outgrowth of a survey that was part of President Johnson's War on Poverty, titled 'The Survey of Economic Opportunity'.   This survey was conducted in 1966 and 1967 by the Census Bureau.  It was intended to be used in order to gain an understanding of the determents of family well being.   In 1968 the Census Bureau asked the Survey Research Center (SRC) at the University of Michigan to take over the survey. The SRC started with  a sample group of 2000 households  from the 1966/1967 SEO study.   As the emphasis of the original study was on poverty, this sample primarily included low income families.  The new study was to be more representative of the U.S. population, therefore the SRC added approximately 3000 new  households and attached probability-of-selection weights to each family in the survey.   They also wrote a new questionnaire, interviewed the new sample group of households and the results became Wave 1 of the  PSID.

In the last 1980's it became clear that the original PSID sample families under-represented the Latino population of the United States.   Not only did it not reflect the 1968 Latino population of the U.S, it had become even less representative over time because there had been a serge of immigration from Latin America to the U.S. during the intervening years.  Therefore a sample of 2,043 Latino households was added to the PSID in 1990 to correct this.   This sample came from the Latino National Political Survey at the University of Texas, Austin.  They are of Mexican, Puerto Rican and Cuban origin, and represent 89% of the U.S. Latino population.  The new households were assigned a 'Latino Family weight' as well as an overall PSID weight.

TOPICS COVERED IN THE QUESTIONNAIRES

The original purpose of the PSID was to study the effects of income on U.S. families' well-being (or poverty) over time.  Therefore, the main, or core questions of the questionnaires concern economic activities and demographic changes.  This includes questions about income sources and amounts, employment, family composition changes and residential location.   From time to time additional questions have been added to explore other topics including 'happiness' and intelligence.  But the main focus remains economic and demographic in nature.  Much of the detailed information gathered is about the household in general and about the primary adults in the household, the head of the household and the spouse/'spouse' of the head of the household.   A much smaller amount is gathered about each individual currently living in a PSID household.

The following are examples of the information that is gathered about the household, the Head of the household and the spouse of the Head of the household:

  • Sources and amounts of the previous year's Income including: wage income and hours worked, pension payments, inheritance, rent payments, unemployment payments and child support.  Some of these are reported as annual amounts and some as monthly amounts..

  •  
  • Employment Information including the occupation code whether the  job was as a farmer, or whether the job was government work..

  •  
  • Time spent on housework, dollars saved by performing: household repairs, automobile repairs or by sewing clothes.

  • Demographics: Marital status, education background, information on parents and grandparents, current family members and their ages, changes in family composition from last year, members who moved out since last year.
  • Geographical information:  Current state and city of household as well as the size of the largest city in the county.

  •  

     


FILES AVAILABLE - MAIN FILES

Currently, the PSID consists of two main data files: The Annual family files and  The Cross Year Individual file.

The Annual Family files are structured with one record per household surveyed during a particular wave or year.  A family is  uniquely identified and sorted by the Family ID assigned the household for that year.   If a PSID family did not participate in a particular wave, there is no record for that family in the file representing the wave they missed.  To help merge a file with other PSID files, the family record contains a variable of the annual Family ID  for each previous wave of the PSID.

The Cross Year Individual file is structure with one record for each person who has ever been been a member of  a PSID family .   An individual is uniquely identified and sorted by the 1968 Family Interview Number (V30001) and 1968 Person Number (V30002).   Every family who has ever been in the survey is assigned a 1968 Family number and every individual who has ever been in the survey is assigned a 1968 Person number, rather or not the person or family was interviewed in 1968.   To help merge the Individual file  with  other PSID files , the records in the Individual file also contains all annual Family ID's of the household where the individual lived for each year of the survey.

Prior to 1990, the PSID was released as a cross year Family-Individaul Response File, a Cross Year-Individual Nonresponse File and a Cross Year Family File.   That is, the data for ALL waves of the PSID was included in a single file.  But, in 1990 it exceeded the computer allowable length and the separate annual Family file system that is currently in use, was devised.  This eliminated the need for the nonresponse file.  If a family  is nonresponsive in a particular year, it is simply not included in that wave's file.  If an individual becomes nonresponsive, all his variable are entered as missing for that year in the Individual file.
 

FILES AVAILABLE - SUPPLEMENTAL FILES

On an occasional basis, the PSID issues supplemental data files on specialized topics,    They contain details of the specific topic which are not included in the main files.  In some cases the additional detail comes from additional questions added to the survey.   These files rarely cover all of the survey years and do not necessarily include non-sample members.   The following is a sample of currently available supplemental files:  For a complete list, check the  following web site:

http://psidonline.isr.umich.edu/Data/zipSuppData.html

 Married History File - 1985-1993

This file contains one record per marriage for every individual who was of marriage age between 1985-1993.  Each record contains the 1968 ID numbers of the individual and of one spouse, and some details of the marriage including the date of the marriage and divorce (if any).   A single record was generated for each individual who never married consisting mostly of missing data.
 

Relationship File

This file is largely intended to explain in detail the relationships (blood, marital or cohabitation ) between all paris of individuals who were ever part of a PSID family prior to 1983.   This file was intended to be used to analyze the  living arrangement patterns of the households.
 

 Time and Management Transfer File

This file reflects data collected in 1988,  detailing transfers of time or money between relatives and friends..  Each record represents a single transfer in 1987.
 

Work-History file

This is a unique file.  It was first issued in 1984 and is re-built every year as a stand alone  supplemental file.  It does not need to be merged into the Family file.  It contains: that wave's  ENTIRE Family Record for al HEADS and WIVES/'WIVES'" and all details of their work history.  Work history includes information about each spell of employment, and unemployment.
 

Child Birth and Adoption History - 1985-1993

This file contains one record per birth or adoption.  The information includes identifiers for each parent and the child, details concerning  the birth or the adoption as well as  the birth dates of both parents.
 



IMPORTANT VARIABLES

The following are definitions of some of the most important terms and variables of general interest to all researchers who use the PSID files.


An individual is defined as a member of a household during a particular wave if he was either residing in the interviewed family  (not temporary roommates, renters or visiting friends or relatives) or was temporarily away in an institution (college, jail, hospital or the military).



Every individual in the study is defined as either a Sample family member or a Non-Sample family member:

  A SAMPLE PSID Family member:

  •        was a member of a household which was interviewed in the first wave (1968) OR
  •        was born after 1968 into an original family OR
  •        moved out of a first wave family and formed a new family (typically children who grew up ).
                Weights are calculated for sample members only.

  A Non-Sample family member:

  •         joined the panel study through marriage, cohabitation or co-residency as an adult OR
  •         joined the panel study as the child of a non-sample adult.
               Non-sample family members are no longer interviewed if they leave a sample households



1968 Family ID and 1968 Person Number

An individual can be uniquely identified through out all waves by using the 1968 Family ID  along with the 1968 Person Number. All individuals in the study are assigned these two 1968 variables rather or not they were actually part of the study in 1968.

It is more complicated to uniquely identify a family.    The 1968 Family ID could be used to follow generations of a family, but not a specific household.    Ever year that a household takes part in the survey,  it is assigned an interview number for that year.  The annual family files contain both the 1968 Family number as well as that year's  interview number.   This annual interview number is the variable that will be used to follow a particular household over time.

The 1968 Person number was not assigned randomly within a family.  For instance, '01' is assigned  to the 1968 Head of Household.  The following table contains  the possible values and their meaning of the 1968 Person Number variable .  Notice that Sample Members have a Person number less than 170  and Non-Sample Members have a Person number of 170 or greater.

1968 Person # :

        SAMPLE MEMBERS:

001-019 People living in sample families at the time of the wave 1 interview(1968) AND individuals living in the Latino sample households in 1990.
20 Husbands of the Head living in an institution at the time of the wave 1 interview.
021-026 Children or step-children who were under 25 years old in 1968 who were living in an institution.
030-169 People who were born into PSID families after the wave 1 interview AND had a least one SAMPLE parent in the study at the time of birth.

        NON-SAMPLE MEMBERS:

170-226 Individuals who moved into the households after 1968 OR were born into a PSID family after 1968 AND had no sample parent.
227-228 Husband or wife of wave 1 head who moved out or died in the year prior to 1968.
400-499 Non-sample individuals aged 65 or older who are treated as if they were sample members.
900 - Covers a variety of non-sample individuals.  They are included in some of the supplemental files BUT, are NOT included  in the Individual File.
Example:   An  individual who was an adult child prior to wave 1 ( used in the childbirth/adoption file).

 



Sequence Number

Every year that a family participates in the survey it is assigned an Interview Number for that year and every member of the family is assigned a Sequence Number to identify the individual within the family,  The Sequence number is assigned to an individual in the following manner:

01-20. Individuals in the family at the time of the current interview:
01 for the Head,  02 for his wives, etc.
51-59. Individuals in institutions at the time of the current interview: they were nonresonsive.
71-80. Individuals who either moved out of the Household OR
out of an institution between the current and  previous interviews, but who were not included in another responded Household. All such individuals are nonresonsive.
81-89. Individuals living in the previous wave but who died before the current interview.
00. Born or moved in after the current interview OR
nonresponse for the current wave OR
mover-out nonresponse in the previous wave


 Relationship to Head

The Head of Household is the householder who is usually interviewed and about whom most of the personnel information is gathered.   Because of this it is important to know who the Head is and how every other family member relates to the Head.   The PSID uses a definition for Head of Household which is similar to the one used by the Bureau of the Census.  It has became quaintly old-fashioned, but remains in use for consistency sake.  The Head of Household is preferably  the adult male head.  If the family is a married couple family, the husband is the Head unless he is severely disabled.  The Relationship to Head or RELHD variable is included as part of the individual file and defines how each individual is related to the Head:

      1968 - 1982 - Original values of Relationship of Individual to Head of Household (RELHD):

1 Head
2 Wife/"Wife"
3 Son or daughter, including step-children and foster children
4 Brother or sister of head
5 Father or mother of head
6 Grandchild, niece, nephew, other relatives under 18
7 Other relative , including in-laws, other adult relatives
8 Husband or Wife of Head who moved out or died in 1967 (1968), non-relative (after 1968)
9 Husband of current head
0 Individual from core sample who was born or moved in after the current wave, non-responsive in current wave, or individual from Latino supplement in 1990

 

In 1983 it was determined that these categories were too narrow and thereafter the possible values were greatly expanded.  One important example is the Head's wife.  Prior to 1993 only a single value, '2,  sufficed  for both wife and cohabitor  Since 1993: wife is '20',  cohabitor or 'wife' is '22' and girlfriend is '88'.   A girlfriend is defined as a first year cohabitor.  After that year if the women is still cohabiting her RELHD  value becomes '22'.  Another example is that all children of Head were '3' , but now there are several values: children of head (30), stepchildren of head (33), children of cohabitor, but not of head (35),  children-in-law(37) and children of girlfriend(83).

    1983 - Revised values of  Relationship of Individual  to Head of the Household  (RELHD):

10. Head
20. Legal wife
22. "Wife"--female cohabitor who has lived with Head  for a year or more or who was present in the  previous wave
30. Son or daughter of Head (includes adopted children  but not stepchildren)
33. Stepson or stepdaughter of Head (children of legal wife [code 20] who are not children of Head)
35. Son or daughter of "wife" but not Head (includes only those children whose mother's relationship to Head is 22 but who are not Head's children)
37. Son-in-law or daughter-in-law of Head (includes stepchildren-in-law)
38. Foster son or foster daughter, not legally adopted
40. Brother or sister of Head (includes step and half sisters and brothers)
47. Brother-in-law or sister-in-law of Head; i.e., brother or sister of legal wife.
48. Brother or sister of Head's cohabitor (the cohabitor's relationship code=22 or 88)
50. Father or mother of Head (includes stepparents)
57. Father-in-law or mother-in-law of Head (includes parents of legal wives [code 20] only)
58. Father or mother of Head's cohabitor (the cohabitor's relationship code=22 or 88)
60. Grandson or granddaughter of Head (includes only legal wife's [code 20] grandchildren; those of a cohabitor are coded 97)
65. Great-grandson or great-granddaughter of Head (includes only legal wife's [code 20] great-gradchildren; those of a cohabitor are coded 97)
66. Grandfather or grandmother of Head (includes stepgrandparents)
67. Grandfather or grandmother of legal wife (code 20)
68. Great-grandfather or great-grandmother of Head
69. Great-grandfather or great-grandmother of legal wife (code 20)
70. Nephew or niece of Head
71. Nephew or niece of legal wife (code 20)
72. Uncle or Aunt of Head
73. Uncle or Aunt of legal wife (code 20)
74. Cousin of Head
75. Cousin of legal wife (code 20)
83. Children of first-year cohabitor but not of Head (this child's parent is coded 88)
88. First-year cohabitor of Head
90. Legal husband of Head
95. Other relative of Head
96. Other relative of legal wife (code 20)
97. Other relative of cohabitor (the cohabitor's  code=22 or 88)
98. Other non relatives (includes homosexual friends, friends of children of the FU, etc.)
99. NA relationship; NA who is householder (V19017=9999); two primaries share HU (V19016=8 or 9)
00. Inap.: FU is in institution





GOTTCHA'S

  • The Head of Household for a household may not always be the same person in all waves ot the survey.  The head could be a man in one year, his wife in another (if the man dies, for example), or may be the father one year and his son the next.

  •  
  • When using data from multiple waves, check for inconsistencies in the exact definition of each variables between waves as well as for differences in coding.  A major example is the RELHD variable as was discussed above.  A second example involves employment status.  Before 1976 the categories were:
    1. Working now, or only temporarily laid off (70.5%)
    2. Looking for work, unemployed ( 4.5%)
    3. * Retired, permanently disabled (18.3%)
    4. Housewife (4.8%)
    5. Student (1.7%)
    6. Other (0.1%)

    After 1976 they have become:

    1. Working now (70.3%)
    2. Only temporarily laid off ( 1.0%)
    3. Looking for work, unemployed ( 3.4%)
    4. * Retired (14.8%)
    5. * Permanently disabled ( 3.5%)
    6. Housewife ( 5.2%)
    7. Student ( 1.6%)
    8. Other ( 0.2%)
  •  Some of the variables refer to the current year and some to the previous year:
                     Current year:  Relation to head, years of education, age, geographic information,
                     Previous year: Income variables including , hours worked, income earned, commute time,
                                                  time on household chores,   food expenditures , income tax
  • Most interviews are conducted with the Head of the Household. But if the head is not available, the answers are given by another household member, a 'proxy'.   You may want to eliminate proxy responses from your research, depending upon your research.

  •  
  •  Some background questions are asked only when the Head of the Household is a new Head, but included in each subsequent annual file.  Be aware that these variables have not changed from one year to the next.

  •  

     
     
     
     
     

    The moral is: READ THE CODE BOOKS


Published Research

The following web site contains a list of published papers and books which have used the data from the PSID:

                http://psidonline.isr.umich.edu/Publications/Bibliography/default.aspx


LOCATION of DATA

On the PopNET at PRI:

AT PRI, the following location contains SAS data sets of all currently available Family files, the Individual file and a few supplemental files: the 1968-1985 Marriage History file and  the 1985-1992 Childbirth and Adoption History file.

        /home/sas_data/psid
 
 

Code Books and raw data files downloaded from The PSID web site at U of Michigan are at the following location:

        /home/data/psid
 

Each sub directory of /home/data/psid represents a separate PSID file and contains: the raw data, the Code Book and the SAS and SPSS programs.  These programs can be edited to extract only the variables needed for your research.   Because they contain thousands of variables, no one works with the entire data files if it can be helped.

Directories name that are a year are the Annual Family Files:

            Example : The 1982 Family File is located in the directory /home/data/psid/1982
                                  The raw data  :      dat7439.f82.gz   (NOTE: the file is ZIPPED)
                                  The code book:      cb7439.fam82.
                                  The SAS code :     sa7439.fam82
                                  The SPSS code:     sp7439.fam82

The 'Individ' file  contains the Individual file and is structured similarly to the Annual Family files.  The following directories contain supplemental files: 'child-dev', 'childbirth' and 'marital_hist'.  They all include a README file which explains the contents of the directory.  The 'old_structure' directory contains raw data ONLY for the pre-1990 cross-year files.  The Code Books and SAS and SPSS files are not on-line., but there are hard copies of the Code Book at PRI that should be referenced if you want to work with these files.  The 'new_structure' directory contains the post 1990 files as they were first downloaded from the PSID institute.

There are hard copies of all PSID code books in the PRI library in 706 Oswald as well as in the computer lab in 806 Oswald Tower.
 
 

At the PSID institute at the University of Michigan :

            http://psidonline.isr.umich.edu/

This is the PSID site.  It contains detailed documentation about the survey, citations of published researched based on the PSID  and information on the newest waves availability .


MERGING  PSID DATA  SETS
 

When the PSID changed its policy from creating a single cross year family file each year to creating an  annual family file each year,  they solved the  file size problem , but created a bit more work for the researcher.  Most research requires multi years of the survey and individual level variables.   Luckily, it is a  fairly simple process to  merge multiple  files into a single file, if you keep track of the variables which will connect an individuals to their family for each year of the survey.  The important facts to remember are:

  1. The 1968 Family ID combined with the 1968 Individual ID uniquely identifies an individual over all waves of the survey.
  2. The ANNUAL family number identifies a household for a particular wave of the survey.
  3. The INDIVIDUAL file contains a list of all annual family numbers that this individual has been participated in the survey.
  4. The Annual Family files contain a list of all previous annual family numbers that this family has participated in the survey.
  5. The ANNUAL person number '01' is the Head of Household for that wave of the survey.
  6. The Annual Family files are in sort order by the annual family number.
  7. The Individual file is in sort order first by the 1968 Family number, and then by the 1968 Individual number.
  8. To keep the size of your file manageable, keep only the variables necessary for your research.

You can merge the files of the PSID in many different ways, too numerous to go into detail here.  To merge the Individual file with one of the Annual Family files, use the Annual family number contained in the family file and in the Individual file for that wave.
 

The PSID web site at Michigan has a good detailed document on merging files:

                  http://psidonline.isr.umich.edu/Guide/FAQ.aspx#6
 

The following is a table containing all of the Annual Family number variables in each Annual Family file and in the Individual Files.

Year File Family File Individual
1968 V3 V30001
1969 V442 V30020
1970 V1102 V30043
1971 V1802 V30067
1972 V2402 V30091
1973 V3002 V30117
1974 V3402 V30138
1975 V3802 V30160
1976 V4302 V30188
1977 V5202 V30217
1978 V5702 V30246
1979 V6302 V30283
1980 V6902 V30313
1981 V7502 V30343
1982 V8202 V30373
1983 V8802 V30399
1984 V10002 V30429
1985 V11102 V30463
1986 V12502 V30498
1987 V13702 V30535
1988 V14802 V30570
1989 V16302 V30606
1990 V17702 V30642
1991 V19002 V30689
1992 V20302 V30733



by benson — last modified September 04, 2007 06:03 PM

Privacy and Legal Statements | Copyright Information
Copyright ©2008, The Pennsylvania State University

Powered by Plone, the Open Source Content Management System

Personal tools