Using the PSID
An introduction to the use of the Panel Study for Income Dynamics data collection
INTRODUCTION
The PSID is an annual survey of U.S. families and the individuals who make up those families. The first wave was in 1968 and sampled approximately 4,800 U.S families who lived in 40 states. These same families have been re-interviewed annually. As family members move out of the Wave I households and form new families, the PSID continues to survey them and their new households. By 1992 the number of families had grown to 9,829 encompassing 53,013 individuals.
The survey is conducted annually by interviewing a single adult who
answers all questions for the family. The questions asked include information
about the family's previous year's income, employment, current family
composition and demographics. Some years a longer questionnaire has
been used which includes questions about each individual's attitudes,
expectations, behavior. The results of the PSID is intended
to be used for cross-sectional, longitudinal, and inter generational analysis
and may be performed at either the individual or the family level.
The large number of families who have been re-interviewed annually over
several decades, combined with the detailed and diverse questionnaire used
makes this a valuable and therefore widely used survey by researchers in
the Social Sciences.
HISTORY
The PSID began as an outgrowth of a survey that was part of President Johnson's War on Poverty, titled 'The Survey of Economic Opportunity'. This survey was conducted in 1966 and 1967 by the Census Bureau. It was intended to be used in order to gain an understanding of the determents of family well being. In 1968 the Census Bureau asked the Survey Research Center (SRC) at the University of Michigan to take over the survey. The SRC started with a sample group of 2000 households from the 1966/1967 SEO study. As the emphasis of the original study was on poverty, this sample primarily included low income families. The new study was to be more representative of the U.S. population, therefore the SRC added approximately 3000 new households and attached probability-of-selection weights to each family in the survey. They also wrote a new questionnaire, interviewed the new sample group of households and the results became Wave 1 of the PSID.
In the last 1980's it became clear that the original PSID sample families under-represented the Latino population of the United States. Not only did it not reflect the 1968 Latino population of the U.S, it had become even less representative over time because there had been a serge of immigration from Latin America to the U.S. during the intervening years. Therefore a sample of 2,043 Latino households was added to the PSID in 1990 to correct this. This sample came from the Latino National Political Survey at the University of Texas, Austin. They are of Mexican, Puerto Rican and Cuban origin, and represent 89% of the U.S. Latino population. The new households were assigned a 'Latino Family weight' as well as an overall PSID weight.
TOPICS COVERED IN THE QUESTIONNAIRES
The original purpose of the PSID was to study the effects of income on U.S. families' well-being (or poverty) over time. Therefore, the main, or core questions of the questionnaires concern economic activities and demographic changes. This includes questions about income sources and amounts, employment, family composition changes and residential location. From time to time additional questions have been added to explore other topics including 'happiness' and intelligence. But the main focus remains economic and demographic in nature. Much of the detailed information gathered is about the household in general and about the primary adults in the household, the head of the household and the spouse/'spouse' of the head of the household. A much smaller amount is gathered about each individual currently living in a PSID household.
The following are examples of the information that is gathered about the household, the Head of the household and the spouse of the Head of the household:
- Sources and amounts of the previous year's Income including: wage income and hours worked, pension payments, inheritance, rent payments, unemployment payments and child support. Some of these are reported as annual amounts and some as monthly amounts..
- Employment Information including the occupation code whether the job was as a farmer, or whether the job was government work..
-
Time spent on housework, dollars saved by performing: household repairs,
automobile repairs or by sewing clothes.
- Demographics: Marital status, education background, information on parents and grandparents, current family members and their ages, changes in family composition from last year, members who moved out since last year.
- Geographical information: Current state and city of household as well as the size of the largest city in the county.
FILES AVAILABLE - MAIN FILES
Currently, the PSID consists of two main data files: The Annual family files and The Cross Year Individual file.
The Annual Family files are structured with one record per household surveyed during a particular wave or year. A family is uniquely identified and sorted by the Family ID assigned the household for that year. If a PSID family did not participate in a particular wave, there is no record for that family in the file representing the wave they missed. To help merge a file with other PSID files, the family record contains a variable of the annual Family ID for each previous wave of the PSID.
The Cross Year Individual file is structure with one record for each person who has ever been been a member of a PSID family . An individual is uniquely identified and sorted by the 1968 Family Interview Number (V30001) and 1968 Person Number (V30002). Every family who has ever been in the survey is assigned a 1968 Family number and every individual who has ever been in the survey is assigned a 1968 Person number, rather or not the person or family was interviewed in 1968. To help merge the Individual file with other PSID files , the records in the Individual file also contains all annual Family ID's of the household where the individual lived for each year of the survey.
Prior to 1990, the PSID was released as a cross year Family-Individaul
Response File, a Cross Year-Individual Nonresponse File and a Cross Year
Family File. That is, the data for ALL waves of the PSID was
included in a single file. But, in 1990 it exceeded the computer
allowable length and the separate annual Family file system that is currently
in use, was devised. This eliminated the need for the nonresponse
file. If a family is nonresponsive in a particular year, it
is simply not included in that wave's file. If an individual becomes
nonresponsive, all his variable are entered as missing for that year in
the Individual file.
FILES AVAILABLE - SUPPLEMENTAL FILES
On an occasional basis, the PSID issues supplemental data files on specialized topics, They contain details of the specific topic which are not included in the main files. In some cases the additional detail comes from additional questions added to the survey. These files rarely cover all of the survey years and do not necessarily include non-sample members. The following is a sample of currently available supplemental files: For a complete list, check the following web site:
http://psidonline.isr.umich.edu/Data/zipSuppData.html
Married History File - 1985-1993
This file contains one record per marriage for every individual who
was of marriage age between 1985-1993. Each record contains the 1968
ID numbers of the individual and of one spouse, and some details of the
marriage including the date of the marriage and divorce (if any).
A single record was generated for each individual who never married consisting
mostly of missing data.
Relationship File
This file is largely intended to explain in detail the relationships
(blood, marital or cohabitation ) between all paris of individuals
who were ever part of a PSID family prior to 1983. This file
was intended to be used to analyze the living arrangement patterns
of the households.
Time and Management Transfer File
This file reflects data collected in 1988, detailing transfers
of time or money between relatives and friends.. Each record represents
a single transfer in 1987.
Work-History file
This is a unique file. It was first issued in 1984 and is re-built
every year as a stand alone supplemental file. It does not
need to be merged into the Family file. It contains: that wave's
ENTIRE Family Record for al HEADS and WIVES/'WIVES'" and all details of
their work history. Work history includes information about each
spell of employment, and unemployment.
Child Birth and Adoption History - 1985-1993
This file contains one record per birth or adoption. The information
includes identifiers for each parent and the child, details concerning
the birth or the adoption as well as the birth dates of both parents.
IMPORTANT VARIABLES
The following are definitions of some of the most important terms and variables of general interest to all researchers who use the PSID files.
An individual is defined as a member of a household during a particular wave if he was either residing in the interviewed family (not temporary roommates, renters or visiting friends or relatives) or was temporarily away in an institution (college, jail, hospital or the military).
Every individual in the study is defined as either a Sample family member or a Non-Sample family member:
A SAMPLE PSID Family member:
- was a member of a household which was interviewed in the first wave (1968) OR
- was born after 1968 into an original family OR
- moved out of a first wave family and formed a new family (typically children who grew up ).
A Non-Sample family member:
- joined the panel study through marriage, cohabitation or co-residency as an adult OR
- joined the panel study as the child of a non-sample adult.
1968 Family ID and 1968 Person Number
An individual can be uniquely identified through out all waves by using the 1968 Family ID along with the 1968 Person Number. All individuals in the study are assigned these two 1968 variables rather or not they were actually part of the study in 1968.
It is more complicated to uniquely identify a family. The 1968 Family ID could be used to follow generations of a family, but not a specific household. Ever year that a household takes part in the survey, it is assigned an interview number for that year. The annual family files contain both the 1968 Family number as well as that year's interview number. This annual interview number is the variable that will be used to follow a particular household over time.
The 1968 Person number was not assigned randomly within a family. For instance, '01' is assigned to the 1968 Head of Household. The following table contains the possible values and their meaning of the 1968 Person Number variable . Notice that Sample Members have a Person number less than 170 and Non-Sample Members have a Person number of 170 or greater.
1968 Person # :
SAMPLE MEMBERS:
| 001-019 | People living in sample families at the time of the wave 1 interview(1968) AND individuals living in the Latino sample households in 1990. |
| 20 | Husbands of the Head living in an institution at the time of the wave 1 interview. |
| 021-026 | Children or step-children who were under 25 years old in 1968 who were living in an institution. |
| 030-169 | People who were born into PSID families after the wave 1 interview AND had a least one SAMPLE parent in the study at the time of birth. |
NON-SAMPLE MEMBERS:
| 170-226 | Individuals who moved into the households after 1968 OR were born into a PSID family after 1968 AND had no sample parent. |
| 227-228 | Husband or wife of wave 1 head who moved out or died in the year prior to 1968. |
| 400-499 | Non-sample individuals aged 65 or older who are treated as if they were sample members. |
| 900 - | Covers a variety of non-sample individuals. They are included in
some of the supplemental files BUT,
are NOT included in the Individual File.
Example: An individual who was an adult child prior to wave 1 ( used in the childbirth/adoption file). |
Sequence Number
Every year that a family participates in the survey it is assigned an Interview Number for that year and every member of the family is assigned a Sequence Number to identify the individual within the family, The Sequence number is assigned to an individual in the following manner:
| 01-20. | Individuals in the family at the time of the current interview:
01 for the Head, 02 for his wives, etc. |
| 51-59. | Individuals in institutions at the time of the current interview: they were nonresonsive. |
| 71-80. | Individuals who either moved out of the Household OR
out of an institution between the current and previous interviews, but who were not included in another responded Household. All such individuals are nonresonsive. |
| 81-89. | Individuals living in the previous wave but who died before the current interview. |
| 00. | Born or moved in after the current interview OR
nonresponse for the current wave OR mover-out nonresponse in the previous wave |
Relationship to Head
The Head of Household is the householder who is usually interviewed and about whom most of the personnel information is gathered. Because of this it is important to know who the Head is and how every other family member relates to the Head. The PSID uses a definition for Head of Household which is similar to the one used by the Bureau of the Census. It has became quaintly old-fashioned, but remains in use for consistency sake. The Head of Household is preferably the adult male head. If the family is a married couple family, the husband is the Head unless he is severely disabled. The Relationship to Head or RELHD variable is included as part of the individual file and defines how each individual is related to the Head:
1968 - 1982 - Original values of Relationship of Individual to Head of Household (RELHD):
| 1 | Head |
| 2 | Wife/"Wife" |
| 3 | Son or daughter, including step-children and foster children |
| 4 | Brother or sister of head |
| 5 | Father or mother of head |
| 6 | Grandchild, niece, nephew, other relatives under 18 |
| 7 | Other relative , including in-laws, other adult relatives |
| 8 | Husband or Wife of Head who moved out or died in 1967 (1968), non-relative (after 1968) |
| 9 | Husband of current head |
| 0 | Individual from core sample who was born or moved in after the current wave, non-responsive in current wave, or individual from Latino supplement in 1990 |
In 1983 it was determined that these categories were too narrow and thereafter the possible values were greatly expanded. One important example is the Head's wife. Prior to 1993 only a single value, '2, sufficed for both wife and cohabitor Since 1993: wife is '20', cohabitor or 'wife' is '22' and girlfriend is '88'. A girlfriend is defined as a first year cohabitor. After that year if the women is still cohabiting her RELHD value becomes '22'. Another example is that all children of Head were '3' , but now there are several values: children of head (30), stepchildren of head (33), children of cohabitor, but not of head (35), children-in-law(37) and children of girlfriend(83).
1983 - Revised values of Relationship of Individual to Head of the Household (RELHD):
| 10. | Head |
| 20. | Legal wife |
| 22. | "Wife"--female cohabitor who has lived with Head for a year or more or who was present in the previous wave |
| 30. | Son or daughter of Head (includes adopted children but not stepchildren) |
| 33. | Stepson or stepdaughter of Head (children of legal wife [code 20] who are not children of Head) |
| 35. | Son or daughter of "wife" but not Head (includes only those children whose mother's relationship to Head is 22 but who are not Head's children) |
| 37. | Son-in-law or daughter-in-law of Head (includes stepchildren-in-law) |
| 38. | Foster son or foster daughter, not legally adopted |
| 40. | Brother or sister of Head (includes step and half sisters and brothers) |
| 47. | Brother-in-law or sister-in-law of Head; i.e., brother or sister of legal wife. |
| 48. | Brother or sister of Head's cohabitor (the cohabitor's relationship code=22 or 88) |
| 50. | Father or mother of Head (includes stepparents) |
| 57. | Father-in-law or mother-in-law of Head (includes parents of legal wives [code 20] only) |
| 58. | Father or mother of Head's cohabitor (the cohabitor's relationship code=22 or 88) |
| 60. | Grandson or granddaughter of Head (includes only legal wife's [code 20] grandchildren; those of a cohabitor are coded 97) |
| 65. | Great-grandson or great-granddaughter of Head (includes only legal wife's [code 20] great-gradchildren; those of a cohabitor are coded 97) |
| 66. | Grandfather or grandmother of Head (includes stepgrandparents) |
| 67. | Grandfather or grandmother of legal wife (code 20) |
| 68. | Great-grandfather or great-grandmother of Head |
| 69. | Great-grandfather or great-grandmother of legal wife (code 20) |
| 70. | Nephew or niece of Head |
| 71. | Nephew or niece of legal wife (code 20) |
| 72. | Uncle or Aunt of Head |
| 73. | Uncle or Aunt of legal wife (code 20) |
| 74. | Cousin of Head |
| 75. | Cousin of legal wife (code 20) |
| 83. | Children of first-year cohabitor but not of Head (this child's parent is coded 88) |
| 88. | First-year cohabitor of Head |
| 90. | Legal husband of Head |
| 95. | Other relative of Head |
| 96. | Other relative of legal wife (code 20) |
| 97. | Other relative of cohabitor (the cohabitor's code=22 or 88) |
| 98. | Other non relatives (includes homosexual friends, friends of children of the FU, etc.) |
| 99. | NA relationship; NA who is householder (V19017=9999); two primaries share HU (V19016=8 or 9) |
| 00. | Inap.: FU is in institution |
GOTTCHA'S
- The Head of Household for a household may not always be the same person in all waves ot the survey. The head could be a man in one year, his wife in another (if the man dies, for example), or may be the father one year and his son the next.
- When using data from multiple waves, check for inconsistencies in the exact definition of each variables between waves as well as for differences in coding. A major example is the RELHD variable as was discussed above. A second example involves employment status. Before 1976 the categories were:
- Working now, or only temporarily laid off (70.5%)
- Looking for work, unemployed ( 4.5%)
- * Retired, permanently disabled (18.3%)
- Housewife (4.8%)
- Student (1.7%)
- Other (0.1%)
- Working now (70.3%)
- Only temporarily laid off ( 1.0%)
- Looking for work, unemployed ( 3.4%)
- * Retired (14.8%)
- * Permanently disabled ( 3.5%)
- Housewife ( 5.2%)
- Student ( 1.6%)
- Other ( 0.2%)
- Some of the variables refer to the current year and some to the previous year:
Previous year: Income variables including , hours worked, income earned, commute time,
time on household chores, food expenditures , income tax
- Most interviews are conducted with the Head of the Household. But if the head is not available, the answers are given by another household member, a 'proxy'. You may want to eliminate proxy responses from your research, depending upon your research.
- Some background questions are asked only when the Head of the Household is a new Head, but included in each subsequent annual file. Be aware that these variables have not changed from one year to the next.
The moral is: READ THE CODE BOOKS
Published Research
The following web site contains a list of published papers and books which have used the data from the PSID:
http://psidonline.isr.umich.edu/Publications/Bibliography/default.aspx
LOCATION of DATA
On the PopNET at PRI:
AT PRI, the following location contains SAS data sets of all currently available Family files, the Individual file and a few supplemental files: the 1968-1985 Marriage History file and the 1985-1992 Childbirth and Adoption History file.
/home/sas_data/psid
Code Books and raw data files downloaded from The PSID web site at U of Michigan are at the following location:
/home/data/psid
Each sub directory of /home/data/psid represents a separate PSID file and contains: the raw data, the Code Book and the SAS and SPSS programs. These programs can be edited to extract only the variables needed for your research. Because they contain thousands of variables, no one works with the entire data files if it can be helped.
Directories name that are a year are the Annual Family Files:
Example
: The 1982 Family File is located in the directory /home/data/psid/1982
The raw data : dat7439.f82.gz
(NOTE: the file is ZIPPED)
The code book: cb7439.fam82.
The SAS code : sa7439.fam82
The SPSS code: sp7439.fam82
The 'Individ' file contains the Individual file and is structured similarly to the Annual Family files. The following directories contain supplemental files: 'child-dev', 'childbirth' and 'marital_hist'. They all include a README file which explains the contents of the directory. The 'old_structure' directory contains raw data ONLY for the pre-1990 cross-year files. The Code Books and SAS and SPSS files are not on-line., but there are hard copies of the Code Book at PRI that should be referenced if you want to work with these files. The 'new_structure' directory contains the post 1990 files as they were first downloaded from the PSID institute.
There are hard copies of all PSID code books in the PRI library in 706
Oswald as well as in the computer lab in 806 Oswald Tower.
At the PSID institute at the University of Michigan :
http://psidonline.isr.umich.edu/
This is the PSID site. It contains detailed documentation about the survey, citations of published researched based on the PSID and information on the newest waves availability .
MERGING PSID DATA SETS
When the PSID changed its policy from creating a single cross year family file each year to creating an annual family file each year, they solved the file size problem , but created a bit more work for the researcher. Most research requires multi years of the survey and individual level variables. Luckily, it is a fairly simple process to merge multiple files into a single file, if you keep track of the variables which will connect an individuals to their family for each year of the survey. The important facts to remember are:
- The 1968 Family ID combined with the 1968 Individual ID uniquely identifies an individual over all waves of the survey.
- The ANNUAL family number identifies a household for a particular wave of the survey.
- The INDIVIDUAL file contains a list of all annual family numbers that this individual has been participated in the survey.
- The Annual Family files contain a list of all previous annual family numbers that this family has participated in the survey.
- The ANNUAL person number '01' is the Head of Household for that wave of the survey.
- The Annual Family files are in sort order by the annual family number.
- The Individual file is in sort order first by the 1968 Family number, and then by the 1968 Individual number.
- To keep the size of your file manageable, keep only the variables necessary for your research.
You can merge the files of the PSID in many different ways, too numerous
to go into detail here. To merge the Individual file with one of
the Annual Family files, use the Annual family number contained in the
family file and in the Individual file for that wave.
The PSID web site at Michigan has a good detailed document on merging files:
http://psidonline.isr.umich.edu/Guide/FAQ.aspx#6
The following is a table containing all of the Annual Family number variables in each Annual Family file and in the Individual Files.
| Year File | Family File | Individual |
|---|---|---|
| 1968 | V3 | V30001 |
| 1969 | V442 | V30020 |
| 1970 | V1102 | V30043 |
| 1971 | V1802 | V30067 |
| 1972 | V2402 | V30091 |
| 1973 | V3002 | V30117 |
| 1974 | V3402 | V30138 |
| 1975 | V3802 | V30160 |
| 1976 | V4302 | V30188 |
| 1977 | V5202 | V30217 |
| 1978 | V5702 | V30246 |
| 1979 | V6302 | V30283 |
| 1980 | V6902 | V30313 |
| 1981 | V7502 | V30343 |
| 1982 | V8202 | V30373 |
| 1983 | V8802 | V30399 |
| 1984 | V10002 | V30429 |
| 1985 | V11102 | V30463 |
| 1986 | V12502 | V30498 |
| 1987 | V13702 | V30535 |
| 1988 | V14802 | V30570 |
| 1989 | V16302 | V30606 |
| 1990 | V17702 | V30642 |
| 1991 | V19002 | V30689 |
| 1992 | V20302 | V30733 |

