Donna J. Peterson


Existing records will be the source of important data in many evaluations, particularly when conducting a needs assessment. Yet secondary data sources are often overlooked because it is assumed that attempts to determine the need for a program must rely on primary data sources (Kettner, Moroney, & Martin, 1990). However, when time and resources are a factor, existing records often provide the most efficient and effective strategy. 

Existing records refer to data acquired from secondary sources rather than from original data collection efforts (Hatry, 1994; Kettner et al., 1990). In other words, this information was not collected during the course of the present evaluation, but instead for other purposes. Two broad categories of existing records are: 1) those collected as part of an agency's normal process of implementing a program (sometimes called agency records and/or utilization data) and 2) existing data such as surveys or reports completed by outside sources. Existing records may include information on number of program participants, participant characteristics, vital statistics, spending levels, income and poverty levels, resources acquired and spent, productivity, school dropout rates, test scores, divorce proceedings, and recidivism rates. 

Sources of this information can include governmental agencies at local, state, national, or international levels and nongovernmental organizations. Some governmental agencies gather data regularly (McKenzie & Smeltzer, 1997). Certain data collection is required by law (e.g., census, births, deaths, notifiable diseases), while other is voluntary (e.g., use of seatbelts). Additionally, most programs are required to keep records on their participants and the resources expended (Rossi & Freeman, 1993). Various types of existing data can be obtained from published reports, on the World Wide Web, or from the actual agency records.


*Provide information about the incidence (the number of new cases), prevalence (the number of existing cases), and rate (the proportion of a population with the particular concern) of a particular programmatic concern in a population (Rossi & Freeman, 1993)
*Aid in definition and selection of a target population
*Aid in defining objectives and setting goals of a new program
*Help improve the planning and design of new programs
*Provide information about how "services" are being delivered and the activities of staff and program participants (some agency records only)


*If you are using agency records, your information will apply to only those individuals participating in that program -- Agency records exclude data on individuals who are not participating, but who could or should. 

*If you are using published reports or data collected by outside sources, you will not have information about the individuals involved in the specific program which you are evaluating.

*Precise information about your geographic area, unless it was collected specifically in your area (for example, if you access your state's records on teen pregnancy rates, those rates may not accurately reflect the pregnancy rates in your own community). 

*Published reports will not allow you to determine the impact of your program on its actual participants.



If you are using the Five-Tiered Approach to Program Evaluation outlined in the State Strengthening Evaluation Guide (Callor, Betts, Carter, & Marczak, 1997), existing records can be helpful at several levels. 

TIER 1 - Program Definition

One task in Tier 1 is to assess community needs and assets to determine the need for a specific program. Turning to existing records at this stage can be helpful in a number of ways. First, they will be a quick and easy way to determine the occurrence of a problem in your population and help you select your target audience. For example, if you want to provide a respite program for families with developmentally or physically challenged children, you may be able to use existing records from your county health services to determine the number of families facing this concern in your area. Existing records can also help you define objectives, set goals and design your program. For example, after determining that several of these families in your area have limited incomes, you may choose to focus on low-income families who would be the least likely to be able to purchase desired respite care. By accessing additional information from published reports, you see that the time parents spend with their nonchallenged children is less than what they would like. As a result, you decide that, in addition to providing "time off" for parents, another goal will be to enable them to spend some quality time with their other children. Having your goals and objectives in mind will then help you design your program.
TIER 3 - Understanding and Refining 

As part of Tier 2, you were keeping records on the characteristics of participants, how service was provided, the time each staff member spent working with each family, and the way the participants felt about the program. The goal of Tier 3 is to utilize the information gathered in Tier 2 to help you improve your program. This may be a good time to turn to records from other agencies with a similar program. For example, if you notice that your program is not serving families from rural areas, while another program is, comparing your recruiting procedures and service delivery methods may help you make decisions about refining your program. 

TIER 4 - Progress Toward Objectives

The goal of Tier 4 is to document program effectiveness. While it is usually most advantageous to obtain original data specifically for your program outcomes, there may be times when using existing records may provide another option and be more cost effective. For example, assume that another goal of your respite care program is to enhance academic competency and extracurricular activities of the physically challenged children in the program. While you could gather your own data through academic assessments and questionnaires asking about their extracurricular activities, it might be more efficient and cost effective to get such data from existing school records. You may find through school records that children in your respite care program made significantly greater improvements on their grades and joined more extracurricular activities over the academic year than their counterparts who did not participate in the program. You should, however, be cautious about interpreting such data and drawing conclusions about its link to your program. Since the records were not collected specifically for your project, you probably had little control over issues such as when, how, and what types of data were collected. For these reasons, you should be more stringent and cautious when interpreting existing data. Also, be sure to keep in mind that there may be other factors outside of your control that have contributed to the outcome. 


*Availability and accessibility of data -- The data is ready for use when you are; there is no waiting for data collection (Kettner et al., 1990; McKenzie & Smeltzer, 1997)

*Low cost -- There are usually no costs associated with the use of existing records (O'Sullivan & Rassel, 1995). Individuals have free access to data collected by the government (McKenzie & Smeltzer, 1997). The information can be obtained from the agency that collected the data, by searching the World Wide Web, or by finding it in a library that serves as a US government depository for government documents (e.g., college and university libraries and large public libraries). If utilizing agency records to evaluate one of the organization's programs, you will likely have free access to their existing records. 

*Minimum staff needed -- Because you will be using already existing information, fewer individuals will be needed than if you had to collect new information. (McKenzie & Smeltzer, 1997) 

*Comparative or longitudinal data may be available -- For example, if you want to see the trends in teenage pregnancy rates over several years or would like to compare rates in different regions of the United States, referring to existing records would be the most efficient procedure. (Kettner et al., 1990; O'Sullivan & Rassel, 1995)



*Missing or incomplete data will affect the overall accuracy of the information (Hatry, 1994)

*Some information needed for program evaluation may not be available; limitations exist based on what information was collected (McKenzie & Smeltzer, 1997). 

*Depending on the source, existing data may not be in the format you require -- For example, it may not be specific to your geographic location (e.g., the data was collected in New York City, but your program is in a small Kansas community) or it may be in an overly aggregated form (e.g., you want to break participants out by their specific grade level, but the statistics you are utilizing only report information for elementary students, middle school students, and high school students) (Hatry, 1994; Kettner et al., 1990)

*Unknown, different or changing definitions of data -- Concepts may not have been defined and measured in the same way over time or across sources (Hatry, 1994). This will affect the reliability and comparability of the information (O'Sullivan & Rassel, 1995; Rossi & Freeman, 1993). One common example is the use of race and/or ethnic categories. Some studies may describe only ethnicity, while others may mix race and ethnicity. 

*Difficulty in gaining access to necessary records due to confidentiality issues -- This is often a problem when human services, education, and criminal justice programs are being evaluated or when sensitive information is being collected. (Hatry, 1994; McKenzie & Smeltzer, 1997)



Using existing records for an evaluation often raises unexpected problems (Hatry, 1994). Therefore, "the challenge is to make needed adjustments that do not compromise the overall quality of the evaluation" (Hatry, 1994, p. 385).

*Missing or incomplete data -- In such cases, first determine whether the missing information prevents you from answering questions important to the evaluation.

1. Gather the missing or incomplete data (although this is often not possible)

2. Leave out the data

3. Assign values to the missing data that best represent the group (e.g., replace missing information with the mean of the available data)

4. Delete incomplete cases, but weight each complete case to compensate for those that were excluded

*Data unavailable in format required -- Instead of having specific data on each individual, the data is often combined or aggregated. However, evaluators often want the information broken apart. 

1. Collect new data for the breakouts desired

2. Go back to the individual records to gather the information

3. Exclude the unavailable breakouts from the evaluation

*Unknown, different, or changing definitions of data -- Determine how the information you are using was defined and gathered. This information is needed to determine the accuracy and comparability of the data. 

1. Identify the definitions and data collection procedures used and check for changes during the period covered in the evaluation

2. Make adjustments when differences in definitions or collection procedures are found

a. Exclude the variables for which data are unavailable in compatible definitions

b. Check the original data and make appropriate adjustments (e.g., if one source obtained monthly income and another obtained annual income, you could multiply the monthly amount by 12 to calculate annual income)

3. Discuss percentage changes rather than absolute values -- This will provide relatively accurate comparisons if the definitions and collection procedures remained stable over the period included in the evaluation

4. Keep a record of unsolved data definition problems and estimate their effect on your findings

5. Exclude the analysis of that variable if the problem cannot be corrected

*Data linked across time and participants -- Sometimes you may want to use data from different programs. However, separate agencies may use different identifiers or track participants in different ways. 

1. Closely examine names to identify variations (e.g., consider multiple identifiers such as age, address, and social security number to ensure that you are referring to the same person)

2. Use a standard time period for collecting information -- If individuals are selected without consideration of program length or if outcomes are not measured for the same time period, the information obtained cannot be compared 

*Confidentiality and privacy considerations -- Protect the privacy of individuals in the following ways:

1. Do not record participants' names

2. Assign each individual a code number, and store the list that cross-references the number to names separately; destroy this list after the evaluation requirements are met 

3. Obtain data without identifiers (such as name, address, social security number) from the agency

4. Do not include any details in evaluation reports that would allow a reader to link a particular finding to a specific individual

5. Obtain written permission before mentioning a specific case


The procedure for using existing records will vary somewhat depending on the type and source of the information. The following section provides general guidelines for working with agency records and existing data from outside sources.

Obtaining Data From Agency Records (Hatry, 1994)

Before actual data collection begins: 

1. Make friends with staff who collected the data. If you are asking for information from people you do not know, getting to know these individuals early can be beneficial throughout the process. 

2. Try to work with the staff most familiar with the records. Ask about possible problems, such as changes in the way concepts were defined, problems in getting the information, and reliability and validity problems. This will tell you what information you will be able to obtain and will help you decide how to handle problems which may arise. 

3. If you ask the agency to provide information instead of requesting access to their files, you can make the task easier for staff by providing advance notice, putting your request in writing, providing clear descriptions of the information you need, and indicating why the information is needed. However, remain open to alternatives that may be suggested. 

4. Obtain samples of the data formats and definitions before you actually compile your data to see what information is available. This will help you decide whether to sacrifice some of the desired information, to obtain information not currently in the desired form, or to accept the data as it is. 

During actual data collection: 

5. Identify the periods of time and geographical areas that apply to each piece of information collected. This information may be necessary if adjustments need to be made or if discrepancies need to be explained when writing your report. 

After initial data have been obtained:

6. Decide how to handle missing or incomplete information for each item of interest. 

7. Check for illogical and inconsistent data, and try to obtain the correct information. 

8. Have staff who initially collected the data verify it if they are able and willing. 

9. Thank the staff for assisting. 

10. Provide necessary cautions in your report. Be sure to indicate how any problems with the data may have affected your findings. 

Obtaining Existing Data From Outside Sources

There are various sources of information collected and reported by governmental agencies that are accessible to the public. Statistics from these federal sources can be applied to your own geographic location. For example, if you know the prevalence of a problem is 2.5 per 100, and there are 26,000 individuals in your area, you can compute that the prevalence in your area is 600 (26,000 x .025). Below is a description of information available from the federal government. Similar information is available from state and local governments. 

Recall that access to information collected by the government is free. Reports can be obtained from the agency that collects the data, from a library that is a US government depository for government documents (e.g., many college and university libraries and large public libraries), or on the World Wide Web. 
Some of the major statistical agencies of the federal government who collect, compile, analyze, and publish data for general use that may be especially applicable include: 1) the National Center for Health Statistics, 2) the National Institutes of Health, 3) the Bureau of the Census, 4) the Bureau of Labor Statistics, 5) the Department of Agriculture, 6) the National Center for Education Statistics, and 7) the Bureau of Justice Statistics.

1. The National Center for Health Statistics (NCHS) (

The National Center for Health Statistics is one of the seven divisions of the Centers for Disease Control and Prevention (CDC) (O'Sullivan & Rassel, 1995). This agency obtains vital statistics and collects information through a variety of national surveys, including the National Health Interview Survey (NHIS), the National Health and Examination Survey (NHANES), the Youth Risk Behavior Surveillance System (YRBSS), and the Health Records Survey. Basic vital statistics are published in the "Monthly Vital Statistics Report: Provisional Data" and in annual volumes of the "Vital Statistics of the United States." Data from the NHIS and NHANES are published in the "Vital and Health Statistics" series. Information from the YRBSS appeared in the "Morbidity and Mortality Weekly Report" in 1995. 

2. The National Institutes of Health (NIH) (

NIH is the "flagship of federal government health websites, and the gateway for many other public health websites" ( NIH includes 24 separate Institutes, Centers, and Divisions, including the National Institute on Aging, the National Institute on Alcohol Abuse and Alcoholism, the National Institute of Child Health and Human Development, the National Institute on Drug Abuse, and the National Institute of Mental Health. While a great deal of public health information can be accessed from this site, it is important to point out that much of the material is quite technical.

3. The Bureau of the Census (

The Bureau of the Census is part of the US Department of Commerce. Information collected by this agency is typically an excellent source of demographic information because the Bureau is concerned with producing high-quality data that can be compared across time (O'Sullivan & Rassel, 1995). The "Statistical Abstract of the United States" has been published since 1878 and provides a summary of statistics on the social, political, and economic organization of the US. A new edition is published in January and includes data for up to two years prior to the current data. Information from several surveys is included in this publication. A brief description of various surveys conducted by the Census Bureau follows (McKenzie & Smeltzer, 1997).

The Decennial Census of Population and Housing. The Bureau takes a census of the US every ten years to obtain information on population, income, employment, family size, education, type of dwelling, and other social indicators. 

The Current Population Survey. This is a monthly survey of households conducted by the Bureau of Labor Statistics and the Bureau of the Census to gather population and labor-force information. The Census Bureau examines only the population data and produces the "Current Population Reports." The data usually represent approximately 60,000 households.

The Survey of Income and Program Participation. The Census Bureau and the Social Services Administration created the SIPP to collect monthly information on income, employment, and receipt of government assistance. This shows the effect of federal assistance on recipients and on the level of federal spending. Approximately 30,000 households are included. Each household participates for 2 ½ years, with data being collected once every four months, thus only 25% of the households are contacted every month.

4. The Bureau of Labor Statistics (BLS) (

The Bureau of Labor Statistics is an independent national agency which collects, analyzes, and disseminates statistical data on employment and unemployment, prices and living conditions, compensation and working conditions, and productivity and technology. The BLS releases information on labor-force participation from the "Current Population Survey" in its monthly report on the nation's employment and unemployment rates. 
5. The US Department of Agriculture (USDA) ( 

There are four agencies within the US Department of Agriculture with a major focus on research, education, and economics. The Agricultural Research Service's (ARS) goal is to "ensure an adequate supply of high quality, safe food and other agricultural products to meet the nutritional needs of consumers, sustain a competitive food and agricultural economy, to enhance quality of life and economic opportunity for rural citizens and society as a whole, and to maintain a quality environment and natural resource base" ( One of the research objectives of this agency is to improve human nutrition and well-being. The Cooperative State Research, Education and Extension Service (CSREES) "works with partners and customers to advance research, extension and higher education in the food and agricultural sciences and related environmental and human sciences to benefit people, communities and the Nation" ( The Economic Research Service (ERS) "produces economic and other social science information to serve the general public and to help Congress and the Administration develop, administer and evaluate agricultural and rural policies and programs" ( The National Agricultural Statistics Service (NASS) provides statistical information and services for producers, financial institutions, and agricultural organizations, services, and business ( 
6. The National Center for Education Statistics (NCES) ( 

The National Center for Education Statistics is part of the US Department of Education. The purpose of this agency is to collect and report "statistics and information showing the condition and progress of education in the United States and other nations in order to promote and accelerate the improvement of American education" ( Some of the major surveys conducted by NCES are: Early Childhood Longitudinal Study, High School and Beyond, National Adult Literacy Survey, National Household Education Survey, Private School Survey, and Schools and Staffing Survey. 
7. The Bureau of Justice Statistics (BJS) ( 

The Bureau of Justice Statistics is part of the US Department of Justice. This agency collects information on crimes reported to the police, including juvenile justice statistics. The website contains links to statistics from the FBI and other federal agencies, to international crime and justice statistics, to other crime statistics websites, and to the Office of Juvenile Justice and Delinquency Prevention.

Additional Sources of Statistics on the Internet

*The US Department of Education's Educational Resources Information Center (ERIC) has links to sources of demographic data on educational achievement, general demographic data, and school demographic data. (

*The University of Michigan Documents Center has information on agriculture, business and industry, consumers, cost of living, demographics, economics, education, energy, environment, finance and currency, foreign economics, foreign governments, foreign trade, government finances, health, housing, labor, military, politics, science, sociology, transportation, and weather. (

*CYFERNet, the Cooperative Extension System's Children, Youth and Family Information Service, also has links to various children, youth, and family statistics and demographics. (

*FedStats is known as a one stop shopping site for federal statistics. Although over 70 agencies in the US Federal Government produce statistics of interest to the public, visitors to this site currently have access to statistics from 14 of these agencies. Other agencies will be added as the site develops. However, a list of all agencies is accessible to help point you toward additional sources of information. (

*The Centers for Disease Control and Prevention website contains links to scientific data, studies, laboratory information, health statistics, and to the CDC's individual Centers, Institutes, and Offices. (

*The Institute for Research on Poverty (IRP) at the University of Wisconsin-Madison publishes a newsletter three times per year called "Focus." This newsletter includes articles on poverty-related research and issues. While the publication can be ordered (free of charge) in print form, articles are also available at IRP's website. (

*The Children's Defense Fund is a private, nonprofit organization supported by foundations, corporation grants, and individual donations. The website includes state-by-state data on key indicators that measure aspects of children's lives. (


1. Baj, J. and others. (1991). A feasibility study of the use of unemployment insurance wage-record data as an evaluation tool for JTPA. Available from EDRS.

This report discusses the use of state unemployment insurance wage-record data to assess the effectiveness of Job Training Partnership Act (JTPA) programs. It discusses several issues associated with the use of existing records: coverage, accuracy, timeliness, and confidentiality. 

2. Federal Interagency Forum on Child and Family Statistics (1997). America's children: Key national indicators of well-being. Available on the web at

This report is the result of a collaboration by several government agencies who collect and report data on children. It describes population and family characteristics and contains information on key indicators of children's well-being: economic security (poverty and income, food security, housing problems, parental employment, and health insurance), health (prenatal care, infant mortality, low birthweight, immunizations, activity limitation, child mortality, adolescent mortality, teen births, and a summary health measure), behavior and social environment (cigarette smoking, alcohol use, substance abuse, and victims of violent crime), education (difficulty speaking English, family reading to children, early childhood education, math and reading proficiency, high school completion, detached youth, and higher education), and child abuse and neglect. There is also a discussion of the sources and limitations of the data presented.

3. Hernandez, D. J. (1995). Changing demographics: Past and future demands for early childhood programs. The Future of Children, 5(3). Available on the web at

This article describes how demographic changes among American families from the mid-1800s to the present have influenced the demand for early childhood care and educational programs.

4. Hernandez, D. J. (1997). Child development and the social demography of childhood. Child Development, 68, 149-169.

This article describes historic and current trends and statistics regarding the demographic characteristics and environments of children. The information can be used by individuals to compare their study populations to the general population of children. It also includes an Appendix with questions to measure the demographic characteristics and environments of children.

5. National Commission for Employment Policy. (1992). Using unemployment insurance wage-record data for JTPA Performance management. Available from EDRS.

This report discusses the potential use of Unemployment Insurance wage records to track the employment and earnings experiences of participants in programs of the Job Training Partnership Act (JTPA). It is an example of how to use existing records or "share data." One section of the report describes the data itself, the data sharing experience, confidentiality issues, and issues involved in data sharing, such as costs and data accuracy. 

6. O'Sullivan, E., & Rassel, G. R. (1995). Research methods for public administrators (2nd ed.). White Plains, NY: Longman Publishers USA.

Chapter 9 in this book (Secondary Data Analysis: Finding and Analyzing Existing Data) provides a very detailed description about the use of secondary data. It discusses strategies for identifying, accessing, and evaluating the quality of existing data and describes the general content of major US Census Bureau population surveys and vital records.

7. Record exchange process: A set of records for handicapped students in vocational education. Available from EDRS.

This article describes a process for the sharing of records by agencies.

8. Soriano, F. I. (1995). Conducting needs assessments: A multidisciplinary approach. Thousand Oaks, CA: Sage Publications.

This book discusses needs assessments from the initial stages of planning and developing (methods and measures) to evaluating and reporting findings. Examples and exercises are included. 


Callor, S., Betts, S. C., Carter, R., & Marczak, M. S. (1997). State strengthening evaluation guide. Tucson, AZ: USDA/CSREES & The University of Arizona. 

Hatry, H. P. (1994). Collecting data from agency records. In J. S. Wholey, H. P. Hatry, & K. E. Newcomer (Eds.), Handbook of practical program evaluation (pp. 374-385). San Francisco: Jossey-Bass Publishers.

Kettner, P. M., Moroney, R. M., & Martin, L. L. (1990). Designing and managing programs: An effectiveness-based approach. Newbury Park, CA: Sage Publications.

McKenzie, J. F., & Smeltzer, J. L. (1997). Planning, implementing, and evaluating health promotion programs: A primer (2nd ed.). Boston: Allyn and Bacon.

O'Sullivan, E., & Rassel, G. R. (1995). Research methods for public administrators (2nd ed.). White Plains, NY: Longman Publishers USA.

Rossi, P. H., & Freeman, H. E. (1993). Evaluation: A systematic approach (5th ed.). Newbury Park, CA: Sage Publications.

No comments: