n.a set of records within a records series that are chosen for permanent retention through a statistical process that ensures each file within the series has an equal chance of being included in the retained setMcKay 1978, 286Many archivists and historians fear that wonderfully detailed letters from constituents, reflecting more than just the basic topic the constituent writes about, may be destroyed by weeding. Study of extensive runs of this correspondence indicates, however, that these meaty letters form a small portion of the total series. And a truly random sample allows 20 percent of such letters to be saved, just as it reduces the routine mail by a uniform 80 percent.Boles 1981, 127Survey statisticians assume that the results of a systematic sample approximate those of a simple random sample. The justification for this assumption rests on the realization that there are two ways by which selection can be randomized. The first is by application of random number tables during the process of selection. The second is by the randomization of the population elements prior to selection.Hull 1981, 13The true random sample must avoid at all costs any suspicion of special cases or special pleading, neither may it relate to nth. items or years or in any other way take particular interests into consideration. If all these matters are rigorously excluded it is then argued that the result will be acceptable for quantitative analysis or for any other statistical purpose.Kepley 1984, 239–240A random sample differs in that a random number chart is used to determine the units to be retained. Random sampling has been described as “purer” statistically because it dramatically reduces the possibility of the sample being biased by the systematic selection process itself. If, for example, one chooses every tenth file from a series where the tenth file relates always to the same subject, the sample would not be representative; rather, it would be skewed toward that subject to the exclusion of others.Guptill 1985, 65A random sample was taken so as not to skew the results in favor of large or small companies, and a list was prepared of all company contacts for the use of scholars.Melvin 1992, 48In some cases, totally random samples of unimportant files were saved to document their substantial but insignificant existence.a set of records within a records series that are chosen for permanent retention through a process that employs a preset pattern of selection to produce a sample of the wholeLewinson 1957, 304For a random sample he must make provision for “removing from the files every fifth, tenth, twentieth, etc., case depending upon the percentage” required; “a random sample in the ratio of 1 to 5”; or “10 area cases from each of the selected cities, 25 . . . from critical areas, 20 . . . from each of the 8 Litigation Offices.”a set of records within a records series that are chosen for permanent retention through a process that ensures the retention of a statistical sample of each type or category of file within the series in the retained set at the same percentage as it was within the wholeAnderson 1980, 176An alternate course is to select significant information from each file and then to select a random sample of complete files for preservation. By pulling out and preserving summaries of quantitative and demographic information from each case file, the archivist can both retain linkages between different records on the same individual and protect against the possibility of a non-representative random sample.Hindus, Hammett, and Hobson 1980, 43Because of our belief that the earlier years would yield more interesting files, we decided to oversample the early years by taking equal sample sizes for each decade, despite the fact there were far fewer cases in the early decades. That is, we formed a stratified random sample, with decades representing the strata. In this way, we could be sure of obtaining a substantial amount of information for each time period as well as information on the entire population of cases.
Notes
The random sample may be divided into three subcategories: simple random, stratified random, and systematic random.The first, the simple random sample, is a pure form of the process: the records in the sample are chosen via a totally random process based on the necessary sample size for the body of records. If the records in play are quite homogenous and there is no need to save them in their entirety, the simple random sample is likely the best methodology.A systematic random sample (also known as a systematic sample) is one without absolute randomness; instead, a pattern of selection is employed. Systematic samples include those that consist of every 100th file in a series, all records from years ending with a 0 or a 5, or every tenth file within each set of 100 files that occurs after each randomly selected filed in a series.A stratified random sample (also known as a stratified sample) is a type of sample created only when working with a heterogeneous set of records. In a set of case files with different types of cases that take up different percentages of the files, a stratified sample can ensure that a statistically valid sample is created for each case type. For instance, a stratified sample of a series of criminal court case files would consist of a simple or systemic random sample that was statistically valid for each type of crime (or each stratum) of the series.