Hurley, Catherine B. and Mahmoud, Hosam M. (1994) Analysis of an Algorithm for Random Sampling. Probability in the Engineering and Informational Sciences, 8. pp. 153-168. ISSN 0269-9648
Preview
CH_analysis of.pdf
Download (631kB) | Preview
Abstract
We analyze a standard algorithm for sampling m items without replacement from a computer file of n records. The algorithm repeatedly selects a record at random from the file, rejecting records that have previously been selected, until m records are obtained. The running time of the algorithm has two components: a rejection component and a search component. We show that the probability distribution of the rejection component undergoes an infinite series of phase transitions, depending on the order of magnitude of m relative to n. We identify an infinite number of ranges of m, each with a different behavior. The rejection component is distributed as a linear combination of Poisson-like random variables. The search component is customarily done using a hash table with separate chaining. The analysis of the hashing scheme in this problem differs from previous hashing analyses, as the number of lookups in the hash table for each insertion has a geometric distribution. We show that the average overall cost of searching is asymptotically linear with only two phase transitions in the coefficient of linearity.
Item Type: | Article |
---|---|
Keywords: | analysis; algorithm; random sampling; |
Academic Unit: | Faculty of Science and Engineering > Mathematics and Statistics Faculty of Science and Engineering > Research Institutes > Hamilton Institute |
Item ID: | 15466 |
Depositing User: | Dr. Catherine Hurley |
Date Deposited: | 09 Feb 2022 12:11 |
Journal or Publication Title: | Probability in the Engineering and Informational Sciences |
Publisher: | Cambridge University Press |
Refereed: | Yes |
Related URLs: | |
URI: | https://mu.eprints-hosting.org/id/eprint/15466 |
Use Licence: | This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here |
Repository Staff Only (login required)
Downloads
Downloads per month over past year