1. **Problem statement:** We have a database with 100 records, 10 corrupted and 90 uncorrupted. A data analyst randomly selects 8 records. We want the probability distribution for the number of corrupted records selected, and then find the mean and variance of this distribution.
2. **Distribution type:** This is a hypergeometric distribution because we are sampling without replacement from two groups (corrupted and uncorrupted).
3. **Formula for hypergeometric probability:**
$$P(X = k) = \frac{\binom{10}{k} \binom{90}{8-k}}{\binom{100}{8}}$$
where $k$ is the number of corrupted records selected, and $k$ can be from 0 to 8 (but limited by the number of corrupted records available).
4. **Calculate probabilities:** For $k=0,1,2,...,8$ (only up to 8 corrupted records or 10 corrupted records, whichever is smaller), calculate:
$$P(X=k) = \frac{\binom{10}{k} \binom{90}{8-k}}{\binom{100}{8}}$$
5. **Mean and variance formulas for hypergeometric distribution:**
- Mean:
$$\mu = n \frac{K}{N} = 8 \times \frac{10}{100} = 0.8$$
- Variance:
$$\sigma^2 = n \frac{K}{N} \left(1 - \frac{K}{N}\right) \frac{N-n}{N-1} = 8 \times \frac{10}{100} \times \left(1 - \frac{10}{100}\right) \times \frac{100-8}{99}$$
6. **Calculate variance:**
$$\sigma^2 = 8 \times 0.1 \times 0.9 \times \frac{92}{99} = 8 \times 0.1 \times 0.9 \times 0.9292929 = 0.6687$$
7. **Summary:**
- Probability distribution: $P(X=k) = \frac{\binom{10}{k} \binom{90}{8-k}}{\binom{100}{8}}$ for $k=0,1,...,8$
- Mean: $0.8$
- Variance: approximately $0.669$
This gives the full probability distribution and its mean and variance for the number of corrupted records selected.
Corrupted Records 995349
Step-by-step solutions with LaTeX - clean, fast, and student-friendly.