On CI's, the CI for p with p-hat in the variance is too liberal in that it produces intervals for which the coverage probability tends to be much smaller than the nominal. For example, it is not unusual for intervals of nominal 95% coverage to have actual coverage probabilities more than 10% lower than that even for samples as large as 30. Using the t multiplier instead of the z makes the intervals a little longer and, hence, increases the coverage probability. Thus, the t turns out to be a decent practical solution for no good theoretical reason.
There are many other "solutions" to this problem, some of the better ones coming out of the Bayesian approach. But, it is difficult to bring all of this into an introductory course. For AP, we will have to accept the standard textbook approach.
A good resource is the book by Santner and Duffy (1989), THE STATISTICAL ANALYSIS OF DISCRETE DATA, Springer-Verlag.