The recent questions about poker hands and Texas Hold'em have me just concerned enough to post anew a message I've posted a couple of times before, at the risk of sounding like a broken record. My concern is solely that teachers new to AP Statistics may not realize that such questions lie outside the curriculum; I don't meant to discourage teachers from teaching some combinatorics, nor from discussing the topic on the listserv.
It used to be that a lot of college statistics courses were really "probability and statistics", and would include a hefty dose of the first before beginning the second. I know of one university course in which probability is intentionally taught first not because it prepares students for statistics, but because it typically gives students a lot of trouble and thus weeds out early the students who will (supposedly) have a tough time with statistics, saving them future agony.
I don't agree with this practice, as students who have trouble with combinatorics problems often have no trouble with applied statistics. And many colleges have come to that conclusion and now teach introductory statistics without requiring a full probability course. This is reflected in the AP Statistics curriculum, which includes only as much probability as usefully serves the statistics that we teach. Below is a list of some probability topics that AP Statistics students should know. I don't claim it's exhaustive--you should read the course syllabus carefully to see such a list--but I think it hits most of the big topics. If you look at old AP exams, both free-response and the released multiple-choice questions, you'll get a good feel for what's expected of students in the way of probability. I wouldn't trust non-College-Board course prep guides too much, though; I've found that some of them seem to misjudge the relative importance of some topics, notably probability.
My list of big probability ideas, in the order I teach them. These don't include random variables:
-- Be able to draw a two-factor contingency table containing either data counts (from which probability estimates may be made) or probabilities. Know that joint probabilities are at row-column intersections, that marginal (i.e., unconditional) probabilities are the sums of rows and columns, and that the total of these for all columns or all rows (or all joint cells) must equal 1. Use the table to estimate conditional probabilities or (less usefully) probabilities of unions.
-- Know the definition of conditional probability, P(B|A)=P(A^B)/P(A), and its consequence, P(A^B) = P(A)*P(B|A).
-- Independence means that P(B|A)=P(B). This occurs when events don't have an impact on one another, and should be questioned when there's the possibility that the outcome of one event does have an impact on the other. A consequence of independence (not the definition) is that P(A and B) = P(A)*P(B).
-- Be able to draw a probability tree properly and know that the probabilties in it are unconditional, then conditional, then (on the "leaves"), joint probabilities. Be able to use it to solve problems involving conditional probabilities. These include "Bayes's Theorem" problems, so called because their solutions implicitly invoke Bayes's Theorem; you don't actually need to teach or name Bayes's Theorem explicitly.
That's roughly four or five class days. Towards the end of the school year there always seem to be a fair number of teachers who feel rushed, so I think it's good to point out to new teachers now a place where time may be saved. The syllabus shouldn't feel packed and you shouldn't feel rushed to teach the whole course in a year. So it's reasonable to ask yourself whether you're spending too long on probability. Most importantly, notice what is not in my list above: license plates, couples at round-table dinner parties, and poker hands. That is, no combinatorics. The only combinatorial element I know of in the entire curriculum is the binomial coefficient, and that can be skipped altogether, so long as students are comfortable using their calculators to compute binomial probabilities directly.
And here's another suggestion for how to teach the topics above. Instead of doing so with playing cards, dice, and balls in urns (real or just in problem statements), consider getting simple data sets that include two categorical factors with two to five levels each, and some interesting interplay between them, such as U.S. region and smoking habits. Ask the obvious questions: what region has people most likely to smoke? Is it true among the chain-smokers as well as the light smokers? Where is the highest concentration of smokers? Did you use the same numbers to answer that question as the first question? (One should be A|B, the other B|A). Is smoking independent of the region of the U.S. a person is from? Two or three data sets like this could cover all the relevant topics in probability, and they're pretty easy to find.
I used to feel like I had to make up fake data sets (which I hate doing) to get good examples of independence since it's so rare in real data. I thought of two things I could do with real data to still teach the topic. One is this. I would tell students that independence means P(B|A)=P(B), without telling them that a consequence is P(A^B)=P(A)*P(B). Then I would give them some real data, but only marginal counts. I would then ask them to fill in the joint cell counts assuming that the factors are independent. They would have to play with the numbers and figure out how uniquely to make it work out right. We would then talk about the multiplication "rule" of independence. Then I would show them the real joint counts and ask whether they thought the factors were independent. They'd say no, and we'd talk about the fact that smoking rates are higher in the southeast than in the northeast, for example.
The other thing I do is this. I'd give them a smallish sample data set in which the numbers didn't produce exact independence but were "close", and again I'd ask them whether they thought the factors were independent. Many of them, going strictly by the numbers, would say no. Some others would say "almost" or "close". That would give us a chance to foreshadow the future idea of sampling variability, and how the population might have independence even though the sample doesn't fit the equations perfectly.
And that's it. If I ever had a student who felt lost on the AP exam because we hadn't counted license plate permutations, she never told me about it.
(I have had success teaching probability this way myself and I do encourage other teachers to try it, especially if you feel a need to save time somewhere in the syllabus. But the person who knows a class best is always their own teacher, so you are the best judge of whether this approach will work for you and your students. If others have their preferred ways to teach probability, I expect I'm not alone in being interested in them sharing their strategies on the listserv.)