Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.


loom91
Posts:
17
Registered:
4/25/06


Re: Help with a school project, a statistical survey on education
Posted:
Jul 22, 2007 2:36 PM


On Jul 21, 4:36 pm, The Qurqirish Dragon <qurqiri...@aol.com> wrote: > On Jul 20, 10:55 am, loom91 <loo...@gmail.com> wrote: > > > I'm also looking for specific help on the following topics: > > > i)What is a suitable measure of whether girls are stronger at some > > subjects while boys at other subjects? > > Assuming you have the individual subjecttest scores for each person, > then let mu_bi be the popolation mean for boys in subject i, and mu_gi > be the population mean for girls in subject i. You want to test the > hypotheses (1) mu_bi > mu_gi and (2) mu_gi > mu_bi. performing this > sort of hypothesis test is one of the first things that most > statistics texts cover when considering two related populations. > If exactly one test fails, then your evidence supports a gender bias > in the subject. If both tests fail, then the evidence indicates no > bias (or, more correctly stated, fails to show there is a bias. If > this is a beginning statistics class, then you should be certain to > tell them the difference between those two phrasings) >
Thanks for your comments. But we are not aiming to measure the absolute difference between boys and girls. I had tried to give a picture of what we are looking for above:
"i)What is a suitable measure of whether girls are stronger at some subjects while boys at other subjects? I'm thinking of comparing the percentage of total marks obtained in one subject, standardised against the whole population. For example, consider the variable X = percentage of total marks earned in History+Geography. Next, we define the standardised (wrt the entire population) variate corresponding to X, let it be Z. Now we compute the mean of Z over the girls schools ([itex]E_1(Z)[/itex]) and the mean over the boys schools ([itex]E_2(Z) [/itex]).
If the first value is larger than the second value (it seems one will have to be positive and the other negative), then we may say that girls prefer humanities more over other subjects than boys. Next we can do the same analysis on the boys vs girls population in coed schools and see if the difference is less. By using the absolute marks instead of expressing it as percentage of total marks, we can also compare the relative performance (as opposed to preference) of boys and girls in humanities. The same can be done for languages and sciences. Is this a statistically sound measure (unlikely, since I just made it up)? What are the alternatives?"
As nyou see, we are not comparing whether boys score more marks than girls in math. We are aiming to measure whether girls are weaker in math than boys *relative to the other subjects*. For example, if a boy scores very low marks in all subjects, but scores comparatively better in Biology, you would say he was strong in Biology, even though mny boys scored more in Biology than him. This is why I was expressing marks in the subject as parcentage of total marks obtained, to judge the relative contribution of the subject irrespective of whether the student is good or bad overall. Then I standardise wrt to the whole population to see whether the subject contributes more or less to a students score than the population average. By taking the mean over the boys population and seeing whether it is more than the mean over the girls population, we can judge whether the subject contributes more to the totals of one sex than the other. Does this make sense? Will it work well? Will something else work better?
> > > > iii)Is there some easily available (preferably free) software that > > will let me do all this analysis (brownie points for fitting > > probability distributions and graphing)? It would be a nightmare to do > > this by hand since we usually work with less than 50 data points > > instead of several hundred. > > Off hand, I don't know of free software, but it is likely that your > school has one or more of them already on the school's computers. For > that matter, at this level even Excel will have sufficient tools > (although you may need to install the statistical measurements pack.)
Actually, our proposed stat computer lab is stalled because there are not enough plugpoints to put up two more computers :)
> > > iv)As it stand right now, we will sample two boys schools, two girls > > schools and one coed school. Is this enough to be statistically > > significant? How many data points should we sample from each school? > > Should this be a constant or proportional to the total number of > > students? > > To compare the individual schools, of course, this is fine (as long as > you have a decent sized sample from each). To compare TYPES of > schools, then no, it is insufficient, as you only have have a few data > points. As for the sample size (from each school), I would suggest a > minimum of the larger of 30 or 5% of the student population (these > numbers are the same at 600 students) This way you can likely use a > normal approximation to score distributions, even if the scores are > not normally distributed. Many statistical tests have simpler forms > for normally distributed data. This may allow you to have the class do > the analysis by hand. If you use a software package, then this need is > not important, obviously. In any event, large sample sizes will > improve your confidence levels in the hypothesis tests.
We plan on taking 50 data points from each school, about 2025% of the class size. So you say that we should take the same number of points even if one school has more students than the other? Also, do you mean that the sample size is insufficient to draw reliable conclusions about whether coed schools really lessen the gender differences?
> > > v)Finally, is the whole proposition so glaringly ridiculous that all > > serious statisticians will simply laugh at it? I hope not :redface: > > Not at all. It is great if you can use an example like this (as > opposed to textbook work). This should be a very good problem for a > firstyear statistics class. Of course, if your results show a > significant difference between the schools in your district, there may > be some bruised egos in the administration(s), but that is problem > outside the scope of statistics ;) Hide quoted text  > >  Show quoted text 
Just to clear up something, you don't think I'm the teacher, do you? I'm just a 11th grade student. Our syllabus covers very little estimation, mostly descriptive stat, so it'll be helpful if you gave me a few pointers about common problems encountered by statisticians when doing this type of study.
Also, the question about which I've abolutely no idea at all is the following:
ii) What is a good way of identifying whether the population in a school indeed consists of discreet stratas? This could be good students/bad students (there is indication from previous results that this may be the case) or in coed schools boys/girls (very likely the case). In case of coed schools, there may even be four stratas: good boys, good girls, bad boys, bad girls. It will be interesting to study whether bad boys vs girls show more difference than good boys vs girls. All this sounds very pretty, but I don't know how to separate the population into stratas.
Can you help me? Thanks.
Molu



