|
|
Re: Please critique my scheme for re-weighting source data
Posted:
Feb 23, 2012 6:19 PM
|
|
On Feb 23, 11:20 am, Jennifer Murphy <JenMur...@jm.invalid> wrote: > On Thu, 23 Feb 2012 13:56:58 -0500, Rich Ulrich > > <rich.ulr...@comcast.net> wrote: > >You give no hint, that I notice, of what it is that you > >are trying to accomplish. > > >For most purposes of inference that come to my mind, > >the extreme cases -- the ones that you seem to propose > >to drop -- are the most informative and most interesting. > >So I conclude that your interests are probably the opposite > >(in some fashion) from what my naive interests would be. > > >I repeat-- What are you trying to do? > > I am trying to calculate for each word the relative likeliness that it > would be encountered by an average well-educated person in their daily > activities: reading the paper, listening to the news, attending classes, > talking to other people, reading books, etc. > > The raw scores that I have already do that, but I question the > weighting.I do not think that the average person encounters the types of > words typically found in academic journals at the same frequency as they > would those found in newspapers or magazines. Therefore, I want to > re-weight the five sources to reflect a more average experience.
The "average" well-educated person will never read an academic journal. Whether or not a well-educated person will read a novel (or fiction in general) will depend strongly on whether that person is male or female.
RGV
|
|