Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Expectation Maximization Initialization
Replies: 2   Last Post: Jul 11, 2006 5:19 AM

 David Jones Posts: 637 Registered: 12/7/04
Re: Expectation Maximization Initialization
Posted: Jul 11, 2006 5:19 AM

A.G.McDowell wrote:
> In article <44b233c4\$1@news.nwl.ac.uk>, David Jones
<dajxxx@ceh.ac.uk>
> writes
>> dbmining@gmail.com wrote:
>>> How is the expectation maximization algorithm initialized? How
are
>>> the initial values estimated.
>>
>>
>> In principle this should not matter too much, so you would need to
>> judge how much to worry about being sophisticated. However
>> convergence can be slow.
>>
>> You can try setting missing values temporarily to a mean value,
>> although this will lead to an initially underestimated variance.

You
>> could use a second stage to this where missing values are replaced
by
>> random values generated from a first stage (poor) fitted model.
>>
>> You might find it beneficial to start from several different

initial
>> parameter sets in order to help you judge whether to stop
iterating.
>>
>> David Jones
>>
>>

> Does this advice stem from an assumption about the presence (or
rather
> absence) of multiple local minima?

No, it was rather about the possible slow convergence. Differences
between successive iterations might be compared with differences
between starting points as some guide to how long convergence might
take, with the possibility of restarting from some averaged value.

> I have seen Expectation
> Maximization considered in situations where the presence of multiple
> local minima was obvious (for example, to fit mixture models where
> the presence of a minimum at e.g. A=1, B=2 implied the existence of
> one at A=2, B=1). In this case you might try multiple starts from
> random initial parameters in the hope that at least one of your
> random starts would lead you to converge to a local minimum which

was
> also a global minimum. I am not sure what you could deduce about
> convergence from the different local minima found.

My limited experience was not in this context, and without multiple
maxima.

David Jones