Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.math.num-analysis.independent

Topic: binary search with arbitrary or random split
Replies: 4   Last Post: Jul 24, 2012 12:25 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Ben Bacarisse

Posts: 1,244
Registered: 7/4/07
Re: binary search with arbitrary or random split
Posted: Jul 24, 2012 10:27 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

pam <pamelafluente@libero.it> writes:

> On Jul 24, 1:05 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> pam <pamelaflue...@libero.it> writes:
> https://groups.google.com/group/comp.theory/browse_thread/thread/f712492127047301/ffb8397007f4abf9#ffb8397007f4abf9
>

>> > ... And actually splitting deterministically at 1 (or n) we would
>> > somehow be out of a "divide and conquer" scenario
>> > (as indicated above by John), as in effect, as Patricia noted, we are
>> > examining the elements one by one.

>>
>> It's not "splitting deterministically at 1 or n" is splitting that
>> happens, by outrageous fortune, at 1 or n all n-1 times.  If that can
>> happen, it forms the worst case for random splitting.

>
> I agree with that. I have a problem with the probability being 0, in
> the asymptotic analysis.


Where, in the standard definitions of worst-case complexity, does the
limit of some probability get used? As far as I know, it does not.

Here's an example worked out in full:

// search for t in d[1]..d[n]:
binary-search(t, d, n)
{
l := 1; h := n
while h >= l do
m := (l + h) / 2
if d[m] = t then
return m
else if d[m] > t then
h := m - 1
else l = m + 1
done
return false
}

unless I've got something wrong (I probably have[1]), the number of
comparisons done for input size is bounded above by log2(n)+1. This is
an exact and tight bound. No case ever takes more, and there is always
some input of size n which takes this number of compares.

Here there is no need to use big O, but we might as well. The algorithm
is O(log(n)) because the base of the log function and the +1 are lost.

What changes if we write m := uniform(l, h)? What is the upper bound on
the cost now? It's n -- again exactly and tightly. There is always
some execution that takes n compares, and the probability that this
occurs is non-zero -- for all n. We never consider an "infinite" case
when the probability is zero.

[1] See Bentley, Jon, 1986. "Programming Pearls", Addison-Wesley. Lots
of professional programmer got some detail of binary search wrong when
set as an exercise.

>> > So i think this can be legitimately ruled out, as a real "degenerate"
>> > case. No? This for the deterministic split.

>>
>> Now I can answer your split at 2 or n-1 question.  That also gives O(n)
>> search so you'd have to rule that case out as well.  In fact, splitting
>> at k and n-k for any constant k (with some other rule when n <= k) gives
>> O(n) search so they all have to be ruled out. too!

>
>
> Here i am getting lost. Assume that M is the middle index in an
> arbitrarily large array.
> Are you saying that if i split, for instance, deterministically at M-1
> i suddenly get a O(n) search
> instead of O(log n) ?


No. Splitting at M+k for some (possibly negative) constant k as well as
splitting at M*k for some constant k in (0,1) all produce logarithmic
bounds.

I was talking about splitting at k and/or n-k. That's generalising from
the pathological case of splitting at 1 and/or n. Eliminating only one
element at a time is not, asymptotically, worse than eliminating 2 or
3 or 4 or... They are all O(n).

>> But to answer...  Yes, anything with probability zero
>> is not considered because it can't happen.

>
> I thought so.


Just be sure: because a zero probability means that it can't happen so
it has no place in a worst case analysis.

>> The asymptotic bound is not what happens when n gets very large...
>
> well it seems to me that there is a limit in the definition.


You can define O(f(x)) in terms of limits but that's got nothing to do
with any algorithm. Remember, we work out the worst-case cost first as
a function of n. We may then choose to summarise that by using big O
notation. If we do, we don't re-visit the code to see what happens "in
the limit". We already know, exactly, the cost of the algorithm for all
n and the big O step is a technical re-writing of this function in a
simpler form.

In my examples above, the worst-case costs are log2(n) + 1 and n --
exactly. No need to use big O, or take any limits. One is logarithmic
and the other is linear.

> So, to recap, what would be the answer? That only a deterministic
> search
> using exactly the middle point is O(logn), while any other
> deterministic and the random-split search
> is O(n) ?


No. For example, a random split based on

m := (l+h)/uniform(2,5)

would be logarithmic. The key distinction is between reducing the
range by a factor of n or by a fixed amount independent of n.

--
Ben.



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.