Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: binary search with arbitrary or random split
Replies: 6   Last Post: Aug 7, 2012 5:58 PM

 Messages: [ Previous | Next ]
 Ben Bacarisse Posts: 1,972 Registered: 7/4/07
Re: binary search with arbitrary or random split
Posted: Jul 24, 2012 6:02 PM

pam <pamelafluente@libero.it> writes:

> On Jul 24, 7:10 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>

>>
>> What would that fix?  uniform(2, n-1) is as bad as uniform(1, n).

> ...
>>
>> By the way, your use of big O is getting out of hand (we all do it
>> because it's become a short-hand for something else).  You can't say
>> that something has a complexity of O(log(n)) and not O(n).  Proper
>> binary search is O(n), it just happens to O(log(n)) as well!  If this
>> surprises you, check the definition.

> ...
>>
>> "Complexity", unadorned, usually means worst-case complexity.

>
>
> Well i understand we want an upper bounding function. But i guess you
> normally refer to the "smallest"
> (or "tightest" one) or else one could always place some enormous
> function f(n) in O(f(n)).

There is other notation for other bounds. Big O is just an upper bound
so all functions that are O(log(x)) are also O(x) and O(x^2). If you
want a tighter bound you write Theta(f(x)).

> However, in this particular instance i think that Ben's observation is
> really key.
> In fact i believe that talking of splitting at "given place"s, makes
> the problem "ill
> defined" and allows making formal remarks which do not attack the
> essence of the matter under discussion.

I don't know what observation of mine leads to this, but I did not
suggest that anything discussed so far leads to ill defined problems.

> What we should really do, i think, is saying that the splitting will
> occur at place n/k, with k a continuous Uniform in (1, n)

Why should we do this? To what end? n/uniform(1,n) has the same range
as uniform(1,n), the only difference is the distribution.

> This way, i think, we are defining a proper generalization of the,
> say, "mid point" binary search. A divide and conquer procedure.

Except that it does not, in the worst case, divide. The effect is still
a O(n) algorithm.

> Viewed under this perspective, suggested by the observation made by
> Ben, I think we can solve various problems.

I can't see how you get here from anything I've said. You must have
misunderstood what I was saying.

<snip>
>> > I think that including events with 0 probability might be rather

>>
>> It would be wrong, not misleading.  That you say this, suggests that did
>> not agree with my rather detailed previous message.  Which bit suggests
>> that something with zero probability is being considered?

>
> You seems to be saying that vanishing events should not be included.

If it can happen, include it; if it can't, ignore it. It seems so clear
to me, but you keep coming back with slightly vague phrasing ("vanishing
events" -- what are they?) as if you understand what I'm saying but
would rather it were not so.

> But this seems in contrast with the post from Patricia:
>

>>However, for the algorithms we are talking about, the probability of a
> split sequence that will result in n comparisons is strictly positive
> for all finite problem sizes. There is no finite N that is so big we
> can
> eliminate that case for all values of n greater than N.

Patricia and I are in agreement here. No contrast at all that I can see.

> So having reformulated the problem assigning a distribution to the the
> split point n/k (or its "floor")
> (as a function of n) the question still remains: are you saying that
> this is no better than O(n) or can have the better
> bound O(logn) ?
> This is a huge difference as it can mean seconds instead of millions
> years
> if n is large, for some important problems.

Of course we can get better than O(n) -- just use k=2. If you insist on
a random algorithm with better than O(n) worst-case bounds, split at
n/uniform(2, 2.00001).

What's going on here? If there were any down side at to using n/2 I can
see why one might consider alternatives -- even random ones (some people
do for quicksort, partitioning for example) --- but since n/2 is simple,
easy to calculate and optimal you can't be proposing an alternative.

--
Ben.

Date Subject Author
7/24/12 Ben Bacarisse
7/24/12 Patricia Shanahan
8/7/12 Ralph Hartley
7/24/12 Ben Bacarisse