Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



Re: binary search with arbitrary or random split
Posted:
Jul 24, 2012 6:02 PM


pam <pamelafluente@libero.it> writes:
> On Jul 24, 7:10 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote: > >> >> What would that fix? uniform(2, n1) is as bad as uniform(1, n). > ... >> >> By the way, your use of big O is getting out of hand (we all do it >> because it's become a shorthand for something else). You can't say >> that something has a complexity of O(log(n)) and not O(n). Proper >> binary search is O(n), it just happens to O(log(n)) as well! If this >> surprises you, check the definition. > ... >> >> "Complexity", unadorned, usually means worstcase complexity. > > > Well i understand we want an upper bounding function. But i guess you > normally refer to the "smallest" > (or "tightest" one) or else one could always place some enormous > function f(n) in O(f(n)).
There is other notation for other bounds. Big O is just an upper bound so all functions that are O(log(x)) are also O(x) and O(x^2). If you want a tighter bound you write Theta(f(x)).
> However, in this particular instance i think that Ben's observation is > really key. > In fact i believe that talking of splitting at "given place"s, makes > the problem "ill > defined" and allows making formal remarks which do not attack the > essence of the matter under discussion.
I don't know what observation of mine leads to this, but I did not suggest that anything discussed so far leads to ill defined problems.
> What we should really do, i think, is saying that the splitting will > occur at place n/k, with k a continuous Uniform in (1, n)
Why should we do this? To what end? n/uniform(1,n) has the same range as uniform(1,n), the only difference is the distribution.
> This way, i think, we are defining a proper generalization of the, > say, "mid point" binary search. A divide and conquer procedure.
Except that it does not, in the worst case, divide. The effect is still a O(n) algorithm.
> Viewed under this perspective, suggested by the observation made by > Ben, I think we can solve various problems.
I can't see how you get here from anything I've said. You must have misunderstood what I was saying.
<snip> >> > I think that including events with 0 probability might be rather >> > misleading. >> >> It would be wrong, not misleading. That you say this, suggests that did >> not agree with my rather detailed previous message. Which bit suggests >> that something with zero probability is being considered? > > You seems to be saying that vanishing events should not be included.
If it can happen, include it; if it can't, ignore it. It seems so clear to me, but you keep coming back with slightly vague phrasing ("vanishing events"  what are they?) as if you understand what I'm saying but would rather it were not so.
> But this seems in contrast with the post from Patricia: > >>However, for the algorithms we are talking about, the probability of a > split sequence that will result in n comparisons is strictly positive > for all finite problem sizes. There is no finite N that is so big we > can > eliminate that case for all values of n greater than N.
Patricia and I are in agreement here. No contrast at all that I can see.
> So having reformulated the problem assigning a distribution to the the > split point n/k (or its "floor") > (as a function of n) the question still remains: are you saying that > this is no better than O(n) or can have the better > bound O(logn) ? > This is a huge difference as it can mean seconds instead of millions > years > if n is large, for some important problems.
Of course we can get better than O(n)  just use k=2. If you insist on a random algorithm with better than O(n) worstcase bounds, split at n/uniform(2, 2.00001).
What's going on here? If there were any down side at to using n/2 I can see why one might consider alternatives  even random ones (some people do for quicksort, partitioning for example)  but since n/2 is simple, easy to calculate and optimal you can't be proposing an alternative.
 Ben.



