Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
|
|
Re: binary search with arbitrary or random split
Posted:
Jul 24, 2012 6:02 PM
|
|
pam <pamelafluente@libero.it> writes:
> On Jul 24, 7:10 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote: > >> >> What would that fix? uniform(2, n-1) is as bad as uniform(1, n). > ... >> >> By the way, your use of big O is getting out of hand (we all do it >> because it's become a short-hand for something else). You can't say >> that something has a complexity of O(log(n)) and not O(n). Proper >> binary search is O(n), it just happens to O(log(n)) as well! If this >> surprises you, check the definition. > ... >> >> "Complexity", unadorned, usually means worst-case complexity. > > > Well i understand we want an upper bounding function. But i guess you > normally refer to the "smallest" > (or "tightest" one) or else one could always place some enormous > function f(n) in O(f(n)).
There is other notation for other bounds. Big O is just an upper bound so all functions that are O(log(x)) are also O(x) and O(x^2). If you want a tighter bound you write Theta(f(x)).
> However, in this particular instance i think that Ben's observation is > really key. > In fact i believe that talking of splitting at "given place"s, makes > the problem "ill > defined" and allows making formal remarks which do not attack the > essence of the matter under discussion.
I don't know what observation of mine leads to this, but I did not suggest that anything discussed so far leads to ill defined problems.
> What we should really do, i think, is saying that the splitting will > occur at place n/k, with k a continuous Uniform in (1, n)
Why should we do this? To what end? n/uniform(1,n) has the same range as uniform(1,n), the only difference is the distribution.
> This way, i think, we are defining a proper generalization of the, > say, "mid point" binary search. A divide and conquer procedure.
Except that it does not, in the worst case, divide. The effect is still a O(n) algorithm.
> Viewed under this perspective, suggested by the observation made by > Ben, I think we can solve various problems.
I can't see how you get here from anything I've said. You must have misunderstood what I was saying.
<snip> >> > I think that including events with 0 probability might be rather >> > misleading. >> >> It would be wrong, not misleading. That you say this, suggests that did >> not agree with my rather detailed previous message. Which bit suggests >> that something with zero probability is being considered? > > You seems to be saying that vanishing events should not be included.
If it can happen, include it; if it can't, ignore it. It seems so clear to me, but you keep coming back with slightly vague phrasing ("vanishing events" -- what are they?) as if you understand what I'm saying but would rather it were not so.
> But this seems in contrast with the post from Patricia: > >>However, for the algorithms we are talking about, the probability of a > split sequence that will result in n comparisons is strictly positive > for all finite problem sizes. There is no finite N that is so big we > can > eliminate that case for all values of n greater than N.
Patricia and I are in agreement here. No contrast at all that I can see.
> So having reformulated the problem assigning a distribution to the the > split point n/k (or its "floor") > (as a function of n) the question still remains: are you saying that > this is no better than O(n) or can have the better > bound O(logn) ? > This is a huge difference as it can mean seconds instead of millions > years > if n is large, for some important problems.
Of course we can get better than O(n) -- just use k=2. If you insist on a random algorithm with better than O(n) worst-case bounds, split at n/uniform(2, 2.00001).
What's going on here? If there were any down side at to using n/2 I can see why one might consider alternatives -- even random ones (some people do for quicksort, partitioning for example) --- but since n/2 is simple, easy to calculate and optimal you can't be proposing an alternative.
-- Ben.
|
|
|
|