I think you are getting your information mixed up. Here is a comparison that sho...

graycat · on June 8, 2023

Versions of quick sort differ in how the partitions are determined. A guess is that some of the versions are not worst case O(n log n) for sorting n keys. In that case, for sufficiently large n, on a normal computer, any version of heap sort will beat that version of quick sort in number of comparisons, time in seconds, Joules of energy to run the computer, etc.

It is this point that has much of the academic computer science community saying that no sorting algorithm based on comparing keys two at a time can beat heap sort.

Sure, we don't know how big n has to be. In practice, in an actual case, the n might have to be too big for current computers.

Sure, in practice, for some value of n, on some list of n keys, some version of heap sort, and some version of quick sort, the quick sort might run 10 times faster in seconds of elapsed time than heap sort.

CyberDildonics · on June 8, 2023

I'm not completely sure what you are saying, but do you actually think a heap sort is in general faster than a quicksort or mergesort? You realize that the worst case of a quick sort is easily avoided right? The only way it happens is if you have an already sorted array and you pick your pivots from the minimum or maximum values on every single partition.

It is this point that has much of the academic computer science community saying that no sorting algorithm based on comparing keys two at a time can beat heap sort.

I think you're the only one saying that. Where did you get this idea? I just showed you a quicksort being 10 times faster than a heapsort. You can try this for yourself.

Sure, we don't know how big n has to be. In practice, in an actual case, the n might have to be too big for current computers.

This is never going to be true. Quicksort and other sorts exploit locality much better. Partitioning an array gradually makes the sorting more local. That's going to work better relatively to a heap sort as data gets bigger, not worse.

Sure, in practice, for some value of n, on some list of n keys, some version of heap sort, and some version of quick sort, the quick sort might run 10 times faster in seconds of elapsed time than heap sort.

No, on any modern computer other sorts are going to beat a heapsort. It isn't exotic, it is how it works. Even if you have to scan every element first to get a distribution of your elements to choose your pivot, that is still going to be O(n log n) and it will still beat a heap sort.

Heap sorts are not the fastest way to sort. They can be consistent and they can split the time between inserting and extracting elements.

graycat · on June 8, 2023

> I'm not completely sure what you are saying, but do you actually think a heap sort is in general faster than a quicksort or merge sort

"in general" is not very specific.

The main issue is the big-O expression. Again, if win in the big-O expression, then, in the assumptions of the context, for sorting n keys for all sufficiently large n will win.

Sure, maybe the heap sort O( n log(n) ) is wrong. In that case, maybe could beat heap sort.

How could O( n log(n) ) be wrong? Well for large enough n, could encounter virtual memory paging that would make the cost of a comparison grow with n and an algorithm that had better locality of reference and less paging might win. So, for such a context, would have to redo the big-O derivation.

Merge sort is based on comparing keys two at a time, and that was the assumption in Gleason's argument. So, Gleason's argument should also apply to merge sort. Some people prefer heap sort or quick sort because they are in place and merge sort is not.

I wrote and you quoted:

It is this point that has much of the academic computer science community saying that no sorting algorithm based on comparing keys two at a time can beat heap sort.

> I think you're the only one saying that.

Also Gleason, Knuth, and much of the academic computer science community.

> Where did you get this idea? I just showed you a quicksort being 10 times faster than a heap sort.

Your "10 times" is not for all values of number of keys n. The crucial issue is the big-O expression, and your "10 times" says next to nothing about the big-O expression.

Again, the big-O expression is crucial because winning in the big-O expression means winning for all sufficiently large n, and that is, sadly, the only really meaningful means we have of comparing "in general" two pieces of code.

Instead, sure, if in a particular application have an upper bound on n, say, always less than 10,000,000, then use the code that is fastest from 1 to 10,000,000 or in expectation on the probability distribution of n from 1 to 10,000,000 in the particular application. Or, maybe in some application don't care about the average execution time but definitely always want execution time faster than some given T milliseconds for all n from 1 to 10,000,000. Okay, pick the code that best achieves this.

Gleason's work didn't cure cancer, exceed the speed of light, say what is in the center of a black hole, say what happened before the big bang or will happen after the heat death of the universe, .... Instead, Gleason stated and proved a theorem in applied math. The theorem shows what is the best possible given the assumptions. Similarly for what Knuth said about heap sort and Gleason's theorem. A lot of progress in pure/applied math and science is like that -- not everything but just a specific result given some specific assumptions.

Or, assume that Joe has a new computer, 1000 cores, 10 GHz clock, 500 GB first level cache, and writes some really careful code in assembler that makes full use of the 1000 processors, to sort 10 billion 32 bit numbers via bubble sort. Then it runs in O(n^2), that is, for some constant k_Joe runs in

k_Joe n^2 = k_Joe (10^10)^2

milliseconds.

Along comes Bill with a PC based on a 64 bit single core processor with a 2 GHz clock. Bill's code uses heap sort. Then Bill's code runs in O( n log(n) ) or for some constant k_Bill runs in

k_Bill (n log(n) ) =

k_Bill (10^10 log(10^10) )

milliseconds.

And Bill is a bit sloppy and uses an interpretive language so the constant k_Bill is large.

Then the ratio of running times is

(k_Joe / k_Bill) (n / log(n) )

Well, assume for simplicity and without loss of generality that the log has base 10. Then

log(n) = log(10^10) = 10

so that

n / (log(n)) = 10^10 / 10

= 1,000,000,000

Well, let's account for Joe's 1000 cores to get a ratio of 1,000,000 and Joe's 5 times faster clock speed to get 200,000. Say the interpretative language Bill used is 10 times slower than compiled C for a ratio of

20,000

and say that Joe's hand coding in assembler is 2 times faster than C code for a ratio of

10,000.

So, still Bill's code runs

10,000

times faster than Joe's.

That's an example of, the algorithm that wins in big-O wins for all sufficiently large n, even considering coding in assembler with 1000 cores and a 10 GHz clock.

Sure, for n = 100, maybe Joe's code wins -- I'll let you find the ratio.

That's what the computer science community noticed some decades ago and, thus, settled on big-O as the way to compare algorithms and code. Again, heap sort meets the Gleason bound in worst case. Since in the context the Gleason bound says that O( n log(n) ) is the best can do, the best quick sort can do is tie. And if the quick sort code does not make the partitions carefully, quick sort won't be O( n log(n) ) in worst case and for sufficiently large n will lose by whatever ratio want, 1,000,000:1 if you want.

Look, guys, all this is just from, say, the first week of a ugrad course in computer science.

Arguing with the Gleason bound -- about like arguing with the Pythagorean theorem.

Don't argue with me. Instead, send a question to a comp/sci prof at your local university giving him the URL of this post.

CyberDildonics · on June 8, 2023

Also Gleason, Knuth, and much of the academic computer science community.

Prove it, show me who is saying 'nothing can beat a heapsort'.

says next to nothing about the big-O expression.

Why do you think heap sort is the only n log n sort? Link something that proves what you say.

settled on big-O as the way to compare algorithms and code

Time determines speed, that's what this thread is about. I already linked you a benchmark of 32 million floats. Link me something that actually backs up what you are saying.

Arguing with the Gleason bound -- about like arguing with the Pythagorean theorem.

Link me where you got these ideas from. I'll link you something, a google search on 'gleason bound' - https://www.google.com/search?q=%22Gleason+bound%22

The only thing that comes up is your comment. I've shown you actual results, you keep saying 'maybe possibly in some scenario I can't show, this other one wins, so it is the fastest'. You are hallucinating a reality that you can't demonstrate.

graycat · on June 9, 2023

All your issues have been responded to thoroughly.

Here you are embarrassing yourself.

CyberDildonics · on June 9, 2023

I guess this is the "I already told you" part of the conversation, but you didn't link a single thing.

All you did was repeat your claim over and over. No benchmarks, no link to 'gleason bound' and nothing but hallucinating hypothetical scenarios and declaring that somehow they would back up your claims that go against literally all benchmarks and computer science knowledge. If I'm wrong, show me some links.

xdavidliu · on June 7, 2023

we're talking about asymptotics (i.e. the exponent). Things like "1/10 the time" are utterly meaningless here.

CyberDildonics · on June 8, 2023

Who is talking about that? They said 'faster' not less algorithmic complexity. 1/10th the time is much faster.

graycat · on June 9, 2023

> 1/10th the time is much faster.

No, absolutely no, it's not in any meaningful or useful sense "faster" as in which algorithm is "faster". You utterly fail to get this point.

CyberDildonics · on June 9, 2023

If something runs in less time, it's faster. If something runs in 1/10th the time as something else it is multiple orders of magnitude faster. I don't think I've ever seen someone try to redefine 'faster' before.

You keep trying to equate naive algorithmic complexity and speed, but you haven't even linked anything that shows heapsort is better than other sorts like quicksort at that. You haven't actually linked anything to back up any of what you're saying, even the irrelevant offtopic claims.