Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Intel's New Low: Commissioning Misleading Core I9-9900K Benchmarks [video] (youtube.com)
161 points by esistgut on Oct 9, 2018 | hide | past | favorite | 72 comments


I worked in HPC when NVIDIA started taking serious market share from Intel. My memory of Intel’s performance comparisons were that they were often technically unsupportable once you scratched the surface.

In one case a third party who were demonstrating how much faster Intel Xeon Phi was for deep learning admitted that they were comparing highly-optimised code to unoptimised code in their results.

This doesn’t surprise me at all.


I've been in the same boat and I completely agree. One thing that's unexpected to people is that getting decent performance out of a GPU is actually easier than CPUs - vectorization and multithreading is unified in the parallel programming model, cache optimizations are mostly not needed. These are the two biggest time sinks you have when optimizing for CPU, solved right there. What you instead have to care about is resource utilization per thread, and that is IMO way easier to reason about and optimize for.


Depends on how many if-statements / branches your code takes.

If you have simple if-statements and all branches are grouped together to be SIMD'd easily... then yeah. GPU threads kind of are like normal CPU threads.

But as soon as you have a serious degree of thread-divergence, your performance tanks. That's why things like Chess engines (which ARE parallel problems at heart), execute poorly on GPUs. Because even though it is massively parallel, chess has too many if-statements and can't extend to SIMD very easily.

--------

Raytracing algorithms are funny: they group rays together so that the GPU can SIMD over them more easily. But without the "re-grouping" step, its bad performance.

Ex: A bunch of rays start at the camera. Some might hit a diffuse surface like wood... some might hit a subsurface-scattering surface like skin, and others might hit a metalic surface. GPU-raytracing algorithms then save off all the rays, and then processes all "diffuse" rays together, to minimize divergence.

You can't just follow a singular ray in raytracing on a GPU. You gotta re-group to SIMD units for maximum performance.


Are there any good guides or tutorials? I've found GPUs difficult, in part because I don't really know where to start.

FWIW, I have an AMD GPU with ROCm. HIP it's a lot like CUDA, so NVidea-focused tutorials ought to be fine. With the caveat that I'd have to be aware of hardware differences.


The main issue IMO is thread-divergence. Because "threads" on a GPU are really SIMD-elements, things work very differently.

Lets use a simple example:

    for(int i=0; i<10000000; i++){
        if(someCondition) 
            break;
        else 
            doStuff();
    }
In a CPU case, the thread will break out of the loop early on "someCondition". But in the GPU case, it will only break out of the loop when "someCondition" holds for the entire SIMD-group.

GPUs execute roughly 32-threads with the same instruction pointer. Lets say thread#0 had "someCondition" to be true. Then thread#0 will be set to "disabled", but otherwise, it will have to wait for the 31-other threads to be done with the loop before continuing.

Even if 31-threads have hit "someCondition" and have broken out of the loop, the 32nd thread will keep executing the loop until it is done (and threads 0-through-30 will "execute with" the 32nd thread, but will throw away the results).

That's the key with SIMD. Threads are run in groups of ~32ish at a time, at the same time. All 32-threads must execute if statements together and loops together.

In most cases, an if/else statement will be executed by BOTH threads (but the results "thrown out" by the GPU engine, through execution masks)



This article says that Techspot has obtained the 9900 (without being bound by an NDA, as others are), and that Intel is releasing misleading results while other reviewers are bound by an NDA, but that Techspot is not going to show their own benchmarks--which could refute the misleading results Intel has authorized for public release--out of a sense of "professionalism."

Actually, any "professional" journalism outfit would show the public the newsworthy results, not withhold this information from readers because of a desire not to piss off industry contacts.


The thing is you just need to piss off an industry contact once to be taken off of their list, essentially neutering themselves in the future and making them less competitive compared to other websites.

It's a sad state of affair, but that's the reality of it.


They were already neutered, they weren't given a part to review. the part they allegedly have was probably gotten through unofficial means, since it is highly unlikely intel would just give out parts to media outlets ahead of a launch without an NDA attached to it.


In the video he portrays the decision as out of respect for the sources under NDA, not out of respect for Intel

Not sure I fully get it though, a scoop is a scoop


The point of a review embargo is to prevent things from collapsing into a race to be first with the scoop, by encouraging/forcing everyone to take a reasonable amount of time to prepare their review. Cooperating with review embargoes is usually the best way to promote honest and fair coverage, which is what most professional tech journalists really want.

NDAs that try to restrict the content of a review and not just the publication date are reprehensible, but that doesn't seem to be what's going on here. Intel's just trying to soften the ground with their own misleading numbers, but still permitting the trustworthy sources to do their usual thing.


That isn't always the point of a review embargo. Take a game with heavy marketing that the corp realizes is going to get bad reviews. Sometimes they will delay reviews of things like that as long as possible, until day before or day of release.


Notably, the video game industry pioneered the conditional embargo - where you can publish a review freely as long as it's above a certain score. There's pretty much no explanation for this other than "we want to lie to customers", and there's pretty much no definition of journalistic integrity that would permit signing it, but it happens nevertheless.

(Which, interestingly, means some sites that don't issue numeric scores are barred-by-default from those games. I think that's sometimes been used to spot these embargoes?)


The computer hardware market is pretty different from the video game market. Video games tend to be much more reliant on pre-orders and the first few weeks of sales, while computer hardware is subject to seasonal fluctuations but otherwise maintains strong ongoing sales throughout the product cycle. Games are also much less amenable to objective comparative analysis.

In the computer hardware world, a review embargo that coincides with the product hitting the shelves does not carry any negative connotations about expectations for the product's reception.


Anyone who pre-orders a video game is buying for some speial sense of status, not product quality.


I'm not a big gamer, but aren't they often cheaper if you pre-order?


Intel is violating the embargo in order to give primay to their false narrative. That's odious.


What's the use in releasing the 9900k results? The thesis of the article is that Principled Technologies biased the results, and they've shown that with the ryzen vs 8700k results.


Reposting (without the inflammatory 'did you even watch the video' comment) at the request of asr:

the video doesn't dispute the benchmarks of the 9900, but rather they dispute the published compared benchmarks of the ryzen 7 2700X, which used default, unconfigured memory settings to push the benchmark down, while the 9900 benchmarks used configured memory settings (as it should)


It is most from a respect to your pears that will respect the NDA, if you release early then the others lose the views and they can't respond until the embargo is over.


But they give results for the fairly similar last generation processor, with similar behaviour. "Pissing off industry contacts" isn't needed.


[flagged]


The last line of your comment is wholly unecessary (and explicitly not allowed on HN)


so we can't call someone out who apparently didn't even watch the linked content, while still posting a comment that criticizes the content?

I'm curious about where in the rules it says we shouldn't call someone out on their BS.


From the guidelines

"Please don't insinuate that someone hasn't read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that.""


I don't want to split hairs here, but the linked content is not an article. I understand the spirit of it, but it technically does not apply here.


Upvoted because you are correct that I didn't watch the video (and I usually hate commenters who don't look at the linked content so I get you). But I can't just start watching videos at work... and I posted my comment under a link to the related article, so I think it was fair to comment after reading the article but not watching the video.

I am interested to see your original comment if you want to repost (including whatever caused it to be flagged if you like, now that this thread is old I doubt anyone will care/notice).


I resubmitted my comment, without the 'did you even watch the video' bit.


Also they seem to have tested AMD CPUs in "game mode" which disabled half of the cores (one of two CCXes)...


I like that the "Game mode" makes the AMD CPU's perform worse in game benchmarks...


Game mode was added for threadripper (workstation cpu) which has multiple dies - many games were hit heavily by numa.

It's also available for desktop parts to disable one ccx when game doesn't use more than 8 threads and you want to squize a couple more % of perf.

Game mode is disabled by default, requires AMD software to be installed to enable - they went extra mile to gimp competition..


The reviewers used Game Mode on a CPU without NUMA, so it was pointless and only meant to damage.


It can still help, if used in precisely the right circumstance. Not that this makes your statement any less true, I'm just being needlessly pedantic.

One amusing thing I've noticed is that, playing the same game (Minecraft), Windows runs it ~25% faster in game mode than default NUMA.

In Linux the situation is reversed; it runs 10% faster with NUMA enabled, maybe because the Java garbage collector is NUMA-aware and I'm using enough memory that it's split across NUMA nodes anyway.

But either mode is faster in Linux than Windows.


Does Intel really think this approach is good for them? As a technical person, all I see is a company in trouble with products they need to lie about. This goes beyond market speak - it's deceptive.


The people who won't be fooled by this are likely the customers who are interested in the actual 10% difference for the high end and likely want this chip anyway.


I'm not sure about that.

At $499, the i9-9900k is almost competing against the 12-core Threadripper 2920x ($649, 12-core/24-threads, 4.4GHz clock, 60 PCIe lanes, quad-memory channels).

I think most people will find more use out of +4 cores (granted, on a NUMA platform) than higher clocks. Cores for compiling code, rendering, video editing, etc. etc.

Pretty much only gamers want +Clock speeds, and more and more games actually use all-cores these days (Doom, Battlefield, etc. etc.)

-----------

That's the thing. The i9-9900k isn't even a "high end chip" anymore. Its at best, "highest of the mid-range" since the HEDT category (AMD Threadripper, or Intel-X) has been invented.

Once you start getting into 8-cores/16 threads, I start to worry about dual-memory channels and 16x PCIe lanes + 4GB/s DMI to the Southbridge. Its getting harder and harder to "feed the beast". A more balanced HEDT system (like Threadripper's quad-memory channels + 60PCIe lanes) just makes more sense.


> Pretty much only gamers want +Clock speeds

I wish. We use a commercial path-tracer that scales very well to many cores, GPUs and entire clusters when it's chewing away at a single fixed scene or animation.

But in interactive mode many scene modifications are bottlenecked on a single or few threads and locks until it gets back into the highly optimized rendering code paths. So a lot of work goes into quickly shutting down as many background threads as possible to benefit from high turbo-boost clocks on Xeon Gold processors so the user doesn't have to wait long and then ramp them back up when it's just rendering the fixed scene.


Agreed. Games aren't the only thing people do with lots of cores / HEDT. Give me a 128 core machine and I'll happily keep them busy all day with work. No need for a heater either.


you have msrp prices. real life prices is currently around $600 for both!


Preorders for the 9900K are 578$ USD.

For that you can get a Ryzen 2700X, a nice motherboard and a 256GB SSD. Performance delta shouldn't be more than 15% deficit for the Ryzen for a few specific games.


Keep seeing this 'delta' word. From context it is basically like 'difference'. But is there a delta between difference and delta?


Delta makes you sound smarter.


it is less typing, and less data. save the planet.


'd' works too, or did for Leibniz..


'Δ' must be a sign of Greenpeace membership, then.


Only if you've memorized how to type it. If you have to Google it beforehand, it serves as cheap signaling at best and an environmental transgression at worst.


[flagged]


Speaking of reddit, there is a subreddit called /r/changemyview where you can award deltas (Δ) to people that have changed your view or provided a particularly compelling insight.


'Delta' has connotations of being quantitative that 'difference' doesn't have.


'difference' can be subjective, 'delta' is a measurement of an amount of difference.


It comes from mathematical writing — the greek letter delta ∆ is often a variable used for the difference between two values.

So maybe you could translate it as "quantified difference".


Delta = numerical comparisons only, or at least that's the only context I've ever seen delta used in.


If you have a program that uses AVX instructions much the delta will hit ~2x


AVX512 on downclocks, you won't see the full 2x scaling.

Cloudflare ran into some trouble with this: https://blog.cloudflare.com/on-the-dangers-of-intels-frequen...


AVX512 doesn't exist on the i9-9900k.

You need the "X-series" to get AVX512.


AVX512 would be ~4x, but this intel CPU doesn't have it.

AVX2 is ~2x, Ryzen/AMD fakes AVX2 instructions with multiple SSE instructions.

Some AVX2 instructions downclock but not much, I see very close to 2x speedup over SSE2 with some workloads. Some of the downclock loss is made up for because there are more instructions available (gather, etc)

AVX512 might hit more than 4x improvement over SSE on some workloads despite the downclocks, due to all of the masking features. I have seen results consistent with this, 2nd hand. (I don't own an AVX512 cpu)

Anyway all of these things depend on workload, cpu, compiler etc. But it does happen!


> AVX512 would be 4x AVX2 is 2x, Ryzen fakes AVX2 instructions with multiple SSE instructions.

Not quite. Everything is fake, because everything is encoded into micro-ops first.

Ryzen's internal micro-op engine is 128-bit wide. But it has 4x pipes... each handling 128-bits at a time. So any 256 bit instruction will simply use two-pipes at a time.

-------

So the 256-bit instruction does in fact, execute at once.

The difference is that Intel has 3-pipelines, each of which can do a 256-bit instruction by itself.

-------

In effect: Ryzen is 4x128-bit pipelines, with the ability to merge pipelines together to do a 256-bit instruction.

Intel is 3 x 256-bit pipelines, with the ability (on Skylake-X) to merge pipelines together to do a 512-bit instruction

In any case, Intel has wider pipes than Ryzen. Intel Skylake is effectively a 256-bit CPU, while Ryzen is only a 128-bit CPU.


Decent summary:

> I don’t have too much of an issue with Intel commissioning the report itself, and the Principled Technologies report is very transparent as they clearly state how they tested the games and configured the hardware. The results and testing methods are heavily biased, but they haven’t attempted to hide their dodgy methods. You can dig into the specs and find all the details, it’s still dodgy but it’s a paid report, so it’s somewhat expected.


I disagree.

Vast majority of buyers and sadly huge chunk of tech press won't be able to tell looking at the settings if our how much was benchmark skewed in Intel's favor.

And it plenty skewed.


They disabled half the AMD cores.


Further reason to never purchase Intel again.


yeah. I went Ryzen for my new laptop. 100% happy with it, and this kind of crap from Intel just solidifies my decision.


Which Laptop?


I got (off some sweet discounts that expired over the weekend) a Thinkpad E485 with 128GB M.2, 16GB RAM, and 2700U Ryzen, 1080p screen for $720 shipped.


I got an HP ENVY x360 15z about a month ago now on HP's labor day sale.

Ryzen 5 2500U 256GB SSD 16GB RAM

for only $859. I'm perfectly happy with the speed of the CPU, and the integrated GPU is fantastic (can even play Destiny 2 on low settings at 720p). That, and the build quality is absolutely superb.

Keep in mind, I bought this for programming at home, not gaming. But I'm glad that it works as a portable light gaming rig when needed.


I had had some hope that with a new CEO at the helm this behavior would stop.


There is no new CEO, only an interim/acting CEO.


Interesting follow-up : Interview w/ Principled Technologies on Intel Testing (9900K) - https://youtu.be/qzshhrIj2EY


intel is failing benchmarks, security vulns all around. Nvidia is failing to deliver price/perf they promised years ago. amd is the opposite of both and it's stock continues to dive. go figure.


> amd is the opposite of both and it's stock continues to dive. go figure.

AMDs share price is up almost 1400% in 2 years, what on earth are you on about?


It's currently on a bit of a slump after some financial firms lowered their targets of the stock price.

I have a feeling it'll make its way back up though.


I think when you outpace the general index by multiple factors over years a small pull back is not only expected, its healthy.


exactly. it jumped from bankruptcy level ($1.5) to bottom of peers ($20) in august! since then it had a peak in sept and is on freefall again, while it should have been rising faster.


Stock performance is about a lot more than individual product technical specs. That said, I see AMD is down a bit recently but it's had a significant run-up the past 6 months (~2X) and even more over the past couple of years after being basically flat for ages. And it's got quite a high PE (58) which tends to make stocks vulnerable to even middling news. (By contrast, Intel's PE is about 17.)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: