Question x86 and ARM architectures comparison thread.

511 · Jan 14, 2026

GCC 13 vs GCC 16 + Clang 16 oof

poke01 · Jan 14, 2026

511 said:
View attachment 136670
GCC 13 vs GCC 16 + Clang 16 oof

Btw the compliers aren’t comparable. GCC 16 didn’t even exist when this was tested.

It should just say Xcode 16.1 or just apple clang 16.1, don’t know why Michael adds the misleading stuff

511 · Jan 14, 2026

poke01 said:
Btw the compliers aren’t comparable. GCC 16 didn’t even exist when this was tested.

It should just say Xcode 16.1 or just apple clang 16.1, don’t know why Michael adds the misleading stuff

sure but clang 16 is 2024 GCC 13 is like 2022 I doubt changes for Zen5/ARL would be in GCC 13 he would have been better off using a LLVM Based compiler for doing testing of around the same timeframe would have been a more fair comparison and if he is using Apple propritery compiler than Intel/AMD might as well use their compiler toolchain.

poke01 · Jan 14, 2026

511 said:
sure but clang 16 is 2024 GCC 13 is like 2022 I doubt changes for Zen5/ARL would be in GCC 13 he would have been better off using a LLVM Based compiler for doing testing of around the same timeframe would have been a more fair comparison and if he is using Apple propritery compiler than Intel/AMD might as well use their compiler toolchain.

Here is a newer GCC 14.2, it’s faster than 13.2. Michael didn’t test FFMPEG compilation in this review but we can compare LLVM compilation.

AMD Ryzen 9 9950X3D Delivers Excellent Performance For Linux Developers, Creators & Technical Computing Review - Phoronix

www.phoronix.com

Also these don’t look like wall readings to me.

And here is the M4 Max in a MacBook.

M4Max Benchmarks [2503281-NE-M4MAX949371] - OpenBenchmarking.org

openbenchmarking.org

What I’m most impressed is the perf/w on the M4 Max which use around 100watts at the wall.

I’m going to test my own 9800X3D on LLVM and see the time difference.

511 · Jan 14, 2026

@poke01 can you run using AOCC as well ?

MS_AT · Jan 14, 2026

poke01 said:
It should just say Xcode 16.1 or just apple clang 16.1, don’t know why Michael adds the misleading stuff

Because he doesn't care about the fine details, but scale and automation, that is why when you scrutinize the benchmarks from phoronix you will see all kinds of inconsistencies😉 I guess on MacOS simply gcc aliases clang for convenience.

511 said:
I doubt changes for Zen5/ARL would be in GCC 13 he would have been better off using a LLVM Based compiler for doing testing of around the same timeframe would have been a more fair comparison and if he is using Apple propritery compiler than Intel/AMD might as well use their compiler toolchain.

Support for Zen5 in mainstream LLVM is still a joke. It's copy paste of Zen4 backend which itself only recently got fixed and was a copy paste of Zen3. AMD is dropping a ball there.

poke01 said:
Michael didn’t test FFMPEG compilation in this review but we can compare LLVM compilation.

poke01 said:
I’m going to test my own 9800X3D on LLVM and see the time difference.

Be sure to build the same target as by default each will compile for the same architecture it's running on, which will trigger different code/data paths in the compiler😉 Make sure to match the same options, and depending on the platform use the right compiler package 😉 [As in my table, depending on the package source there was large diff between 18.1.8 clang versions].

poke01 · Jan 14, 2026

511 said:
@poke01 can you run using AOCC as well ?

Will do tomorrow. It’s almost midnight here. I neeed to sleep

Covfefe · Jan 14, 2026

Nothingness said:
How did they measure power for the x86 machines? As I previously wrote, I only trust power at the wall, after all this is what the machines I run consume.

Software readings for everything. Here's the review.

The M4 showing was all the more impressive when looking at the CPU power consumption exposed by powermetrics compared to the Intel/AMD RAPL/PowerCap results on Linux.

Nothingness · Jan 14, 2026

MS_AT said:
I guess on MacOS simply gcc aliases clang for convenience.

Correct. If one wants a real gcc, it can be installed via homebrew and be used this way:

$ gcc-15 --version
gcc-15 (Homebrew GCC 15.2.0) 15.2.0
$ gcc --version
Apple clang version 17.0.0 (clang-1700.6.3.2)

poke01 · Jan 16, 2026

Tested on undervolted AMD 9800X3D. Power consumption in btop was 95w with a peak of 110w. @MS_AT @511

Latest clang 21.1.6

Test4 Benchmarks [2601158-NE-TEST4834135] - OpenBenchmarking.org

openbenchmarking.org

Latest GCC 15.2.1

Test2 Benchmarks [2601151-NE-TEST2270825] - OpenBenchmarking.org

openbenchmarking.org

oh and AMD's complier is pure dogwater. Its based on clang 17.0.6.
PTS crashed after it finished but so no web link but I managed a pic of the XML file.

It took 870 seconds, double the other compliers

511 · Jan 16, 2026

@poke01 I Know of a funny stuff you can use Intel's compiler and pass generic flag 🤣.(like -O2 -x86_64_V3) or something depending on what you passed those compiler

MS_AT · Jan 16, 2026

poke01 said:
oh and AMD's complier is pure dogwater. Its based on clang 17.0.6.

I would expect it to be the slowest, after all it's supposed to run extra optimization passes so the binary it produces is faster, I doubt anyone pays a lot of attention to how fast it itself is running.

poke01 said:
Latest clang 21.1.6

I guess this was distribution clang not official clang? Since the latest clang is 21.1.8. I am not sure how cachy is sourcing it, if they are building from scratch. Also do you know how can I read from openbenchmarking the exact command used to build? I wasn't able to find it.

poke01 said:
Latest GCC 15.2.1

That is surprisingly good result, I wonder if cachy is doing the sane thing and they have ditched ld underneath in favour of lld. Or maybe it's the other way around and the build defaults to system linker, which by default should be ld, so both are using the slow linker, hmm. I am too unfamiliar with Linux to be able to do more than guess😉

511 said:
I Know of a funny stuff you can use Intel's compiler and pass generic flag 🤣.(like -O2 -x86_64_V3) or something depending on what you passed those compiler

Well you can run icx telling it to compile for znver5 outright, it's llvm based after all😉 [it's another thing altogether that AMD still did not manage to merge proper scheduler data into upstream llvm]. https://godbolt.org/z/Ts6TxPef4

511 · Jan 16, 2026

MS_AT said:
Well you can run icx telling it to compile for znver5 outright, it's llvm based after all😉 [it's another thing altogether that AMD still did not manage to merge proper scheduler data into upstream llvm]. https://godbolt.org/z/Ts6TxPef4

I just don't know Intel still does Fancy AMD bottlenecking in their Compilers this might get rid of it

poke01 · Jan 16, 2026

MS_AT said:
how can I read from openbenchmarking the exact command used to build?

phoronix-test-suite install pts/build-llvm to install

phoronix-test-suite run pts/build-llvm-1.6.0 to run

And select 1 when it asks you to choose for Ninja

MS_AT · Jan 16, 2026

511 said:
I just don't know Intel still does Fancy AMD bottlenecking in their Compilers this might get rid of it

Their own compiler is deprecated, I mean icc. Their new compiler (icx) is tuned llvm with extra optimization passes for Intel hardware as far as I understand. So it does not cripple AMD chips the way the old one used to do. It's just not applying extra passes, but the generic tunings apply to AMD chips too. Mystical is using icx to build Y-cruncher for Zen, or at least used to last time I checked and he found it the best available at the time for the purpose.

poke01 said:
phoronix-test-suite install pts/build-llvm to install

phoronix-test-suite run pts/build-llvm-1.6.0 to run

Ah , I guess I was not precise enough, I mean the exact cmake command used to run the build itself, but I think these can be inferred from pts sources 🙂 Thanks anyway🙂

511 · Jan 16, 2026

MS_AT said:
Their own compiler is deprecated, I mean icc. Their new compiler (icx) is tuned llvm with extra optimization passes for Intel hardware as far as I understand. So it does not cripple AMD chips the way the old one used to do. It's just not applying extra passes, but the generic tunings apply to AMD chips too. Mystical is using icx to build Y-cruncher for Zen, or at least used to last time I checked and he found it the best available at the time for the purpose.

ICX is arguably the best compiler for x86_64 so I am not really surprising here but I just was not sure regarding AMD paths thanks for lmk.

poke01 · Jan 16, 2026

Intel compiler is faster than GCC but mainline clang is the best complier

511 · Jan 16, 2026

Well looks like the optimization is for Intel only stuff

MS_AT · Jan 19, 2026

511 said:
Well looks like the optimization is for Intel only stuff

Why do you think so? I mean the benchmark results presented here do not show, in my opinion, enough data to confirm or deny this claim.

1. ICX took longer to compile LLVM (clang) than LLVM (clang) did, but we do not know how fast the compiled binary is. The extra time can come from more optimization passes than vanilla clang is doing. So it might be it's actually applying extra optimization what makes it slower.
2. We do not know how icx was built. Was it built by icx, or by gcc or by clang. This can influence the performance of icx. For example the "default" way for building clang release, at least the one I remember, was using 3 phases. 1) you build clang with system compiler, whatever it might be what gives you clang_1. 2) You use clang_1 to compile clang gathering profile information for profile guided optimization. 3) You produce clang_2 using clang_1 consuming PGO data from 2). Then clang_2 is the final release artifact. The influence of that process can be seen in my previous posts, where the difference between 18.1.8 was down to memory allocator used and if pgo was applied or not.

In other words, your statement might be of course true, but based on this thread alone I don't think we have sufficient evidence to back this up😉

511 · Jan 19, 2026

MS_AT said:
Why do you think so? I mean the benchmark results presented here do show, in my opinion, enough data to confirm or deny this claim.

I am dumb I mistook it for runtime my bad

johnsonwax · Jan 19, 2026

poke01 said:
Intel compiler is faster than GCC but mainline clang is the best complier

Looks like Apple won the compiler wars.

poke01 · Jan 22, 2026

The CPU Performance Of The NVIDIA GB10 With The Dell Pro Max vs. AMD Ryzen AI Max+ "Strix Halo" - Phoronix

www.phoronix.com

GB10 vs Strix Halo cpu

wajeehakhan · Jan 29, 2026

From my understanding, ARM cores are extremely efficient in single-threaded tasks under low power, which makes them great for laptops and mobile devices. On the other hand, x86 cores still shine in multi-threaded workloads and server environments.
I think both architectures have their strengths depending on the use case, and it’s exciting to see innovation from both sides. Looking forward to seeing how efficiency and performance evolve in the coming years!

cytg111 · Jan 31, 2026

I dont want to start a new thread for something that might be meh, so this is not x86 nor ARM, the question is: Is it comparable? Q.ANT.

As I understand it it's not a phantom product, it's rolling through fabs right now ... photonic computing and wipes out anything Blackwell /Tubin.

DavidC1 · Feb 1, 2026

cytg111 said:
I dont want to start a new thread for something that might be meh, so this is not x86 nor ARM, the question is: Is it comparable? Q.ANT.

As I understand it it's not a phantom product, it's rolling through fabs right now ... photonic computing and wipes out anything Blackwell /Tubin.

Nvidia has the AI strangle on a GPU, which is programmable and much more general purpose than a specialized accelerator like that. Proprietary and specialized accelerators are always faster but usually it's the general purpose that takes a greater share, unless the accelerators are at least an order of magnitude faster. They went from gaming and then to workstation 3D, and then to crypto mining, and when that popped they went to AI. GPUs find a different niche while special purpose ones are a dead end.

It's the computer equivalent of a person only being able to do one thing good. The world/trend changes and the person is out of a job.

Question x86 and ARM architectures comparison thread.

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Junior Member

Lifer

Platinum Member