• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Apple Silicon SoC thread

Page 482 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:


M5 Family discussion here:

 
Last edited:
Oh wow. That's probably the fastest BMC in the world.
All the ones I've used are slow as hell so those Apple DC people are very lucky.
Not exactly a BMC.
This patent https://patents.google.com/patent/US20250377933A1
probably describes the system.

There was a video released by the WSJ a few weeks ago showing manufacturing at an Apple Texas plant that showed a few frames of this device being put together.

My recollection is that the system is perhaps best thought of as something like a commercial nV system, with a front-end CPU that gives jobs to the worker GPUs (connected in a ring via TB).
But it's more flexible than that because the objects in the ring are M class SoCs not just GPUs, so you can also give them CPU-based jobs, which makes the system something like AWS Lambda.

But the essence is that the outside world talks to one SoC (perhaps used only as a CPU and security agent, not a GPU) and that SoC generates tasks given to the internal ring of worker SoCs.
It seems like Apple, given the HW they have, have kinda rolled the equivalent of Nitro and BMC and a few other functions into this "controller" SoC. Obviously that means factoring their OS somewhat differently from everyone else, but they's they're used to that by now!

(Oh, and BTW don't read the tweet comments! You will feel your IQ dropping as you see one stupid statement after another.)
 
Not exactly a BMC.
This patent https://patents.google.com/patent/US20250377933A1
probably describes the system.

There was a video released by the WSJ a few weeks ago showing manufacturing at an Apple Texas plant that showed a few frames of this device being put together.

My recollection is that the system is perhaps best thought of as something like a commercial nV system, with a front-end CPU that gives jobs to the worker GPUs (connected in a ring via TB).
But it's more flexible than that because the objects in the ring are M class SoCs not just GPUs, so you can also give them CPU-based jobs, which makes the system something like AWS Lambda.

But the essence is that the outside world talks to one SoC (perhaps used only as a CPU and security agent, not a GPU) and that SoC generates tasks given to the internal ring of worker SoCs.
It seems like Apple, given the HW they have, have kinda rolled the equivalent of Nitro and BMC and a few other functions into this "controller" SoC. Obviously that means factoring their OS somewhat differently from everyone else, but they's they're used to that by now!

(Oh, and BTW don't read the tweet comments! You will feel your IQ dropping as you see one stupid statement after another.)
That's how Apple described it in their documentation. The interface to the tool needs to be AS to use their hardware based E2EE, and it then decodes the package, figures out the model to run, farms it out, packages back up the results and sends it home.

What wasn't clear what the 'farms it out' would look like. I suspected that a single Ultra wouldn't make sense if it was a job coming off of a Max or Ultra, otherwise why not hump it out on the local machine, but maybe the focus was phones/Airs and not Ultras. This suggests that maybe you can take a big job off an Ultra, send it up and have that farmed out to a bunch of whatever these are and get back something meaningfully quicker than the ultra could do itself.

Basically it's a load balancer for AI. Nitro also a good analogy. It needs the dedicated Apple silicon for E2EE, it needs enough compute to decode/encode, and compress. Ideally you virtualize that using their new container tool, one instance per core, nuke and replace it fresh after each request. Sounds like Apple did an extra run of M2 Ultras before they took that production offline.
 
Back
Top