![]() ![]() At the same time, M1 has much larger caches, so it might have an advantage if you have to process large amount of data with less predictable access patterns (which does not seem to be the case for Stockfish specifically).įinally, looking at the ISA itself, it's a bit of a mixed bag. Stockfish could like one of those cases (they are processing large batches of data, but only do few operations per load), so it might be limited by L1D performance - but who knows. ![]() This might limit M1 performance on some workloads where the ratio of operations to memory transfers is low. If I understand the available Firestorm info correctly, it will only do two 128-bit loads from L1D per cycle on most data (and it can do an additional 128-bit load if load overlaps with a previous store). Modern x86 CPUs can sustain two 256-bit fetches from L1D, while M1 can't. there is the matter of data loads and stores. However, since we are talking about sustained multithreaded thought put, the x86 chip is probably operating close to it's base frequency (which is ~3ghz for modern mobile x86), so we should get comparable thoughtput.Īnd then. ![]() M1 will run at ~2.9-3.0ghz while running heavy SIMD code, while x86 CPUs can clock significantly higher. To make comparisons more difficult though, there is also a difference in clocks. Some Intel CPUs with AVX512 can do more than one 512bit operations per cycle, but it is unclear to me under which conditions. Yes, NEON is 128-bit, but Apple Silicon currently offers 4x SIMD units (for 4x 128-bit ops per cycle) while most AVX-2 implementations can do 2x 256-bit ops per cycle, so it's more or less the same. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |