Intel just announced their new Sunny Cove Architecture that comes with a lot of new bells and whistles. The Intel processor line-up has been based off the Skylake architecture since 2015, so the new architecture is a fresh breath for the world’s largest chip maker. They’ve been in the limelight this year with hardware vulnerabilities exposed, known as Spectre and Meltdown. The new designs have of course been patched against those weaknesses.
The new architecture (said to be part of the Ice Lake-U CPU) comes with a lot of new promises such as faster core, 5 allocation units and upgrades to the L1 and L2 caches. There is also support for the AVX-512 or Advanced Vector Extensions instructions set which will improve performance for neural networks and other vector arithmetic.
Another significant change is the support for 52-bits of physical space and 57 bits of linear address support. Today’s x64 CPUs can only use bit 0 to bit 47 for an address space spanning 256TB. The additional bits mean a bump to a whooping 4 PB of physical memory and 128 PB of virtual address space.
The new offering was demoed under the company’s 10nm process which incidentally is the same as the previously launched Cannon Lake. The new processors are due in the second half of 2019 and are being heavily marketed as a boon for the Cryptography and Artificial Intelligence Industries. The claim is that for AI, memory to CPU distance has been reduced for faster access, and that special cryptography-specific instructions have been added.
I hope we see more “sunny” names that mock cloud computing
Not as jovial, but there’s Sunrise Point:
https://en.wikipedia.org/wiki/List_of_Intel_chipsets#LGA_1151_rev_1
Next will be titled, silver lining.
If so, I DEMAND that the chip(set) has an actual silver lining!
I read the linked announcement — it doesn’t mention Meltdown/Spectre by name. There is a mention of “embedded security features” but that is more likely to be referring to crypto and not security vulnerabilities.
So I’m wondering why the HAD title on this article puts emphasis on Meltdown when it apparently wasn’t mentioned at all.
Another big questions is *which ones* get patched, if any.
Spectre variant 1 will be with us literally forever, so long as CPUs speculate, as they may speculate on a bounds check and we will allways have to serialize those in software to prevent it. (unless you think you can remove all possible side channels)
I’ve seen processors coming that patch against variants 1-3, but not 1.1 or 1.2, and perhaps not L1TF or any others. That’s to say nothing of the issues with co-resident hyperthreads, though I’ve heard tell that it’s likely everyone will be dropping hyperthreading in future CPUs as an easier fix. Even for the CPUs that fix the first few, it took months for them to be annouced. There’s a REALLY long lag on fixing architectural stuff in hardware. They’ve probably been working on this architecture for the last three years or so and have only known about spectre for a third of that.
They aren’t going to drop SMT anytime soon but its probably going to change how operating systems behave and allocate threads, were probably going to see a point where spesific cores are used by only the OS and secured processes with the rest left user accessible to run whatever programs you want, regressing from speculative execution and SMT would be insane. A lot of the patches especially for linux to combat the SMT vulnerabilities are way to aggressive sometimes slashing performance by 30-40%. Thats like blowing up your bridges, powerplants, factories, and airports because your enemy invented a wheel.
There have been plenty of cases where dropping SMT has been an acceptable compromise. Especially when there other issues with SMT, like TLB bleed.
Intel called the changes “Side Channel Mitigations” in other slide decks and verbally confirmed (at least to me) that these target Spectre/Meltdown. The Sunny Cove architecture seems to have more units tasked with load/store and address calculations, which is what you need to offset the performance losses caused by the mitigations.
saying a hardware design has been patched makes my skin crawl
What does a crawling skin look like?
Where does it crawl to?
How do you look without a skin?
Does it come back of it’s own or do you have to go searching for it?
How do you put it on again?
Soo many questions…
It crawls around while still remaining attached to your body. So if you lie on the floor, you can move around without moving your limbs.
Almost as much as when people ay they’ve “compiled” VHDL or Verilog.
First you synthesize it, then you put it through place-and-route, people.
Your comment is both condescending and wrong.
I compile verilog a few times a day, every work day, and have for the past 20 years. My verilog gets compiled (usually by VCS) into a simulation .exe at least 100 times for every time it gets synthesized to gates.
You’re a special case where simulation on a CPU is desired. That never produces a logic circuit implemented in *any* kind of hardware, FPGA or ASIC or similar.
So does any cisc system
So does not knowing that nothing Intel or AMD have made in the last 20 years was CISC
Thought I was the only one! Which bytes should be patched to fix the hardware problem?
“Security vulnerability patched” So… they took out the Intel Management Engine?
my kind of guy ^^^
IME is like a cryptographic algorithm with hard-coded backdoor.
You just gotta know where the door is and the secret phrase and its blown wide open.
It’s also REALLY attractive to copyright holders. A big part of why it exists (and the PSP, for that matter) is to underpin DRM scemes. Both the IME and PSP are finding use in windows 10’s DRM extensions, where they do all of the decryption because the OS (expecially on linux) “can’t be trusted.” (just like those pesky users)
AMD recently started putting PSP-like security processors in their graphics cards. I’ve heard stories that it’s to prevent people from modifying the bios and overclocking the card, but I think it’s pretty obvious that it’s to furthur lock-down the “approved” content pathway. Your legally purchased content will have to be passed to the PSP to be decrypted, which will magically pass it to it’s counterpart in your graphics card, which will pass it down your approved HDMI cable to your approved display device.
…and the proof for these claims can be found where? There’s a huge difference between hardware support for DRM schemes on the GPU and offloading decryption to the management engine.
(Also you didn’t purchase the content, just a license to decrypt and view it.)
https://imgs.xkcd.com/comics/content_protection.png
You spelt Integrated Maleware Engine wrong.
Maleware?
The ugly side of penetration testing. :-d
Are there any tech savvy governments around the world that have implemented a total ban on the use of all Intel ME and AMD PSP processors yet ?
I am guessing that China and Russia would have some kind of ban. Like China has its Sunway processor (inspired by the DEC Alpha) for military use. It would make no sense if a single secret FISA court order to Intel could open a giant backdoor and nobble your hardware.
China has it “863 Program” which is a government funded project to render China independent of financial obligations for foreign technology.
Fro what I’ve heard Cina is running powerpc and variants thereof, while russia is developing chips based on mips.
Both China and Russia have used everything from x86/ARM/MIPS/POWER derivatives to fully home-grown designs over the years. China even has access to an AMD EPYC clone, the Hygon Dhyana, through a joint-venture with AMD. This clone currently powers the 38th fastest supercomputer in the world, https://www.top500.org/system/179593 , while a home-grown chip powers the third fastest, https://www.top500.org/system/178764 .
TEN MILLION CORES!?!?
Risk V should change that.
RISC-V is just an Instruction Set plus some definitions. Actual implementations are not bound to being open source, can and will ship with proprietary firmware blobs and security engines, and designers can add proprietary instructions. SiFive has released some code necessary to use their SoCs under open source licenses, but others won’t. I don’t expect to see anything for Western Digital’s line of chips, for example. Hex-Five, a silver sponsor of the RISC-V Foundation, is already offering a proprietary solution called MultiZone which is basically equivalent to Intel’s ME, AMD’s PSP or ARM’s Trustzone.
RISC-V was not built for privacy, security, open source etc. It was built to give hardware designers more freedom than e.g. ARM does, and at a much lower price. But at the end of the day it will be more chaotic than ARM and actual chips will offer everything the market needs, which means also having secure enclaves and management engines.
” But at the end of the day it will be more chaotic than ARM and actual chips will offer everything the market needs, which means also having secure enclaves and management engines.”
*shrug* And why shouldn’t it? At the end of the day, freedom loving, open sourcers need security as much as (more so) as anyone else. As for management, that should be obvious for anyone working with a large amount of computers.
Hackers reference?
Sunway, what a strange processor.
It uses an on-chip networking bus to communicate between the cores and has separate ram controllers for every core.
Neat. Reminds me of this video about the Cray XC30 super computer. It also networks between cores with their own memory, but not on chip.
https://youtu.be/XEdrIpeXQnw
I think you are confusing nodes with processors. The Cray XC30 uses standard Xeon processors.
“Performance improvements”, in that they’ve fixed the performance regression from the original spec. execution vulnerabilities
What application with that level of CPU power could make use of that address space?
My guess would be Intel’s long term vision is all storage devices mapped into the 64 bit address space. Kinda like Unix/Linux mmap() or WIN32 CreateFileMapping() now, but automatically for everything on all your much-larger-in-the-future storage devices.
The article specifically mentions non-volatile optane will have 350 ns average access time, compared to 10 us for optane SSD they’re selling now (and not mentioned but presumably everyone knows normal NAND SSD has latency in the ~100 us range).
In memory databases and in memory compute. When you no longer have to copy data into working memory (because all of it is working memory) then you’re looking at an astonishing leap in computing ability. This is what 3DXPoint (a.k.a. Optane) is designed for. But the memory bus needed to be modified to achieve useful bandwidth. I suspect that is part of the new chip, along with the increase in address space.
Windows 12?
half life 3 confirmed?
I am willing to bet $5 that Microsoft will not release a “Windows 13”.
But they might, it seems they have been prepping us for it with Bob, Windows Millennium Edition, Vista, Windows 8…
Well, you could calculate pi for starters.
And Microsoft Office gets more bloated every year, so it’s only a matter of time…
That’s some extremely round circles.
Right now: everything that needs to process a lot of data really fast. HPE was already offering servers for in-memory databases with 80 TB of RAM years ago. Some development machines already had to work around the 256TB limit.
In the future: with NVMe and NVDIMMs there will be a lot more we want to map into the physical address space than just normal RAM.
This specific processor will not have the physical address space theoretically provided by this five level page table extension. The address pins available on the processor doesn’t need to match the maximum architecturally accessible memory so doing an extension once that will be good enough for the foreseeable future is logical.
Now the big thing with this is the expansion of the virtual address space that can be used for e.g. sparse mapped data structures or a single level storage system (map all solid storage into the memory space).
“…special cryptography-specific instructions have been added”
No thanks NSA
Just be glad they didn’t add any block-chain specific instructions.
There was an article I read recently on LWN about linux’s use of onboard random number generators. The on-cpu ones on intel are used to stir the pot as it were, where entropy is hashed in from other sources, but isn’t credited with any entropy itself. Since it goes through an AES whitening stage, as the article points out, for all we know the hardware random number generator implements:
return AES_encrypt(NSA_key, my_counter++);
I don’t know that anyone will truest them.
On the other hand, intel has subbornly refused to fix the last few spectre variants (spectre RSB, TLB bleed, ect.) which has had people up in arms over their ability to leak information from software cryptographic algorithms. According to intel, if you write them correctly or use *theirs* then there’ll be no issue as everything is nice and balanced (doing equal work in every conditoinal path, ect) so it doesn’t matter if people can snoop on you. This has not been met without scepticism.
“Today’s x64 CPUs can only use bit 0 to bit 47 for an address space spanning 256TB. The additional bits mean a bump to a whooping 4 PB of physical memory and 128 PB of virtual address space.”
The NSA Utah Data Center will be delighted.
There is almost no point to have that size of physical memory on a single processor because you can’t really use it with real life memory. There are limits on the fanout and signal integrity constraints on how many memory modules that you can pile together. While you can get around that with adding extra hardware, it’ll add a lot of latency.
The processor now becomes a bottle neck as you have a serious mismatched processing power vs memory. So in the end, it is meaningless for a lot of applications.
You are correct—but only for older memory tech. See my comment above.
IBM’s POWER9 is happily working with those memory buffers.
Apart from that: When you remove one bottleneck, a different one will start to hit you. Nothing new here, thanks. Just proceed with your daily doings and let the engineers do what is necessary.
>In early 2016 IM Flash announced that the first generation of solid-state drives would achieve 95000 IOPS throughput with 9 microsecond latency.
The latency is still multiple orders of magnitude higher than the conventional memory. For comparison
old 6800/6502 runs a 1us cycle. :P
My point still remains. it is hard to have vast amount of memory with low latency. It is big issue that thy recognized in the early days of computing which leads to memory hierarchy.. there is only so much you can do with large memory arrays. The bus capacitance is the limit. While you can mux busses together and add pipelines, you are addint latency.
In the end you have a single socketed CPU with a huge amount of data with very large latency. Most super computing set up decided to split up the memory into smaller islands with their own processors/computing units. This let them get the latency down and some parallel performance.
Bubul;
Like I said it is useless for -MOST- application. No need to add any insults as some of us are engineers here. Haven’t seen any real engineer comment from you here.
Again, you’re only looking backward to old tech. The new non-volatile memory (3DXPoint/Optane) is slower than DRAM, but close enough to replace it. Then, instead of moving memory back and forth between SSD and RAM, the CPU just uses the storage AS memory. A lot of work the CPU formerly had to do managing that memory swapping goes away, along with the time spent doing it.
I’m skeptical of the patches. Sure, they doubtless mitigate the exact vulnerabilities already known as Meltdown and Spectre, but those are just specific implementations of a whole class of information-leakage attacks. I doubt they’ve overhauled the entire architecture to the point that the whole class is moot — it’d be a much bigger story if they had.
Until then, this is just a press-release to say “we put some new straw over where the wolf blew holes in the old straw, everything is fine”, while ignoring the heavy sticks and bricks that nobody wants to lift.
There’s still the issue of how those patches affect performance.
Yes but to a fair extent you have to wait for someone to invent the (cr)hack, before you can mitigate against it. You can’t protect against the unknown, not in reality.
It will be interesting to see if they do pop in second half of next year (2019). AMD is supposed to pop in the first half with another refresh, I believe I read. Nothing like a little ‘competition’ to move computing forward.
256TB to 4PB … And here I struggle to just find ‘affordable’ good 32GB – 64GB of DDR4 memory in my home systems! Some company/agency with deep pockets will probably find a use for that much memory!
If memristor memory ever became a thing, that might be possible.
If memristors are even a thing. Jury’s still out on that one.
Well there’s still things we don’t understand.
https://arstechnica.com/science/2018/12/phase-change-memory-built-from-layers-of-atomically-thin-materials/
Everspin has been shipping MRAM for over a decade. All state-of-the-art silicon fabs (basically everybody except Intel) can produce the necessary cells. Most of the shipped designs currently are embedded devices which just use a few kilobits of it due to the high price, but DIMM and M.2 modules between 128 and 256 MB in size have been demoed since around 2016.
These “alternate” memory technologies unfortunately don’t end to scale with shrinking process nodes. So it is not going to compete with 1-T DRAM cells in the density game. DRAM are moving to class 1x ie. 18-19nm range and slowly shrinking down to the mid teens in small steps. That’s the level of density that any of these memory have to reach to beat the density. FLASH cheats by storing multiple bits as different voltage levels and stacking them as 3D structures.
As for DDR4, I bought two 16GB for around $150 on sale. The prices are slowly falling. Companies that build systems with large amount of memories do not care oo much bout memory prices like we do.
Dram could easily deploy the same trick as flash, after all DRAM is basically just a matrix of sample and hold circuits holding analogue voltages, but refresh/regeneration circuitry would be more complicated (and refresh periods shorter).
Intel Inside
Idiot Outside
Lest we not forget
http://uncyclopedia.wikia.com/wiki/File:Satan_inside.jpg
Can anyone say why spectre and meltdown patches are not optional? Machines that don’t connect to the internet don’t need it. I have several PCs crunching data, which aren’t online, and they’re slower due to patches for security I don’t need or want. I wonder why.
Because most computers are online, and you can’t give everyone a custom CPU. That’s really what it boils down to.
While attacks haven’t been *seen* in the wild yet (though it’s doubious whether you could even detect them) it’s too jucy of a target fo nobody to try reducing to practice.
You can easily disable the software mitigations, at least on Linux.
Don’t patch them then? I’d rather people have to opt-out of security than opt-in. Windows will probably drop the patches on you. Not much you can do there. Your microcode updates should be up to you though. Linux could certainly be compiled without the security mitigations.
You can disable each individual security mitigation through the Linux kernel command line. No need to compile a special kernel. Just configure GRUB et al accordingly.
As Moore’s law runs out of steam and extensions pile up, it’s looking more and more like CISK.
Moore’s Law ran out of steam more than ten years ago. Intel would have had to fit >100 billion transistors into the Xeon Phi in 2016 to keep up with 2008 numbers, but they managed only 8 billion. NVIDIA would have had to fit more than 120 billion transistors onto a Tesla V100 die to keep up with 2011 numbers, but they only did ~21 billion.
Just stop using the whole term “Moore’s Law” already.
Well yes, it’s dead. Good enough? The point is, there was a time when it was thought CISK was the answer, then it turned out RISK was better. Soon CISK processor producers started interpreting CISK into RUSK at runtime in hardware do that their secret RISK processor inside can look like their CISK processor line. Now, as we can no longer expect ever increasing speed, we moved to multicore. But that does not help singel thread execution. For that, the new answer is all these extensions. Effectively going back to CISK. Special hardware that only work when a small set if specific instructions are used.
Komputer?
The microops every larger CPU now uses internally actually are also not RISC. In my opinion they’re much closer to VLIW than anything else.
“Patched”???
So they were not able to come with a new design and they were forced to frankenstein old stuff.
No, thanks! I will stay with mainframes of the 80s, they still kick ass.
You’re reading this on an 80s mainframe? How!? Man I’d love one of those!
My NCR Tower, with an MMU and an ethernet card is more than capable of rendering basic HTML. Java on the other hand….
Java’s dead. But sadly you’re probably referring to Javascript, which bloody infests the place. Still, there’s an inverse proportion between worthwhileness of a website, and Javascript use. The most useful stuff is often just text and a few optional images to make it look nice, maybe a photo of something for reference. The real Javascript cesspools are sites aimed at “sharing” shit for Facebook. There’s surely a Venn diagram somewhere waiting to be drawn on all this.
Please tell us about your tower, dead computers are surely part of the remit here. And if they’re not, there’s still loads of us here interested. I learned C on a 68020 (I think) with 12MB RAM. A Honeywell Bull Unix System V mini, on Wyse 60 terminals with real green text. And COBOL.
How many users did your tower formerly serve?
Just a FFS, but asking Google “NCR minicomputer” and some other stuff, first dozen results come up without “NCR” and “minicomputer”, cos it thought I might not’ve been serious when I asked. ‘sake!
…I think this powerful processor may be managed cloud computing.
And What about the back doors for the USA government usage?
USA accuses huawei and ….?
It’s called alternatively “national interest” or hypocrisy, depending on how wide your global perspective is.