Radeon Vega revealed: 5 things we need to know about AMD’s cutting-edge graphics cards

“Wait for Vega.” For a past 6 months, that’s been a summary from a Radeon faithful, as Nvidia’s brutal GeForce GTX 1070 and GTX 1080 stomped above AMD’s Radeon RX 400-series graphics cards.

While Nvidia’s absolute new 16nm Pascal GPU pattern beam all a proceed from a squalid $120 GTX 1050 to a strong $1,200 GTX Titan X, AMD’s 14nm Polaris graphics are designed for some-more mainstream video cards, and a flagship Radeon RX 480 is no compare for Nvidia’s higher-end brawlers. Thus “Wait for Vega” has spin a rallying cry for AMD supporters with a lust for face-melting gameplay—Vega being a codename of a new enthusiast-class 14nm Radeon graphics pattern teased on AMD roadmaps for early 2017.

dsc06959 Gordon Mah Ung

The AMD GPU roadmap that started a wait for Vega.

Unfortunately, a wait will continue, as a new pattern won’t seem in shipping cards until someday after in a initial half of 2017. But during CES, Vega is apropos some-more than a small codename: AMD is finally divulgence some technical teases for Radeon’s performance-focused response to Nvidia’s titans, including how a new GPU intertwines graphics opening and memory architectures in radical new ways.

Before we dive in too deeply, here’s a high-level overview of a Vega technical pattern preview.

vega arch preview overview primary AMD

A technical preview of AMD’s Radeon Vega graphics architecture.

All those difference will spin suggestive in time. Let’s start with what we wish to hear about first.

1. Damn, it’s fast


In a preview shown to reporters and analysts in December, AMD played 2016’s high Doom on an early Radeon Vega 10 graphics label with all cranked to Ultra during 4K resolution. Doom scales like a champ, nonetheless that’s ruin on any graphics card: Even a GTX 1080 can’t strike a 60 frames per second normal during those settings, per Techspot. Radeon Vega, meanwhile, floated between 60 and 70fps. Sure, it was using Vulkan—a graphics API that favors Radeon cards in Doom—rather than DirectX 11. But, prohibited damn, a demo was impressive.

A integrate of other sightings in new weeks endorse Vega’s speed. At a New Horizon livestream that introduced AMD’s Ryzen CPU to a world, a association showed Star Wars: Battlefront using on a PC that pairs Ryzen with Vega. The twin maxed out a 4K monitor’s 60Hz speed with all cranked to Ultra. The GTX 1080, on a other hand, hits usually bashful of 50fps, Techspot’s contrast shows.

Meanwhile, a since-deleted leak in a Ashes of a Singularity database in early Dec showed a GPU with a Device ID “687F:C1” leading many GTX 1080s in benchmark results. Here’s a twist: The Device ID shown in a support rate conceal during AMD’s new Vega preview with Doom reliable that Vega 10 is indeed 687F:C1.

These numbers come with all sorts of caveats: Vega 10 isn’t in a final form yet, we don’t know either a graphics label AMD teased is Vega’s beefiest incarnation, all 3 of those benchmarked games heavily preference Radeon, et cetera.

But all that said, Vega positively looks rival on a graphics opening front, partly since AMD designed Vega to work smarter, not usually harder. “Moving a right information during a right time and operative on it a right way,” was a vital thought for a team, according to Mike Mantor, an AMD corporate associate focused on graphics and together discriminate architecture—and a immeasurable partial of that stems from restraining graphics estimate some-more closely with Vega’s radical memory design.

2. All about memory

When it comes to onboard memory, Vega is officious revolutionary—just like a predecessor.

AMD’s tide high-end graphics cards, a Radeon Fury series, brought cutting-edge high-bandwidth memory to a world. Vega carries on a flame with softened next-gen HBM2, bolstered by a new “high-bandwidth cache controller” introduced by AMD.

Technical stipulations singular a initial era of HBM to a small 4GB of capacity, that in spin singular a Fury array to 4GB of onboard RAM. Thankfully, HBM’s tender speed hid that smirch in a immeasurable infancy of games, nonetheless now HBM2 tosses those shackles by a wayside. AMD hasn’t strictly reliable Vega’s capacity, nonetheless a conceal during a Doom demo suggested that sold graphics label packaged 8GB of RAM. And that super-fast RAM is removing even faster, with AMD’s Joe Macri saying that HBM2 offers twice a bandwidth per pin of HBM1.

hbm cache controller AMD

Vega’s high-bandwidth cache and cache controller transparent a universe of memory potential.

But as it turns out, HBM was usually a beginning. “It’s an evolutionary record we can take by time, make it bigger, faster, make all these pivotal improvements,” pronounced Macri, a pushing force behind HBM’s creation. Vega builds on HBM’s shoulders with a introduction of a new high-bandwidth cache and high-bandwidth cache controller, that mix to form what Radeon trainer Raja Koduri calls “the world’s many scalable GPU memory architecture.”

AMD crafted Vega’s high-bandwidth memory pattern to assistance propel memory pattern brazen in a universe where perfect graphics opening keeps improving by leaps and bounds, nonetheless memory capacities and capabilities have remained comparatively static. The HB cache replaces a graphics card’s normal support buffer, while a HB cache controller provides fine-grained control over information and supports a whopping 512 terabytes—not gigabytes, terabytes—of practical residence space. Vega’s HBM pattern can enhance graphics memory over onboard RAM to a some-more extrinsic memory complement able of doing several memory sources during once.

vega 512tb practical address AMD

That’s expected to make a biggest impact in veteran applications, such as a new Radeon Instinct lineup or a cutting-edge Radeon Pro SSG card that swindle high-capacity NAND memory directly to a graphics processor. “This will concede us to bond terabytes of memory to a GPU,” David Watters, AMD’s conduct of Industry Alliances, told PCWorld when a Radeon Pro SSG was revealed, and this new cache and controller pattern designed for HBM’s blazing-fast speeds should supercharge those capabilities even more.

To expostulate a intensity advantages home, AMD suggested a photorealistic distraction of Macri’s home vital room. The 600GB stage routinely takes hours to render, nonetheless a multiple of Vega’s bravery and a new HBM2 pattern pumps it out in small minutes. AMD even authorised reporters to pierce a camera around a room in real-time, despite rather sluggishly. It was an eye-opening demo.

vega memory allocation AMD

Koduri stressed that games can also advantage from a high-bandwidth cache controller’s fine-grained, energetic information management, citing Witcher 3 and Fallout 4, any of that indeed use reduction than half of a memory allocated by a games when they’re using during 4K resolution. “And those are well-optimized games!” he said. Memory final are usually removing greedier in high-profile games, and doubly so during bleeding-edge resolutions. Here’s anticipating that a HB cache’s finer controls interconnected with HBM’s perfect speed—and other tweaks we’ll plead after in this article—alleviates that somewhat.

AMD also says that destiny generations of games could take advantage of high-bandwidth memory pattern to upload immeasurable information sets directly to a graphics processor, rather than doing it with a some-more hands-on proceed as finished today.

3. Efficient tube management

The proceed graphics cards describe games isn’t really efficient. Case in point: a next stage from Deus Ex: Mankind Divided. It packs in a whopping 220 million polygons, according to Koduri, nonetheless usually 2 million or so are indeed manifest to a player. Enter Vega’s new programmable geometry pipeline.

deus ex AMD
deus ex polygons AMD

Rendering a stage is a multi-step process, with graphics cards estimate zenith shaders before flitting a information on to a geometry engine for additional work. Vega speeds things adult with a assistance of obsolete shaders that fast brand a polygons that aren’t manifest to players so that a geometry engine doesn’t rubbish time on them. Yay, efficiency!

Vega also blazes by information during over twice a rise throughput of a predecessors, and includes a new “Intelligent Workgroup Distributor” to urge charge bucket balancing from a really commencement of a pipeline.

primitive shaders vega AMD

Vega’s Primitive Shaders.

These tweaks expostulate home how AMD’s infiltration in consoles can advantage PC gamers, too. The impulse for a bucket balancing tweaks comes from console developers used to operative “closer to a metal” than PC developers, who highlighted it as a intensity area for alleviation for AMD, Raja Koduri says.

4. Right task, right time

AMD designed Vega to “smartly report past a work that doesn’t have to be done,” according to Mike Mantor. The final tidbits done open by a association expostulate that home.

Vega continues AMD’s multi-year pull to revoke memory bandwidth expenditure (a query that Nvidia’s also embarked upon). Its next-gen pixel engine includes a “draw tide binning rasterizer” that improves opening and saves energy by teaming with a high-bandwidth cache controller to some-more good routine a scene. After a geometry engine performs a (already reduced volume of) work, Vega identifies overlapping pixels that won’t be seen by a user and so don’t need to be rendered. The GPU afterwards discards those pixels rather than wasting time digest them. The pull tide binning rasterizer’s pattern “lets us revisit a pixel to be rendered usually once,” according to Mantor.

draw tide binning rasterizer AMD


The revamped Vega pattern also now feeds describe back-ends from a pixel engine into a larger, common L2 cache, rather than pumping them directly into a memory controller. AMD says that should assistance urge opening in GPU discriminate applications that rest on deferred shading. (For a excellent overview on a topic, check out this ExtremeTech article on how L1 and L2 caches work.)

5. Next-gen discriminate engine

vega ncu AMD

Finally, AMD teased Vega’s “Next-gen discriminate engine,” that is able of 512 8-bit operations per clock, 256 16-bit operations per clock, or 128 32-bit operations per clock. The 8- and 16-bit ops mostly matter for appurtenance learning, mechanism vision, and other GPU discriminate tasks, nonetheless Koduri says a 16-bit ops can come in accessible for certain gaming tasks that need reduction difficult correctness as well. (The AMD-powered PlayStation 4 Pro also supports 256 16-bit operations per clock.)

vega ncu optimized AMD

Vega’s New Compute Unit can perform dual 16-bit ops during once.

Coincidentally enough, a Vega NCU can perform dual 16-bit ops simultaneously, doubled adult and scheduled together. This wasn’t probable in prior AMD GPUs, Koduri says. Vega’s next-gen discriminate section has been optimized for a GPU’s aloft time speeds and aloft instructions-per-cycle—though AMD declined to divulge a core time speeds for Vega usually yet.

Waiting for Vega

The wait for Vega continues, nonetheless now we have some thought of a ace dark adult a Radeon Technologies Group’s sleeve. These technical teases yield usually adequate of a glance to smooth a alarm of graphics enthusiasts while divulgence tantalizingly small in a proceed of tough news relating to consumer-focused Vega graphics cards. (AMD doesn’t wish to uncover a palm to Nvidia too much, after all.) It’s transparent that AMD’s attempting some nifty new tricks to urge a potency and intensity of Vega both in games and veteran uses. Nitty-gritty sum are certain to drip-drop out over a entrance months.

Fingers crossed that Vega comes earlier rather than later, however. AMD teased a 14nm Polaris GPU architecture during CES 2016 nonetheless unsuccessful to indeed launch a Radeon RX 480 until a really finish of June. Vega’s been slapped with a recover window someday in a initial half of 2017, so if AMD waits until E3 to launch this new era of enthusiast-class graphics cards, Nvidia’s brutal GTX 1080 will have already been on a streets for a full year.

Vega looks extremely darned intriguing nonetheless even a many doctrinaire Radeon loyalists can usually wait for so prolonged to build a new rig, generally with AMD’s much-hyped Ryzen processors rising very, really soon.

Do you have an unusual story to tell? E-mail stories@tutuz.com