Friday, March 10, 2023

Houston Texas Appliance Parts: Arm Engineer Lauded for Concurrency Modeling Work

Houston Texas Appliance Parts

Arm Engineer Lauded for Concurrency Modeling Work

by Houston Texas Appliance Parts on Friday 10 March 2023 03:21 PM UTC-05

Arm distinguished engineer Jade Alglave has been named a finalist in the Blavatnik Awards, a program that recognizes young faculty-rank scientists in the UK and internationally, administered by the New York Academy of Sciences.

Alglave, who is also a professor of computer science at University College London, is being recognized for her ongoing work to develop a formal way of describing concurrency behavior in multi-core and multi-processor systems. Bugs caused by concurrency issues can be extremely difficult to replicate, as they generally only occur when systems are under stress. Preventing bugs like this from occurring in the first place is therefore crucial to ensuring reliable multi-core systems in everything from supercomputers to smartphones.

Highlighting Alglave's "remarkable achievement," Arm chief architect Richard Grisenthwaite told EE Times that Alglave's work should be celebrated, not only since it highlights her as a female role model for budding computer scientists, but also because her methodology's widespread applicability beyond Arm's ecosystem means it has already had significant impact across the industry.

Alglave and Grisenthwaite at work at Arm
Alglave and Grisenthwaite at work at Arm. (Source: Andrew Gemmell/The Last Word TV)

Concurrency

Alglave's work is centered on a formal way to describe concurrency behaviors of multi-core systems.

In almost all modern computing systems, multiple cores work in parallel, with different threads of execution running independently on each core. These threads must communicate, but operating independently means they can get out of synch.

Alglave's example is a pink pony, drawn by two CPUs exchanging information via shared memory. The first processor creates a pink triangle, and sends a flag to the other processor to let it know the triangle is complete. Then, the other processor can retrieve the triangle and complete the pony.

"If a reordering happens—and there are many different types of reordering—perhaps the triangle gets created but gets stuck along the way, or the flag happens to travel faster," Alglave said. "If the other processor looks for the triangle before it arrives, you get a [broken] pony. You need a barrier to ensure the flag doesn't arrive before the data, so the [message passing] protocol behaves the way you expected."

Rendering of a horse showing broken rendering due to concurrency bug
The pony on the right illustrates concurrency bugs, with data missing from the shared memory when the second processor tried to retrieve it. (Source: Arm)

As processors get more and more complicated, the problem gets worse—while the hardware may present the illusion that a program is run one instruction after the other, in practice, reordering happens widely as it is required to get the best performance. So, it's important to have a set of rules that express how much reordering is allowed, while not making it too complex for software programmers to understand.

One of the solutions is to add special instructions called barriers, which prevent reordering.

"We don't want people to have to think too much about which barrier to use; we want people to be able to reorder things," Alglave said. "So, [it's about] striking the balance, and more specifically, enunciating how to use barriers precisely is sometimes where prose is not enough, because you can argue forever about which barrier to use."

Preventing concurrency bugs - code sample
The message passing communication protocol written in Arm assembly code. The version on the right has added barriers (highlighted in green) that prevent the concurrency bug. (Source: Arm)

Alglave's work over the last 15 years has had several facets. Central to her work is the domain-specific programming language, Cat, which she developed in collaboration with Luc Maranget during her PhD. Cat is used to express the model—the list of formal rules for communication that are legal in the concurrent system under consideration, whether that's Arm hardware, another hardware architecture, an operating system or another concurrent system. Then there are tools that allow engineers to test what they've built against the relevant model (the tool suite is available online).

Grisenthwaite said the Cat language has been particularly helpful in formalizing an expression of the Arm architecture's concurrency behavior.

"I looked at the [Arm] architecture for a long time and tried to write down in the English language what reorderings were allowed, what behaviors we are meant to see… I tied myself in knots, and that's putting it mildly," he said. "[Alglave's] fundamental innovation is coming up with a language, and the tooling that allows you to express this in a mathematically rigorous way."

This makes formal reasoning about concurrency behavior possible, Grisenthwaite added. Using Alglave's tools, the developer can present a scenario and ask the tools whether certain behaviors are allowed, then get an answer (yes or no) and a graphical representation of why or why not.

One of the biggest problems with concurrency bugs is they often occur when the system is under stress and are thus extremely rare (Grisenthwaite suggested one failure might occur in 10,000 runs). This makes them extremely difficult to catch and fix. The tests written by Alglave's tool are designed to mimic these stress conditions and force reorderings to see if they produce a bug.

Reordering with barriers

Alglave and her team at Arm have been working on Arm's concurrency model for three years, adding features of the architecture to the model one by one.

"[Arm's] model allows people who write code for Arm hardware to know the rules, so they know when they need to add an explicit barrier, or when not to," Alglave said. "Hardware folks also benefit from having that set of rules to double check they've understood correctly which reorderings they are permitted to do."

The average application programmer probably won't ever need to use the model, Grisenthwaite stresses. For Arm's off-the-shelf cores, and implementations like the DSU (DynamIQ Shared Unit), Arm has already taken care of concurrency behaviors. Simple ordering rules are also built into programming languages like C.

"For other companies building processors on the Arm architecture… however much they reorder, however much they innovate in their designs, this allows their memory system experts to know whether they've done something that's going to break the world's software in very subtle ways, but ways that matter," Grisenthwaite said. This would apply to the handful of customers building their own Arm-based CPUs, including the team who worked on Fujitsu and Riken's Fugaku supercomputer, which Grisenthwaite describes as a "hugely concurrent system."

Alglave's team has extended Arm's model to bring in not just ordinary memory-to-memory communication, but also system software-oriented features like page table management and instruction-to-data communications.

"It turns out there's more and more about the way that processors communicate with each other that can be expressed in this format and can use this methodology, it's not a point solution to a particular problem, it's a very good way of reasoning generally about concurrency," said Grisenthwaite, adding that Alglave's methodology has become "a foundational tool in the architecture development process."

Industry-wide significance

Alglave, before joining Arm, also worked with companies including Nvidia and IBM to demonstrate the tools and methodology.

"We did find a few bugs on their deployed hardware, which caught their attention," she said.

The Cat language is flexible enough to apply to programming languages and operating systems. Colleagues in academia have written a model for C++, for example, and Alglave also previously worked on building a concurrency model for Linux.

"It's interesting to have language models and hardware models, because then you can ask, 'Did I compile this correctly?'," she said. "It's the same for operating systems. Linux is written in a dialect of C, so you write a Litmus test in that specific dialect of C and ask a question about can it behave that way. You have a set of rules as to how Linux threads are allowed to talk to each other, and the tool will tell you yes or no."

The potential of the Cat language extends to heterogeneous systems, such as CPU-GPU combinations. There have been industry initiatives to tackle this, like the Heterogeneous Systems Architecture (developed by the HSA Foundation), which aimed to reduce communication latency between CPUs, GPUs and other types of processors, and ease programming—the specification used the Cat language. (Heterogeneous systems are outside the current scope of Alglave's work at Arm).

"We recognize that at the language level, at the operating system level, at the hypervisor level, and at the hardware level, there are concurrency issues that need to be expressed," Grisenthwaite said. "Cat is a great tool for doing that… [we want to] encourage people to use this [methodology] and make it more ubiquitous; that's something Arm is very supportive of because it's consistent with our principles of wanting to work in partnership across the entire industry."

Future work

One area Alglave has identified for future work is applying her methodology earlier in the hardware design process.

"One thing that would be very interesting, and I think quite challenging both scientifically and from an engineering point of view is, can we use those rules as written in Cat to write SystemVerilog assertions for EDA tools, like we do for sequential or functional behaviors?" she said.

Currently, Cat tests can be generated and run pre-shipping, but applying them earlier in the chip design process, and more formally, would mean stronger guarantees that designs are following the concurrency rules of the architecture.

"There is a tremendous amount of research that can go in that direction," Grisenthwaite said. "[Proving designs] is one of the areas we're going to be investing in more formal methods for, because as designs get more complicated, it's harder to know if the designs are correct. Formal methods have a really strong place in that process."

The post Arm Engineer Lauded for Concurrency Modeling Work appeared first on EE Times.


Appliance Parts
Pennsylvania Philadelphia PA Philadelphia
March 10, 2023 at 03:00PM


California Vermont Michigan Montana
https://unitedstatesvirtualmail.blogspot.com/2023/03/houston-texas-appliance-parts-arm.html
March 10, 2023 at 06:08PM

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.