WEBINAR

Designing an Industry Standard API to Manage Multicore System Resources

Presented on August 11, 2009

 
Managing application layer shared resources on a multicore chip requires features that include synchronization primitives and memory allocation and management. Most OSs provide rich resource management features, but many multicore programmers must work within systems that use multiple OSs because the cores have different instruction sets, because the memory is not uniformly accessible by all cores, or because the application requires multiple OS instances or even multiple different operating systems. These resource management needs cannot be achieved using proprietary OS APIs, or by standards such as POSIX or Pthreads, or even by directly using atomic primitives provided by a given instruction set. The Multicore Association®'s MRAPI® provides a simple yet comprehensive standard API for resource management that can provide these features as well as enable application portability. This talk gives an overview of MRAPI®: motivation, related work, features, status, and next steps.

The Presenter: Jim Holt
Jim leads the Processor Core Architecture and Modeling Team for Freescale’s Networking and Multimedia group. Jim has 27 years of industry experience focused on microprocessor and SoC architecture, distributed systems, design verification, and optimization. Jim is an IEEE Senior Member, is a member of the Industrial Advisory Board for the Department of Computer Science at Texas State University, and is a board member for the Multicore Association. He is also chair of the Integrated Systems & Circuits Science area for the Semiconductor Research Corporation (SRC), and chair of the Multicore Resource API Working group for the Multicore Association. Jim has over 30 refereed publications in journals and conferences, frequently serves on research proposal selection committees, and on program committees for peer reviewed journals and conferences. Jim earned a Ph.D. in Electrical and Computer Engineering from the University of Texas at Austin, and an MS in Computer Science from Texas State University.
 
 

Questions and Answers from the Webinar


Q: Can you elaborate on how hardware accelerators will interact with embedded processors using MRAPI®? An API is a library of C/C++ functions, but it is not clear how an API can be used with a hardware accelerator, which can be very application specific.

A: The API can be implemented on top of a hardware accelerator. For example, an SoC may have hardware acceleration for mutexes, in which case an MRAPI® implementation could utilize that hardware accelerator without the
programmer needing to know how to interact with it directly. Alternatively, if you want to cooperatively manage access to hardware accelerators, such as an MPEG encoder, your application would create an MRAPI® mutex and use locks on that mutex to govern when application threads could access the encoder. MRAPI® does not attempt to manage the resources directly as this capability belongs to the operating system or hypervisor code.


Q: Does the MRAPI® have test cases?

A: MRAPI® itself does not have test cases. However, as with the MCAPI® example implementation which is available from the Multicore Association, ultimately we should have an MRAPI® example implementation that will contain testcases.


Q: Do you have implementations of MRAPI® that can be tested by the specification reviewers?

A: We hope to publish a simple POSIX implementation along with the release of the spec.


Q: I assume MRAPI® relies upon a "local" resource manager? In other words, MRAPI® must store state, and so does it need a way to allocate state storage?

A: It depends on the specific MRAPI® implementation as to how resources are managed. Our initial implementation stores state in shared memory protected with a semaphore.


Q: I saw a statement that other solutions are too heavyweight because they target distributed systems. Does this mean that your goal is not to target the distributed system? What happens when we have a multi-chip multi-core?
Isn't this the same as a distributed system?

A: To be clear, MRAPI® targets cores on a chip and chips on a board. MRAPI® is not intended to scale beyond that scope.


Q: Is it possible to hide the differences between local and remote memory by just the different properties of these memories? Remote memory will have higher latency, some access restrictions, etc.

A: The working group has considered the possibility of allowing the "promotion" of local memory to remote memory, and then allowing all memory accesses to occur through the API. This would effectively hide the difference, but at a performance cost. For now this is a deferred feature.


Q: In many hardware systems transitions between low power (or no power) and fully working conditions are extremely frequent. In such systems, some state change callbacks will become a nightmare. How are you planning to handle the situation?

A: In the situation where the application does not want to be disturbed by frequent callbacks, then it would be better for the application to periodically poll MRAPI® at a time of its own choosing. This is certainly possible with MRAPI®.


Q: What is the idea to handle MRAPI® asking for HW accelerators if these accelerators are actually powered off because of inactivity?

A: First, keep in mind that a primary goal of MRAPI® is to provide primitive services that higher layer pieces of hardware or software can utilize. In other words, MRAPI® contains the primitives for high level resource sharing.
Features such as power management require more functionality than what MRAPI® was intended to provide (to maintain efficiency). So, in such a scenario requiring power management, the application would determine that there was no acceleration available and would have to find an alternative means to perform its work, perhaps by executing code on the CPU.


Q: Are there any plans to include trigger APIs? For example, invoke callback when a particular resource hits some pre-defined conditions/threshold?

A: Currently there are no threshold-related callbacks other than counter wrap-arounds. MRAPI® may consider this for a future version.


Q: For primitives, did the MRAPI® working group consider including read-copy-update locks (RCUs)?

A: The MRAPI® working group did consider read copy update locks. After discussion with some of the original creators of the RCU code for Linux, we determined that for now there is not sufficient evidence that a high performance user-level implementation of RCU was feasible. We intend to monitor developments in this area as we are aware that it is an active area of research.


Q: The specified MRAPI® primitives are necessary, but seem to be insufficient. I would think that the goal of MRAPI® would be to include the ability to write a resource manager that any application that uses MRAPI® could plug into. That implies that at a minimum: resource enumerations should be standardized, or a mechanism for self-describing enumerations be created.

A: MRAPI® is intended to provide some of the primitives that could be used for creating a higher level resource manager. However, it is also intended to be useful for application level programmers to write multicore code, and
for this reason it was kept minimal and orthogonal to other Multicore Association APIs. The working group believes that a full-featured resource manager would require all of the Multicore Association APIs, e.g., MCAPI®, MRAPI®, and MTAPI.


Q: Are any companies currently incorporating or have plans to incorporate MRAPI® in their products. If so, can you name the products?

A: At this time there have been no public announcements. There is at least one university research project that is looking at MRAPI® for heterogeneous multicore computing. We expect more activities to emerge once the spec is
released officially.


Q: Is MRAPI® planned to be processor agnostic?

A: Yes, that is the plan.


Q: Is MRAPI® dependent on any other Resource Management standards and/or approaches?

A: No, there should be no such dependencies in MRAPI®.