Mixed-Criticality Architectures

Provided by: Roman Obermaisser and Donatus Weber (USIEGEN)

Introduction

Mixed-criticality architectures with support for modular certification make the integration of application subsystems with different safety assurance levels both technically and economically feasible. Strict segregation of these subsystems is a key requirement to avoid fault propagation and unintended side-effects due to integration. Also, mixed-criticality architectures must deal with the heterogeneity of subsystems that differ not only in their criticality, but also in the underlying computational models and the timing requirements. Non safety-critical subsystems often demand adaptability and support for dynamic system structures, while certification standards impose static configurations for safety-critical subsystems. Several aspects such as time and space partitioning, heterogeneous computational models and adaptability were individually addressed at different integration levels including distributed systems, the chip-level and software execution environments.

A holistic architecture for the seamless mixed-criticality integration encompassing distributed systems, multi-core chips, operating systems and hypervisors is a research problem that is addressed within the European project DREAMS. DREAMS provides a hierarchical mixed-criticality platform with support for strict segregation of subsystems, heterogeneity and adaptability [1].

1. Requirements of an Architecture for Mixed-Criticality Systems

Logically, a mixed-criticality systems consists of application subsystems with corresponding criticality levels (e.g., steer-by-wire, powertrain, comfort in a car), which can be further decomposed into components (e.g., engine control and transmission control in case of the powertrain subsystem).

A platform for mixed-criticality systems provides resources for the execution of multiple application subsystems and their components. For each component, the platform must provide an execution environment with corresponding resources (e.g., processing, memory, input/output) as depicted in Figure 1.

 

Partitioning is a prerequisite for modular certification, where each application subsystem is certified to the respective level of criticality.  Partitioning ensures the independence of execution environments regardless of design faults affecting the components. Hence, a component within a partition is a Fault Containment Region (FCR) [2] that delimits the immediate impact of a fault. Other components can only be affected by faulty inputs and not via shared resources such as the same underlying processor core.

In order to simplify this timing and resource analysis, mixed-criticality platforms should ensure determinism and temporal independence of safety-critical subsystems from non safety-critical ones. Temporal predictability and low jitter (i.e., difference between maximum and minimum computational and communication delays) also improves the quality of control in mixed-criticality systems. Many safety-critical application subsystems are hard real-time control applications, where the achievement of control stability and safety depends on the completion of activities (like reading of sensor values, performing computations, communication activities, actuator control) in bounded time.

Mixed-criticality systems consist of heterogeneous application subsystems that differ not only in their criticality, but also exhibit dissimilar requirements in terms of timing (e.g., firm, soft, hard, non real-time) and different models of computation (e.g., dataflow, time-triggered messaging, distributed shared memory). Also, subsystems can have contradicting requirements for the underlying platform such as different tradeoffs between predictability, certifiability and performance in processors cores, hypervisors, operating systems and networks.

Another important requirement is adaptability. Primarily due to safety concerns, reconfiguration of safety-critical subsystems is often reduced to selecting system-wide modes out of statically defined scheduling tables. Non safety-critical subsystem, on the other hand, often demand higher flexibility and dynamic system structures, where components enter and leave at run-time, interact among each other in variable setups and realize different global application services.

2. Mixed-Criticality Integration using Operating Systems and Hypervisors

The software execution environments for components of mixed-criticality systems are realized by operating systems and hypervisors with time and space partitioning. In the temporal domain, partitioning is achieved by applying scheduling mechanisms that assign available resources to tasks based on time slots. Spatial partitioning can be achieved by allocating a partition to a unique address space which is not accessible by other partitions.

Several software execution environments are available on the market reaching from bare metal (type 1) hypervisors to full featured operating systems with the capability of hosting heterogeneous guest operating systems in their partitions. Examples of software execution environments capable of running mixed-criticality applications include PikeOS [3], VxWorks [4], LynxOS [5] and XtratuM [6].

3. Mixed-Criticality Integration using Multi-Core Processors

Many mixed-criticality systems involve high performance embedded computing applications [7], which require significant computational power from the underlying platform. Since the performance of single core processors has reached its boundaries [8], architectures with multi-core processors are becoming a design choice for embedded systems. On the other hand, multi-core systems usually have a more complex hardware architecture. Typical sources of indeterminism are caches, buses and shared memories making the system less predictable and complicating the computation of worst-case execution times.

These challenges are addressed by deterministic chip-level architectures that allow to map mixed-criticality applications to multi-core processors. Examples of deterministic chip-level architectures are GENESYS [9], ACROSS [10], COMPSOC [11,12], PARMERASA [13,14,15] and IDAMC [16,17,18]. These architectures provide partitioning based on based on Time-Division Multiple Access (TDMA) at network interfaces as routers or priority-based arbitration mechanisms.

4. Mixed-Criticality Integration in Distributed Systems

In the avionic domain, Distributed Integrated Modular Avionics (DIMA) introduces the concepts and guidelines for mixed-criticality integration based on distributed systems [19,20]. DIMA is an evolutionary step beyond IMA by combining concepts of federated and integrated avionic systems.

The communication in DIMA needs to support temporal and spatial partitioning between integrated modules. Therefore, suitable communication protocols like Avionics Full DupleX Switched Ethernet (AFDX) are used. In addition, the communication system must support fault-tolerance such as active redundancy (e.g., self-checking pairs, triple modular redundancy) and replicated communication channels.

In the automotive domain, mixed-criticality integration using distributed systems is supported based on time-triggered communication networks. FlexRay is a time-triggered communication protocol with a static segment that supports segregation of the messages from different nodes in the time and value domains [21]. Local and central guardians protect the message transmission in static time slots based on a priori knowledge about the identity of the sender nodes and the predefined message timing with respect to a global time base. Temporal and spatial partitioning is also addressed in future Ethernet-based automotive communication networks. For example, a timing analysis of switched Ethernet topologies for mixed-criticality traffic is done in [22].

5. References

[1] R. Obermaisser, D. Weber, "Architectures for mixed-criticality systems based on networked multi-core chips," Emerging Technology and Factory Automation (ETFA), 2014.

[2] J. Lala and R. Harper, “Architectural principles for safety-critical real-time applications,” in Proc. of the IEEE, ser. 1, vol. 82, Jan. 1994, pp. 25–40.

[3] Theiling Henrik, “White Paper PikeOS and Time-Triggering,” 2013.

[4] Parkinson, Paul and Kinnan, Larry, “Safety-Critical Software Development for Integrated Modular Avionics,” 2007.

[5] Lynuxworks, “LynxOS User’s Guide, Release 4.0,DOC-0453-02,” www.lynuxworks.com/support/lynxos/docs/0453-02-los4_ug.pdf, 2005, last visited on 01/03/2014.

[6] A. Crespo, I. Ripoll, and M. Masmano, “Partitioned embedded architecture based on hypervisor: The xtratum approach.” in EDCC. IEEE Computer Society, 2010, pp. 67–72.

[7] S. Mu, C. Wang, M. Liu, D. Li, M. Zhu, X. Chen, X. Xie, and Y. Deng, “Evaluating the potential of graphics processors for high performance embedded computing,” in Design, Automation Test in Europe Conference Exhibition (DATE), 2011, March 2011, pp. 1–6.

[8] P. Gelsinger, “Microprocessors for the new millenium, challenges, opportunities, and new frontiers,” in Proc. of the Solid State Circuit Conference. IEEE Press, 2001.

[9] R. Obermaisser and H. Kopetz, GENESYS: An ARTEMIS Cross-Domain Reference Architecture for Embedded Systems. Suedwestdeutscher Verlag fuer Hochschulschriften, 2009. [Online]. Available: books.google.de/books

[10] ACROSS Consortium, “D2.2 Functional Specification of Middleware and System Components,” www.across-project.eu/download/ACROSS_D2.2.pdf, 2010, last visited on 01/04/2014.

[11] K. Goossens, A. Azevedo, K. Chandrasekar, M. D. Gomony, S. Goossens, M. Koedam, Y. Li, D. Mirzoyan, A. Molnos, A. B. Nejad, A. Nelson, and S. Sinha, “Virtual execution platforms for mixed-time-criticality systems: The compsoc architecture and design flow,” SIGBED Rev., vol. 10, no. 3, pp. 23–34, Oct. 2013. [Online]. Available: doi.acm.org/10.1145/2544350.2544353

[12] K. Goossens and A. Hansson, “The aethereal network on chip after ten years: Goals, evolution, lessons, and future,” in Proceedings of the 47th Design Automation Conference, ser. DAC ’10. New York, NY, USA: ACM, 2010, pp. 306–311. [Online]. Available: doi.acm.org/10.1145/1837274.1837353

[13] T. Ungerer, C. Bradatsch, M. Gerdes, F. Kluge, R. Jahr, J. Mische, J. Fernandes, P. G. Zaykov, Z. Petrov, B. Boddeker, S. Kehr, H. Regler, A. Hugl, C. Rochange, H. Ozaktas, H. Cassé, A. Bonenfant, P. Sainrat, I. Broster, N. Lay, D. George, E. Quiñones, M. Panic, J. Abella, F. J. Cazorla, S. Uhrig, M. Rohde, and A. Pyka, “parmerasa – multicore execution of parallelised hard real-time applications supporting analysability.” in DSD. IEEE, 2013, pp. 363–370.

[14] T. Ungerer, F. Cazorla, P. Sainrat, G. Bernat, Z. Petrov, C. Rochange, E. Quiñones, M. Gerdes, M. Paolieri, J. Wolf, H. Casse?, S. Uhrig, I. Guliashvili, M. Houston, F. Kluge, S. Metzlaff, and J. Mische, “Merasa: Multicore execution of hard real-time applications supporting analyzability,” Micro, IEEE, vol. 30, no. 5, pp. 66–75, Sept 2010.

[15] J. Wolf, M. Gerdes, F. Kluge, S. Uhrig, J. Mische, S. Metzlaff, C. Rochange, H. Casse?, P. Sainrat, and T. Ungerer, “Rtos support for parallel execution of hard real-time applications on the merasa multicore processor,” in Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC), 2010 13th IEEE International Symposium on, May 2010, pp. 193–201.

[16] B. Motruk, J. Diemer, R. Buchty, R. Ernst, and M. Berekovic, “Idamc:A many-core platform with run-time monitoring for mixed-criticality.” in HASE. IEEE Computer Society, 2012, pp. 24–31.

[17] S. Tobuschat, P. Axer, R. Ernst, and J. Diemer, “Idamc: A noc for mixed criticality systems,” in RTCSA, 2013, pp. 149–156.

[18] J. Diemer and R. Ernst, “Back suction: Service guarantees for latency-sensitive on-chip networks,” in Networks-on-Chip (NOCS), 2010 Fourth ACM/IEEE International Symposium on, May 2010, pp. 155–162.

[19] R. Wolfig and M. Jakovljevic, “Distributed ima and do-297: Architectural, communication and certification attributes,” in Digital Avionics Systems Conference, 2008. DASC 2008. IEEE/AIAA 27th, Oct 2008, pp. 1.E.4–1–1.E.4–10.

[20] T. Wang and G. Qingfan, “Research on distributed integrated modular avionics system architecture design and implementation,” in Digital Avionics Systems Conference (DASC), 2013 IEEE/AIAA 32nd, Oct 2013, pp. 1–53.