Integrating dataflow and non-dataflow real-time application models on multi-core platforms
Ref: CISTER-TR-170512       Publication Date: 23, May, 2017

Integrating dataflow and non-dataflow real-time application models on multi-core platforms

Ref: CISTER-TR-170512       Publication Date: 23, May, 2017

Day by day, gradually and steadily, applications in all segments of computing, including embedded systems, are getting more complex, because of the increased range of functionality they offer. This complexity requires platforms with increased performance that satisfies such growing computational demands. This need has driven the adoption of multi-core processors in embedded systems, since they allow performance to be increased at a reasonable energy consumption.
Future real-time embedded systems will increasingly incorporate mixed application models with timing constraints running on the same multi-core platform. These application models are dataflow applications with timing constraints and traditional real-time applications modelled as independent arbitrary-deadline tasks. Examples of such mixed embedded systems are Autonomous Driving Systems and Unmanned Ariel Vehicles. These systems require guarantees that all running applications execute satisfying their timing constraints. Also, to be cost-efficient in terms of design, they require efficient mapping strategies that maximize the use of system resources to reduce the overall cost.
This work proposes a complete approach with a main goal to integrate mixed application models (dataflow and traditional real-time applications) with timing requirements on the same multi-core platform. This approach guarantees that the mapped applications satisfy their timing constraints and maximize utilization of the platform resources. Three main algorithms to achieve the main goal. The first algorithm is called slack-based merging, which is an offline dataflow graph reduction technique that aims to decrease the complexity of dataflow applications, and thereby their analysis time. The algorithm reduces the run-time of our approach with 82% to 90%, compared to when it is not used. The experimental evaluation with real application models from the SDF3 benchmark shows that the reduced graph: 1) respects the timing constraints, i.e. throughput and latency, of the original application graph and 2) when the throughput constraint is relaxed with respect to the maximal throughput of the graph, the merging algorithm is able to achieve a larger reduction in graph size. The second algorithm is called Timing Parameter Extraction, which extracts timing parameters, i.e. offsets, periods and deadlines, of dataflow applications with timing constraints, i.e. throughput and latency, converting them into periodic arbitrary-deadline tasks. These tasks execute in a way that preserve the dependencies of the original dataflow application using the offset parameter, while satisfying its timing constraints using the period and deadline parameters. This algorithm is a means to unify the two mixed application models into a single real-time task set. The main advantage of this algorithm is that the extraction of the timing parameters is independent of the specific scheduler being used, of other applications running in the system and the details of the particular platform. In addition, the experimental evaluation shows that the reduced-size dataflow graphs generated by the slack-based merging algorithm, in particular for applications that do not need to execute at maximum throughput, help speeding up the extraction of the timing parameters.
The third algorithm is called communication-aware mapping, which allocates the mixed application models on a 2D-Mesh multi-core platform after unifying them. The mapping process is done considering the timing constraints of the applications and maximizing resource utilization of the platform, while accounting for the communication cost of the dataflow applications. The algorithm is based on a novel mapping heuristic called Sensitive-Path-First, which surpasses the well-known First Fit bin-packing heuristic in terms of number of allocated applications and runtime by up to 28% and 22%, respectively. The experimental evaluation reveals a direct relation between the number of allocated applications and the availability of communication resources, which demonstrates the importance of considering communication cost. We also show that ignoring communication cost, as frequently done in existing work, allows 76% more applications to be mapped, although the applications in the system are no longer guaranteed to satisfy their timing constraints.
Together, these three important algorithms successfully achieve the main goal of this thesis and play a part in allowing embedded real-time systems to map and schedule mixed application models. The complete approach and the three algorithms presented in this thesis have been validated through proofs and experimental evaluation.

Hazem Ali

PhD Thesis, Faculdade de Engenharia da Universidade do Porto.

Record Date: 24, May, 2017