This paper summarizes the keynote I gave on the SEAMS 2020 conference. Noting the power of natural evolution that makes living systems extremely adaptive, I describe how artificial evolution can be employed to solve design and optimization problems in software. Thereafter, I discuss the Evolution of Things, that is, the possibility of evolving physical artefacts and zoom in on a (r)evolutionary way of creating `bodies' and `brains' of robots for engineering and fundamental research.
Neural networks are powerful tools for automated decision-making, seeing increased application in safety-critical domains, such as autonomous driving. Due to their black-box nature and large scale, reasoning about their behavior is challenging. Statistical analysis is often used to infer probabilistic properties of a network, such as its robustness to noise and inaccurate inputs. While scalable, statistical methods can only provide probabilistic guarantees on the quality of their results and may underestimate the impact of low probability inputs leading to undesired behavior of the network.
We investigate here the use of symbolic analysis and constraint solution space quantification to precisely quantify probabilistic properties in neural networks. We demonstrate the potential of the proposed technique in a case study involving the analysis of ACAS-Xu, a collision avoidance system for unmanned aircraft control.
Control theoretical techniques have been successfully adopted as methods for self-adaptive systems design to provide formal guarantees about the effectiveness and robustness of adaptation mechanisms. However, the computational effort to obtain guarantees poses severe constraints when it comes to dynamic adaptation. In order to solve these limitations, in this paper, we propose a hybrid approach combining software engineering, control theory, and AI to design for software self-adaptation. Our solution proposes a hierarchical and dynamic system manager with performance tuning. Due to the gap between high-level requirements specification and the internal knob behavior of the managed system, a hierarchically composed components architecture seek the separation of concerns towards a dynamic solution. Therefore, a two-layered adaptive manager was designed to satisfy the software requirements with parameters optimization through regression analysis and evolutionary meta-heuristic. The optimization relies on the collection and processing of performance, effectiveness, and robustness metrics w.r.t control theoretical metrics at the offline and online stages. We evaluate our work with a prototype of the Body Sensor Network (BSN) in the healthcare domain, which is largely used as a demonstrator by the community. The BSN was implemented under the Robot Operating System (ROS) architecture, and concerns about the system dependability are taken as adaptation goals. Our results reinforce the necessity of performing well on such a safety-critical domain and contribute with substantial evidence on how hybrid approaches that combine control and AI-based techniques for engineering self-adaptive systems can provide effective adaptation.
When a self-adaptive system needs to adapt, it has to analyze the possible options for adaptation, i.e., the adaptation space. For systems with large adaptation spaces, this analysis process can be resource- and time-consuming. One approach to tackle this problem is using machine learning techniques to reduce the adaptation space to only the relevant adaptation options. However, existing approaches only handle threshold goals, while practical systems often need to address also optimization goals. To tackle this limitation, we propose a two-stage learning approach called Deep Learning for Adaptation Space Reduction (DLASeR). DLASeR applies a deep learner first to reduce the adaptation space for the threshold goals and then ranks these options for the optimization goal. A benefit of deep learning is that it does not require feature engineering. Results on two instances of the DeltaIoT artifact (with different sizes of adaptation space) show that DLASeR outperforms a state-of-the-art approach for settings with only threshold goals. The results for settings with both threshold goals and an optimization goal show that DLASeR is effective with a negligible effect on the realization of the adaptation goals. Finally, we observe no noteworthy effect on the effectiveness of DLASeR for larger sizes of adaptation spaces.
Advances in Machine Learning (ML) have brought previously hard to handle problems within arm's reach. However, this power comes at the cost of unassured reliability and lacking transparency. Overcoming this drawback is very hard due to the probabilistic nature of ML. Current approaches mainly tackle this problem by developing more robust learning procedures. Such algorithmic approaches, however, are limited to certain types of uncertainties and cannot deal with all of them, e.g., hardware failure. This paper discusses how this problem can be addressed at architectural rather than algorithmic level to assess systems dependability properties in early development stages. Moreover, we argue that Self-Adaptive Systems (SAS) are more suited to safeguard ML w.r.t. various uncertainties. As a step towards this we propose classes of dependability in which ML-based systems may be categorized and discuss which and how assurances can be made for each class.
Identifying the insufficient requirements, such as missing or lacking requirements, is important to prevent serious accidents for CPSs (Cyber-Physical Systems), such as launch vehicles and spacecrafts which often required to be self-adaptive. In JAXA (Japan Aerospace Exploration Agency), several review boards in place to verify the requirements toward space systems from various viewpoints which often derived from the reviewer's experience about anomalies. However, the impossibility of assigning a well-experienced reviewer to all review opportunities highlights the importance of sharing their viewpoints. In this paper, we aimed to extract and exploit associations between development documents and archived anomaly reports about space systems, thereby identifying the lacking requirement in the review. An association between these two documents can be treated as a viewpoint of the review, which contributes to preventing previously experienced anomalies. To cope with this problem, we propose a CNN (Convolutional Neural Network) model that predicts the correlation of documents. A collection of judgments about meaningfully related pairs of text was prepared to train the proposed model to contribute to detect the lack of requirements. Experimental results have shown that the performance of the proposed method is significantly better than that of baseline methods (i.e., 71.0% in F-measure). Also, further investigation has shown that not only the word similarity, but different attributes are necessary to solve our problem.
In the last four years, the number of distinct autonomous vehicles platforms deployed in the streets of California increased 6-fold, while the reported accidents increased 12-fold. This can become a trend with no signs of subsiding as it is fueled by a constant stream of innovations in hardware sensors and machine learning software. Meanwhile, if we expect the public and regulators to trust the autonomous vehicle platforms, we need to find better ways to solve the problem of adding technological complexity without increasing the risk of accidents. We studied this problem from the perspective of reliability engineering in which a given risk of an accident has severity and probability of occurring. Timely information on accidents is important for engineers to anticipate and reuse previous failures to approximate the risk of accidents in a new city. However, this is challenging in the context of autonomous vehicles because of the sparse nature of data on the operational scenarios (driving trajectories in a new city). Our approach was to mitigate data sparsity by reducing the state space through monitoring of multiple-vehicles operations. We then minimized the risk of accidents by determining proper allocation of tests for each equivalence class. Our contributions comprise (1) a set of strategies to monitor the operational data of multiple autonomous vehicles, (2) a Bayesian model that estimates changes in the risk of accidents, and (3) a feedback control-loop that minimizes these risks by real-locating test effort. Our results are promising in the sense that we were able to measure and control risk for a diversity of changes in the operational scenarios. We evaluated our models with data from two real cities with distinct traffic patterns and made the data available for the community.
Recent approaches in testing autonomous driving systems (ADS) are able to generate a scenario in which the autonomous car collides, and a different ADS configuration that avoids the collision. However, such test information is too low level to be used by engineers to improve the ADS. In this paper, we consider a path planner component provided by our industry partner, that can be configured through some weights. We propose a technique to automatically re-engineer the path planner in terms of a self-adaptive path planner (SAPP) following the MAPE loop reference architecture. The Knowledge Base (KB) of SAPP contains descriptions of collision scenarios discovered with testing, and the corresponding alternative weights that avoid the collisions. We forecast two main usages of SAPP. First of all, designers are provided with a prototype that should facilitate the re-implementation of the path planner. As second usage, SAPP can be useful for improving the diversity of testing, as performing test case generation on SAPP will guarantee to find dangerous situations different from those used to build SAPP. Preliminary experiments indicate that SAPP can effectively adapt on the base of the solutions stored in KB.
Safety-critical adaptive software systems, as, for example, used in aircraft must ensure that system must remain in safe regions during adaptation in order to avoid catastrophic failures. We present a framework, which uses hierarchical statistical models and is based upon techniques from computer experiment design and active learning to characterize the boundaries between safe and unsafe regions with a minimal number of test cases. The boundaries are then represented as parametric geometric shapes that can provide easy to understand feedback to the system designer. We illustrate our framework using the NASA adaptive flight control system IFCS.
Self-Adaptive Systems (SASs) reflect on both their state and on the environment and change their behavior to satisfy the expected objectives. Cloud systems are self-adaptive by nature, especially considering the resources used in a pay-as-you-go manner. Satisfying trustworthiness (worthiness of a service based on evidences of its trust) properties also demands self-adaptation capabilities. Unfortunately, developers lack an easy-to-use platform to support the assessment of such properties and to execute the required adaptions. This paper presents TMA, a platform that implements a MAPE-K control loop for cloud systems, supported by a distributed monitoring system based on probes. Quality Models are used to express trustworthiness properties, resulting in scores, which are used to plan adaptations through evaluation rules. These plans are executed by actuators. A demo shows the scaling up/down of the number of containers in a cloud application of a set of web services from TPC Benchmarks, as a result of changes observed in the environment.
Two of the main paradigms used to build adaptive software employ different types of properties to capture relevant aspects of the system's run-time behavior. On the one hand, control systems consider properties that concern static aspects like stability, as well as dynamic properties that capture the transient evolution of variables such as settling time. On the other hand, self-adaptive systems consider mostly non-functional properties that capture concerns such as performance, reliability, and cost. In general, it is not easy to reconcile these two types of properties or identify under which conditions they constitute a good fit to provide run-time guarantees. There is a need of identifying the key properties in the areas of control and self-adaptation, as well as of characterizing and mapping them to better understand how they relate and possibly complement each other. In this paper, we take a first step to tackle this problem by: (1) identifying a set of key properties in control theory, (2) illustrating the formalization of some of these properties employing temporal logic languages commonly used to engineer self-adaptive software systems, and (3) illustrating how to map key properties that characterize self-adaptive software systems into control properties, leveraging their formalization in temporal logics. We illustrate the different steps of the mapping on an exemplar case in the cloud computing domain and conclude with identifying open challenges in the area.
A distributed system's functionality must continuously evolve, especially when environmental context changes. Such required evolution imposes unbearable complexity on system development. An alternative is to make systems able to self-adapt by opportunistically composing at runtime to generate systems of systems (SoSs) that offer value-added functionality. The success of such an approach calls for abstracting the heterogeneity of systems and enabling the programmatic construction of SoSs with minimal developer intervention. We propose a general ontology-based approach to describe distributed systems, seeking to achieve abstraction and enable runtime reasoning between systems. We also propose an architecture for systems that utilize such ontologies to enable systems to discover and `understand' each other, and potentially compose, all at runtime. We detail features of the ontology and the architecture through two contrasting case studies: one on controlling multiple systems in smart home environment, and another on the management of dynamic computing clusters. We also quantitatively evaluate the scalability and validity of our approach through experiments and simulations. Our approach enables system developers to focus on high-level SoS composition without being constrained by deployment-specific implementation details.
The rapidly changing workload of service-based systems can easily cause under-/over-utilization on the component services, which can consequently affect the overall Quality of Service (QoS), such as latency. Self-adaptive services composition rectifies this problem, but poses several challenges: (i) the effectiveness of adaptation can deteriorate due to over-optimistic assumptions on the latency and utilization constraints, at both local and global levels; and (ii) the benefits brought by each composition plan is often short term and is not often designed for long-term benefits---a natural prerequisite for sustaining the system. To tackle these issues, we propose a two levels constraint reasoning framework for sustainable self-adaptive services composition, called DATESSO. In particular, DATESSO consists of a refined formulation that differentiates the `strictness' for latency/utilization constraints in two levels. To strive for long-term benefits, DATESSO leverages the concept of technical debt and time-series prediction to model the utility contribution of the component services in the composition. The approach embeds a debt-aware two level constraint reasoning algorithm in DATESSO to improve the efficiency, effectiveness and sustainability of self-adaptive service composition. We evaluate DATESSO on a service-based system with real-world WS-DREAM dataset and comparing it with other state-of-the-art approaches. The results demonstrate the superiority of DATESSO over the others on the utilization, latency and running time whilst likely to be more sustainable.
Self-adaptive systems continuously adapt to internal and external changes in their execution environment. In context-based self-adaptation, adaptations take place in response to the characteristics of the execution environment, captured as a context. However, in large-scale adaptive systems operating in dynamic environments, multiple contexts are often active at the same time, requiring simultaneous execution of multiple adaptations. Complex interactions between such adaptations might not have been foreseen or accounted for at design time. For example, adaptations can partially overlap, requiring only partial execution of each, or they can be conflicting, requiring some of the adaptations not to be executed at all, in order to preserve system execution. To ensure a correct composition of adaptations, we propose ComInA, a novel reinforcement learning based approach, which autonomously learns interactions between adaptations as well as the most appropriate adaptation composition for each combination of active contexts, as they arise. We present an initial evaluation of ComInA in an urban public transport network simulation, where multiple adaptations to buses, routes, and stations are required. Early results show that ComInA correctly identifies whether adaptations are compatible or conflicting and learns to execute adaptations which maximize system performance. However, further investigation is needed into how best to utilize such identified relationships to optimize a wider range of metrics and utilize more complex composition strategies.
Self-adaptive Systems (SASs) are one way to address the ever-growing complexity of software systems by allowing the system to react on changes in its operating environment. In today's systems, self-adaptation is typically realized with a control loop, for which the MAPE-K feedback loop is a prominent example. Research uses the notion of patterns to describe the distribution and decentralization of individual control loop components or control loops and their underlying managed subsystems. While there are some well-accepted standards about which components a managed sub-system has to implement so that it can interact with the control loop, research still lacks best practices for communication within and across control loops. This paper aims to identify several research challenges that exist currently in this domain. Furthermore, ideas on upcoming research to create distributed SASs that rely on roles, benchmarking and inter- and intra-loop communication for control loops will be presented. Furthermore, ongoing work on a self-adaptive distributed benchmarking application will be discussed. Finally, an evaluation strategy will be presented to provide evidence for viable results to the community.
Today's computing world features a growing number of cyber-physical systems that require the cooperation of many physical devices. Examples include autonomous cars and co-working robots, which are expected to appropriately adapt to any possible context they find themselves in (e.g. the presence of a nearby human).
However, the controlling software continues to be developed using established object-oriented modelling techniques like UML, which do not natively possess a notion of context and thus may introduce accidental complexity. With increasing complexity, the probability of the introduction of software errors rises, which can have fatal consequences in cyber-physical systems. To address this, we envision a model-driven architecture for self-adaptive cyber-physical systems that explicitly models structured context. Entities are modelled as message-passing parallel processes and can play roles in specific contexts, which dynamically alter their behaviour and relationships with other parts of the system. Since the planning of complex adaptations can be cumbersome in real-world scenarios, we envision an intuitive formulation of adaptations as graph rewriting rules on the context model.
This paper discusses the current state of research and identifies open research challenges. Based on this, the envisioned architecture as well as an evaluation strategy are presented.
Self-adaptive systems increasingly need to reason about and adapt both structural and behavioral system aspects, such as in mobile service robots, which must reason about missions that they need to achieve and the architecture of the software executing them. Deciding how to best adapt these systems to run time changes is challenging because it entails considering mutual dependencies between the software architecture that the system is running and the outcome of plans for completing tasks, while also considering multiple trade-offs and uncertainties. Considering all these aspects in planning for adaptation often yields large solution spaces which cannot be adequately explored at run time. We address this challenge by proposing a planning approach able to consider the impact of mutual dependencies between software architecture and task planning on the satisfaction of mission goals. The approach is able to reason quantitatively about the outcome of adaptation decisions handling both the reconfiguration of the system's architecture and adaptation of task plans under uncertainty and in a rich trade-off space. Our results show: (i) feasibility of run-time decision-making for self-adaptation in an otherwise intractable solution space by dividing-and-conquering adaptation into architecture reconfiguration and task planning sub-problems, and (ii) improved quality of adaptation decisions with respect to decision making that does not consider dependencies between architecture and task planning.
The concept of Internet of Things (IoT) has led to the development of many complex and critical systems such as smart emergency management systems. IoT-enabled applications typically depend on a communication network for transmitting large volumes of data in unpredictable and changing environments. These networks are prone to congestion when there is a burst in demand, e.g., as an emergency situation is unfolding, and therefore rely on configurable software-defined networks (SDN). In this paper, we propose a dynamic adaptive SDN configuration approach for IoT systems. The approach enables resolving congestion in real time while minimizing network utilization, data transmission delays and adaptation costs. Our approach builds on existing work in dynamic adaptive search-based software engineering (SBSE) to reconfigure an SDN while simultaneously ensuring multiple quality of service criteria. We evaluate our approach on an industrial national emergency management system, which is aimed at detecting disasters and emergencies, and facilitating recovery and rescue operations by providing first responders with a reliable communication infrastructure. Our results indicate that (1) our approach is able to efficiently and effectively adapt an SDN to dynamically resolve congestion, and (2) compared to two baseline data forwarding algorithms that are static and non-adaptive, our approach increases data transmission rate by a factor of at least 3 and decreases data loss by at least 70%.
Modern software systems, such as cyber-physical systems (CPSs), operate in complex and dynamic environments. With the continuous and unanticipated change in the operational environment, these systems are subjected to a variety of uncertainties. Self-adaptive CPSs (SACPSs) can adjust their behavior or structure at run-time as a response to the changes in their perceived environment. Namely, self-adaptation is commonly realized through a MAPE-K feedback loop incorporating newly derived knowledge obtained by the sensed data from the run-time monitoring, during the operation of decentralized SACPSs. However, to build the knowledge, the need for run-time observations' aggregation and reasoning emerges, since the observations made by the decentralized systems might be conflicting. In this paper, we propose an approach for observations aggregation and knowledge modeling in SACPSs that is domain-independent and can deal with inaccurate, partial, and conflicting observations, based on the formalisms of Subjective Logic.
Smart systems have become key solutions for many application areas including autonomous farming. The trend we can see now in the smart systems is that they shift from single isolated autonomic and self-adaptive components to larger ecosystems of heavily cooperating components. This increases the reliability and often the cost-effectiveness of the system by replacing one big costly device with a number of smaller and cheaper ones. In this paper, we demonstrate the effect of synergistic collaboration among autonomic components in the domain of smart farming---in particular, the use-case we employ in the demonstration stems from the AFar-Cloud EU project. We exploit the concept of autonomic component ensembles to describe situation-dependent collaboration groups (so called ensembles). The paper shows how the autonomic component ensembles can easily capture complex collaboration rules and how they can include both controllable autonomic components (i.e. drones) and non-controllable environment agents (flocks of birds in our case). As part of the demonstration, we provide an open-source implementation that covers both the specification of the autonomic components and ensembles of the use case, and the discrete event simulation and real-time visualization of the use case. We believe this is useful not only to demonstrate the effectiveness of architectures of collaborative autonomic components for dealing with real-life tasks, but also to build further experiments in the domain.
Software systems are playing an increasingly important role in many domains of our society. To ensure that software will support the public good, software engineers, who create and maintain the software, shall adhere to ethical principles. A joint task force of IEEE and ACM has brought such a set of principles together in a Code of Ethics. These principles describe responsibilities for software engineers and guidelines to assist them when making decisions in the benefit of public good. With the emergence of computing systems that take autonomous decisions, there is growing consensus that new ethical principles will be required. Since self-adaptive systems are characterized by autonomy, the need for new principles applies to these systems. Based on the Code of Ethics and leveraging on ongoing initiatives, we suggest an initial set of new ethical principles for autonomous and self-adaptive systems as an inspiration for an extended Code of Ethics for this important class of systems.
The main goal of any feedback control system is essentially to remove humans from the loop. This has always been the goal in the engineering of control systems. The MAPE-K loop is the embodiment of a feedback control loop in self-adaptive software systems, but the complete removal of humans from the control loop has not been thoroughly debated. One of the reasons is that, software systems are social-technical systems, and as such, humans need to be considered right from the inception of such systems, otherwise their deployment is bound to fail. However, as software self-adaptation progresses, enabling to place higher assurances on the deployment of these systems to the point humans become dispensable, some ethical questions need to be raised. Similar questions have been raised in past when the first automatic systems became intrinsic to the industrial fabric. The difference between then and now is that then the impact was confined to portions of the society, but now the implications are much wider, if we consider, in particular, software systems that are able to change themselves. If humans are not aware of those changes, and their implications, humans cease to be in tune with the system they are operating, and inevitably accidents will ensue. The point of no return in self-adaptive software systems refers to the moment in their technical maturity when any human involvement with the operation of a system is perceived to create more harm than benefit. Confronted with this situation, software engineers need start asking themselves some basic ethical questions. Do we really need to consider humans as an integral part of self-adaptive software systems? If humans are removed from the control loop, what kind of assurances will be needed for society to accept such systems?
When developing autonomous systems, engineers and other stakeholders make great efforts to prepare the system for all foreseeable circumstances. However, such systems are still bound to encounter situations that were not considered at design time. For reasons like safety, cost, or ethics it is often highly desired that these new cases be handled correctly upon first encounter. In this paper, we first justify our position that there will always exist unpredicted events and conditions, driven by, e.g., new inventions in the real world, the diversity of world-wide system deployments and uses, and the possibility that multiple events that were unforeseen at design time (or overlooked, or knowingly abandoned following cost-benefit-risk calculations) will not only occur, but will occur together. We then argue that despite the unpredictability, handling such situations is indeed possible. Hence, we offer and exemplify design principles, which, when applied in advance, can improve the system's ability to deal with unpredicted situations. We conclude with a discussion of how this work and a much-needed thorough study of the unexpected can contribute toward a foundation of engineering principles for developing trustworthy next-generation autonomous systems.
Attacks against business logic rules occur when the attacker exploits the domain rules in a malicious way. Such attacks have not received sufficient attention in research so far. In this paper, we propose a novel self-protecting approach that defends a system against the exploitation of business logic vulnerabilities. The approach empowers a system with a self-protecting layer to protect it against attacks aimed at misusing business logic rules. The approach maintains up-to-date domain knowledge which is analyzed using runtime verification to detect logical attacks. When attacks are discovered they are dynamically mitigated by applying proper system reconfigurations at runtime. We evaluate the approach using a case from the domain of hotel booking systems.
Many self-adaptive systems benefit from human involvement and oversight, where a human operator can provide expertise not available to the system and can detect problems that the system is unaware of. One way of achieving this is by placing the human operator on the loop - i.e., providing supervisory oversight and intervening in the case of questionable adaptation decisions. To make such interaction effective, explanation is sometimes helpful to allow the human to understand why the system is making certain decisions and calibrate confidence from the human perspective. However, explanations come with costs in terms of delayed actions and the possibility that a human may make a bad judgement. Hence, it is not always obvious whether explanations will improve overall utility and, if so, what kinds of explanation to provide to the operator. In this work, we define a formal framework for reasoning about explanations of adaptive system behaviors and the conditions under which they are warranted. Specifically, we characterize explanations in terms of explanation content, effect, and cost. We then present a dynamic adaptation approach that leverages a probabilistic reasoning technique to determine when the explanation should be used in order to improve overall system utility.
Advanced systems such as IoT comprise many heterogeneous, interconnected, and autonomous entities operating in often highly dynamic environments. Due to their large scale and complexity, large volumes of monitoring data are generated and need to be stored, retrieved, and mined in a time- and resource-efficient manner. Architectural self-adaptation automates the control, orchestration, and operation of such systems. This can only be achieved via sophisticated decision-making schemes supported by monitoring data that fully captures the system behavior and its history.
Employing model-driven engineering techniques we propose a highly scalable, history-aware approach to store and retrieve monitoring data in form of enriched runtime models. We take advantage of rule-based adaptation where change events in the system trigger adaptation rules. We first present a scheme to incrementally check model queries in the form of temporal logic formulas which represent the conditions of adaptation rules against a runtime model with history. Then we equip the model with the capability to retain only information that is relevant to queries. Finally, we demonstrate the scalability of our approach via experiments on a simulated smart healthcare system employing a real-world medical guideline.