SoHeal '18- Proceedings of the 1st International Workshop on Software Health

Full Citation in the ACM Digital Library

SESSION: Keynote

Lessons learned from the linux kernel: creating sustained healthy communities

The Linux Kernel is one of the most successful open source projects to date. After 26 years, the rate of code contribution continues to be high, new developers are still being attracted to participating, and the code is in widespread use. By analyzing the contributions, we can see how individuals impact the kernel's evolution as a whole, as do the organizations in the kernel ecosystem. So what lessons can we learn from this information? What information is relevant to software community health in general that is not being caught in traditional health metrics? This keynote will discuss how the insights from the Linux kernel are being applied to other Linux Foundation open source projects to create healthy vibrant communities producing useful code for us all.

SESSION: Software ecosystem health

Maintaining third-party libraries through domain-specific category recommendations

Proper maintenance of third-party libraries contributes toward sustaining a healthy project, mitigating the risk it becoming outdated and obsolete. In this paper, we propose domain-specific categories (i.e., grouping of libraries that perform similar functionality) in library recommendations that aids in library maintenance. Our empirical study covers 2,511 GitHub projects and 150 domain-specific categories of Java libraries. Our results show that a system uses up to six different categories in their dependencies. Furthermore, recommending domain-specific categories is practical (i.e., with an accuracy between 66% to 81% for multiple categories) and its suggestion of libraries within that domain is comparable to existing techniques.

The effect of generic strategies on software ecosystem health: the case of cryptocurrency ecosystems

Thus far, no research has been done into the effect of business strategies on software ecosystem health. This research aims to fill that gap by combining the Open Source Ecosystem Health Operationalization (OSEHO) and generic strategies. These models are combined and tested on five cases of cryptocurrency ecosystems: Ripple, Ethereum, Litecoin, IOTA and zCash. Findings suggest that the generic strategy Focused Differentiation has the biggest positive impact on ecosystem health. Further research is necessary to see if this is also true for more mature and more stable ecosystems.

Healthy until otherwise proven: some proposals for renewing research of software ecosystem health

The software ecosystem has become a central conceptualisation for characterising the contemporary software business world. To understand and evaluate ecosystems, the concept of 'ecosystem health' was borrowed from the field of biology. In a 'healthy' ecosystem, the participants will flourish and, vice versa, suffer in an unhealthy one. Yet, there is a lack of empirical validations for the current approach as well as certain limitations regarding the concept. This paper will present a critique on current ecosystem health measurement and evaluation approaches. In addition, there is discussion on three proposals that could help to refocus the academic research on software ecosystem health.

On the nature of software sub-ecosystems and their health

Background. The concept of sub-ecosystems, widely used in natural ecosystem theory, has never been introduced in software ecosystem analysis. It provides a perspective on software ecosystems that can be used to create better understanding of them and more effective ecosystem health analysis. Objectives. The objective of this research is to introduce the concept of sub-ecosystems to the field of software ecosystems. An extension on the Open Source Ecosystem Health Operationalization for measuring the health of a sub-ecosystem is created and evaluated with three small case studies. Method. A literature review of both software and natural ecosystem research is used for the definitions of key concepts. Design Science is used for the extension of the Open Source Ecosystem Health Operationalization. Finally, for the case studies, data is gathered using several data repositories and analyzed to show how the concept of sub-ecosystems is used. Results The concept of software sub-ecosystems is defined. Next to that an extension to the Open Source Ecosystem Health Operationalization (OSEHO) framework is introduced for considering sub-ecosystems in health assessments. Conclusion The subject of sub-ecosystems provides a promising new perspective on software ecosystems that improves the understanding of this research field for both researchers and practitioners. Additionally, the extended OSEHO framework can be used to more accurately measure the health of an ecosystem by looking at both larger and smaller ecosystems around it.

SESSION: Health vs. cloud, finances and global development

National boundaries and semantics of artefacts in open source development

Global software development has long being recognised as a paradigm shift in modern software development. As an immediate effect, co-location of workers in the same building or office is not seen as necessary any longer. Coordination in distributed socio-technical systems is mostly achieved by means of the artifacts that are produced by the developers part of a project's team.

Geographic distance profoundly affects the ability to collaborate. With communication becoming less frequent, the challenge is for it to become more effective. This is especially complex when different nationalities, languages and cultures are part of the same development effort. Open source software is an example of a distributed, multi-lingual development effort. As such, the main resulting artefacts are discussions, and source code. Diverse backgrounds can produce a different semantic corpus if the authors come from the same ethnic and language groups or from different ones.

The purpose of this paper is to evaluate the artifacts in the context of their semantics, and how semantic corpora are affected by development and languages. By using a selection of Open Source projects developed within national boundaries, we compare their semantic richness, and how their class content is reflected in their identifiers. We also compare these national projects to a successful, international project. The aim is to discover how national boundaries influence the semantics of the developed code.

Cloudhealth: a model-driven approach to watch the health of cloud services

Cloud systems are complex and large systems where services provided by different operators must coexist and eventually cooperate. In such a complex environment, controlling the health of both the whole environment and the individual services is extremely important to timely and effectively react to misbehaviours, unexpected events, and failures. Although there are solutions to monitor cloud systems at different granularity levels, how to relate the many KPIs that can be collected about the health of the system and how health information can be properly reported to operators are open questions.

This paper reports the early results we achieved in the challenge of monitoring the health of cloud systems. In particular we present CloudHealth, a model-based health monitoring approach that can be used by operators to watch specific quality attributes. The Cloud-Health Monitoring Model describes how to operationalize high level monitoring goals by dividing them into subgoals, deriving metrics for the subgoals, and using probes to collect the metrics. We use the CloudHealth Monitoring Model to control the probes that must be deployed on the target system, the KPIs that are dynamically collected, and the visualization of the data in dashboards.

Exploring the effect of software ecosystem health on the financial performance of the open source companies

Background. It is currently unknown how software ecosystem health affects the financial performance of open source companies. This is a problem, because open source is becoming increasingly popular and more knowledge is necessary for companies on how to capitalize on this phenomenon. Objectives. With this paper, insight is developed into the relation between software ecosystem health and financial performance of the open source company that nurtures it. Method. A case study on two open source companies, Cloudera and Hortonworks, is performed. The software ecosystem health and financial performance of both companies are assessed. Results. Cloudera is healthier in terms of robustness and niche creation, while Hortonworks is healthier in terms of productivity. Financially, Cloudera performs better than Hortonworks. Conclusion. The following hypotheses are formulated. Software ecosystem health has an expected positive relation with financial performance. Niche creation health is the main contributor to this relation, robustness health is a minor reason for this relation, and software ecosystem productivity health has little to no influence.