CSI-SE '18- Proceedings of the 5th International Workshop on Crowd Sourcing in Software Engineering

Full Citation in the ACM Digital Library

Codekōan: a source code pattern search engine extracting crowd knowledge

Christof Schramm
Yingding Wang
François Bry

Source code search is frequently needed and important in software development. Keyword search for source code is a widely used but a limited approach. This paper presents CodeKōan, a scalable engine for searching millions of online code examples written by the worldwide programmers' community which uses data parallel processing to achieve horizontal scalability. The search engine relies on a token-based, programming language independent algorithm and, as a proof-of-concept, indexes all code examples from Stack Overflow for two programming languages: Java and Python. This paper demonstrates the benefits of extracting crowd knowledge from Stack Overflow by analyzing well-known open source repositories such as OpenNLP and Elasticsearch: Up to one third of the source code in the examined repositories reuses code patterns from Stack Overflow. It also shows that the proposed approach recognizes similar source code and is resilient to modifications such as insertion, deletion and swapping of statements. Furthermore, evidence is given that the proposed approach returns very few false positives among the search results.

Competence, collaboration, and time management: barriers and recommendations for crowdworkers

Alexandre Lazaretti Zanatta
Leticia Machado
Igor Steinmacher

Software crowdsourcing development requires a continuous influx of crowdworkers for their continuity. Crowdworkers should be encouraged to play an important role in the online communities by being active members, but they face difficulties when attempting to participate. For this reason, in this paper, we investigated the difficulties that crowdworkers face in crowdsourcing software development platforms. To achieve this, we conducted a study relying on multiple data sources and research methods including literature review, peer review, field study, and procedures of grounded theory. We observed that crowdworkers face many barriers - related to competence, collaboration, and time management - when making their contributions in software crowdsourcing development, which can result in dropouts. The main contributions of this paper are: a) empirical identification of barriers faced by crowdsourcing software development crowdworkers, and b) recommendations on how to minimize the barriers.

Crowdassistant: a virtual buddy for crowd worker

Kumar Abhinav
Alpana Dubey
Sakshi Jain
Gurpriya Kaur Bhatia
Blake McCartin
Nitish Bhardwaj

Crowdsourcing is an emerging practice which provides workers, across the globe, to work on their choice of tasks. It offers many benefits to people over traditional long term employment model, such as, schedule and geographic flexibility, easy access to work, an opportunity to gain experience on wide variety of tasks, or supplemental revenue streams. However, it also brings a new set of challenges to the workers. Workers on crowdsourcing platform do not have similar level of support as they get in traditional employment model, such as career guidance, compensation counseling, HR support, etc. To overcome the challenges crowd workers face, we propose "CrowdAssistant" which acts as a virtual buddy for the worker and helps them throughout their career journey on the platform. It even renders a level of support impossible for human managers and career counselors to provide. The proposed system acts as a personalized assistant and pro-actively supports worker's needs. It is the first of its kind to the best of our knowledge.

Do extra dollars paid-off?: an exploratory study on topcoder

Lili Wang
Yong Wang

In general crowdsourcing, different task requesters employ different pricing strategies to balance task cost and expected worker performance. While most existing studies show that increasing incentives tend to benefit crowdsourcing outcomes, i.e. broader participation and higher worker performance, some reported inconsistent observations. In addition, there is the lack of investigation in the domain of software crowdsourcing. To that end, this study examines the extent to which task pricing strategies are employed in software crowdsourcing. More specifically, it aims at investigating the impact of pricing strategies on worker's behaviors and performance. It reports a conceptual model between pricing strategies and potential influences on worker behaviors, an algorithm for measuring the effect of pricing strategies, and an empirical evaluation on 434 crowdsourcing tasks extracted from TopCoder. The results show that: 1) Strategic task pricing patterns, i.e. under-pricing and overpricing are prevalent in software crowdsourcing practices; 2) Overpriced tasks are more likely to attract more workers to register and submit, and have higher task completion velocity; 3) Underpriced tasks tend to associate with less registrants and submissions, and lower task completion velocity. These observations imply that task requesters can typically get their extra dollars investment paid-off if employing proactive task pricing strategy. However, it is also observed that it appears to be a counter-intuitive effect on the score of final deliverable. We believe the preliminary findings are helpful for task requesters in better pricing decision and hope to stimulate further discussions and research in pricing strategies of software crowdsourcing.

A hybrid simulation model for crowdsourced software development

Razieh Saremi

Crowdsourcing as a new emerging software development method contains crowdsourced mini-tasks as demand and online workers as suppliers. The major counter-argument in such systems is that suppliers are volunteers and are not bound by any contract, also, the size of available suppliers varies wieldy throughout the day. Such uncertainty about the receiving service may cause inefficiency and task failure. This research presents a hybrid simulation model to address the risk of task failure in competitive crowdsourcing platforms. The simulation model is composed of three components: the discrete event simulation which represents the task life cycle, the agent-based simulation which illustrates the crowd workers' decision-making process and the systems dynamic simulation which displays the platform.