Tell them apart: Distilling technology differences from crowd-scale comparison discussions

Yi Huang, Chunyang Chen, Zhenchang Xing, Tian Lin, Yang Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Scopus citations

Abstract

Developers can use different technologies for many software development tasks in their work. However, when faced with several technologies with comparable functionalities, it is not easy for developers to select the most appropriate one, as comparisons among technologies are time-consuming by trial and error. Instead, developers can resort to expert articles, read official documents or ask questions in Q&A sites for technology comparison, but it is opportunistic to get a comprehensive comparison as online information is often fragmented or contradictory. To overcome these limitations, we propose the diffTech system that exploits the crowdsourced discussions from Stack Overflow, and assists technology comparison with an informative summary of different comparison aspects. We first build a large database of comparable software technologies by mining tags in Stack Overflow, and locate comparative sentences about comparable technologies with NLP methods. We further mine prominent comparison aspects by clustering similar comparative sentences and represent each cluster with its keywords. The evaluation demonstrates both the accuracy and usefulness of our model and we implement a practical website for public use.

Original languageEnglish
Title of host publicationASE 2018 - Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering
EditorsChristian Kastner, Marianne Huchard, Gordon Fraser
PublisherAssociation for Computing Machinery, Inc
Pages214-224
Number of pages11
ISBN (Electronic)9781450359375
DOIs
StatePublished - 3 Sep 2018
Externally publishedYes
Event33rd IEEE/ACM International Conference on Automated Software Engineering, ASE 2018 - Montpellier, France
Duration: 3 Sep 20187 Sep 2018

Publication series

NameASE 2018 - Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

Conference

Conference33rd IEEE/ACM International Conference on Automated Software Engineering, ASE 2018
Country/TerritoryFrance
CityMontpellier
Period3/09/187/09/18

Keywords

  • Differencing similar technology
  • NLP
  • Stack overflow

Fingerprint

Dive into the research topics of 'Tell them apart: Distilling technology differences from crowd-scale comparison discussions'. Together they form a unique fingerprint.

Cite this