Detecting and Correcting Duplication in Behaviour Driven Development Specifications

  • Leonard Peter Binamungu

Student thesis: Phd


The Behaviour Driven Development (BDD) technique enables teams to specify software requirements as example interactions with the system. Due to the use of natural language, these examples (usually referred to as scenarios) can be understood by most project stakeholders, even end users. The set of examples also acts as tests that can be executed to check the behaviour of the System Under Test (SUT). Despite BDD's benefits, large suites of examples can be hard to comprehend, extend and maintain. Duplication can creep in, leading to bloated specifications, which sometimes cause teams to drop the BDD technique. Current tools for detecting and removing duplication in code are not effective for BDD examples. Moreover, human concerns of readability and clarity can rise. Previous attempts to detect and remove duplication in BDD specifications have focused on textually similar duplicates, not on textually different scenarios that specify the same behaviour of the SUT. To fill this gap, this thesis does the following. First, we surveyed 75 BDD practitioners from 26 countries to understand the extent of BDD use, its benefits and challenges, and specifically the challenges of maintaining BDD specifications in practice. We found that BDD is in active use amongst respondents; and the use of domain specific terms, improving communication among stakeholders, the executable nature of BDD specifications, and facilitating comprehension of code intentions emerged as some of the main benefits of BDD. The results also showed that BDD specifications suffer the same maintenance challenges found in automated test suites more generally. We map the survey results to the literature, and propose 10 research opportunities in this area. Second, we propose and evaluate a framework for detecting duplicate scenarios based on the comparison of characteristics extracted from scenario execution traces. We focus on the patterns of production code exercised by each scenario, and consider two scenarios to be duplicates if they execute the same functionality in the SUT. In an empirical evaluation of our framework on 3 open source systems, the comparison of execution paths of scenarios recorded more recall and precision than the comparison of full execution traces, public API calls, or a combination of public API calls and internal calls. Also, the focus on essential characteristics in the execution traces of scenarios improved the recall and precision of the duplicate detection tool. Third, we propose four principles describing BDD suite quality that can be used to assess which of a pair of duplicate scenarios can be most effectively removed. All of the four principles were accepted by at least 75% of the practitioners we surveyed. An empirical evaluation of the four principles on 3 open source systems shows that each principle gave acceptable remove suggestions, and thus the principles can be used to guide human engineers in removing duplicate scenarios from BDD specifications.
Date of Award31 Dec 2020
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorSuzanne Embury (Supervisor) & Nikolaos Konstantinou (Supervisor)


  • clone detection
  • test suite quality assessment
  • test suite quality
  • behaviour driven development
  • test suite maintanance
  • BDD
  • clone removal

Cite this