Supporting High-Level to Low-Level Requirements Coverage Reviewing with Large Language Models
Sprache des Titels:
Englisch
Original Buchtitel:
Mining Software Repositories (MSR) conference, Lisbon, Portugal
Original Kurzfassung:
Refining high-level requirements into low-level requirements is a common task, especially in safety-critical systems engineering. The objective is to describe every important aspect of the high-level requirement in a low-level requirement, ensuring a complete and correct implementation of the system?s features. To this end, standards and regulations for safety-critical systems require reviewing the coverage of high-level requirements by all its low-level requirements to ensure no missing aspects. Supporting automatic requirements coverage reviewing is difficult as high-level and low-level requirements reside at different levels of abstraction, are natural language heavy, and often use different vocabulary. Unfortunately, this problem has received noticeably little attention from the research community.
With the rise of Large Language Models (LLMs) that have been trained on a huge corpus of text and hence might ``understand'' the context of high-level and low-level requirements, we would expect to be able to address this problem. This paper presents the first study to explore the performance of LLMs to check requirements coverage. For evaluation, we selected requirements from five publicly available data sets and evaluated whether GPT-3.5 and GPT-4 can detect whether the traced low-level requirements cover a high-level requirement. While GPT-3.5 with a zero-shot plus explanation prompting strategy correctly classifies covered high-level requirements across four projects, it correctly identifies incomplete coverage due to a single removed low-level requirements with 99.7% recall across the complete evaluation data set.