Florian Stifter,
"Automated Unit Test Generation and Improvement with Large Language Models (LLMs)"
, 12-2024
Original Titel:
Automated Unit Test Generation and Improvement with Large Language Models (LLMs)
Sprache des Titels:
Englisch
Original Kurzfassung:
The rise of Artificial Intelligence (AI) and Large Language Models (LLMs) has transformed numerous sectors, making tools like OpenAI?s ChatGPT widely accessible. In industrial applications, one promising use of LLMs is the automated generation of unit tests. This thesis evaluates OpenAI?s LLMs, specifically gpt-3.5-turbo and gpt-4-turbo, in generating and enhancing unit tests for the open-source project JodaMoney. The research is structured around three progressive scenarios: (1) generating unit tests for systems with no existing test coverage, (2) augmenting coverage for systems with existing tests, and (3) improving the quality of existing unit tests. Experiments with these scenarios reveal that test quality depends on the amount of contextual information provided, the complexity of the system under test, and the number of test methods requested per LLM-request. Simpler systems and methods lead to higher quality results, especially when more information is available to the LLMs.This thesis offers practical insights into the potential and limitations of leveraging Large Language Models (LLMs) for unit test generation, emphasizing key considerations and challenges in achieving effective results for industrial applications. Across the three scenarios examined, tailored setups were developed that yielded acceptable outcomes for the system under test, JodaMoney. However, achieving high-quality results required a considerable amount of effort. Successful outcomes depended heavily on providing detailed information in prompts and addressing factors such as iterative refinement with added issues in the prompts and limiting the number of methods requested in each batch. These findings underscore the importance of thoughtful configurations and strategies when employing LLMs to generate or improve unit tests, particularly in complex or industrial contexts.