Safeguarding the Development and Usage of Agentic AI Applications in the Enterprise Context
Monash University
Associate Supervisor: Dr Chetan Arora
Co-supervised with A/Prof. Chakkrit Tantithamthavorn · Dr Michael Fu (University of Melbourne) · Joey Chua
Research Project
The rapid adoption of Large Language Model (LLM)-powered software, and the usage of agentic AI coding agents, have drastically improved software development and operational efficiency in enterprise settings. However, these systems introduce new and complex security vulnerabilities across their lifecycle.
Traditional LLM applications — such as chatbots — remain highly susceptible to runtime jailbreak attacks, while autonomous agents can face attacks hidden within the file system that influence an agent towards harmful executions. While the utility of these LLM-powered applications is immense, their inherent vulnerabilities, both through direct user interaction and engineer interaction during development, remain largely unknown to enterprise adopters.
This thesis aims to quantify and evaluate the risks associated with the development and usage of LLM applications and agentic AI, before proposing defensive measures to prevent such vulnerabilities from being exploited to compromise systems in the enterprise context. The work spans adaptive runtime guardrails, black-box testing methodologies for pre-trained models, and engineering secure RAG-based virtual assistants in practice.
Publications
Rui Yang, Michael Fu, Chakkrit Tantithamthavorn, Chetan Arora, Gunel Gulmammadova, and Joey Chua. ASE 2025. Introduces AdaptiveGuard, a framework for adaptive runtime safety monitoring of LLM-powered software systems, capable of detecting and responding to unsafe outputs in real time without requiring access to model internals.
Yang, Rui, Michael Fu, Chakkrit Tantithamthavorn, Chetan Arora, Lisa Vandenhurk, and Joey Chua. Journal of Systems and Software 226 (2025): 112436. Reports on a practical study of engineering RAG-based virtual assistants, examining real-world engineering challenges, design decisions, and lessons learned from deploying such systems in industry.