An Anomaly Detection Model Training Method Based on LLM Knowledge Distillation

Abstract

In contemporary machine learning, large pre-trained models such as LLM and GPT have achieved outstanding success, but the deployment and practical application of these models are limited by the huge computational resource required for them, which is especially difficult to be employed in the current edge computing networks. Therefore, we proposed an anomaly detection model training method based on knowledge distillation of LLM, aiming to utilize the knowledge of LLM to improve the performance of lightweight anomaly detection models. we first use a large pre-trained language model llama27B as the base model and conduct supervised fine-tuning using the large-scale textual dataset UNSW-NB 15 with LoRA method for anomaly detection mission to capture the abundant domain knowledge and semantic information. Then apply knowledge distillation to allow the LLM to guide the lightweight model for training, enabling the lightweight model to learn the representations and features that similar to those of the LLM, and implement the knowledge transformation of the LLM. Experimental results show that our method significantly reduces the computational overhead of the model, while maintaining high accuracy, also improve the generalization ability of the model and the recognition rate of anomaly data, making it applicable to anomaly detection applications on resource-constrained edge devices.

References

Page 1

	Year	Citations

Page 1