Publication | Open Access
Training-free Lexical Backdoor Attacks on Language Models
26
Citations
51
References
2023
Year
Unknown Venue
Natural Language ProcessingLarge Language ModelsAbuse DetectionLarge Ai ModelEngineeringMachine LearningAttack ModelComputational LinguisticsAdversarial Machine LearningLarge-scale Language ModelsComputer ScienceLanguage StudiesLarge Language ModelLanguage ModelsLinguisticsLanguage ProcessingText MiningMachine Translation
Large-scale language models have achieved tremendous success across various natural language processing (NLP) applications. Nevertheless, language models are vulnerable to backdoor attacks, which inject stealthy triggers into models for steering them to undesirable behaviors. Most existing backdoor attacks, such as data poisoning, require further (re)training or fine-tuning language models to learn the intended backdoor patterns. The additional training process however diminishes the stealthiness of the attacks, as training a language model usually requires long optimization time, a massive amount of data, and considerable modifications to the model parameters.
| Year | Citations | |
|---|---|---|
Page 1
Page 1