abliteration
English
Etymology
Blend of ablate + obliteration, see abliterate.
Noun
abliteration (countable and uncountable, plural abliterations)
- (neologism, artificial intelligence, computing) The process of uncensoring a large language model by modifying internal functions to eliminate refusal behaviors while preserving the remaining functions of the model.
- 2024 June 13, Maxime Labonne, “Uncensor any LLM with abliteration”, in Hugging Face Blog[1], retrieved 29 May 2025:
- We applied abliteration to Daredevil-8B to uncensor it, which also degraded the model's performance.