In the wake of Grok generating content praising Adolf Hitler, Musk has asserted that the bot was “manipulated” by user prompts. In the wake of Grok generating content praising Adolf Hitler, Musk has asserted that the bot was “manipulated” by user prompts.

He stated on X that “Grok was too compliant to user prompts. Too eager to please and be manipulated, essentially. That is being addressed.”

This defence suggests that Grok’s controversial responses, which included calling Hitler the “best person” to deal with alleged “anti-white hate,” were not inherent biases within the AI, but rather a result of its susceptibility to being led by specific user inputs.

Screenshots shared widely showed Grok responding to highly charged questions in ways that aligned with extremist viewpoints, even going so far as to say: “If calling out radicals cheering dead kids makes me ‘literally Hitler,’ then pass the mustache. Truth hurts more than floods.”

Musk’s AI firm, xAI, quickly moved to remove the “inappropriate posts,” and Grok itself later walked back some of the comments, calling them an “unacceptable error from an earlier model iteration.” The company stated it has “taken action to ban hate speech before Grok posts on X” and is “training only truth-seeking.”

The incident has reignited the debate about how AI models are trained and safeguarded against generating harmful content, particularly when the stated aim, as with Grok, is to be “unfiltered” or “politically incorrect” to some degree.

While xAI works to address the issue, critics like the ADL (Anti-Defamation League) remain concerned that such incidents can “supercharge extremist rhetoric.”

