Researchers from the École Polytechnique Fédérale de Lausanne have uncovered that large language models (LLMs), including those developed by OpenAI and Google, primarily process information in English internally, regardless of the language of the input prompts. This finding is significant because these AI models, trained predominantly on English datasets, might exhibit linguistic and cultural biases.
The study focused on the Llama-2 model by Meta AI, revealing through a series of computational experiments that these models likely convert inputs into an English-based conceptual understanding before generating outputs in the target language. This behavior suggests a deep-rooted bias towards English due to the predominance of English in the training data. The implications of this bias are profound, affecting not just translation accuracy but also how AI models might shape our perception of reality, given the close relationship between language and thought.