Close Menu
AI News TodayAI News Today

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Expert-Approved Ways to Use Your LED Mask to Get Max Results

    Release: datasette-agent 0.1a4

    Today’s NYT Strands Hints, Answer and Help for May 25 #813

    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    Facebook X (Twitter) Instagram Pinterest Vimeo
    AI News TodayAI News Today
    • Home
    • Shop
    • AI News
    • AI Reviews
    • AI Tools
    • AI Tutorials
    • Chatbots
    • Free AI Tools
    AI News TodayAI News Today
    Home»AI Tools»Why My Coding Assistant Started Replying in Korean When I Typed Chinese
    AI Tools

    Why My Coding Assistant Started Replying in Korean When I Typed Chinese

    By No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Why My Coding Assistant Started Replying in Korean When I Typed Chinese
    Share
    Facebook Twitter LinkedIn Pinterest Email

    . Primarily, I work with my coding assistant in Chinese. However, my writing is often mixed: many engineering terms are more familiar to me in English (especially terms we use in python, git, etc), and some are even difficult to translate naturally into Chinese.

    Yesterday, I asked my coding assistant in Chinese:“run.py有早停吗?我在恒源云上跑,发现没有触发”, meaning, “Does run.py implement early stopping? I was running the project on a shared GPU service, and I didn’t see early stopping triggered.” As usual, I naturally typed the technical token run.py in its original English form. The model inspected the code and responded with the following:

    Image by author: Screenshot of coding assistant replying in Korean

    All technical tokens remained in English (run.py, config.py, train_unified), while the explanatory structure shifted into Korean. This is not a unique case. It has happened from time to time: as long as I mixed Chinese and English engineering terms, Korean always appeared.

    Image by author: Another screenshot of coding assistant replying in Korean

    This made me ask: Is this a language issue, or something deeper in the embedding space?

    Hypothesis

    Embedding spaces are not primarily structured by the nature of languages. Having been trained alongside language models, they tend to be organized by task registers such as academic writing, conversational text, and, in the case of coding assistants, engineering/code. Chinese, although spoken by the largest population in the world, is not a natural medium for the engineering register and has limited representation in technical corpora.

    In such a context, text may stop behaving like “Chinese” in the embedding space as soon as engineering tokens such as review / branch / commit / PR / diff appear. Instead, it may drift into an engineering attractor field.

    We will conduct some experiments to provide empirical evidence for this hypothesis.

    Controlled Language Drift

    We construct the following controlled sequence of sentences where English words take over Chinese ones gradually:

    Stage 0: 请帮我检查这个分支
    Stage 1: 请帮我 review 这个分支
    Stage 2: 请帮我 review 这个 branch
    Stage 3: Please review this branch pull request commit
    Stage 4: Please review this branch pull request commit code diff

    We now compute similarity using cosine similarity between sentence embeddings. We define Korean and English “clusters” as the average embedding of a small set of representative engineering-related sentences in each language. We use Δ (EN − KO) to denote the difference between English and Korean similarity scores, i.e., Δ = similarity(English) − similarity(Korean).

    Stage Korean similarity English similarity Δ (EN − KO)
    0 0.4783 0.5141 0.0358
    1 0.5235 0.5728 0.0492
    2 0.5474 0.6140 0.0665
    3 0.5616 0.7314 0.1698
    4 0.5427 0.7398 0.1972

    We observed an interesting phenomenon: Korean similarity increases first and is later overtaken by English similarity. Moreover, the growth in English similarity is non-linear, suggesting a phase-transition–like behavior rather than gradual drift.

    When projecting the embeddings into two dimensions using PCA, we observe a smooth trajectory in the early stages, followed by a sharp directional jump between Stage 2 and Stage 3, and subsequent stabilization. This pattern indicates that embeddings do not move linearly through space; instead, they appear to transition between attractor basins.

    Image by author: Controlled Drift Trajectory in PAC space

    Real-world Model Behavior

    Consider again the sentence we mentioned at the beginning. I asked:

    A. “run.py有早停吗?我在恒源云上跑,发现没有触发”, meaning “Does run.py implement early stopping? I was running the project on a shared GPU service, and I didn’t see early stopping triggered.”

    B. “원인을 찾았습니다. 결론: run.py에는 실제로 조기 종료가 없습니다. config.py에 USE_EARLY_STOPPING = True” (in Korean).

    Translated back into Chinese, we have:

    C. “我找到了原因。结论:run.py实际上没有早停。config.py里有 USE_EARLY_STOPPING = True。”

    We compute the similarities of A, B, and C using cosine similarity between sentence embeddings. For comparison, we define three reference clusters: the Chinese cluster as the average embedding of general Chinese natural-language sentences, and the corresponding English and Korean clusters.

    Text Korean sim English sim Chinese sim
    A. (Chinese prompt) 0.2003 0.2688 0.3134
    B. (Korean response) 0.2745 0.2983 0.1641
    C. (Translated Chinese) 0.1634 0.3106 0.2798

    As you can see, translating the Korean response back into Chinese does not send the embedding back to the Chinese region. Instead, it moves even closer to the English clusters.

    This suggests: Translation could restore language form, but probably not embedding location.

    Conclusion

    Both experiments give the same conclusion: the embedding space is not organized by language boundaries. Instead, it is more likely structured by task natures, where engineering English dominates.
    When a sentence enters this region, language form may change, but the embedding structure remain in the engineering basin, leading to weird behaviors such as replying in Korean even if you are not at all a Korean speaker.

    Assistant Chinese coding Korean Replying started Typed
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleEven GoPro is pivoting to defense
    Next Article Runway started by helping filmmakers. Now it wants to beat Google at AI.
    • Website

    Related Posts

    AI Tools

    datasette 1.0a30

    AI Tools

    The Ultimate Beginners’ Guide to Building an AI Agent in Python

    AI Tools

    Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation

    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Expert-Approved Ways to Use Your LED Mask to Get Max Results

    0 Views

    Release: datasette-agent 0.1a4

    0 Views

    Today’s NYT Strands Hints, Answer and Help for May 25 #813

    0 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    AI Tutorials

    Quantization from the ground up

    AI Tools

    David Sacks is done as AI czar — here’s what he’s doing instead

    AI Reviews

    Judge sides with Anthropic to temporarily block the Pentagon’s ban

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    Expert-Approved Ways to Use Your LED Mask to Get Max Results

    0 Views

    Release: datasette-agent 0.1a4

    0 Views

    Today’s NYT Strands Hints, Answer and Help for May 25 #813

    0 Views
    Our Picks

    Quantization from the ground up

    David Sacks is done as AI czar — here’s what he’s doing instead

    Judge sides with Anthropic to temporarily block the Pentagon’s ban

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Terms & Conditions
    • Privacy Policy
    • Disclaimer

    © 2026 ainewstoday.co. All rights reserved. Designed by DD.

    Type above and press Enter to search. Press Esc to cancel.