Have you ever asked ChatGPT not to do something, only to find that it quietly keeps doing it?
For me, it's the em dash. I don't hide that I use AI to support my writing, but I prefer to remove the more obvious signals. I like things to feel intentional and a little more "me".
It sounds like a simple request. And yet, it rarely sticks. It's not a major problem, just one of those persistent details that makes me wonder what's going on underneath.
Today I watched a thoughtful conversation with Yuval Noah Harari. One point stayed with me. He suggested that modern AI systems are shaped less by the instructions we give them, and more by the behaviours they observe. In other words, they learn from we do, not what we say.
It made me think about how often that's true in life, too.
The table below outlines how this shaping occurs at different stages of AI development. It is a quiet reminder that whether we are training machines or shaping our own habits, what we model often speaks louder than what we say.
Why Behaviour Outweighs Instructions
Training phase | What the model “soaks up” | Practical Affect |
---|---|---|
Pre-training on billions of documents | Raw statistical patterns of human language—including exaggeration, sarcasm, half-truths, kindness, cruelty, etc. | Unless we scrub or weight the data, every pattern (good or bad) leaves a trace. |
Fine-tuning / RLHF (Reinforcement Learning from Human Feedback) | Examples that humans actively reward or punish | The model learns to imitate approved behaviour, not necessarily instructed values. Over time, the reward signal dominates written rules. |
Live deployment / online learning (recommendation engines, chatbots that keep adapting) | Real-time clicks, watch-time, up-votes, prompts | Systems keep adjusting toward the behaviours users reinforce—even if those behaviours contradict “official” policies. |
What began as a small edit request, a quiet rebellion against the em dash, revealed something meaningful. These systems follow patterns, respond to repetition, and reflect the behaviours that re reinforced.
There’s a thread in the OpenAI Developer Community about debugging the em dash. Maybe the lesson here isn’t just about punctuation. Maybe it’s about the quiet power of consistency in how we train, how we show up, and who we become.
No comments:
Post a Comment