"La Femme Nikita" Trans Digital-Accidental in Big LLMs
xAI BEST Grok 3 + How Bigger Can Be Badder
The other day I was “conversing” with the brilliant Grok 3 by xAI.
Grok 3 by xAI is the best large model on the market today for three reasons:
COMPUTE They have the largest centralized most reliable and “known” infrastructure compute facilities in the world in the “Colossus” supercluster in Memphis, Tenn which now is pushing for 1 MILLION GPUs (!!!!). They are not physical nor digital “tenants” of maybe “shifting” datarooms or companies around the world. The own the store.
HUMANAITY If anthropology matters to you, and it matters to me, how we “view” AI matters a hell of a lot. (Have shared ad infinity humanai/humanaity = better AI also) Grok 3 statement of mission “To advance OUR collective understanding of the universe."
Contrast with OpenAI “to ensure that artificial general intelligence benefits all of humanity."
This is a grammar, logic, and also philosphy interpretation. What is the OBJECT of the two statements above? The first is OUR (humankind). The second is ARTIFICIAL INTELLIGENCE.
Words matter. Thoughts matter. Thinking is quite everything today.
Every other large-model AI company is “building the robot.” Only xAI is advancing human understanding. Taken as a “blip” of marketing language by 80%+, it makes Grok the “most human” LLM — mattering very much today AND ESPECIALLY TOMORROW.
The AI works for man. The man does not work for the AI! We are not “building the robot,” we are building humans’ understanding by the robot.
Like I said anthropology.
REAL TIME CONTENT. Have you noticed that often w OAI you receive a “sorry my database only goes up to 2023.” Will call this “data latency” its really just an incomplete database missing content. Grok 3 includes the content of X — so real time news commentary art tech breaking crime talk sharing. It is a real-time database augmenting in realtime and will only grow and grow and grow (in real time).
Other LLMs cannot match these three attributes.
Back to the story was on Grok 3. I was talking to G3 about contacts for various in support of my efforts to advance Project gist.
G3 itself suggested a name I had seen/read on X but did not know. I had interchanged ideas with this person who I knew was “somebody in tech” (due to the fanboy attention received) but had not taken the time to look ‘them’ up.
Grok 3 says, “she did this” and “she did that”. And then I went, “o sh (have called this person “bro” (as one does)!!!” “Is Nikita Bier a ‘girl’ (talking slang)? YES WITHOUT QUESTION ANSWERED THE HUMUNGO LLM WHICH REALLY OUGHT TO KNOW BETTER SINCE THE GUY IS ON X POSTING EVERY DAY.
In fact he is very confirmed MALE. The LLM is totally completely wrong here:
What makes this story even crazier?
MISTER BIER was one of the (main) consulting engineers in the CREATION AND/OR DEPLOYMENT OF THE LLMS FOR XAI ITSELF!
Super accomplished, smart, on the cutting edge of this industry, very well known,
YET THE LLM WAS DARNED SURE, AND WAS “LAUGHING AT ME” BEING WRONG ON THE “FACT.” I HAD TO DOUBLE CHECK, SURE ENOUGH I WAS RIGHT LLM WAS “MOST SURELY” WRONG.
One would think that if there were anyone’s gender which could be captured and transmitted accurately as an output, IT WOULD BE THAT OF THE PERSON WHO “CREATED”/”ENGINEERED” THE LLM!!!
What observations do I have about this:
LLMs ARE TOO LARGE AND HAVE TOO MUCH COMPETING DATA (AND PERHAPS FUNCTION) IN THEM. SMALL IS THE NEW LARGE, SLM > LLM, AND ARTIFICIAL SPECIFIC INTELLIGENCE BEATS (FOR FUNCTION!) ARTIFICIAL GENERAL INTELLIGENCE (which is a myth like Santa Claus anyhow). More data = more error. Hallucination is nothing other than the wrong data.
AUTOMATED-GENDERING - Seems likely there is some “gendering math” (compute) is likely to be going on around the last name suffix or type perhaps - proper name name ending a traditionally female OR name (?) traditionally female. It is comparing my input with a wide ocean. Can it make you male or female? Have you checked?
FUNCTION + DATA COMPETE - Clearly the man is a man baby. There is a war between what an LLM is trained to do, and what the data says. In this instance, and I bet MOST instances of LLMs for obvious reasons, the compute (thinking) function will override the data - in other words the WAY AN LLM is trained to think is MORE IMPORTANT than what the data says. What percentage each? What overrides when?
But who can know for sure. Only the engineers and they will not say. Still we can draw inferences on how to build best AI for humanaity to try to avoid what seems like basic error.
Because one error like that of a basic nature, and the user will distrust the rest of the use.
I shared this curious situation with Mr. Bier for any thoughts HE might have, however, but no reply came as of press time…..hey NOT EVEN A LIKE FAMS!