0:00
/
Transcript

IMAGINE THERE'S NO LANGUAGE

It's Easy if you Try

AI INSTANTANEOUS LANGUAGE TRANSLATION IS REALITY AND GOOD/EFFECTIVE - CHANGING HISTORY IN VERY LARGE WAYS

Simply stunning, these days with language + AI, and a whole Bible cautionary story about it.

»»» There is also a Bible laudatory story about it, remember.

One language for everyone, or more properly maybe, no languages. Or all language?

A singular logos?

A key component of future growth of AI is language. In the area of Law, this is especially so. And especially for the USA market which of course is all-language culturally. One can also think about a shrinking world by technology. Will someone in China be doing your law in 20 years?

Have called the ‘times’ we are in the the GUDA - the Global Unified Data Age - where all information (all data) from every culture and language is/are culled (in “The Great Collation”) into ONE database.

This is meant to not just overcome nation and geography, language and culture but TIME itself. An elimination of chronology, with respect to data.

So all art, historical, writings, documents, video, audio, social media, business communications, official writings, informal chats, emails, photos, all — of past time, present time, and (today embedding AI function into all) future time(s) updating in real time — brought in one place for all/anyone in the world to access in their native ‘linguistic operating system’.

In Law, in my lifetime, or soon after I am gone, we will have accomplished putting all of the Law onto the blockchain available to anyone anywhere anytime any country any language “from a spigot” — Law on demand.

»»»> there are (2) major efforts underway for a blockchain “DAO of LawAGLI - Artificial General Legal Intelligence, maybe AGSI (Specific) is better? These underlie Hugging Face and others… yet are fractional at this moment.

Pile of Law Pile of Law is a [Stanford University] large-scale, open-source dataset comprising approximately 256 GB of primarily English-language legal and administrative texts, with a strong focus on U.S. sources. It aggregates data from 35 diverse origins, including court opinions and filings, government agency publications, contracts, legislative records, and administrative rules.

The dataset is designed for pretraining language models in the legal domain, while emphasizing responsible data filtering techniques to address issues like privacy, toxicity, and bias, drawing from legal norms. It is hosted on Hugging Face and continues to grow, enabling research in natural language processing (NLP) for legal applications such as improving access to justice.

MultiLegalPile MultiLegalPile is an (University of Bern) extensive 689 GB multilingual legal corpus that spans 24 languages (including all official EU languages) and 17 jurisdictions, incorporating diverse legal text types such as caselaw, legislation, and contracts. It integrates subsets like Native Multi Legal Pile (112 GB), Eurlex Resources (179 GB), Legal mC4 (106 GB), and even the Pile of Law (292 GB).

The dataset supports pretraining NLP models under fair use principles, with most content under permissive licenses (e.g., CC BY-NC-SA 4.0 or more open variants), and is openly available on Hugging Face for advancing multilingual legal AI research.

A key gate along the way is instantaneous AI language translation. How are we coming with it? We are there. 2025 we are here.

See a sample above for reference. [and this company operating for at least 2 years]

[Note: we looked at a different deal in 2018 which seemed to have everything working fine at that time.]

See a EZDub AI 1 min review below for reference.

+++ PS - Even my ZOOM does a transcription, instant, which can be manipulated with AI; even my SUBSTACK does a transcription, which can be manipulated with AI.

Discussion about this video

User's avatar

Ready for more?