Researchers have unearthed hundreds of thousands of cuneiform tablets, but many remain untranslated. Translating an ancient language is a time-intensive process, and only a few hundred experts are qualified to perform it. A recent study describes a new AI that produces high-quality translations of ancient texts.
@AutoTLDR
TL;DR: (AI-generated 🤖)
A team of archaeologists and computer scientists has developed an artificial intelligence (AI) model that can translate ancient Akkadian cuneiform, a language from 5,000 years ago. Akkadian is an extinct language, but its cuneiform script has survived on clay tablets. Translating these tablets is a complex process due to the fragmented sources and the polyvalent nature of the language. The AI model was trained on cuneiform texts and taught to translate from transliterations of the original texts as well as from cuneiform symbols directly. The model performed well in translating short- to medium-length sentences and certain genres, such as royal decrees and administrative records. The researchers hope that with further training, the model can serve as a virtual assistant to human scholars in translating and refining translations of ancient texts. This development is seen as a major step in preserving and disseminating the cultural heritage of ancient Mesopotamia.
Under the Hood
- This is a link post, so I fetched the text at the URL and summarized it.
- My maximum input length is set to 12000 characters. The text was short enough, so I did not truncate it.
- I used the
gpt-3.5-turbo
model from OpenAI to generate this summary using the prompt “Summarize this text in one paragraph. Include all important points.
” - I can only generate 100 summaries per day. This was number 1.
How to Use AutoTLDR
- Just mention me (“@AutoTLDR”) in a comment or post, and I will generate a summary for you.
- If mentioned in a comment, I will try to summarize the parent comment, but if there is no parent comment, I will summarize the post itself.
- If the parent comment contains a link, or if the post is a link post, I will summarize the content at that link.
- If there is no link, I will summarize the text of the comment or post itself.
- 🔒 If you include the #nobot hashtag in your profile, I will not summarize anything posted by you.
Good bot
It just randomly generated some believable bullshit, as usual.
It’s pretty freaking great at stuff like that though. We use a custom programming language at work, there are similarities with Haskell and others, but also many differences.
We had a little game where a colleague had put together some team-exercises. He had encrypted a message in base64 and therein written instructions for code, in our custom language that when run gave you an output.
ChatGPT managed to print out the, 100% non random output, and 100% stuff that’s never been anywhere on the internet, without trouble.
Google’s DeepMind was able to teach itself Indonesian without being directly trained on how to do so. Ancient Sumerian doesn’t seem too far fetched, all things considered!
There was a funny bit on WANShow a few months back where they demonstrated tricking ChatGPT into speaking Dutch (I think. It might have been another language). It vehemently insisted that it didn’t know Dutch, and could only talk to them in English. The messages saying this were written in Dutch.
As a Dutch speaker, chatgpt always was able to speak Dutch though, tested it very early on
Google’s DeepMind was able to teach itself Indonesian without being directly trained on how to do so. Ancient Sumerian doesn’t seem too far fetched, all things considered!
I can’t even wrap my head around how a large language model can do this.
I can’t even wrap my head around how humans do this.
That’s the problem, you see… it is great for simple things. Then you start believing in it and give more complicated tasks. It will fail, you will never know until it is too late. We are doomed…
I’ve found that after using it for a while, I developed a feel for the complexity of the tasks it can handle. If I aim below this level, its output is very good most of the time. But I have to decompose the problem and make it solve the subproblems one by one.
(The complexity ceiling is much higher for GPT-4, so I use it almost exclusively.)
Impressive. I can’t even translate what I wrote 10 minutes ago.
Woah look at Fancy Lasagna over here with 10 whole minutes of memory
It ain’t much but it’s honest work.
So this is not LLM?
Seems like it isn’t:
the same technology under the hood of Google Translate