Adam Conner-Simons introduces an MIT CSAIL system that aims to help linguists decipher languages without advanced knowledge of relations to other languages.
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have recently developed a new system that can automatically decipher a lost language without needing advanced knowledge of its relation to other languages.
They also showed that their system can itself determine relationships between languages, and they have used it to corroborate recent scholarship suggesting that the language Iberian is not actually related to Basque.
The team’s goal is for the system to be able to decipher lost languages that have eluded linguists for decades, using just a few thousand words.
By incorporating these and other linguistic constraints, Barzilay and MIT PhD student Jiaming Luo developed a decipherment algorithm that can handle the vast space of possible transformations and the scarcity of a guiding signal in the input.
The team applied their algorithm to Iberian considering Basque, as well as less-likely candidates from Romance, Germanic, Turkic, and Uralic families.
In future work, the team hopes to expand beyond the act of connecting texts to related words in a known language—an approach referred to as cognate-based decipherment.
The team’s new approach would involve identifying semantic meaning of the words, even if they don’t know how to read them.