AI learns language

AI can self-learn human language norms and patterns

In the dawn of this year, sci­en­tist Gary Mar­cus told CNBC that the most impor­tant AI break­through in 2022 “will like­ly be one that the world does­n’t imme­di­ate­ly see”. The ‘sus­pense’ cre­at­ed with Mar­cus’s state­ment relies on AI’s abil­i­ty to learn on its own and is get­ting more and more sus­pense­ful with each new AI dis­cov­ery this year.

We are actu­al­ly get­ting a ton of advances in the field of AI in 2022. 

  • For exam­ple, Meta researchers have recent­ly devel­oped arti­fi­cial intel­li­gence that, by ana­lyz­ing brain­waves, can “hear” what peo­ple are hearing.
  • Lead­ing human artists to despair, an AI cre­at­ed art­work won ‘the first-place blue rib­bon’ and got the $300 prize. AI is cre­at­ing Art, music, arti­cles, and is set to take over the 
  • A cou­ple of days ear­li­er, Google’s Deep­Mind trained vir­tu­al bots to play match­es of 2v2 foot­ball with one anoth­er in a bid to get AI to work togeth­er in teams.

And most recent­ly, researchers at MIT, Cor­nell Uni­ver­si­ty, and McGill Uni­ver­si­ty, have tak­en a step fur­ther in this direc­tion by devel­op­ing an AI sys­tem to self-learn human lan­guage norms and patterns.

AI to learn human language norms and patterns on its own? 

Accord­ing to the find­ings pub­lished in Nature Com­mu­ni­ca­tions, the machine-learn­ing mod­el gen­er­ates rules that explain why the forms of those words vary when giv­en words and exam­ples of how those words change in one lan­guage to indi­cate oth­er gram­mat­i­cal func­tions such as tense, case, or gen­der. For exam­ple, it may be dis­cov­ered that the let­ter “a” needs to be added to the end of a word in Ser­bo-Croa­t­ian to turn the mas­cu­line form feminine.

This mod­el can also learn high­er-lev­el lin­guis­tic pat­terns that can be used across mul­ti­ple lan­guages, enhanc­ing its performance.

58 dif­fer­ent lan­guages were used to train and test the mod­el using issues from lin­guis­tics text­books. Each test includ­ed a unique set of words and word-form mod­i­fi­ca­tions. The mod­el offered a reli­able set of rules to explain the word-form mod­i­fi­ca­tions in 60% of the situations.

“One of the moti­va­tions of this work was our desire to study sys­tems that learn mod­els of datasets that are rep­re­sent­ed in a way that humans can under­stand”, said Kevin Ellis, an assis­tant pro­fes­sor of com­put­er sci­ence at Cor­nell Uni­ver­si­ty and the paper’s pri­ma­ry author.

Related:

To devel­op an AI sys­tem that could auto­mat­i­cal­ly gen­er­ate a mod­el from many relat­ed datasets, the researchers chose to ana­lyze the rela­tion­ship between phonol­o­gy (the study of sound pat­terns) and mor­phol­o­gy (the study of word structure).

The researchers devised a mod­el that could learn a gram­mar, or set of rules for cre­at­ing words, using a machine-learn­ing tech­nique known as Bayesian Pro­gram Learn­ing. By employ­ing this approach, the mod­el cre­ates a com­put­er pro­gram that solves an issue.

In this exam­ple, the gram­mar that the mod­el believes pro­vides the most log­i­cal expla­na­tion for the words and mean­ings in a lin­guis­tics prob­lem is the pro­gram. They used Sketch, a well-known soft­ware syn­the­siz­er cre­at­ed by Solar-Leza­ma at MIT, to cre­ate the model.

When the mod­el was test­ed on 70 text­book prob­lems, it cor­rect­ly matched the gram­mar of the com­plete word set in 60% of the cas­es and most of the word-form changes in 79% of the cases.

The mod­el fre­quent­ly pro­duced sur­pris­ing results. On one occa­sion, it revealed a valid option that made use of a text­book error in addi­tion to the pre­dict­ed response to a Pol­ish lan­guage puz­zle. This indi­cates, in Ellis’ opin­ion, how well the mod­el can “debug” lin­guis­tics studies.


In the future, the researchers are hop­ing to use this method to find sur­pris­ing solu­tions to prob­lems in var­i­ous aca­d­e­m­ic fields. They might apply the tech­nique in oth­er sit­u­a­tions where apply­ing advanced knowl­edge across con­nect­ed data­bas­es is pos­si­ble. For instance, accord­ing to Ellis, they might devel­op a method to infer dif­fer­en­tial equa­tions from data on the motion of numer­ous objects.

Con­tin­u­ous devel­op­ment of AIs through untir­ing research have now one after anoth­er been turn­ing into sig­nif­i­cant break­throughs that Mar­cus had dreamed of.

Leave a Reply

Your email address will not be published.

Join our NewsletterDaily Glimple of Future

Our blog, "Daily Glimpse of Future", strives to make the future much clearer than it is today. Join our newsletter for free now.