A new AI model for summarizing scientific literature can now assist researchers in wading through and identifying the latest cutting-edge papers they want to read.
On November 16, the Allen Institute for Artificial Intelligence (AI2) rolled out the model onto its flagship product, Semantic Scholar, an AI-powered scientific paper search engine.
It provides a one-sentence tl;dr (too long; didn’t read) summary under every computer science paper (for now) when users use the search function or go to an author’s page.
In an era of information overload, using AI to summarize text has been a popular natural-language processing (NLP) problem.
The researchers first created a dataset called SciTldr, which contains roughly 5,400 pairs of scientific papers and corresponding single-sentence summaries.
To supplement these 5,400 pairs even further, the researchers compiled a second dataset of 20,000 pairs of scientific papers and their titles.
The researchers intuited that because titles themselves are a form of summary, they would further help the model improve its results.
This was confirmed through experimentation. The next best abstractive method is trained to compress scientific papers by an average of only 36.5 times.
In the long-term, the team will also work summarizing multiple documents at a time, which could be useful for researchers entering a new field or perhaps even for policymakers wanting to get quickly up to speed.