Few areas of human activity have embraced generative artificial intelligence with as much enthusiasm as academia and scientists in general, both in universities and research institutes as well as in companies.
Physicists were not far behind, and the lack of results in the search for new physics – explanations that go beyond current physics, whose models and theories are known to be incomplete – led the scientific community to believe that it had finally found the tool that would help to unravel the secrets that still stubbornly refuse to be understood today.
There are, in fact, reasons for enthusiasm, especially when it comes to analyzing enormous masses of data, from cosmological data to those resulting from particle impacts in large colliders.
But there is a major problem when trying to use artificial intelligence to generate new ideas and explanations: AI models are trained with data and explanations that we already have, which means they are almost entirely dependent on what we already know – including the aggregations and interpretations we make of the data.
Veena Krishnaraj and Adrian Bayer, from Princeton University in the USA, then decided to investigate this effect and try to find ways to overcome it. But the results were not encouraging.
Unexplained effects receive trivial explanations.
Artificial intelligence is already being widely used in cosmology to analyze the Universe, but testing new hypotheses that attempt to go beyond the standard cosmological model, known as the Lambda-CDM (Λ-CDM) model , remains extremely computationally demanding.
Although the ΛCDM describes many properties of the Universe – from its expansion to the distribution of galaxies – physicists know that it is incomplete: Experimental observations show that phenomena such as massive neutrinos, anomalous behaviors of gravity, and the very notions of dark matter and dark energy are demanding new physics, which has explanations that the current model does not provide.
Testing these alternatives requires running a huge number of high-precision simulations of virtual universes under different physical assumptions, which often demands immense computational resources. This is where scientists believed that artificial intelligence could make a difference.
It turns out that effects unexplained by current theory often closely resemble patterns already explained by the standard cosmological model. In these cases, AI tends to interpret new information using categories learned during training, making it more difficult—rather than easier—to recognize genuinely new effects.
Researchers observed this behavior in simulations involving massive neutrinos. Certain effects produced by the neutrino’s mass closely resemble variations associated with a parameter of the ΛCDM model known as Sigma 8 (σ8), which describes the intensity with which matter clumps together in the Universe. As a result, the AI failed to distinguish between the different characteristics of the neutrinos. This is what experts call “negative transfer.”
“Negative transfer is not random. It is driven by underlying physical degeneracies in the model,” Krishnaraj explained. “In other words, different physical parameters can produce very similar observable effects, making it difficult for AI to correctly distinguish between them. Therefore, this is something we need to take into account and try to mitigate.”
Transfer learning
The team has already made progress in this direction, and has tested whether a technique known as transfer learning – a method in which AI systems reuse knowledge acquired in one task to accelerate learning in another – could make this process more efficient.
To do this, the researchers first trained a neural network on simulations based on the ΛCDM model – this is known as pre-training – and then adapted it to more complex cosmological models, which include possible new physics. “It’s basically a shortcut,” explained Bayer. “Normally, people train AI directly on the most computationally expensive simulations. What we do instead is first use simpler, less expensive ΛCDM simulations to give the AI an idea of what’s happening, and only then move on to the more complex models.”
In fact, the technique yielded results: Transfer learning reduced the number of simulations required by more than ten times, which is excellent, although insufficient to eliminate the need for supercomputers more powerful than those currently available.
The big problem is that the simulations clearly showed the emergence of the negative transfer phenomenon, which was far from being a random phenomenon, as the researcher had already pointed out: “Pre-training can accelerate inference, but it can also hinder the learning of new physical phenomena,” the team concluded.
Source: www.inovacaotecnologica.com.br
Source link
