Big data has become a major challenge for astronomers who analyze huge data sets from increasingly powerful space devices. To address this, the Southwest Research Institute team has developed a machine learning tool to efficiently label large and complex data sets to allow deep learning models to inspect and identify potentially hazardous solar events. The new tagging tool can be applied or adapted to meet other challenges involving large data sets.
As space-based instrument packages collect increasingly complex data at ever-increasing volumes, it has become increasingly difficult for scientists to process and analyze relevant trends. Machine learning (ML) has become a critical tool for processing large, complex data sets, as algorithms learn from existing data to make decisions or predictions that can take into account more information at once than humans can do. However, to take advantage of machine learning techniques, humans need to label all the data first – often a colossal endeavor.
Dr. Subhamwi Chatterjee, a SwRI postdoctoral researcher specializing in solar and instrumentation astronomy and lead author of a paper on this information, said: “Data classification with meaningful annotations is a critical step in supervised ML. However, labeling data sets is a critical step in supervised ML. It’s hard and it takes a long time.” The results are published in the journal natural astronomy. “New research shows how convolutional neural networks (CNNs), trained on bluntly labeled astronomical videos, can be leveraged to improve the quality and breadth of data classification and reduce the need for human intervention.”
Deep learning techniques can automate the processing and interpretation of large amounts of complex data by extracting and learning complex patterns. The SwRI team used videos of the solar magnetic field to identify the regions in which strong and complex magnetic fields appear on the surface of the sun, which are a major precursor to weather phenomena in space.
“We trained CNNs using primitive labels, to only manually check our differences with the machine,” said co-author Dr. Andres Muñoz Jaramillo, a solar physicist at SwRI with machine learning expertise. “We then retrained the algorithm with the corrected data and repeated this process until we all agreed. While the stream appearance labeling is usually done manually, this iterative interaction between the human algorithm and the ML algorithm reduces manual validation by 50%.”
Iterative tagging methods such as active learning can significantly save time, and reduce the cost of making big data ML ready. Moreover, by gradually hiding the videos and looking for the moment when the ML algorithm changed its classification, SwRI scientists took advantage of the trained ML algorithm to provide a richer and more useful database.
Dr Derek Lamb of SwRI, a co-author who specializes in developing magnetic fields on the Sun’s surface, said. “This database will be critical in developing new methodologies to predict the emergence of complex regions leading to space weather events, which may increase the lead time we have to prepare for space weather.”
Machine learning drastically reduces the workload of calculating the number of cells for disease diagnosis
Subhamoy Chatterjee et al, Effective labeling of solar flux evolution videos through a deep learning model, natural astronomy (2022). DOI: 10.1038 / s41550-022-01701-3
Presented by the Southwest Research Institute
the quote: Scientists Demonstrate Machine Learning Tool to Efficiently Process Complex Solar Data (2022, July 6) Retrieved on July 6, 2022 from https://phys.org/news/2022-07-scientists-machine-tool-efficiently-complex.html
This document is subject to copyright. Notwithstanding any fair dealing for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.
#Scientists #demonstrate #machine #learning #tool #efficiently #process #complex #solar #data