The work “Unsupervised Data Pattern Discovery on the Cloud”, with the support of NEANIAS team, has been published.
The article was presented as a discussion paper at the 20th International Conference of the Italian Association for Artificial Intelligence (AIxIA2021) and published on December 3 on CEUR-WS.org.
- Authors: Thomas Cecconello [1], Lucas Puerari [1] and Giuseppe Vizzari[1].
- Affiliations: [1] Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Milan, Italy.
Abstract
Scientific research implies the production of data describing phenomena still not studied and well understood. Sometimes the amount and rate of generation of produced data can be overwhelming, and anyway tools supporting a computer assisted analysis of scientific data can support systematic forms of data driven analysis. Machine learning can be an instrument in an overall flow including domain experts and computer scientists. Adopted machine learning approaches need to be unsupervised, employing just the input data as a teacher. We propose a two-step workflow: (i) achieving a compact representation of elements of the dataset by means of representation learning techniques, shifting the analysis from cumbersome representations to compact vectors in a latent space, and (ii) clustering points associated to instances to suggest patterns to the domain experts that will evaluate their potential meaning within the domain. The paper presents the rationale of the approach within a cloud-based setting, and first experiments on an image dataset from the literature.
Acknowledgments.
The authors want to acknowledge support by the NEANIAS (Novel EOSC Services for Emerging Atmosphere, Underwater & Space Challenges) project, funded by the EC Horizon 2020 research and innovation programme under Grant Agreement No. 863448.
Get the article at http://ceur-ws.org/Vol-3078/paper-52.pdf.




