Goodbye CINECA! Goodbye Li! Last week concluded my (almost) two month stay at CINECA in Bologna. I worked on an interesting project and got to experience the life in Bologna. Project The first big part of my project was data …

Goodbye CINECA! Goodbye Li! Read More »

In last blog post I talked about over/under sampling as a method to address unbalanced datasets. As a data transformation method it is important that we are cautious when evaluating performance of classifiers when we are performing over/under sampling. In …

Preprocessing pipeline and train/test set separation Read More »

The goal of my project is to construct a classifier (learner) that will be able to recognize (and possibly predict) abnormal behavior of HPC system. Naturally abnormal behavior and faults represent relatively small part of the overall data collected from …

Learning on unbalanced classes Read More »

Data lake is a storage of large quantities of data that has little to no structure. It combines data from different sources that were originally not intended to be a part of a bigger monitoring infrastructure. In the case of …

Swimming in the data lake Read More »

My name is Martin Molan. I come from Slovenia and I have just finished the first year of master’s program at IPS at Jozef Stefan Institute in Ljubljana. Before that I obtained a BA in mathematics from University of Ljubljana. …

Martin Molan Read More »

Tagged with: , ,

Project reference: 1906 Anomalies detection is one of the timeliest problems in managing HPC facilities. Clearly, this problem involves many technological issues, including big data analysis, machine learning, virtual machines manipulations and authentication protocols. Our research group already prepared a …

Anomaly detection of system failures on HPC machines using Machine Learning Techniques Read More »

Tagged with: , ,
Follow by Email