NEW APPROACHES IN MULTIVARIATE DATA ANALYSIS FOR PETROLEOMICS STUDIES
petroleomics; mass spectrometry; organic geochemistry; exploratory analysis; multiplex network; multidimensional scaling.
Interpretation of complex data generated by ultra-high-resolution mass spectrometry represents a significant challenge in petroleomics, demanding advanced tools for data analysis and visualization. In this context, this work investigated the molecular composition of crude oils from the Sergipe-Alagoas Basin, Brazil, focusing on polar compounds and their relationship with geochemical and physicochemical properties. The study was divided into two parts, using Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) data. In the first chapter, the molecular composition of fourteen oils was studied through ESI(±) FT-ICR MS analysis and a multiplex network data processing method, allowing for pairwise comparison of the samples and identification of five distinct groups. The groupings were interpreted based on asphaltene, resin, and methanol-extracted polar compound (Polar-MeOH) contents, and the relative abundance of compound classes obtained by FT-ICR MS. Additionally, the analysis of diagnostic ratios of classic biomarkers allowed for the proposal of two new ratios (N2/N and NS/N), which reflect the origin and thermal maturity of the studied oils. The second chapter is under development and addresses the molecular study of ten crude oils from ESI(-) FT-ICR MS data, focusing on naphthenic acidity and its relationship with polar composition. The samples were grouped by multidimensional scaling (MDS) and K-means into four distinct acidity levels, using total acid number (TAN), mass percentage of naphthenic acids (% NA), and naphthenic acid index (NAI). It is expected that the analysis of petroleomic data will demonstrate that the properties of molecular formulas shared among the samples and the formulas exclusive to the classes differentiate the oils into the four acidity groups. Thus, the results presented in this study, so far, demonstrate the promising effectiveness of chemometric tools in differentiating crude oils based on molecular composition and acidity, providing relevant information for the petroleum industry, especially for exploration, production, and refining.