A Computational Framework for Nutrient Density Assessment and Food Categorization

Main Article Content

Rafael Julio Suseno
Kevin Matthew Siregar
Devi Dwi Purwanto

Abstract

In this paper, we aim to explore the potential of clustering in creating a nutritional map of different foods based on the nutritional elements present in the food. We evaluated two clustering algorithms: DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and Agglomerative Clustering. We used these algorithms to cluster the different foods in the Kaggle Food-101 dataset, which contains nutritional features such as proteins, carbohydrates, fats, and energy density. In order to enhance the efficiency of the clustering process and reduce the complexity of the data, we used the PCA (Principal Component Analysis) technique for data reduction. The Agglomerative Clustering technique with PCA demonstrated superior clustering quality compared to DBSCAN. This was based on the fact that the Agglomerative Clustering technique with PCA produced a higher Silhouette Score (0.41) and Calinski-Harabasz Index (85.97) than the DBSCAN technique. In our research, it was also found that the clusters produced by the Agglomerative Clustering technique with PCA could separate the different foods based on nutritional elements, which included high-protein foods, high-carb foods, and balanced diet foods.

Article Details

Section
Articles

References

[1] G. De Bhowmick, B. Guieysse, D. W. Everett, M. G. Reis, and C. Thum, “Novel source of microalgal lipids for infant formula,” Trends Food Sci. Technol., vol. 135, pp. 1–13, May 2023, doi: 10.1016/j.tifs.2023.03.012.

[2] H. Shin and H. Seo, “Nutrient profile-based food categorization and group-wise missing data imputation for commercial food composition database,” J. Food Compos. Anal., vol. 150, p. 108828, Feb. 2026, doi: 10.1016/j.jfca.2025.108828.

[3] W. Quan, J. Zhou, J. Wang, J. Huang, and L. Du, “Machine Learning-Driven Precision Nutrition: A Paradigm Evolution in Dietary Assessment and Intervention,” Nutrients, vol. 18, no. 1, p. 45, Dec. 2025, doi: 10.3390/nu18010045.

[4] Y. Balakrishna, S. Manda, H. Mwambi, and A. van Graan, “Determining classes of food items for health requirements and nutrition guidelines using Gaussian mixture models,” Front. Nutr., vol. 10, Oct. 2023, doi: 10.3389/fnut.2023.1186221.

[5] M. M. Medina-Vadora et al., “A Clustering Study of Dietary Patterns and Physical Activity among Workers of the Uruguayan State Electrical Company,” Nutrients, vol. 16, no. 2, p. 304, Jan. 2024, doi: 10.3390/nu16020304.

[6] E. A. F. da Silva Torres, M. L. Garbelotti, and J. M. Moita Neto, “The application of hierarchical clusters analysis to the study of the composition of foods,” Food Chem., vol. 99, no. 3, pp. 622–629, 2006, doi: 10.1016/j.foodchem.2005.08.032.

[7] Y. Balakrishna, S. Manda, H. Mwambi, and A. van Graan, “Statistical Methods for the Analysis of Food Composition Databases: A Review,” Nutrients, vol. 14, no. 11, p. 2193, May 2022, doi: 10.3390/nu14112193.

[8] D. Tsolakidis, L. P. Gymnopoulos, and K. Dimitropoulos, “Artificial Intelligence and Machine Learning Technologies for Personalized Nutrition: A Review,” Informatics, vol. 11, no. 3, p. 62, Aug. 2024, doi: 10.3390/informatics11030062.

[9] D. Granato, J. S. Santos, G. B. Escher, B. L. Ferreira, and R. M. Maggio, “Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: A critical perspective,” Trends Food Sci. Technol., vol. 72, pp. 83–90, Feb. 2018, doi: 10.1016/j.tifs.2017.12.006.

[10] I. T. Jolliffe and J. Cadima, “Principal component analysis: a review and recent developments,” Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., vol. 374, no. 2065, p. 20150202, Apr. 2016, doi: 10.1098/rsta.2015.0202.

[11] F. Nurulhikmah and D. N. E. Abdi, “Classification of Foods Based on Nutritional Content Using K-Means and DBSCAN Clustering Methods,” Teknika, vol. 13, no. 3, pp. 481–486, Oct. 2024, doi: 10.34148/teknika.v13i3.1067.

[12] T. Dinh et al., “Data clustering: a fundamental method in data science and management,” Data Sci. Manag., Aug. 2025, doi: 10.1016/j.dsm.2025.08.001.

[13] H. S. Al Jauhar, S. Solimun, and R. Fitriani, “Application Of DBScan For Clustering Society Based On Waste Management Behavior,” BAREKENG J. Ilmu Mat. dan Terap., vol. 19, no. 2, pp. 961–972, Apr. 2025, doi: 10.30598/barekengvol19iss2pp961-972.

[14] R. Watanabe, H. Ashida, M. Kobayashi‐Miura, A. Yokota, and J. Yodoi, “Effect of chronic administration with human thioredoxin‐1 transplastomic lettuce on diabetic mice,” Food Sci. Nutr., vol. 9, no. 8, pp. 4232–4242, Aug. 2021, doi: 10.1002/fsn3.2391.

[15] D. S. Ludwig, “The Ketogenic Diet: Evidence for Optimism but High-Quality Research Needed,” J. Nutr., vol. 150, no. 6, pp. 1354–1359, Jun. 2020, doi: 10.1093/jn/nxz308.