Machine Learning Analysis on MNIST and California Housing Datasets

Authors

  • Hussein Shaa'lan Author

Keywords:

Machine Learning, Classification, Regression, MNIST, California Housing, Logistic Regression, Random Forest, Model Evaluation, Predictive Modeling

Abstract

This researcj explores the performance of machine learning models on two basic types of problems: classification and regression. The MNIST data was used to test a Logistic Regression model on handwritten digit classification with a 91.76 accuracy which indicates good baseline performance by a linear model. In the regression task, both Linear Regression and Random Forest Regressor were used to analyze the California Housing data.The results of the experiments show that the Random Forest model is much more effective than Linear Regression, and its R 2 score is 0.8088 in contrast to 0.6195, and its error rate is smaller. This has been enhanced by the capacity of the ensemble techniques in describing non-linear and complicated relationships in real life data. The results emphasize the need to choose the right models depending on the nature of the data. Although simple linear models may be effective in structured data sets, complex regression tasks need more sophisticated models. This work offers a comparative approach via which the complexity of the models affects the predictive performance of the various machine learning problems.

References

[1] Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN computer science, 2(3), 1-21.

[2] Jhaveri, R. H., Revathi, A., Ramana, K., Raut, R., & Dhanaraj, R. K. (2022). A review on machine learning strategies for real‐world engineering applications. Mobile Information Systems, 2022(1), 1833507.

[3] Chaudhary, P. S., Khurana, M. R., & Ayalasomayajula, M. (2024). Real-world applications of data analytics, big data, and machine learning. In Data Analytics and Machine Learning: Navigating the Big Data Landscape (pp. 237-263). Singapore: Springer Nature Singapore. [4] D. Harrison and D. L. Rubinfeld,

[4] Kapoor, A. (2024). Ml approach: Algorithms, real-world applications and research directions. Real-World Applications and Research Directions (November 01, 2024).

[5] Lwakatare, L. E., Raj, A., Crnkovic, I., Bosch, J., & Olsson, H. H. (2020). Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions. Information and software technology, 127, 106368.

[6] Nozari, H., Ghahremani-Nahr, J., & Szmelter-Jarosz, A. (2024). AI and machine learning for real-world problems. In Advances in computers (Vol. 134, pp. 1-12). Elsevier.

[7] Khan, M. A. (2023). Real World Applications And Research Directions For Machine Learning: Challenges And Defies. Cloud Computing and Data Science, 2949-2954.

[8] Sarkar, D., Bali, R., & Sharma, T. (2017). Machine learning basics. In Practical Machine Learning with Python: A Problem-Solver's Guide to Building Real-World Intelligent Systems (pp. 3-65). Berkeley, CA: Apress.

[9] Huang, Y., Li, J., Li, M., & Aparasu, R. R. (2023). Application of machine learning in predicting survival outcomes involving real-world data: a scoping review. BMC medical research methodology, 23(1), 268.

[10] Paleyes, A., Urma, R. G., & Lawrence, N. D. (2022). Challenges in deploying machine learning: a survey of case studies. ACM computing surveys, 55(6), 1-29.

Downloads

Published

2026-04-21

Issue

Section

Articles