MACHINE LEARNING APPROACH TO ETHEREUM FRAUD DETECTION: COMPARATIVE ANALYSIS OF LOGISTIC REGRESSION AND RANDOM FOREST

Authors

  • Nathania Vannesa University of Pembangunan Nasional Veteran Jawa Timur
    • Nusrat Zahan Nisha Universiti Malaysia Pahang Al-Sultan Abdullah
      • Shindi Shella May Wara University of Pembangunan Nasional Veteran Jawa Timur image/svg+xml

        DOI:

        https://doi.org/10.15282/

        Keywords:

        Ethereum Fraud Detection, Blockchain Security, Machine Learning, Random Forest, Logistic Regression

        Abstract

        Ethereum's pseudonymous and decentralized nature has made it challenging to trace fraud since its inception. The fraud type identified is fraudulent reuse of the same funds from copied unique received addresses and a different time between the first and last user transaction. This paper is concerned with comparing and examining the performance of Logistic Regression and Random Forest models in detecting unusual Ethereum transactions and finding the most significant features influencing these outcomes. Traditional approaches have a tendency to overlook lightweight, interpretable models that may be utilized for real-time filtering, or they focus on Bitcoin rather than Ethereum. In order to fill this gap, this study utilizes machine learning to detect meaningful patterns of behaviour related to fraud. This study compares the performance of Random Forest and Logistic Regression models in classifying fraudulent Ethereum transactions. The dataset is extracted from a publicly available repository called Ethereum Fraud Detection. The dataset was normalized and cleaned using MinMaxScaler, and then split into 80% training and 20% testing subsets. Feature scaling, correlation analysis, and removal of duplicate variables (such as ERC20 tokens) were part of the preprocessing. With 90.40% accuracy and F1-score of 89.17%, the model suggests that Random Forest is a better performer than Logistic Regression. The ability of the model to identify non-fraud instances is clear from the visualization of the confusion matrix, which also identifies areas of improvement in identifying actual fraud cases. These results give business organizations actionable recommendations on how to deploy real-time detection, rank high-risk transaction signals, and utilize adaptive machine learning architectures. The paper promotes further research into anomaly detection based on deep learning and demonstrates the potential of feature-based machine learning in bolstering Ethereum security infrastructure. 

        Downloads

        Download data is not yet available.

        References

        [1] Ethereum.org. What is Ethereum? [Internet]. 2025 [cited 2025 Feb 28]. Available from: https://ethereum.org/en/what-is-ethereum/

        [2] Huang Q. Ethereum: Introduction, expectation, and implementation. Highlights Sci. Eng. Technol., 2023;39:1–7. doi: 10.54097/hset.v39i.6188.

        [3] Belanger A. MIT students stole $25M in seconds by exploiting ETH blockchain bug, DOJ says. Ars Technica [Internet]. May 16, 2024 [cited 2025 Feb 28]. Available from: https://arstechnica.com/tech-policy/2024/05/sophisticated-25m-ethereum-heist-took-about-12-seconds-doj-says/

        [4] Isidore C. Two former MIT students charged with stealing $25 million of crypto in 12 seconds, CNN Business [Internet]. May 16, 2024 [cited 2025 Feb 28]. Available from: https://edition.cnn.com/2024/05/16/investing/mit-crypto-hack/index.html

        [5] European Parliament. Regulation (EU) 2023/1114 of the European Parliament and of the Council on markets in crypto-assets (MiCA). Off. J. Eur. Union [Internet]. 2023 [cited 2025 Feb 28]. Available from: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R1114

        [6] Elmougy Y, Manzi O. Anomaly detection on Bitcoin and Ethereum networks using GPU-accelerated machine learning methods. In Proc. IEEE Int. Conf. Comput. Theory Appl. (ICCTA), 2021, doi: 10.1109/ICCTA54562.2021.9916589.

        [7] Onu IJ, Omolara AE, Alawida M, Abiodun OI, Alabdultif A. Detection of Ponzi scheme on Ethereum using machine learning algorithms. Sci. Rep., 2023;13:18403, doi: 10.1038/s41598-023-45275-0.

        [8] Koa CG, Heng SH, Chin JJ. New Ethereum-based distributed PKI with a reward-and-punishment mechanism. Blockchain: Research and Applications. 2025;6(1):100239. doi: 10.1016/j.bcra.2024.100239.

        [9] Siddamsetti S, Srivenkatesh M. Efficient fraud detection in Ethereum blockchain through machine learning and deep learning approaches. Int. J. Recent Innov. Trends Comput. Commun., 2023;11(11):71–82, doi: 10.17762/ijritcc.v11i11s.8072.

        [10] Zeng ML, Qin J. Metadata, 3rd ed. Chicago, IL, USA: Neal-Schuman/ALA Editions, 2022.

        [11] Ashfaq T, Khalid R, Yahaya AS, Aslam S, Azar AT, Alsafari S, Hameed IA. A machine learning and blockchain based efficient fraud detection mechanism. Sensors, 2022;22(19):7162, doi: 10.3390/s22197162.

        [12] Azad P, Akcora CG. Machine learning for blockchain data analysis: Progress and opportunities. IEEE Access, 2021;9:76900–76917, doi: 10.1109/ACCESS.2021.3082325.

        [13] Apache Software Foundation. PySpark Overview [Internet], 2025 [cited 2025 Feb 28]. Available from: https://spark.apache.org/docs/latest/api/python/index.html

        [14] Isa IGT, Zulkarnaini Z, Novianti L, Elfaladonna F, Agustri S. Exploratory data analysis (EDA) dalam dataset penerimaan mahasiswa baru Universitas XYZ Palembang. Smart Comp: Jurnalnya Orang Pintar Komputer, 2023;12(3):600–609, doi: 10.30591/smartcomp.v12i3.4125.

        [15] Jiang S, Josse J, Lavielle M. Logistic regression with missing covariates—Parameter estimation, model selection and prediction within a joint-modeling framework. Comput. Stat. Data Anal., 2020;145:106907, doi: 10.1016/j.csda.2019.106907.

        [16] Gu Z, Dib O. Enhancing fraud detection in the Ethereum blockchain using ensemble learning. PeerJ Comput. Sci., 2025;11:2716, doi: 10.7717/peerj-cs.2716.

        [17] Aziz RM, Baluch MF, Patel S, Ganie AH. LGBM: A machine learning approach for Ethereum fraud detection. Int. J. Inf. Technol. 2022; 14:3321–3331. doi: 10.1007/s41870-022-00864-6.

        [18] Cherfly K, Yoga P. The Effect of Class Imbalance Handling on Datasets Toward Classification Algorithm Performance. Matrik: jurnal manajemen, teknik informatika, dan rekayasa komputer, 2023;22(2):227–238, doi: 10.30812/matrik.v22i2.2515

        [19] Mounnan O, Manad O, Boubchir L, Mouatasim AE, Daachi B. A review on deep anomaly detection in blockchain. Blockchain Res. Appl., 2024;5(2):100187, doi: 10.1016/j.bcra.2023.100187.

        [20] Chen J, Wang X, Lei F. Data-driven multinomial random forest: A new random forest variant with strong consistency. J. Big Data, 2024;11:34, doi: 10.1186/s40537-023-00874-6.

        [21] Aliyev V. Ethereum Fraud Detection Dataset [Internet]. Kaggle [cited 2025 Feb 28]. Available from: https://www.kaggle.com/datasets/vagifa/ethereum-frauddetection-dataset.

        Published

        2026-06-09

        How to Cite

        [1]
        Nathania Vannesa, Nusrat Zahan Nisha, and Shindi Shella May Wara, “MACHINE LEARNING APPROACH TO ETHEREUM FRAUD DETECTION: COMPARATIVE ANALYSIS OF LOGISTIC REGRESSION AND RANDOM FOREST”, IJSECS, vol. 12, no. 1, pp. 50–59, Jun. 2026, doi: 10.15282/.

        Similar Articles

        1-10 of 58

        You may also start an advanced similarity search for this article.