The imputation of missing tides data in Malaysian tourism areas using basic statistical methods

Authors

  • Nor Zila binti Abd Hamid Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Adlina binti Sahrin Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Nurul Bahiyah binti Abd Wahid Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Nur Hamiza binti Adenan Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Nor Hafizah Binti Md Husin Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Noor Wahida binti Md. Junus Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Nor Suriya binti Abd Karim Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia
  • Rawdah Adawiyah binti Tarmizi Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900, Tanjong Malim, Perak, Malaysia

DOI:

https://doi.org/10.15282/daam.v5i2.9725

Keywords:

Imputation method, Missing data, Basic statistical method, Tides data, Malaysian tourism area

Abstract

This study analyses the imputation of missing tide data in Malaysian tourism areas using basic statistical methods. It aims to determine the most appropriate method among five basic statistical methods for the imputation of missing tide data in three Malaysian tourist areas, namely Kota Kinabalu, Penang and Langkawi Island. These methods are Top Bottom Mean, 6-Hour Mean, 12-Hour Mean, Daily Mean, and Linear Interpolation. The data were recorded hourly for 14 days, which is equivalent to 336 hours, in 2019. The data are complete and continuous. The percentage of data discarded in this study is 10%, 20%, 30%, 40%, and 50%. The performance indices used to evaluate the methods are Correlation Coefficient (CC), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). Overall, the best basic statistical method to impute missing tide data is Linear Interpolation, and it is hoped that this study can help the Department of Survey and Mapping Malaysia (JUPEM) in imputing the missing tide data.

References

[1] Sukatis FF, Noor NM, Zakaria NA, Ul-Saufie AZ, Suwardi A. Estimation of missing values in air pollution dataset by using various imputation methods. International Journal of Conservation Science. 2019;10(4):791–804.

[2] Zakaria NA, Noor NM. Imputation methods for filling missing data in urban air pollution data for Malaysia. Urbanism. 2016;9(2):159–66.

[3] Noor NM, Yahaya AS, Ramli NA, Al Bakri Abdullah MM. Filling the missing data of air pollutant concentration using single imputation methods. Applied Mechanics and Materials. 2015;754–755:923–32.

[4] Saeipourdizaj P, Sarbakhsh P, Gholampour A. Application of imputation methods for missing values of PM10 and O3 data: interpolation, moving average, and k-nearest neighbor methods. Environmental Health Engineering and Management. 2021;8(3):215–26.

[5] Johnes J, Mapjabil J. Pola ruang keliaran pelancongan kembara di Kota Kinabalu, Sabah. Journal of Tourism, Hospitality & Environmental Management. 2020;5:242–54.

[6] Azhar N, Ahmad H. Beach tourism and family tourists satisfaction: A case of Penang. Jurnal Wacana Sarjana. 2019;3(1):1–14.

[7] Samad S, Shukor MS, Salleh NHM. Impak pembangunan industri perlancongan kepada komuniti di Pulau Langkawi. Proceedings of the Eighth National Economics Conference of Malaysia (PERKEM VIII). 2013;1:207–16.

[8] Malaysian Meteorological Department. Iklim Malaysia. 2023.

[9] Malaysian Meteorological Department. Fenomena Cuaca. 2023.

[10] Department of Survey and Mapping Malaysia. Garis Panduan Teknikal Cerapan Air Pasang Surut [Internet]. 2021 [cited 2025 May 9]. Available from: https://www.jupem.gov.my/storage/upload/pekeliling/23541-pkpup-8-2021.pdf

[11] Zainuri NA, Jemain AA, Muda N. A comparison of various imputation methods for missing values in air quality data. Sains Malaysiana. 2015;44(3):449-56.

[12] Ghapor AA, Zubairi YZ, Imon AHM. Missing value estimation methods for data in linear functional relationship model. Sains Malaysiana. 2017;46(2):317-26.

[13] Libasin Z, Ul-Saufie AZ, Ahmat H, Shaziayani WN. Single and multiple imputation method to replace missing values in air pollution datasets: A review. IOP Conference Series: Earth and Environmental Science. 2020;616(1) 012002.

[14] Libasin Z, Fauzi WSWM, Ul-Saufie AZ, Idris NA, Mazeni NA. Evaluation of single missing value imputation techniques for incomplete air particulates matter (PM10) data in Malaysia. Pertanika Journal of Science & Technology. 2021;29(4):3099–3112.

[15] Chen M, Zhu H, Chen Y, Wang Y. A novel missing data imputation approach for time series air quality data based on logistic regression. Atmosphere (Basel). 2022;13(7)1044.

[16] Schober P, Schwarte LA. Correlation coefficients: Appropriate use and interpretation. Anesthesia & Analgesia. 2018;126(5):1763–8.

Downloads

Published

2024-09-30

Issue

Section

Research Articles

How to Cite

[1]
N. Z. binti A. Hamid, “The imputation of missing tides data in Malaysian tourism areas using basic statistical methods”, Data Anal. Appl. Math., vol. 5, no. 2, pp. 17–22, Sep. 2024, doi: 10.15282/daam.v5i2.9725.

Similar Articles

1-10 of 53

You may also start an advanced similarity search for this article.