Analisis Korpus Jaringan Leksikal Seksual Menggunakan Perisian Graphcoll Berdasarkan Laporan Jenayah BH Online (2004–2020)

Corpus-based Analysis of a Sexual Lexical Network Using Graphcoll based on BH Online Crime Reports (2004–2020)

Authors

  • Muhamad Fadzllah Zaini Universiti Pendidikan Sultan Idris, Malaysia
  • Md. Zahril Nizam Md. Yusoff Universiti Pendidikan Sultan Idris, Malaysia
  • Darwalis Sazan Universiti Sains Malaysia, Malaysia
  • Farah Fazlinda Mohamad Universiti Pendidikan Sultan Idris, Malaysia

DOI:

https://doi.org/10.15282/ijleal.v15i2.11411

Keywords:

Jenayah, Linguistik korpus, Seksual, Teks akhbar, Tematik

Abstract

Seksual dalam konteks jenayah merujuk kepada pelanggaran undang-undang seperti yang digariskan dalam Kanun Keseksaan, Akta 792 dan Akta 840, dan sering dikaitkan dengan implikasi serius terhadap kesejahteraan mental dan sosial masyarakat. Kajian ini bertujuan menganalisis pola kolokasi perkataan seksual dalam Korpus Perlaporan Akhbar Jenayah BHonline (2004–2020) bagi memahami gambaran jenayah seksual dalam wacana media. Data korpus terdiri daripada 3,064,134 token, 48,351 jenis kata dan 7,950 teks, dan dianalisis menggunakan pendekatan Statistik Linguistik Korpus melalui perisian #LancsBoxX. Dapatan menunjukkan bahawa perkataan seksual berkolokasi secara signifikan dengan kata seperti kanak-kanak, penderaan, dan gangguan, yang mencerminkan hubungan rapat antara jenayah seksual dan kerentanan mangsa. Kolokasi tersebut dikelaskan kepada beberapa tema utama iaitu konflik [gangguan (11.9), penderaan (11.7), pedofilia (8)], budaya [gejala (8.1)], serta pencirian sifat [kanak-kanak (11.2), pemangsa (8.4)]. Kata lain seperti amang, penganiayaan, bapa, anak, polis, dan guru turut muncul dalam jaringan kolokasi dan menggambarkan konteks pelaku, mangsa serta institusi terlibat. Secara keseluruhan, pola kolokasi ini memperlihatkan bagaimana media membingkai isu jenayah seksual melalui pilihan leksikal yang konsisten. Hasil kajian dapat membantu pihak berkepentingan merangka dasar serta strategi pencegahan yang lebih berkesan dalam menangani jenayah seksual di Malaysia.

Sexual acts in the context of crime refer to conduct that violates the law as provided under the Penal Code, Act 792, and Act 840, and are often associated with psychological harm and long-term social implications. This study aims to analyse the collocational patterns of the word sexual in the BHonline Crime News Corpus (2004–2020) to understand how sexual crimes are represented in Malaysian media discourse. The corpus comprises 3,064,134 tokens, 48,351-word types, and 7,950 texts, and the analysis was conducted using a quantitative Corpus Linguistics approach with the #LancsBox X software. The findings reveal that sexual frequently co-occurs with terms such as children, abuse, and harassment, indicating strong lexical associations with vulnerable victims and criminal behaviour. These collocations are classified into three main themes: conflict [harassment (11.9), abuse (11.7), paedophilia (8)], socio-cultural framing [symptom (8.1)], and characterisation [children (11.2), predator (8.4)]. Additional collocates such as molestation, maltreatment, father, child, police, and teacher further illustrate the roles of perpetrators, victims, and institutional actors. Overall, the collocational patterns demonstrate how the media constructs societal perceptions of sexual crime through recurring lexical choices. These insights offer valuable implications for policymakers and relevant agencies in designing more effective strategies for addressing sexual crime in Malaysia.

Author Biographies

  • Md. Zahril Nizam Md. Yusoff, Universiti Pendidikan Sultan Idris, Malaysia

    Md. Zahril Nizam Md. Yusoff merupakan pensyarah kanan di Jabatan Bahasa dan Kesusasteraan Melayu, Fakulti Bahasa dan Komunikasi, Universiti Pendidikan Sultan Idris, 35900, Tg. Malim, Perak Malaysia. Kepakaran beliau dalam analisis wacana, linguistik forensik dan sintaksis. Beliau aktif dalam pemartabatan bahasa Melayu di peringkat Kementerian Pendidikan Tinggi. 

  • Darwalis Sazan, Universiti Sains Malaysia, Malaysia

    Darwalis Sazan merupakan pensyarah kanan di Sekolah Pendidikan Jarak Jauh , Universiti Sains Malaysia, 11800 USM, Penang, Malaysia. Kepakaran beliau dalam analisis wacana. Beliau aktif dalam penyelidikan dan penerbitan.

  • Farah Fazlinda Mohamad, Universiti Pendidikan Sultan Idris, Malaysia

    Farah Fazlinda Mohamad merupakan pensyarah kanan di Jabatan Komunikasi dan Media, Fakulti Bahasa dan Komunikasi, Universiti Pendidikan Sultan Idris, 35900, Tg. Malim, Perak Malaysia. Kepakaran beliau dalam komunikasi dan media massa. Beliau aktif dalam penerbitan, penyelidikan dan perundingan di peringkat industri.

References

Abrusán, M., Asher, N., & van de Cruys, T. (2018). Content vs. function words: The view from distributional semantics. Proceedings of Sinn Und Bedeutung 22, 1(269427), 1–21.

Ahmad, A., Mohamad Mangsor, M., Ardi, N., & Ab. Wahab, A. (2020). Peristilahan Bahasa Melayu dalam Akta Kesalahan Seksual terhadap Kanak-Kanak 2017 (Akta 792). Jurnal Bahasa, 20(1), 151–172. https://doi.org/10.37052/jb.20(1)no7

Baker, P. (2004). ‘Querying keywords: questions of difference, frequency and sense in keywords analysis.’ Journal of English Linguistics 32,4, 346–59. https://doi.org/10.1177/0075424204269894

Baker, P. (2006). Using corpora in discourse analysis. In Continuum (pp. 1–206). Continuum.

Baker, P., Gabrielatos, C., & McEnery, T. (2013). Sketching muslims: A corpus driven analysis of representations around the word “Muslim” in the British press 1998-2009. Applied Linguistics, 34(3), 255–278. https://doi.org/10.1093/applin/ams048

Beckett, K. (1997). Making crime pay. New York: Oxford University Press

Black, J. W. (2023). Creating specialized corpora from digitized historical newspaper archives: An iterative bootstrapping approach. Digital Scholarship in the Humanities, 38(2), 779-797. https://doi.org/10.1093/llc/fqac079

Brezina, V., & Gablasova, D. (2018). The corpus method. In J. Culpeper, J., Kerswill, P., Wodak, R., McEnery, A., & Katamba, F. (2018). English Language: Description, variation and context (2nd ed.). Macmillan Education UK. https://doi.org/10.1057/978-1-137-57185-4

Brezina, V. (2018). Statistics in Corpus Linguistics. In Cambridge University Press (First). Cambrige University Press. https://doi.org/10.1017/9781316410899

Brezina, V., McEnery, T., & Wattam, S. (2015). Collocations in context: A new perspective on collocation networks. International Journal of Corpus Linguistics, 20(2), 139–173. https://doi.org/10.1075/ijcl.20.2.01bre

Collins, L. C., & Nerlich, B. (2016). Uncertainty discourses in the context of climate change: A corpus-assisted analysis of UK national newspaper articles. Communications, 41(3), 291–313. https://doi.org/10.1515/commun-2016-0009

Defanti, T., Grafton, A., Levy, T. E., Manovich, L., & Rockwood, A. (2018). Lexical Collocation Analysis; Advances and Application (First). Springer Nature Switzerland AG.

El-Kanash, H. H., & Hamdan, J. M. (2023). COVID-19 Conceptual Metaphors in Jordanian Political Discourse: Evidence from a Newspaper-based Corpus. GEMA Online Journal of Language Studies, 23(1), 93–113. https://doi.org/10.17576/gema-2023-2301-06

Firth, J. R. (1957). Applications of General Linguistics. Transactions of the Philological Society, 56(1), 1–14. https://doi.org/10.1111/j.1467-968X.1957.tb00568.x

Farhan, A. K. (2023). Divergence in the translation of criminal law: A corpus-based study of prohibition in Iraqi penal code and its English translation. Ampersand, 10(December 2022), 100104. https://doi.org/10.1016/j.amper.2022.100104

Fontaine, L. (2017). The early semantics of the neologism BREXIT: a lexicogrammatical approach. Functional Linguistics, 4(1). https://doi.org/10.1186/s40554-017-0040-x

Giugliano, M. (2022). Discourses about independence: A corpus-based analysis of discourse prosodies in spanish and catalan newspapers. Discourse and Communication, 16(5), 525-550. https://doi.org/10.1177/17504813221099194

Gries, S. T. (2013). 50-Something Years of Work on Collocations. International Journal of Corpus Linguistics, 18(1), 137–166. https://doi.org/10.1075/ijcl.18.1.09gri

Gu, C. (2023). ‘Climate change concerns human survival. . .and justice in our international community’: A corpus-based positive discourse analysis (PDA) of the largest developing nation’s global involve/ engagement discourses (re)told in interpreting. PLoS ONE, 18(4 April), 1–20. https://doi.org/10.1371/journal.pone.0277705

Haney C. (2009). The social psychology of isolation: why solitary confinement is psychologically harmful. PrisonServ J. 181:12–20

Kramar, N. (2023). Construction of Agency within Climate Change Framing in Media Discourse: a Corpus-Based Study. Respectus Philologicus, 43(48), 36–48. https://doi.org/10.15388/RESPECTUS.2023.43.48.106

Liu, M., & Huang, J. (2023). Framing responsibilities for climate change in chinese and american newspapers: A corpus-assisted discourse study. Journalism, https://doi.org/10.1177/14648849231187453

McEnery, T and Hardie, A (2012) Corpus Linguistics: Method, Theory and Practice. Cambridge University Press.

O'Keeffe, A., & Breen, M. J. (2007). At the hands of the brothers: A corpus-based lexico-grammatical analysis of stance in newspaper reporting of child sexual abuse cases. The language of sexual crime, 217-236. https://doi:10.1057/9780230592780

Osama Ghoraba, M. (2023). Influential Spanish Politicians’ Discourse of Climate Change on Twitter: A Corpus-Assisted Discourse Study. In Corpus Pragmatics (Issue 0123456789). Springer International Publishing. https://doi.org/10.1007/s41701-023-00140-3

Parvaresh, V. (2023). Covertly communicated hate speech: A corpus-assisted pragmatic study. Journal of Pragmatics, 205, 63–77. https://doi.org/10.1016/j.pragma.2022.12.009

Pizarro, P. A. (2019). MadSex: collecting a spoken corpus of indirectly elicited sexual concepts. Language Resources and Evaluation, 53(1), 191–207. https://doi.org/10.1007/s10579-018-9435-x

Rosli, N. N., Hamzah, N., Zaini, M. F., Baharum, H., Mohd, F. H., Jabar, N. A., Damit, A. R., & Omar, R. (2022). Sentiment analysis of emotional words in a classical text web corpus. AIP Conference Proceedings, 2644(November), 030027. https://doi.org/10.1063/5.0104738

Smyth, C. (2016). An Introduction to Corpus Linguistics. Bulletin of Tokyo Denki University, Arts and Sciences, 14, 105–109. https://doi.org/10.1177/00754240022004965

Szczygłowska, T. (2022). Exploring Obama’s and Trump’s Political Discourse through the Lens of Wordlists, Keywords and Clusters. Brno Studies in English, 48(1), 93–116. https://doi.org/10.5817/BSE2022-1-5

Yüksel, H. G., Mercanoğlu, H. G., & Yılmaz, M. B. (2022). Digital flashcards vs. wordlists for learning technical vocabulary. Computer Assisted Language Learning, 35(8), 2001–2017. https://doi.org/10.1080/09588221.2020.1854312

Zaini, M. F., Muhammad, M. M., Goyak, F., Saradin, A., Osman, Z., Redzwan, H. F. M., & Al Muhsin, M. A. (2022). Geometric Lexical Representative Perspectives: The Impact of Threshold Values Through #LancsBox Software. AIP Conference Proceedings, 2644(November). https://doi.org/10.1063/5.0104817

Zaini, M. F., Sarudin, A., Muhammad, M. M., & Abu Bakar, S. S. (2020). Representatif Leksikal Ukuran sebagai Metafora Linguistik berdasarkan Teks Klasik Melayu (Representatives of Lexical Ukuran as Linguistics Metaphors Based on Malay Classic Text). GEMA Online® Journal of Language Studies, 20(2), 168–187. https://doi.org/10.17576/gema-2020-2002-10

Published

2025-12-22

Issue

Section

Articles

How to Cite

Zaini, M. F., Md. Yusoff, M. Z. N., Sazan, D., & Mohamad, F. F. . (2025). Analisis Korpus Jaringan Leksikal Seksual Menggunakan Perisian Graphcoll Berdasarkan Laporan Jenayah BH Online (2004–2020): Corpus-based Analysis of a Sexual Lexical Network Using Graphcoll based on BH Online Crime Reports (2004–2020). International Journal of Language Education and Applied Linguistics, 15(2), 95-110. https://doi.org/10.15282/ijleal.v15i2.11411

Similar Articles

1-10 of 35

You may also start an advanced similarity search for this article.