SKETCHING THE SEMANTIC CHANGE OF JAHANAM AND HIJRAH: A CORPUS BASED APPROACH TO MANUSCRIPTS OF ARABIC-INDONESIAN LEXICON

There are a significant number of lexical borrowings from Arabic language to Indonesian language. Among of them are the words jahanam and hijrah. These words are diachronically can be traced and found in literatures and religious texts. This paper seeks to scrutinize the semantic change of jahanam and hijrah. This paper analyses jahanam and hijrah as they are used in both old manuscripts and modern texts. To see their semantic change behaviour, collocation and concordance of their contexts were analysed. The manuscripts employed as the source of research data were taken from the Malay Concordance Project (MCP) which comprises of 165 classical Malay literatures containing some Islamic texts, Corpora Collection Leipzig University, and WebCorp Live Birmingham City University. Using the corpus linguistics method, this research manages to demonstrate how words change semantically through time. The results of this study can be used as material for the preparation of the IndonesianArabic etymology dictionary.


Introduction
The contents of manuscripts scattered across the Indonesian certainly bring a lot of foreign vocabulary to the archipelago. Many foreign vocabularies are then absorbed into Malay (Liaw Yock Fang, 1991;Riddell, 2012). There are complete forms of absorption (the exact spelling and pronunciation) and some are adjusted both spelling and pronunciation. The arrival of Islam to the archipelago has brought a great influence not only in religion and culture, but also in language (Pusat Pembinaan dan Pengembangan Bahasa, 1996;Berg, 2007;Jones et. al., 2007;van Dam, 2010). This can be seen from the many Arabic words that enrich Malay and Indonesian vocabulary through absorption and borrowing (Julul, et.al., 2019). In 1996, the book of Senarai Kata Serapan dalam Bahasa Indonesia (List of loan words in Indonesian) recorded 1,495 loanwords from Arabic.
That number might have been increased by now.
Many of the Arabic loanwords' meanings have now been deviated far from the original meaning. The change in meaning is natural because languages, including its vocabulary, always develop and change. Contact with outside culture causes some vocabularies to change some of its sound, form, and or meaning from its original especially when language users have no prior knowledge on the etymology or the history of the word.
People currently study one manuscript at one time carefully to get the core and lessons from it due to an uncertain amount of time needed to study many manuscripts. An effective and efficient method for studying large numbers of manuscripts is the corpus linguistic method. Corpus linguistics can simplify and make the work for months or even years into just a couple of minutes. Corpus linguistics is a research method that utilizes corpus data that is a collection of pieces of language text in electronic form, selected according to external criteria to represent, as far as possible, a language or language variety as a source of data for linguistic research (Sinclair, 2004). O'Donnell (2008) further explains that corpus data contains authentic language data designed and collected according to sampling procedures in an electronic or machine-readable form. The data is a representative of language and used for linguistic investigation. In corpus linguistics, what is meant by digitalization is not just the transfer of media from print to digital media, but it is more about digitalization that makes the manuscripts readable by the corpus tools. To be used in the corpus, manuscripts must be converted into a plain file.
Researchers interested in studying manuscripts through corpus are now facilitated by the existence of the Malay Concordance Project (MCP). MCP is a corpus of classical Malay texts developed by Ian Proudfoot (1991), comprising 165 texts and 5.8 million words, including 140,000 verses, dating from the 14th to the 20th century. Those texts are collected from some reputable philologists (Gallop, 2013). The corpus provides useful information about contexts in which words are used, where particular terms or names occur in texts, and patterns of morphology and syntax.
There have been many studies conducted using MCP, including linguistics research. Among them are studies by Siaw-Fong Chung (2011) on a corpus based study on the uses of the affix ‚ter‛ in Malay, Wade and Li's (2012) study of Anthony Reid and the study of the Southeast Asian Past, and also Maziar (2012) who study about Malay kingship in Kedah, its religion, trade, and society. MCP has made it much easier to research about Malay and Indonesian. There are some more corpora on Malay and Indonesian language, but the texts they contained is not of manuscripts. The oldest texts would be from the 20 th century and some of them originated in digital form. For some research, that does not cover all the need. Another way to do this is by comparing two or more different corpora from different era.
Corpus like MCP can be set diachronically so that user can see from the concordance or word collocates whether a word has similar or different meaning from time to time. We can also compare corpus of old data with another corpus from a more current data. From the time shown in the corpus we can also see when the changes occurred.
One example of the utilization of digitized Islamic text for linguistics research has been carried out by Nor Hashimah et al (2012) in his research on the change of meaning of the word alim in Malaysian. Although the focus of the research is on cognitive semantic analysis, the words analysed and the data sources used are examples of how digital Islamic texts are utilized in transdisciplinary linguistic research. Nor Hashimah et al analysed how the meaning of Arabicorigin word alim has change through time by looking at the concordance of alim in UKM-DBP (Malaysian Language and Library Council) corpus. UKM-DBP corpus is a 5 million words data taken from UKM-DBP data bank which comprises of newspaper, magazines and books. Unfortunately, as mentioned previously, the corpus contains many Islamic texts but none of them are of old manuscripts. From the corpus, she found that there are four new meaning of the word alim in addition to its original meaning. From that finding they continue the analysis to cognitive semantic.
To the best of our knowledge, there is no previous research that has investigated the semantic change of Indonesian lexicon from Arabic origin utilising corpus method. Using corpora in analysing the semantic change can be advantageous in threefold: (1) it enables us to track different language used of a word across the time, (2) it gives us an authentic evidence of changes in a language through time, and (3) it helps us to identify new meanings of a word. Therefore, this paper was aimed to investigate the Indonesian lexicon of Arabic origin. In more specific, the words jahanam and hijrah will be tracked. These two words are considered frequently used recently in media. As a result, this study aims at describing how these words have changed semantically from their origin word in Arabic as well as at tracing them diachronically from time to time. Additionally, this research offers to demonstrate how to use of Islamic manuscripts for study in the field of applied linguistics by exploiting the available online corpora.

Method
To analyse the semantic change of the current researched words, we used corpus-based approach. There were three corpora used in this research, namely MCP, Indonesian corpus from Corpora Collection Leipzig University (Leipzig Corpora), and WebCorp Live. To mention some of Islamic manuscripts stored in MCP used as the source of data in this study. Tuhfat al-Nafis

(MS 1889+) 13
Muhimmat al-Nafa´is 1892 14 al-Imam 1906-1908 15 Itqa> nal-Mulk bi Ta'di> lal-Sulu> k 1911 Leipzig Corpora is a collection of on-line corpus developed by the Institute of Computer Science, University of Leipzig, Germany which contains 358 corpuses from 252 languages in the world. Corpus data was taken from internet pages. All corpora were processed in the same way so that all contain well-formed sentences in each language. The sentences were then sorted randomly so that the use of web data for the corpus did not violate copyright because it was not possible to reconstruct the entire original text. The Indonesian corpus in Leipzig Corpora consists of 74,329,815 sentences, 7,964,109 types, and 1,206,281,985 tokens. Data come from the year range of the 2010s to 2014s. The year of appearance can be seen in the corpus so that it can be sorted chronologically.
WebCorp Live is an online corpus project developed by Birmingham City University. Similar to Leipzig Corpora, WebCorp Live also retrieves data from web pages on the Internet. While Leipzig Corpora uses its own search engine, WebCorp uses the Bing search engine for the Indonesian language. In addition, because WebCorp is a device that is directly connected to the internet at the time of the search, the amount of data cannot be known, really depends on the availability of data in the network. The range of the year also depends on the availability of data, the longest is from the early 2000s and the most recent is from the current year.
This research employed collocation and concordance features as the basis for analysing jahanam and hijrah in their contexts. Collocation was used to describe the broader context of these words. Concordance provide the overall uses of the lexical items whereas, collocation determines the textual behaviour of the node word. Through concordance the lexical items are explored with specific reference to their dynamic nature of meaning, while collocation specifies the cooccurrence of various lexical items with the node words (Zahra & Abbas, 2018).

Result and Discussion
Semantics change of Jahanam The first word to be analysed in this research was the word ‚jahannam‛ or ‚jahanam‛ in Indonesian context. If we look at its original usage from the Quran or Islamic texts, jahannam is a name of a hell. In the MCP data it was found that there are 84 concordance lines containing the word jahanam (with one letter m) and Jahannam (with two letter m). Those concordance lines are derived from several texts. Most of the word jahan(n)am in the data from the 14th century to the 19th century were in collocation with neraka (34 times). Both words form the word neraka jahanam as can be seen in the following concordance lines. In other data, the word jahan(n)am is paired with words that relate to sin, the torment of hell, and religious words. Only in the 20 th century, the word jahanam is being put in other context than religion as can be seen in the following concordances. In the other corpus, Leipzig Corpora, there are two different results from the search for the word jahanam. The first is the search result of the word jahanam in Islamic texts; the second is the search result for the word jahanam in the general text. The first search of results show the collocation as in Figure 1. Digitised Islamic texts in Leipzig Corpora associate the word jahanam strongly with hell, devil, eternal, fire, punishment, torment, and Allah, as shown in Figure 1. However, a more recent data corpus shows a quite different usage of the word in context as depicted in Table 4 and 5 as follow.  The word jahanam in recent corpora is no longer used solely in religious contexts; its use is now widespread. The word is used in any context that associates with evil or crime. Some people even use the word to name and describe food. The meaning of jahanam has expand to 'evil', 'strong as hell', and 'hot as hell' (see Table 5).
For Indonesian who understands Arabic, this sort of thing might not a new or an interesting thing, but for Muslims in other country and culture, this might be shocking. Some culture might never associate good, tasty food with hell. For linguists, it is the expanded meaning and what is beyond the expanded meaning that draws their attention. This is a researchable phenomenon.

Sketching the meaning of Hijrah
Another word case widely used among Indonesian young millennials is hijrah. The word hijrah etymologically means moving from one place to another. In Islam, hijrah means the departure of Prophet Muhammad from Mecca to Medina with the aim of saving himself and spreading the Islamic teachings (as shown in figure 3). Hijrah is also related to the dates of Islam which began when the Prophet Muhammad moved to Medina.
Data from the 14th to 19th centuries as found in the MCP shows that hijrah or its variation hijrat occurs in many texts and refers to the year of Islamic calendar. The word hijrah or hijrat collocates with number of years, and such words as nabi, baginda, Muharam, and 12 Rabiulawal as demonstrated in Table 6 below. TZA 2.595c ri bulan Zulkaedah yang mulia , empat puluh enam hijrat anbia , pukul sepuluh pagi bertolak dia. | Pukul lima TZA 2.1182a ... lama zamani , sampailah hari bertabal sultani. | Hijrah Nabi Allah Rahmah ... , seribu tiga ratus masa terje TZA 2.1252a ........ saat ketika , memulai kerija sultan paduka. | Hijrat Nabi akhiru'l-zamani , seribu tiga ratus hai ikhwani , TZA 2.1323a peri , berlengkap kealatan di dalam negeri. | Pada hijrat Nabiu'l-mukhtara , seribu tiga ratus sudah dikira , li S 1Aug36:2 seperti hari raya puasa, hari raya haji, pada tahun hijrah 12 Rabiulawal hari lahir Nabi kita Muhammad dan s However, at the present time, the meaning of hijrah has been expanded. This word has been used differently as hijrah is now associated with repentance or change. However, the changes tend in the way of dress or fashion styles, such as the use of gamis, braids, wearing beard, etc. All these things are regarded as the embodiment of devotion and repentance for hijrah. This word is also now expanding its use in many fields, not only related to religious term, but also used in sport. The concordance lines in Table 7 exemplifies the co-occurrences of hijrah in the field of sport, which demonstrate the collocation of the word with Real Madrid, club, transfer, liga, United, Milan, etc.  Aside from the use of the word hijrah in the context of sport (Table 7 and 8), hijrah can also be found in the context of Indonesia celebrity's fashion. Some celebrities in Indonesia who decide to change their appearance to a more religious look or change their way of life to a more religious one are called doing hijrah. This phenomenon is used by the media as a commodity. That is why in the recent corpora, the word hijrah can be found in the context as the following concordance lines in Table 9. The data from the corpus presented above shows that the meaning of the word jahanam and hijrah has changed from its original meaning. Their meanings have changed from time to time. The word jahanam is no longer used solely in religious contexts; its uses are now widespread. The word is used in any context that associates with evil or crime. Some people even use the word to name and describe food. The meaning of jahanam has expand to 'evil', 'strong as hell', and 'hot as hell'. The same thing applies to the word hijrah. This word is no longer used only in the field of Islam, but also has expanded into sports field, especially soccer. Hijrah in this soccer field means to change home club. The use of the word hijrah in this context only occurs in Indonesian.
Referring to the types of changes in meaning by Riemer (2010) and Geeraerts (2010), jahanam and hijrah experience an expansion of meaning. According to them, semantic change can be influenced by many factors, among of them are: a) development in the field of science and knowledge, b) development of word usage, c) development of social culture, d) exchange of sensory responses, e) association of a word.
In the case of these two words, the change in meaning occurs because of the development of social culture. Indonesia is a country with a Muslim majority so that they understand several Arabic vocabularies related to religion. Because they understand the meaning, Indonesian speakers do not feel guilty when using the word in other contexts as long as the meaning is almost the same or does not stray too far (Alasmari et. al., 2017).
The data also shows that the difference in the use of the two words in context mostly occurred in the 20th century. In this century, Indonesian people have been flooded with information from abroad so that there are many changes in social and cultural order. This is what causes changes in meaning in several words, including the word jahanam and hijrah.

Conclusion
This study has shown how the word jahanam and hijrah that are used in Indonesian language differ both in use and meaning from their origin. This study has also shown that old manuscripts including Islamic text that compiled in a corpus are very useful to be used to analyse change in meaning of words using corpus linguistics method. The change in meaning can be seen from the context where the words are used as well as from the collocation. To analyse that phenomenon manually from one text to another will take months of time. By using digitised text in a corpus, it took only a couple of minutes to get the results.
Today, a specific Islamic corpus that contains only Islamic text has not yet been developed. Digitalization of Islamic texts could be a starting point to the compilation of such corpus. Considering that Islamic teaching based on the Quran is fixed and never change from time to time, Islamic corpus could become a reference corpus for analysing various phenomena in various subjects related to Islam.[]