On Designing an Automated Malaysian Stemmer for the Malay Language
2000
Conference Paper
ei
Online and interactive information retrieval systems are likely to play an increasing role in the Malay Language community. To facilitate and automate the process of matching morphological term variants, a stemmer focusing on common affix removal algorithms is proposed as part of the design of an information retrieval system for the Malay Language. Stemming is a morphological process of normalizing word tokens down to their essential roots. The proposed stemmer strips prefixes and suffixes off the word. The experiment conducted with web sites selected from the World Wide Web has exhibited substantial improvements in the number of words indexed.
Author(s): | Tai, SY. and Ong, CS. and Abullah, NA. |
Book Title: | Fifth International Workshop on Information Retrieval with Asian Languages |
Journal: | Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages |
Pages: | 207-208 |
Year: | 2000 |
Month: | October |
Day: | 0 |
Publisher: | ACM Press |
Department(s): | Empirical Inference |
Bibtex Type: | Conference Paper (inproceedings) |
DOI: | 10.1145/355214.355247 |
Event Name: | Fifth International Workshop on Information Retrieval with Asian Languages |
Event Place: | Hong Kong, China |
Address: | New York, NY, USA |
Digital: | 0 |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
PostScript
Web |
BibTex @inproceedings{3421, title = {On Designing an Automated Malaysian Stemmer for the Malay Language}, author = {Tai, SY. and Ong, CS. and Abullah, NA.}, journal = {Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages}, booktitle = {Fifth International Workshop on Information Retrieval with Asian Languages}, pages = {207-208}, publisher = {ACM Press}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, address = {New York, NY, USA}, month = oct, year = {2000}, doi = {10.1145/355214.355247}, month_numeric = {10} } |