ArabicStemmer

A simple web app that allows the user to enter an Arabic word and retrieve stem predictions from three of NLTK's Arabic stemming algorithms.

This was created as a learning project to learn more about python stemming algorithms for the Arabic language, to experiment with SvelteKit, especially its API functionality, and to explore Node child processes.

How To Use

Enter an Arabic word in the prompt. Submitting the request will prompt three of NLTK's Arabic stemming algorithms and deliver the response back in table form to the user.

The user can enter words in two ways:

Type the word using an Arabic keyboard
Type using the latin script, and use the incorporated Yamli tool to select the transliterated Arabic

How It Works

The application simply takes the form entry, calls the python script, and returns the result to the user.

To do this, the web app was scaffolded using SvelteKit.

In this example, Node spawns a child process to call the python script with NLTK via an API. It takes the form entry as input and returns the predicted stems as output in JSON format. In this way, the script is coupled with the app for a convenient example, but it can also be easily decoupled and hosted elsewhere for standard API function calls.

Use the App

In its current basic form, there are a few steps required to get the app up and running.

Clone the repo from GitHub

git clone [email protected]:Wollaston/ArabicStemmer.git //using ssh

Create a virtual environment for working with the Python component of the app

python3 -m venv venv

Install NLTK in the virtual environment

pip install nltk

Install the SvelteKit and Node dependencies

npm install

Launch the app using local host

npm run dev //will provide a link to the proper port

Why Three Predictions?

During experimentation, it became clear that the existing Arabic stemming algorithms from NLTK are not entirely perfect, especially when trying to accurately identify word roots, although they are generally accurate with standard vocabulary.

Therefore, the algorithm provides three predictions to give the user some choice when assessing the accuracy of the responses.

These algorithms are:

Next Steps

Explore additional Arabic stemming algorithms and incorporate accordingly
Decouple the Python script from the App for efficient hosting options
Create a local desktop app for a standalone client
Provide the ability to link stemmed responses to a root-based Arabic dictionary and/or provide examples of words based on that stem and root
Add additional tooling and guidance to the App, for example the Buckwalter Arabic Morphological Analyzer
Program proper error checking
Incorporate stem and root verifiers, and warn the user accordingly if the predicted stem does not match an established Arabic stem or root
- This may be useful when working with roots that are not three letters, or with hamzated/geminated/assimilated words

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples		examples
src		src
static		static
.eslintignore		.eslintignore
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
package.json		package.json
postcss.config.cjs		postcss.config.cjs
svelte.config.js		svelte.config.js
tailwind.config.cjs		tailwind.config.cjs
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArabicStemmer

How To Use

How It Works

Use the App

Why Three Predictions?

Next Steps

About

Languages

License

Wollaston/ArabicStemmer

Folders and files

Latest commit

History

Repository files navigation

ArabicStemmer

How To Use

How It Works

Use the App

Why Three Predictions?

Next Steps

About

Topics

Resources

License

Stars

Watchers

Forks

Languages