Diacritics stripper (Remover) tutorial.

Hello and salam;

I have been working onthis project for a couple of hours now and I thought I should share with you my findings since I tried searching the internet and found nothing at all in this area, what I am trying to show you is how to remove Arabic diacritics from Arabic text dynamically in Flash using ActionScript.

Now what I am about to explain is not an undocumented feature, a hidden API or some sort of a hack, it’s simply my approach to solve this problem. In fact this method will work in flash player version 5 and up.

This method for removing Arabic diacritics is based on the charCode and fromCharCode methods of the String class in ActionScript.

The thing is, all Arabic specific diacritics have an AScii char code between 1611 and 1618, Arabic characters as you might know are double bytes characters and thus each character is represented by two bytes instead of one as it is the case of latin characters.

So in pseudocode, this is how to remove the diacritics dynamically:

– start by storing all the charcodes of the input text in an Array. You will have to make a for loop for that to grab them using the charCodeAt method of the String class.

– loop through the Array and check the charcodes, if a charcode is between 1611 and 1618 just omit it, store the results in a new Array.
– loop through the purified Array and convert the charcodes back to characters using the fromCharCode method, store the results in yet a new Array.

– Finally , joing the elements of the last array and voila!

Below is a sample of the diacritics remover actionscript in action.

Diacritics Stripper

Leave a Reply