I was trying today to generate PDF reports using Geraldo Reports and I needed to generate reports with Arabic text in them. Arabic is a very special script language with two essential features:
- It is written from right to left.
- The characters change shape according to their surrounding characters.
So when you try to print Arabic text in an application – or a library – that doesn’t support Arabic you’re pretty likely to end up with something that looks like this:
We have two problems here, first, the characters are in the isolated form, which means that every character is rendered regardless of its surroundings, and second is that the text is written from left to right.
To solve the latter issue all we have to do is to use the Unicode bidirectional algorithm, which is implemented purely in Python in python-bidi. If you use it you’ll end up with something that looks like this:
The only issue left to solve is to reshape those characters and replace them with their correct shapes according to their surroundings.
I solved this issue more than four years ago in a small application that I wrote in Visual Basic, my solution was naive but it solved it well, anyway, a few days ago I faced the same problem – rendering Arabic text correctly – but on Android, and I searched and used the solution in this SO answer, which is pretty similar to the solution provided in Better Arabic Reshaper.
Today I ported the solution in Better Arabic Reshaper from Java to Python, tweaked it a little bit, and used it to successfully render Arabic text in PDF, and the result was:
Pretty cool right? Here is another test with English text in it some diacritics:
It looks fine! in Word the same text looks like this:
Amazing, now it is time for you to use the ported library along with python-bidi to solve those issues.
Usage
import arabic_reshaper from bidi.algorithm import get_display #... reshaped_text = arabic_reshaper.reshape(u'اللغة العربية رائعة') bidi_text = get_display(reshaped_text) pass_arabic_text_to_render(bidi_text) #...
Demo
You can try an online demo of this script on my Python/Django site here: Arabic Reshaper Online.
Download
The source code is licensed under the GNU Public License (GPL).
Project on GitHub
Source code download from GitHub
Have fun واستمتع!




بارك الله فيك اخي عبدالله
أرجو ان يساعدني هذا في العديد من الامور التي لا تدعم اللغة العربية
أتمنى أن نتواصل على الايميل
وبارك فيك أخي خالد
يمكنك التواصل معي عن طريق بريدي الالكتروني الموجود في صفحة
About Me
Hello Abd,
Thank you for this *extremely* valuable port. Quick question, regarding “single letters”.
Your algorithm reshapes an isolated letter, such as ض (\u0636) into a shaped one : ﺿ (\uFEBF).
I don’t think this is correct (?) I consider adding a line of code at the very first line of the function “get_reshaped_word” to exclude 1-letter words. Would it make sense?
def get_reshaped_word(unshaped_word):
if len(unshaped_word) == 1: return unshaped_word ### <—– New
unshaped_word = replace_lam_alef(unshaped_word)
decomposed_word = DecomposedWord(unshaped_word)
…
Hi Louis,
Thanks for your reply and bug report, I fixed it now, it is on GitHub, you can download the library again.
بارك الله فيك
شغالة معايا تمام
تسلم
Thanks for this project, just wanted to inform that my problems regarding the error :
UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 0-4: ordinal not in range(128)
is solved by putting the following lines in arabic_reshaper.py :
import sys
reload(sys)
sys.setdefaultencoding(‘utf-8′)
In first, thanks for sharing this code, but i have a problem with the example that you provided.
pass_arabic_text_to_render(bidi_text)
NameError: name ‘pass_arabic_text_to_render’ is not defined
Welcome,
Your error is because this is actually not a method, it is just to say that you should instead of this line call your rendering method which will accept the Arabic text and render it, so it might be PDF printing, or simply PIL image or anything.
Cheers.
Assalam Alykum
Thank you brother for your great effort and sharing it , Now i can finally use beautiful arabic fonts in Linux for OpenERP arabic Reports.
which the arabic_reshaper.py was suggested as a part of solution for OpenERP arabic reports in https://github.com/barsi/openerp-rtl
i have noticed that there is vertical alignment Problem when generating the reports . the data is not vertically well aligned. am just asking is this issue related to the reshaper or to the Reportlab represntation for the arabic font.
note that before i use the solution in the link [ https://github.com/barsi/openerp-rtl ] some fonts were well aligned but they have the square thing issue , now they are ok but not well aligned vertically !!!
Wa Alaikom Al Salaam,
Thanks Razan for using this solution, the problem you’re having is due to the font you’re using I think, because I’ve used multiple fonts with Arabic text and Python and it went well without this vertical alignment problem, so you should experiment with multiple fonts till you find the best one for you, I tried Arial and Helvetica, try them if you want.
Good luck…
Thanks, i’ve tried various Fonts even Arial but still have the same problem, now i find that the alignment for Reports in Reportlab engine is in the paragraph.py file and that’s where comes the problem now am trying some tricks.
thanks again
Sorry I wasn’t of help to you.
Thanks in advance