Article written by

A passionate pythonist geek looking for problems, to solve :P

55 responses to “Python Arabic Text Reshaper”

  1. Khaled

    بارك الله فيك اخي عبدالله
    أرجو ان يساعدني هذا في العديد من الامور التي لا تدعم اللغة العربية
    أتمنى أن نتواصل على الايميل

  2. Louis

    Hello Abd,
    Thank you for this *extremely* valuable port. Quick question, regarding “single letters”.
    Your algorithm reshapes an isolated letter, such as ض (\u0636) into a shaped one : ﺿ (\uFEBF).
    I don’t think this is correct (?) I consider adding a line of code at the very first line of the function “get_reshaped_word” to exclude 1-letter words. Would it make sense?

    def get_reshaped_word(unshaped_word):
    if len(unshaped_word) == 1: return unshaped_word ### <—– New
    unshaped_word = replace_lam_alef(unshaped_word)
    decomposed_word = DecomposedWord(unshaped_word)

  3. waleed

    بارك الله فيك
    شغالة معايا تمام

  4. Cüneyt Sina Koca

    Thanks for this project, just wanted to inform that my problems regarding the error :

    UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 0-4: ordinal not in range(128)

    is solved by putting the following lines in :

    import sys

  5. Mohamed LICHOURI

    In first, thanks for sharing this code, but i have a problem with the example that you provided.

    NameError: name ‘pass_arabic_text_to_render’ is not defined

  6. Razan

    Assalam Alykum

    Thank you brother for your great effort and sharing it , Now i can finally use beautiful arabic fonts in Linux for OpenERP arabic Reports.
    which the was suggested as a part of solution for OpenERP arabic reports in

    i have noticed that there is vertical alignment Problem when generating the reports . the data is not vertically well aligned. am just asking is this issue related to the reshaper or to the Reportlab represntation for the arabic font.

    note that before i use the solution in the link [ ] some fonts were well aligned but they have the square thing issue , now they are ok but not well aligned vertically !!!

  7. Razan

    Thanks in advance :)

  8. Marek


    I’m using your module together with bidi and it’s clear the arabic text itself is correct and well wrapped whether in console or in text editor. However I need to render Arabic text properly as Paragraph entity in Reportlab, but I’m only facing a problem with word wrap (RTL text is wrapped, but with new line above, not under). How did you passed through this?

    best regards and thanks for your effort

  9. egamal

    السلام عليكم
    تظهر مشكله عند طباعه جمله طويله في اكتر من سطر

  10. Amine

    Thank you so much, really a wonderful job, thanks thanks thanks

  11. Josh

    Thank you for this extremely valuable port, which helped generate printed registration rolls for over a million voters in Libya.

    There is a minor bug with the lam-alef glyphs, which appears to be from the original Java package, as I have noted in GitHub issue #2.

    We have also mirrored the RTL branch of reportlab to GitHub, in case others would like to use it without installing mercurial.

    1. Marek

      Hi Josh & Abd Allah,

      I am still confused how to break a block of Arabic text into lines – a reportlab’s paragraph. Starting from the right side of a page, the text should run to the left margin and continue on a new line bellow and right. This is not so, when I run the code against the reportlab-rtl branch. In PDF I got this:
      ‫و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة‬
      ‫إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ‬
      instead of this:
      إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة

      This is the complete code (using reportlab-rtl, python-bidi and Abd Allah’s reshaper):

      from reportlab.lib.pagesizes import A4
      from reportlab.platypus.doctemplate import SimpleDocTemplate
      import arabic_reshaper # Abd Allah’s code
      from bidi.algorithm import get_display # python_bidi
      from reportlab.pdfbase import pdfmetrics
      from reportlab.pdfbase.ttfonts import TTFont
      from reportlab.lib.styles import ParagraphStyle
      from reportlab.lib.enums import TA_RIGHT
      from reportlab.platypus.para import Paragraph

      pdf_doc = SimpleDocTemplate(pdf_file, pagesize=A4)
      arabic_text = u’إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة’
      arabic_text = arabic_reshaper.reshape(arabic_text) # join characters
      arabic_text = get_display(arabic_text) # change orientation by using bidi
      #english_text = ‘If we take into account the nature of climate variability and inter-annual variability and those on long-term addition to the lack of accuracy of measurements and calculations used’
      pdfmetrics.registerFont(TTFont(‘Arabic-normal’, ‘KacstOne.ttf’))
      style = ParagraphStyle(name=’Normal’, fontName=’Arabic-normal’, fontSize=12, leading=12. * 1.2)
      style.alignment=TA_RIGHT[Paragraph(arabic_text, style)])


      1. Josh

        Marek, in your ParagraphStyle make sure you set wordWrap=’RTL’. Otherwise, reportlab-rtl will act as if it’s LTR text.

        1. Marek

          I have tried it already (it looks very promising :-)), but unfortunately it has no effect, at least with my code…

  12. Yashar Bazli

    Hi Bro.
    How i can to install it ?!
    thank you

  13. Marek

    Hi Josh & Abd Allah,
    I was trying reportlab-rtl branch with reshaper and bidi. Reportlab’s paragraph doesn’t seem to be RTL enabled, because the block of Arabic text is not properly broken into lines automatically. The text running from right side is expected to continue on the new line bellow and right. This is not so, the new line appears above. Is this feature missing in reportlab Paragraph class for RTL text? It works for LTR.
    all the best

  14. Samir Sabri


    I have re-wrote your library to haxe language so that I can port it to php, javascript, c sharp, c++, java, but I didn’t re-write the method get_display
    The question is, why shall I use get_display to reverse the text? I can simply reverse it simply by iterating through the letter via a simple loop, right?

    Also, I have tried it, but I got this result: , so why the ALEF looks like LAM ? please see here:

  15. Samir Sabri

    Here is the result after using a fully unicode font (Traditional Arabic Font)

    there is an empty space under the shadda, what do you recommend?

  16. Muhannad

    Man awesome, worked like magic! THUMBS UP

  17. faisal

    اولاً شكراً على المجهود الجيد
    لاكن لم افهم لماذا تحتاج الى مكتبة بايثون بايدي
    تستطيع ان تستغني عنها بإظافة هذا الكود
    RTL = “”
    for letter in reshaped_text:
    RTL= letter + RTL

    في النهاية

    وهذا البرنامج كامل

    # -*- coding: utf-8 -*-

    import arabic_reshaper
    reshaped_text = arabic_reshaper.reshape(u’اللغة العربية رائعة’)
    RTL = “”
    for letter in reshaped_text:
    RTL= letter + RTL
    print RTL

    شكراً لك

  18. Zubair

    Salam o Alykum.
    Newbies can not undertand how to use this scrip. I have the same issues with OpenERP reports and also with Arabic fonts right/left side.
    It will be very great if some can give a bit more detail how to use this scrip from the scratch on ubuntu 14.04.

  19. Kursat Aker

    Dear Abd,

    Thanks for the reshaper. I do not know Arabic. I was testing the arabic reshaper. I would like to get some feedback if possible.

    My understanding is that the proper way to write yeh followed by teh would be “یت” . My question is what should arabic reshaper produce once it applied to یﺖ

  20. Kursat Aker

    Dear Abd,

    Thank your for your answer. I am not very good with git or github.

    So, I would like to point a line of code where YEH and ALEF MAKSURA gets mixed up in your code in ARABIC_GLYPHS:

    u’\u06CC’ : [u’\u06CC’, u’\uFEEF’, u’\uFEF3′, u’\uFEF4′, u’\uFEF0′, 4]

    FEEF and FEF0 are codes for ALEF MAKSURA.

    Instead this line must be:

    u’\u06CC’ : [u’\u06CC’, u’\uFEF1′, u’\uFEF3′, u’\uFEF4′, u’\uFEF2′, 4]



  21. Adnan

    Dear Abdullah,

    I’m a newbie on using reportLab and I’m trying to use your re-shaper code for creating a Urdu document. My full code is as follow:

    from reportlab.pdfgen import canvas
    from reportlab.pdfbase import pdfmetrics
    from reportlab.pdfbase.ttfonts import TTFont

    import arabic_reshaper
    from bidi.algorithm import get_display

    pdfmetrics.registerFont(TTFont(‘Urdu’,’JameelNooriNastaleeq.ttf’)) # Urdu Nastaleeq font

    c = canvas.Canvas(filename = ‘test2.pdf’,pagesize=’A4′)

    x = 250
    y = 500

    text = u’عدنان الحسن’
    reshaped_text = arabic_reshaper.reshape(text)
    bidi_text = get_display(reshaped_text)



    But, unfortunately, I’m unable to get anything in the pdf file generated by above code. The output from my code can be seen on following link:

    I would be grateful if you can suggest me a solution to this problem.

    Thanks in advance!

    1. Zubair

      Adnan Bhai Salam o Alykum.

      1. Adnan

        Thanks Zubair,
        Can you share the document, so that I can get the web-links easily.


        1. Adnan

          Okay, I managed to repeat the process you showed in the video, but still, I wasn’t able to get anything in pdf file. It’s an empty pdf file.

  22. Zubair

    Salam o Alykum Mr. Adnan, this link was there in video description
    here is the source

Leave a Reply