Make your own free website on

Massachusetts Windows Arabic

. . . ENSE petit placidam sub libertate quietem . . .

Temporary Download Package (7 May 2001)

Not all the links found here have been connected up yet (or even entered).

In particular, links about downloading two fonts and two keyboards from this site are not functional. All this freebie material is temporarily available in one single zipped file


That package does not contain an automatic setup program, only a description of how to do the setting up yourself. If you are not comfortable with the file structure as seen from MS-DOS, do not fool with this stuff yourself, find a technician who for sure understands what is written in the INSTALL.TXT file in the download package. Read this main page to the end before doing anything with the downloaded files.

Please ignore all the links below that offer to download fonts and/or keyboards individually.

The following link to an unfinished abstract discussion of

Unicode and Arabic-Script Unicode

is also located here on a temporary basis only.


I.    Preliminaries
II.   About "Localized" Arabic Windows
III.  Cheap Windows Arabic vs. Dear Windows Arabic
IV.   The Arabic Browser (Internet Explorer 5)
V.    Writing to the The Arabic Browser (µsoft KBDA1.KBD)
VI.   Arabic-Script Arabic Beyond the Browser (µsoft WORD 9, MINIPAD)
VII.  The Transliterated Keyboard (modified KBDA1.KBD with Persian)
VIII. Read-Only Transliteration ("Xlit1256" font)
IX.   Text Processing
X.    Writing Arabic in Learned Transliteration ("Dushizat" font & keyboard)


This collection of webpages is about reading and writing Arabic using ordinary Microsoft Windows, not the "localized" kind. The special emphases are on (1) browsing in Arabic, and (2) doing academic work in and about Arabic.

If you are not a Windows customer, there is very little for us to say. What there is you will find here .

This document is self-referential in the matter of fonts. Passages will be included that will appear correctly only after you download the necessary fonts. Here is the same hemistich presented three different ways:

السيف أصدق أنباء من الكتب
السيف أصدق أنباء من الكتب
alsayfu ’aےdaqu ’anbه’an mina lkutubi

The first line will have to remain Eurogibberish, unfortunately, in the main discussion here, because a single page of HTML must be all in one character set. If we use Codepage 1256 (which is "Windows Arabic" by definition) the second line will appear as Arabic-script Arabic also, which is pointless. If you glance over here for a moment, you can see exactly the same three lines on a webpage declared to be in Arabic.

Here is a picture of how those lines are supposed to look once you have the fonts:

Another minor concession to HTML is that this document will use curly brackets {{ }} where angle brackets would be more suitable, especially around URL's and the names of keyboard keys.

About "Localized" Arabic Windows

If you want menus, icon labels, help files &c. to be in Arabic (even just optionally in Arabic), this document is probably not for you. Simply buy the latest "Arabic Windows" software from Microsoft and follow the vendor's instructions. We write for people who work primarily in a European language and want Arabic only occasionally and only in special places, not all over the screen all the time.

A case for occasional users buying localized Windows does exist, however, at least temporarily. There is a lot of database material on CD-ROM's made in the Middle East that won't work unless it has Arabic Windows. The best solution is probably to partition your hard disk and hand over a certain percentage to Arabic Windows. It need not be a very large percentage, because only system software has to be located on the logical (or physical) drive you devote to Arabic Windows. The database files themselves can be located anywhere. That is, after you have installed Arabic Millennium on drive D:, you can install the programs off your Lebanese CD-ROMs on C: nevertheless. They just won't function unless you booted from D:.

The word "temporarily" was uttered a moment ago. Microsoft seems to assume that very soon the whole 95-98-Me world will be swallowed up by the NT-2000 world. With Windows 2000, "localization" is not necessary in the same sense. Every user gets a copy of the operating system that can be localized anywhere. That arrangement will be nice when it arrives universally, but meanwhile there is no reason related to Arabic for a single user to prefer Windows 2000 to Windows Me. Those Lebanese CD-ROMs very likely will not run under Windows 2000 anyhow.

So much disposes of "Arabic Windows." The rest of this discussion will be about doing Arabic things under ordinary USA Windows, and mostly the 9X kind of USA Windows, which is more recent, outnumbers NT/2000 installations vastly, costs less, occupies less disk space, runs faster, looks better, and is easier to find software and support for. If you don't have a computer yet, go buy one with Microsoft Windows Millennium Edition on it and everything that follows will apply to you.

Cheap Windows Arabic vs. Dear Windows Arabic

Our idea of cheap is extremely downscale, being nearly synonymous with "free." Our idea of dear can be set out in a table of prices for a single product. The expensive way of doing Arabic-script Arabic is called Microsoft Word, in the current versions that are available only inside the "productivity suite" called either Office 2000 (for Windows) or Office 2001 (for Mac). Here are some numbers on that product:

        List Price     Faculty/Staff   Students
Mac      $499           $225            $60
Windows  $499           $145            $68


{{ }}.

The only other item mentioned hereabouts that costs anything at all is the iCab browser for the Macintosh. It is free for now, but will cost $29.99 when they finalize it.

Oops. Windows itself isn't free, not really. But since it probably came with your machine, we will mention only that in the USA and according to

{{ }}

you are supposed to buy ARABIC Windows from

Aptec Int'l (Europe) London,
UK Microsoft Product Manager 44 171 627 1000

On another site, one in Beirut, the number $349.95 is mentioned. Academic discounts may or may not be available from APTEC and Mr. Dufani.

The Arabic Browser

There are several reasons to put browsing the web first: reading seems logically to come before writing, and the web is where there is lots of stuff to read. Web Arabic is, of course, Arabic-script Arabic, and it is logical to discuss "the real stuff" before going into transliterations of it. Most important of all, thinking about Arabic on the Web allows us to make a stipulative definition of the expression "Windows Arabic," viz.,

Windows Arabic is Codepage 1256.

Exactly what Codepage 1256 is you may read about in English and hexadecimal on a separate page.

If you are not a nerd, only an aspiring or accomplished Arabist, the important thing is that Codepage 1256 is what the 570 old books ("more than a million pages of Arab heritage!") at

{{ }}

and the 85 old books at

{{ }}

are written in. Not to mention 95% of the 1,001,001 lesser Arabic websites.

Codepage 1256 is not, being after all a Microsoft product, quite perfect. Its deficiencies relate only to full vocalization, however, so we have banished our pedantic exposé of them to a sidebar.

Inside Internet Explorer version 5 (IE5 hereinafter), Codepage 1256 is called "Arabic (Windows)" and for ordinary English prose purposes we will call it "Windows Arabic." In IE5 it appears especially importantly on the "View" menu's "Encoding" submenu. Once your browser is set up right, there should be three other Arabic encodings available. These cover 80% of the other 5% of Arabic websites, probably. The only one of these codes there is occasion to use with any site one would ever care to revisit is "Arabic (ASMO 708)" and the offending sites come, astonishingly, from Google:

{{ }}

That is "World }} Arabic" in the very useful "Google Web Directory." All the subcategorical lists under it are in the wrong encoding also. Like the 78 links to newspapers at "World }} Arabic }} Akhbلr wa'A'lلm }} jarل'id" of which al-Ahrلm

{{ }}

(the second item) is of course sensibly in Windows Arabic. And it is a pleasure to report that al-Nahلr

{{ }}

(the first item chez Google) has now got its HTML act together and appears in Windows Arabic. Last time we visited there (a couple of months ago), the front page was encoded in "none of the above" as far as we could ascertain.

That is the way the wind is blowing, even in Beirut. So allow us to repeat:

Windows Arabic is Codepage 1256.

How do you get Windows Arabic to display? Basically you must repair to

{{ }}

and download the latest version of IE5, making sure that you ask for the "minimal" or "customized" installation, not the "typical" one. You will then get a chance to pick various optional components, of which the crucial one is called "Arabic Text Support." The whole process is set out in detail with screen shots of the Microsoft installation process at

{{ }}.

If you go there and see a page of Arabic already, no problem. But if that page is a trackless wilderness of the

أكثر من مليون صفحة من التراث العربي

sort, click on the one bit of English in sight, the words "To View Arabic Text" at the second bullet under the on-screen keyboard in the right frame, and you will find the full instructions we mentioned.

So now you have an Arabic browser.

Writing to the Arabic Browser

Although browsing is 99.44% a matter of reading, there is that 0.56% component of writing also. On the al-Warrلq page you just saw one possible solution, the on-screen ouija board or miniature keyboard that you click on with the mouse. If you never want to communicate with any other Arabic pages that lack this feature, fine, but even the search engines don't necessarily provide such a crutch. They assume you can type Arabic in. And so you can, but only after a certain amount of fussing.

To enable multiple keyboards at all, let alone exotic Oriental ones, you must go through the following process under Windows:

Start  (at the far southeast of your screen)
    Control Panel
      Keyboard  (double-click on the icon)
        Language (click on the tab)  
This should bring you to approximately here:

That picture comess from a Windows Me system with everything already installed. The dialog box looks slightly different for 95, 98 and 2000, but not distressingly so. It is important to understand that this very important dialog box does two quite different things. It "installs a language," so to speak, and then it installs a keyboard for that language. Each language may have multiple keyboards associated it, although only one can be installed at a time. In fact, sticking to what comes with Windows and/or IE5, English has four or five different keyboards and Arabic three.

At this point you should consider how much of what we shall talk about you are interested in. Everything in the above set-up is pertinent to doing Arabic, not just the obvious Saudi-Arabia part. Even the "Polish (Programmers)" is there for an Arabic-related reason. Especially the "United States-International" keyboard is there for an Arabic-related reason. If you have never visited this dialog box before, the one and only line in the window should read "En English (United States) ... United States 101."

If you are interested in writing Arabic in Latin-script transliteration and you have 95/98/Me (not NT/2000), you should read the rest of this document first before doing anything at all, and then come back here and switch English to "United States-International" and install Polish with the "Polish (Programmers)" keyboard at the same time as you install the Arabic keyboard itself, which is all that will be described at this point. Even if you have no use for writing transliterated Arabic, though, you might be interested in the vendor-provided "United States-International" keyboard (US-IK) if you want to type French, German, Italian, Spanish ... diacritics with a minimum of fuss. Examine the Microsoft documentation in the vicinity of

{{ }},

which is a page well worth bookmarking for the on-screen display of ALL Microsoft's keyboard lay-outs, Arabic included.

To install only the standard Arabic keyboard, though, for the moment, click on the "Add" button and select Arabic from the drop-down list that appears. If Arabic does not appear in the list, either it is already installed or you haven't yet installed the "Arabic Text Support" package for IE5. That step came first for a reason. If all goes well, you will get "Ar Arabic ... Arabic (Saudi Arabia) 101 keys" just like in the screen shot above. There are only three keyboards for Arabic, really, the country names are pure piffle. If you are interested in our schemes for transliteration, you must leave the 101-key Arabic keyboard in place. If not, the alternatives are "102-keys" and "102-keys AZERTY." If you take the default, this is what you get:

KBDA1.KBD mockup

The little yellow label means the mouse was hovering over the Zل’ key when the screen shot of the sample keyboard available at the last URL cited above was taken. The "U+0638" gives the Unicode hexadecimal value for the letter(?)/glyph(?). Unicode is of no practical importance. What matters is Codepage 1256, as has already been pointed out. But of course where the keys are located is even more important from our present point of view.

(( DIGRESSION: The above on-screen keyboard is not functional, except that and will show the corresponding values of the keys. If you want something that looks like that but really works, something which generates the character when you click on the picture of a key, it is available in Expensive Windows Arabic as follows:

    Microsoft Office Tools
      Microsoft Visual Keyboard

If you have a default installation of Office 2000, you can make a desktop shortcut to it at

C:\Program Files\Microsoft Office\Office\vkey.exe

and it can float on top of other windows. It is resizable down to 369H x 169V pixels and you can set the font it uses. ))

Now that you have installed more than one language, the "Internat" process should be running in the background. It appears in that mysterious list of things you get when you type {Ctrl}{Alt}{Del}, and it also manifests itself visibly in the form of a blue blob with two white letters in it over in the system tray at the west edge of the taskbar by the clock. You can probably guess what "En" and "Ar" signify. If you click on this blob, you will get a menu of available languages and a second click will switch you to one of them. It is important to understand that language/keyboard switching happens outside of any particular program. That means that there is nothing inside most particular programs to switch things for you automatically. ِ ءَن تيپى عن تهى -- oops! sorry, trying to type English with the Arabic keyboard active.... (Expect that to happen to you twelve times a day for the next twenty years.)

You are now 100.00% Arabic-browser enabled. You may rush to, say,

{{ }},

switch to Arabic, type "kpjsf" (it would probably be) and begin your quest for information on the medieval market inspector. It is not clear exactly what you should type on a keyboard laid out the Arabic way. Using the "Arabic (Massachusetts) 101-key" keyboard, one could type "mHtsb" instead.

... Ooops, not a good recommendation. Al-Misbلr is having another one of its incomprehensible snits. (One match for "bgd!d," and that one in a page evidently written in Linear B? Forget it!) Try instead

{{ }}

where you can get the whole H-s-b entry from Lisلn al-'Arab with all the rubrics wearing a tasteful pink. Unfortunately, /muHtasib/ itself does not seem to be specially marked.... This site definitely makes a better example, though, since they naturally assume that if you want to look things up in an Arabic-Arabic dictionary, you can type Arabic words in. (Al-Misbلr thoughtfully gives an on-screen keyboard to help you enter whatever it is that they are going to fail to find for you.)

Arabic-Script Arabic Beyond the Browser

At this point the cheap/expensive dichotomy becomes important. If you have Expensive Windows Arabic, a.k.a. Office 200X, you can simply select what you like from those AJEEB search results, move it to the clipboard with and paste it into WinWord 9.0 -- or into anything else in the Office suite, presumably. Your difficulties are far from over, but how to set up Office and Word to cope with Arabic and especially mixed Arabic-European text is a topic we haven't yet sufficiently studied to care to recite upon.

Except for a couple of really urgent things. If you don't want to lose your mind altogether, start up the brontosaurus (Microsoft Word, that is) and when it finally lumbers into sight go to

    Right-to-Left  (click on the tab)

and then under "Cursor Control" set "Movement:" to "Visual" and "Visual Selection:" to "Continuous." You probably want the whole box to look as follows:

WinWord 9 R2L Settings

There are a hundred other settings that Microsoft has the defaults wrong about, but this dialog box stands out as the greatest single obstacle to anything ever happening as you expect when you innocently try to do something with Arabic text the same way you would with English. Next in importance, wherever it is, is the "feature" that resets tabs and margins automatically based on the phases of the moon and the customer's mother's maiden name. And then there's the one that starts formatting your typing as lists without ever being asked to... and the one that .... Referring to only the previous version and of course with no mention of Arabic, there is a whole book about what is wrong with this [exp. del.] product, see

{{ }}.

There doesn't seem to be a "Windows 2K Annoyances" book. Yet. Maybe we will start work on an Arabic appendix ourselves.

Before turning to Cheap Windows Arabic proper, we should consider an area of overlap, the great Arabic cut-and-paste problem. From Microsoft's latest browser to the latest version of Microsoft's pricey business suite, it works fine. But it seems to be deliberately intended at Redmond that Arabic cut-and-paste should never work under any other circumstances. Specifically, IE5 puts text on the clipboard in the RTF ("rich text format") mode and in plain text mode, but if it knows that there is Arabic in that text, it replaces every text character with a question mark. The result can be very peculiar: you paste a passage into another program, and the fonts and typesizes and colors and pictures and gridlines are solemnly preserved. The one thing missing is the information content. There doesn't seem to be any plausible commercial reason for this behavior and perhaps one should attribute it to sheer brain damage.

In any case, the great thing is to figure out how to work around it. If you have Expensive Arabic Windows, the thing to do is save the pertinent text in the "Encoded Text" format you can choose in the drop-down listbox at the bottom of the "File\Save As" dialog box, specifying "Arabic (Windows)" as the pertinent encoding. Then open that file with the program you couldn't paste to directly.

And so we arrive at Cheap Windows Arabic, which is by definition when we do not have (or do not use) WinWord 9. Since this section is about Arabic-script Arabic, the essential thing is to get something, anything, that will display Windows Arabic in the customary script under vanilla or USA Windows. The best freebie we have found so far for this purpose is MINIPAD, an Arabicization of NOTEPAD more or less. You can download it from

{{ }};

you might also want to ... no, here, allow us ....

MINIPAD Keyboard

That's the conventional Arabic keyboard, and it might be more useful to have it in printed form by your computer than to use the on-screen "visual keyboard" that does not give both the Arabic and the English for each key. MINIPAD's screen font is tiny, when viewed at 1152x864 screen resolution, anyway, but tolerable. The printer font (ARBNSK.TTF, "Arb Naskh") is quite presentable.

As usual with an editor called "....PAD," it won't handle anything but short texts, meaning in this case texts under 30,000 or 32K bytes long. That sounds very small, but it would be enough for at least fifteen pages of al-Warrلq Arabic, quite enough to accomodate al-'Asma'ي's discourse on the bedouin's vocabulary of the sheep, which immortal work just happens to be available at al-Warrلq. Actually, the sheep book fits in twice. The multiple-window format is misleading: you can only have 30,000 bytes total, not 30,000 in each child window. MINIPAD's worst feature is that it crashes when you try to open a file that is too long.

Returning to the cut-and-paste problem, there is no impossible difficulty in the case of al-Warrلq. You have only to e-mail the text of a page to yourself and then open it with Eudora or some other mail reader of non-Microsoft provenance. The Arabic will appear as gibberish, but when you copy it to the clipboard and paste it into MINIPAD, it will turn right back into Arabic-script Arabic again. Of course that plan is valid only for the one site. Most places don't have any such e-mail arrangement, needless to say. The challenge is to get the stuff out of IE5 without everything turning into pumpkins. Let us now attempt it with the Second Greatest Classical Arabic Website, namely,

{{ }},

or rather with their web-based search at

{{ }}.

We have written a whole document about this subsite.

????? 1 ?? 70 ?????? ?????? ??????? ??????? 1.52 - ?????? ??????? ????? 7 ?? ?????? }} ???? ??????? }} ?????: 58 {?????? ????? ???? ????? ???? ??? ????? ??? ?? ???? ??? ???? ???? ???? ?????? ???? ??????} --????-- ?????: ??? ?????? ???? ?????? ??????? ???????? ??? ?????? ??? ???? ???? ??? ???? ???? ????: (????? ???? ???? ?? ???? ????? ??? ??? ???? ????? ?? ??????? ?????? ???? ??????). "????" ??? ??? ?????? ??? ????? ??????? ?? ????? ?????. ???? ?????. ??? ?????: ???? ?? ?? ??? ??? ????? ???????. ???? ???? "??? ????" ??? ?????? ??????. ???? ??? ??????? "????" ???? --????-- (????? ??????: 972)

So much for direct cut-and-paste! But the al-Muhaddith report format is clear enough. This text that we have tried to paste into WinWord 2.0 (the pre-brontosaurial version from ten years ago) from IE5 gives the first of seventy matches. The bibliographical stuff is in blue. The extract is found on page 58 of the book. The citation is 972 letters long. The extract from the book is in black. The word sought is in red, being in this case /!lmHtsb/. There's enough here for a production of Hamlet without the Prince of Denmark, no doubt about it.

In this case, the solution is tolerably easy. "View\Source" brings up NOTEPAD and the HTML for the page, which in this case contains the authetic gibberish:

نتيجة 1 من 70
الجامع لأحكام القرآن، الإصدار 1.52 - للإمام القرطبي
الجزء 7 من الطبعة
سورة الأعراف الآية: 58 {والبلد الطيب يخرج نباته بإذن ربه والذي خبث لا يخرج إلا نكدا كذلك نصرف الآيات لقوم يشكرون}

--مزيد-- قتادة: مثل للمؤمن يعمل محتسبا متطوعا، والمنافق غير محتسب؛ قال رسول الله صلى الله عليه وسلم: (والذي نفسي بيده لو يعلم أحدهم أنه يجد عظما سمينا أو مرماتين حسنتين لشهد العشاء). "نكدا" نصب على الحال، وهو العسر الممتنع من إعطاء الخير. وهذا تمثيل. قال مجاهد: يعني أن في بني آدم الطيب والخبيث. وقرأ طلحة "إلا نكدا" حذف الكسرة لثقلها. وقرأ ابن القعقاع "نكدا" بفتح --مزيد--  (مجموع الأحرف: 972)
Paste that into MINIPAD and the question of who is gibbering looks rather different:

Muhaddith Page in MINIPAD

If we tell you that the book is /!lj!m& l!'Hk!m !lqr!@n/ of the Imلm al-Qurtubيy, you can make it out even from that degraded screen shot. (It's quite clear if you zoom the above picture to 200%, and you can see that we guessed wrong what the number 58 means.) Of course at this point you would clearly want a utility to chop HTML mark-ups out of documents and leave only the text. We have prepared one that specializes in Arabic text. No doubt there are more reliable ones available than that hasty concoction. We will discuss that nerdish sort of thing separately.

If you only want to copy-and-paste some scrap of Arabic that doesn't cross any format boundaries, this procedure would be tolerable even without automated assistance. It is not always available, however. Websites can prevent "View\Source" from working, quite apart from encryption properly so called. Furthermore the text may not even be in the so-called "source," when you do view it. This is the case at al-Warrلq, although it doesn't matter there because they let you e-mail pages out. Even with Expensive Arabic Windows, you will run into websites that don't let you copy text at all.

Another struggle with the brute suggests that we ought to say more about Mirosoft Word for Windows (WinWord 9) and not just blithely abandon you to its help files. It is a terrifically "powerful" program, and you can do quite remarkable things with it, as for instance put up a Library of Congress virtual catalog card exactly the way it looks at

{{ }}

and then talk about the transliterated Arabic there in terms of the "real thing." But the price of this power is complication. WinWord 9 can almost certainly do anything one might ever want to do with Arabic, not to mention lesser languages, but that doesn't mean that one will figure out how to do it in this lifetime. For instance, try to select one word of that virtual library card. Since the whole fetch of text it was in (the author entry for a book) was a hyperlink, there seemed no way to touch it with the mouse and not activate the link rather than select part of the text. Since it was only one word, we wound up typing it in from scratch, even though it involved two of those exotic overscore characters they do Arabic with in Washington.

Now we remembered reading about exactly this hyperlink problem in the annoyances book mentioned above, but unfortunately we didn't remember what the solution is, because it was a library copy of the book and we didn't have it at the computer. The moral of the story is that we should advise you not only to read but to own a good book on WinWord 9. Before trying anything the least bit complicated in Arabic, make sure you understand what the program does with text in English, above all how paragraph formatting is stored in the paragraph marker at the end of the paragraph and what that implies and why Woody Leonard in op. cit. takes the line that you can't be serious about WinWord if you don't have the tabs and paragraph markers showing all the time. You don't have to actually do that. It can look very distracting. But you should certainly understand why he recommends it and turn the hidden characters on frequently to make sure you understand what is happening.

Specifically about Arabic, you should (but won't) keep a whole bunch of things in mind whenever you stick the cursor ("caret") somewhere and either start typing or start selecting text.

(a) The Keyboard. This is, as we have noticed, outside of WinWord altogether, and therefore a fortiori outside paragraph four of your document. If you were typing Arabic and jumped the cursor to a patch of English, nobody tells the keyboard software that this happened. It may seem obvious to a human being that if you plunge into the middle of an Arabic word you will almost certainly not want to type English. But machines and software can be very dumb about certain things, and this is one of them.

(b) Direction of Entry. You must clearly understand that right-to-left data entry is NOT the same thing as Arabic, nor left-to-right the same thing as English. This distinction becomes particularly important inside dialog boxes. The fact that the flashing gizmo is located at the east end of the little box does not necessarily mean that Arabic is going to happen when you hit a key. You are quite as likely to get English shoved in backwards instead.

For use inside dialog boxes especially (where you usually cannot get at a program's menu or icons), there are five (5) special keyboard shortcuts that you must simply memorize:

{Alt}{LeftShift}   gives left-to-right entry.
{Alt}{RightShift}  gives right-to-left entry.
{Ctrl}{LeftShift}  should enforce English.
{Ctrl}{RightShift} should enforce Arabic.
{Ctrl}{V}          pastes text from the clipboard.

The last item has nothing specifically to do with Arabic, but learning it can save you an immense number of keystrokes. It also helps to cope with the gibberish-that-should-be-Arabic problem, which is never going to go away altogether. The "should enforce English" is weasel-worded, because what these keys do is cycle through the whole list of keyboards. If there are only English and Arabic, they will work as described, but after you add Greek and Russian things begin to get a little tricky and you'll probably have to remember to set the language right before getting into a dialog box at all.

(c) Alignment of Paragraphs. This is again a completely separate question from either Arabic/English or R2L/L2R. Unlike them, it affects how the document looks after it is entered, not what happens when you strike a key.

(d) Fonts. Unlike all the above, the font for the next character entered is picked up from context when you jump from place to place in your document. You cannot, however, tell anything about what language you are in by looking at the name of the current font in the WinWord toolbar. The standard Microsoft fonts ("Arial", "Courier New", "Tahoma", and "Times New Roman") are understood by WinWord 9 to be the names of Arabic typefaces as well as Latin ones.

(e) Language. WinWord's idea of what language a block of text is in is only remotely related to yours or mine. It primarily answers the question about what spell-checker to use. However, since the computer coding of Windows Arabic and ordinary English ASCII scarcely overlap at all, the program can usually tell the difference. For instance, if you open a plain text Arabic document like those available at

{{ }}

in WinWord 9, the prgram will correctly decide that the document is in "Arabic (Windows)" even though it does ask you to confirm this analysis. With short bits of text, however, especially those brought in by pasting, it may get the analysis wrong. The results can be very unfortunate, since you cannot use the "Tools\Language\Set Language" procedure to do what a human being would call setting the language. That is, you cannot compel the program to show Windows Arabic as Eurogibberish or vice versa. Or at least if one can do this trick, we haven't figured out how. As noted, all that procedure does is set aside some notes for spell checking. Once WinWord has decided that text is Eurogibberish, it does no good at all to select it and try to force it into a specifically Arabic font like "Arabic Transparent." Sometimes the problem can be resolved by saving the problem passage as an "encoded text" file, of course with the "Arabic (Windows)" encoding, and then reopening it. Sometimes the problem cannot be resolved at all.

When you install Office 2000 (and also Office 2001, presumably), you must notify Office specifically as well as Windows generally that you intend to use Arabic. The procedure goes like this:

    Microsoft Office Tools
      Microsoft Office Language Settings.

Then make sure the boxes labeled Arabic and Farsi are checked. The notice about these languages not being supported by the system is nothing to be alarmed about. It really only means that Microsoft wishes you had bought Windows 2000. At least for Arabic, it means that. If you want to write Persian without Windows 2000, you will need the "Arabic (Massachusetts) 101 Keys" keyboard, to which topic we may now turn.

The Transliterated Keyboard

We are now almost, but not quite, finished with Arabic-script Arabic and ready to move on to Latin-script Arabic, the really central topic of this discussion. The "transliterated keyboard" occupies an intermediate position here, since it may well be of interest to people who have no use at all for transliteration proper. The idea of it has already been indicated above. Is that market-inspector word, /mHtsb/ to be entered with keys like "kpjsf" or with keys like "mHtsb"? If you touch-type in Arabic, the answer will be obvious. On the other hand, if you touch-type in Arabic, you almost certainly won't have read this far in this document. When we have to hunt and peck in Arabic anyway, the case for what is engraved on the keys having some discernible connection with the Arabic letters intended seems overwhelming.

Unfortunately, this case does not overwhelm anybody at Microsoft. Let us all bombard Redmond with complaints until they provide something so easy to do (for them) and so useful for us. But meanwhile, we happen to have a hack.... Let us make very plain that this is a hack we are talking about. It involves rearranging a file that Microsoft does not document the format of. We am far from the first people to do this, however. The "Arabic (Massachusetts)" keyboard came into being by way of

{{ }},

which rearranged the Microsoft Arabic keyboard file to include Persian. We would recommend it to you, conceivably, except that it makes Persian possible at the cost of making Arabic impossible. There is no taa marbuwta. Not serious. But you might look at the site anyway and see what you think of Mr. Mazar's Persian fonts.

For that matter, there are people out there in WWWonderland

{{ }}

trying to sell exactly this trick, which really ought to have drawn the attention of Microsoft's lawyers if anything was ever going to. So probably we can get away with it for scholarly and nonprofit purposes.

The trick consists of replacing the file KBDA1.KBD (in a canonical installation of Windows, C:\WINDOWS\SYSTEM\KBDA1.KBD) with something different that has the same filename. This replacement must happen after the original version from Microsoft is put in place, or else Windows will unload the original version and wipe out the new one. Once it is made, not only will the filename be the same for different contents, so will the system name. The screen shot above showed "Arabic (Saudi Arabia) 101 Keys" when in fact what is there is "Arabic (Massachusetts)." These names are not inside the keyboard files themselves, so nothing can be done about the nomenclature problem.

So there it is. If you want it, you can download it and see what you think. That is, you can do so if you have Windows 95 or 98 or Me. The keyboard files for NT/2000 are organized entirely differently.

It is important to mention that in addition to transliterating the keyboard, so to speak, "Arabic (Massachusetts)" allows you to write Persian. On this front again it is a puzzle what Microsoft can possibly be thinking of. When you downloaded the "Arabic Text Support" package for IE5, you acquired fonts that contain the extra letters for Persian, but the only way you will get a keyboard that knows about them from Microsoft is to buy Windows 2000. Why didn't they at least put it into Office? There soen't seem to be any good commercial reason for their policy, and the fact that a Persian keyboard is available in any version makes plain that it cannot somehow have to do with boycotting the Islamic Republic.

A screen shot of the keyboard layout will not be inserted at this point, because the "Microsoft Visual Keyboard" also pretends not to know that the Persian letters are in those fonts and puts up square boxes instead. Suffice it to say that here in Massachusetts the Arabic alphabet runs

c ! b p t + % j C H x d ] r z J s $ S D T Z & g f q G K l m n h w y e

Let us quote one famous line, putting some vowel keystrokes in,

!lsyfu 'Sdqu 'nb!cA mn !lktbi * fy Hd:hi !lHd:u byn !ljid:i w!ll&ibi

As to "Windows Persian," let those who are interested examine our separate discussion of some of the problems and peculiarities thereof.

Read-Only Transliteration

From this point on, we will be concerned with reading and writing Arabic in Latin script. On the computer side this is largely a matter of fonts, of which we shall make available at least two. The first of these is called "Xlit1256" and by now you should be able to guess what the 1256 part is about. What this font does is simply to put plausible transliteration glyphs where Windows Arabic (Codepage 1256) puts the Arabic alphabet.

N.B., Codepage 1256 is not itself an Arabic font. How could it be? It assigns only one number for each letter, whereas most letters need three or four different glyphs in an Arabic font. On the other hand, one mark for one Arabic letter is (more or less) what we want in a transliteration scheme. It would be crazy to try to represent absolute-final-initial-medial variants in Latin letters, as crazy as to try to indicate where Arabic has fancy ligatures. To illustrate how it works, here is that search result from al-Muhaddith (minus the HTML and plus a little formatting) displayed first in "Xlit1256" and then in "Times New Roman":

نتيجة 1 من 70
الجامع لأحكام القرآن، الإصدار 1.52 - للإمام القرطبي
الجزء 7 من الطبعة
}} سورة الأعراف }} الآية: 58 {والبلد الطيب يخرج نباته بإذن ربه والذي خبث لا يخرج إلا نكدا كذلك نصرف الآيات لقوم يشكرون}--مزيد-- قتادة: مثل للمؤمن يعمل محتسبا متطوعا، والمنافق غير محتسب؛ قال رسول الله صلى الله عليه وسلم: (والذي نفسي بيده لو يعلم أحدهم أنه يجد عظما سمينا أو مرماتين حسنتين لشهد العشاء). "نكدا" نصب على الحال، وهو العسر الممتنع من إعطاء الخير. وهذا تمثيل. قال مجاهد: يعني أن في بني آدم الطيب والخبيث. وقرأ طلحة "إلا نكدا" حذف الكسرة لثقلها. وقرأ ابن القعقاع "نكدا" بفتح --مزيد&(مجموع الأحرف: 972)

نتيجة 1 من 70 الجامع لأحكام القرآن، الإصدار 1.52 - للإمام القرطبي الجزء 7 من الطبعة }} سورة الأعراف }} الآية: 58 {والبلد الطيب يخرج نباته بإذن ربه والذي خبث لا يخرج إلا نكدا كذلك نصرف الآيات لقوم يشكرون}--مزيد-- قتادة: مثل للمؤمن يعمل محتسبا متطوعا، والمنافق غير محتسب؛ قال رسول الله صلى الله عليه وسلم: (والذي نفسي بيده لو يعلم أحدهم أنه يجد عظما سمينا أو مرماتين حسنتين لشهد العشاء). "نكدا" نصب على الحال، وهو العسر الممتنع من إعطاء الخير. وهذا تمثيل. قال مجاهد: يعني أن في بني آدم الطيب والخبيث. وقرأ طلحة "إلا نكدا" حذف الكسرة لثقلها. وقرأ ابن القعقاع "نكدا" بفتح --مزيد&(مجموع الأحرف: 972)

That display was created by copying the text out of MINIPAD, where it naturally appears as "real Arabic," pasting it into WinWord 2.0 twice, and then applying the fonts mentioned to the two copies of it. Plus some HTML for the web, admittedly, but nothing that affects the text itself. The Eurogibberish and/or "Xlit1256" transliteration simply IS Windows Arabic.

A detailed discussion of how "Xlit1256" looks and why, giving examples of Windows Arabic with different degrees of vocalization is available for examination.

To summarize that material briefly, the above specimen admittedly does not look very much like how scholars usually transliterate Arabic, but it is important to understand that the main reason it doesn't is that the Orientalists put in almost all of the vowels and the Arabs leave almost all of them out. Englsh wld strt lkng vry strng too if we took to doing that.

Trivially, the above specimen would look a bit more Orientalistic if there were underscored t and d instead of eً and ‏orn. If you prefer it that way, we could make up a font that does it that way. The points about the representation that are not indifferent and that some will probably dislike are especially the exclamation mark for 'alif and the plus for tل’ marbْta+, and maybe the e for 'alif maqsْra+.

These points all have to do with cases where the scholars want to write what is one single thing in Arabic script or Arabic pronunciation a number of different ways in Latin script. Given what is happening with "Xlit1256," that we are just laying a font on top of Codepage 1256 and not deploying any software that could recognize and contextualize and treat different cases differently, we must choose one mark for each code number Microsoft represents Arabic with and stick with it in all cases. Therefore it is better to write 'alif as an exclamation point (which at least looks like the original) rather than as either an apostrophe/smooth breathing or as a long A vowel somehow indicated, because those representations will be spectacularly wrong in lots of cases.

The extreme case of this problem is shadda, which the Orientalists want to write as many different ways as there are consonants. We can't do that with just a font. Shadda doesn't happen to occur in the above extract. We have decided to use that superscript 2 that exists more or less uselessly in the standard vanilla ANSI/USA Windows Codepage 1252. That is to say, the name of the mark in question is شَدَّة. Or more likely, شدّة. Even more likely is شدة, but the point would disappear if we wrote it that way in this context.

"Xlit1256" is the ultimate in Cheap Windows Arabic. Get used to it a little, and you don't need to worry about downloading IE5 with exactly the right options. You can just point your good old Mozilla 0.0 at the front page of al-Warrلq and tell it what font to use, and there you are:

The usefulness of "Xlit1256" Arabic for making out what other people have written already is manifest, but why would anybody want to write MORE of it from scratch, especially if she has WinWord 9 (or even just MINIPAD)? We have assumed that nobody would. But that was a mistake. It can in fact be quite useful, and there is no reason we should not reveal this fact to the world. (Let us at this point stop putting quotes around the name of one font and use "Xlit1256 Arabic" to mean the general idea which that font involves, the idea of displaying Arabic as a left-to-right language with an alphabet defined by Codepage 1256. Exactly what glyphs are used in such a display is a different and lower-order sort of question. "Helvetica" and "Baskerville" are not the names of two different languages, only of two very superficially different ways of writing Latin.)

We used Xlit1256 Arabic for preparing our first Arabic bilingual webpage, namely,

The Al-’Asma‘ي Sheep Page

If you have installed the "Arabic Text Support" module for IE5, you may examine that page and see that it contains Arabic-script Arabic and perhaps not notice that Xlit1256 is even there. Yet it is, indirectly. Down in the appendix for programmers, there is a picture of some of it. On a webpage, it must necessarily appear in a picture, since it would be foolish to do a public page relying on a font nobody else possesses. The picture is of the HTML mark-ups for the page itself, and the point of the picture is that what appears as Arabic-script Arabic out in WWWonderland first appeared as Xlit1256 Arabic inside a very plain editor. UltraEdit, the editor's name is, and one can recommend it highly for programming or any other kind of text preparation where one font for the whole document is adequate. If that one font happens to be "Xlit1256," you can type in things like this:

Alasmaei Capitula XVII DE OVE (Linguâ Arabicâ Confecta)
الاصمعي -- أنباء العلماء في أسماء الشاء

And the second line of the link will duly appear as Arabic-script Arabic when you look at it with the browser. Assuming that you have put the mysterious incantation

{meta http-equiv="Content-Type" content="text/html; charset=windows-1256"}

at the top of the page, that is, and that the browser in question is IE5 with Arabic Text Support.

We've already seen in those MINIPAD examples what becomes of HTML when one attempts to manipulate Arabic-script Arabic directly. As you'll probably soon learn from struggling with WinWord 9, really bilingual and bidirectional text, paragraphs and sentences in English with chunks of Arabic in them or the other way around, is not just twice as hard to manage as one language alone, but at least four times as hard. Maybe EIGHT times as hard. With the above arrangement, we revert to the vicinity of twice as hard, since you do still have to switch "languages" in a case like the HTML scap above. That is to say, you must switch keyboards from (in this case) "Polish (Programmers)" to "Arabic (Massachusetts)."

The Arabic transliteration characters of the "Xlit1256" font were deliberately borrowed from a sans-serif font and embedded in a typewriter one. This hybrid arrangement was needed because it is important to be able to see at a glance that

that "barf" is not the equivalent of "بَرف."

That whole centered line is shown in "Xlit1256." The two four-letter words in it are utterly different. If it were part of that sheep HTML page, the second instance would appear as Arabic-script Persian.

Since the two strings are not at all the same, searching for one will not find the other. And this point is more than just a nuisance, because it affords a quick, although inelegant, solution to an obvious problem that arises now that so many long books are available in Windows Arabic. How can you search them? In general, that is, and not because somebody gave you a search program that goes only with a particular corpus on that particular CD-ROM? Well, with Expensive Windows Arabic you can load a text like al-Tabarي's Qur'لn commentary into WinWord 9 and wait a week for all 15,557,993 bytes to load, and after that, search it in Arabic-script Arabic. (There are a bunch of things you ought to know about before trying to search and sort Arabic with the brontosaur, but further research is needed before writing them up for you.)

Or use Cheap Windows Arabic and UltraEdit, set the font to "Xlit1256," switch the keyboard to Massachusetts Arabic, bring up the search box and type in "mHtsb" (i.e., "محتسب") and you can know in about five seconds rather than fifteen minutes that it occurs five times and, with a little puzzling, discern what the context around each occurence the word means. After you decide that the third one looks interesting, you can cut it out and then paste it over to the brontosaur and print it out in loveliest Tahoma.

Text Processing

By this term we mean such academic enterprises as the preparation of lexicons or concordances with computer tools. That last search procedure begins to get into the general area. With nothing more than UltraEdit (and the old 16-bit version at that), we could search 94 files (296,997,784 bytes) and get a report that runs to about 800 (wrapped) lines and is formatted as follows:

Find 'محتسب' in 'SBUKR.TXT' :
SBUKR.TXT(16701): % سألت رسول الله صلى الله عليه وسلم عن الطاعون، فأخبرني أنه: (عذاب يبعثه الله على من يشاء، وأن الله جعله رحمة للمؤمنين، ليس من أحد يقع الطاعون، فيمكث في بلده صابرا محتسبا، يعلم أنه لا يصيبه إلا ما كتب الله له، إلا كان مثل أجر شهيد).

Search complete, found 'محتسب' 432 time(s).

That process took less than thirty seconds. If we had been serious about the search except as an illustration, we would have printed the whole report out ("Xlit1256" looks wretched on the screen, but quite decent when printed) and then worked through it with a red pencil before deploying WinWord 9 to look at particular passages in Arabic-script Arabic.

That happens, not accidentally, to be essentially the same search we did at

{{ }},

where we found only 70 instances. Those 94 books are the al-Muhaddith treasury, downloaded and unzipped without reference to their own indexing software. The difference (presumably) is that they were matching for the exact word, whereas we matched just the string of characters, which could be embedded in a longer word. As in fact you can see in the above instance from the Sahيh of al-Bukhلrي that the match found was with /mHtsb!/ (accusative singular indefinite) and not with the citation form of the word.

This example should give a general idea of what we mean by text processing, namely pretending (if at all possible) that Arabic is a European language written left to right and without connections or contextual variants of the letters, and then using the million-and-one existing tools to work with it, only reverting to Arabic-script Arabic at the very end when displaying the results. Codepage 1256 presents no obstacle in itself to this approach. But problems do arise, problems which mean that the Xlit1256 approach by itself is not all we need. For instance, the best concordance program we have found seems unable to handle Windows Arabic because some of the values in it would (in Codepage 1252, or Eurogibberish) be upper- and lower-case variations of one another and therefore, in this program's unfortunate opinion, interchangable.

Writing Arabic in Learned Transliteration

This section will be the last. It is centered around the second of our give-away fonts. This one is called "Dushizat" for reasons to be explained shortly, and it is available for download here.

The object of this font is to make it easier to write Arabic the way that academic Orientalists in Europe in fact do write it in learned articles and books. There is, however, a good deal of variation in their practice, partly from country to country, and partly on the basis of whether one likes or dislikes digraphs as opposed to letters with diacritics. It has, incidentally, come to mind that somebody might desire a font with all these variations included in it, so that one could reproduce exactly what Brockelmann or Massignon or Levi della Vida or the Encyclopaedia of Islam actually wrote in such-and-such a place. It would contain, for instance, G with a dot above it, with a line above it, with a hachek above it, with a breve above it for Turkish, and maybe more variations still. That might be called a maximum transliteration font. "Dushizat" is a minimum one. Minimalism is where the mysterious name comes from. The very least needed over and above ASCII (or call it plain A-to-Z) to write Arabic in the Orientalistic way is

D and S and H and Z and T   with dots under them, plus
U and I and A               with macrons.

The mnemonic "Dushizat" refers to that inventory. The word defines an absolute minimum; the actual "Dushizat" font is a little more generous. You also get E and O with macrons, plus optional versions of 'ayn and 'alif. (Optional, because you could use the rough and smooth breathings (0x91-0x92 in Codepage 1252) or even the grave accent and apostrophe of plain ASCII.) Here is the Arabic (Massachusetts) alphabet again, with an approximate version in "Dushizat":

c ! b p t + % j C H x d ] r z J s $ S D T Z & g f q G K l m n h w y e

' (ه) b p (t/h/-) th j ch پ kh d dh r z zh s sh ے ً ƒ و چ gh f q k g k l m n h w y (ه)

And what ’Abْ Tammلm wrote would be Orientalized as something like

Al-sayfu 'aےdaqu 'anbه"an mina 'l-kutubi * f‚ پaddihi 'l-پaddu bayna 'l-jiddi wa-l-laچibi

We have written at length elsewhere on the relationship of Orientalist transliteration to Arabic-script Arabic. In general, the worst problem is that you pretty well have to put in almost all the short vowels, which requires that you know which vowels to put in.

Apart from that challenge, which pertains to any Orientalizing representation, the worst problem with "Dushizat" in particular is the necessity of digraphs with H as the second member. Of course no learned Arabist would ever dream that the word 'ashalu has a shiyn in it and not a siyn followed by a hل’, yet all but the best might stumble over less common words. The matter of having to put in all the vowels is pertinent here. If you don't put them in and you also use -h digraphs, all those third-person suffixes beginning with hل’ become a real puzzle in analysis. One can, with E. G. Browne, write such a word 'as-halu, to be sure.

Furthermore, some of us purely esthetically have never liked the look of such digraphs when they have to be doubled, as in, say, yuhadhdhibuhل.

When it came to making up a font, however, we decided to abandon esthetics and aim at a minimal representation, because every glyph that is required for transliterated Arabic means something else has to be sacrificed. With "Dushizat" what is sacrificed from Codepage 1252 is mainly Scandinavian/Germanic glyphs plus a few of the less frequent special symbols. (Hopefully the less frequent ones -- our judgment may not overlap with yours.) As it is, you can still write French and German and Italian and Spanish and Latin and Portugeuse as well as English and Arabic and Persian without ever needing to switch to another font. "Dushizat" is designed for the small class of people who might conceivably wish to do exactly that in the course of a single document, or even perhaps a single paragraph.

In fact, a stronger reason for minimalism than the font is the keyboard. The glyphs needed for all these languages have to be available with combinations of keystrokes which are not too contortionistic and above all which can be remembered easily. The "United States-International" keyboard, US-IK, which has already been mentioned, was where we started on the keyboard end. As with the "Arabic (Massachusetts)" keyboard for Arabic-script Arabic, the "Dushizat" keyboard for Latin-script Arabic has the same name as a Microsoft keyboard file, but somewhat different contents. This item again will be of use only to Windows 95-98-Me customers. The full details are available in the download package here.

In summary, we have (1) put the Arabic in where Microsoft has those Nordic letters and special characters, (2) retained all the German-French-Spanish items, but removed some redundant synonymous ways of doing them present in the original US-IK, and (3) switched the dead key for the acute accent and umlaut one step to the left, using the semi-colon key rather than the apostrophe key. The result, as regards typing Latin-script Arabic, goes like this

	a with macron   {AltGr}{A}
	d with dot      {AltGr}{D}
	e with macron   {AltGr}{E}
	h with dot      {AltGr}{H}
	i with nacron   {AltGr}{I}
	o with macron   {AltGr}{O}
	s with dot      {AltGr}{S}
	t with dot      {AltGr}{T}
	u with dot      {AltGr}{U}
	z with dot      {AltGr}{Z}
	(ayn            {AltGr}{Q}
	)alif           {AltGr}{P}

With the first nine of these, the upper-case letters add {Shift}:

إ ذ گ ف ھ ¯ ں ق ط ئ.

These triple keystrokes can be a pretty stiff test for an arthritis victim, but it is difficult to see how to avoid something at least that complicated unless we get keyboards with more basic keys on them. This arrangement is vastly better than a couple dozen others available from various vendors. {AltGr}, by the way, is a name for the RIGHT {Alt} key, which the US-IK makes completely different from the vanilla or left {Alt} key that is used with menus, et cetera. With the on-screen Microsoft keyboards linked to above, this key is marked plain {Alt} if it works the same as the other {Alt}, but {AltGr} when it shifts you into a different set of characters. The dead keys, if a given layout has any, appear as orange in these pictures.

Unfortunately there exist laptop computers that dispense with this second {Alt} key. The recommended solution is to plug in a full-size keyboard when you want to write transliterated Arabic or Persian.

The divergence from the Microsoft US-IK about French and German is that the acute accents and umlauts become available ONLY by the dead-key method. The synonymous {AltGr} method now gives the Arabic items.

Shifting the dead key for acute/umlaut has nothing to do with Arabic, since as you can see from the above table, Dushizat Arabic doesn't use the dead keys. At issue is an objection to the Microsoft US-IK in its original form. Single and double quotes are very likely to be followed by vowels, which then on the US-IK turn up minus the quote mark but plus an unwanted diacritic. Maddening! Semi-colon and colon, on the other hand, are most likely to be followed by a space, which with Microsoft dead keys forces the literal dead key to be displayed.

Here is where that exotic "Polish (Programmers)" keyboard finally comes in. The Polish language is nothing to the purpose, because this is a plain English keyboard unless you go out looking for trouble with {AltGr}. The point of installing it is that it has no dead keys at all, which is what a programmer needs when writing "A80D:E934" and suchlike character sequences that would scarcely occur in the everyday prose of any language. If one could have two different keyboards active for English, that solution would be preferable. As with providing a transliterated keyboard for the Middle East, Microsoft could improve the product with extreme ease, whereas working from the outside one is reduced to strange and kludgey strategies.

Between the discussion above of installing the "Arabic (Massachusetts)" keyboard and the explanations included with the download packages and the other discussions linked to, that is all we need say here about the "Dushizat" package. To conclude, here are links to a couple of webpages mainly or entirely written in it:

A Specimen of Brockelmann's Geschichte der Arabischen Literatur
Surah II, Verse 282 Once Again