assamese script misrepresentation(cmb)

256

Click here to load reader

Upload: dr-satyakam-phukan

Post on 26-Nov-2015

391 views

Category:

Documents


57 download

DESCRIPTION

This document is a report presented to the Government of Assam state of India holistically describing the whole issue of misrepresentation of the Assamese script in the International Standards.Keywords India ; Assam; Assamese; Assamese Script ; Misrepresentation ; International Standards ; ISO ; Unicode ; ALA-LC

TRANSCRIPT

  • Dated Guwahati the 18th of March 2014

    ToMr D. S. PeguThe Managing DirectorAssam Electronics Development Corporation Ltd (AMTRON)Bamunimaidam, GuwahatiAssam

    Subject : Report on ASSAMESE SCRIPT MISREPRESENTATIONS IN INTERNATIONAL STANDARDS

    Sir, I have working on the issue of the Assamese script misrepresentations in the International Standards since 2011. Recently I have returned from New Delhi attending a meeting of the Bureau of Indian Standards (BIS) on their invitation on the 5 th of February 2014. I have received the minutes of the meeting by email a few days back. A panel has been instituted to look into the entire issue and the Department of Information Technology, Government of Assam has been nominated as one of the members of the said panel.

    It is in this context that I am enclosing a comprehensive report titled ASSAMESE SCRIPT MISREPRESENTATIONS IN INTERNATIONAL STANDARDS.

    I hope that this will aid the Government of Assam in solving this long standing problem.

    Thanking you.

    Yours sincerely

    Dr Satyakam PhukanGeneral SurgeonHemchandra RoadJorpukhuripar, UzanbazarGuwahati, AssamPhone : 99540 46357

    Copy to :

    1. Chief Secretary Government of Assam, Dispur, Guwahati

    2. Mr Rajiv Kr Bora, IAS , Principal Secretary, Deptt. of IT, Government of Assam

    3. Mr Jishnu Barua IAS, Principal Secretary to Hon'ble Chief Minister, Assam

    4. Mr Anurag Goel, IAS, Commissioner & Secretary, Deptt. of IT, Government ofAssam

  • ASSAMESE SCRIPT MISREPRESENTATIONS IN

    INTERNATIONAL STANDARDS

    The International Alphabet of Sanskrit Transliteration (IAST) is a

    transliteration scheme that allows a lossless romanization of Indic

    scripts as employed by the Sanskrit language. IAST is based on a

    standardestablishedbytheInternationalCongressofOrientalistsat

    Genevain1894.ItallowsalosslesstransliterationofDevangar(and

    otherIndicscripts,suchasradscript).

    The IndianScript Code forInformationInterchange ISCII wasfirst

    adoptedin1988.TheISCIIhasIASTasthebasisoftransliteration.

    AnupdatedISCIIwasadoptedbytheBureauofIndianStandardsafter

    thedraft finalisedbytheComputerMediaSectionalCommitteehas

    been approved by the Electronics and Telecommunication Division

    Councilin1991.

    InoneofthebeginningparagraphsoftheISCIIdocumentitstatesthat

    :

    Thereare15officiallyrecognizedlanguagesinIndia:Hindi,Marathi,

    Sanskrit, Punjabi, Gujarati, Oriya, Bengali, Assamese, Telugu,

    Kannada,Malayalam,Tamil,Urdu,SindhiandKashmiri.Outofthese,

    Urdu, Sindhi and Kashmiri are primarily written in PersoArabic

    1

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • scripts,butgetwritteninDevanagaritoo(Sindhiisalsowritteninthe

    Gujaratiscript).ApartfromPersoArabicscripts,alltheother10

    scriptsusedforIndianlanguageshaveevolvedfromtheancient

    Brahmiscriptandhaveacommonphoneticstructure,making

    a common character set possible . The Northern scripts are

    Devanagari,Punjabi,Gujarati,Oriya,BengaliandAssamese,whilethe

    SouthernscriptareTelugu,Kannada,MalayalamandTamil.

    TheISCIIcodetableisasupersetofallthecharactersrequiredinthe

    tenBrahmibasedIndianscripts.Forconvenience,thealphabetof

    theofficial script Devanagari (with diacritic marks for non

    Devanagari alphabets) has been used in the standard. For

    notational simplicity, elsewhere, the term Indian scripts implies

    BrahmibasedIndianscripts.

    ISCIIretainedmostofthetransliterationcharacteristicsoftheIAST.

    Assamese script which was represented in the ISCII standard was

    hence not properly represented since Assamese differs widely with

    Sanskrit inphonology.TheIASTis notapplicable fortheAssamese

    script.

    2

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • In1991encodingcalledtheUnicodeStandardpreparedbytheUnicode

    Consortium/Inc.wasstartedandit'sIndicscriptencodingtheysayis

    based on ISCII . The Unicode encoding for the Indic scripts as

    mentioned in manydocuments is supposed to be a superset of the

    ISCII. This Unicode Standard is synchronised with the ISO 10646

    maintained by the International Organisation for Standardization

    (ISO).

    TheAssamesealphabetswerenotseparatelyencodedbytheUnicode.

    FollowingtheirpolicyofUnificationtheAssamesescriptwaseclipsed

    intoBengaliintheUnicodeStandardbyUnicodeConsortium/Inc.The

    uniquenessoftheAssamesescriptwasperhapsunknowntothemainly

    American experts of Unicode Consortium/Inc. Unicode compensated

    this by inclusion of two graphically dissimilar Assamese script

    characters into Unicode/ISO10646Bengali codechart byconverting

    themintoBengalicharacters.

    Assameseletter""(Ra)isbeingdescribedasBengaliletter""(Ra)

    withmiddlediagonal

    Assamese letter "" (Waba)describedasBengali letter ""(Ra) with

    lowerdiagonal.

    wasnotrepresentedasaletterbutasaligaturei.e.aconjunct

    formoftwoletters:

    3

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • +=transliteratingasKhsya,

    whereastheAssameselettertransliteratesas:

    =Khya.

    ThefactthatmanyoftheAssameselettersalthoughbeingsimilarin

    graphicalformstoBengalilettershaveanentirely differentidentity

    wasnotgivendueconsiderationbytheUnicodeStandard.Thesame

    wasrepeatedinISO10646,asthisStandardissynchronisedwiththe

    UnicodeStandard.

    The Assamese script is in all total, misrepresented or absent in 4

    internationalStandards:

    A.ISO15924

    InternationalStandardforNamesoftheScripts

    B.ISO10646=UnicodeStandard

    UniversalCharacterSet(UCS)

    C.ISO15919

    InternationalStandardforIndicScriptsTransliteration

    4

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • D.ALALCRomanizationTable

    RomanizationchartsmaintainedbyUSLibraryofCongress

    The present status of the Assamese script in these International

    Standardsaredescribedbelowindetails:

    A.ISO15924

    TheISOstandsforInternationalOrganizationforStandardizationThe

    ISOis aninternationalrepresentativebody formedbyanetworkof

    nationalstandardsbodies.Thesenationalstandardsbodiesmakeup

    theISOmembershipandtheyrepresentISOintheircountry.InIndia

    theGovernmentofIndia'sBureauofIndianStandards(BIS)represents

    IndiainISO.TheISOpreparesStandardsforuseindiversefields.

    This International Standardprovides a code for thepresentationof

    names of scripts. The codes were devised for use in terminology,

    lexicography, bibliography,andlinguistics,buttheymaybeusedfor

    anyapplicationrequiringtheexpressionofscriptsincodedform.This

    International Standard also includes guidance on the use of script

    codesinsomeoftheseapplications.

    ISO has appointed the Unicode Consortium as the Registration

    AuthorityforthisInternationalStandard,ISO15924i.e.Codesforthe

    5

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • representationofnamesofscripts. MichaelEversonofEvertypehas

    beenappointedRegistrarbytheRegistrationAuthority.

    StatusofAssamesescriptinISO15924:Notincluded

    CopyofISO15924attachedheretoasDOCUMENTA.01.

    B.ISO10646andUnicodeStandard

    InternationalStandardISO/IEC10646,Informationtechnologydefines

    theUniversalCharacterSet(UCS).ThefollowinglinesfromtheISO

    10646documentarequotedbelow:

    ThisInternationalStandardspecifiestheUniversalCodedCharacter

    Set (UCS). It is applicable to the representation, transmission,

    interchange,processing,storage,input,andpresentationofthewritten

    formofthelanguagesoftheworldaswellasofadditionalsymbols.

    This is the Standard in which the characters of a script which is

    recognizedintheISO15924isencoded.Thisstandardissynchronized

    with the Unicode Standard maintained by the Unicode Consortium

    incorporated as a nonprofit company Unicode Incorporated in the

    CaliforniastateofUnitedStatesofAmerica.

    ItisinthesesynchronizedInternationalStandardsthatAssameseis

    includedasasubsetofBengali.

    Assameseletter""(Ra)isbeingdescribedasBengaliletter""(Ra)

    6

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • withmiddlediagonal

    Assamese letter "" (Waba) describedas Bengali letter ""(Ra) with

    lowerdiagonal.

    wasnotrepresentedasaletterbutasaligaturei.e.aconjunct

    formoftwoletters:

    +=transliteratingasKhsya,

    whereastheAssameselettertransliteratesas:

    =Khya.

    By recent change the Unicode has made these as Additions for

    AssameseandaddedthetermAssamese.UsingtheBengaliencoding

    inthisStandardAssamesecanbetypedincomputer.Butapartfrom

    thattheAssamesescripthasnoidentityinthisencodingandallother

    functions apart fromtyping are distorted, disabled or handicapped.

    Thisstateofaffairconstitutesgraveinjustice,donetotheAssamese

    script reflecting onto the well being of the Assamese languageand

    peopleingeneral.TheBengaliCodeChartcurrentversionisattached

    heretoasDOCUMENTB.01.

    MyselfandmyfriendPastorAzizulHaquehaverepresentedtothe

    UnicodeConsortiumseekingrectificationofthisgraveinjusticedoneto

    theAssamesescriptbyemailsdated13thand21stofJuly2011.Please

    7

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • see DOCUMENT B.02. The Unicode Consortium responded by

    writingonthemattertotheDepartmentof InformationTechnology

    Government of India. Copy of the document attached hereto as

    DOCUMENTB.03.

    TheGovernmentofIndiahasalsorespondedandsoughttheopinionof

    therespectivestateGovernmentsofthestatesofAssam,WestBengal,

    BiharandManipur.ThedocumentattachedheretoasDOCUMENT

    B.04.

    Onthe9th ofJanuary2012,PastorAzizulHaqueandmyselfsenta

    memorandumtotheHonbleChiefMinisterofAssam,MrTarunGogoi,

    onthesubjectmatterNonrepresentation/Erroneousnomenclatureof

    the Assamese script/writing system in the Unicode Character Set

    (U.C.S) of theUnicodeConsortium.withtheappealtotakeupthe

    matterandtakestepstoensureandobtainaseparateslot/range/place

    fortheAssamesescript/writingsystemintheUniversalCharacterSet

    (UCS)oftheUnicodeConsortium.Onthe18th ofFebruary2012the

    DepartmentofInformationTechnology,GovernmentofAssamsentan

    official communication to the Department of Electronics and

    Information Technology, Government of India for requesting the

    Unicode Consortium to allot a separate slot/range/block for the

    8

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Assamese script. Document attached herewith as DOCUMENT

    B.05.

    Following that on the On the 13th of June 2012, a meeting was

    organised by the Department of Electronics and Information

    Technology, Government of India, in NewDelhi on the issue of the

    AssameseandUnicode.Copyoftheminutesofthemeetingattached

    heretoasDOCUMENTB.06.

    AtpresenttheissuehasshiftedintotherealmoftheBureauofIndian

    Standards(BIS)throughISO,whichistobedescribedbelow.

    C.ISO15919

    TheISO15919is thetransliterationstandardforIndicscripts. The

    followinglinesarequotedfromthere:

    1Scope

    This International Standard provides tables which enable the

    transliterationintoLatincharactersfromtextinIndicscriptswhich

    arelargelyspecifiedinrows09to0DofUCS(ISO/IEC106461and

    Unicode).

    The tables provide for the Devanagari, Bengali (including the

    9

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • charactersusedforwritingAssamese),Gujarati,Gurmukhi,Kannada,

    Malayalam,Oriya,Sinhala,Tamil,andTeluguscriptswhichareused

    inIndia,Nepal,

    Bangladesh and Sri Lanka. The Devanagari, Bengali, Gujarati,

    Gurmukhi, and Oriya scripts are North Indian scripts, and the

    Kannada, Malayalam, Tamil, and Telugu scripts are South Indian

    scripts.

    ThescriptconversionasperISO15919isrequiredScriptconversion

    isoftenrequiredfordocumentssuchashistoricalandliterarytexts,

    geographical texts (including maps and atlases), bibliographies,

    catalogues,listsandpassports(andotheridentificationdocuments).

    TextinDevanagariscriptorotherIndicscriptssometimesneedstobe

    showninLatinscript,whereusers,orequipmentthattheyareusing,

    cannot read or write the text. Copy of the ISO 15919 document

    attachedheretoasDOCUMENTC.01.

    ThetransliterationchartforAssameseIfoundwasmissingfromISO

    15919andtheBengalitransliterationchartprovidedtherecannotbe

    appliedforAssamesescript.

    IwrotebyemailtotheInternationalOrganizationforStandardization

    (ISO) on the 21st of July 2012 asking for help in correction of the

    transliterationerrorofAssamesescriptinISO15919.

    10

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • On the 2nd of October 2012 I received a reply from the ISO who

    informedmethat IshouldtakeupthematterwithBureauofIndian

    Standards(BIS),theIndianGovernment'srepresentativeintheISO.

    AccordinglyIapproachedtheBISontheissueofAssamesescriptin

    ISO15919byemaildated18thofOctober2012.IwasrepliedbyMrN

    KPaltheHeadoftheMSDdivision oftheBISatthattimebeing

    askingmetosubmitaproposalforthesame,relevantportionofhis

    communicationquotedbelow:

    Wehope that we have interpreted correctly that you want separate

    tablestobeincludedfortransliterationintoLatincharactersfromtext

    in Assamese script instead of clubbing them together with Bengali

    scriptsashasbeenpresentlydone.

    InordertopointouttheprobleminrightperspectivetoISO,couldwe

    requestyoutokindlyprovidetheexactchanges(clausewise)youwould

    like topropose inthe existingISO15919, a copyof whichis hereby

    enclosedforyourreadyreferenceplease.Itwouldbeappreciatedifany

    documentaryevidenceinsupportofyourcommentsbeprovidedtousfor

    facilitatingthedecision.

    You are further informed that the International Standard, ISO

    15919:2001,alongwithyourspecificcommentswillbecirculatedtoall

    membersofMSD5SectionalCommitteeforitsconsideration.Basedon

    11

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • the decisionof the MSD5Sectional Committee, ISO/TC46will be

    formallyrequestedbyBIStosuitablyamendISO15919:2001.

    Wewishanearlyresolutiontothisproblemandweassureyouthatwe

    willkeepyoupostedonthedevelopments

    Onthe24th ofNovember2012IsentaproposalregardingAssamese

    script in ISO 15919. A copy of the proposal attached herewith as

    DOCUMENTC.02.

    Myproposal was discussed in discussed in the14th meetingof the

    Documentation and Information Sectional Committee, MSD 5 (the

    NationalMirrorCommitteetoISO/TC46)heldon14December2012at

    BISNewDelhi.ThedecisionwasthatinordertoincludeAssamese

    script in ISO15919, Assamese scripts needs to be included in ISO

    106461.ForthismatterwashandedovertotheLITDdivisionofthe

    BIS.Relevantportionsoftheminutesquotedbelow:

    TheCommitteeconsideredtheinformationgiveninitem10.3of the

    Agenda regarding suggestions from Dr. Satyam Phukan on ISO

    15919:2001 Information and documentationTransliteration of

    DevanagriandrelatedIndicscriptsintolatincharactersforcorrections

    inAssamesetransliterationrequiredinthisISOStandardbyproviding

    separatetablestocoverthetransliterationofAssamesecharactersinto

    12

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • latin,asbecausetranscriptionandtransliterationof Assamesescript

    wasdifferentfromBengali.DetailedproposalreceivedfromDr.Satyam

    Phukansubsequentlywasalsotabledduringthemeeting.

    TheCommitteenotedthatDrPhukan,inhisdetailedproposal,hasalso

    pointedoutthatintheInternational StandardISO/IEC106461on

    InformationtechnologyUniversalMultiOctetCharacterset(UCS)

    Part1:Architectureandbasicmultilingualplane(whichisanecessary

    adjuncttoISO15919:2001),theAssamesescriptwasnotrecognizedasa

    separate,distinctscriptfromBengaliwhichneedscorrectionfirst.He

    alsoinformedthattheDepartmentofInformationTechnology,Govt.of

    AssamhadalreadysentaproposaltothateffecttotheDIT,Govtof

    Indiafortakingthenecessarystepsforobtainingaseparaterangefor

    theAssamesescriptinUnicodeinISO/IEC106461standard.

    He, therefore, proposed that necessary steps shall first be taken for

    obtaining a separate range/block in ISO106461standard and only

    afterthattheproposalforprovidingseparateTransliterationtablesfor

    AssamesescriptinISO15919standardwillbefeasible. Copyofthe

    emailstringscontainingcommunicationwiththeISOandtheBISup

    tothispointoftimeisattachedheretoasDOCUMENTC.03.

    SubsequenttothisI wasaskedbytheBIStoprovidecommentson

    inscriptkeyboardlayoutsfortypingAssamesescriptincomputersand

    13

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ontheISO10646,Iprovidedthesametothem.Iattachcopiesofthe

    sameasDOCUMENTC.04andDOCUMENTC.05respectively.

    Onthe10th ofJanuary2014Iwasinvitedtoattendandspeakinthe

    FifthMeetingofLITD20onthe5thofFebruary2014tobeheldatthe

    BISofficeinNewDelhi.Theagendaforthesaidmeetingspecificfor

    myrequirementofattendancewasasfollows:

    4. COMMENTS RECEIVED ON ISO 106461 IT UNIVERSAL

    CODEDCHARACTERSET

    4.1Dr.SatyakamPhukan,hassentaproposalforseparateUnicodefor

    Assamese language in ISO 106461 Information technology

    UniversalCodedCharacterSet(UCS).ThisISOStandardspecifiesthe

    universal coded character set and applicable to the representation,

    transmission,interchange,processing,storage,input,andpresentation

    ofthewrittenformofthelanguagesoftheworld.

    Inthisstandard,AssameselanguageisgivenunderBengaliscriptwith

    differencesmentionedseparatelyasgiveninenclosedfile.Dr.Phukan

    mentionedthatAssamese is aseparatescript andnotasubscriptof

    Bengali script. Thus, separate Universal codedcharacter set is to be

    providedtoAssamesescriptinISO/IEC106461standardbyissuing

    anamendmenttothesame.Theseparateuniversalcodeasproposedby

    himismentionedinenclosedfile.

    14

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • TheCommitteemaydecide.

    TheAgendadocumentattachedherewithasDOCUMENTC.06.

    Iattendedthemeeting takingalongwithmetwootherpersonsMr

    DurlavGogoianAssamesesoftwaremakerandDrBhaskarjyotiSarma

    Asst. Prof. Deptt. of Assamese Dibrugarh University. I gave a

    presentationontheissueofAssameseScriptandtheISOStandards,

    stressingontheneedforaseparaterange/slot/placefortheAssamese

    scriptinISO106461Standard.Thematterisnowhandedovertoa

    panelheadedbynotedscholarDrPeribhaskarRao,whoaretoexamine

    theissuerelatingtotheAssameselanguage.Relevantportionsfrom

    theminutesquotedbelow:

    4. COMMENTSRECEIVEDONISO106461 IT UNIVERSAL

    CODEDCHARACTERSET

    4.1Dr.SatyakamPhukan,gaveapresentationonhisproposalfora

    separate place/slot/range for Assamese script in ISO 106461

    InformationtechnologyUniversalCodedCharacterSet(UCS).He

    mentionedthatAssamese is aseparatescript andnotasubscriptof

    Bengaliscript.However,thecommitteethinksthatthisissueneedstobe

    discussedseparatelyindetail.ThecommitteedecidedtoformaPanel

    regardingtheissuesraisedbyDr.Phukan.Theworkofthepanelisto

    15

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • examinetheissuesrelatedtoAssameselanguageasitrelatestovarious

    ISOstandards.Thecompositionofthepanelwillbeasfollowssubjectto

    theiracceptance:

    a) ShriPeriBhaskarraoConvener

    b) Dept.ofIT,Assam

    c) ShriManojJain,DeiTY

    d) ShMaheshKulkarni,CDAC

    e) Dr.DilipKumarKalita,Abilac,Assam

    f) SecretaryofIT,WestBengal

    4.2Basedontherecommendationsofthepanel,furtheractionwillbe

    takeninthenextmeetingofthiscommittee.

    CopyoftheminutesofthemeetingattachedheretoasDOCUMENT

    C.07.

    D.ALALCRomanizationTables

    AmericanLibraryAssociationLibraryofCongresssetsstandardsfor

    romanization,ortherepresentationof text inotherwritingsystems

    usingtheLatinscript.ThisstandardismaintainedbytheGovernment

    oftheUnitedStatesofAmerica'sLibraryofCongress.

    ThissystemisusedbytheNorthAmericanlibrariesandtheBritish

    16

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Library. Assamese is one of the languages represented in the said

    standard.Copyofthehomepageofthesaidstandardintheinternetis

    attached hereto as DOCUMENT D.01. The standard also has

    Bengaliasoneofthelanguages.SurprisinglythetheTablesforthe

    Assamesehasbeenpresentedexactlywiththesamecontentwiththe

    Bengali. Copies of the two Tables Assamese and Bengali attached

    hereto as DOCUMENT D.02 and DOCUMENT D.03

    respectively.IcommunicatedtotheUnitedStatesLibraryofCongress

    authorities and presented a corrected form of the Assamese

    Romanizationbutwithoutseparatemarkersforthemultiplegrapheme

    representing a solitary phoneme. Copy of my corrected document

    attachedheretoasDOCUMENTD.04.TheUnitedStatesLibrary

    ofCongressauthoritiesrejectedmyproposalbyanemaildated2nd of

    November2012.Therelevantportionof theircommunicationquoted

    below:

    WithregardtoMr.SatyakamPhukanssuggestion,wecantusethe

    Romanizationtablepurelybasedonpronunciation.Thereasonsare:

    1. ALALCRomanizationtablesaredevelopedforusewhenthe

    consistenttransliterationofaNonRoman(vernacular)scriptintothe

    RomanAlphabetisneeded.

    17

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 2. Romanization attempts to transliterate the original script, the

    guiding

    principleisaonetoonemappingofcharactersinthesourcelanguage

    intothetargetscript,withlessemphasisonhowtheresultsoundswhen

    pronouncedaccordingtothereader'slanguage.

    3. Itwouldbevirtuallyimpossibletoretrievetheoriginalwordin

    AssameselanguagefromtheRomanizedwordbasedonpronunciation.

    OnlyshortcomingthatmyproposalwashavingisthatIdidnotemploy

    markerstographemeofletterswhichrepresentsasinglephonemein

    theAssamesealphabet.Hadtheytoldmethat,Imighthavebeenable

    toalterittotheirneeds.Butthey closedanyscopeforthat bythe

    following statement, I quote below : The current Assamese

    romanizationtablereflectsthegoalsoftheALALCromanizationtables

    asdevelopedbythelibrarycommunity.Weappreciateyourinterestin

    theAssameseromanizationtable.Pleaseletusknowifwemaybeof

    furtherassistance.

    The communications with the United States Library of Congress

    authoritiesattachedheretoasDOCUMENTD.05.

    18

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • CONSEQUENCESOFTHEMISREPRESENTATIONSOFTHEASSAMESESCRIPTININTERNATIONALSTANDARDS

    TheimmediateeffectsofthemisrepresentationsoftheAssamesescript

    in the International Standards are the eclipsation of the Assamese

    script in the ISO 106461/Unicode Standard followed by the non

    inclusionoftheAssamesescriptinallotherInternationalStandards

    where it should have had it's presence. The net results can be

    summarizedasfollows:

    1. LossofidentityoftheAssameseScript

    Inthepresentsituation,ifnotrectifiedbycollectiveeffortofthepeople

    andtheGovernmentofAssamthereisnothingcalledtheASSAMESE

    SCRIPTintheNationalandInternationalStandards.

    2. Lossofhistoricalheritagehundredsofyears

    old

    Assamesescriptisoneoftheoldestorsayoneofthemostancientof

    the Indic scripts. Specimens of this script in stone and metal

    inscriptionshavebeenfoundinsitesnotonlyinAssambutalsointhe

    19

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Arakan/Rakhinestateof Myanmar/Burmadatingbacktotheeraas

    earlyasthe5th / 6th centuryAD.Infactsomeof thebestpreserved

    specimens of Assamese script have been discovered in

    Arakan/RakhinestateofMyanmar/Burma.Theancientinscriptionin

    stone,metalandinwritingsareinthreemainlanguagesAssamese,

    SanskritandPali.

    20

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • TheAssamesescriptwasdevelopedinAssamduringtheKamrupera

    ofAssam'shistory.Thescriptchangesitscharacteraccordingtothe

    languagesinwhichitisused.ThescriptwhenwritteninAssameseisa

    differentonefromtheoneinwhichSanskritorlanguagesfollowingthe

    Sanskrit type of phonology is written. It is this difference which

    differentiatestheBengali scriptfromtheAssamesescript. Although

    thegraphicalrepresentationsofthemanyofthelettersaresimilarin

    appearancealargenumberofthemrepresentstotallydifferententities.

    SimilarsituationistherebetweenthethreemajorEuropeanscripts

    namelyLatin,GreekandCyrillic.Thesesimilarlookingcharactersof

    Latin,GreekandCyrillicscriptshavedifferentrepresentationinthe

    InternationalencodingsnamelyISO10646andUnicodeStandard.The

    same principle can be applied for giving separate encodings for

    AssameseandBengaliscripts.Thisrepresentationofmultipleformsin

    computerparlanceisknownasDuplication.AchartofLatin,Greek

    andCyrillicduplicationisattachedherewithasDOCUMENTE.01.

    3. HandicapsanddisabilitiesintheoperationoftheAssamesescript

    Exceptfortheabilitytotypeincomputers,mostotherfunctionsthat

    needs to be performed in the operation of the Assamese script are

    21

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • distorted,disabledorhandicapped.Thisisfurthercompoundedbythe

    factthatAssamesescriptismissingfromalltheISOstandardsthat

    arethereforscriptsoftheworld.

    DuetheabsenceofthegraphicalformofthelastletteroftheAssamese

    alphabet, (Khya)inISO10646/Unicode,propersortingoperation

    isimpossibleinthepresentAssamesescript(includedinBengali).

    While translating a present Assamese script (included in Bengali)

    webpageontheInternetthetranslationtakesplacebetweenBengali

    andthetarget language, for exampleEnglish. Screenshot pasted

    below

    WhilesearchingforanymatterinthesearchenginesintheAssamese

    22

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • script (included inside Bengali) two phenomenon are noted. If the

    searchword/wordscontainthetwographicallydissimilarletters are

    thenallthereferencesthatsurfacesinthesearchareAssamese.But

    iftheyarenottherethenmajorityofthesearchresultsparticularlyin

    the first page are invariably Bengali. Document showing this

    phenomenon attached herewith as DOCUMENT E.02 and

    DOCUMENTE.03.

    Ihaveseenthesethingfrommypracticalexperiencesbutmoresuch

    technicalproblemsaresuretodiscoveredbymanyothernoworinthe

    future.

    ButthecomputerexpertsoftheGovernmentofIndia(DEITy)havea

    solutionforallthesebyusingapatchingsoftwareforrectifyingthese.

    Infactinthecomputerworldthereisapatchforallproblems.Ihave

    personallycometorealisethisimportantfactbymyinteractionswith

    theofficialsoftheDEITyGovernmentofIndiainthemeetingheldon

    the5thofFebruary2014attheManakBhawanofficeoftheBureauof

    IndianStandards(BIS)inNewDelhi.Inthisparticularinstancethe

    patcheswillberequiredfortheusersintheAssamesescriptincluded/

    eclipsedinsideBengalinotfortheusersinBengali.Ifwearetogoby

    their countenance, for the future generations of Assamese script in

    computerusagewewillbeleavingforthemwithagiftofapatchfull,

    23

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • crippledandclumsyscript.

    CONCLUSION

    1. IthighlyessentialthatAssamesescriptisencodedseparatelyfrom

    Bengali in all the International Standards, provided we do not

    procrastinate and agree to have a defective and crippled Assamese

    scriptforuseincomputers.

    2. TodothisitshouldbeconclusivelyprovedthatAssameseisindeeda

    separate script in spite of having a large of number of graphic

    characterssimilaringraphicforms.

    3. It should be clearly shown that many of these similar graphic

    charactersareinrealityhavingdifferentidentity.

    4. Thebasisforthedifferingidentityliesinthedifferingphonologyof

    the Assamese and Bengali. This difference between Assamese and

    BengaliisapplicabletoallotherIndianscriptsandthisbasicdifference

    therefore,isinrealitythedifferencebetweenAssameseandSanskrit

    scripts.

    24

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 5. The difference between Assamese and Sanskrit becomes most

    obviousinthetransliterationandtranscriptionof thesescripts. The

    presentprogressintheissuehasbeeninitiatedfrommypointingout

    transliteration errors in Assamese script in the ISO 15919:2001

    Standard.

    6. A huge degree of omission on the part of the present Assamese

    scholarsisthereinthematterofhighlightingthetransliterationand

    transcriptiondifferencesbetweenAssameseandSanskrit.

    7. ItisbythisactofSanskritisationoftheAssamesescriptandalsoof

    thelanguage thatthepresentcrisisoftheAssamesescripthasbeen

    generated.

    8. ItistheoutmostdutyoftheGovernmentofAssamtoclearallsuch

    misrepresentationsof theAssamesescriptandlanguagefirst inthe

    homefrontitselfandthenonlywecanproceedwiththerectificationof

    the misrepresentationsof theAssamesescriptatthenationaland

    internationallevel.

    9. The only effect but not essentially a problem will occur if the

    Assamese and Bengali are given two distinct scripts in the

    25

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • InternationalEncodingsisduplication.Duplicationasmentionedabove

    isalreadythereinISO10646/UnicodeStandardbetweenLatin,Greek

    andCyrillic,inadditiontothatthereissomeamountofduplication

    between the southern Indian scripts and the Myanmar and the

    Chakma scripts. There has been some problems caused by

    unscrupulouspersonsontheInternetbyresortingtoillegalactivities

    likephising.onlyincaseoftheLatin,GreekandCyrillicscripts.But

    primarysituationbetweenthesescriptsandthosebetweenAssamese

    andBengaliarenotthesame.Moreoverthedecisionmakingpoweron

    whethertheyAssameseandBengaliscriptsbeseparatedornotshould

    beinthehandsoftheGovernmentsofAssamandWestBengalandthe

    sovereigncountryofBangladeshandthepeopleoftheseplaces.Ifthe

    respectiveGovernmentsandthepeopledecidesoandtakethedecision

    tobeartheconsequencesifany,thentheInternationalOrganisations

    andtheAmericancompanynamedUnicodeIncorporatedshouldcomply

    withthesame.

    10. Itisalsotoberememberedattheendthatconservationofone's

    script is aRightguaranteedbyTheConstitutionof India. Article

    29(1)oftheConstitutionofIndiastatesasfollows:

    "CulturalandEducationalRights

    26

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 29. (1)AnysectionofthecitizensresidingintheterritoryofIndiaorany

    partthereofhavingadistinctlanguage,scriptorcultureofitsownshall

    havetherighttoconservethesame."

    Hencetheidentityofthisancientscriptshouldnotbeallowedtobe

    destroyedandbeextinctatall cost inthenameof technologyand

    modernization.

    DrSatyakamPhukan

    GeneralSurgeon

    HemChandraRoad

    Jorpukhuripar,Uzanbazar

    Guwahati,Assam

    P.I.N:781001

    Phone:9954046357

    Dated:Guwahatithe18thofMarch2014

    27

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    1/7www.unicode.org/iso15924/iso15924-codes.html

    ISO15924CodeLists Previous|RAHome|Next

    CodesfortherepresentationofnamesofscriptsCodespourlareprsentationdesnoms

    dcritures

    Table1Alphabeticallistoffour-letterscriptcodes

    Listealphabtiquedescodetsdcriturequatrelettres

    Code N EnglishName Nomfranais PropertyValueAlias Date

    Afak 439 Afaka afaka 2010-12-21

    Aghb 239 CaucasianAlbanian aghbanien 2012-10-16Arab 160 Arabic arabe Arabic 2004-05-01

    Armi 124 ImperialAramaic aramenimprial Imperial_Aramaic 2009-06-01

    Armn 230 Armenian armnien Armenian 2004-05-01

    Avst 134 Avestan avestique Avestan 2009-06-01

    Bali 360 Balinese balinais Balinese 2006-10-10

    Bamu 435 Bamum bamoum Bamum 2009-06-01

    Bass 259 BassaVah bassa 2010-03-26Batk 365 Batak batik Batak 2010-07-23

    Beng 325 Bengali bengal Bengali 2004-05-01

    Blis 550 Blissymbols symbolesBliss 2004-05-01Bopo 285 Bopomofo bopomofo Bopomofo 2004-05-01

    Brah 300 Brahmi brahma Brahmi 2010-07-23

    Brai 570 Braille braille Braille 2004-05-01

    Bugi 367 Buginese bouguis Buginese 2006-06-21

    Buhd 372 Buhid bouhide Buhid 2004-05-01

    Cakm 349 Chakma chakma Chakma 2012-02-06

    Cans 440 UnifiedCanadianAboriginalSyllabicssyllabaireautochtonecanadienunifi

    Canadian_Aboriginal 2004-05-29

    Cari 201 Carian carien Carian 2007-07-02

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    2/7www.unicode.org/iso15924/iso15924-codes.html

    Cham 358 Cham cham(am,tcham) Cham 2009-11-11

    Cher 445 Cherokee tchrok Cherokee 2004-05-01

    Cirt 291 Cirth cirth 2004-05-01Copt 204 Coptic copte Coptic 2006-06-21

    Cprt 403 Cypriot syllabairechypriote Cypriot 2004-05-01Cyrl 220 Cyrillic cyrillique Cyrillic 2004-05-01

    Cyrs 221 Cyrillic(OldChurchSlavonicvariant)cyrillique(varianteslavonne) 2004-05-01

    Deva 315 Devanagari(Nagari) dvangar Devanagari 2004-05-01Dsrt 250 Deseret(Mormon) dseret(mormon) Deseret 2004-05-01

    Dupl 755

    Duployanshorthand,Duployanstenography

    stnographieDuploy 2010-07-18

    Egyd 070 Egyptiandemotic dmotiquegyptien 2004-05-01Egyh 060 Egyptianhieratic hiratiquegyptien 2004-05-01

    Egyp 050 Egyptianhieroglyphshiroglyphesgyptiens

    Egyptian_Hieroglyphs 2009-06-01

    Elba 226 Elbasan elbasan 2010-07-18

    Ethi 430 Ethiopic(Geez) thiopien(geez,guze) Ethiopic 2004-10-25

    Geok 241Khutsuri(AsomtavruliandNuskhuri)

    khoutsouri(assomtavroulietnouskhouri)

    Georgian 2012-10-16

    Geor 240 Georgian(Mkhedruli) gorgien(mkhdrouli) Georgian 2004-05-29

    Glag 225 Glagolitic glagolitique Glagolitic 2006-06-21

    Goth 206 Gothic gotique Gothic 2004-05-01

    Gran 343 Grantha grantha 2009-11-11Grek 200 Greek grec Greek 2004-05-01

    Gujr 320 Gujarati goudjart(gujrt) Gujarati 2004-05-01Guru 310 Gurmukhi gourmoukh Gurmukhi 2004-05-01

    Hang 286 Hangul(Hangl,Hangeul)hangl(hangl,hangeul) Hangul 2004-05-29

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    3/7www.unicode.org/iso15924/iso15924-codes.html

    Hani 500 Han(Hanzi,Kanji,Hanja)

    idogrammeshan(sinogrammes)

    Han 2009-02-23

    Hano 371 Hanunoo(Hanuno) hanouno Hanunoo 2004-05-29

    Hans 501 Han(Simplifiedvariant)idogrammeshan(variantesimplifie) 2004-05-29

    Hant 502 Han(Traditionalvariant)

    idogrammeshan(variantetraditionnelle)

    2004-05-29

    Hebr 125 Hebrew hbreu Hebrew 2004-05-01

    Hira 410 Hiragana hiragana Hiragana 2004-05-01

    Hluw 080

    AnatolianHieroglyphs(LuwianHieroglyphs,HittiteHieroglyphs)

    hiroglyphesanatoliens(hiroglypheslouvites,hiroglypheshittites)

    2011-12-09

    Hmng 450 PahawhHmong pahawhhmong 2004-05-01

    Hrkt 412

    Japanesesyllabaries(aliasforHiragana+Katakana)

    syllabairesjaponais(aliaspourhiragana+katakana)

    Katakana_Or_Hiragana

    2011-06-21

    Hung 176 OldHungarian(HungarianRunic)runeshongroises(ancienhongrois) 2012-10-16

    Inds 610 Indus(Harappan) indus 2004-05-01

    Ital 210 OldItalic(Etruscan,Oscan,etc.)

    ancienitalique(trusque,osque,etc.)

    Old_Italic 2004-05-29

    Java 361 Javanese javanais Javanese 2009-06-01

    Jpan 413Japanese(aliasforHan+Hiragana+Katakana)

    japonais(aliaspourhan+hiragana+katakana)

    2006-06-21

    Jurc 510 Jurchen jurchen 2010-12-21Kali 357 KayahLi kayahli Kayah_Li 2007-07-02Kana 411 Katakana katakana Katakana 2004-05-01

    Khar 305 Kharoshthi kharochth Kharoshthi 2006-06-21

    Khmr 355 Khmer khmer Khmer 2004-05-29

    Khoj 322 Khojki khojk 2011-06-21

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    4/7www.unicode.org/iso15924/iso15924-codes.html

    Knda 345 Kannada kannara(canara) Kannada 2004-05-29

    Kore 287 Korean(aliasforHangul+Han)coren(aliaspourhangl+han) 2007-06-13

    Kpel 436 Kpelle kpll 2010-03-26Kthi 317 Kaithi kaith Kaithi 2009-06-01

    Lana 351 TaiTham(Lanna) tatham(lanna) Tai_Tham 2009-06-01Laoo 356 Lao laotien Lao 2004-05-01

    Latf 217 Latin(Frakturvariant) latin(variantebrise) 2004-05-01

    Latg 216 Latin(Gaelicvariant)latin(variantegalique) 2004-05-01

    Latn 215 Latin latin Latin 2004-05-01

    Lepc 335 Lepcha(Rng) lepcha(rng) Lepcha 2007-07-02Limb 336 Limbu limbou Limbu 2004-05-29

    Lina 400 LinearA linaireA 2004-05-01Linb 401 LinearB linaireB Linear_B 2004-05-29

    Lisu 399 Lisu(Fraser) lisu(Fraser) Lisu 2009-06-01Loma 437 Loma loma 2010-03-26Lyci 202 Lycian lycien Lycian 2007-07-02

    Lydi 116 Lydian lydien Lydian 2007-07-02

    Mahj 314 Mahajani mahjan 2012-10-16Mand 140 Mandaic,Mandaean manden Mandaic 2010-07-23Mani 139 Manichaean manichen 2007-07-15Maya 090 Mayanhieroglyphs hiroglyphesmayas 2004-05-01Mend 438 Mende mend 2010-03-26

    Merc 101 MeroiticCursive cursifmrotique Meroitic_Cursive 2012-02-06

    Mero 100 MeroiticHieroglyphs hiroglyphesmrotiquesMeroitic_Hieroglyphs 2012-02-06

    Mlym 347 Malayalam malaylam Malayalam 2004-05-01

    Mong 145 Mongolian mongol Mongolian 2004-05-01

    Moon 218Moon(Mooncode,Moonscript,Moontype)

    critureMoon 2006-12-11

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    5/7www.unicode.org/iso15924/iso15924-codes.html

    Mroo 199 Mro,Mru mro 2010-12-21

    Mtei 337 MeiteiMayek(Meithei,Meetei) meiteimayekMeetei_Mayek 2009-06-01

    Mymr 350 Myanmar(Burmese) birman Myanmar 2004-05-01

    Narb 106OldNorthArabian(AncientNorthArabian)

    nord-arabique 2010-03-26

    Nbat 159 Nabataean nabaten 2010-03-26

    Nkgb 420NakhiGeba('Na-'KhiGg-baw,NaxiGeba)

    nakhigba 2009-02-23

    Nkoo 165 NKo nko Nko 2006-10-10

    Nshu 499 Nshu nshu 2010-12-21Ogam 212 Ogham ogam Ogham 2004-05-01

    Olck 261 OlChiki(OlCemet,Ol,Santali) oltchiki Ol_Chiki 2007-07-02

    Orkh 175 OldTurkic,OrkhonRunic orkhon Old_Turkic 2009-06-01

    Orya 327 Oriya oriy Oriya 2004-05-01

    Osma 260 Osmanya osmanais Osmanya 2004-05-01

    Palm 126 Palmyrene palmyrnien 2010-03-26Perm 227 OldPermic ancienpermien 2004-05-01Phag 331 Phags-pa phagspa Phags_Pa 2006-10-10

    Phli 131 InscriptionalPahlavi pehlevidesinscriptionsInscriptional_Pahlavi 2009-06-01

    Phlp 132 PsalterPahlavi pehlevidespsautiers 2007-11-26Phlv 133 BookPahlavi pehlevideslivres 2007-07-15Phnx 115 Phoenician phnicien Phoenician 2006-10-10

    Plrd 282 Miao(Pollard) miao(Pollard) Miao 2012-02-06

    Prti 130 InscriptionalParthianparthedesinscriptions

    Inscriptional_Parthian 2009-06-01

    Qaaa 900 Reservedforprivateuse(start)rservlusagepriv(dbut) 2004-05-29

    Qabx 949 Reservedforprivateuse(end)rservlusagepriv(fin) 2004-05-29

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    6/7www.unicode.org/iso15924/iso15924-codes.html

    Rjng 363 Rejang(Redjang,Kaganga)

    redjang(kaganga) Rejang 2009-02-23

    Roro 620 Rongorongo rongorongo 2004-05-01Runr 211 Runic runique Runic 2004-05-01

    Samr 123 Samaritan samaritain Samaritan 2009-06-01

    Sara 292 Sarati sarati 2004-05-29

    Sarb 105 OldSouthArabian sud-arabique,himyariteOld_South_Arabian 2009-06-01

    Saur 344 Saurashtra saurachtra Saurashtra 2007-07-02

    Sgnw 095 SignWriting Signcriture,SignWriting 2006-10-10

    Shaw 281 Shavian(Shaw) shavien(Shaw) Shavian 2004-05-01Shrd 319 Sharada,rad charada,shard Sharada 2012-02-06Sind 318 Khudawadi,Sindhi khoudawad,sindh 2010-12-21Sinh 348 Sinhala singhalais Sinhala 2004-05-01

    Sora 398 SoraSompeng sorasompeng Sora_Sompeng 2012-02-06

    Sund 362 Sundanese sundanais Sundanese 2007-07-02

    Sylo 316 SylotiNagri sylotngr Syloti_Nagri 2006-06-21Syrc 135 Syriac syriaque Syriac 2004-05-01

    Syre 138 Syriac(Estrangelovariant)syriaque(varianteestranghlo) 2004-05-01

    Syrj 137 Syriac(Westernvariant)syriaque(varianteoccidentale) 2004-05-01

    Syrn 136 Syriac(Easternvariant)syriaque(varianteorientale) 2004-05-01

    Tagb 373 Tagbanwa tagbanoua Tagbanwa 2004-05-01

    Takr 321 Takri,kr,kr tkr Takri 2012-02-06Tale 353 TaiLe ta-le Tai_Le 2004-10-25

    Talu 354 NewTaiLue nouveauta-lue New_Tai_Lue 2006-06-21

    Taml 346 Tamil tamoul Tamil 2004-05-01

    Tang 520 Tangut tangoute 2010-12-21Tavt 359 TaiViet tavit Tai_Viet 2009-06-01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • 10/20/12 ISO 15924 - Alphabetical Code List

    7/7www.unicode.org/iso15924/iso15924-codes.html

    Telu 340 Telugu tlougou Telugu 2004-05-01

    Teng 290 Tengwar tengwar 2004-05-01Tfng 120 Tifinagh(Berber) tifinagh(berbre) Tifinagh 2006-06-21

    Tglg 370 Tagalog(Baybayin,Alibata)tagal(baybayin,alibata) Tagalog 2009-02-23

    Thaa 170 Thaana thna Thaana 2004-05-01

    Thai 352 Thai tha Thai 2004-05-01

    Tibt 330 Tibetan tibtain Tibetan 2004-05-01

    Tirh 326 Tirhuta tirhouta 2011-12-09Ugar 040 Ugaritic ougaritique Ugaritic 2004-05-01

    Vaii 470 Vai va Vai 2007-07-02

    Visp 280 VisibleSpeech parolevisible 2004-05-01

    Wara 262 WarangCiti(VarangKshiti) warangciti 2009-11-11

    Wole 480 Woleai wola 2010-12-21

    Xpeo 030 OldPersian cuniformeperspolitain Old_Persian 2006-06-21

    Xsux 020 Cuneiform,Sumero-Akkadiancuniformesumro-akkadien Cuneiform 2006-10-10

    Yiii 460 Yi yi Yi 2004-05-01

    Zinh 994 Codeforinheritedscriptcodetpourcriturehrite Inherited 2009-02-23

    Zmth 995 Mathematicalnotationnotationmathmatique 2007-11-26

    Zsym 996 Symbols symboles 2007-11-26

    Zxxx 997 Codeforunwrittendocumentscodetpourlesdocumentsnoncrits 2011-06-21

    Zyyy 998 Codeforundeterminedscriptcodetpourcritureindtermine Common 2004-05-29

    Zzzz 999Codeforuncodedscript

    codetpourcriturenoncode Unknown 2006-10-10

    Code N EnglishName Nomfranais PropertyValueAlias Date

    Copyright20042012ISO,Unicode,Inc.,&Evertype.AllRightsReserved

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.02

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.02

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.02

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.02

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • From: "Lisa Moore" To: "Swaran Lata" ; "Manoj Jain" Cc: Subject: Assamese writing system in UnicodeDate: Wednesday, August 17, 2011 11:24 PM

    Dear Manoj and Swaran,

    The Unicode office has recently received several emails from some members of the

    Assamese community, objecting to the way Assamese is addressed in Unicode. They

    feel Assamese is considered a sub-class to Bengali in Unicode, and it should be given

    its own block.

    Currently, two characters in the Bengali code block have annotations Assamese and

    the text of the Unicode Standard states that the Bengali script is used to write

    Assamese in Assam and a number of other minority languages. Based on our review,

    the Bengali script adequately covers the Assamese language.

    We have replied to the Assamese authors that we are in receipt of their emails and

    will be in contact with the Government of India regarding this request.

    As this request is coming from India and has political implications, we feel it is an

    issue that the Government of India will wish to address. The Unicode Technical

    Committee will take no action based on the current correspondence. If you would

    like copies of the various emails, please let us know. I append one such email below.

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Most sincerely,

    Lisa MooreChair, Unicode Technical Committee

    From: SATYAKAM PHUKAN [mailto:[email protected]] Sent: Wednesday, July 13, 2011 10:11 AMTo: [email protected] Subject: Erroneous nomenclature of the Assamese writing system as a subclass of Bengali in Unicode Consortium/Inc ToMs Magda DanishUnicode Consortium/IncUSA Subject : Erroneous nomenclature of the Assamese writing system as a subclass of Bengali in Unicode Consortium/Inc. Madam,

    There has been considerable displeasure over the naming of writing system of the

    Assamese language as a subclass of Bengali .

    The fact needs to be cleared up regarding the nomenclature of the similar alphabets

    used by the Assamese, Maithili, Bengali and Manipuri languages. This script is

    actually the KAMRUPI script, it developed in the ancient kingdom of Kamrup, the

    precursor or the older name of Assam. Kamrup had fixed boundaries from east to

    west. In east it ended in present eastern border of India and in the west it extended up

    to the river Korotoya now in the areas of so-called north Bengal. The indigenous

    people of this area of so-called north Bengal still differentiates themselves from the

    Bengalis and a movement for a separate state of Kamatapur spearheaded by extremist

    organisations like KLO (Kamatapur Liberation Organisation) with close links with

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • extremist ULFA is there.

    From time to time the ancient kingdom of Kamrup in pre-Muslim era used to make

    conquering forays into adjacent areas of mainland India and had ruled many areas of

    mainland India. Consequently whole of Bengal, eastern part of Bihar mainly the

    Mithila area and Orissa were under rule of Kamrup for considerable period of time.

    The KAMRUPI SCRIPT is used only in areas which were part of ancient kingdom of

    Kamrup or were under the rule of Kamrup. So in Bihar this script is used only in

    Mithila area which was once under Kamrup rule but not in areas of Bihar to the west

    of it which were never under Kamrup rule. The similarities of these languages are

    also due to this fact of the Kamrup rule in all these areas. Currently the Maithilis ie.

    the indigenous people of Mithila use the Devnagari script for most purposes but they

    still retain the use of the Kamrupi script for religious purposes. But there is a move by

    several of the Maithili scholars to revive the Mithilakshar script, the name with which

    this form of Kamrupi script is known there. The alphabets of the Assamese and the

    Maithili versions are almost same and these scripts are phonetically complete.

    Whereas the form used by the Bengali is phonetically lacking because the do not

    have any alphabet to represent the sound wa. This is because the Bengali was using

    the same script as is used by the Assamese till the coming of the British. Notable

    example is the alphabet for the sound ra, they were using the same alphabet the

    Assamese and the Maithili uses till the British period. Then they started using the

    alphabet the Maithili uses for denoting wa for representing the sound ra and in

    the process ended up having no alphabet for representing wa. The sound wa in

    the present Bengali is represented on assumption by the alphabet used to denote the

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • sound ya.

    The largest in number and oldest in time of the literatures in this system of writing

    belong exclusively to the Assamese language. Maithili is the second oldest in terms

    of time of writing and literary history of Bengali in the pre-British era comes much

    later than the other two.

    In the British period due to unscrupulous manipulation by a section of Bengali

    scholars and intelligentsia, the British rulers were manipulated in believing that the

    Assamese is just a peasant form or patois of the Bengali language. Thus started more

    than seventy years of Bengali imposition as the official and educational language in

    Assam. It was due to the struggle of the American Baptist Missionaries

    complemented with the effort of the budding Assamese intelligentsia and right

    thinking Bengali intellectuals who opposed their parochial compatriots, that

    Assamese was given back it's rightful place in Assam.

    There is now a move by a section of intellectuals to rename the writing system as

    EASTERN NAGARI, this is far more erroneous because this script has nothing to

    do with Nagari form of writing except a common source of borrowing of the schema

    or concept but not the alphabets from the Brahmi script.

    The script with which the Kamrupi script share the highest similarity is the Tibetan

    script. The major similarities of the Kamrupi and the Tibetan system of writing is the

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • use of angular or triangular form or shapes in the alphabets. I have mapped out the

    alphabets in the attached pdf file. The Kamrupi-Tibetan linkages which are quite

    obvious even to a laymen have been overlooked and not thought of by other scholars

    who had written on the subject. The Unicode has therefore wrongly written about the

    Bengali(???)/Assamese script as being very similar with Devnagari. We are in the

    process of making a website on the issue of wrong nomenclature of this writing

    system, which will be in the form of a memorandum with provisions for obtaining

    signatures online and send to the Unicode Consortium for rectification of the

    injustice.

    The proposal for the same has three options for the rectification :

    FIRST : Give a separate slot to the Assamese writing system / fonts

    SECOND : Rename the script as KAMRUPI

    THIRD : Rename the script as AMBM (Assamese-Maithili-Bengali-Manipuri)

    The third option is given on the basis of chronological basis of the use of this script in

    the ancient and pre-British era.

    Lets hope Unicode Consortium will take steps in the right direction, please inform us

    whether it will be necessary us to go ahead with the proposed Website for the purpose

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • of rectifying the mistake/injustice or sending the required information or documents

    by E-mail will suffice.

    Dr Satyakam Phukan Jorpukhuripar, UzanbazarGuwahati, Assam (INDIA)P.I.N : 781001Phone : +91 99540 46357E-mail : [email protected] The links below will give more information first on the controversy and the second

    on the roots and connections of the Assamese language.

    http://rajivkonwar100.blogspot.com/2011/07/assamese-experts-question-sahitya-sabha.html

    Roots and Strings of the Assamese Language

    From: AZIZ-UL HAQUE [mailto:[email protected]] Sent: Thursday, July 21, 2011 8:25 AMTo: [email protected]; [email protected]: [email protected]: Assamese writing system in unicode ToMs. Magda DanishUnicode Consortium Inc. USA Dear Madam

    Greetings from Guwahati, Assam, India. I am grieved to know that the Assamese

    writing system has been kept as a sub-class of Bengali in Unicode.

    In 1836 the British rulers imposed Bengali in Assam thinking that it was a patois or

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • colloquial dialect or distortion of Bengali. It took about 37 years of struggle which

    was spearheaded by the American Baptist Missionaries led by Dr. Miles Bronson in

    convincing the British administration that Assamese was a distinct language.

    Assamese was finally reinstated in 1873.

    The origin of Assamese script can be traced back to as early as 300 B.C., found in

    inscription on stones during the reign of Ashoka the Great. Thus, it has a long history

    and it developed through the ages. The ancient name of Assam was Kamrup and for a

    considerable period its territory was extended to the Mithila area of Bihar, Orissa and

    Bengal. There are sure proofs of distinct Kamrupi script which was written in the 8th

    century. The people of those areas came under the influence of culture and language

    of Kamrup. Moreover, there had been cordial relations of Kamrup with the

    neighboring kingdoms. The people of those areas either used this ancient Assamese

    script or borrowed the idea of this script. That is why there is a close affinty of

    Assamese with Bengali, Maithili, Oria(prounounciation) and Manipuri. There are

    many historical and documentary evidences to show that Assamese is a distinct

    language from Bengali.

    Therefore, Madam, we feel that a separate slot be given to Assamese or rename the

    script as Kamrupi. A third option can be to rename the script as AMBM for

    Assamese-Maithili-Bengali-Manipuri basing on the chronological development and

    use of the script in the ancient times.

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Please let me know if I need to suffice with documentary evidences.

    Hope you will look into the matter and do the needful.

    With regards.Yours sincerely A. Haque Address: Aziz-ul Haque, Pastor, Guwahati Baptist Church, Panbazar, Guwahati-781001, Assam, India. Phone-09864023020.

    DOCUMENT B .03

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • m m

    N Ravi Shanker GOVERNMENT OF INDIA Additional Secretary ~ * q = r f f ~ ~ Email: [email protected] MINISTRY OF COMMUNICATIONS AND INFORMATION TECHNOLOGY

    ~ ~ k p r m

    goT[h[/Tele: . . . DEPARTMENT OF INFORMATION TECHNOLOGY .:'\. .; . ,.. ..,dl

    . . .'. . .

    . ..

    zqogom q0 : m f i c m Fax +91-11-24363099 '. D.O.NO .......................... ELECTRONICS NlKETAN

    . .

    . .. * 6,C.G.0. COMPLEX .. :

    ,, j D o No. 13(4)12011 -HCC(TDI.L):;. . .~ . ;:. t.4v:y.-$ , . , f?F8? /New Delhi-110003 .. . .

    - ... .

    ". .+. , ;.>

  • DOCUMENT B.05

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT B.05

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • /BY SPEED POST

    ~~

    GOVERNMENT OF INDIA

    ~ aih: ~ sthi) R'I cfi) +i =::j 1

  • Minutes of the Meeting on Unicode for Assamese Writing System

    Department of Electronics & Information Technology, New Delhi

    June is", 2012

    Concerns were raised by Govt. of Assam regarding the nomenclature/ representation of the Assamese

    Writing System in the Unicode Standard. A meeting of Assamese & Bengali experts and Officials ofGovt.

    of Assam and West Bengal and some industry experts was organized on June 13th, 2012 at 11:00 AM at

    Department of Electronics & Information Technology, Electronics Niketan, Lodhi Road, New Delhi to

    address the issues raised. The list of the participants is placed at Annexure-I

    Representatives of Government of Assam mentioned that there is erroneous nomenclature of the

    Assamese writing system as a subclass of Bengali in the Unicode Standard. They submitted that the

    script used for writing Assamese, Bengali, Maithili and Manipuri is "Kamrupi" as it was the writing

    system used in ancient Kamrup state, whereas in the Unicode Standard it is mentioned as "Bengali". An

    email dated 13-7-2011 was sent to Unicode Consortium by Shri Satyakam Phukan. In the mail Shri

    Phukan had given his arguments for the name change and suggested three options:

    First: Give a separate slot to the Assamese writing system / fonts

    Second: Rename the script as KAMRUPI

    Third: Rename the script as AMBM (Assamese-Maithili-Bengali-Manipuri)

    Based on this e-rnail, Unicode Consortium included Assamese also in the Code-Chart list hosted on

    Unicode website (http://www.unicode.org/charts/). "Bengali" was changed as "Bengali and Assamese"

    on this web link.

    Representatives of Government of Assam also requested to change the name of Bengali script as

    "Assamese-Bengali" script based on alphabetical order so as to give due recognition to Assamese

    Writing System also.

    Assamese experts also requested to allocate a separate code block for existing Assamese writing system

    and to cater to futuristic needs of various dialects of Assamese.

    It was also discussed that name change request need to be examined by Unicode Consortium as

    neighboring country Bangladesh is also using Bengali.

    Experts from industry appraised the members that the internet security threat may arise with the

    duplicate encoding of the same glyph with the implementation of Internationalized Domain Names

    (IDN).

    Based on the discussions following points were agreed:

    - -- -1. The name change of the script from "Bengali"-fO "Bengali and Assamese'reflecfecfon the

    website needs to be reflected in other parts of the text in the standard. The names of the

    characters are indicated currently as per Bengali Script and additional annotation with

    respect to Assamese can be taken up with Unicode for addition. Govt. of Assam may submit

    the detailed proposals covering these aspects.

    2. Government of Assam shall examine the futuristic need of additional requirements of

    Assamese and its dialects and submit a report to DeitY alongwith the requisite documentary

    support.

    The meeting ended with the vote of thanks to the Chair.Page 1 of 2

    DOCUMENT B.06

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Annexure-I

    1. Dr. Rajendra Kumar, Joint Secretary, DeitY, New Delhi (In Chair)

    2. Shri Shantanu Thakur, lAS, Commissioner, Excise Department, Govt of Assam

    3. Shri M.K.Yadava, IFS, Managing Director, AMTRON

    4. Mrs. Swaran Lata, DeitY, New Delhi

    5. Prof. B B Chaudhury, 151, Kolkata; Govt. of West Bengal & SNLTR-West Bengal6. Shri Monoj Kr. Baruah, Dy. Manager, AMTRON

    7. Prof. Lilabati Saikia Bora, Department of Assamese, Gauhati University

    8. Dr. Sikhar Sarma, Professor & Head, Dept. Of IT, Gauhati University

    9. Dr. Utpal Sharma, Associate Professor, Tezpur University

    10. Shri Bl;tjlskat;;j.yotiSarma, lectl!U~J~,Department of Assarnese, Dibrugarh Universjtv11. Dr. Shakuntala Mahanta, HSSDept, liT Guwahati

    12. Prof. Sivaji Bandyopadhyay, Jadhavpur University, Kolkata

    13. Shri Debashis Mazumdar - Joint Director, CDAC, Kolkata

    14. Shri Akshat Joshi, CDAC, Pune

    15. Shri Vijay Kumar, DeitY, New Delhi

    16. Shri Manoj K Jain, DeitY, New Delhi

    DOCUMENT B.06

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • Reference numberISO 15919:2001(E)

    ISO 2001

    INTERNATIONALSTANDARD

    ISO15919

    First edition2001-10-01

    Information and documentation Transliteration of Devanagari and relatedIndic scripts into Latin charactersInformation et documentation Translittration du Devanagari et descritures indiennes lies en caractres latins

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    PDF disclaimerThis PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall notbe edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading thisfile, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in thisarea.

    Adobe is a trademark of Adobe Systems Incorporated.Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameterswere optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely eventthat a problem relating to it is found, please inform the Central Secretariat at the address given below.

    ISO 2001All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronicor mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member bodyin the country of the requester.

    ISO copyright officeCase postale 56 CH-1211 Geneva 20Tel. + 41 22 749 01 11Fax + 41 22 749 09 47E-mail [email protected] www.iso.ch

    Printed in Switzerland

    ii ISO 2001 All rights reservedDOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    ISO 2001 All rights reserved iii

    Contents Page

    1 Scope ..............................................................................................................................................................12 Conformance..................................................................................................................................................13 Normative references ....................................................................................................................................14 Terms and definitions ...................................................................................................................................25 Abbreviated terms .........................................................................................................................................36 Characteristics of Indic scripts ....................................................................................................................37 Transliteration tables ....................................................................................................................................48 Special requirements and recommendations...........................................................................................168.1 Special requirements ..................................................................................................................................168.2 Recommendations.......................................................................................................................................189 Options .........................................................................................................................................................1810 Tables for uniform transliteration of Indic scripts ...................................................................................1911 Transliteration scheme for limited character set .....................................................................................1912 Recommended transliteration of Indic schemes for Perso-Arabic characters.....................................1913 Additional Indic scripts ...............................................................................................................................1914 Reverse transliteration................................................................................................................................19Annex A (normative) Tables for uniform transliteration .......................................................................................20Annex B (normative) Transliteration table for limited (7-bit) character set ........................................................24Annex C (normative) Recommended transliteration of Indic schemes for Perso-Arabic characters..............25Annex D (informative) Examples of Indic characters used for Perso-Arabic .....................................................26Annex E (informative) Additional Indic scripts ......................................................................................................27Annex F (informative) Reverse transliteration of Indic scripts.............................................................................28F.1 Overview.......................................................................................................................................................28F.2 Examples of reverse transliteration in modern Indic languages............................................................28F.3 Reverse transliteration in Vedic texts .......................................................................................................28Bibliography ..............................................................................................................................................................29

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    iv ISO 2001 All rights reserved

    Foreword

    ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISOmember bodies). The work of preparing International Standards is normally carried out through ISO technicalcommittees. Each member body interested in a subject for which a technical committee has been established hasthe right to be represented on that committee. International organizations, governmental and non-governmental, inliaison with ISO, also take part in the work. ISO collaborates closely with the International ElectrotechnicalCommission (IEC) on all matters of electrotechnical standardization.

    International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.

    Draft International Standards adopted by the technical committees are circulated to the member bodies for voting.Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote.

    Attention is drawn to the possibility that some of the elements of this International Standard may be the subject ofpatent rights. ISO shall not be held responsible for identifying any or all such patent rights.

    International Standard ISO 15919 was prepared by Technical Committee ISO/TC 46, Information anddocumentation, Subcommittee SC 2, Conversion of written languages.

    Annexes A, B and C form a normative part of this International Standard. Annexes D, E and F are for informationonly.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    ISO 2001 All rights reserved v

    Introduction

    Script conversion is often required for documents such as historical and literary texts, geographical texts (includingmaps and atlases), bibliographies, catalogues, lists and passports (and other identification documents).

    Text in Devanagari script or other Indic scripts sometimes needs to be shown in Latin script, where users, orequipment that they are using, cannot read or write the text.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • INTERNATIONAL STANDARD ISO 15919:2001(E)

    ISO 2001 All rights reserved 1

    Information and documentation Transliteration of Devanagariand related Indic scripts into Latin characters

    1 ScopeThis International Standard provides tables which enable the transliteration into Latin characters from text in Indicscripts which are largely specified in rows 09 to 0D of UCS (ISO/IEC 10646-1 and Unicode).The tables provide for the Devanagari, Bengali (including the characters used for writing Assamese), Gujarati,Gurmukhi, Kannada, Malayalam, Oriya, Sinhala, Tamil, and Telugu scripts which are used in India, Nepal,Bangladesh and Sri Lanka. The Devanagari, Bengali, Gujarati, Gurmukhi, and Oriya scripts are North Indianscripts, and the Kannada, Malayalam, Tamil, and Telugu scripts are South Indian scripts.

    The Burmese, Khmer, Thai, Lao and Tibetan scripts which also share a common origin with the Indic scripts, andwhich are used predominantly in Myanmar, Cambodia, Thailand, Laos, Bhutan and the Tibetan AutonomousRegion within China, are not covered by this International Standard.

    This International Standard applies to transliteration of Devanagari, and to Indic scripts related to Devanagari,independent of the period in which it is or was used (i.e. for Devanagari script it can be used for transliterating textin classical Sanskrit, Hindi, Marathi, and the Vedic language, for instance).Other Indic scripts whose character repertoires are covered by the tables may also be transliterated using thisInternational Standard.

    Options in this International Standard are defined in clause 9.

    2 ConformanceText originally in non-Latin script which is converted to a Latin-script representation conforms to this InternationalStandard with or without any of the specific recommendations, if it follows the rules defined in 8.1 and theconversion tables given in clause 7 and normative annexes A and B, with or without following any of the threerecommendations given in 8.2 and clause 12, all in accordance with the options defined in clause 9.

    A claim of conformance shall specify which options have been chosen, and which recommendations have beenfollowed.

    3 Normative referencesThe following normative documents contain provisions which, through reference in this text, constitute provisions ofthis International Standard. For dated references, subsequent amendments to, or revisions of, any of thesepublications do not apply. However, parties to agreements based on this International Standard are encouraged toinvestigate the possibility of applying the most recent editions of the normative documents indicated below. Forundated references, the latest edition of the normative document referred to applies. Members of ISO and IECmaintain registers of currently valid International Standards.

    ISO/IEC 10646-1, Information technology Universal Multiple-Octet Coded Character Set (UCS) Part 1:Architecture and Basic Multilingual Plane

    ISO/IEC 646:1991, Information technology ISO 7-bit coded character set for information interchange

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    2 ISO 2001 All rights reserved

    4 Terms and definitions

    For the purposes of this International Standard, the following terms and definitions apply.

    4.1conversionrepresenting graphic characters from a source script by the graphic characters of a target script, most commonly byromanization

    NOTE The two basic methods of conversion of a system of writing are transliteration and transcription. The use of theterms source script and target script in transliteration is analogous to the terms source language and target language intranslation.

    4.2scriptset of graphic characters used for the written form of one or more languages

    4.3graphic charactercharacter (other than a control character) that has a visual representation, normally handwritten, printed ordisplayed

    NOTE A graphic character is a single element of a script. Examples are letters, conjunct characters, numerical digits,punctuation marks or diacritical marks.

    4.4reverse transliterationprocess whereby the characters of a target script are transliterated into those of the source script

    NOTE This International Standard aims to enable reverse-transliterated text to be identical to the original source text up toequivalent orthography. However, non-reversible transcription-like transliterations are often found to be useful when quotingrecent material.

    4.5romanizationconversion of non-Latin graphic characters into Latin graphic characters, using either transliteration or transcription

    4.6transcriptionrepresentation of the sounds of a source language by graphic characters associated with a target language

    4.7transliterationrepresentation of the graphic characters of a source script by the graphic characters of a target script

    NOTE In transcription, pronunciation conventions are of primary importance, while in transliteration, writing conventions areof primary importance.

    4.8UCSUniversal Multiple-Octet Coded Character Set (UCS) as defined in ISO/IEC 10646-1NOTE 1 The Indic scripts listed in ISO/IEC 10646-1:1993 form a subset (with identical codes) of the Indic scripts listed inISO/IEC 10646-1:2000. Similarly, the Indic scripts listed in the Unicode standard (version 1.0 onwards) form a subset (withidentical codes) to the Indic scripts listed in ISO/IEC 10646-1:2000 and the Unicode standard, version 3.0. Any of thesestandards provide valid character codes for the specific characters concerned.

    NOTE 2 ISO/IEC 10646-1 is increasingly used for providing character identifiers in a wide range of International Standards,including some in this International Standard. Use of these identifiers does not impose any requirements to use ISO/IEC 10646-1 orany other character coding standard to represent either the source characters or the target characters in any computer system orin information interchange.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    ISO 2001 All rights reserved 3

    5 Abbreviated terms

    Ben. Bengali script

    Dev. Devanagari script

    Guj. Gujarati script

    Gur. Gurmukhi script

    Kan. Kannada script

    Mal. Malayalam script

    Ori. Oriya script

    Tam. Tamil script

    Tel. Telugu script

    Sin. Sinhala script

    P-A. Perso-Arabic script

    6 Characteristics of Indic scripts

    Characters in Indic scripts represent vowels, consonants and their combinations; nasalization, breathings,numerals and punctuation.

    Each vowel has a full form (occupying a full character space in text, and required when beginning a word or invowel hiatus) and a combining form (mtr) used when the vowel follows a consonant, except that the short astanding at the beginning of Indic alphabets has only a full form, because no mtr is required (see below).

    Consonants include stops, semivowels, spirants, and other speech sounds. Stop consonants are arranged inclasses, or vargas, according to the point of articulation, and within each class are subdivided into unvoiced orvoiced, unaspirated or aspirated consonants, and a nasal consonant.

    Characters for consonants are most simply quoted in a form which includes the inherent vowel a, as in the firstconsonant ka in Table 1. The inherent vowel is removed by the virma sign of the relevant script (Dev., Ben., Guj.,Gur., Ori. , Tam. , Tel. , Kan. , Mal. , Sin. . AThe relevant mtr is used when any other vowel

    follows a consonant. Consonant clusters frequently form conjunct characters. Use of virma to form consonantclusters is unusual, except in Tamil where it is the normal method. When a mtr is associated with a consonant, itreplaces the inherent vowel. Mtrs have various forms, even in a single script, and details may be found indictionaries and grammars.

    It is important to note that many Indic characters have variant forms. Such differences of orthography are notdistinguished in this International Standard.

    Devanagari is used for writing various modern languages, such as Hindi, Marathi, Rajasthani and other languagesin India, and Nepali in Nepal. Devanagari and most of the other Indic scripts are used for writing classicallanguages often used in religious texts, such as the Sanskrit and Vedic languages, and Pali. In some cases, text inIndic scripts uses additional characters for writing words in languages which do not normally use these scripts.Thus some Urdu consonants are typically represented by adding a dot (nuqta) below certain letters (see Table 1,normative annex C and informative annex D). Two English vowels may also be represented. Devanagari has alsobeen extended to write South Indian languages.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    4 ISO 2001 All rights reserved

    Sinhala script (used in Sri Lanka) has additional letters, in comparison with the scripts which are used in India,Nepal and Bangladesh. Tamil script (used in South India and also in Sri Lanka) uses fewer characters, incomparison with other scripts which are used in India, Nepal, Bangladesh and Sri Lanka.

    When the Bengali script is used to write the Assamese language (in parts of North India), two characters not usedin writing Bengali are required. Hence the Assamese script is sometimes regarded as separate from the Bengaliscript.

    7 Transliteration tables

    7.1 The transliteration from each Indic script to the Latin script shall be as specified in the Tables 1 to 10 andA.3, subject to the rules specified in 8.1 and the options specified in clause 9.

    7.2 The structure of the transliteration tables is explained in the following paragraphs.

    The target characters (Latin script) fall within the ranges 0020-01FF and 0300-0332 of ISO/IEC 10646-1:2000.

    The repertoires for many of the source characters fall within the following ranges of ISO/IEC 10646-1:2000, for thescript concerned:

    0900-097F Devanagari

    0980-09FF Bengali

    0A00-0A7F Gurmukhi

    0A80-0AFF Gujarati

    0B00-0B7F Oriya

    0B80-0BFF Tamil

    0C00-0C7F Telugu

    0C80-0CFF Kannada

    0D00-0D7F Malayalam

    0D80-0DFF Sinhala

    Some additional Indic scripts whose character repertoires are included in the character repertoires of these scriptsare listed in informative annex E.

    Consonants are shown with their inherent vowel a.

    Only a single form of each Indic character is shown, just as in ISO/IEC 10646-1. Specifications of alternative formsof these characters, including shapes when these are included in conjunct forms or in consonant-vowelcombinations, are outside the scope of this International Standard.

    This clause gives tables for each script, with references to the rules of 8.1. Numerals are shown in Table A.3 ofannex A. Tables 1 to 10 are in the order of ISO 10646-1:2000. Vowels are shown in full form followed by a typicalform of the corresponding mtr.

    Normative annex A gives tables showing linguistically equivalent characters in each script (except that GurmukhiBindi is not exactly equivalent to anusvara in the other scripts). Extended and ancient characters, apart fromnumerals, are shown in Table A.2 unless an equivalent modern character exists in another script, in which casethey are enclosed in round brackets in Table A.1. (See also the requirements in clause 10.) In Tables A.1 to A.3the scripts are ordered according to similarity of character repertoires.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    ISO 2001 All rights reserved 5

    A few rare characters for which attestation is not currently available are omitted.

    Normative annex B gives the transliteration table (Table B.1) that shall be used when it is necessary to avoid use ofLatin letters with diacritics.

    Normative annex C gives the recommended method of transliterating Indic characters specified as representingPerso-Arabic characters (Table C.1 and its rules of application).

    In the Ref. column of all these tables, the 3-digit decimal references are derived from hexadecimal to decimalconversion of character codes in ISO/IEC 10646-1:2000. Note that the earlier International StandardISO/IEC 10646-1:1993 also includes these decimal codes explicitly in its tables, in case visual comparisons arerequired between this International Standard and ISO/IEC 10646-1.

    3-digit decimal characters with an additional letter refer to characters not in ISO/IEC 10646-1:2000.

    The order of characters in tables follows approximate alphabetical order, rather than the order inISO/IEC 10646-1:2000.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    6 ISO 2001 All rights reserved

    Table 1 Transliteration of Devanagari script

    Ref. Indic Transliteration Ref. Indic Transliteration Ref. Indic Transliteration

    005 aaaa Rule 2 a 027 chachachacha 053 vavavava

    006 028 jajajaja 054 aaaa

    007 iiii 029

    jhajhajhajha 055 aaaa

    008

    030 aaaa 056 sasasasa

    009 uuuu 031 aaaa 057 hahahaha

    010 032 hahahaha 088 qaqaqaqa

    011

    033 aaaa 089 aaaa

    096 034 hahahaha 090 aaaa

    012 !!!! "" "" 035 #### aaaa 091 $$$$ zazazaza

    097 %%%% && && 036 '''' tatatata 092 (((( aaaa

    015 )))) ** ** eeee 037 ++++ thathathatha 093 ,,,, hahahaha

    016 ---- .. .. aiaiaiai 038 //// dadadada 094 0000 fafafafa

    019 1111 **** oooo 039 2222 dhadhadhadha 048a 3333 c

    020 4444 .... auauauau 040 5555 nananana 051 6666 aaaa

    013 )7)7)7)7 77 77 b 042 8888 papapapa 002 99 99 !!!! Rules 3, 5, 8 a

    017 7777 7777 b 043 :::: phaphaphapha 001 ;; ;; #### Rules 4, 5, 8 a

    021 > khakhakhakha 045 ???? bhabhabhabha 003a X ''''023 @@@@ gagagaga 046 AAAA mamamama 003b ****

    024 BBBB ghaghaghagha 047 CCCC yayayaya 061 DDDD Rule 15 a

    025 EEEE -a-a-a-a 048 FFFF rararara

    026 GGGG cacacaca 050 HHHH lalalalaNOTE 1 Additional characters from Extended Devanagari may be found in Table A.1. See also Table D.1.NOTE 2 The treatment of Vedic accents may be found in 8.1 (Rule 14 in clause 8), 8.2 and Table B.1.a See clause 8.b English vowels as in ba, bla, English bat, ball.c Used in Marathi and Nepali.

    DOCUMENT C.01

    Dr

    Satya

    kam

    Phu

    kan

    Dr

    Sa

    tyaka

    m P

    huka

    n

  • ISO 15919:2001(E)

    ISO 2001 All rights reserved 7

    Table 2 Transliteration of Bengali script

    Ref. Indic Transliteration Ref. Indic Transliteration Ref. Indic Transliteration

    133 aaaa Rule 2 a 154 cacacaca 174 mamamama

    134 155 chachachacha 175 yayayaya

    135 iiii 156

    jajajaja 176 rararara

    136

    157 jhajhajhajha 240 rararara b

    137 uuuu 158 aaaa 178 lalalala

    138 159 aaaa 241 vavavava b

    139

    160 hahahaha 182 aaaa

    224 161 aaaa 183 aaaa

    140 162 hahahaha 184 sasasasa

    225 163 aaaa 185 hahahaha

    143 !!!! eeee 164 """" tatatata 220 #### aaaa

    144 $$$$ %%%% aiaiaiai 165 &&&& thathathatha 221 #### hahahaha

    147 '''' !!!! oooo 166 (((( dadadada 223 )))) 0a0a0a0a Rule 9 a

    148 **** !!!!++++ auauauau 167 ,,,, dhadhadhadha 156a #### zazazaza c

    149 ---- kakakaka 168 .... nananana 172a wawawawa c

    150 //// khakhakha