custom scripting language 101

Upload: redbandit

Post on 14-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Custom Scripting Language 101

    1/29

    Custom Scripting Language101

    By Justin Sterling

    (Celestialkey)July20,2010July22,2010

  • 7/30/2019 Custom Scripting Language 101

    2/29

    Index...

    1. What is a scripting language? ...................................................... 3

    2. What makes up a scripting language? ...................................................... 5

    3. Creating Rules and Token Identifiers ...................................................... 6

    4. The Script Reader Header ...................................................... 8

    5. The Script Reader Source ...................................................... 12

    6. Code Listing ...................................................... 20

  • 7/30/2019 Custom Scripting Language 101

    3/29

    What is a Scripting Language?

    Whenaskedaboutwhatascriptinglanguageis,youwilloftengetvaryingresults.

    Takingthewikidefinition,"A scripting language, script language orextension language is a

    programming languagethat allows control of one or moresoftware applications. ",isgoodenoughfor

    generalizationpurposes.Thisisamorevagueversionofwhatyouwouldnormally

    hear.Ascriptinglanguagecanbeconsideredmorespecificallyforourpurposesto

    bea3rdpartylanguageindependentoftheactualapplicationthatmodifiestheflow

    ofthemainapplication.Scriptlanguagescomeinmanydifferentforms.Youhave

    theverycoherentscriptinglanguagessuchasTorquescript(TorqueEngine),Unreal

    script(UnrealTournamentII+III),LUA(Tibiaandothergames),JASP(WarcraftIII

    TFT),GameMakerScript(GameMaker5.4+),andthelistcontinues.Theseare

    professionallycreatedlanguagestohelpplayersintereactwiththegameandcreate

    customizableworlds.OtherlanguagessuchasBrainFuck,

    Outputs "Hello World" in BrainFuck

    LOLCODE,

    Outputs "Hai World" in lolcats speech

    Chef,

    Pushes a item onto a stack

    Befunge,

    andevenmore.Theselanguagesareesotericandaresometimesalmostimpossible

    toreadwithoutevenaminormasteryofthelanguage.TheselanguagesareFAR

    ++++++++++[>+++++++>++++++++++>+++>+.

    HAI

    CAN HAS STDIO?

    VISIBLE "HAI WORLD!"

    KTHXBYE

    Put cinnamon into 2nd mixing bowl

    "dlroW olleH">:v

    ^,_@

    http://wiki/Programming_languagehttp://wiki/Application_softwarehttp://wiki/Application_softwarehttp://wiki/Application_softwarehttp://wiki/Programming_language
  • 7/30/2019 Custom Scripting Language 101

    4/29

    moreadvancedthenanythingwewillcoversincetheyareamazinglyhugeand

    complex.Wewillkeepourssimpleenoughtofinishthisebookquickly,yet

    sophisticatedenoughtobuildalanguagefromscratchandhaveitbeuseful.

  • 7/30/2019 Custom Scripting Language 101

    5/29

    What makes up a scripting language?

    Scriptinglanguagesarebuiltfrom4mainparts.Youfirsthaveasymboltable

    whichcontainseverythingrelatedtoyourlanguage.Allthekeywords,thesymbols,

    thedatatypes,etc.Afterthesymboltable,comesthelexicalanalyzerwhichconverts

    achracterstream(thefileloadedwithyourscript)intotokens(keywords,operators,

    datatypes,etc).Fromthere,thetokensarepassedintoaparserwhichcollectsthe

    tokenspassedinandbuildsasyntextree(howaoperation,keywordistobeused).

    Afterbuildingatree,yournewcommandispassedintoasemanticcheckerthat

    looksforanysemanticerrorsinsidethecommand(e.g.2parametersaregivenas

    opposedto3,oryourmissingaparenthesis).Youthenhaveaintermediatecode

    generatorwhichdoesnothingmorethenexecutethecommandpassedin.Youalso

    haveafewoptionstoaddinsuchasabytecodegeneratorto"assemble"thecode

    intoaeasierreadformat,oraoptimizertoincreasethespeedofthecode.Letstake

    aexampleofafinalproductwewillendupwithattheendofthiseBook.Belowisa

    exampleofaworkingscriptthatwillproperlyruninourapplication.

    Fornow,letsstartatthebarebonesandcheckouthowtocreatealexicalanalyzer

    fromscratch.

    // Test Script// Celestialkey

    // For the book!

    // Http://Www.CelestialCoding.com

    // The '//' will denote a comment

    Var name def ; // Create a variable and just place "def" inside it

    Var realName def ;

    Set name Celestialkey ; // Set variables to strings

    Print "Hello from " name "!!!" "\n" ; // Print to the screen with normal

    // string, formating, and with variables

    Print "What is your name?" "\n" "|> " ; // Ask the user for their name

    GetInput name ; // Get input and store it inisde

    // variable 'name' that we created

    Set realName name ; // Set variables equal to each other

    Print "Hello, " realName "!!!" "\n" ;

    //

    $ ; // End of script character squence

  • 7/30/2019 Custom Scripting Language 101

    6/29

    Creating the Rules and Token Identifiers

    Whenwritingalexicalanalyzer,youneedtoplanouthowyouwantyour

    languagetobestructured.YoucantheoreticallycreateaentirenewlanguagesimilartoC++oramoreesotericonesimilartotheaboveexamples.Ourswillbeaveryhigh

    levelandthelanguagewillbeinterpreted,notcompiled.Thiskeepsthingssimple.

    Ourlanguagewillfollowthefollowrules:

    1.Alllineswillendwithaspacefollowedbyasemicolon

    2.Whenconcactingstrings,aspaceisrequiredbetweeneachpart

    3.Variablesmustbedeclaredbeforetheycanbeused

    4.Thevalue"$;"willdesignatetheendofthescript(eos).

    5.Thislanguagewillnotsupportif/thenorgrouping(brackets)

    6.Thislanguagewillbebasic,butcaneasilybeexpandedon

    7.Nowhitespacecanexistbetweenthefinalcommandandtheeos.Usinga

    commentlinecancheatthisthough.

    8.Commentsaredesignatedby'//'

    andhavethefollowingmaintokens

    1.Print

    2.;

    3.Var

    4.GetInput

    5.Set

    PrettysimplewhenyoucompareittotheotherMAINhighlevellanguagesout

    theresuchasC++orC.Youcaneasilyaddyourownfeaturestothislanguage

    though.Withthoserulesnowdefined,weneedtoaskourselves,"Howisthisgoing

    tostoreeverything?"Thatsaverygoodquestion.Iquiteliterallyspent48hours

    codingandrecodingdifferentdesignedtomakeaflexablelanguage.Tobehonest,it

    isverydifficult.Thebestwaytocreateascriptlanguageistoconvertthescriptinto

    OPcodesandthenbuildavirtualmachinetorunthecustommadeassemblyinside.

    Thatisbeyondthescopeofthisbookthough,soIhadtothinkofawaytomakea

    interpreter.Aftersomedebateover3gooddesignsthatfinallyworked.Ichosethe

    onewiththeabilitytoaccessvariableseasilysincethatiswheremostworkisdone

  • 7/30/2019 Custom Scripting Language 101

    7/29

    normallywhenyouscriptanything.Thebasicgistofthelexicalanalyzerandparser

    arebelow.Remember,thisistheflowoflogic,notthepseudocode.

    Prettystraightforwardright?Thecodeisabitmessyoncewejumpintoit

    though.Let'stakealookattheheadercodeforourscriptreader.

    Scan each line character by chracter looking for white space or special

    characters that describe the end of a token or beginning of a

    special token.

    After a special character has been found, push the token onto a vector

    that stores all discovered tokens.

    Repeat each process until the token '$ ;' is encountered.

    Load each token from the Discovered tokens list and compare them to the

    function token list. If a match is found, set it to the current

    postion which would equate to one of the enumerated functions

    predefined.After loading the operation/function, assume all remaining tokens to be

    parameters until encountering a ';'.

    Upon hitting a ';', The new command is pushed onto a command list stack

    to be executed later.

    Repeart each process until the token '$ ;' is encountered.

    Load one command at a time and select the correct operation to use.

    Once correct operation is discovered, loop through the remainingparameters list and perform the requested operation by

    parsing the parameter into a single string.

    Execute the command.

  • 7/30/2019 Custom Scripting Language 101

    8/29

    The Script Reader Header

    #ifndefCELESTIALANALYZER_H_

    #define CELESTIALANALYZER_H_

    #include#include#include#include#include

    usingnamespace std;

    // Functions Listenum{

    PRINT,ADD,VAR,GETINPUT,SET,SUBTRACT

    };

    // Variable types listenum{

    _STRING, // String_INT, // Integer_OPR // Operation

    };

    struct command{int Operation;vector ParameterList;

    };struct variable{

    string name;string value;};struct varType{

    int Type;string sValue;int iValue;

    };

    class CelestialAnalyzer{private:

    vector m_TokenList; // Will store all tokens we compare forvector m_DiscoveredTokens; // Will store all found tokensvector m_CommandList; // Will store parsed commands form token listvector m_VariableList; // Will store variables created in the script

    public:CelestialAnalyzer();void ReadInTokens(ifstream *f);varType AnalyzeParameter(string, int);void Parse();int ParseCommand(int);int ExecuteCommands();

    };

    #endif

    Letusbreakthisintochunksandreviewthemonebyone.Skippingthe

    headersandthedefines,wecometothispart.:

  • 7/30/2019 Custom Scripting Language 101

    9/29

    // Functions Listenum{

    PRINT,END_LINE,VAR,GETINPUT,SET

    };

    Thisisthefunction*list.Itwillhavea1:1relationtoourTokenList.Meaningthe

    datawepushontothetokenlistwillhavetofollowthesameorderasthis.Thesewill

    beusedtoassignfunction'operations'toeachcommand.Noticewewillnotbe

    usingEND_LINE.Thatismoreofaspacefillersowemaintainthe1:1relationtoour

    TokenList.

    // Variable types listenum{

    _STRING, // String_INT, // Integer_OPR // Operation

    };

    Thevariabletypeswillbeusedwhenwegotolookatparameters.Wewill

    analyzeeachoneandthendecideifitisainteger,operator,orastring.Inourcase,

    sincewearekeepingthissimple,wewillgooffthebasisofaassumptionandcheck

    theveryfirstcharacter.Ifitis0->9,thenwewillassumetheentireparameterasa

    integer.Ifitfallsintothe'+','-',etcarea.Wewouldsayitisaoperator.Ijustaddedthis

    toallowforexpansion.Wewon'tactuallyuseitinthisbook.

    struct command{int Operation;vector ParameterList;

    };

    Creatingavectoroftype'command'isthesimplestwayI'vefoundtostorethe

    commandsafterparsingcompleted.Inthisstructure,wehaveaintwhichwillstore

    theoperationtobeperformed.Thiswillbeloadedwiththe1:1relationIexplained

    earlier.

    struct variable{string name;string value;

    };

    Allvariablesaretreatedasstringstomakethingssimple.Ifweneedtoperform

    math,wecanalwaysjustconvertthecharacterarrayintoaintegerusingatoi()or

  • 7/30/2019 Custom Scripting Language 101

    10/29

    somethingsimilar.

    struct varType{int Type;string sValue;int iValue;

    };

    varTypeisusedwhenweanalyzeparameters.Theresultsarereturnedin

    eitherstringorintegerformat.Sometimestheconversionwillsimplyfillbothofthem

    outandletustheprogrammerselectwhichonetouse.

    class CelestialAnalyzer{private:

    vector m_TokenList; // Will store all tokens we compare forvector m_DiscoveredTokens; // Will store all found tokensvector m_CommandList; // Will store parsed commands form token listvector m_VariableList; // Will store variables created in the script

    public:CelestialAnalyzer();void ReadInTokens(ifstream *f);varType AnalyzeParameter(string, int);void Parse();int ParseCommand(int);int ExecuteCommands();

    };

    Thisisourmainclassthatweinstantiateanduse.Intheprivateareaofthe

    class,wecreate4vectors.Thefirstonestoresallthefunctiontokenswelookfor

    suchasPrint,Set,GetInput,etc.ThesecondvectorstoresALLtokensfoundinthe

    textfile.Anythingthathasawhitespaceinsideitisbrokenapartunlesscertainrules

    arefoundsuchastheopeningquotation.Whenfound,whitespacesareignored

    duringthereaduntilanotherquotationisfound.Nextwehaveacommandvector

    thatstoresallthecommandsparsedbytheparser.Lastwehaveavariablelistthat

    storesallfoundvariables.Weusethistoaccessandmodifyvariablesaswell.

    Wethenenterintothepublicareaandfirstthingonthelististheconstructor.

    WepushallourfunctiontokensontotheTokenListinsidetheconstructor.Wethen

    ReadInTokens().Thefunctionisobviousbythename,butnoticeitacceptsaifstream

    astheparameter.Thismeansyouhavetoopenthefileoutsideoftheparserand

    thenpassthefilepointerin.ThisfunctionwillcallParse()afterbreakingupthefile

    intomanytokens.DuringParse()thefunctionwillloopthroughallTokensinsidem_DiscoveredTokensandgroupthemintocommandsandparameters.Thebasic

  • 7/30/2019 Custom Scripting Language 101

    11/29

    syntaxruleforthisisthatthecommandwillalwaysbethefirsttokenfollowinga';'

    token.

    Parse()willcallParseCommand()untila-1isreturned(-1meansnexttokenisthe

    endora'$').ParseCommandwillthencallExecuteCommands()afteritfinishes

    parsingallcommandstogether.ExecuteCommands()willthencallAnalyzeParameter()repeatedlyforallparameterseachfunctionhas.Afterthe

    analysisiscomplete,theprogramwillsimplyrunit.

    Okay!Areyouguysstillwithmeatthispoint?Ifnot,besuretodirectyour

    questionstotheforums[http://www.CelestialCoding.com].Nowthatwehavethe

    mainheaderforourscriptreadercompleted.Let'sgototheactualsourcecodeitself.

    http://www.celestialcoding.com/http://www.celestialcoding.com/
  • 7/30/2019 Custom Scripting Language 101

    12/29

    The Script Reader Source

    Thiscodewilltakeupmultiplepages,soIamgoingtobreakituponefunction

    atatime.Iwillgoinchronologicalorder(firstfunctioncalledtolastfunctioncalled)

    CelestialAnalyzer::CelestialAnalyzer(){

    m_TokenList.push_back("Print"); // Printm_TokenList.push_back(";");m_TokenList.push_back("Var");m_TokenList.push_back("GetInput");m_TokenList.push_back("Set");

    }

    Hereisourconstructorwherewepusheverythingontothevector.Remember

    topushthemoninthesameorderasyourenumeratedFunction/Operationslist.

    voidCelestialAnalyzer::ReadInTokens(std::ifstream*f){

    charc;string token;bool readFile = true;do{

    f->get(c);switch(c){

    case'{':case'}':

    token.clear();token.push_back(c);m_DiscoveredTokens.push_back(token);token.clear();break;

    case'/':{ // Comments

    charbuf[255];f->get(c);if(c=='/')

    f->getline(buf, 255, '\n');break;

    }case'\"':{ // "Used for entire strings"

    token.clear();bool readString = true;while(readString){

    f->get(c);switch(c){

    case'\"':case'\n':case'\r':case'^Z':

    if('\"')m_DiscoveredTokens.push_back(token);

    token.clear();readString = false;break;

    default:token.push_back(c);break;

    }}

    }case'\n':case'\r':case' ':case'^Z':case'\t':

    m_DiscoveredTokens.push_back(token);token.clear();break;

    default:token.push_back(c);

  • 7/30/2019 Custom Scripting Language 101

    13/29

    break;}if(f->eof())

    readFile = false;} while(readFile);f->close();

    cout

  • 7/30/2019 Custom Scripting Language 101

    14/29

    intCelestialAnalyzer::ParseCommand(intx){

    command newCom;newCom.Operation = -1;while(m_DiscoveredTokens[x].compare(";") != 0){

    if(newCom.Operation < 0){

    for(unsignedint i=0; i

  • 7/30/2019 Custom Scripting Language 101

    15/29

    intCelestialAnalyzer::ExecuteCommands(){

    charbuf[25];for(unsignedint c = 0; c

  • 7/30/2019 Custom Scripting Language 101

    16/29

    {case'\\':{

    if((z+1) < temp.size())// Avoid buffer overflow

    {if(temp[z+1]

    == 'n')

    {

    s.push_back('\n');z++;

    } elseif(temp[z+1] == 't')

    {

    s.push_back('\t');z++;

    }}break;

    default:s.push_back(temp[z]);

    }}

    }}break;

    case _OPR:break;

    }}cout

  • 7/30/2019 Custom Scripting Language 101

    17/29

    break;}

    }cout

  • 7/30/2019 Custom Scripting Language 101

    18/29

    break;case'+':case'=':

    v.Type = _OPR;default:

    v.Type = _STRING; // Unknowns are simply returned as stringsv.sValue = p;

    }

    }break;case GETINPUT:{

    break;}default:

    break;}return v;

    }

    Thisshouldlookprettyobvious.Youcheckthefirstcharacterofthestring.Ifit

    isanumber,thenitisaint.Otherwise,itisaoperatororastring.Placingitall

    togetheralongwithourtestfile.Wecanexpecttogettheoutputbelow.

  • 7/30/2019 Custom Scripting Language 101

    19/29

    Ofcourse!Iforgottogiveyouthetestcodetorunthis!Sorryaboutthat,Iwasexcited

    overthenextpartofthebook!Thesourcecodeforthetestbedisbelow.

    #include"CelestialAnalyzer.h"

    CelestialAnalyzer ca;

    int main(){ifstream scriptFile("script.txt", ios::in);

    if(!scriptFile){cout

  • 7/30/2019 Custom Scripting Language 101

    20/29

    Code Listing

    Youwillfindacompletecodelistingherewithoutanymodifications.

    Downloadingtheprojectfileswillgiveyoutheexactsameresults.Ifyouwantahard

    codpyofthesourcecode,headtohttp://www.CelestialCoding.com.Anyquestions

    canalsobedirectedthere.

    [Main.cpp]

    #include"CelestialAnalyzer.h"

    CelestialAnalyzer ca;

    int main(){ifstream scriptFile("script.txt", ios::in);

    if(!scriptFile){cout

  • 7/30/2019 Custom Scripting Language 101

    21/29

    [CelestialAnalyzer.h]

    #ifndefCELESTIALANALYZER_H_

    #define CELESTIALANALYZER_H_

    #include#include#include

    #include#include

    usingnamespace std;

    // Functions Listenum{

    PRINT,ADD,VAR,GETINPUT,SET,SUBTRACT

    };

    // Variable types listenum{

    _STRING, // String_INT, // Integer_OPR // Operation

    };

    struct command{int Operation;vector ParameterList;

    };struct variable{

    string name;string value;

    };struct varType{

    int Type;string sValue;int iValue;

    };

    class CelestialAnalyzer{private:

    vector m_TokenList; // Will store all tokens we compare forvector m_DiscoveredTokens; // Will store all found tokensvector m_CommandList; // Will store parsed commands form token listvector m_VariableList; // Will store variables created in the script

    public:CelestialAnalyzer();void ReadInTokens(ifstream *f);varType AnalyzeParameter(string, int);void Parse();int ParseCommand(int);int ExecuteCommands();

    };

    #endif

  • 7/30/2019 Custom Scripting Language 101

    22/29

    [CelestialAnalyzer.cpp]ThisfilealsocontainsthetriesthatIattemptedwhile

    creatingthelexicalanalyzerportionofthisfile.ItriedbothCandC++waysbefore

    finallydecidedonC++andgoingthroughwithanalyzinghowtodoitusingC++.I

    onlyleftinthetriesjustincaseanyonewascuriousonothermethodsforreadingina

    fileandchecking.Intheendthough,itappearedthataswitchstatementworkedbestoutofallotheroptions.Notethatsomeofthestructuresorvariablesincluded

    withthetriesmightnolongerexistinsidetheheaderfile.Theyalsomighthavebeen

    modifiedsincewritingthetrialversion.Usethemonlyforcuriosity,notforactual

    parsingorlexicalanalyzing.

    #include"CelestialAnalyzer.h"

    CelestialAnalyzer::CelestialAnalyzer(){m_TokenList.push_back("Print"); // Printm_TokenList.push_back(";");

    m_TokenList.push_back("Var");m_TokenList.push_back("GetInput");m_TokenList.push_back("Set");CurrentVariableId = 0;

    }

    varType CelestialAnalyzer::AnalyzeParameter(string p, int Operation){varType v;switch(Operation){

    case VAR:{

    switch(p[0]){

    case 1:case 2:case 3:case 4:case 5:case 6:case 7:case 8:case 9:case 0:

    v.Type = _INT;v.iValue = atoi(p.c_str());

    break;

    case'+':case'=':

    v.Type = _OPR;default:

    v.Type = _STRING; // Unknowns are simply returned as stringsv.sValue = p;

    }}

    break;case PRINT:

    {switch(p[0]){

    case 1:case 2:case 3:

  • 7/30/2019 Custom Scripting Language 101

    23/29

    case 4:case 5:case 6:case 7:case 8:case 9:case 0:

    v.Type = _INT;v.iValue = atoi(p.c_str());v.sValue = p;

    break;case'+':case'=':

    v.Type = _OPR;default:

    v.Type = _STRING; // Unknowns are simply returned as stringsv.sValue = p;

    }}

    break;

    case GETINPUT:{string tmp;getline(cin, tmp);

    break;}default:

    break;}return v;

    }int CelestialAnalyzer::ExecuteCommands(){

    charbuf[25];

    for(unsignedint c = 0; c

  • 7/30/2019 Custom Scripting Language 101

    24/29

    break;}

    }m_VariableList.push_back(var);

    }

    break;

    case PRINT:{string s;varType v;

    bool usedVar = false;for(unsignedint i=0; i

  • 7/30/2019 Custom Scripting Language 101

    25/29

    s.push_back(temp[z]);}

    }}

    }break;

    case _OPR:

    break;}}cout

  • 7/30/2019 Custom Scripting Language 101

    26/29

    {for(unsignedint i=0; i -1){

    test = ParseCommand(test); // Recursion... Somewhat}

    cout

  • 7/30/2019 Custom Scripting Language 101

    27/29

    case'\"':case'\n':case'\r':case'^Z':

    if('\"')m_DiscoveredTokens.push_back(token);

    token.clear();

    readString = false;break;default:

    token.push_back(c);break;

    }}

    }case'\n':case'\r':case' ':case'^Z':case'\t':

    m_DiscoveredTokens.push_back(token);token.clear();break;

    default:token.push_back(c);

    break;}if(f->eof())

    readFile = false;} while(readFile);f->close();

    cout

  • 7/30/2019 Custom Scripting Language 101

    28/29

    }

    void CelestialAnalyzer::ReadInTokens(ifstream *f){vector tokens;string token;char c;

    bool eof = false; // End of file

    bool eos = false; // End of statementbool eot = false; // End of tokenint curVarId = 0;int curFuncId = -1;do{

    f->get(c); // Get a character if(c == ';')

    eos = true;if(c == '\n' || c == ' ' || c == ' Z' || c == '\r' || c == '\t' || f->eof()){

    for(unsigned int i=0; iget(c); // Read in each character

    if(c == '\n' || c == ' ' || c == ' Z' || c == '\r' || c == '\t' || f->eof()){ // 26 ASCII ^Z (end of filemarker)

    for(unsigned int i=0; ieof())

    eof = true;} while (!eof);

  • 7/30/2019 Custom Scripting Language 101

    29/29

    f->close();

    // for(unsigned int i=0; i