The VISL System
Total Page:16
File Type:pdf, Size:1020Kb
TheVISLSystem: ResearchandapplicativeaspectsofIT-basedlearing EckhardBick e-mail:[email protected] web:http://visl.sdu.dk 1.Abstract Thepaperpresentsanintegratedinter activeuserinterfaceforteachinggrammaticalanalysis throughtheInternetmedium(VisualInteractiveSyntaxLearning),developedatSouthernDenmark University, covering14differentlanguages ,halfofwhicharesupportedbylivegrammatical analysisofrunningtext.Forreasonsofrobustness,efficiencyandcorrectness,the system'sinternal toolsarebasedontheConstraintGrammarformalism(Karlsson,1990,1995),butusersarefreeto choosefromavarietyofnotationalfilters,supportingdifferentdescriptionalparadigms , witha currentteachingfocuson syntactictreestructuresand theform -functiondichotomy.Theoriginal kernelofprogramswasbuiltaroundamulti -levelparserforPortuguese(Bick, 1996, 2000) developedinadissertation frameworkatÅrhusUniversity andusedasapointofdeparturefor similarsystemsinother languages .Over thepast 5years,VISLhasgrownfromateaching initiativeintoafullblownresearch anddevelopment project withawiderangeof secondary projects,activitiesandlanguagetechnologyproducts. Examplesofapplicationorientedresearch areNLP -basedteachinggames, machinetranslationandgrammaticalspellchecking. TheVISL group has repeatedly attractedoutsidefundingfor thedevelopment of grammar teachingtools, semanticsbasedConstraintGrammarsandtheconstructionofannotatedcorpora. Online Proceedings of NODALIDA 2001 1.Background WhentheVISLprojectstartedin1996,itsprimarygoalwastofurthertheintegration ofITtoolsandITbasedcommunicationroutinesintotheuniversitylanguage teachingmilieuatOdenseUniversity(Denmark),andmorespecifically,todevelop toolsforVisualInteractiveSyntaxLearning.Theinitiativewasfundedjointlyby CTU(Centerfor Teknologi-StøttetUddannelse)andOdenseUniversityfor3years, andthelanguagesinvolvedwereEnglish,German,FrenchandPortuguese. Alreadyintheearlystagesoftheprojectitbecameclearthatadistinction wouldhavetobemadeastowhetherthe languagedatatobeusedintheteaching interfacewouldbelimitedtextbookexamplesorunlimitednaturallanguagetext.We decidedtodevelopbotha"closed"andan"open"system,andtodesigntheteaching applicationsformaximalsynergy,suchthatt heywouldbeabletotakeinputfrom boththeclosedandopenlanguagedatasources, -anddosoinalargelylanguage independentway. Fortheclosedsystemanotationalformalismwasdevelopedthatallowedthe textualexpressionofgraphicalsyntactict reestructures,anddatabasesofmanually analysedsentenceswerebuiltforallparticipatinglanguages,theoriginaltargetbeing 500textbooksentencesand500runningtextsentences.Withthehelpofenthusiastic studentsandteachers,these"closedcorpus"databasesareconstantlybeingenlarged, andtodayVISLcovers14languages,amongthemthebasicRomanceandGermanic languagesaswellasanumberofmoreexoticspecimen,likeArabic,Japaneseand Esperanto. Theopensystemisresearchbasedan dcenteredaroundtheConstraint Grammarparadigm,introducedbyFredKarlssonatHelsinkiUniversityintheearly 1990ies(Karlsson,1991,1995).The1996rolemodelforthesyntacticVISLsystem wasmyPortugueseCGparsingsystem(Bick1996,2000),whic hfeaturedafull dependencyanalysisofsubclausestructureandaprototypeCG -to-treesyntax transformationgrammar.IhavesincedevelopedsimilarCGbasedsystemsfor Danish,SpanishandEsperanto.ForEnglishandGerman,VISLhas correcteda nd amplifiedlicensedcommercialCGsystemsfromtheFinnishsoftwarefirmLingsoft. 2.Aunifiedapproachtogrammar ThecentralprincipleofVISL'slanguageanalysisis itsfocusonsurfacestructure (expressedaseitherdependencyrelationsorsyntactictrees tructures)andtheform - functiondichotomy.FollowingBacheet.al.(1993,1999),functionsymbolsstartwith uppercaseletters,formsymbolswithlowercaseletters,andboth arecombinedina combinedcolon -separatedsymbol(text)orfunction -over-formsy mbol(graphics). Forthedependencynotation,internationalCGconventionsarefollowed,withupper caselettersforallprimarytags,usingthe@ -symboltointroducefunctiontags,and arrowheads(>,<)forheadorienteddependencymarkers. Online Proceedings of NODALIDA 2001 VISLlightverticaltree VISLverticaltree (non-graphicalnotation) (non-graphicalnotation) UTT:cl(fcl) STA:fcl S:propVISL S:propVISL P:v(v-pr) er P:v-fin(v-pr) er Cs:g(np) Cs:np =D:art et =DN:art et =H:n forskningsprojekt =H:n forskningsprojekt =D:cl(fcl) =DN:fcl ==S:pron(pron-rel) der ==S:pron-rel der ==P:v(v-pr) involverer ==P:v-fin(v-pr) involverer ==Od:g(np) ==Od:np ===D:pron(pron-indef) mange ===DN:pron-indef mange ===D:adj forskellige ===DN:adj forskellige ===H:n sprog ===H:n sprog VISL [VISL] <heur><*> PROPNOM @SUBJ> er [være] <vk> VPRAKT @FMV et [en] ARTNEUSIDF @>N forskningsprojekt [forskningsprojekt] NNEUSIDFNOM @<SC , der [der] <rel> INDPnGnNNOM @SUBJ> involverer [involvere] <vt> VPRAKT&MV @FS-N< mange [mange] <quant> DETnGPNOM @>N forskellige [forskellig] ADJnGPnDNOM @>N sprog [sprog] NNEUPIDFNOM @<ACC . Meetingregularlyover4years,theVISLgroupofuniversityteachershasinvested considerableeffortindiscussingthecompatibilities,incompatibilitiesandblindspots ofdifferentnationalandlinguisticgrammartraditions,andagreedonacommon supersetofsymbols.Recently,areducedsymbolsetforpropedeuticuseandschools, "VISLlight",wasagreedupon,andtheDanishX-and-O-systemadaptedtomatchthe functioncategoriesusedinVISLlight. Atthelowe stlevel,11wordclassesand14 primaryfunctionsareused. Predicator(P),Verbal(V) ¡ Auxiliary(Vaux),alsoas<>(D) £ ¢ Mainverb(V*,Vm),alsoas (H,K) ¥ ¤ Verbchainparticle(Vp),alsoas<>(D),simplifiedas (A) ¦ Infinitivemarker(Vi,INFM),alsoas<>(D) § Subject(S) ( § ) Formalorprovisionalsubject(Sf),possiblywiththesubclassofsituativesubject(Ss) ¨ Direct(accusative)object(Od) ( ¨ ) Formalorprovisionalobject(Of) © Indirect(dative)object(Oi) Prepositionalobject(Op),emneled,evt.forenkletsom ¥ (A) ⊗ Subjectcomplement(Cs),Subjectpredicative(Ps) [⊗] Freesubjectpredicative(fPs,fCs),simplifiedas ¥ (A) Online Proceedings of NODALIDA 2001 ⊕ Objectcomplement(Co),Objectpredicative(Po) ⊕ [ ] Freeobjectpredicative(fPo,fCo),simplifiedas ¥ (A) ¥ ¥ Adverbial(A),withpossiblesubdivisionoffree( fA)orbound( bA,bAs,bAo) £ Head(H),Kernel(K) <> Dependents(D) Subordinator(SUB) ↔ Co-ordinator(CO) # Conjunct(CJT) «» Underspecifiedconstituentatclauselevel(e.g.clausebody) 3.Internetbasedteachingtools OnelessontobelearnedfromtheVISLproject,isthatitisnotatalleasyto introduceIT-basedtoolsintoanexistingteachingenvironment.Ap artfromhardware problems(thereneverbeingenough-compatibleandupdated-machinesintheright roomattherighttime),thereistheverycentralproblemofpsychologicalresistance againstthenewmedium,simplybecauseitmayfeeltoo"technical". Allthings technicalhaveaverylowacceptancerateintheHumanities,andteachersoftenresent thepersonalinvestmentintimeandeffortnecessarytoacquirethenecessaryskills - nottomentionchangesinteachingmaterialandexams.Thereis,ofco urse,a fundamentaldifference intermsof"technicality"betweenahumanteacheranda computerterminal,-thelatterlackstheteacher's naturalness,interactivity,flexibility and tutoringcapacities.Ontheotherhand,computersdohaveevidentteachin g advantages-theycanintegratethesenses,makinguseofcolours,picturesandsounds inamoreflexibleandimpressivemannerthanpapercan.Also,acomputerprogram can"know"more-intermsoffactsandexamples,andwithinawell -definedsubject matter-thanahumanteacher.Andlast,butnotleast,acomputersystem,especially ifaccessiblethroughtheinternet,canteachanunlimitednumberofstudentsatthe sametimeinwhatoptimallystillamountstoanindividualmanner. Giventheseadvantages,itmakessensetoinvestsomeeffortinaddressingthe fourmaindisadvantages,aslistedabove.TheVISL grammarteachinginterfacetries tomakeadvanceswithregardtothefollowingfourprinciples: (i)Flexibility TheVISLinterfaceisnotationallyf lexible,i.e.th eusercanchoosebetween severalnotationalconventions(e.g.flatdependencygrammar,enrichedtext,meta textnotation,treestructures),andmovebackandforthbetweendifferentlevelsof complexity.Forinstance,dependingontheexer cisechosen,thetypeandnumberof grammaticalcategoriesused(e.g.wordclasses)maybechanged.Inordertomake workmorecolourful,itisalsopossibletomovebetweentextbookmaterial,copied "live"texts,randomizedtestsentencesandone'sowncreativeidiolect. Inthetreestructureexamplebelow,theusercanswitchbackandforthbetween lettersymbolsandgraphicalsymbols,morethandoublethenumberofcategories,or reducethetreetoapurefunctiontree(greenonly). Online Proceedings of NODALIDA 2001 Lettersymbols Graphicalsymbols VISL'suniqueintegrationofteachingandresearchtoolswouldevenallowtheuserto experimentwithdifferentkindsofsubjectsor addacoupleofplaceandtime adverbialsandrerunthesentenceinfree-textmode–withexactlythesamegraphical setupandpaedagogicalfunctionality. (ii)Interactivity VISL'sjava -treeinterfaceforgrammaticalanalysisallowsthestep -by-step interactiveinspection,constructionandlabellingofsyntactictreesusingmenus, mouseclicksanddrag -and-dropmovements,allknownfromb asictextprocessor functionality. Inthe firstexamplebelow, astudenthasrecognizedthenp "minhest",buthas yettoassemble "lyst"ontothepredicator ("har")oftheadverbialsubclausetothe left. Online Proceedings of NODALIDA 2001 Whenasentenceprovesproblematicorincomprehensible,theusercanmodifyit,or askforthecomputer 'sopinion(show -meoption).Ingrammargameslike Paintbox, PostOffice or Shoot-the-Verb, interactivityinegratesac ertainelementof competition,andisfurtherenhancedbysoundeffects,timersandhigh-scores. Online Proceedings of NODALIDA 2001 (iii)Naturalness Amajordrawbackofmostlanguageteachingsoftware(or,forthatmatter, languageanalysissoftware)isthattheydonotrunonfree,na turallanguage,butonly onasmallsetofpredefinedsentencesorstructures("toylexica"or"toygrammars"),