资源预览内容
第1页 / 共13页
第2页 / 共13页
第3页 / 共13页
第4页 / 共13页
第5页 / 共13页
第6页 / 共13页
第7页 / 共13页
第8页 / 共13页
第9页 / 共13页
第10页 / 共13页
亲,该文档总共13页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Key word analysisDr Li Wenzhong Faculty of International Studies, Henan Normal University July, 2006What is a key word?nBasic assumptionsnThe real world is represented by our knowledge of itnOur knowledge of the world is represented in different semantic fieldsnSemantic fields are represented in variety of lexical fieldsnLexical fields are realized in groups of wordsnSpecific topics motivate using specific wordsnEach text with a specific topic has a group of unique words that are not usually found in other texts with different topicsnAccording to statistics, about 2530% words in any given text are unique topical words (key words)Measuring key wordsnRationalenKey words occur unusually frequently in a given textnThrough comparison of two texts or corpora, those with unusually high frequency are regarded as key wordsnAssuming each text contains a group of unique words, the texts with similar topics must share to a great extent the key words they useKey key wordsnIf we have, say, 3oo text of the same or similar topic(s), we will be able to compute both the key words for each text and the key words these texts share and thus obtain key key wordsnThe keyness of the key key words can be measured by counting how many texts in which a key word occurResearch questionsnWhat relationships are displayed between the topic and the words in a text?nWhat words and associates do the learners use when writing a topical essay?nWhat is the internal structure of the key words for a topic?ProceduresnObserved text vs. reference corpusnThe OC must be smaller than the RCnWordlists must be computed for each text (corpus) separatelynObserved corpus vs. reference corpusnAn observed corpus consists of a collection of texts of the same or similar topic(s)Steps nClean text processingnTopics groupingnSplit the textnMake word list and wordlist databasenMake keyword list databasenRetrieve keywords and their associatesnClump and match clumps associatesA flowchart观察语料库参照语料库词表数据库词 表主题词数据库词表比较主题词查询关键主题词查询联想词查询同题作文分割Results序 号主题词题词频频数 OC %频频数RC %2值值 (主题题性 )P值值 (P0.000001)1JOB106.585190.014,124.40.000000 2SKILLS42.632541,146.10.0000003PERSON31.975220.01284.10.000000 4COMPANY31.975270.01281.40.000000 5COULD53.292,9000.08160.50.000000 6I63.954,5890.12149.20.000000 7HIGHER31.971,5490.0492.90.000000 8ACHIEVED21.325890.0289.50.000000 9MY21.328600.0260.40.000000 10DO31.972,5600.0754.40.000000 11ALWAYS21.329770.0352.90.000000 12POSITION21.321,2250.0341.60.000000 13DEVELOPED21.321,4840.0433.80.000000 14THEY42.637,4590.2033.20.000000My view on Job- hopping主题词(1):行为 者 we, I主题词(5):等级 词 Job, jobs, work主题词(2):行为 词 Find, change, Like, want, view, think challenge, choose, select, enjoy engage, devote主题词(4)联想词 Ability, skills, fields, position, salary, chance, experience主题词(3):描述词 New, better, good, changing, interesting, different, constant, stable, fixed, boring,图 2: My view on job-hopping主题词网络The Problems of Water Shortage主题词题词(1):行为为 者: factories, industries, population, cities, human beings, development, government主题词题词(4):联联想 词词 Rain, rivers, wells, sea, resources, lakes, underground, ice,主题词题词(5):等 级词级词Shortage, pollution, problem, crisis, consumption主题词题词(3):描述词词Fresh, global, serious, clean, dirty, limited主题词题词(2):行为词为词 Protect, prevent, save, control, solve, use, drink, measures, reuse, DIYnTask 1: Clean text and mark the end of the texts and find a reference corpus wordlistnTask 2: split the textsnTask 3: make a wordlist for the observed text(s)nTask 4: make a keyword list
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号