KHO THƯ VIỆN 🔎

Speech and language processing an introduction to natural language processing part 2

➤  Gửi thông báo lỗi    ⚠️ Báo cáo tài liệu vi phạm

Loại tài liệu:     PDF
Số trang:         535 Trang
Tài liệu:           ✅  ĐÃ ĐƯỢC PHÊ DUYỆT
 













Nội dung chi tiết: Speech and language processing an introduction to natural language processing part 2

Speech and language processing an introduction to natural language processing part 2

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2 Jair.es H. Martin.Copyright © 2007, All rights reserved. Draft of October 22, 2007. Do not cite without permission.I /Ị STATISTICAL PARSINGTiro roads

diverged in a wood, and I -I rook the one less traveled by...Robert Frost. The Road Nor TakenThe characters in Damon Runyon's short stories are willi Speech and language processing an introduction to natural language processing part 2

ng to bet "on any proposition whatever", as Runyon says about Sky Masterson in The Jdyli of Miss Sarah Brown. fioni the probability of getting aces ba

Speech and language processing an introduction to natural language processing part 2

ck-to-back to the odds against a man being able to throw a peanut from second base to home plate. There IS a moral here for language processing: with

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2re and its parsing. In this chapter we show that it is possible to build probabilistic models of syntactic knowledge and use some of this probabilisti

c knowledge m efficient probabilistic parsers.One crucial use of probabilistic parsing is to solve the problem of disambiguation Recall from Ch. 13 th Speech and language processing an introduction to natural language processing part 2

at sentences on average tend to be very syntactically ambiguous, due to problems like coordination ambiguity and attachment ambiguity. The CKY and Ear

Speech and language processing an introduction to natural language processing part 2

ley parsing algorithms could represent these ambiguities in an efficient way. but were not equipped to resolve them. A probabilistic parser offers a s

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2biguity, most modern parsers used for namral language understanding tasks (thematic role labeling, summarization, question-answenng. machine translati

on) are of necessity probabilistic.Another important use of probabilistic grammars and parsers is in language modeling for speech recognition. We saw Speech and language processing an introduction to natural language processing part 2

tliat jV-gram grammars are used in speech recognizers to predict upcoming words, helping constrain the acoustic model search for words. Probabilistic

Speech and language processing an introduction to natural language processing part 2

versions of more sophisticated grammars can provide additional predictive power to a speech recognizer of course humans have to deal with the same pro

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2listic grammars in human language-processing tasks (e.g.. human reading or speech understanding).The most commonly used probabilistic grammar IS the p

robabilistic context-free grammar (PCFG). a probabilistic augmentation of context-free grammars in whichChapter 14. Statistical Parsingeach rule is as Speech and language processing an introduction to natural language processing part 2

sociated with a probability. We introduce PCFGs in the next section, showing how they can be trained on a hand-labeled Treebank grammar, and how they

Speech and language processing an introduction to natural language processing part 2

can be parsed. We present the most basic parsing algorithm for PCFGs. which is the probabilistic version of the CRY algorithm that we saw in Ch. 13.We

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2 Treebank grammar is to change the names of the non-terminals. By making the nonterminals sometimes more specific and sometimes more general, we can c

ome up with a grammar with a better probability model that leads to improved parsing scores. Another augmentation of the PCFG works by adding more sop Speech and language processing an introduction to natural language processing part 2

histicated conditioning factors, extending PCFGs to handle probabilistic subcategoiizatiou information and probabilistic lexical dependencies.Finally,

Speech and language processing an introduction to natural language processing part 2

we describe the standard PARSEVAL metrics for evaluating parsers, and discuss some psychological results on human parsing.14.1 Probabilistic Context-

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2Context-Free Grammarsera(SCFG). first proposed by Booth (1969). Recall that a context-free grammar G isdefined by four parameters GV. z. p. 5); a prob

abilistic context-free grammar augments each rule in p with a conditional probability. A PCFG IS thus defined by the following components:jV a set of Speech and language processing an introduction to natural language processing part 2

non terminal symbols (or variables)z a set of terminal symbols (disjoint from .V)R a set of rules or productions, each of the form A -* $ [p). where A

Speech and language processing an introduction to natural language processing part 2

is a non-terminal. p is a suing of symbols from the infinite set of strings (Zu.V)*. and p is a number between 0 and 1 expressing P(P)s a designated

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2he probability that the given non-termmal .4 will be expanded to the sequence p That IS. p IS the conditional probability of a given expansion p given

the left-hand-side (LHS) non-terminalzi. We can represent this probability as/»(.<-p)or asP(.4-P|.4)Section 14.1. Probabilistic Context-Free Grammars Speech and language processing an introduction to natural language processing part 2

.S’ _ A'P pp s _ AtaNPVP s > ĨT NP > Pronoun NP — Proper-Noun NP — Det Nominal NP — Nominal Nominal • Noun Nominal — Nominal Noun Nominal — Nominal pp

Speech and language processing an introduction to natural language processing part 2

VPPerh VP > VerbNP IT -> VerbNPPP VP — Verhpp VP — Verb NP NF IT -* VPPP pp —» Preposition NP.80 .15 .05 .35 .30 .20 .15 .75 .20 .05 .35 .20 .10 .15

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2.30| 1 include |.30| 1 pnỳêr;|.40| Pronoun —- / Ị.40Ị 1 s7i<»|.05| 1 me (.15) 1 you [.40] Proper-Noun — Houston |.60 | 1 TWA |.40| Atir — does |.60 1

can |40| Preposition - from [.30] 1 to [.30] 1 on [.20] near [.15] 1 through |.05|Figure 14.1 A PCFG which is a probabilistic augmentation of the £] m Speech and language processing an introduction to natural language processing part 2

iniature English CFG grammar and lexicon of Fig. ?? in Ch. 13. These probabilities were made up for pedagogical purposes and are not based OU a corpus

Speech and language processing an introduction to natural language processing part 2

(since any real corpus would liave many more roles, and so the true probabilities of each pile would be much smaller).or asP(RHS\LHS)Thus if we consi

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2tation of the 2-1 miniature English CFG grammar and lexicon . Note that the probabilities of all of the expansions of each non-tcnninal sum to I. Also

note that these probabilities were made up for pedagogical purposes In any real grammar there are a great many more rules for each non-terminal and h Speech and language processing an introduction to natural language processing part 2

ence the probabilities of any particular rule would tend to be much smaller.C0KSSILN1 A PCFG is said to be consistent if the sum of the probabilities

Speech and language processing an introduction to natural language processing part 2

of all sentences in the language equals 1. Certain kinds of recursive rules cause a grammar to be inconsistent by causing infinitely looping derivatio

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Speech and language processing an introduction to natural language processing part 2oth and Thompson < 1973) for more details on consistent and inconsistent grammars.

Speech and Language Processing: An introduction to natural language processing, cosputaclonal linguistics, and speech recognition. Baniel JuralsJcy 4

Gọi ngay
Chat zalo
Facebook