Go to the first, previous, next, last section, table of contents.


16 Phrase breaks

There are two methods for predicting phrase breaks in Festival, one simple and one sophisticated. These two methods are selected through the parameter Phrase_Method and phrasing is achieved by the module Phrasify.

The first method is by CART tree. If parameter Phrase_Method is cart_tree, the CART tree in the variable phrase_cart_tree is applied to each word to see if a break should be inserted or not. The tree should predict categories BB (for big break), B (for break) or NB (for no break). A simple example of a tree to predict phrase breaks is given in the file `lib/phrase.scm'.

(set! simple_phrase_cart_tree
'
((Token.punc in ("?" "." ":"))
  ((BB))
  ((Token.punc in ("'" "\"" "," ";"))
   ((B))
   ((n.name is 0)
    ((BB))
    ((NB))))))

The second and more elaborate method of phrase break prediction is used when the parameter Phrase_Method is prob_models. In this case a probabilistic model using probabilities of a break after a word based on the part of speech of the neighbouring words and the previous word. This is combined with a ngram model of the distribution of breaks and non-breaks using a Viterbi decoder to find the optimal phrasing of the utterance. The results using this technique are good and even show good results on unseen data from other researchers' phrase break tests (see black97b). However sometimes it does sound wrong, suggesting there is still further work required.

The following variables are used for parameters. They are all set in the file `lib/phrase.scm'.

break_pos_ngram_name
The name of a loaded ngram that gives probability distributions of B/NB given previous, current and next part of speech.
break_ngram_name
The name of a loaded ngram of B/NB distributions. This is typically a 6 or 7-gram.
pos_p_start_tag
As with the part of speech tagger, this is the part of speech tag of what the most likely tag before the start of an utterance is. This is typically a sentence final punctuation mark.
pos_pp_start_tag
As with the part of speech tagger, this is the part of speech tag of what the most likely tag before the tag before the start of an utterance is. This is typically a noun of some sort.
pos_n_start_tag
This is the most likely part of speech tag after the end of the utterance. I have usually made this a determiner. However if the utterance does not end in a punctuation mark this should probably be a punctuation mark. Currently the system does not deal with this case.
phrase_type_tree
If set this should be a CART tree to predict two levels of break (BB or B) from the B and NB predicted. This is currently crude and is done based on sentence final punctuation only. This will be improved.


Go to the first, previous, next, last section, table of contents.