António Manuel Horta Branco
João Silva

R – NLX-Natural Language Group, Universidade Nova de Lisboa


In this talk, we report on a series of experiments on the tagging of gender and number inflection that sought to explore the serendipity of the rationale that inflection forms a linguistic system separate from the part-of-speech system. These experiments indicate that the statistically-based approach to inflection tagging with better results is the one where the standalone HMM inflection tagger is developed with a training set where the hidden states are the tokens resulting from the concatenation of word forms with their corresponding accurate POS tags, and the output symbols are tags with the values of inflection features. If the tagging of POS and inflection is to be performed jointly, then the best solution is to go for a single tagger that assigns tags with both POS and inflection information. We discuss also why, given the current state of the art and the empirical data collected so far in our study, we are led to believe that, nevertheless, the most accurate approach to nominal featurization will be a symbolic-based one.


Date: 2005-Nov-18     Time: 15:30:00     Room: 336

