An Analysis of Sentence Segmentation Features for
Broadcast News, Broadcast Conversations, and Meetings
Sebastien Cuendet [email protected]
Elizabeth Shriberg [email protected]
Benoit Favre [email protected]
James Fung [email protected]
Dilek Hakkani-Tur [email protected]
- ICSI
- SRI International
333 Ravenswood Ave
Menlo Park, CA 94025, USA
1947 Center Street
Berkeley, CA 94704, USA
ABSTRACT General Terms Keywords
1. INTRODUCTION
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ... 5.00.
2. METHOD 2.1 Data and annotations
2.2 Automatic speech recognition
- 2.3 Features
- 2.5 Metrics
Lexical features.
2.6 Chance performance computation
Prosodic Features.
3. RESULTS AND DISCUSSION 3.1 Performance by feature group
2.4 Boosting Classifiers
TDT4 ref TDT4 stt MRDA ref MRDA stt BC stt
80 60 40 20
0
- Chance
- Dur
- Energy Pitch
- Turn
- Pause AllPros
- Lex Lex+Pau ALL
Features
3.2 Performance by feature subgroup
0.40 0.35 0.30 0.25 0.20
TDT4 stt MRDA stt BC stt
PITCH
SUBGROUPS
ENERGY SUBGROUPS
- Range
- Reset
- Slope
- Range
- Reset
- Slope
5. ACKNOWLEDGMENTS
4. SUMMARY AND CONCLUSIONS
6. REFERENCES