Continuous speech recognition using syllables

Jones, Rhys; Mason, John; Downey, Simon

doi:https://doi.org/

Conference Paper/Proceeding/Abstract 477 views

Continuous speech recognition using syllables

Rhys Jones

, John Mason, Simon Downey

Eurospeech '97, Pages: 1171 - 1174

Swansea University Authors: Rhys Jones , John Mason

Full text not available from this repository: check for access using links below.

Check full text

Abstract

The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the...

Full description

Published in:	Eurospeech '97
ISSN:	1018-4074
Published:	Grenoble, France European Speech Communication Association: ESCA 1997
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa63336

first_indexed	2023-05-02T17:07:43Z
last_indexed	2024-11-15T18:01:24Z
id	cronfa63336
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2023-06-13T13:22:48.4681384</datestamp><bib-version>v2</bib-version><id>63336</id><entry>2023-05-02</entry><title>Continuous speech recognition using syllables</title><swanseaauthors><author><sid>896a6aacfd217fb099481697a43bfe80</sid><ORCID>0000-0003-3928-4701</ORCID><firstname>Rhys</firstname><surname>Jones</surname><name>Rhys Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>284b34c63a5cbc71055047daf2ee1392</sid><firstname>John</firstname><surname>Mason</surname><name>John Mason</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-05-02</date><deptcode>CACS</deptcode><abstract>The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Eurospeech '97</journal><volume/><journalNumber/><paginationStart>1171</paginationStart><paginationEnd>1174</paginationEnd><publisher>European Speech Communication Association: ESCA</publisher><placeOfPublication>Grenoble, France</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint>1018-4074</issnPrint><issnElectronic/><keywords/><publishedDay>25</publishedDay><publishedMonth>9</publishedMonth><publishedYear>1997</publishedYear><publishedDate>1997-09-25</publishedDate><doi/><url/><notes/><college>COLLEGE NANME</college><department>Culture and Communications School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>CACS</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders/><projectreference/><lastEdited>2023-06-13T13:22:48.4681384</lastEdited><Created>2023-05-02T17:59:35.9756576</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">School of Culture and Communication - Media, Communications, Journalism and PR</level></path><authors><author><firstname>Rhys</firstname><surname>Jones</surname><orcid>0000-0003-3928-4701</orcid><order>1</order></author><author><firstname>John</firstname><surname>Mason</surname><order>2</order></author><author><firstname>Simon</firstname><surname>Downey</surname><order>3</order></author></authors><documents/><OutputDurs/></rfc1807>
spelling	2023-06-13T13:22:48.4681384 v2 63336 2023-05-02 Continuous speech recognition using syllables 896a6aacfd217fb099481697a43bfe80 0000-0003-3928-4701 Rhys Jones Rhys Jones true false 284b34c63a5cbc71055047daf2ee1392 John Mason John Mason true false 2023-05-02 CACS The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models. Conference Paper/Proceeding/Abstract Eurospeech '97 1171 1174 European Speech Communication Association: ESCA Grenoble, France 1018-4074 25 9 1997 1997-09-25 COLLEGE NANME Culture and Communications School COLLEGE CODE CACS Swansea University Not Required 2023-06-13T13:22:48.4681384 2023-05-02T17:59:35.9756576 Faculty of Humanities and Social Sciences School of Culture and Communication - Media, Communications, Journalism and PR Rhys Jones 0000-0003-3928-4701 1 John Mason 2 Simon Downey 3
title	Continuous speech recognition using syllables
spellingShingle	Continuous speech recognition using syllables Rhys Jones John Mason
title_short	Continuous speech recognition using syllables
title_full	Continuous speech recognition using syllables
title_fullStr	Continuous speech recognition using syllables
title_full_unstemmed	Continuous speech recognition using syllables
title_sort	Continuous speech recognition using syllables
author_id_str_mv	896a6aacfd217fb099481697a43bfe80 284b34c63a5cbc71055047daf2ee1392
author_id_fullname_str_mv	896a6aacfd217fb099481697a43bfe80_*_Rhys Jones 284b34c63a5cbc71055047daf2ee1392_*_John Mason
author	Rhys Jones John Mason
author2	Rhys Jones John Mason Simon Downey
format	Conference Paper/Proceeding/Abstract
container_title	Eurospeech '97
container_start_page	1171
publishDate	1997
institution	Swansea University
issn	1018-4074
publisher	European Speech Communication Association: ESCA
college_str	Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id	facultyofhumanitiesandsocialsciences
hierarchy_top_title	Faculty of Humanities and Social Sciences
hierarchy_parent_id	facultyofhumanitiesandsocialsciences
hierarchy_parent_title	Faculty of Humanities and Social Sciences
department_str	School of Culture and Communication - Media, Communications, Journalism and PR{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}School of Culture and Communication - Media, Communications, Journalism and PR
document_store_str	0
active_str	0
description	The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models.
published_date	1997-09-25T08:21:32Z
_version_	1827825351034667008
score	11.055822

Continuous speech recognition using syllables

Similar Items