No Cover Image

Conference Paper/Proceeding/Abstract 256 views

Continuous speech recognition using syllables

Rhys Jones Orcid Logo, John Mason, Simon Downey

Eurospeech '97, Pages: 1171 - 1174

Swansea University Authors: Rhys Jones Orcid Logo, John Mason

Full text not available from this repository: check for access using links below.

Abstract

The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the...

Full description

Published in: Eurospeech '97
ISSN: 1018-4074
Published: Grenoble, France European Speech Communication Association: ESCA 1997
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa63336
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2023-05-02T17:07:43Z
last_indexed 2023-05-02T17:07:43Z
id cronfa63336
recordtype SURis
fullrecord <?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>63336</id><entry>2023-05-02</entry><title>Continuous speech recognition using syllables</title><swanseaauthors><author><sid>896a6aacfd217fb099481697a43bfe80</sid><ORCID>0000-0003-3928-4701</ORCID><firstname>Rhys</firstname><surname>Jones</surname><name>Rhys Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>284b34c63a5cbc71055047daf2ee1392</sid><firstname>John</firstname><surname>Mason</surname><name>John Mason</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-05-02</date><deptcode>AMED</deptcode><abstract>The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Eurospeech '97</journal><volume/><journalNumber/><paginationStart>1171</paginationStart><paginationEnd>1174</paginationEnd><publisher>European Speech Communication Association: ESCA</publisher><placeOfPublication>Grenoble, France</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint>1018-4074</issnPrint><issnElectronic/><keywords/><publishedDay>25</publishedDay><publishedMonth>9</publishedMonth><publishedYear>1997</publishedYear><publishedDate>1997-09-25</publishedDate><doi/><url/><notes/><college>COLLEGE NANME</college><department>Media</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>AMED</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders/><projectreference/><lastEdited>2023-06-13T13:22:48.4681384</lastEdited><Created>2023-05-02T17:59:35.9756576</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">School of Culture and Communication - Media, Communications, Journalism and PR</level></path><authors><author><firstname>Rhys</firstname><surname>Jones</surname><orcid>0000-0003-3928-4701</orcid><order>1</order></author><author><firstname>John</firstname><surname>Mason</surname><order>2</order></author><author><firstname>Simon</firstname><surname>Downey</surname><order>3</order></author></authors><documents/><OutputDurs/></rfc1807>
spelling v2 63336 2023-05-02 Continuous speech recognition using syllables 896a6aacfd217fb099481697a43bfe80 0000-0003-3928-4701 Rhys Jones Rhys Jones true false 284b34c63a5cbc71055047daf2ee1392 John Mason John Mason true false 2023-05-02 AMED The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models. Conference Paper/Proceeding/Abstract Eurospeech '97 1171 1174 European Speech Communication Association: ESCA Grenoble, France 1018-4074 25 9 1997 1997-09-25 COLLEGE NANME Media COLLEGE CODE AMED Swansea University Not Required 2023-06-13T13:22:48.4681384 2023-05-02T17:59:35.9756576 Faculty of Humanities and Social Sciences School of Culture and Communication - Media, Communications, Journalism and PR Rhys Jones 0000-0003-3928-4701 1 John Mason 2 Simon Downey 3
title Continuous speech recognition using syllables
spellingShingle Continuous speech recognition using syllables
Rhys Jones
John Mason
title_short Continuous speech recognition using syllables
title_full Continuous speech recognition using syllables
title_fullStr Continuous speech recognition using syllables
title_full_unstemmed Continuous speech recognition using syllables
title_sort Continuous speech recognition using syllables
author_id_str_mv 896a6aacfd217fb099481697a43bfe80
284b34c63a5cbc71055047daf2ee1392
author_id_fullname_str_mv 896a6aacfd217fb099481697a43bfe80_***_Rhys Jones
284b34c63a5cbc71055047daf2ee1392_***_John Mason
author Rhys Jones
John Mason
author2 Rhys Jones
John Mason
Simon Downey
format Conference Paper/Proceeding/Abstract
container_title Eurospeech '97
container_start_page 1171
publishDate 1997
institution Swansea University
issn 1018-4074
publisher European Speech Communication Association: ESCA
college_str Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id facultyofhumanitiesandsocialsciences
hierarchy_top_title Faculty of Humanities and Social Sciences
hierarchy_parent_id facultyofhumanitiesandsocialsciences
hierarchy_parent_title Faculty of Humanities and Social Sciences
department_str School of Culture and Communication - Media, Communications, Journalism and PR{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}School of Culture and Communication - Media, Communications, Journalism and PR
document_store_str 0
active_str 0
description The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based Hidden Markov Models (HMMs) are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable-level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole-word and phoneme models.
published_date 1997-09-25T13:22:47Z
_version_ 1768590113216921600
score 11.013799