Type Forum
License CC BY 4.0

How accurately can the Google Web Speech API recognize and transcribe Japanese L2 English learners’ oral production?

Abstract

The ultimate aim of our research project was to use the Google Web Speech API to automate scoring of elicited imitation (EI) tests. However, in order to achieve this goal, we had to take a number of preparatory steps. We needed to assess how accurate this speech recognition tool is in recognizing native speakers’ production of the test items; we had to assess its accuracy with our Japanese EFL learners; and, on the basis of these trials, we needed to evaluate the potential for using the API for our purposes. Through comparing our own assessments of the learners’ pronunciation with the system’s ability to transcribe utterances, we were able to ascertain that the learners’ pronunciation of certain sounds is probably the single biggest reason for a fall in recognition accuracy compared to native speaker input. However, we argue that pronunciation may not be an insurmountable barrier to using this speech recognition system for our EFL purposes. By going through this double screening process, we feel we have arrived at a set of items which can be used to assess student’s grammatical ability in an EI test using a custom Google Web Speech system.

Citation

Ashwell, T., & Elam, J. R. (2017). How accurately can the Google Web Speech API recognize and transcribe Japanese L2 English learners’ oral production?. The JALT CALL Journal, 13(1), 59-76. https://doi.org/10.29140/jaltcall.v13n1.212

DOI

https://doi.org/10.29140/jaltcall.v13n1.212