TOEFL iBT training in the CALL classroom

Ritsumeikan University richinwit@hotmail.com This paper describes a CALL course for EFL students at a Japanese university in which Moodle quiz activities were used in conjunction with NanoGong (a sound-recording plugin) to simulate TOEFL iBT-style speaking and writing exercises. In addition to describing and sharing our experience implementing the course, we also share survey and interview feedback from students, which touches upon themes such as learner anxiety toward peer assessment and a positive response toward the use of simplified scoring rubrics. We conclude with some practical suggestions for teachers interested in carrying out iBT training with Moodle in their CALL rooms.


Introduction
In early 2010, we were put in charge of designing a spring semester CALL class for the highest level, first-year students in the business faculty of our university. Knowing that some students would be taking the TOEFL in the near future in order to study abroad as part of exchange programs, and since the faculty had set a minimum TOEFL score for graduation, we chose to design a course based around improving the students' test taking ability on the writing and speaking parts of the TOEFL iBT. We thought that not only would weekly training improve their scores on the test and potentially be motivating for them, but we also recognized the educational value of the iBT; namely, that its exercises demand high standards of expressing oneself in English in a timely, structured, clear, and, in some cases, opinionated fashion. Whether or not students would actually take the TOEFL iBT exam, we believed that time spent on these iBT exercises would be of long-term benefit to the students, especially in terms of communicating clearly in speech and writing. We also felt that it would improve students' ability to write using a keyboard and browser-based text entry form -something identified in previous research as being an area in which Japanese students could benefit from training (see McDonald & Foss, 2007;Susser, 2008) -as well as increase practical language-related skills.
The purpose, therefore, of this paper is to share what we did in this class and what we learned, in the hope that it may be beneficial to other educators interested in either using the CALL classroom to improve their students' speaking abilities or for TOEFL iBT training. After describing how we set-up and implemented the course on Moodle to attempt to simulate iBT testing conditions, we will discuss the main problems we encountered -both pedagogical and technical -and what we did to overcome them. We will also share student feedback from an end-of-the-semester survey and from qualitative interviews. Finally, we will make specific recommendations to other educators hoping to use the CALL classroom in a similar way.

The tOEFl iBt
Every year almost a million people across 165 countries take the TOEFL exam (Educational Testing Service, 2010a) as part of their efforts to secure admission to schools of their choice, win scholarships, gain professional accreditation, apply for visas, etc. Described as a test aimed at "measuring non-native speakers' ability to use English to communicate effectively in college and university settings," it is used by over 6,000 colleges and universities worldwide, as well as by government organizations and agencies (Rilcy & Wyatt, 2009).
Whether or not we, as language educators, believe the TOEFL to be an accurate measure of a learner's true command of the English language, or even whether we disagree with the whole premise of a worldwide, standardized English test (see McCrostie, 2006) or its inevitable washback effect in language education (Bailey 1999;Cheng, 2008), the practical value of the TOEFL to young, ambitious people is real: they simply need a good TOEFL score to open the doors necessary to realizing their future dreams. In addition, the types of skills required in the test (i.e., the reformulation of arguments heard in lectures in both written and spoken discourse) might well be skills required in future studies or careers.
The TOEFL iBT was introduced in 2005 as an alternative to the TOEFL paper-based and computer-based versions, which had already been in use. Because the iBT is a four skills test, it responded to criticisms of TOEFL lacking a speaking component (Powers, 2010). There was also a reformulation of the tasks in the TOEFL iBT to correspond to the actual skills required of students in an English-language academic setting (e.g., the integrated writing and speaking task), rather than the testing of discrete knowledge about forms of the English language that could be coached, leading to negative washback effects on teaching and learning (Cumming, Grant, Mulcahy-Ernt, & Powers, 2004). And by being internet based, the iBT offers flexibility for candidates who can now take the TOEFL with greater regularity and even make multiple attempts over a relatively short span of time. Without a doubt, the TOEFL iBT remains a popular measure of a student's English ability by academic institutions in North America and around the world (Ohkubo, 2009).

Our CAll course
All new students in the business administration faculty at our institution take a CALL class as part of the first-year English curriculum. CALL classes number up to 35 students and are generally taught by native Japanese instructors. Students are streamed into levels according to students' performance on an in-house placement exam. The primary purpose of the CALL class is to develop students' reading, listening, and speaking skills. For the 2010 spring semester, other CALL teachers were using a video-based textbook designed to foster topic-based inquiry and discussion on world culture, along with Gyuto-e, a CALL solution for Japanese EFL learners, designed to improve grammar, reading, vocabulary, and listening.
The business faculty divides students into two primary streams: business administration and international business administration. In the fall of 2009, we were told that one of us would teach the upper-level CALL class of each stream. We were given free reign to design the course; the only stipulation being that we had to have our students use the Gyuto-e software as part of the course requirements. Knowing that many of the students in the top classes would be planning to go abroad during their university years and would most likely take the TOEFL exam at some point in the near future, we decided to focus on TOEFL iBT speaking and writing exercises. We deemed that not only would it help students improve their scores on the exam, but we also saw the educational potential of the exercises toward helping students develop their ability to quickly organize and deliver coherent, structured, and opinionated speeches and essays; skills that are valuable in any language, including one's mother tongue. Given that Japanese takers of the TOEFL test historically tend to rank toward the bottom of test-takers in Asian countries (Hagerman, 2009) and most recently have continued that trend (Educational Testing Service, 2010b), there also seems to be some support for the idea that in Japan there is a need for new, innovative methods for TOEFL training.
The Business class was composed of 26 students (58% male, 42% female) while the International Business class had 16 students (50% male, 50% female). As might have been expected for the highest-level classes, a few of the students had actually lived abroad as children, while others had done one-year homestays in high school. Most of the students, however, had never spent much time abroad. All students took the CALL class in combination with a Communication and Writing Class (taught by native English speakers), as well as a Reading Class and a Listening Class (both taught by native Japanese speakers). Therefore, each group of students met four times a week with different teachers for a total of 6 hours of English.
Our CALL classes met on Thursday mornings in the CALL room. In each room were approximately 45 Windows PC desktop computers, along with a teacher's desk equipped with two computers: one for demonstrating to students and one for administration, the latter having the Wingnet application installed to monitor student activity. In addition, the room had a sound system for playing audio. Between each pair of student computers was an additional monitor for projecting video or whatever was happening on the teacher's computer. A teaching assistant was also present, though he or she was generally not involved in class activities.
Our general plan for the course was to have students begin each class with self-directed study on Gyuto-e for 30 minutes, and then use the remaining hour to focus on iBT training. For the semester, we chose to focus on the following three iBT-type exercises: 1. Independent Speaking: Students are shown a question on the screen, which is also read to them by a native speaker. They have 15 seconds to plan their answer and 45 seconds to speak their answer into the computer microphone, which records their response. An evaluator scores the response according to a 4-point scale. 2. Integrated Writing: Students are shown a short reading and given three minutes to examine it and take notes (if they wish). Then, they listen to a short lecture or spoken opinion (2-3 min.) on the subject, while taking notes. Finally, they are shown the reading again and are given 20 minutes to compose a response on the computer in the form of a short essay, attempting to integrate content from both the lecture and the reading. An evaluator scores the essay according to a 5-point rubric. 3. Integrated Speaking: Students are shown a short written announcement on the screen, (approximately 80 words) which is followed by a 1-2 min. spoken response from a native speaker, generally agreeing or disagreeing with the announcement. Students can take notes if they wish. Students are then given 30 seconds to plan their response and 60 seconds to deliver it into the microphone. An evaluator scores the response on a 4-point scale, taking into account how successful the student was in integrating content from both the reading and listening.
Grades for the course were to be determined by performance on iBT exercises (40%), completion percentage of Gyuto-e (30%), and active in-class participation (30%).

Integrating Moodle with a sound recorder plugin
A self-hosted Moodle (v1.9.8) was chosen to be the primary Course Management System (CMS) for a number of reasons, namely because the proprietary CMS provided by the university (Blackboard) gave the teachers few administration privileges and no options for customization outside of the default settings; essentially, it was a closed system. For our purposes, we needed a CMS that could be configured so that students could record their voices directly into the browser when logged in. It also needed to give the teachers easy access to the recordings, so that they could assign a grade based on a rubric, while also providing a space to give students feedback; something along the lines of what Mendori, Daniels, and Shinomori (2009) describe with their Flash Media Server. Moodle was the CMS of our choice. Our plan was to search for a sound recording plugin that would work in a timed Moodle quiz activity, both for simulating actual iBT conditions and for subsequent easy grading.
What we wanted to avoid most of all was having students record their voices offline, save as mp3 or wav, give each audio file a specific name for easy tracking, and then upload those audio files one-by-one to folders on Moodle. After that, the teacher would have had to find the sound file for each student, open it in a sound player application, listen to it, and then go back to Moodle to assign each student a grade for their attempt. This would have been too meticulous and time-consuming.
In our search, we found two plugins that seemed promising: Audio Recorder (v1.1) and NanoGong (v3.2). Unfortunately, we could not get Audio Recorder to work in conjunction with Moodle quiz activities. And even though Kalamarz (2010) and Dang and Robertson (2010) have had some recent luck using NanoGong with Moodle, we were unable to get the NanoGong activity module to function correctly. It simply would not show the teacher a listing of all student recordings made at any given time. Attempts to solve the problem, such as looking for help on the developer's forums and fiddling with the code proved fruitless (note: the upgrade to NanoGong 4.0 was released shortly after our course began in the Spring of 2010).
We did not consider the use of a third party, online service, such as Voicethread (which does feature integration with Moodle), mostly for the same reasons that Daniels (2008) points out: concerns about privacy and ownership of content. This was particularly true in our case, since we planned to use the student-generated content for official assessment purposes.
We did, however, discover that one of the features NanoGong offers is integration with the extended HTML editor on Moodle. It inserts a little "speaker-shaped" icon amongst the myriad other buttons and tools listed there. All a user has to do is click the speaker icon, and a pop-up window will appear with the NanoGong recorder. A message can be recorded using the PC's external microphone and then inserted as an audio object into the text window. When saved, all a user has to do is view the submitted text with its embedded audio object (in the form of the same speaker icon) and click the object, whereupon the NanoGong recorder will appear, allowing the user to listen to the embedded audio.
We then realized that all we would have to do is create a timed quiz activity using the "essay" type Moodle quiz question. We could then train students how to access the NanoGong recorder through the extended HTML editor and record their own verbal responses, so that when they demonstrated proficiency in its use, we could then quiz them simulating iBT conditions as best we could, simply by setting a time limit for the quiz that would close when the limit was reached.
Another benefit of using Moodle for these activities was that it served as a gathering point for the student recordings and teacher feedback on those recordings based on the scoring rubric. This allowed students to check the teacher's comments against the rubric and against their own performance. As previously explained, trying to keep things together in one place was not only practical for teachers and students, but it also provided a record of learning for the students, giving them an opportunity to reflect on each response before attempting the next task a week later. This habit of alternating action and reflection has the potential to "equip the student to better exploit future opportunities for reflection in a self-managed way" (Levy & Kennedy, 2004, p. 64). Or as Schwienhorst (2008, p. 11) puts it, students "should be encouraged to critically reflect on their learning process and develop a personally meaningful relation to it."

Implementation of course
After having the students register themselves on Moodle and familiarize themselves with its interface during the first day of class, we proceeded to use the weekly course format blocks to guide students through the weekly activities, giving them a list of what they were expected to do that day, along with examples of iBT responses, transcripts, audio recordings, practice quizzes, etc. We handled one iBT exercise at a time, spending approximately three weeks on it, culminating in a graded quiz. The sum of all three graded quizzes, plus the midterm and final (each with multiple iBT activities) would comprise the iBT quiz component of the final course grade. In what follows, we will describe briefly how we implemented each iBT exercise in class using Moodle and the CALL room.

Independent Speaking
In Week 2 of the course we began our iBT training with Independent Speaking because it was the simplest. After introducing the task to the students and having them practice faceto-face several times, we then let them try it themselves on Moodle. We learned right away that simulating iBT conditions was going to be difficult for technical reasons. So after the first day of iBT practice, we made three decisions: 1. Doing away with time settings on Moodle quizzes. It took varying amounts of time for each student to load the recorder after clicking the sound icon. This was for two reasons. First, we were using Firefox since our up-to-date version of Moodle wasn't working properly with the older versions of Internet Explorer that the university provided. Firefox would often ask permission before allowing a pop-up containing a script, which necessitated students clicking on the "allow" button to continue. Second, recorder load times were simply different for each student, depending on the computer they were using and whether they were one of the first people in the class to start the loading process, or one of the last. It seemed that those who clicked first got faster load times.
The variations in voice recorder load times meant that some students ran out of time before they had a chance to finish their recordings. And if we increased the time limit, then the students whose recorders loaded faster would end up having more than the allotted time to complete their answers. Since the load time could not be predicted accurately, we decided to abolish the time limit in the quiz settings, and manually time the students altogether. That way, each student not only had ample time to load the recorder, but also had time to check their microphone recording levels. When everyone's recorder was loaded and working properly, then we started the exercise by verbally announcing the beginning and end.

Reading the question aloud rather than embedding a recording of it into a Moodle quiz.
Because of the loading time issue as described above, we felt it would be better to read the questions aloud to the students, rather than depend on the students having to start a pre-recorded question in their browser. Even if we had been able to set the recording to automatically play upon starting the quiz, the varying load times of the voice recorder rendered this issue problematic.
3. Displaying the question on the center screen. Due to the same recorder loading time issue, we decided it would be necessary to remove the test question from the Moodle quiz as well, displaying it on the center monitor instead. Doing so ensured that all students, when ready to go, had equal time to view the question while it was being read, during their 15-second planning time, and for their 45-second recording time. This was necessary in order to achieve our aim of simulating actual test conditions that treated all students in the same way.
During Week 3, we decided to introduce a modified version of the official iBT scoring rubrics published by ETS to the students and have them evaluate each other during face-to-face practice. Rather than use a 4-point scale that lumped all three categories together (delivery, language use, and topic development), we decided to allot a separate 4-point scale to each of the three, resulting in three separate scores, comprising a total of 12 maximum points.
We thought that doing so would provide more specific feedback to the students, allowing each to hone in better on the particular aspect of their response that needed the most improvement. We also simplified the descriptions contained in the rubric, as they were far too verbose, being intended for professional educators and evaluators, rather than learners themselves. Finally, we thought that the practice of doing peer-evaluation would develop students' awareness of what makes a good response and how to recognize its three different aspects, not only when listening to the responses of other students, but also when listening to their own recorded responses on Moodle. Doing so might also draw students into reflecting on their own strengths and weaknesses, not simply in terms of sentence construction but also in terms of pronunciation and intonation. Our training with Independent Speaking culminated in a graded quiz in Week 4. Students recorded their responses and the teacher evaluated each response using the quiz module, not only listing the 4-point scores for each category and the resulting 12-point score, but also giving written feedback for the students to ponder.

Integrated Writing
This section was less technically challenging than the other two, simply because it did not involve the use of the NanoGong sound recorder. We carried it out by using the center screen to display the reading, the classroom sound system to play the lecture, and an essay type question in the Moodle quiz activity for students to compose their response. As a result of our experiences with the independent speaking test, timing was once again done manually by the teacher. Students were able to compare their essays -and those of their peers -with sample responses, both from the teacher and from selected student essays. A simplified 5-point rubric was used, based on the official ETS published version.
For some of the lectures we used actual recordings from iBT practice tests, while for other lectures we wrote our own and read them out aloud to the students. We found that we were much better able to match both the content and level of the readings and lectures to our students than the iBT samples we were able to find. While the most proficient of our students -several of whom had lived abroad as children -were able to cope with genuine sample iBT questions that we tried in the first week of practice for this exercise, others found it impossible to understand the content of the lectures. We felt the need, therefore, to simplify the lectures slightly, in order for students to have a real chance to practice the skills involved in the iBT exercises.

Integrated Speaking
For this exercise, the exact same Moodle quiz set-up was used as for Independent Speaking. After reading the announcement on the center screen, students listened to the native speaker's response, either on the classroom speakers or from the teacher himself. Then, after having 30 seconds to plan, students recorded their 60-second responses into a Moodle quiz using the sound recorder.
As with Integrated Writing lectures, we ended up writing a number of our own announcements and responses, partly due to lack of access to sufficient iBT practice materials, but also to make the content a little more interesting and relevant to our students.
Peer and teacher evaluations were carried out using a 12-point scoring rubric similar to the one we used for Independent Speaking exercises. It was also adapted from the official 4-point iBT rubric published by the ETS.

Student feedback
On the final day of the semester, we administered a survey via Moodle in an attempt to gather feedback on the students' experience. The survey contained both quantitative and qualitative items. For the former, we used a 6-point Likert scale (to discourage student tendency to choose the middle value), while the latter was used to follow up on answers to Likert scale questions. The survey questions were written in English, but students were encouraged to respond in Japanese on the qualitative items, in order to better express themselves. The survey was not anonymous, but students were assured that the survey was for research purposes and that their responses would not in any way affect their grades in the class. In addition to the survey, we also conducted short face-to-face interviews in English with small groups of students. After reviewing the survey results and listening to the recordings, we identified two emergent issues worthy of further discussion.

Student attitudes toward peer review on Moodle
Throughout the semester, teachers evaluated the official quizzes that were administered using the modified iBT rubrics accompanied by written comments. Students, however, prepared for these quizzes by practicing in pairs and using the same rubrics to evaluate each other. This peer review process consisted of face-to-face evaluation using the rubrics we provided on the one hand, but it also consisted of students listening to recordings and reading the essay responses of their peers on Moodle. When surveyed, students seemed more receptive toward having access to the recordings and essays of their peers, but were slightly less enthusiastic about their peers being able to access their own work (see Figures 1 and 2  Reasons for students not wanting their recordings to be listened to by their peers varied, but some of the most common had to do with shyness and lack of confidence. As one student put it: I don't think it is a good idea to let students listen to each other's answers in Moodle because there are a lot of people who don't have confidence in English and it would be uncomfortable if other students listened to his/her answer.
Although they were uncomfortable with having their recordings listened to by others, many expressed a desire to listen to their peers' recordings.
Student A: I want to hear other student's record because I think listening to others is really useful to improve my skill. But the other hand some people won't like to open their record to someone else.

Student B:
If we could listen the other people's recordings, I think we could learn good things from that. So, I want to listen the others' recording.

Student C:
Listening each others recording make it possible for students to give some advice to each other.
Interestingly, however, students were slightly less concerned about other students reading what they had written for the essay-based responses. This may suggest that students are either more self-conscious and/or lack confidence related to their speaking ability compared to their writing ability. This seems to echo the findings of Dang and Robertson (2009) who reported insufficient confidence or potential criticism as possible reasons for students not participating in voice recording options for communication with their peers in an online course. Perhaps there is something about vocal expression that is more personal than written expression, or it could be that the well-established practice of peer review in writing courses means that students are more used to it. On a related note, in a recent survey, Stricker and Attali (2010) found that attitudes toward the speaking section on the iBT were largely unfavorable in a survey of test takers from a variety of different countries, suggesting either a dislike for speaking tests in general, or perhaps uneasiness with the iBT format, such as the absence of interaction.
Furthermore, another recent study in Taiwan found that learner anxiety on an integrated (reading-then-speaking) speaking activity was less than on a speaking-only activity, possibly due to the presence of textual input before speaking, even though learners reported that the integrated activity was more difficult (Heng-Tsung & Shao-Ting, 2010).

Student feedback on use of simplified iBT scoring rubrics
Another emergent issue from our inquiry was the overwhelmingly positive response given toward the use of the simplified iBT scoring rubrics, as described above. Students largely thought that the rubrics not only made the process of evaluating iBT responses easier and more focused, but that they also led to greater feelings of improvement and accomplishment toward the end of the semester. For example, 75% of students "strongly agreed" or "agreed" that the 12-point modified speaking rubric was useful. Another 14% "slightly agreed." As one student expressed it: Because the scoring standards were clearly shown in the rubric, I felt that it guided me to speak and write according to a comprehensible, very effective method. [translated from Japanese] Another student had this to say: Although the speaking exercises were difficult, the rubric helped me become aware of my own language power and how I could try hard to improve it. [translated from Japanese] Not everyone, however, felt that the rubrics were entirely helpful. One student actually thought that the rubric complicated matters: It was useful when the teacher scored my assignments, because I could see what was good and what was bad. But when I scored other people's assignments, it was difficult to decide the score from 12-point scoring rubric because there were too much list in one score.

Recommendations for teachers
The following are a set of recommendations we came up with for teachers interested in using Moodle with an audio recording plugin to integrate iBT style speaking and writing exercises into their classes.
Ȼ Search for, install, and configure your sound recording plugin well before the semester begins, and test it in the same room that you will be teaching in. Make sure it works before introducing it to your students! Ȼ Have students practice in pairs or small groups before having them record their responses online. This builds trust, rapport, and a sense of camaraderie amongst the students. It also breaks up the monotony and loneliness of staring at computer screens. Ȼ Teachers may want to consider starting with a reading-then-speaking integrated speaking activity before attempting an independent speaking activity to build confidence from the beginning. A positive experience with an activity that causes less anxiety should lead to better performance later on (see Heng-Tsung & Shao-Ting, 2010). Ȼ Make an attempt to alleviate the fears of students concerning peer review of their recorded and/or written responses. This could either be done by finding a way to make student responses posted on Moodle anonymous to their peers (but not to the teacher) or by forming small groups of students who evaluate each other all semester, so that they build trust over time. Ȼ Strongly consider the use of simplified rubrics for self, peer, and teacher evaluations.

Conclusion
As with any first-time implementation of a particular approach to a learning task in the CALL room, our iBT simulation project was not without its unexpected adjustments and technical issues. It did, however, shed some light on the use of Moodle to record student voices for the purpose of both reflection and evaluation. Furthermore, our follow up with students at the end of their experience revealed some of their attitudes toward the process, particularly concerning peer review and the use of simplified rubrics. Perhaps most interesting to us was the fact that students expressed more hesitancy or shyness toward their peers listening to their recorded responses to the iBT speaking exercises than they did toward those same peers reading what they had written in the iBT essay responses. Further research into this issue could generate valuable findings. It would also be worth investigating to what extent students' self-reflection and self-evaluation of their work improves their ability to answer the TOEFL iBT speaking exercises, and whether they experience a sense of accomplishment over the course of the semester. Finally, further research could be done to find out whether or not the act of repeated reflection by students on their own portfolio of spoken recordings fosters greater learner autonomy over time.