While it is becoming increasingly common (and easy) for lectures to be placed online as videos or podcasts, such media can often be very limited in their utility. For example, I missed a lecture last week for my numerical analysis class. Fortunately, the professor always posts his lectures on Blackboard and I watched the whole thing a couple of nights later. Peachy. Yet when I wanted to review a particular concept covered in a class a few weeks ago, I spent far too much time searching through the 90-minute lecture for the particular bits in which I was interested.
MIT researchers believe that a variety of speech recognition tools (as well some intense cataloging tools they built) have matured to the point at which they can reliably address this problem. According to an article in MIT's Technology Review, researchers can now
break up a lengthy academic lecture into manageable chunks, pinpoint the location of keywords, and direct the user to them. Announced last month, the MIT Lecture Browser website gives the general public detailed access to more than 200 lectures publicly available though the university's OpenCourseWare initiative. The search engine leverages decades' worth of speech-recognition research at MIT and other institutions to convert audio into text and make it searchable.
Of particular note is the tool's ability to handle the "rambling and mumbling" of a typical lecture, technical terminology, and the accents of many lecturers, whose native languages are often not English:
They trained the software to understand particular accents using accurate transcriptions of short snippets of recorded speech. To help the software identify uncommon words--anything from "drosophila" to "closed-loop integrals"--the researchers provided it with additional data, such as text from books and lecture notes, which assists the software in accurately transcribing as many as four out of five words. If the system is used with a nonnative English speaker whose accent and vocabulary it hasn't been trained to recognize, the accuracy can drop to 50 percent. (Such a low accuracy would not be useful for direct transcription but can still be useful for keyword searches.)