I heard about Podzinger on the Inside the Net podcast and thought it sounds neat but I couldn't believe it did what they suggested - a Google-like service that trawls mp3 podcasts (NOT iTunes ACC encoded content - Ha!) does speech recognition and then builds a hash-table based on timecode so that when you do a text-search of what was said and then you can play the file from a few seconds before the utterance right there - in the browser!
Now my interest comes from the fact that in the mid-80's I was doing a degree in maths & programming and I spent my final year doing a thesis on speech-recognition. I did build a system that could reliably recognize about two-dozen words spoken by one person - all coded in native x86 and Pascal! I was aware of multiple-speaker / large vocabulary recognizers but this system is something else. I notice that BBN are the company behind it and since they built the original nodes of ARPAnet they clearly have a long technical pedigree.
Anyhow - give it a go - type in a word like "Media Portal" and see what podcasts are talking about homebrewed PVRs (for example).
Now my interest comes from the fact that in the mid-80's I was doing a degree in maths & programming and I spent my final year doing a thesis on speech-recognition. I did build a system that could reliably recognize about two-dozen words spoken by one person - all coded in native x86 and Pascal! I was aware of multiple-speaker / large vocabulary recognizers but this system is something else. I notice that BBN are the company behind it and since they built the original nodes of ARPAnet they clearly have a long technical pedigree.
Anyhow - give it a go - type in a word like "Media Portal" and see what podcasts are talking about homebrewed PVRs (for example).
No comments:
Post a Comment