Audio transcription engine

Would there be anyone here with experience automating a sound file transcription workflow on Mac?

Something that would process a list of recordings and save them along with a rough text transcript to search said recordings later.

The search could happen anywhere between finder, Devonthing, etc…

First of all, I am looking for a transcription engine (automatable or not) that does not charge by the usage nor a subscription fee as I would aim to transcribe quite a lot. I am ok to pay a one-off, but I would not want it to be a recurring cost.

Here is what I’ve tried/seen so far:

  • Descript: great experience but too expensive / charge by the usage, plus I am not sure it could be automated
  • Looping playback via Audio Hijack or similar to Draft / Apple dictation : free solution but requires the file to play its actual duration, and it can hit some time limit.
  • Python SpeechRecognition library artisanal script: relatively slow and limited to short recordings if using the free version

There are dozens of paid services online, and transcription APIs but all of these services charge you on a usage model (fee per x minutes) or subscription+bundled minutes, which does not bode well with an ever-on audio archiving engine.

1 Like

Unfortunately I have not got a solution for you but I have had great success with the Otter app on my iPhone for transcription. They may do other products which would do what you need?

I agree with @timlawson here. I would look at Otter. As someone who has had a lifetime of getting interviews and group discussions transcribed apps like Otter almost feel like a miracle.

Remember to get this done manually I would always estimate X4 times the length of the audio, if you could get hold of a really good transcriber. Up to x8 for a multiple person interview.

You don’t mention the amount of transcription you will need, but ,frankly, for the cost for a month of otter, or some of the others, it is an absolute steal. You can turn “subscription” off and on as you wish.

My most recent workflow has been to record on Zoom. Download the audio. Upload to Otter which then transcribes (there is a way of doing this automatically I just haven’t done it yet). The site allows to to follow the interview through the transcription. I then download the document and check. Given I did the interview I tend to archive the audio and use the transcription for analysis.

I don’t have not set up any automation however.

Nick

1 Like