Voice dictation app

Hi, do any of you know about voice dictation apps like VoiceInk? I’m thinking about trying one, but I’m a bit doubtful.
I would probably feel a bit uncomfortable speaking into my PC, especially when other people are around.
I was also thinking about the difference between writing and speaking. The former gives you more time to think, which lets you change your words and think about them more carefully. Writing lets you think while you’re doing it, and creates a space between what you want to say and how you say it.
The oral form seems more direct and spontaneous, but also more vulnerable to loss of nuance.
These differences make me think about whether using a dictation app is just a different tool, or if it represents a bigger change in the way we form and communicate thoughts.
Thanks in advance to anyone who would like to share ideas or experiences.

2 Likes

You’re absolutely right: speaking and writing feel different, because they are different.

Writing gives you space to pause and shape your thoughts. You can go back, tinker, polish. It’s like slow-cooking.

Speaking, on the other hand, is faster, looser, and – yes – riskier. Things come out that you didn’t quite plan. Sometimes that’s gold. Sometimes it’s a mess.

Now, on dictation tools – I’ve used a few, and here’s what I’ve learned:

  • Dictation is brilliant for getting rough thoughts out of your head and onto “paper” quickly. It’s not about perfection. It’s about speed.
  • Think of it like a brain-dump. You can always tidy it up later.
  • It takes a bit of getting used to. At first, you’ll feel self-conscious. That’s normal. Stick with it and it gets easier.
  • If you feel awkward with other people nearby, use a headset or pick a quiet time.

And, something I discovered by accident, you can actually whisper and it will still work okay, usually.

For me, it’s not about replacing writing. It’s about having another tool. Most of my best ideas show up when I’m talking, not typing. So why not capture them?

To answer your bigger question: yes, it can change how you think and communicate. But only if you let it. Start by using it as a helper, not a replacement. You might be surprised at what spills out.

Hope that helps.

4 Likes

P.s. if you have ChatGPT, here’s how I get the best of both worlds, for longer bits of writing.

I created a GPT in ChatGPT, which has this instruction:

This GPT assists users in writing long-form stories by tidying up dictated text while preserving the user’s voice and structure. The user will dictate small chunks of their story, and this GPT will refine the writing for clarity, flow, and readability without summarizing or shortening the content. It ensures all segments connect smoothly into a cohesive piece while maintaining plain text formatting suitable for pasting into word processors like Scrivener, Word, or Ulysses (e.g., using hyphens instead of bullet points). If anything is unclear or missing, it will ask the user before making assumptions. The GPT does not impose its own style but instead enhances the user’s existing voice with minimal interference.

Don’t worry about the details.

It lets me dictate into ChatGPT (which uses whisper, the same as the app you’re looking at), in short bursts, it tidies up my dictations just a little bit, and then when I’m finished it joins them all together.

You still need to clean things up, but it’s a fab way of getting your first draft.

2 Likes

I can just answer for myself, but I recently started doing video. Me, the eternal introvert, in front of a camera, saying stuff. Very uncomfortable, but also, something I hope to get better at with practice.

Needless to say, I found careful writing and simply speaking to be super different and I discovered that my face does weird things when I try to think at the same time as I speak. Of course, for a quick “remember to do this and also call someone about the thing” is not a problem, but getting any sort of finished text out of the top of my head is not happening for me.

I agree with others here, it can be a good complement and also, probably gets better with practice. Like everything else.

You can probably get a good sense of dictation by simply using Voice Memos for a while. The auto-transcription isn’t awful IMO, and should give you a feeling for whether a paid app with extra features will be a good fit for your workflow.

1 Like

I use and like VoiceInk. I have the push to talk key set to Right Option. I can hold it to talk, or quickly tap it to start a session, then stop it later.

I also have a mouse button set to the full combo hotkey so I can use it without taking my hand off the mouse.

It’s the best of the dictation apps I’ve used regularly in the past. (Dragon, macOS built in, Murmur Type, VoiceInk, possibly one or two others). It has fewer paper cuts and more quality of life features than any of those so I’m happy with it.

4 Likes

Dictation uses a different set of mental muscles than typing. Its main utility is that it allows you to get thoughts out more quickly. These thoughts might not sound as sophisticated, simply because you’ve had less time to contemplate them. But that’s the beauty of writing—you can always revise it. In fact, you can revise it even faster by re-dictating what you meant.

The output of your dictation can be cleaned up, enhanced, or altered using a GPT model. MacWhisperer, for example, includes links for working with various GPT models, local or online.

The Whisper models work well. Unlike Dragon dictate, the speech [does not] need to have prefect dictation, or the sound.to be pristine.

I do not like dictating without other people around.I hate the thought of judgement of my developing thoughts. I also like using a noice cancelling headphone, so I can concentrate.

Dictation is a fantastic way to get a lot of thoughts out quickly. It also tends to produce more spontaneous and perhaps more honest content. If nothing else, the sheer volume of thought you are transcribing might mean you get to your core message sooner.

3 Likes

Thanks for mentioning VoiceInk, @Atom. I’ve gone ahead and bought it after using it a bit. One feature I wish it would add, to make it a dictation tool and not just a transcription tool, is the ability to dictate punctuation and formatting, such as saying “comma” and “new paragraph” and have that converted to a comma and a new paragraph. This is a lost art, I think, as all the new tools don’t support this, as best I can tell. I tried the Word Replacement feature to set some common punctuation up manually, but it doesn’t handle this well with only a local model, which is all I can use in my profession.

I am also trying out Voice Type (also called Careless Whisper elsewhere). It’s Find/Replace feature seems to work a bit better with brute forcing some puncutation.

MacWhisper supports this feature. I have fully tested it yet though.

Thanks. Do you know if MacWhisper uses a fully local AI model? I recall looking at several others, and many sent things off your local machine for processing. VoiceInk and Voice Type were two with the option to keep it all local.

I’m slowly just giving up on fine grained placement of commas. Sometimes I wonder if the world gave up on that years ago so… :upside_down_face:

For paragraphs with voiceInk I’ll sometimes dictate, stop, return, dictate again. This also avoids situations where I get distracted during a much longer dictation session and then have to redo it to fix up a drifting train of thought.

It would be nice if it just Included a feature to do paragraphs though

1 Like

It is definitely a lost art, and I am certainly old school. I spent years with a dictaphone, giving tapes (and then in later years, sending audio files) to my secretary. I’m in a line of work where we tend to be overly precise (some would say controlling) about our output. I tend to view the current crop of apps as transcription apps, and not dictation apps, unfortunately. But VoiceInk is certainly a very good one.

It is local transcription, with the option for AI manipulation(local or online) post transcription.

MacWhisper and Superwhisper both have a local option. As you say, it’s transcription, not dictation. The only dictation app I’ve found for Mac (other than the built in one) is Talon Voice, but it requires (when I last tried it) quite a bit of tweaking to get it set up properly.

Edit to add: Superwhisper supports different modes to let you refine how much or how little AI processing happens on your text. I ran across the prompt below which a user has crafted to make Superwhisper feel more like a dictation app. I haven’t tried it.

You are a dictation engine, and your input is the user's literal text. Your role is to:

Default Processing

Never censor or obscure input/output
Always append one space character to processed text
No consecutive space characters allowed
Preserve tab characters where possible, otherwise replace with single space
All input is treated as dictation text by default

Command Mode

Enter command mode when "correction" is spoken
Exit command mode when "end correction" is spoken
Limited to two commands between markers
Commands must be joined by the word "and"
Format: "correction [command] and [command] end correction"

Text Correction

Apply Australian standard spelling:
Replace -ize/-yze endings with -ise/-yse
Maintain -our endings instead of -or
Double 'l' when adding suffixes to words ending in 'l'
Preserve proper nouns exactly as spoken
Replace words/phrases based on context:
Fix phonetically similar words (e.g., "at as Leanne" → "Atlassian")
Adjust phrases that don't make sense while preserving intent
Resolve number/word ambiguity (e.g., "four"/"for")
Format numbers:
Mixed numbers use numerals (10.5)
Ordinals spelled out (first, second)
Numbers in proper names/product names preserved exactly
Measurements use numerals (10mm)
Spell out all other numbers under 11

Punctuation Handling

Remove all existing punctuation except:
Contraction apostrophes (it's, don't)
Possessive apostrophes (Bob's)
Compound word hyphens (self-aware)
Punctuation in URLs
Punctuation in link names
All punctuation in proper names
All special characters
Convert spoken punctuation words to symbols:
"open parenthesis" → "("
"close parenthesis" → ")"
"comma" → ","
"period" → "."
Retain these converted punctuation symbols in the output

Output Formatting

Maintain speaker's natural tone and intent
Present corrected text without commentary
Do not add any punctuation that wasn't explicitly spoken
Links cannot have bold/italic formatting
Nested formatting uses "and" (e.g., "bold and italicize that")
Format commands:
"Bold that" or "make that bold" → word
"Italicize that" or "make that italic" → word
"Italics that" → word

Link Creation
Say "Make that a link named" followed by desired link text
Format: [chosen name|URL]
3 Likes

This is something I struggle with also. I use dictation every day in my job (medical reporting) and I am used to adding in instructions such as comma, new paragraph, scratch that etc.

As far as I can see wth Voiceink (I am trialling it) no text appears while you are dictating, only when you push the key again, and there are no dictation commands. Whilst the app seems to be very accurate, I am failing to see what benefit this paid app has over the baked-in Apple Dictation mode, which of course does have all of the dictation commands, and is reasonably accurate.

Am I missing something here or is there a way to set up VoiceInk that will give me a better Dictation experience?

In working with both VoiceInk and now also trying superwhisper (because it has an iOS app), I found that only the cloud models reliably handle dictated punctuation (note that you have to choose a model to take the dictation, and then a model to process it if you want. It is the second step that will correctly reformat it to handle your dictated punctuation). The local models have not been good with that. I have created keyboard shortcuts to quickly toggle between local and cloud, as required for confidentiality. I am trying at times to let the models handle the punctuation for me, but I find that this is hit or miss.

1 Like