The New York Times's Farhad Manjoo goes all in on voice: "I Didn't Write This Column. I Spoke It."

#1

This is really interesting: Farhad Manjoo experiments with the “screenless Internet” – doing everything by voice instead of computer or phone. He’s having a lot of success with it.

I love to walk and have sometimes speculated that someday in the future I’d be able to do my whole job just walking around, talking and listening into AirPods. Manjoo seems to be doing just that – today. Or he’s nearly there.

Manjoo:

Here’s what I do: Instead of writing, I speak. When a notable thought strikes me — I could be pacing around my home office, washing dishes, driving or, most often recently, taking long, aimless strolls on desolate suburban Silicon Valley sidewalks — I open RecUp, a cloud-connected voice-recording app on my phone. Because I’m pretty much always wearing wireless headphones with a mic — yes, I’m one of those AirPod people — the app records my voice in high fidelity as I walk, while my phone is snug in my pocket or otherwise out of sight.

And so, on foot, wandering about town, I write. I began making voice memos to remember column ideas and short turns of phrases. But as I became comfortable with the practice, I started to compose full sentences, paragraphs and even whole outlines of my columns just by speaking.

Then comes the magical part. Every few days, I load the recordings into Descript, an app that bills itself as a “word processor for audio.” Some of my voice memos are more than an hour long, but Descript quickly (and cheaply) transcribes the text, truncates the silences and renders my speech editable and searchable. Through software, my meandering memos are turned into a skeleton of writing. The text Descript spits out is not by any means ready for publication, but it functions like a pencil sketch: a rough first draft that I then hammer into life the old-fashioned way, on a screen, with a keyboard…

I do the best of my research through interviews – somebody talks to me and I write down what they say. Additionally, I’m often talking with colleagues and writing down to-dos during the conversation. Hard to imagine going screenless for those things.

I recently realized I’ve been wearing my AirPods wrong. Well, I knew they were wrong before; rather than having the stems hanging down, like most people, I screwed the AirPods into my ears and the stems stuck out horizontally. But recently I realized that they were actually more comfortable if I wore them the regular way. They felt like they were going to fall out, but they are pretty secure. And they stay connected to the iPhone better, and respond better to touch controls.

Also, I can hear external sounds very clearly when I wear them properly. For many people, that’s a flaw in the AirPods, but to me it’s a feature. If I want to talk with someone standing in front of me, I can leave the AirPods in and talk with them normally. Indeed, if I have a few phone calls during the day, I sometimes just leave the AirPods in my ears.

So yes I can see the screenless internet coming, not far away. We’ll still use our phones and PCs quite a bit, just a lot less than before, just as we now use our PCs quite a bit but less than we used to since smartphones came along.

A nitpicky note on Manjoo’s column: I don’t understand what this RecUp app does that Voice Memos doesn’t do. I get that it lets you record without having to title your individual recordings, but you can do that with Voice Memos too. Just … don’t title them.

I found Manjoo’s column to be quite exciting, actually. So much so that I wanted to write this response right away. So I reached for the keyboard near the couch, propped up the iPad, and tapped out this post. Nope, we’re not at the screenless internet just yet.

3 Likes
#2

I’m fascinated by this topic, and I think this transition is going to be interesting. It’s hard for me to imagine the screen-less internet, but then I think of how my kids have taken to voice assistants while I almost never use them. And it’s not just because I had problems with Siri - we have had a couple of Echoes in my house, and it’s interesting to see how my kids pick it up naturally but my wife and I often forget they are even there!

Great article, thanks for sharing!

1 Like
#3

15 years ago it was difficult to imagine the way phones have eclipsed PCs today.

The punditocracy underestimates the potential of voice interface, and overestimates VR.

1 Like
#4

The extraordinary thing here is that he fails to recognize that disabled computer users have been doing this for decades.

3 Likes
#5

Sure, but it was always too much friction there to make it worthwhile for abled users. Now it’s frictionless to the point of making carrying a pad and pen seem… (trying to avoid using the word friction here) … unnecessary?
I’m a dev, so this would never work for me, but I have been looking more and more at the Drafts voice memo thing. Just being able to say “pick up milk” and then "look into using $SOFTWARE for $PROJECT, and then every day putting one thing into my shopping list, and the other into Jira is actually very helpful powerful.

#6

Macalope did a reasonable takedown of Manjoo’s piece the other day “The Incredible Shrinking Apple.”

Manjoo is also the guy who called out Apple for being boring in '16 because they didn’t do ‘moonshot’ programs like Google. And last year he write the risible ‘IT’S TIME FOR APPLE TO BUILD A LESS ADDICTIVE IPHONE’ piece, suggesting, “It could also needle you: “Farhad, you spent half your week scrolling through Twitter. Do you really feel proud of that?” It could offer to help: “If I notice you spending too much time on Snapchat next week, would you like me to remind you?””

Manjoo is the closest thing the NYTimes has to a stunt tech reporter, writing eyeball-grabbing pieces that often fall apart when you go into them more deeply, so I’m automatically wary of anything he writes.

#7

Since I read Manjoo’s piece, I’ve put Just Press Record on my home screen and Apple Watch. It simultaneously records voice memos and gives you a voice transcription. Handy for when you get home and look at the voice transcription and see “rutabaga cadillac buy virago liederhosen” and think to yourself, OK, what did I actually say there that Siri then munged." You have the recording right there to listen to (spoiler: you’ll hear 17 seconds of wind noise with the faint sound of your own voice in the background and you can’t figure out what you’re saying from THAT either).

1 Like
#8

I have never once used transcription software that didn’t get lots of words wrong. I find it useful for short snippets, but for anything longer it takes me more time to correct the software than to just have typed it in the first time. Especially when you get into any kind of specialized vocabulary, it is so bad. Try talking about classical music to your computer, Tchaikovsky, Mendelssohn, Yannick Nézet-Séguin, forget about it. Computer acronyms are also hopeless, I can just imagine how medical terms show up. I’d love to ditch my keyboard someday, but in my experience, we’re very far from that.

#9

Wait a couple of years and you’ll probably be surprised by what it says back to you!

#10

I’ve been trying out Descript, mentioned in the NYT article. My biggest hangup right now is that I can’t seem to dictate punctuation. For example, in Dragon (or with my secretary), I can dictate “Hello John comma how are you question mark” and get in return, “Hello John, how are you?” Descript types out the words “comma” and “question mark.” Does anyone know of a similarly priced service that recognizes punctuation commands, that isn’t tied to the iPhone? I like to carry around a dictaphone to capture thoughts, and in the past would plug it into Dragon for transcription.