Copy all .md files on my Mac (synology drive) to .txt files to make Dropbox search possible

Ward · June 11, 2022, 12:58pm

FOr 2 years I am asking Dropbox to make the search of TXT files also possible in .md files . They don’t want to do it. SO I want to create my own workaround. I want to copy all .md files on my Mac (synology drive) to .txt files to make Dropbox search possible. The file name and location should be exactly the same. I want to select a top folder and then use Hazel or Shortcuts or another solution to automatically duplicate all .MD files to .txt files. I need the .md files for Obsidian and I need the TXT file sto search in Dropbox on my iphone or ipad to find all content inside the MD (TXT) file. I don’t understand why Dropbox don’t want to dos this because a MD is just a Text file. No development needed. Just add it to the files that get indexed. But I need it and thats why I want to duplicate all files. But not just one. I want to run a workflow that does this all the time and overwrite all the TXT files with the new content of the MD files. I only will work with the MD files but use the TXT files just for searching content while I am on my iphone. Does anyone know a goed solution to do this? Or a software that I can buy that can do this? Or Hazel or Shortcuts?
This is the request on Dropbox forum.

jcarucci · June 11, 2022, 3:12pm

You can’t use the built-in search in Finder? Why do you have to use the Dropbox search?

Ulli · June 11, 2022, 3:41pm

You can do this easily with Hazel, you just need to duplicate the file, and swap the extension from .md to .txt.
But I also don‘t understand why you want to search with Dropbox, instead of Finder?
And how do you want to handle files, where you have changed the „.md-Part“?!

margaretamartin · June 11, 2022, 4:40pm

It’s easy to do this one-time in Finder.

But if you want this to be ongoing without intervention, then something like Hazel will fit the bill. It would have to be on a Mac that was always on, however.

A shortcut could do this (I think), but there’s no way to have it continually monitor a folder to do this automatically. Maybe you could schedule the shortcut to run periodically, but I’m not sure that’s a good idea.

Ward · June 11, 2022, 5:59pm

In my office the files or on a Synology. It is 40 TB of data and miljons of files. I use Devonthink on my Mac and Dtsearch on my Windows PC’s to search so many files. As far as I know the built in search can only search files on my Mac or a external SSD directly plugged in on the Mac. I have a software running that copies real-time a part of this to Dropbox (about 8 TB). On my iphone spotlight can only search local on my iphone. With the Dropbox app on my iphone I can search the content of TXT, PDF, Word, Excel files on all the 8 TB of files in seconds. I tested Onedrive Bussiness, MS SharePoint and Google drive and they can’t do what Dropbox can. But Dropbox can not search MD files. But if you know a more easy solution I am all ears

Ward · June 11, 2022, 6:01pm

I am testing with Hazel. But how can tell hazel that he has to do this with all file in a directory? I have a Mac Mini that can be always on.

Ward · June 11, 2022, 6:02pm

I want to always overwrite the TXT files with the latest version of the MD files.

Ulli · June 11, 2022, 6:20pm

Have you taken a look on Devonthink?

Ward · June 11, 2022, 6:40pm

I am using Devonthink on my Mac. But DT can only handle 250.000 files per database on the Mac. So I am creating about 400 DT database as 1 per project. But syncing that to iphone is not going to work. Only Dropbox can handle so many files as far as I know. The same for Obsidian. It crashes and become very slow with miljons of files. I created a vault per project and that works perfect. I have now more than 250 vaults. But Obsidian can only sync 5 vaults.

jec0047 · June 11, 2022, 6:51pm

Maybe a folder action, created with Automator?

nlippman · June 11, 2022, 9:20pm

I would think Hazel would be the way to go.

Assuming that all of these files are contained under one root folder (they don’t have to be; you would just have to duplicate the process for each root folder), then I would think the following would work:

Create a rule for the root folder that matches files if all of the following are true:

The extension is “md”
The date modified is not in the last XX minutes where XX is whatever is reasonable for your workflow in terms of how quickly after you touch an MD file the txt file needs to be updated

For the action, you would:

Copy the file to a new file using the file name and extension .txt
Set the copy action to replace the .txt file if it already exists
Apply the rule to sub folders which will let it chain down your folder tree

One concern I have is how long it would take Hazel to actually complete a run of this nature. Remember that Hazel has to basically scan the folder, look at every file in the folder, compare it to its database to see if it needs to process the file (eg has anything changed since it last looked at the file); check the file against the rules, and execute the actions if appropriate. It has to do that for your entire database of 40TB of files!

What I do not know is how much Hazel relies of file system events (which allow it to only process files which have changed) vs doing an exhaustive scan of the folder (and in your case sub folders) to check each and every file. If the former, this will be very efficient. If the later, it will most definitely not be.

You can experiment - create the rule without the action of applying to sub folders to ensure it works, then apply to sub folders, and see how quickly it completes. Note that the first time through will probably take a very long time since every file in your database has to be touched and copied to a .txt file, but hopefully after that it is fast if Hazel only relies on file system events to find files to process.

I do have a question, however. If I understood your posting, you have about 40TB of files on a Synology to which this process applies, with 8TB in Dropbox, which is actually where you want this to apply. How much of that is actually the markdown files in question? That has important implication in that you are talking about duplicating all of these files, so you could be talking about a substantial amount of additional storage space.

nlippman · June 11, 2022, 9:25pm

Now that I think about it, here’s another idea.

Rather than duplicating all of the files into txt files, you might consider creating alias or links to the md files instead.

Aliases are a MacOS feature, while links and symlinks rely on the underlying “Unix-y” nature of MacOS. I have not personally ever really deliver into Aliases, so I don’t know a lot regarding the implications of creating a huge number of them, but creating either symlinks or hard links might be a good solution.

From Hazel you would probably have to create them using an action that runs a shell script, but the script itself is fairly straightforward. You would want to have a rule that basically created a new link to every .md file using a .txt extension, and then Dropbox should just work for your searching. Since the link now points to the exact same file on disk, you would not incur any additional storage overhead other than for the folder directory entires themselves.

What I do not know however, is how / if Dropbox handles either hard links or symlinks, and so I would strongly suggest creating a test folder with a small number of files to test things out before doing this wholesale on your database.

jec0047 · June 11, 2022, 9:42pm

Per the Dropbox forum:

Symlinks and aliases are not currently supported by Dropbox…

Ward · June 12, 2022, 5:19am

Super. Thank you for the good explanation. I will try Hazel because I think it is the best solution as Dropbox is not supporting Symlinks. I will try Hazel on a small folder. The MD files are only a small number of the 40 TB. It is less than 5 % as most are PDF files and MS Office and they are supported by Dropbox. As MD and TXT are very small in storage that will not be a problem. I don’t like the fact that I will see all files double in Obsidian but I don’t see another solution. I have bought David Sparks his course for Hazel and that will help . But I think if follow your solution I will be a great start.

Lars · June 12, 2022, 11:53am

The problem:

You have a folder structure with .md files. You want to replicate the folders/files, but change the extensions to .txt. That already creates an issue with syncing files, since identical files will be different (because of the extension) for every syncing tool out there.

I was trying to wrap my head around it and found no easy solution for syncing folders while considering different extensions. So, before syncing, the .txt files have to be removed before every sync, repopulated with .md files and those renamed to be .txt

Assumptions:
~/Documents/mdfiles is the source directory
~/Documents/txtfiles is the destination directory (your Dropbox folder)

So, three steps:

Purge the destination folder of all .txt files
Sync source to destination folder (all files)
Recursively rename all .md to .txt

This can the done in the shell with this command sequence:

rsync -a --delete ~/Documents/mdfiles/ ~/Documents/txtfiles/;find ~/Documents/txtfiles -name “*.md” | while read filename;do mv -v “${filename}” “$(echo “${filename}” | sed -e ‘s/.md$/.txt/’)”;done

Warning: Please double (triple) check the folder names!!! “mdfiles” for your source, “txtfiles” for your destination

You can save it at a script (don’t forget to make it executable with chmod +x) and have it run in the background (cron) or and/run it with Hazel, Keyboard Maestro,…

I you want to duplicate all .md files to .txt files (and update all.txt from .md) in the current folder structure:

find ~/Documents/txtfiles -name “.txt" -delete;find ~/Documents/txtfiles -name ".md” | while read filename;do cp -v “${filename}” “$(echo “${filename}” | sed -e ‘s/.md$/.txt/’)”;done

WARNING: all .txt will be deleted!

margaretamartin · June 13, 2022, 8:03pm

I don’t fully understand your workflow, but do the searchable .txt files need to be in the Obsidian vault? Since you’re searching in Dropbox, it seems unnecessary. And potentially confusing, because you could end up editing the .txt file instead of the .md.

So if not, perhaps you need to replicate the folder structure of the .md files for the .txt files, and keep that folder structure outside of the Obsidian vault. I’m not sure if Hazel can move the duplicate (.txt) to the right folder in a similar file structure. However, I’ll bet the Mac app Keyboard Maestro can, and I find it is easier to “program” than writing AppleScript.

I don’t think Keyboard Maestro can monitor a folder for changes, but I think you can get around this by attaching a Folder Action to the folder that triggers a simple AppleScript to run the Keyboard Maestro action.

You might want to look at David Spark’s Keyboard Maestro field guide. It’s a fast way to get up and running with this powerful and complex program.

Ward · June 18, 2022, 5:38am

Lars. Super thanx for the script. But I do not understand why I have to delete all TXT files? I just want overwrite them with a new copy.

Ward · June 18, 2022, 5:41am

Thank you. I will watch David Sparks Hazel guide and Keyboard Maestro field guide and see what is best to use. The MD and TXT need to be the project folder. Every project is a Vault in Obsidian and every project is a folder.

Ward · June 20, 2022, 10:16am

Hi. I cant find how to do it in Hazel. Even after watching David Sparks field guide. I have put my question in the Hazel forum.