This is my workflow to process annotations from PDFs out to Obsidian. The key batch translations that I will post from my workflow are
- Storing annotations from the PDF as markdown text bundles including asset images
- Re-exposing tags that may have otherwise been hidden in annotation notes
- Re-setting the folder and file names in the textbundle to be viewable in Obsidian
I am annotating with Bookends on iPadOS. The processing is independent of this first choice. You must however have Highlights on macOS for the first step. As a heads up, you must (at least for now) also have BBEdit on macOS for the second step.
This post is about the first step. Follow ups (perhaps not immediately) will outline the other steps. The recommended process for the first step is to
- Put your annotated PDFs in a folder on macOS. I tag the annotated PDFs with the tag
annotated
as a way to find them.
- Select the annotated PDFs at the Finder level.
- Run the AppleScript below. The script will cycle through all selected PDFs, opening them in Highlights and exporting the annotations as a markdown textbundle folder to a default folder location.
(*
save annotations in PDF to markdown textbundle using Highlights
2021-07-29
jjw
Instructions
--
* select a set of PDF files at the Finder level
* run this script
--> output is a textbundle folder of annotations from all PDFs selected
Caveats
--
The current version of this script will crash if the default folder already
contains a copy of the annotation textbundle folder.
This uses lots of AppleEvents with delays because Highlights is
entirely and frustratingly unscriptable.
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- set the default folder location to save the annotation textbundles
property theDefaultFolder : "/Volumes/Databases/Journal Annotations"
-- quit Highlights at the end?
property quitHighlightsOnEnd : true
on run {}
set theCount to 1
-- get the list
tell application "Finder" to set theSelectedList to the selection as alias list
if theSelectedList = {} then return
-- start with Highlights
tell application "Highlights" to activate
repeat with theSelection in theSelectedList
set theFileName to the POSIX path of theSelection as text
my saveHighlightsAnnotations(theFileName, theCount)
set theCount to theCount + 1
end repeat
if quitHighlightsOnEnd is true then tell application "Highlights" to quit
return
end run
on saveHighlightsAnnotations(theFileName, theCount)
tell application "Highlights" to open POSIX file theFileName
delay 1
tell application "System Events"
-- save as textbundle
keystroke "t" using option down
delay 1
-- save to the default location
if theCount = 1 then
keystroke "g" using {shift down, command down}
delay 0.5
keystroke theDefaultFolder
delay 0.5
keystroke return
delay 0.5
end if
-- this next step clicks the SAVE dialog button
-- the script will crash here if the file already exists
keystroke return
delay 1
-- close the window
keystroke "w" using command down
end tell
return
end saveHighlightsAnnotations
–
JJW
3 Likes
My processing chain is documented below. As a starting point, I …
- Annotate the PDF in Bookends on iPadOS.
- Sync the annotated PDF back to Bookends on macOS.
You can use whatever apps you choose for the above steps as long as the annotations are not flattened and the annotated PDF is stored on macOS. I have tagged the annotated PDFs at the Finder level with the tag annotated
so that I can select them all at once for the next step.
- Extract the annotations to a textbundle using the Highlights app on macOS and the script below. The script can be run as a drag + drop applet or it can be invoked from the Scripts menu. You should not create an applet and double-click on it to run, as this approach can play havoc with what is passed as the selection to the
on open
handler.
(*
extract annotations in PDFs to markdown textbundle folder using Highlights
2021-07-30
jjw
Instructions
--
* select a set of PDFs at the Finder level
* run this script
OR
* drag + drop a set of PDFs onto this script application
--> output is a textbundle folder of annotations from all PDFs selected
Caveats
--
This uses lots of AppleEvents with delays because Highlights is
entirely and frustratingly unscriptable.
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- set the default folder location to save the annotation textbundles
property theDefaultFolder : "/Volumes/Databases/Journal Annotations"
-- remove any existings textbundles (default is to suffix with _old)
property removeExisting : false
-- quit Highlights at the end?
property quitHighlightsOnEnd : true
-- display report dialog at end?
property displayReport : true
-- run with preselected set
on run {}
tell application "Finder" to set theSelectedList to the selection as alias list
if theSelectedList = {} then return
my batchProcess(theSelectedList)
end run
-- drag and drop files onto script application
on open FileSets
my batchProcess(FileSets)
end open
on batchProcess(FileSets)
tell application "Highlights" to activate
set theCount to 0
set isFile to true
repeat with theSelection in FileSets
set theFilePathName to the POSIX path of theSelection as text
try
tell application "System Events" to set fileExtension to name extension of (theSelection as alias)
on error
set isFile to false
end try
if isFile is true then
if ((fileExtension is "pdf") or (fileExtension is "PDF")) then
my checkforExisting(theFilePathName)
my extractAnnotationsviaHighlights(theFilePathName, theCount)
set theCount to theCount + 1
end if
end if
end repeat
if quitHighlightsOnEnd is true then tell application "Highlights" to quit
if displayReport then
tell application "Finder"
activate
display alert "Extracted annotations from " & theCount & " PDFs."
end tell
end if
return
end batchProcess
on checkforExisting(theFilePathName)
set itExists to true
set theTBFolderPrefix to my extractFileName(theFilePathName)
set theTBFolderName to theDefaultFolder & "/" & theTBFolderPrefix & ".textbundle"
try
POSIX file theTBFolderName as alias
on error
set itExists to false
end try
if itExists is true then
if removeExisting is false then
set theCopyTBFolderName to theDefaultFolder & "/" & theTBFolderPrefix & "_copy.textbundle"
try
POSIX file theCopyTBFolderName as alias
set copyExists to true
on error
set copyExists to false
end try
if copyExists is true then
set theCmd to "rm -r " & (the quoted form of theCopyTBFolderName)
do shell script theCmd
end if
set theCmd to "mv " & (the quoted form of the theTBFolderName) & " " & (the quoted form of theCopyTBFolderName)
do shell script theCmd
else
set theCmd to "rm -r " & (the quoted form of the theTBFolderName)
do shell script theCmd
end if
end if
end checkforExisting
on extractFileName(theFilePathName)
set cTID to text item delimiters
set text item delimiters to "/"
set theFolderName to text item -1 of theFilePathName
set text item delimiters to "."
set theFileName to text 1 thru text item -2 of theFolderName
set text item delimiters to cTID
return theFileName as text
end extractFileName
on extractAnnotationsviaHighlights(theFileName, theCount)
tell application "Highlights" to open POSIX file theFileName
delay 1
tell application "System Events"
-- save as textbundle
keystroke "t" using option down
delay 1
-- save to the default location
if theCount = 0 then
keystroke "g" using {shift down, command down}
delay 0.5
keystroke theDefaultFolder
delay 0.5
keystroke return
delay 0.5
end if
-- this next step clicks the SAVE dialog button
-- the script will crash here if the file already exists
keystroke return
delay 1
-- close the window
keystroke "w" using command down
end tell
return
end extractAnnotationsviaHighlights
The script will extract all forms of annotations, including “picture” annotations. Here is a snapshot example of an annotation that I made using Bookends on iPadOS that is extracted in the assets folder of the textbundle folder.

- I use #hashtag notations in the note fields of annotations. Highlights extracts the #hashtags INSIDE the URL as . This leaves them hidden from any markdown editor. The AppleScript below exposes these “hidden” #hashtags. It requires BBEdit on macOS. I welcome any help to convert this to use
sed
at the OS level (I am baffled by what to use for the proper escape sequence to capture the required closing brackets). You use this by selecting the text.markdown files inside the .textbundle folder created in the above step.
(*
expose hidden hashtags
version 2021-07-29
author jjw
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- should BBEdit quit when done?
property quiteBBEditonEnd : true
-- display report dialog at end?
property displayReport : true
-- grep strings for BBEdit
-- DO NOT CHANGE
property BBEgrepString : "!\\[(#\\S*) (.*)\\]"
property BBEreplaceString : "\\1 \\2 !\\[\\2\\]"
(*
property sedgrepString : "s/![\\(#[:alpha:]*\\) \\([:alpha:]*\\)\\]"
property sedreplaceString : "/\\1 \\2 ![\\2\\]"
*)
-- run with preselected set
on run {}
tell application "Finder" to set theSelectedList to the selection as alias list
if theSelectedList = {} then return
my extractTagsinFileswBBE(theSelectedList)
end run
-- drag and drop files onto script application
on open FileSets
my extractTagsinFileswBBE(FileSets)
end open
on extractTagsinFileswBBE(FileSets)
tell application "BBEdit" to activate
set theCount to 0
repeat with theSelection in FileSets
set theFilePathName to POSIX path of theSelection as text
tell application "BBEdit"
open POSIX file theFilePathName
delay 0.5
tell text of front text window to replace BBEgrepString using BBEreplaceString options {starting at top:true, search mode:grep}
save active document of front window
close front window
end tell
end repeat
if quiteBBEditonEnd then tell application "BBEdit" to quit
return
end extractTagsinFileswBBE
(*
on extractTagsinFilewShell(theFileName)
set theCMD to "sed " & the quoted form of (grepString & replaceString) & " < " & the quoted form of theFileName
do shell script theCMD
return
end extractTagsinFilewShell
*)
- As a final step (this can also be the second step), I convert the
.textbundle
folders format to plain folders. This exposes the internals to both Devonthink and Obsidian. I use the script below. You use this by selecting the .textbundle folder (unlike the above script where you select the text.markdown file).
(*
convert textbundle folder to regular markdown folder
version 2021-07-29
author jjw
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- define a prefix to rename the text.markdown markdown file
property AnnotationFilePrefix : "Annotations_"
-- remove textbundle folder after converting
property removeTBFolder : true
property theMDFileName : "text.markdown"
on run {}
-- get the list
tell application "Finder" to set theSelectedFolderList to the selection as alias list
if theSelectedFolderList = {} then return
my convertBatchFolders(theSelectedFolderList)
end run
on open theFolderList
my convertBatchFolders(theFolderList)
end open
on performSmartRule(theRecords)
set theCount to my convertBatchFoldersDT(theRecords)
display alert "Successfully converted " & theCount & " records from textbundles to regular folders."
end performSmartRule
on convertBatchFolders(theFolderList)
set theRootPath to ""
set theName to ""
repeat with theFolder in theFolderList
set theFilePath to POSIX path of theFolder
set {theRootPath, theName} to my getBaseNames(theFilePath)
-- convert the text.markdown file name
set thecurrentMDFilePathName to theFilePath & theMDFileName
set thedesiredMDFilePathName to theFilePath & theName & ".md"
set theCmd to "mv " & (the quoted form of thecurrentMDFilePathName) & " " & (the quoted form of thedesiredMDFilePathName)
do shell script theCmd
-- convert the .textbundle folder name
set theNewMDFolderName to theRootPath & "/" & theName
set theCmd to "rsync -av " & (the quoted form of theFilePath) & " " & (the quoted form of theNewMDFolderName)
do shell script theCmd
-- remove folder?
if removeTBFolder is true then
set theCmd to "rm -r " & (the quoted form of the theFilePath)
do shell script theCmd
end if
end repeat
return
end convertBatchFolders
on getBaseNames(FullPathName)
set cTID to text item delimiters
set text item delimiters to {"/"}
if (FullPathName ends with "/") then
set baseName to text item -2 of FullPathName
set rootName to text 1 thru text item -3 of FullPathName
else
set baseName to last text item of FullPathName
set rootName to text 1 thru text item -2 of FullPathName
end if
if (baseName contains ".") then
set text item delimiters to {"."}
set nameWithoutExtension to text 1 thru text item -2 of baseName as text
else
set nameWithoutExtension to baseName as text
end if
set text item delimiters to cTID
return {rootName, nameWithoutExtension}
end getBaseNames
I apologize that this is a long post with a lot of code. I’d consider a more formal posting approach (e.g. GitHub or a public Dropbox) if interest dictates. I am not as proficient in such methods at the moment.
Hope this information has some benefits to some folks.
Enjoy!
–
JJW
2 Likes
Thank you for this workflow. Very helpful and a needed one for Bookends!
1 Like