My processing chain is documented below. As a starting point, I …
- Annotate the PDF in Bookends on iPadOS.
- Sync the annotated PDF back to Bookends on macOS.
You can use whatever apps you choose for the above steps as long as the annotations are not flattened and the annotated PDF is stored on macOS. I have tagged the annotated PDFs at the Finder level with the tag annotated so that I can select them all at once for the next step.
- Extract the annotations to a textbundle using the Highlights app on macOS and the script below. The script can be run as a drag + drop applet or it can be invoked from the Scripts menu. You should not create an applet and double-click on it to run, as this approach can play havoc with what is passed as the selection to the
on open handler.
(*
extract annotations in PDFs to markdown textbundle folder using Highlights
2021-07-30
jjw
Instructions
--
* select a set of PDFs at the Finder level
* run this script
OR
* drag + drop a set of PDFs onto this script application
--> output is a textbundle folder of annotations from all PDFs selected
Caveats
--
This uses lots of AppleEvents with delays because Highlights is
entirely and frustratingly unscriptable.
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- set the default folder location to save the annotation textbundles
property theDefaultFolder : "/Volumes/Databases/Journal Annotations"
-- remove any existings textbundles (default is to suffix with _old)
property removeExisting : false
-- quit Highlights at the end?
property quitHighlightsOnEnd : true
-- display report dialog at end?
property displayReport : true
-- run with preselected set
on run {}
tell application "Finder" to set theSelectedList to the selection as alias list
if theSelectedList = {} then return
my batchProcess(theSelectedList)
end run
-- drag and drop files onto script application
on open FileSets
my batchProcess(FileSets)
end open
on batchProcess(FileSets)
tell application "Highlights" to activate
set theCount to 0
set isFile to true
repeat with theSelection in FileSets
set theFilePathName to the POSIX path of theSelection as text
try
tell application "System Events" to set fileExtension to name extension of (theSelection as alias)
on error
set isFile to false
end try
if isFile is true then
if ((fileExtension is "pdf") or (fileExtension is "PDF")) then
my checkforExisting(theFilePathName)
my extractAnnotationsviaHighlights(theFilePathName, theCount)
set theCount to theCount + 1
end if
end if
end repeat
if quitHighlightsOnEnd is true then tell application "Highlights" to quit
if displayReport then
tell application "Finder"
activate
display alert "Extracted annotations from " & theCount & " PDFs."
end tell
end if
return
end batchProcess
on checkforExisting(theFilePathName)
set itExists to true
set theTBFolderPrefix to my extractFileName(theFilePathName)
set theTBFolderName to theDefaultFolder & "/" & theTBFolderPrefix & ".textbundle"
try
POSIX file theTBFolderName as alias
on error
set itExists to false
end try
if itExists is true then
if removeExisting is false then
set theCopyTBFolderName to theDefaultFolder & "/" & theTBFolderPrefix & "_copy.textbundle"
try
POSIX file theCopyTBFolderName as alias
set copyExists to true
on error
set copyExists to false
end try
if copyExists is true then
set theCmd to "rm -r " & (the quoted form of theCopyTBFolderName)
do shell script theCmd
end if
set theCmd to "mv " & (the quoted form of the theTBFolderName) & " " & (the quoted form of theCopyTBFolderName)
do shell script theCmd
else
set theCmd to "rm -r " & (the quoted form of the theTBFolderName)
do shell script theCmd
end if
end if
end checkforExisting
on extractFileName(theFilePathName)
set cTID to text item delimiters
set text item delimiters to "/"
set theFolderName to text item -1 of theFilePathName
set text item delimiters to "."
set theFileName to text 1 thru text item -2 of theFolderName
set text item delimiters to cTID
return theFileName as text
end extractFileName
on extractAnnotationsviaHighlights(theFileName, theCount)
tell application "Highlights" to open POSIX file theFileName
delay 1
tell application "System Events"
-- save as textbundle
keystroke "t" using option down
delay 1
-- save to the default location
if theCount = 0 then
keystroke "g" using {shift down, command down}
delay 0.5
keystroke theDefaultFolder
delay 0.5
keystroke return
delay 0.5
end if
-- this next step clicks the SAVE dialog button
-- the script will crash here if the file already exists
keystroke return
delay 1
-- close the window
keystroke "w" using command down
end tell
return
end extractAnnotationsviaHighlights
The script will extract all forms of annotations, including “picture” annotations. Here is a snapshot example of an annotation that I made using Bookends on iPadOS that is extracted in the assets folder of the textbundle folder.

- I use #hashtag notations in the note fields of annotations. Highlights extracts the #hashtags INSIDE the URL as . This leaves them hidden from any markdown editor. The AppleScript below exposes these “hidden” #hashtags. It requires BBEdit on macOS. I welcome any help to convert this to use
sed at the OS level (I am baffled by what to use for the proper escape sequence to capture the required closing brackets). You use this by selecting the text.markdown files inside the .textbundle folder created in the above step.
(*
expose hidden hashtags
version 2021-07-29
author jjw
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- should BBEdit quit when done?
property quiteBBEditonEnd : true
-- display report dialog at end?
property displayReport : true
-- grep strings for BBEdit
-- DO NOT CHANGE
property BBEgrepString : "!\\[(#\\S*) (.*)\\]"
property BBEreplaceString : "\\1 \\2 !\\[\\2\\]"
(*
property sedgrepString : "s/![\\(#[:alpha:]*\\) \\([:alpha:]*\\)\\]"
property sedreplaceString : "/\\1 \\2 ![\\2\\]"
*)
-- run with preselected set
on run {}
tell application "Finder" to set theSelectedList to the selection as alias list
if theSelectedList = {} then return
my extractTagsinFileswBBE(theSelectedList)
end run
-- drag and drop files onto script application
on open FileSets
my extractTagsinFileswBBE(FileSets)
end open
on extractTagsinFileswBBE(FileSets)
tell application "BBEdit" to activate
set theCount to 0
repeat with theSelection in FileSets
set theFilePathName to POSIX path of theSelection as text
tell application "BBEdit"
open POSIX file theFilePathName
delay 0.5
tell text of front text window to replace BBEgrepString using BBEreplaceString options {starting at top:true, search mode:grep}
save active document of front window
close front window
end tell
end repeat
if quiteBBEditonEnd then tell application "BBEdit" to quit
return
end extractTagsinFileswBBE
(*
on extractTagsinFilewShell(theFileName)
set theCMD to "sed " & the quoted form of (grepString & replaceString) & " < " & the quoted form of theFileName
do shell script theCMD
return
end extractTagsinFilewShell
*)
- As a final step (this can also be the second step), I convert the
.textbundle folders format to plain folders. This exposes the internals to both Devonthink and Obsidian. I use the script below. You use this by selecting the .textbundle folder (unlike the above script where you select the text.markdown file).
(*
convert textbundle folder to regular markdown folder
version 2021-07-29
author jjw
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
-- define a prefix to rename the text.markdown markdown file
property AnnotationFilePrefix : "Annotations_"
-- remove textbundle folder after converting
property removeTBFolder : true
property theMDFileName : "text.markdown"
on run {}
-- get the list
tell application "Finder" to set theSelectedFolderList to the selection as alias list
if theSelectedFolderList = {} then return
my convertBatchFolders(theSelectedFolderList)
end run
on open theFolderList
my convertBatchFolders(theFolderList)
end open
on performSmartRule(theRecords)
set theCount to my convertBatchFoldersDT(theRecords)
display alert "Successfully converted " & theCount & " records from textbundles to regular folders."
end performSmartRule
on convertBatchFolders(theFolderList)
set theRootPath to ""
set theName to ""
repeat with theFolder in theFolderList
set theFilePath to POSIX path of theFolder
set {theRootPath, theName} to my getBaseNames(theFilePath)
-- convert the text.markdown file name
set thecurrentMDFilePathName to theFilePath & theMDFileName
set thedesiredMDFilePathName to theFilePath & theName & ".md"
set theCmd to "mv " & (the quoted form of thecurrentMDFilePathName) & " " & (the quoted form of thedesiredMDFilePathName)
do shell script theCmd
-- convert the .textbundle folder name
set theNewMDFolderName to theRootPath & "/" & theName
set theCmd to "rsync -av " & (the quoted form of theFilePath) & " " & (the quoted form of theNewMDFolderName)
do shell script theCmd
-- remove folder?
if removeTBFolder is true then
set theCmd to "rm -r " & (the quoted form of the theFilePath)
do shell script theCmd
end if
end repeat
return
end convertBatchFolders
on getBaseNames(FullPathName)
set cTID to text item delimiters
set text item delimiters to {"/"}
if (FullPathName ends with "/") then
set baseName to text item -2 of FullPathName
set rootName to text 1 thru text item -3 of FullPathName
else
set baseName to last text item of FullPathName
set rootName to text 1 thru text item -2 of FullPathName
end if
if (baseName contains ".") then
set text item delimiters to {"."}
set nameWithoutExtension to text 1 thru text item -2 of baseName as text
else
set nameWithoutExtension to baseName as text
end if
set text item delimiters to cTID
return {rootName, nameWithoutExtension}
end getBaseNames
I apologize that this is a long post with a lot of code. I’d consider a more formal posting approach (e.g. GitHub or a public Dropbox) if interest dictates. I am not as proficient in such methods at the moment.
Hope this information has some benefits to some folks.
Enjoy!
–
JJW