Hi All. I have a question related to manipulating text files on the Mac. In my day job at MPOW I have to upload large text files that need to be uploaded to a remote server. The files are parsed based upon character (i.e. characters 3-7 have a certain value etc). Recently, the server side parsing tool has been updated and in order for it to properly accept our file I need to remove text from our files that exists between a certain character limit. I have been able to do this using Excel and the text to column function (i.e. parse the file by the character guide, remove the text I don’t need, and then resave the file as a text file) though in the recent Excel upgrade the text to column feature is not working as before and I can’t scroll down to enter in the column delimiters as needed. I have used BBEdit to view text files and that works great though I wanted to know if there is a way in BBEdit (or another app) to remove all text from a set character range in a text while also preserve the character spacing. Thanks in advance.
I don’t fully understand the vast you are trying to accomplish (are you trying to remove the first n characters from a file, where n is always the same number but may consist of different characters? Or are you trying to remove a varying number of characters from the beginning of files? and you need to replace the removed characters with space characters?)
Either way, irrespective of the interpretation, I strongly expect the Find and Replace feature using grep/regex will accomplish what you need, especially if the number of characters you are removing is the same for every file. Just a matter of figuring out the right regex pattern. https://regexr.com will likely help you out a lot, I use that frequently.
If you need to make several ordered or iterative passes then BBEdit’s Text Factory feature can help you here. I’ve used this a few times to process files that needed a large number of changes done in a certain order. My m most recent one did 8 different passes of a combination of regex and standard searches and replacements.
You’d have to design this with an understand of your own text source(s), but I would think that the highly-flexible BBEdit Text Factory feature would be a good possibility.
I have use Text Factories to parse very large text documents and apply a chain of transformations - regular expressions as well as formatting. Start small with smaller sample documents, verify the factory works, then proceed.
If this is something you do a lot, you could look into the cut command, used from the terminal:
cat file.txt | cut -c1-13,17-25 >upload.txt
Would select columns 1 through 13, and 17 through 25. In other words, eliminating characters in columns 14 through 16.
Yeah, I’d do it with a shell script, or if I were lacking the requisite amount of whisky to deal with that, python. Well, I’d probably use python even if I did have the whisky
Many good suggestions above, all could work well. Even a fairly simple perl script could be used as an intermediary solution.
However, I strongly recommend that you get to work on whatever software generates the text file you receive in the first place. Make sure you correct the source and remove the need for manual intervention of every file. Integration points between two systems are always a weak link, so great care needs to be taken in order to reduce the number of potential failure points.
I feel like this recent Six Colors article may give you some help. Ignore the excel part if you want to.
Thanks everyone for your advice. I appreciate it. I managed to resolve the issue within BBEdit with a simple text manipulation.
@airwhale I agree! This is an ongoing discussion with our tech team. There have been some trouble with the fix being implemented so I have had to manipulate the file.
Thanks. I have to look into this option further. It could be very useful for some of my projects.