OpenAI not helpful; hope you all are in resolving my "sed" issue

This might not be the appropriate forum to ask this question but I have an head-scratching problem with using sed in the Terminal app. The use of sed appears in a long command line pipe with several other sed invocations to change more static things in those data lines (intend putting them in a command file to clean up the line).

The data being processed is a CSV file of addresses like this

1 The Road, The Town, The County, The Post Code
1a The Road, The Town, The County, The Post Code
1b The Road, The Town, The County, The Post Code
2 The Road, The Town, The County, The Post Code
3 The Road, The Town, The County, The Post Code
4a The Road, The Town, The County, The Post Code
4b The Road, The Town, The County, The Post Code
4c The Road, The Town, The County, The Post Code
15z Street, Other Town, No County, Different Post Code

I want to replace the lower case a, b, c, … z after the digits with upper case A, B, C, … Z respectively.

With much discussion in other topics here about the use of ChatGPT I posted a request for a working sed command. It responded with:

sed ‘s/([0-9])([a-z])/\1\U\2/’

which resulted in U being inserted between the last digit and the lower case letter. Other responses included:

sed ‘s/([0-9])([a-z])/\1$(echo \2 | tr [:lower:] [:upper:])/’

which resulted in the echo being inserted.

When told that didn’t work and the \U didn’t work it responded with:

sed ‘s/([0-9])([a-z])/\1\U\2\E/’

which you can guess merely added U and E to the output. The question/response then cycled through the same output.

I have tried using sed’s -E option to enable extended REs but that does not improve the result.

So I have given up on an AI produced solution and now asking the sensible real intelligence for help.

BTW using Monterey 12.6 with bash version 3.2.57.1 (installed via homebrew) and whatever version of sed came with Monterey.

StackOverflow might be the best place to ask this question. But, since you are here… I suspect the reason the sed commands are not returning the expected result is because you are using the built-in sed, which is derived from BSD, and not the GNU sed, which has more options. I think if you install gnu-sed from HomeBrew you should be able to use the substitution patters recommended. ChatGPT is still not quite right, but this should do the trick:

cat test.csv| gsed 's/^[0-9]*[a-z] /\U&/'
4 Likes

Why not use Python? It will have much better support for parsing a csv file.

2 Likes

I would use python but this was meant to be a quick and dirty change. Sadly csvkit doesn’t help me either. Ultimately the reader of this CSV file (and others file it) is SQLite3.

Dang. Been caught out by BSDisms before and too many times. Off to homebrew and get me a real sed.

Thanks.

2 Likes

I would’ve done it in Excel :see_no_evil:

I run a Microsoft-free zone.

Plus this is all meant to be a single command pipeline.

(I think) the quick and dirty way of doing this in python is to ignore that it’s a CSV at all: Read the file in line by line from stdin; iterate through the characters until you find the first non-numeric one, replace it with an upper case version of itself, and write the line out to stdout. Then put that python program into your command pipeline. Unless I’m missing something pretty obvious (aways a very real possibility), this could be done in something like 5 lines of python, even it it’s a bit of a kluge.

1 Like

@ACautionaryTale Won’t be using python for this as GNU sed worked (if I was going to write my own program it would be in Algol-68 :neutral_face:); thanks @ibuys.

The other tools I use in tihs and other related pipelines, namely csvkit, are written in python.But now I have a tool that can be used as it is supposed to be used then nothing additional needs to be written — for the moment.

1 Like

This also seems to work:

awk ‘{ if ($1 ~ /[0-9][a-z]/) {x=toupper($1); $1=“”; printf “%s%s\n”, x, $0} else print}’

1 Like

Returning to this. gsed works but I would rather it be used instead of the Apple provided sed.

Tried setting a symbolic link to gsed as sed but despite the homebrew bin directory appearing earlier in PATH than /usr/bin the Apple/BSD sed is always run.