Git for Writing or other Non-Coding Projects?

Although this is almost enough to convince me not to start using git (for prose), I’d still like to better understand the short-comings. Apart from the problem of paragraphs being long lines for git, my impression is also that git diff (and pretty much any diff tool?) struggles with larger insertions. (But I may not have understood how to use it.) What I mean is that once you insert a larger chunk of text, whatever comes after that insertion will show up as modified, even though it wasn’t changed at all.

From what I understand, this is because git diff operates on a line by line basis and doesn’t have a large context window that would enable it to notice that a piece of text isn’t actually new but just moved down. I read that git diff --patience tries to remedy this, but I haven’t tried it and I’m wondering whether it is worth trying in the world of prose.

Git just versions files. It excels at plain text i.e. source code and markdown.

You can use any diff tool you like. I use BeyondCompare and it does a pretty good job of getting differences down to a few characters.

However, rather than take the advice of anyone here, run a small safe to fail experiment. Maybe you will like it, maybe not.

Since I last wrote in this thread we’ve switched our entire website development to astro/markdown/mdx. See: https://agilepainrelief.com

I’m the ideal target audience for this - I’m a recovering software developer.

4 Likes

I’m thinking the obsidian git plugin might be a good way of trying out git.

When I first tried git a few months ago in an attempt to share some R and Python code (for data analysis) with a colleague I used git cli commands and found it rather laborious. Part of it may be me climbing the learning curve while what I really needed to do was analyze data, but I sensed that even if I eventually learned the important commands (and understood the overall logic of git) git would still feel like a burden rather that something making things easier. But if the obsidian plugin does most of the version control in the background, it would actually relieve me from overthinking which notes I should split or not split and to what extent I want to document a note’s history…

I’m using git with Obsidian on 1 vault but NOT using the git plug-in. I am a git novice but I found it much faster and easier to just learn the minimum git command I need and run them from the command line. There are really only a very few you actually need to learn to function quite well.

The Obsidian git plug-in only handles an entire vault and the vault I am using has 2 main sections, the database schema creation code and the support technical documentation. So I put those things into 2 separate folders and set each folder as a git instance pointing to a different repository on GitLab (we don’t use GitHub). Works well and we are also syncing that vault to all team members via Obsidian sync.

Meant to call this out for further discussion above and messed up…

There are really only a few things you need to learn about git. This is my personal summary and my personal git cheat sheet.

Using Git

  1. Git saves everything you commit and push
  2. Fewer branches is better
  3. Rebase can be your friend but can also mess you up so use with caution
  4. Git commands do not all use the same syntax, deal with it
  5. The ProGit book is excellent Git

My Git command cheat sheet

git remote add upstream https://…

use the https clone location from the upstream site

git remote --verbose

shows all the remotes you have

git status

Tell me where I am with my local on-mac repo

git fetch origin

This pulls things down but does not merge them into my system. They are just available. If you only have one remote that is origin the command is just git fetch

git pull origin branchname

do a fetch then merge operation if you only have one origin and you have checked out the branch you want already then you only need to do a git pull

git checkout branchname

git checkout -b hcase/feature-branch

Create and checkout a feature branch based on the branch you are at. We use a person’s name as a directory and then the feature branch as it makes the commit history cleaner. This one is for a feature branch that Henry Case is working on. (yes I have read Neuromancer)

git add <modified_file_names>

Stage changes for commit

git commit -m"a short description of your work"

puts changes into your local system

git push

push your feature branch to GitLab

Create a merge request in Gitlab.

- Make sure the source branch is your feature branch.
- Make sure the destination branch is `develop`

I also found that of the visual tools you can use GitKraken has the most understandable picture of the git tree on your repositories. But I don’t use it to do any repo work just to look at what the state of the repo is.

2 Likes

I agree, it creates an undesirably complex workflow without adding any real advantages. Pages and Time Machine have you covered.

I use GitHub for coding but would not use it for writing, neither group or individual work. It only really shines when built into an IDE like Jetbrains, as they have an awesome AI that writes commits for you and everything is automated (no going back to 1980s style terminal commands required!)

You would more likely spend days tweaking it to make it work for a purpose it’s not designed, and not get any writing done!!

2 Likes

I appreciate you sharing your cheat sheet (I copied it into my notes), but I can’t help but notice that it also proves exactly this point:

I don’t understand half of what‘s in the cheat sheet :laughing: (which my problem, not that of the cheat sheet). From last time I used git, I remember what remote means, but I slready forgot what origin means and I don’t know what an upstream site is. And the whole thing with branches is giving me headaches that I probably would avoid using them (at the cost of missing perhaps one of git’s strongest features). And so on and so forth…

Non technical uses of git for me…

  1. Periodic version from my obsidian vault… specially if I am opening it on another machine for a short time.
  2. Managing my website using pelican and GitHub pages.
  3. Managing my CV but I use TeX so that’s okay. Maybe not so useful for Word users.
1 Like

Upstream are the places you can send you updates to. More important for open source items where you do not have write privleges on the main repository. You basically make a copy under your username and you can then push your changes up to it. When you are ready you do a merge request to the origin, basically asking the maintainers of the open source project to allow your updates to be added. The extra layer is so that it’s harder for nefarious code to creep into an open source repo. Not impossible, but does require more effort. And because you can trace where it came from youc an find and block that user in the future. Also the open nature means more people wil see the problem code and catch it.

Branches are your friends. You can test multiple options. So in writing documentation I have the main published branch of the AnimalTrakker® manual as main. I decide I’m going tototally rearrnge it and so I make a branch. In our system I put my branches and changes into a subdirectory with my name so everyone workingon the project can see who did it but if I was the only person I’d just create a branch

My new branch is called oogiem/docs_reorg

I can work with all the files there until I am satisfied it works and looks ok. To save my work I push early and often to my upstream remote. When it’s all ready I can then ask the maintainers to merge it in to main.

In a novel I can use Git to play with scenes, decide I don’t like that whole thread of action and end up moving it to a subsequent novel or a short story or just prphaning that branch and trying something else

Branches are like scraptch paper you can use to test thigns out before committing to using them.

the ProGit book is really helpful

3 Likes

Useful tips, thanks!

One more I’ve seen (which you’re probably already aware of) for those writing long texts in Markdown stored in git is to split paragraphs up into short phrases, with a single phrase on a line.

That’s apparently because Git (by default?) only compares versions on lines, not on individual words. If you write paragraphs ‘normally’ (using soft-wrapping, and marking paragraphs with double returns), then Git will mark the whole paragraph changed if only one word has been altered – not always convenient with long paragraphs.

But as Markdown ignores single returns, you can use this to split the paragraph into shorter lines, which makes identifying the precise location of changes easier.

In other words  
if you break your paragraph down into phrases,  
one per line,  
the job of using diff tools 
to find the mistakes you've corrected
or other changes between versions
is made easier.

In other words if you break your paragraph down into phrases, one per line, the job of using diff tools to find the mitsakes mistakes you’ve corrected or other changes between versions is made easier.

You still use double returns to mark the end of the paragraph and Markdown and HTML still render the paragraph properly.

Derek Sivers recommends this a good technique even if you’re not using Git… [Writing one sentence per line | Derek Sivers] Writing one sentence per line | Derek Sivers but I don’t think the idea originates with him.

2 Likes

Not quite sure where to reply in this thread. I think there is a lot of over complication going on.

Git for one person writing is very easy. You need all of 2 commands.

  • Commit - takes whatever has changed since your last commit and makes a copy of those changes in your repo.
  • Push - sends those changes to a remote server if you’re using one
    You don’t need branches or 90% of the other stuff

Any simple Git GUI will make it easy to use: Best 12 Free Open Source Git GUI Clients for macOS

GitHubDesktop is easy enough one to start with.

3 Likes

I’m converting my site right now to Astro. It’s such a great framework once you get the hang of it. The community is unbelievable, and hopefully more people move over to Astro. Hopefully more bloggers will use it and teach businesses like squarespace and wix a lesson about portability.

1 Like

The Astro community is as welcoming as talk.mpu and that’s saying something.

1 Like

Thank you for explaining this. It helps a lot.

Though what I gather from this is that there are even more copies of my text to be aware of:

  1. one local copy, outside of git
  2. the upstream copy (which may also be stored locally or remotely or both)
  3. the main branch/ origin, into which my upstream copy will eventually be merged.
  4. add multiple upstream branches to that and :exploding_head:

While I understand the benefit of having a lot of copies, it also means that you need to be aware of, which copy to work on at any point in time (or when doing particular tasks). I think this comes a bit more naturally when writing code, especially in a team, but it’s not how most people go about writing texts (articles, books, …). Even without git, I sometimes fine myself scratching my head when I return to a writing projects after some weeks of interruption and can’t remember what the file was called or whether I wrote it in Word or in LaTeX or what. - I’m not expecting anyone to solve this stupidity of mine. I’m just saying that git might add more of that kind of friction…

(Although I suspect that once you have wrapped your head around how git works or developed a routine of how to work with it, it might well make my stupidity less of a problem.)

And that may or may not be a benefit. Before I used more or less all my notes onto the computer, I hade so many paper snippets and scribbles all over the place that I couldn’t find a note when I needed it (or wondered whether I actually wrote it down or just had the idea).

The reason I like writing in LaTeX (or Markdown) is that it makes it super easy to comment out some parts or write notes to your future self while being able to produce a clean version of the document at any time. I think a lot of the different versions of a text that git could take care of are actually in my main document all the time, just some of them are commented out and therefore invisible in the output document.

What an ingenious idea! Why did I never come up with that? I love it. This also works in LaTeX and it should indeed make the use of diff/git a bit easier.

I found this quick video helpful to get a basic understanding of the different places where versions are stored:

And in this one, someone shows how they used git to write a book. It really helps to get a better understanding of what you can do, like using committ messages for documenting progress/ direction or using branches for chapters (though I’m not exactly sure what the benefits of the latter are):

(Note, however, that the Kanban Board feature in GitKraken has been discontinued)

How is that any different from trying to keep track of backups, preventing you from losing stuff you wrote before or even the way scrivener keeps snapshots? Git just provides a much more robust way to get back to any place you were. It also provides redundancy and backups and allows for very tiny editing changes to be compared easily.

Though what I gather from this is that there are even more copies of my text to be aware of

While I’m not actually recommending git for your use case, git takes care of all of that, it can show you where the different versions are, and the differences between them.

Food for thought. You now know more than most software developers about how source control works.

All most people need to know is that source control saves copies of text files and keeps track of changes efficiently with consuming insane amounts of disk space.

I’ve now spent some days familiarizing myself with git by actually using it while writing a report (in LaTeX/knitr) with a colleague and I completely see (even more) why non-developers would stay away from it. The learning curve is steep, even when using a GUI like Tower (I might switch to GitKraken or Tower for a bit more ease of use).

It simply takes time to understand the differences and relationships between, for example,

  • your local working copy
  • your copy of the main branch
  • your private local branch
  • any other branches
  • the stash
  • the index

Concrete example: what is the difference between having a private branch to which I commit and which I occasionally merge with main versus having only the main branch to which I commit locally and occasionally push to github?

I think those statements are misleading. While git takes care of a lot, it is very easy to shoot yourself in the foot and even if chances are that you won’t loose data, figuring out how exactly to restore it (or in what order) can be laborious and time consuming. To prevent yourself from making mistakes, you either need to (sort of) know what you’re doing or have someone set things up for you and then follow a strict regime of when to do what and avoid any deviations until you know what you’re doing. Thing is: you probably learn what you’re doing unless you mess up at some point.

Another example: my colleague and I originally started off by sharing a Dropbox folder and agreeing via chat who edits what file. Since it nevertheless happened that we had to use the detailed dropbox history to restore some changes that somehow got lost nonetheless, I created a git repo in our shared folder and started committing from our (shared) local working folder. - The setup sorta works, but when it came to making sure that we don’t overwrite each others changes, I learned that the way to do this is ”always pull before you push”. Sounds good until you realize that you can’t do that if you don’t have a remote. So I had to set up a remote repo on github and share it with my colleague.

Our setup is still not ideal, since we’re still sharing the dropbox folder (which makes it easy to look at each other’s work i progress without having to go via github) but I couldn’t quite figure out how to split the shared folder into two (so that our respective working folders and local repository are separate) but because git gets confused when I try to move the .git folder into a subfolder of the repo, the best I could come up with was emptying the entire folder, creating two top level folders for my colleague and me and then moving the previous contents of the shared folder into one of these subfolders. In fact, it would make sense to copy those contents into both of those folders, but I’m not sure it will work for my colleague to continue using my local git repo.

I’m not asking for help here. I’m sharing these examples merely to convey an idea of why I think that git is not at all easy.

With all that said, I am probably going to continue using git also for other projects because it is also very powerful once you get the hang of it.

1 Like

Yes and no. For a single user, using it primarily as a single-stream version control system (i.e. without different branches and such), it’s very, very simple. That’s OP’s use case.

When you try to do separate branches and all that other stuff, yes - it gets more complicated. But that stuff would be just as hard (if not harder) to do without a system like git.

I would think that trying to integrate any version control system with Dropbox would be a huge hassle, mostly because a VCS is supposed to be a “source of truth.” But if it’s in Dropbox, it’s not. Dropbox is the source of truth, and the VCS is subject to issues with simultaneous overwrites and such. Basically, you’re implementing a VCS to eliminate issues, and then re-introducing them by keeping the VCS itself in Dropbox. :slight_smile:

I’d strongly consider paying for a Github account and use a private repo.

2 Likes

You don’t need to pay for private repos on GitHub anymore. It’s a pretty good option. You’re limited in number of contributors, settings, runner minutes etc. but that wouldn’t matter for a solo developer just wanting backup and sync.

3 Likes