RecurseChat - Little app to use a local LLM

I found this interesting, will take it for a spin and report back:

3 Likes

Supports Mistral, LLaVA and Wizard Coder --hadn’t heard about this two. And also supports OpenAI’s GPT3.5-4 via API key. So it’s a very nice wrapper to avoid paying for ChatGPT subscription and only pay per use.

One thing I’ve noticed is that the proprietary GPT models have a lot of fine tuning. For example, asking Mistral the question “Which country lies between Portugal and France?” gives a confusing answer, while the answer by GPT is straight correct.

Nice!!

Something I’m trying https://lmstudio.ai/

Does it allow you to index local docs (eg PDFs etc) so you can ask local LLMs about them?

There’s also Faraday.

Both LMStudio and Faraday seem more flexible and tweakable but what I like about Recurse is that it offers a clean Mac-like interface.

Thanks for the recommendation. I have been playing with RecurseChat with Mistral as the engine. I am using it to reformat text from MacWhisper Pro. It does this well. It does have a tendency to hallucinate data with alarming frequency, compared to ChatGPT, so it needs tighter boundaries.

It does not have the ability to have a chat with a PDF/local docs.

I do enjoy the simple interface, which is worth paying for. It also works relatively quickly.

2 Likes

Well, it would depend on with direction you travel …

:stuck_out_tongue:

Now that would be a smart answer! XD

Thanks for the review! This is Xiaoyi - the dev behind RecurseChat. What feature in LMStudio and Faraday would you like to see in RecurseChat? Our goal is to maintain the simplicity of the user interface, while having great customization options. Local models have its unique edges, but agree with some limitations you have brought up. Since launch - we also supported creating models for custom OpenAI API endpoints.

@shandy @NiranS We have recently implemented chatting with local docs including PDFs, using local LLM, embedding model, and vector database running locally. You can see a demo here: https://twitter.com/chxy/status/1777234458372116865

4 Likes

The one feature that I’d like to see in RecurseChat is the ability to format paragraphs. What I’d like to do is to take the output from Whisper.ai and format it into paragraphs. The output is generally one very long paragraph that has no breaks in it.

I have found I’ve used a couple llocal llms : RecurseChat, and ChatGpt4All. In the past, despite my instructions of leaving the text the same, if I put in the text from Whisper.ai, it summarizes it or it alters the

I have been having more luck with just paragraph formatting recently.

Thanks for implement PDFchat and RAG. Is there any way of unleasing RecurseChat on a folder of files for RAG?Any plans for an API to allow external programs to use RecurseChat ?

1 Like

I happe to have both. What’s your workflow? How do you use RecurseChat to split the MacWhisper chunk of text into paragraphs?

Thanks for suggesting the feedback.

For splitting transcript paragraphs, I’m not familiar, but it looks like a common issue for whisper:

Chatting with folders are on our radar. Have a use case for myself to ask an Obsidian vault question (folder of markdown file) and another user has brought this up as well.

2 Likes

I wish I had a sophisticated workflow with Typinator or keyboard maestro.

I will prompt RecurseChat:

Format the text into paragprahs. Do not otherwise alter the text, except for spelling errors.Do not label the paragraphs
and then paste the text I want formatted.

Previously I had issues with RC summarizing rather than formatting the text. It worked properly the last time I tried this. I have not tested this function with longer pieces of text.

MacWhisper Pro has a recent integration with ChatGPT(needs the API token). One of the options that is available is format into paragraphs. Using GPT3.5 makes this task cheaper.

ChatGPT has always formatted the text into paragraphs properly. I have recently stopped the subscription and was looking for cheaper options. I have not tried Claude for the task yet.

Thanks for the insights.

We have previously had some issues with summarizing long text because of the default context length is too small. After bumping the context length it’s fixed.

I found this comment from Reddit: https://www.reddit.com/r/LocalLLaMA/comments/18vtwkc/comment/kfu1hb4/

And tried karen_theeditor_v2_strict_mistral_7b.Q5_K_M.gguf · TheBloke/Karen_TheEditor_V2_STRICT_Mistral_7B-GGUF at main model on RecurseChat (You can create custom GGUF model in the model tab).

Seems to promising since it works well with a couple of samples (doesn’t modify the original text at all), with the config below:

Note that:

  • You’d need to customize instructions, user prefix, user suffix, assistant prefix, stop sequences with the ChatML prompt template (You can see the template from model home page above)
  • Assistant prefix has two \n\n - without it, it seems to stop at one paragraph
  • Increase Max Output Tokens if you are dealing with long text
  • I lowered the temperature a bit, but seems to be okay with temperature at 0.7
1 Like

Thanks for the extra configuration help.

1 Like

I want to make sure I’m understanding this correctly.

I am a lawyer and I have built a digital research library over many years. Will this app use my documents (which include many voluminous PDF files) as the basis to answer my questions?

For example, if I ask it to check the requirements in California for giving a trust notification upon death, it will check against my documents and not random sites on the web?

If so…cool! Would be an excellent tool.

Not yet.

What it currently does is to give the answers against a PDF or Markdown file you upload to the chat session with RecurseChat, so if you had all of your knowledge base in a big honking PDF it would theoretically work (with limitations coming from the selected model token window and whatnot).

I also see in the current RecurseChat version you can upload several PDFs to a chat session but I’m not sure if the second one would replace the first one or would add its content to it.

I have tried uploading a single financial statement PDF from my bank and asked some questions and it did fine, although it failed to answer some questions it generally behaved as a “OCR on steroids” and is fast. One very nice detail of this procedure is that, at least working with a Mistral model, RecurseChat will capture the context from the ground truth reported by the model and will display it as footnotes that you can hover to check the actual PDF text.

Working upon a collection of documents stored in a folder requires some additional local infrastructure support for indexing but is something I would expect to be in the roadmap. Tagging @xyc so they can get the feedback.

One other use case would be summarising long PDFs into a few paragraphs and export that to Markdown so you can have the original document and another summary. Automating this would also be very cool.

While we don’t support this use case right now, it’s one of the exact use cases we are building towards. For the time being we support dragging and dropping multiple PDF files in a session, but the UX will probably break down if there are too many files. And yes, chat with PDF feature should refer to the original document and cite the reference if possible.

We are in the process of designing UX of dragging & dropping a folder into the UI, and start chatting with it. One consideration is if there are too many files in your folder, it might take a relatively long time to index, but it should only happen once and the following query should be fast.

1 Like

Thanks for the feedback on the PDF chatting. If you have some question / answer failure examples on non-sensitive document, feel free to post here or send an email to support@recurse.chat and I can take a look (We are also thinking about creating better avenues for reporting issues).

What it currently does is to give the answers against a PDF or Markdown file you upload to the chat session with RecurseChat, so if you had all of your knowledge base in a big honking PDF it would theoretically work (with limitations coming from the selected model token window and whatnot).

I also see in the current RecurseChat version you can upload several PDFs to a chat session but I’m not sure if the second one would replace the first one or would add its content to it.

For multiple PDFs in one chat session, RecurseChat indexes every one of them (split into chunks, embedding, and saving in vector db). The PDF’s (or PDFs’) text length can go way beyond the model context window. Then we use the question to retrieve several most relevant chunks and prompt the LLM. So both the first and second one are used as context. We don’t have a UI for adding / removing individual file in the chat session yet.

One other use case would be summarising long PDFs into a few paragraphs and export that to Markdown so you can have the original document and another summary. Automating this would also be very cool.

This will be a cool feature, for sure. Summarizing is a great use case, and this is where we could use more tuning. Currently chat with PDF feature leans more towards question / answer.

This has just been launched by OpenAI and it solves your use case

https://platform.openai.com/docs/assistants/tools/file-search

Is still a little bit geeky but soon Mac apps will use these apis to present a friendly UI. Being a cloud service trust and privacy concerns are to be considerated, though. I would wait for a local only solution, it will happen soon enough.

3 Likes