Open Source RAG with Gemma and Langchain | (Deploy LLM on-prem)

Open Source RAG with Gemma and Langchain | (Deploy LLM on-prem)

Farzad Roozitalab (AI RoundTable)

10 месяцев назад

5,936 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@NinVibe
@NinVibe - 05.04.2024 11:56

I have a question. So once you start using this app and start uploading the documents additionally with the 'upload PDF or doc file' button, do the documents stay in the app data and you can use them whenever you want, or they will be deleted? (sorry if i got the answer in the video, i probably missed it.)

Ответить
@THAMEEMMULANSARIS
@THAMEEMMULANSARIS - 19.09.2024 16:06

Do we need to run the command 'python src/llm_serve.py' for execution, or its enough to run one time and host it on postman

Ответить
@SaddamBinSyed
@SaddamBinSyed - 05.08.2024 20:34

Hi , Thanks for great video. I tried with my RTX 12 GB VRAM, but could not able to setup due to OOM. I set flag for quantized as well ( as showed in notebook) but still it throws OOM error. Could you please advise me, Is there any way I can plug Ollama models to this code for testing. ? Help appreciated.

Ответить
@tk-tt5bw
@tk-tt5bw - 26.04.2024 09:24

This is great but I think it would be better if you did your projects on colab

Ответить
@AliFarooq-yg6fn
@AliFarooq-yg6fn - 18.04.2024 16:57

Hi, I have a question. I already implement rag with mistral dolphin 7B. I also test some advanced rag techniques like ensembles, parent-child, multi-query retrievals. I don't have a GPU so I run my llm on LM studio server. In your video you say we need GPU can we also use CPU and make the same interface and project? I also want to deploy my app on server. I looked into your video in which you are using flask to deploy the app but its locally what I have to do to deploy it on server. I also came across with Heroku platform to deploy applications with gpu. I am confused can I make the application like you did on my local system without gpu for test and than deploy it on server.

Ответить
@KhanhLe-pu5wx
@KhanhLe-pu5wx - 12.04.2024 09:43

what you got in the .env file

Ответить
@ashwinisivanandan1715
@ashwinisivanandan1715 - 01.04.2024 22:07

Hey! Thanks for the lovely videos....
Just a question, how do I work without GPU? It's asking me for nvidia drive which is not installed in my system. And I do not intend to install it either... So any workaround you may suggest?

Ответить
@musumo1908
@musumo1908 - 01.04.2024 00:41

These videos are amazing! Can you release a version with the summary doc task added back….?? Subject to LLM model…thanks

Ответить
@TooyAshy-100
@TooyAshy-100 - 30.03.2024 22:18

Hi,
How can open-source tools and frameworks be utilized to evaluate the performance of a Retrieval-Augmented Generation (RAG) system that integrates Large Language Models (LLMs) like Google Gemma?

Ответить
@venys1388
@venys1388 - 09.03.2024 06:01

Thank u so much

Ответить
@gbrbreenecommerce
@gbrbreenecommerce - 06.03.2024 22:15

Thanks for your efforts. I really like your explanation style. I have a question that really kills me :) ..... How to control the LLM response and be assured that it will be from the RAG? are there any specific techniques for that? ......... Also, How to provide feedback about the response to the LLM? I can see you are showing thumbs up and down here .....but where is this feedback saved? and How to inform the LLM by this feedback? I am sorry if my question is primitive (may be for others) but for me it is very important to understand these questions. Thank you again

Ответить
@Mike-Denver
@Mike-Denver - 05.03.2024 01:59

this is awesome! thank you! One thing though - by now only lazy did not say that gemma is a poor model. Many experiments on yt show its underperpormance. How about make a use of any open llm? Can you experiment with LM Studio?

Ответить
@bald_ai_dev
@bald_ai_dev - 03.03.2024 07:28

Great stuff. Instead of using pdfs, can you do a tutorial on using a large csv doc for RAG?

Ответить
@ginisksam
@ginisksam - 03.03.2024 07:22

Hi. Thanks for your detail & step-by-step with your codes. Am enjoying and learning same time. 🙏

Ответить
@rocufw
@rocufw - 03.03.2024 02:16

are there any good open source embedding models? if someone wants to keep their data private, won't ada require you to send your data to openai?

Ответить
@omidsa8323
@omidsa8323 - 03.03.2024 01:09

Just another fantastic video! Thanks Farzad

Ответить
@saeednsp1486
@saeednsp1486 - 02.03.2024 23:52

hi farzad,how are you ?
so im testing RAG platforms and currently im using privategpt+ollama { mistral instruct fp16 v0..2} + bge-m3 + bge-reranker-large
i have a 3090
the inference is superfast,but the results are not satisfying


my question is whats the best RAG platform right now if you want to go fully open source ?
also please try your rag with more complex pdf files, not easy text files like a story

Ответить