In this workshop, you will learn how to build LLM-based apps, such as a question-answering system with RAG, in LangChain using Llama-3 at 1,000 tokens per second on the SambaNova AI Platform.
Level: Intermediate
Abstract: SambaNova delivers generative AI capabilities to the enterprise. In this workshop, you will learn:
● About SambaNova’s full-stack generative AI platform, powered by the SN40L AI chip and delivering unparalleled performance for training and inference
● Samba-1, a trillion parameter composition of experts (CoE) model, and how it can be used for enterprise settings
● How to build and deploy a question-answering app end-to-end with retrieval augmented generation (RAG) for enterprise search using the following suite: LangChain as framework, Unstructured for pre-processing text documents, E5-large-v2 embedding, ChromaDB vector store, and Llama-3-8B-Instruct running at speed record of 1,000 tokens per second via SambaNova.
This workshop is designed for tech professionals, engineers, and anyone interested in enterprise generative AI applications.
Prerequisites: Experience programming, ideally in Python, a Github account, and laptop
Assets: We will provide a link to the Github repo with step-by-step instructions on how to install the required libraries and how to run the Jupyter notebooks and Streamlit apps. We will also provide SambaNova API keys for the CoE and Llama-3 endpoints.
GitHub Repo: https://github.com/sambanova/ai-starter-kit/tree/main/workshops/ai_engineer_2024/
Dev Setup for Exercise 1: https://github.com/sambanova/ai-starter-kit/blob/main/workshops/ai_engineer_2024/basic_examples/README.md
Dev Setup for Exercise 2: https://github.com/sambanova/ai-starter-kit/blob/main/workshops/ai_engineer_2024/ekr_rag/README.md