20240702 Présentation Plateforme GenAI.pdf

Iguane Solutions ©2024 - ig1.com
2024.07.02 - Plug n Play Gen AI Platform

Iguane Solutions ©2024 - all rights reserved - ig1.com
IG1
A New Era
2

IG1
Index
Plug n Play Gen AI Platform Event
3
PLUG N PLAY
GEN AI PLATFORM
CUSTOMER
TESTIMONIAL
QUESTIONS

IG1
1. Plug n Play Gen AI Platform
4

IG1
Plug n Play Gen AI Platform
Layered Architecture for Gen AI Infrastructure
PLUG N PLAY
GEN AI PLATFORM
Layered Architecture
Concept: Gen AI
5

IG1
6
Context: Artiﬁcial Intelligence Era
PLUG N PLAY
GEN AI PLATFORM
Market Size
1
source: https://www.precedenceresearch.com/artiﬁcial-intelligence-market
$ 1 807 bn
$ 638bn
2024 2030

IG1
AI Platform Concept
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
What is an AI Platform ?
7

IG1
Not everyone can move
to the public cloud or use OpenAI .
8
PLUG N PLAY
GEN AI PLATFORM
Context

IG1
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
1. Hardware & Cloud :
Infrastructure
2. Model Foundation: LLM & RAG usage
3. Integration, Orchestration &
Deployment tooling
4. Gen AI Applications
9

IG1
Layered Architecture for Gen AI Infrastructure / Layer 01: Hardware & Cloud
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
Unpacking and Initial Setup
Unpack the Hardware
Carefully unpack the servers, NVidia GPUs, and other hardware components.
Purpose: Ensures that all components are intact and ready for installation.
Rack the Servers
Install the servers into the designated racks in the data center.
Purpose: Provides a secure and organized physical setup.
Connect Power and Networking
Connect the servers to power sources and the data center network.
Purpose: Ensures the servers are powered and networked for subsequent conﬁguration.
Hardware Conﬁguration
Install NVidia GPUs
Physically install the NVidia GPUs into the servers according to the manufacturer's instructions.
Purpose: Provides the necessary hardware acceleration for AI computations.
Verify Hardware Connections
Ensure all connections are secure and components are properly seated.
Purpose: Prevents hardware failures and connectivity issues during operation.
Physical Servers
10

IG1
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
Operating System Installation
Install the OS
Install the OS: Install IG1 AI OS, a specially designed operating system tailored for AI services, leveraging
our deep expertise and capability in managing "plug and play" platforms for AI.
Purpose: Provides the underlying operating system for all software and services.
Update the System
Run system updates to ensure all packages are up to date.
Purpose: Ensures the system has the latest security patches and features.
GPU Drivers and CUDA Installation
NVidia Drivers
Install the latest NVidia drivers for the GPUs.
Purpose: Enables the operating system to communicate with the GPUs.
CUDA Toolkit
"CUDA toolkit" is embedded in IG1 OS.
Purpose: Provides the necessary libraries and tools for developing and running GPU-accelerated applications.
Base System
11

IG1
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
KUBE by IG1 for AI
Installation and Configuration
Install KUBE by IG1
Follow the installation guide for KUBE by IG1 to set up the virtualization layer.
Purpose: Provides a platform for managing virtual machines and containers.
Configure Networking
Set up networking within KUBE to ensure communication between nodes and external access.
Purpose: Ensures seamless communication and data transfer within the cluster and with external clients.
Cluster Installation
Initialize KUBE Cluster
Initialize the KUBE cluster to create a control plane and add worker nodes.
Purpose: Establishes the core infrastructure for managing containerized applications.
Verify Cluster Health
Check the health and status of the KUBE cluster to ensure all components are functioning correctly.
Purpose: Identifies and resolves any issues before proceeding with further setup.
12

IG1
LLM Model Setup
LLM Model Setup
Download LLM
Obtain the LLM from the appropriate source.
Purpose: Provides the base AI model for various applications.
LLM Optimization
Optimization consists in optimising resource usage by preparing and enhancing LLMs through a process
called quantization. Quantization increases inference performance without signiﬁcantly compromising
accuracy. Our quantization management services utilize the AWQ project, which provides excellent
performance in terms of speed and accuracy.
LLMs Inference servers
Similar to database engines, LLMs inference servers run LLMs for inference or embedding. IG1 installs
and manages all the necessary services for the proper functioning of LLM models. For this, we rely on
several instances of:
- VLLM , ideal for models without quantization FP16,
- Nvidia Triton Inference server , for optimized models with Nvidia TensorRT-LLM
- TGI (Text Generation Inference) for Hugging Face models
Layered Architecture for Gen AI Infrastructure / Layer 02: Model Foundation
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
13

IG1
RAG Setup
RAG (Retrieval-Augmented Generation) Setup
Integrate RAG Components
Set up the necessary RAG components (example using the LlamaIndex framework):
- Retriever: Finds the most relevant information from the data.
- Generator: Uses the retrieved information to generate accurate responses.
- Embedding: Transforms data into vector representations to improve retrieval accuracy.
- Reranking: Organizes and prioritizes the retrieved results based on relevance.
Purpose: Enhances the LLM with retrieval-augmented capabilities for more accurate and relevant responses.
Deploy RAG Pipeline
Deploy the RAG pipeline within the KUBE environment.
Purpose: Ensures the RAG system is operational and integrated with the LLM model.
Layered Architecture for Gen AI Infrastructure / Layer 02: Model Foundation
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
14

IG1
Integration of AI Services
Integrate various AI services seamlessly to ensure efﬁcient communication and operation.
This includes:
API Integrations
Connect your AI models to various APIs for extended functionalities, including data retrieval, processing,
and user interface interactions.
Data Pipelines
Establish data pipelines to ensure smooth data ﬂow between different components, facilitating real-time
data processing and analysis.
The API Core acts as a Proxy LLM , balancing the load between LLMs inference server instances .
LiteLLM , deployed in High Availability, is used for this purpose. It offers wide support for LLM servers,
robustness, and usage information and API key storage through PostgreSQL . LiteLLM also enables
synchronization between different instances and sends LLM usage information to our
observability tools .
Integration AI
Services
Layered Architecture for Gen AI Infrastructure / Layer 03: Integration, Orchestration & Deployment Tooling
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
15

IG1
Observability &
Traceability
Observability and Traceability
Implement observability tools to gain insights into the behavior and performance of your AI applications:
Centralized Logging
Aggregate logs from different services and applications in a central location for easier analysis and
troubleshooting
Metrics Collection
Collect metrics on various aspects of your applications' performance, such as response times, error rates,
and resource usage.
Distributed Tracing
Use distributed tracing to track requests as they ﬂow through different services, helping to identify
bottlenecks and optimize performance..
The LLMs observability layer collects usage data and execution traces, ensuring proper LLM
management . IG1 efﬁciently manages LLM usage through a monitoring stack connected to the
LLM orchestrator . Lago and OpenMeter collect information , which is then transmitted to our
central observability system, Sismology .
Layered Architecture for Gen AI Infrastructure / Layer 03: Integration, Orchestration & Deployment Tooling
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
16

IG1
Chat
Layered Architecture for Gen AI Infrastructure / Layer 04: AI Applications
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
17

IG1
Dev Copilot
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
18

IG1
PLUG N PLAY
GEN AI PLATFORM
Context
92% of US-based developers have already
used an AI assistant for coding 2
92%
2
https://github.blog/2023-06-13-survey-reveals-ais-impact-on-the-developer-experience/
19

IG1
Low Code LLM tool
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
20

IG1
API Setup
. . .
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
21

IG1
Physical Servers Base System KUBE by IG1 for AI
LLM Model Setup RAG Setup
Integration AI
Services
Observability &
Traceability
Chat Dev Copilot Low Code LLM tool API Setup
. . .
PLUG N PLAY
GEN AI PLATFORM
Concept: Gen AI
22

IG1
23
Artiﬁcial Intelligence Era
PLUG N PLAY
GEN AI PLATFORM
Our Mission
Our mission
Help organizations to
beneﬁt from AI
by providing them AI platforms

IG1
Service
Benefit from our expert-led AI services that deliver
tailored solutions, from infrastructure design to
ongoing support, ensuring seamless integration
and immediate usability . With advanced security,
data integrity oversight, and personalized interfaces,
our services enhance your operations with
integrated AI tools for efficient DevOps, MLOps, and
AIOps, supporting scalable and effective AI
management.
Hosting
Elevate your AI applications with our cutting-edge
hardware and cloud infrastructure, featuring
NVIDIA GPU-equipped servers, optimized
Linux-based IG1 AI OS, and KUBE by IG1 for
efficient virtual machine and container management.
Our comprehensive solutions cover everything from
initial server setup to seamless deployment,
ensuring exceptional performance and flexibility for
your AI workloads.
Software
Enhance your AI projects with our all-inclusive
software solutions designed for deploying and
managing Large Language Models (LLMs) and
Retrieval-Augmented Generation (RAG)
systems . Our packages offer powerful tools for API
integration, data pipeline management,
containerized deployment, and comprehensive
observability, ensuring smooth operations and
insightful performance metrics for optimal resource
management.
Pricing
Our Offers , starting July 2024
+ +
24
PLUG N PLAY
GEN AI PLATFORM
Our Offers

IG1
Full RAG: Embedding,
Reranking resources
and Vector DB
LLMs inference servers for multi LLMs
Slack Support 24 / 7
Control your data,
hosted in France
KUBE by IG1 for AI
Train, Fine-tune et run your own Models
Metrology & Supervision
by Sismology
Ollama to
OpenAI API translator
H100 & H200 GPUs
AIOps, MLOps
Consulting
Support
25
Plug n Play Platforms for Gen AI
Models quantization
for LLMs optimization
OpenAI API-compatible core

IG1
2. Customer Testimonials
26

IG1
Customer Testimonials: Easybourse
27
Data Security and GDPR Compliance
CUSTOMER
TESTIMONIAL
Context Highly regulated
banking sector
Absolute data
security

IG1
28
Expertise and innovation at Iguana Solutions
CUSTOMER
TESTIMONIAL
Research & Development
● Recognized expertise on highly innovative topics
● Beneﬁt from one year of R&D
● Numerous tests to evaluate what works

IG1
29
Need for internal resources
CUSTOMER
TESTIMONIAL
HR
● No internal resources dedicated to R&D on GenAI
● Difﬁculty progressing alone on the subject

IG1
30
Deployed Gen AI Platform
CUSTOMER
TESTIMONIAL
Easybourse
Gen AI Platform
1. Hardware & Cloud Infrastructures
2. LLM & RAG Deployments
3. Integration, Orchestration, and Deployment Tools
4. AI Applications

IG1
31
Utilization of Code Copilot
CUSTOMER
TESTIMONIAL
Use Cases
● Immediate adoption of Copilot for coding assistance
● Ease of deployment: Plug n Play in less than an hour
● Proven efﬁciency from the ﬁrst uses
● Unfailing and high-quality support and guidance

IG1
32
Deployment and Next Steps
CUSTOMER
TESTIMONIAL
Use Cases
● Initial deployment of GenAI for operations
● Current testing of the IG1 platform with RAG

IG1
33
Key Beneﬁts for Easybourse
CUSTOMER
TESTIMONIAL
Key Beneﬁts
● Guaranteed data security, no DPO discussions
● Access to results from over a year of R&D
● True Plug n Play: operational in no time
● Very high-quality professional service

IG1
Questions
34

36

20240702 Présentation Plateforme GenAI.pdf

Related slideshows

More Related Content

20240702 Présentation Plateforme GenAI.pdf