Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Data For GenAI

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Vector databases as the data foundation

for generative AI applications


Table des matières

Introduction3

Introduction to vector databases 4

Vector databases for GenAI: use cases 5


LLM RAG chatbot 5
“What is PRO?” response without RAG 6
“What is PRO?” response with RAG 6
Financial Services 8
Telecommunications9
High-Dimensional Data Management 12
Similarity Search and Nearest Neighbour Queries 12
Scalability and Real-Time Performance 12
Integration with Machine Learning Workflows 12
Indexing and Query Optimization 12

Project considerations 13
Building your use case 13
Data storage and cleanliness  13
Security and compliance 13
Architecture, scalability, and performance 14
Talent gap14

Trends and future 15


Open source 15
Hardware acceleration 16
Advanced Machine learning models  16

Conclusion17

Next steps 17

2
Introduction
The generative AI and MLOps spaces are undergoing rapid changes: having moved
from the realm of specialist research to mainstream accessibility in just a couple
of years. One of the most exciting aspects of Generative AI is its versatility across
industries. Whether in real estate, hospitality, retail, or telecommunications,
organisations around the globe are discovering ways to take advantage of GenAI to
drive development, growth, and innovation.

GenAI has a wide variety of use cases: it can be used in knowledge creation, system
recommendations, personalised ads and communication, sales trends prediction,
customer feedback analysis and much more.

Despite its potential, many organisations are struggling to initiate their GenAI
journey as there are many complexities and technologies underneath. Launching a
GenAI project requires an MLOps platform to manage activities such as model fine-
tuning, data cleaning, and automating machine learning workloads. Additionally,
navigating the array of libraries and frameworks can be challenging.

But there is another key enabler behind Gen AI that is often overlooked – vector
databases. Modern vector databases, optimised for storing and retrieving vector
representations of data, are crucial to the successful deployment of GenAI models
in production applications.

This guide focuses on vector databases as a foundational element for GenAI,


providing an introduction to their function and exploring their use cases across
financial, telecommunications, retail, e-commerce, and public sector industries.
It also highlights key considerations for starting your GenAI project and offers
insights into future trends in AI and MLOps development.

3
Introduction to vector databases
Although vector databases are not new to the market, they have come into the
spotlight amid the GenAI and LLM rapid developments. According to a report by
MarketsandMarkets, the global vector database market is projected to reach USD
4.3 billion by 2028, which represents a compound annual growth rate of 23.3%.

Vector databases are a significant element in machine learning and AI


applications. Their importance stems from their ability to effectively manage and
query vast quantities of high-dimensional data (datasets that have a large number
of features or variables), which is commonly used in AI and ML. As the adoption of
AI and machine learning continues to rise, the reliance on vector databases as key
data infrastructure will only increase.

Vector databases store and index text documents, rich media, audio, geospatial
coordinates, tables, and graphs into vectors for fast retrieval and similarity search.
These vectors represent points in N-dimensional spaces, effectively encapsulating
the context of an asset. Search tools can look into these spaces using low-latency
queries to find similar assets in neighbouring data points. These search tools
typically do this by exploiting the efficiency of different methods for obtaining,
for example, the k-nearest neighbours (k-NN) from an index of vectors.
OpenSearch, for example, is a leading open source vector database. It offers the
k-NN plugin and augments this functionality by providing your conversational
applications with other essential features, such as fault tolerance, resource access
controls, and a powerful query engine.

To explore how OpenSearch can further enhance your AI infrastructure for your
Gen AI needs, visit the page: https://canonical.com/data/opensearch.

4
Vector databases for GenAI: use cases
Vector databases enable a variety of use cases. In this section, we explore specific
use cases across different sectors, illustrating how and when vector databases can
be effectively utilised to maximise the potential of generative AI.

LLM RAG chatbot


The first use case involves developing a chatbot or virtual assistant, a solution with
broad applications across various industries and appealing to many companies.

To build an effective chatbot or virtual assistant, you can leverage a vector


database to maintain context and provide coherent, contextually relevant
responses. The creation of a large language model (LLM) chatbot is closely
linked with Retrieval Augmented Generation (RAG), which enhances its ability to
generate accurate and relevant replies.

What is retrieval augmented generation? RAG is a method that enhances the


precision and reliability of generative AI models by drawing on facts sourced
from external databases. Renowned LLMs like GPT, BERT, PaLM, and LLaMa can
experience performance improvements by gaining access to additional structured
and unstructured data. This additional data may include public or internal
documents, websites, and various text forms and content. This methodology,
termed retrieval-augmented generation (RAG), ensures that your conversational
application generates accurate results with contextual relevance and domain-
specific knowledge, even in areas where the pertinent facts were not part of the
initial training dataset.

Consider the query: “What is Ubuntu Pro?” Without RAG, the response might
simply state that Ubuntu Pro is a subscription-based service that offers various
5
benefits. However, with RAG, enabled by vector databases through the k-NN
plugin, the response would be more detailed and precise. It would not only
describe Ubuntu Pro as a subscription model but also highlight its specific value
propositions, such as security updates, compliance packages, service level
agreements (SLA), and additional benefits.

“What is PRO?” response without RAG

Pro is a subscription-based service that offers additional features and


functionality to users. For example, Pro users can access exclusive content,
receive priority customer support, and more. To become a Pro user, you can sign
up for a Pro subscription on our website. Once you have signed up, you can access
all of the Pro features and benefits.

“What is PRO?” response with RAG

Ubuntu Pro is an additional stream of security updates and packages that meet
compliance requirements, such as FIPS or HIPAA, on top of an Ubuntu LTS. It
provides an SLA for security fixes for the entire distribution (‘main and universe’
packages) for ten years, with extensions for industrial use cases. Ubuntu Pro is
free for personal use, offering the full suite of Ubuntu Pro capabilities on up to 5
machines.

You can notice the difference in the outcomes of the answers. RAG can drastically
improve the accuracy of LLM’s responses.

Let’s now consider an example of how you can build your own chatbot with RAG
using OpenSearch.

6
To build your own chatbot, start by aggregating your source documents. First,
gather the content and context that you want your application to use for
providing accurate answers. This could include various documents or data scraped
from multiple web pages. Load this content into memory and divide it into chunks
to create embeddings from the selected documents, which you will then upload
to your vector index.

Next, generate embeddings for these text chunks and store them in the
vector index. You can then use similarity search to retrieve documents that
provide context for your queries. The search engine will use approximate k-NN
techniques, such as cosine similarity, to find and return the most relevant
documents based on your question.

To manage and integrate these components, use a tool like LangChain, which
helps orchestrate the infrastructure and ensure smooth integrations. Once set
up, you can start querying your application, specifying the context needed for
accurate responses.

Finally, perform a few inferences with the language model using the context
documents retrieved from your search engine. This process will yield responses
that are more contextually relevant and insightful, reflecting the most up-to-date
information.

This example shows how RAG can enhance data retrieval, improve knowledge
sharing, and enrich the results of your LLMs for more accurate and relevant
answers.

Canonical’s Charmed OpenSearch is a simple and robust technology that can


enable RAG capabilities. With it, any business can leverage RAG to transform their
knowledge bases. For a more detailed explanation, check out our blog “Large
Language Models (LLMs) Retrieval Augmented Generation (RAG) using Charmed
OpenSearch”.

Next, let’s look at some example use cases from financial services,
telecommunications, retail and ecommerce and the public sector.
7
Financial Services
Fraud detection. Fraud detection is critical in the financial industry, where
even a small instance of fraud can have significant repercussions. Traditional
methods often struggle with the sheer volume and complexity of transaction
data. Vector databases store transaction data as high-dimensional vectors,
allowing for more sophisticated analysis. Generative AI models can then compare
current transactions against historical patterns to detect anomalies indicative of
fraudulent behaviour. For instance, a bank can use a generative AI model to detect
unusual transaction patterns in real-time. The model queries a vector database
that contains vectors representing historical transaction data, allowing it to flag
suspicious activities based on similarity to known fraudulent patterns.

Personalised financial advice. Providing personalised financial advice involves


analysing a vast amount of customer-specific data, including spending habits,
investment goals, and risk tolerance. Vector databases can efficiently store and
retrieve this data, enabling generative AI models to create highly personalised
financial strategies.

Credit scoring and risk assessment. Accurate credit scoring and risk assessment
require the integration of diverse data points such as credit history, income,
spending behaviour, and much more. Vector databases allow for the efficient
storage and retrieval of this complex data, facilitating sophisticated analysis by
generative AI models to generate risk scores and make lending recommendations.

Algorithmic trading. Algorithmic trading relies on real-time analysis of vast


amounts of market data to make quick and profitable trading decisions. Vector
databases can store historical market data in a format that allows for rapid
querying and analysis, supporting the high-speed decision-making required in
algorithmic trading. To illustrate: an investment firm uses a generative AI-driven
trading system that queries a vector database of historical stock prices, trading
8
volumes, and news articles. The system generates trading strategies that adapt to
current market conditions, maximising profits while minimising risks.

Customer segmentation and targeting. Effective customer segmentation


requires analysing various attributes such as transaction histories, demographics,
product usage and many other parameters. Vector databases can handle this
high-dimensional data efficiently, enabling generative AI models to create
detailed and accurate customer segments for targeted marketing.

Telecommunications
Customer support automation. Customer support automation involves providing
accurate and context-aware responses to customer inquiries. Vector databases
store past customer interactions and other relevant data as high-dimensional
vectors, which generative AI models can query to produce appropriate responses.
For instance, a telecom company deploys a chatbot powered by generative AI that
queries a vector database of previous customer interactions and service records.
The chatbot generates responses that are not only relevant to the current query
but also consistent with the customer’s past preferences.

Network optimisation. Optimising network performance requires analysing


large volumes of network traffic data to identify and resolve bottlenecks. Vector
databases can handle this data efficiently, enabling generative AI models to
perform real-time analysis and optimisation, predicting future congestion points
and providing solutions to optimise network routing.
9
Predictive maintenance. Predictive maintenance involves analysing equipment
maintenance logs and sensor data to predict and prevent equipment failures.
Vector databases allow for the efficient storage and retrieval of this high-
dimensional data, enabling generative AI models to make accurate predictions.
A telecom operator could use a generative AI model to predict when network
equipment is likely to fail. The model would access a vector database with
historical maintenance logs and sensor readings, generating maintenance
schedules that prevent unexpected downtimes.

Churn prediction. In churn prediction, organisations analyse customer behaviour


to identify those at risk of leaving. Vector databases store customer usage
patterns, service complaints, and support interaction histories, enabling
generative AI models to identify at-risk customers and generate retention
strategies.

Enhanced customer profiling. Enhanced customer profiling involves creating


detailed customer profiles based on various attributes such as service
preferences, usage patterns, and demographic information. Vector databases can
efficiently store and retrieve this high-dimensional data, enabling generative AI
models to generate personalised service plans and targeted promotions.

Retail and E-commerce


Personalised product recommendations. Personalised product
recommendations involve analysing customer preferences and behaviour to
suggest relevant products. Vector databases store customer browsing history,
purchase history, and product attributes as high-dimensional vectors, enabling
generative AI models to come up with tailored suggestions.

Customer sentiment analysis. Customer sentiment analysis is the process of


examining reviews and feedback to understand customer opinions and improve
products. Vector databases turn customer reviews and feedback into high-
dimensional vectors, enabling generative AI models to produce insights on
sentiment and identify common issues..
10
Dynamic pricing optimisation. Adjusting prices based on market trends and
customer behaviour can help maximise revenue. Vector databases store
competitor prices, sales data, and customer demand patterns as high-dimensional
vectors, enabling generative AI models to produce dynamic pricing strategies.

Inventory management and demand forecasting. Inventory management and


demand forecasting involve analysing sales data and predicting future demand
to optimise inventory levels. Vector databases can store sales history, seasonal
trends, and market conditions as high-dimensional vectors, enabling generative AI
models to generate accurate forecasts.

Enhanced visual search. Enhanced visual search uses AI to match product images
with relevant items in an e-commerce catalogue. Vector databases turn product
images and descriptions into vectors, enabling generative AI models to perform
accurate and fast visual searches. An online marketplace that implements a visual
search feature powered by a generative AI model could enable customers to find
products by uploading photos or searching with visual similarity.

Public Sector
Citizen services automation. Public sector organisations can use AI to handle
inquiries and service requests from citizens. Vector databases store past
interactions, service records, feedback and service appointment scheduling,
enabling generative AI models to give accurate and context-aware responses.

Disaster response and management. Disaster response and management


requires effective coordination of efforts and resources. Vector databases store
data from various sources such as weather data, emergency reports, and resource
availability, facilitating generative AI models to produce actionable insights.

Public health monitoring. Public health monitoring involves tracking and


predicting public health trends to control disease outbreaks. Vector databases
store medical records, laboratory results, and epidemiological data as high-
dimensional vectors, enabling generative AI models to generate early warnings
and intervention strategies.
11
Urban planning and development. Urban planning and development refers
to designing projects that cater to future growth and sustainability. Vector
databases store demographic data, zoning regulations, and infrastructure plans as
high-dimensional vectors, so generative AI models can generate optimal designs.
Consider this scenario: a city planning commission uses a generative AI model
to design urban development projects. The model queries a vector database
containing vectors of environmental impact assessments, housing market
data, community engagement data, transportation and utility infrastructure,
generating optimal designs that cater to future growth and sustainability goals.

Common themes
You probably noticed that many of these use cases have a lot of similarities when
it comes to the role of vector databases. Let’s summarise the main common
themes.

High-Dimensional Data Management

Vector databases store, manage, and query high-dimensional data, which is crucial
for AI applications that rely on complex, multi-feature datasets.

Similarity Search and Nearest Neighbour Queries

AI models often require the ability to find similar data points or the nearest
neighbours in a high-dimensional space, which vector databases optimise through
efficient indexing and querying mechanisms.

Scalability and Real-Time Performance

Vector databases are built to handle large-scale datasets and support real-time
querying, which is essential for AI applications that demand quick response times
and can scale with growing data volumes.

Integration with Machine Learning Workflows

Vector databases integrate seamlessly with machine learning workflows, enabling


efficient storage and retrieval of embeddings and feature vectors generated by AI
models.

Indexing and Query Optimization

Advanced indexing techniques (e.g., Approximate Nearest Neighbour search)


speed up the querying process in high-dimensional spaces, which is vital for AI-
driven applications requiring fast data retrieval.

12
Project considerations
The above mentioned use cases can offer significant benefits to companies across
various industries. However, when embarking on an AI/ML project, it is essential to
consider several key factors.

Building your use case


When building your use case, it is important to establish your needs as an
organisation or as a team. Adopting new trending technologies without a properly
defined need can lead to a waste of money and resources. Before investing your
time and efforts into developing any use case, it is important to state a business
objective in a clear way and make sure that it aligns with your organisation’s goals.
The outcome of your use case should be measurable, scalable and sustainable.
Having that in mind will help you to identify a relevant use case and craft a
successful project strategy.

Data storage and cleanliness


Data should be clean and ready for the use case. Data storage solutions must
be reliable, scalable, and secure to handle large volumes of data efficiently. This
includes using databases or cloud storage services that provide fast access and
robust backup mechanisms to prevent data loss. Equally important is maintaining
data cleanliness, which involves removing duplicates, correcting errors, and
ensuring consistent formatting. Clean data is vital for training AI models
accurately, as poor-quality data can lead to skewed results and reduced model
performance.

Canonical’s data portfolio offers an enterprise-ready suite of popular open


source data management solutions. The suite makes it easy to build, deploy, and
run data-intensive applications with Apache Spark, Apache Kafka, MongoDB,
OpenSearch, MySQL and PostgreSQL.

Security and compliance


Data is the heart of any AI project, and at the same time it represents the
intellectual property of the organisation, and an essential responsibility when
handling customer information. It translates into a need for organisations to build
secure ML solutions in such a way that they protect their data and the artefacts
produced later on. Whether dealing with machine learning models or the logs
they produce, ensuring the security of these components is as crucial as securing
cloud deployments or databases. Machine learning tooling should not be an
exception.

Security and compliance considerations become increasingly critical when


scaling or deploying projects in a production environment. New laws and
regulations are rapidly emerging, and it is advisable to ensure that systems are
secure and compliant from the outset, rather than trying to catch up with these
requirements later.

In highly regulated industries such as healthcare, the public sector,


telecommunications, and financial services, there are numerous compliance
13
requirements that must be adhered to and considered from the very beginning.
Security and compliance requirements, including data privacy regulations
like GDPR, telecommunications regulations, security standards, consumer
protection laws, and secure network and data transactions, should be prioritised
from the initial stages of any project. For example, in some jurisdictions,
telecommunications companies must comply with network neutrality regulations,
which prohibit discrimination in the treatment of internet traffic based on factors
such as source, destination, or content.

Thus, security and compliance are increasingly important and will continue to
shape the further development of data and the AI domain.

Architecture, scalability, and performance


When starting an AI/ML project, it is vital to focus on the architecture, scalability,
and performance of the system to ensure long-term success. The architecture
should be designed with adherence to industry standards and best practices,
which include modularity, fault tolerance, and ease of maintenance.

Scalability is another key consideration, as it ensures that the system can


handle increasing volumes of data and higher computational demands without
degradation in performance. Implementing scalable solutions involves designing
for horizontal and vertical scaling, utilising cloud resources effectively, and
optimising data pipelines.

Performance, meanwhile, must be managed through efficient algorithms,


resource management, and real-time processing capabilities to meet the needs of
dynamic applications. By integrating these elements, you can create a system that
not only aligns with organisational goals but also delivers efficient, reliable, and
adaptable AI/ML solutions capable of evolving with technological advancements
and business growth.

Talent gap
The talent gap in the tech industry, particularly in data science, is a growing
concern as highlighted by the U.S. Bureau of Labor Statistics, which projects a
nearly 28% increase in jobs requiring data science skills by 2026. This escalating
demand for data science specialists creates a significant challenge, not only
for those developing AI/ML applications but also for the teams responsible for
managing and operating these systems once they are deployed. The shortage
of qualified professionals exacerbates the difficulty of filling critical roles and
maintaining high standards of performance and innovation. Addressing the talent
gap is crucial for ensuring that AI/ML projects are effectively implemented and
managed, and for sustaining the industry’s growth. When starting AI/ML projects,
organisations should consider investing in training, development, and strategic
hiring to bridge this gap and support the successful development of MLOps
initiative.

To circumvent the skill shortage concern, consider offloading management and


operational responsibility for OpenSearch through managed services. Canonical
can manage your OpenSearch deployments in the cloud of your choice. We
take care of deploying, hardening, patching, optimising and upgrading your
OpenSearch. Learn more here: https://canonical.com/data/opensearch/managed.

14
Trends and future
Generative AI and MLOps space are developing at lightning speed. Although the
pace of changes is truly breathtaking, we can project some trends that will shape
the future of generative AI developments.

Open source
Open source technology is becoming a significant driver of innovation due
to its flexibility and wide-ranging applications. The AI sector has increasingly
adopted open source practices, with many tools, libraries, and frameworks now
emerging from the community. Looking to the future, the focus is expected to
shift from standardised, out-of-the-box solutions towards providing adaptable
infrastructure that enables organisations to develop customised use cases. As the
industry evolves, this approach will challenge the traditional single-vendor model.
This shift will empower organisations to create solutions tailored to their needs
and unique use cases. Such a transition will facilitate more significant innovation
and agility, enabling companies to leverage open source technologies in ways that
best suit their operational requirements and strategic goals.

This trend will also challenge the reliance on proprietary solutions with limited
flexibility. The rise of adaptable, open source infrastructure is likely to disrupt
this model, encouraging a more diverse and integrated approach to technology
adoption. This shift will not only foster greater collaboration and innovation but
also drive the development of more cohesive and synergistic solutions that better
meet the dynamic needs of modern organisations.

At Canonical, we have been developing secure and well-supported open source


technologies for 20 years. Our commitment is to ensure that the open source
ecosystem remains both secure and comprehensive, supporting vector databases,
big data tools, and MLOps solutions to effectively meet organisational challenges.
You can explore more here: https://canonical.com/data.

15
Data storage and compute
The landscape of data storage and computation will evolve to better address the
new needs of organisations while minimising costs. To achieve this, data should be
stored in the most cost-effective cloud object storage solutions and retrieved on
demand for specific queries. This approach ensures that organisations only incur
costs for data retrieval rather than ongoing storage expenses.

Equally important is the optimisation of computation processes; by refining


query scopes in advance, organisations can reduce the need for extensive
data processing. This targeted approach enhances operational efficiency and
streamlines resource utilisation, ultimately contributing to more cost-effective
and agile data management strategies.

Canonical’s portfolio of open source data solutions such as MongoDB,


OpenSearch, PostgreSQL, MySQL, Kafka and Spark simplify operations at any
scale, with advanced features for scaling, security, backup and monitoring.

Hardware acceleration
Hardware acceleration is becoming increasingly essential for optimising the
performance of vector databases, LLMs and Generative AI projects. Recent
innovations in GPU technology, driven by advancements from NVIDIA, Intel, and
others, are crucial for handling the intensive computational demands of these
applications. These improvements ensure that systems with high-performance
requirements, such as vector databases, achieve substantial performance levels
on specialised hardware.

Furthermore, there is a growing shift towards ARM-based CPUs, which are


emerging as a viable alternative to the traditional x86 architecture. Processors
like AWS’s Graviton and GCP’s Ampere highlight this trend, offering enhanced
energy efficiency and cost-effectiveness. These ARM-based solutions not only
provide competitive performance but also address cost and energy concerns,
making them a compelling choice for modern computing needs. This evolution
in hardware underscores a strategic move towards more efficient and high-
performance computing solutions.

Advanced Machine learning models


The field of machine learning continues to advance with machine learning
models becoming increasingly sophisticated to support diverse use cases and
enhance software embeddings. Notably, in the context of vector databases,
the emergence of advanced retrieval models such as BERT marks a significant
development. These models are designed to improve efficiency and effectiveness
in vector-based searches, addressing limitations of traditional methods and
enhancing similarity search capabilities.

Vector retrieval accuracy is gaining paramount importance with the growing


integration of vector databases in production environments and cutting-edge RAG
applications. As the technology evolves, improving search quality and precision is a
key focus. Innovations like the ColBERT retrieval model and advanced hybrid vector
search techniques are central to this effort, helping to minimise information loss
and optimise domain-specific retrieval processes. These advancements represent
a critical step forward in ensuring that vector databases can deliver accurate and
efficient search results.
16
Conclusion
This guide has introduced vector databases, highlighted their importance in
the GenAI ecosystem and offered insights into their applications within various
industries, including finance, telecommunications, retail, e-commerce, and the
public sector. Additionally, we have discussed the key considerations for planning
and executing AI projects, as well as explored future trends in the field of AI and
machine learning.

For organisations looking to advance their AI and machine learning initiatives,


establishing a solid data foundation is essential. OpenSearch offers a reliable
and scalable solution for this purpose. Canonical’s team is ready to support you
throughout this journey, ensuring your projects run and scale successfully.

Next steps
To explore Canonical’s AI and data portfolios, visit: canonical.
com/ai and canonical.com/data.

To discuss how we can support your team, please get in touch.

To learn more, watch our vector databases for genAI webinar.

© Canonical Limited 2023. Ubuntu, Kubuntu, Canonical and their associated logos are the registered trademarks of Canonical Ltd. All
other trademarks are the properties of their respective owners. Any information referred to in this document may change without
notice and Canonical will not be held responsible for any such changes.

Canonical Limited, Registered in Isle of Man, Company number 110334C, Registered Office: 2nd Floor, Clarendon House, Victoria
Street, Douglas IM1 2LN, Isle of Man, VAT Registration: GB 003 2322 47

You might also like