Data For GenAI
Data For GenAI
Data For GenAI
Introduction3
Project considerations 13
Building your use case 13
Data storage and cleanliness 13
Security and compliance 13
Architecture, scalability, and performance 14
Talent gap14
Conclusion17
Next steps 17
2
Introduction
The generative AI and MLOps spaces are undergoing rapid changes: having moved
from the realm of specialist research to mainstream accessibility in just a couple
of years. One of the most exciting aspects of Generative AI is its versatility across
industries. Whether in real estate, hospitality, retail, or telecommunications,
organisations around the globe are discovering ways to take advantage of GenAI to
drive development, growth, and innovation.
GenAI has a wide variety of use cases: it can be used in knowledge creation, system
recommendations, personalised ads and communication, sales trends prediction,
customer feedback analysis and much more.
Despite its potential, many organisations are struggling to initiate their GenAI
journey as there are many complexities and technologies underneath. Launching a
GenAI project requires an MLOps platform to manage activities such as model fine-
tuning, data cleaning, and automating machine learning workloads. Additionally,
navigating the array of libraries and frameworks can be challenging.
But there is another key enabler behind Gen AI that is often overlooked – vector
databases. Modern vector databases, optimised for storing and retrieving vector
representations of data, are crucial to the successful deployment of GenAI models
in production applications.
3
Introduction to vector databases
Although vector databases are not new to the market, they have come into the
spotlight amid the GenAI and LLM rapid developments. According to a report by
MarketsandMarkets, the global vector database market is projected to reach USD
4.3 billion by 2028, which represents a compound annual growth rate of 23.3%.
Vector databases store and index text documents, rich media, audio, geospatial
coordinates, tables, and graphs into vectors for fast retrieval and similarity search.
These vectors represent points in N-dimensional spaces, effectively encapsulating
the context of an asset. Search tools can look into these spaces using low-latency
queries to find similar assets in neighbouring data points. These search tools
typically do this by exploiting the efficiency of different methods for obtaining,
for example, the k-nearest neighbours (k-NN) from an index of vectors.
OpenSearch, for example, is a leading open source vector database. It offers the
k-NN plugin and augments this functionality by providing your conversational
applications with other essential features, such as fault tolerance, resource access
controls, and a powerful query engine.
To explore how OpenSearch can further enhance your AI infrastructure for your
Gen AI needs, visit the page: https://canonical.com/data/opensearch.
4
Vector databases for GenAI: use cases
Vector databases enable a variety of use cases. In this section, we explore specific
use cases across different sectors, illustrating how and when vector databases can
be effectively utilised to maximise the potential of generative AI.
Consider the query: “What is Ubuntu Pro?” Without RAG, the response might
simply state that Ubuntu Pro is a subscription-based service that offers various
5
benefits. However, with RAG, enabled by vector databases through the k-NN
plugin, the response would be more detailed and precise. It would not only
describe Ubuntu Pro as a subscription model but also highlight its specific value
propositions, such as security updates, compliance packages, service level
agreements (SLA), and additional benefits.
Ubuntu Pro is an additional stream of security updates and packages that meet
compliance requirements, such as FIPS or HIPAA, on top of an Ubuntu LTS. It
provides an SLA for security fixes for the entire distribution (‘main and universe’
packages) for ten years, with extensions for industrial use cases. Ubuntu Pro is
free for personal use, offering the full suite of Ubuntu Pro capabilities on up to 5
machines.
You can notice the difference in the outcomes of the answers. RAG can drastically
improve the accuracy of LLM’s responses.
Let’s now consider an example of how you can build your own chatbot with RAG
using OpenSearch.
6
To build your own chatbot, start by aggregating your source documents. First,
gather the content and context that you want your application to use for
providing accurate answers. This could include various documents or data scraped
from multiple web pages. Load this content into memory and divide it into chunks
to create embeddings from the selected documents, which you will then upload
to your vector index.
Next, generate embeddings for these text chunks and store them in the
vector index. You can then use similarity search to retrieve documents that
provide context for your queries. The search engine will use approximate k-NN
techniques, such as cosine similarity, to find and return the most relevant
documents based on your question.
To manage and integrate these components, use a tool like LangChain, which
helps orchestrate the infrastructure and ensure smooth integrations. Once set
up, you can start querying your application, specifying the context needed for
accurate responses.
Finally, perform a few inferences with the language model using the context
documents retrieved from your search engine. This process will yield responses
that are more contextually relevant and insightful, reflecting the most up-to-date
information.
This example shows how RAG can enhance data retrieval, improve knowledge
sharing, and enrich the results of your LLMs for more accurate and relevant
answers.
Next, let’s look at some example use cases from financial services,
telecommunications, retail and ecommerce and the public sector.
7
Financial Services
Fraud detection. Fraud detection is critical in the financial industry, where
even a small instance of fraud can have significant repercussions. Traditional
methods often struggle with the sheer volume and complexity of transaction
data. Vector databases store transaction data as high-dimensional vectors,
allowing for more sophisticated analysis. Generative AI models can then compare
current transactions against historical patterns to detect anomalies indicative of
fraudulent behaviour. For instance, a bank can use a generative AI model to detect
unusual transaction patterns in real-time. The model queries a vector database
that contains vectors representing historical transaction data, allowing it to flag
suspicious activities based on similarity to known fraudulent patterns.
Credit scoring and risk assessment. Accurate credit scoring and risk assessment
require the integration of diverse data points such as credit history, income,
spending behaviour, and much more. Vector databases allow for the efficient
storage and retrieval of this complex data, facilitating sophisticated analysis by
generative AI models to generate risk scores and make lending recommendations.
Telecommunications
Customer support automation. Customer support automation involves providing
accurate and context-aware responses to customer inquiries. Vector databases
store past customer interactions and other relevant data as high-dimensional
vectors, which generative AI models can query to produce appropriate responses.
For instance, a telecom company deploys a chatbot powered by generative AI that
queries a vector database of previous customer interactions and service records.
The chatbot generates responses that are not only relevant to the current query
but also consistent with the customer’s past preferences.
Enhanced visual search. Enhanced visual search uses AI to match product images
with relevant items in an e-commerce catalogue. Vector databases turn product
images and descriptions into vectors, enabling generative AI models to perform
accurate and fast visual searches. An online marketplace that implements a visual
search feature powered by a generative AI model could enable customers to find
products by uploading photos or searching with visual similarity.
Public Sector
Citizen services automation. Public sector organisations can use AI to handle
inquiries and service requests from citizens. Vector databases store past
interactions, service records, feedback and service appointment scheduling,
enabling generative AI models to give accurate and context-aware responses.
Common themes
You probably noticed that many of these use cases have a lot of similarities when
it comes to the role of vector databases. Let’s summarise the main common
themes.
Vector databases store, manage, and query high-dimensional data, which is crucial
for AI applications that rely on complex, multi-feature datasets.
AI models often require the ability to find similar data points or the nearest
neighbours in a high-dimensional space, which vector databases optimise through
efficient indexing and querying mechanisms.
Vector databases are built to handle large-scale datasets and support real-time
querying, which is essential for AI applications that demand quick response times
and can scale with growing data volumes.
12
Project considerations
The above mentioned use cases can offer significant benefits to companies across
various industries. However, when embarking on an AI/ML project, it is essential to
consider several key factors.
Thus, security and compliance are increasingly important and will continue to
shape the further development of data and the AI domain.
Talent gap
The talent gap in the tech industry, particularly in data science, is a growing
concern as highlighted by the U.S. Bureau of Labor Statistics, which projects a
nearly 28% increase in jobs requiring data science skills by 2026. This escalating
demand for data science specialists creates a significant challenge, not only
for those developing AI/ML applications but also for the teams responsible for
managing and operating these systems once they are deployed. The shortage
of qualified professionals exacerbates the difficulty of filling critical roles and
maintaining high standards of performance and innovation. Addressing the talent
gap is crucial for ensuring that AI/ML projects are effectively implemented and
managed, and for sustaining the industry’s growth. When starting AI/ML projects,
organisations should consider investing in training, development, and strategic
hiring to bridge this gap and support the successful development of MLOps
initiative.
14
Trends and future
Generative AI and MLOps space are developing at lightning speed. Although the
pace of changes is truly breathtaking, we can project some trends that will shape
the future of generative AI developments.
Open source
Open source technology is becoming a significant driver of innovation due
to its flexibility and wide-ranging applications. The AI sector has increasingly
adopted open source practices, with many tools, libraries, and frameworks now
emerging from the community. Looking to the future, the focus is expected to
shift from standardised, out-of-the-box solutions towards providing adaptable
infrastructure that enables organisations to develop customised use cases. As the
industry evolves, this approach will challenge the traditional single-vendor model.
This shift will empower organisations to create solutions tailored to their needs
and unique use cases. Such a transition will facilitate more significant innovation
and agility, enabling companies to leverage open source technologies in ways that
best suit their operational requirements and strategic goals.
This trend will also challenge the reliance on proprietary solutions with limited
flexibility. The rise of adaptable, open source infrastructure is likely to disrupt
this model, encouraging a more diverse and integrated approach to technology
adoption. This shift will not only foster greater collaboration and innovation but
also drive the development of more cohesive and synergistic solutions that better
meet the dynamic needs of modern organisations.
15
Data storage and compute
The landscape of data storage and computation will evolve to better address the
new needs of organisations while minimising costs. To achieve this, data should be
stored in the most cost-effective cloud object storage solutions and retrieved on
demand for specific queries. This approach ensures that organisations only incur
costs for data retrieval rather than ongoing storage expenses.
Hardware acceleration
Hardware acceleration is becoming increasingly essential for optimising the
performance of vector databases, LLMs and Generative AI projects. Recent
innovations in GPU technology, driven by advancements from NVIDIA, Intel, and
others, are crucial for handling the intensive computational demands of these
applications. These improvements ensure that systems with high-performance
requirements, such as vector databases, achieve substantial performance levels
on specialised hardware.
Next steps
To explore Canonical’s AI and data portfolios, visit: canonical.
com/ai and canonical.com/data.
© Canonical Limited 2023. Ubuntu, Kubuntu, Canonical and their associated logos are the registered trademarks of Canonical Ltd. All
other trademarks are the properties of their respective owners. Any information referred to in this document may change without
notice and Canonical will not be held responsible for any such changes.
Canonical Limited, Registered in Isle of Man, Company number 110334C, Registered Office: 2nd Floor, Clarendon House, Victoria
Street, Douglas IM1 2LN, Isle of Man, VAT Registration: GB 003 2322 47