Stop my AI hallucinating

Working with AI is powerful and useful, but they have the tendency to confidently make things up. That’s a problem but there is an emerging solution…

Learn

Vector Databases

One way to give an AI access to relevant external information is to do a vector search, based on similarity to the question asked, and return the response in the prompt...More

Experience

Mike Taylor

Built a 50-person growth agency.
Logo
Logo
Logo
💪 Useful 0
😓 Difficult 0
🎉 Fun 0
😴 Boring 0
🚨 Errors 0
😕 Confusing 0
🤓 Interesting 0
Premium subscription required.
Python experience recommended.
1. Scenario
Vexnomics Content Meeting
You’re trying to use ChatGPT for writing content, but it keeps making stuff up. It doesn’t have much of your domain expertise in the training data, so you need a way to let it look up the relevant context.
Charlotte Cook
at Vexnomics

Have you heard of vector databases?

They work by searching based on similarity rather than just “text contains X”.

You get embeddings from the AI model, these are like the location of where a concept or text is relative to other concepts in the model

Then you put them into a vector database like Pinecone

That lets you search by meaning

It’s useful for a whole host of things, but in our situation it can help the AI stop hallucinating by giving it the right context for the question we ask it to write a blog post about

This course is a work of fiction. Unless otherwise indicated, all the names, characters, businesses, data, places, events and incidents in this course are either the product of the author's imagination or used in a fictitious manner. Any resemblance to actual persons, living or dead, or actual events is purely coincidental.

2. Brief

Vector databases are an important tool for AI development. By providing a way to store and search for meaning, rather than specific keywords, vector databases can enable more accurate results when using AI tools.

Vector databases use embeddings to store information. Embeddings are locations in a dimensional chart – for example, a simplified 2D version of an embedding would have an X and Y coordinate. The actual embeddings used in AI tools usually have up to one thousand, three hundred and fifty dimensions. The closeness of two points on this chart is a way of understanding the similarity of the objects they represent. For example, the word 'dog' and the word 'cat' would be close together, while 'elephant' would be further away.

Pinecone is a vector database that can be used to store and search for meaning. To use Pinecone, you will need an OpenAI key, which can be created through the OpenAI website. You will also need to download a dataset – this example used a YouTube transcription dataset. After downloading the dataset, you will need to join the text together, and then create an index. This index is the heavy lifting of the process, and can take a while – in this example it took 28 minutes with 400 pieces of text.

Once the index is created, you can then search for content using the query with content feature. This feature allows you to inject context into the query, which can lead to more accurate results. For example, if you were to search for “what training methods show use for transformers when I only have pairs of related sentences?” without context, you may get a simple answer. However, when you search using the query with context feature, you are likely to get a much more accurate result with the relevant context.

Overall, vector databases can be a very useful tool for AI development. By providing a way to store and search for meaning, rather than specific keywords, vector databases can enable more accurate results when using AI tools. Pinecone is one of the vector databases that can be used, and by downloading a dataset, creating an index, and using the query with context feature, you can get the most out of this tool.

3. Tutorial

00:01 Okay let's learn about vector databases and pine cone, which is a hosted vector database. This is really powerful stuff because pretty much most of the new AI tools use a vector database in some way.

Pinecone.ipynb
Download
4. Exercises
5. Certificate

Share This Course