Just run the LLM five times and choose the most common answer

People seriously underestimate the value of just running an LLM multiple times and counting the best answer.

Learn

Self-Consistency Sampling

Since LLMs are non-deterministic, you can get better performance by just generating multiple responses and picking the most common answer, trading cost for quality...More

Experience

Mike Taylor

Built a 50-person growth agency.
Logo
Logo
Logo
💪 Useful 0
😓 Difficult 0
🎉 Fun 0
😴 Boring 0
🚨 Errors 0
😕 Confusing 0
🤓 Interesting 0
Premium subscription required.
Python experience recommended.
1. Scenario
UPSERT OFFICE - LATE MORNING
You've just joined the team at Upsert, and today Sally, the Head of Marketing, has asked you to run the LLM (Language Model) five times and choose the most common answer. This technique, called self-consistency sampling, helps improve the accuracy and quality of AI outputs. By generating multiple responses and selecting the most consistent one, we increase the odds of obtaining a correct response. It's a simple yet effective way to enhance performance. So grab your computer and let's get started!
Sally Valentine
at Upsert

Hey there! I need your help with something important. We've been using the LLM for our AI outputs, but sometimes we get inconsistent answers. It's causing a lot of confusion and errors in our work. We need to find a way to improve the accuracy and quality of our responses. Can you run the LLM five times and choose the most common answer? Let's see if this technique called self-consistency sampling can help us out.

This course is a work of fiction. Unless otherwise indicated, all the names, characters, businesses, data, places, events and incidents in this course are either the product of the author's imagination or used in a fictitious manner. Any resemblance to actual persons, living or dead, or actual events is purely coincidental.

2. Brief

Self-Consistency Sampling: A Secret Trick for Improving AI Performance

In the world of artificial intelligence (AI), there are numerous techniques and strategies used to enhance the performance of AI models. One such technique that is not widely known or utilized is self-consistency sampling. While it may sound complicated, self-consistency sampling is a powerful tool that can greatly benefit AI production.

Self-consistency sampling involves generating multiple responses to a given question or prompt and then choosing the most suitable response based on various criteria. This can be done by selecting the summary or aggregate of the responses, choosing the most common response, or even using evaluation metrics to determine the best response.

The concept of self-consistency sampling is not new and is well-known in the academic world. However, its application in AI production is not as prevalent as it should be. By leveraging the non-deterministic nature of language models, self-consistency sampling allows for the generation of multiple responses, increasing the chances of obtaining the correct answer in aggregate.

To better understand self-consistency sampling, let's consider a canonical example. Instead of generating a single response to a question, this technique generates three responses. In this example, two of the three responses are correct, while one is incorrect. By selecting the most consistent responses, we can arrive at the correct answer, even if individual responses may vary.

Implementing self-consistency sampling can be made easier using asynchronous OpenAI, which allows for quicker execution of multiple tasks simultaneously. By running multiple samples asynchronously, we can reduce latency and obtain more accurate results. This is particularly useful when running three or five samples, as running them sequentially would significantly increase the processing time.

To demonstrate the use of self-consistency sampling, a code snippet utilizing asynchronous calls is provided in the transcript. The code generates multiple responses and then gathers them to find the most common answer. Even in the presence of errors or inconsistencies, the right answer can still be obtained through this technique.

Moreover, self-consistency sampling allows for flexibility in adjusting the number of samples generated. By increasing the number of samples, the likelihood of obtaining a successful outcome also increases. This trade-off between quality and cost can be advantageous, as generating more samples does not significantly impact latency but improves the overall performance of the AI model.

Aside from selecting the most common answer, self-consistency sampling can also incorporate evaluation metrics to assess the correctness of the responses. This adds an additional layer of validation to ensure the accuracy of the generated output. Additionally

3. Tutorial

  Okay. Self consistency, sampling. This is something that I use all the time, which I don't actually see a lot of other people using. I think it's a little bit of a secret trick. It's actually, well known in the academic world, I think, but in production it has huge benefits.

SelfConsistencySampling.ipynb
Download
4. Exercises
5. Certificate

Share This Course