Using Large Language Models to improve your Search Engine PyData Global 2023

Using Large Language Models to improve your Search Engine
.ical

12-08, 19:30–21:00 (UTC), LLM Track

Every corner you look, everyone is talking about Large Language Models (LLMs).
Are you feeling a bit overwhelmed and looking for a simple intro and guided application of LLMs ?

Many internet companies have a search engine.
In this tutorial, we will cover practical use case of LLMS in improving a search engines such as

1) Understanding user intent in query
2) Checking if query is relevant to a document
3) Fine-tuning LLMs with custom corpus .
4) Updating the search engine documents with LLM knowledge.

This tutorial is meant to be beginner friendly and will focus on the practical use case.
No prior experience on search or advance machine learning needed.
Google Colab and an e-commerce dataset will be provided.

In the last year, Generative Large Language Models like ChatGPT and Llama2 have sparked everyone’s imagination about possible use case.

With the pace that things are moving, it is easy to get lost in how to apply these models.

To motivate the concepts, we will explore how LLMs can be used to improve a search engine.

Every company with a visible web presence has a search engine.

The tutorial will cover how to improve a search engine.
The content is meant to be beginner friendly and focus on the practical aspect.

Proposed Agenda:

1) Introduction ( 10 mins)

overview of the dataset that we are using
simple token based search engine and the gaps we need to solve
Lab: explore the dataset and google Colab setup

2) Fundamentals (20 min)

Brief Intro to Generative Language Models
How companies are using LLM to search
Prompting Techniques: Few Shot , Zero Shot, Chain of Thought

3) Applications of Prompting (25 min)

Extract user intent of query
generating relevant alternate queries
Checking if query is relevant to a document
Lab: explore prompting techniques to solve the above use cases

4) Finetuning LLM (25 min)

gentle intro to Parameter Efficient Fine-tuning on how to finetune
how to apply LLM prompt output to improve search results
Lab: explore results from finetuned model
Lab: incorporate all the prompt outputs to improve search engine

5) Questions (10 mins)

Prior Knowledge Expected –

No previous knowledge expected

Nidhin Pattaniyil

Machine Learning Engineer working on Search

Ravi

AI engineer.

Mustafa Zengin

Data Scientist at Walmart

Using Large Language Models to improve your Search Engine .ical 12-08, 19:30–21:00 (UTC), LLM Track

Using Large Language Models to improve your Search Engine
.ical

12-08, 19:30–21:00 (UTC), LLM Track