Retrieval Augmented Generation
mistique allows users to ask questions about the Mistral7B reserach paper thorugh a Retrival Augmented Generation (RAG) pipeline. This application uses ChromaDB to create a vector database, which will be queried to find relevant information to the query. To create the embeddings for the vector database, Amazon Bedrock embeddings are created from overlaping chunks of text from the source pdf. Cosine similarity is used to compute the relevance between the query embedding (the embedding of the user query) and embeddings saved to the ChromaDB. This functionality is wrapped in a FastAPI endpoint which has been deployed as an AWS Lambda function. The frontend allows users to ask questions about the Mistral7B paper and see the relevant sections of the paper that were used to answer the question. The application was using built Next.js, Tailwind CSS, and Shadcn and deployed to Vercel.
Software Engineer
Solo Project
sept 2024 - nov 2024
AWS
Docker
FastAPI
Question answer UI showing RAG sources.
FastAPI endpoints.
Asking a question using the /submit_query endpoint.
Getting a query_id back as backend processes the request asynchronously.
Passing the query_id back via the /get_query endpoint.
Getting the response back from the /get_query endpoint.