About Us
Sign In
AI

LLMs at Korl: Insights and Experiences

Sumeet Pannu
Sumeet Pannu
January 10, 20246 min read
LLMs at Korl: Insights and Experiences
At Korl, we are very targeted in our use of LLMs. It makes getting to the truth easier by ensuring that the rich narrative that LLMs can help with is grounded in data. This allows us a lot of flexibility in choosing which models we use and fall back from one to the other without any user impact. Take a look at the outage reports from the major vendors to see the importance of being able to load balance between the.‍Exploring New Models 🔍We do keep a keen eye on new models being offered and are always updating Korl to use the latest ones that are beneficial - if nothing else they often introduce lower prices which is always welcome (Thanks OpenAI, Google and all the competition around it!). We were very excited about Gemini 1.5 Pro and ChatGPT-4o for performance and cost benefits respectively. Though we haven’t explored the cost savings associated with Gemini’s Context Caching yet.‍In this vein, we were really excited about Claude 3.5 Sonnet. We are already seeing internal productivity gains by switching to it for many coding tasks and we wanted to see what it could do for some of our LLM rich features. One of the places where we do rely on LLMs heavily is on Product Requirements Docs (PRD) generation. You give us a description of your feature and based on other things you’ve been building, we generate a PRD for you that is a pretty good context aware starting point (rather almost finishing point).‍The Power of LLMs in PRD Generation 💪As you can see the LLM will be expected to output a lot of tokens - usually about 1.5 to 3 pages of information like Basic Features, User Journeys, Metrics etc. This particular feature benefits quite a bit from the ‘best model’ for us. We fully recognize that this is highly dependent on what you feed the LLM and what you expect out of it so fully YMMV!‍Comparisons and Insights 📊In our experience, the output quality of PRDs across different LLMs—whether it's from OpenAI, Gemini, or Claude—has been remarkably similar. This consistency allows us to remain flexible, switching between models based on performance, reliability, and cost considerations without compromising on quality.‍Let's Discuss! 💬I’d love to hear from you: What has your experience been with different LLMs? Which models do you prefer for specific use cases? Your insights could be invaluable as we all strive to optimize our tools and processes.‍