Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (2024)

Fraser Sim - ASO Company of the Year. Apple Search Ads.

Apple Search Ads Partner.

  • Report this post

Fancy breakfast next week with fellow app professionals at Gordon Ramsay's Bread St Kitchen?Thursday 25th from 8am, roll up and help your self style buffet.hosted by Redbox Mobile, Batch, Branch

  • Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (2)

6

1 Comment

Like Comment

Fraser Sim - ASO Company of the Year. Apple Search Ads.

Apple Search Ads Partner.

5d

  • Report this comment
Like Reply

1Reaction

To view or add a comment, sign in

More Relevant Posts

  • Yasir Altaf

    Data Science/AI Consultant | Big Data Solutions | Cellular Radio & IoT

    • Report this post

    Having worked on various commercial RAG products over the past year, costs can become prohibitive unless you optimize both embedding and retrieval models. Firstly embedding models are fairly cheap, even the commercial ones like OpenAI's text-embedding-3-large will cost $0.13 / 1M tokens. You don't always need a commercial embedding model though, depending on your task, you may even get away with a TF-IDF based vectorizer or BERT(sbert) based sentence_Transformer running locally on your machine. Models like e5,bge/bgm, mpnet,minilm, roberta, LABSE provide excellent performance. You can even save the embeddings on Huggingface for free.Another aspect is to avoid using a generic large model for all tasks. Finetuned OSS embedding models are available for specific use cases whether you are looking for semantic similarity, Search, Information retrieval etc, so depending on your specific use case, it is recommended to chose a smaller( also distilled model) model, which will also keep the infra costs down. MTEB leaderboard on Huggingface is the best place to simplify model selection process. Another area that contributes significantly to your costs is poor prompting techniques. Completions take into account the input text as well as the output it generates, so if you have a habit of shoving everything in your prompt, your per token costs are going to skyrocket. Techniques such as Chain of Though prompting& ICL add 100s or thousands of tokens. This verbosity can be handled by Prompt Compression techniques i.e.adding a "Information Entropy" checker which is basically a "Sentence Rank" algorithm which assesses the informativeness of various text blocks in the prompt and discards unnecessary segments in the prompt. This verbosity can also be handled by passing the prompt to a summarization model such as BART, T5 or even GPT-2. Microsoft recently announced LLMLingua Library to do just this..Another method is to use prompt caching, rather than use an LLM for each query, lookup in your local cache whether a response already exists. Build cache ahead of time, we use chatGPT or a local model to generate question answer pairs, coupled with HYDE this has done wonders in retrieval performance. Majority of the costs incurred are however are not from embedding models but the LLM choice at retrieval phase. Making a wrong choice here will lead to significant cost difference here. for example GPT4 will cost you $30/$60(input/Output) while GPT4(32K ) will $60/$120(input/Output) per million tokens. GPT3.5 Turbo on the other hand will only cost you $0.5/1.5(in/out). I always start with a GPT3.5 Turbo and it has never failed me unless my use case needs complex, multi hop reasoning (in which case GPT4 is the king). Besides the commercial APIs, you should consider OSS LLMs such as Mistra, Zephyr, Mixtral, LLama2, Open Hermes etc. You'll need reasonably robust infra, however their eval markers on information retrieval is not too far off from commercial models.

    8

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Vant App (AWS Build '23)

    36 followers

    • Report this post

    You asked, we listened! Now you can earn points just by spreading the word about Vant! Refer your friends and family to Vant and start earning points today! 💰 Don't miss out – download the Vant app now from the Google and Apple Store! ✨ #ReferAndEarn #SpreadTheWord #vantcares

    • Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (8)

    1

    Like Comment

    To view or add a comment, sign in

  • Cassiano Surek

    Chief Technology Officer @ Beyond | AI Engineering Chair @ Next15

    • Report this post

    Cost planning for cloud based solutions is never easy.New architectures introduced by GenerativeAI, such as a RAG, kick that up a notch.Magdalena Kuhn and Joanna Stoffregen help you avoid the shock or at least prepare for that AI Cloud Provider bill generated by your #generativeai explorations!

    6

    6 Comments

    Like Comment

    To view or add a comment, sign in

  • Amer Saleh. BSc. MSc.

    Technology and Business Consultant

    • Report this post

    Make sure that you are aware of your ROI before investing heavily in AI tools!

    Like Comment

    To view or add a comment, sign in

  • Brandon Rich

    Assoc. Director of Enterprise Data & Integration Services at University of Notre Dame

    • Report this post

    A good explainer on RAG costs and the tradeoff between cost and quality as you go to cheaper models. However, look out for a mistake in the graphic that shows "11 input tokens." The other slides express this point, but any time you are introducing additional data to the context -- relevant document snippets for RAG, conversation history for memory, or pre-prompting for behavior influence -- that equals more tokens, which equals more cost.

    8

    Like Comment

    To view or add a comment, sign in

  • Yohanes Nuwara

    Consultant in AI & Computer Vision | Developer of Open-source Programs for Oil and Gas | M.Sc Business Analytics and Big Data presso Politecnico di Milano

    • Report this post

    Very useful to estimate the cost of building LLM apps and how to minimize it

    4

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Hugh Frost

    • Report this post

    Many organizations struggle with how and when to leverage Gen AI. This post and presentation from Magdalena Kuhn and Joanna Stoffregen provides a clear illustration of the costs to operate a RAG based LLM.In their example the costs are staggering. They also provide a clear explanation for what is driving the costs, the cost differential between Chat GPT 3.5 v 4.0, along with the trade-offs.

    2

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Man and Van

    29 followers

    • Report this post

    📣 Exciting news! Check out Man and Van App's blog post on five tips for disassembling a bed when moving house. Learn step-by-step instructions, essential tools, and expert advice for hassle-free bed disassembly and reassembly. Save time, effort, and headaches! Read more: https://lnkd.in/eKc4uXkADon't forget to like and share. Let's make moving a breeze together! 🌟 #MovingTips #ManAndVan #BedDisassembly #SmoothRelocation #ExpertAdvice

    • Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (23)
    Like Comment

    To view or add a comment, sign in

  • Jean Bernard Yung Hing Hin

    Principal Director of Engineering | Canadian Lead at Nuvalence

    • Report this post

    When you plan the cost of your infrastructure, it's very critical to account for the cost of operating your RAG-based LLM apps (especially if you're planning to deploy them at scale). The main cost drivers are usually associated with:1. Vector database cost2. Embedding creation/manipulation cost3. Inference costPer Magdalena Kuhn post, using a smaller (and cheaper) LLM will significantly lower the cost per user due to a significant reduction in inference cost. Or you could consider other cheaper alternatives that use open-source LLMs. You could also look at different strategies to create/manipulate your embeddings. You can select a "good enough" (and cheaper) embedding model to achieve what you want or even use utility libraries/tools to create and manipulate your embeddings. You can also select a vector DB vendor that is best suited for your needs (each product from each vendor has its own characteristics and cost associated with it).

    6

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Dispense

    3,789 followers

    • Report this post

    Day 5 of 5 Dispense Product Tips 💡:Keep your menu note short and sweet 🌱. Most customers don't read the entire note, or even may dismiss it before reading.If you want to highlight any deals you're running, leave them out of the note and use the dedicated deals section of the app - customers will look there first!

    • Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (29)

    4

    Like Comment

    To view or add a comment, sign in

Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (31)

Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (32)

3,508 followers

  • 56 Posts

View Profile

Follow

Explore topics

  • Sales
  • Marketing
  • Business Administration
  • HR Management
  • Content Management
  • Engineering
  • Soft Skills
  • See All
Fraser Sim - ASO Company of the Year. Apple Search Ads. on LinkedIn: Fancy breakfast next week with fellow app professionals at Gordon Ramsay's… (2024)

References

Top Articles
Latest Posts
Article information

Author: Dan Stracke

Last Updated:

Views: 6036

Rating: 4.2 / 5 (43 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Dan Stracke

Birthday: 1992-08-25

Address: 2253 Brown Springs, East Alla, OH 38634-0309

Phone: +398735162064

Job: Investor Government Associate

Hobby: Shopping, LARPing, Scrapbooking, Surfing, Slacklining, Dance, Glassblowing

Introduction: My name is Dan Stracke, I am a homely, gleaming, glamorous, inquisitive, homely, gorgeous, light person who loves writing and wants to share my knowledge and understanding with you.