How to set up LLM analytics for Cohere

Apr 18, 2024

Posted by

Lior Neu-ner

On this page

1. Create the demo app
2. Capture chat completion events
3. Create insights
Total cost per model
Average cost per user
Average API response time
Next steps
Further reading

Tracking your Cohere usage, costs, and latency is crucial to understanding how your users are interacting with your AI and LLM powered features. In this tutorial, we show you how to monitor important metrics such as:

Total cost per model
Average cost per user
Average API response time

We'll build a basic Next.js app, implement the Cohere API, and capture these events using PostHog.

1. Create the demo app

To showcase how to track important metrics, we create a simple one-page Next.js app with the following:

A form with textfield and button for user input.
A label to show Cohere's output.
A dropdown to select different Cohere models.

First, ensure Node.js is installed (version 18.0 or newer). Then run the following script to create a new Next.js app and install both the Cohere JavaScript and PostHog Web SDKs:

Terminal

npx create-next-app@latest cohere-analytics
cd cohere-analytics
npm install --save cohere-ai
npm install --save posthog-js
cd ./src/app
touch providers.js # we set up PostHog in this file below

When prompted, select No for TypeScript, Yes for use app router, No for Tailwind CSS and the defaults for every other option.

Next, we set up PostHog using our API key and host (You can find these in your project settings). Add the below code to app/providers.js:

app/providers.js

'use client'
import posthog from 'posthog-js'
import { PostHogProvider } from 'posthog-js/react'
import { useEffect } from 'react'

export function PHProvider({ children }) {
  useEffect(() => {
    posthog.init('<ph_project_api_key>', {
      api_host: 'https://us.i.posthog.com',
      person_profiles: 'identified_only'
    })
  }, []);

  return <PostHogProvider client={posthog}>{children}</PostHogProvider>
}

Then we import the PHProvider component into app/layout.js and wrap our app with it:

app/layout.js

import "./globals.css";
import { PHProvider } from './providers'

export default function RootLayout({ children }) {
  return (
    <html lang="en">
      <PHProvider>
        <body>{children}</body>
      </PHProvider>
    </html>
  );
}

Then replace the code in page.js with our basic layout and functionality. You can find your Cohere API key here.

app/page.js

'use client'

import { useState } from 'react';
import { usePostHog } from 'posthog-js/react'
import { CohereClient } from "cohere-ai";

const models = [
  {
    name: 'command-r-plus',
    token_input_cost: 0.000003,
    token_output_cost: 0.000015
  },
  {
    name: 'command-r',
    token_input_cost: 0.0000005,
    token_output_cost:  0.0000015
  },
]

export default function Home() {
  const [userInput, setUserInput] = useState('');
  const [response, setResponse] = useState('');
  const [selectedModel, setSelectedModel] = useState(models[0]);
  const posthog = usePostHog()

  const fetchResponse = async () => {
    try {
      const cohere = new CohereClient({
        token: '<your_cohere_api_key>',
      });
      
      setResponse('Generating...');
      const chat = await cohere.chat({
        model: selectedModel.name,
        message: userInput,
      });

      const response = chat.text
      setResponse(response);
    } catch (error) {
      setResponse(error.message);
    }
  };

  const handleInputChange = (event) => {
    setUserInput(event.target.value);
  };

  const handleModelChange = (event) => {
    setSelectedModel(models.filter(m => (m.name === event.target.value))[0]);
  };

  const handleSubmit = (event) => {
    event.preventDefault();
    fetchResponse();
  };

  return (
    <div style={{ display: 'flex', flexDirection: 'column', alignItems: 'center', justifyContent: 'center', minHeight: '100vh', gap: '20px' }}>
      <form onSubmit={handleSubmit}>
        <input
          type="text"
          value={userInput}
          onChange={handleInputChange}
          placeholder="Type your message"
        />
        <button type="submit">Send</button>
      </form>
      <select value={selectedModel.name} onChange={handleModelChange}>
        {models.map((model, index) => (
          <option key={index} value={model.name}>
            {model.name}
          </option>
        ))}
      </select>     
      <label>API Response:</label>
      <label>{response}</label>
    </div>
  );
};

Our basic app is now set up. Run npm run dev to see it in app action.

2. Capture chat completion events

With our app set up, we can begin capturing events with PostHog. To start, we capture a cohere_chat_completion event with properties related to the API request like:

message
model
billed_input_tokens
billed_output_tokens
input_cost_in_dollars i.e. billed_input_tokens * token_input_cost
output_cost_in_dollars i.e. billed_output_tokens * token_output_cost
total_cost_in_dollars i.e. input_cost_in_dollars + output_cost_in_dollars

Update your fetchResponse() function in page.js to capture this event:

App.js

const fetchResponse = async () => {
    try {

      // your existing code...

      setResponse('Generating...');
      const chat = await cohere.chat({
        model: selectedModel.name,
        message: userInput,
      });

      const inputCostInDollars = chat.meta.billedUnits.inputTokens * selectedModel.token_input_cost
      const outputCostInDollars = chat.meta.billedUnits.outputTokens * selectedModel.token_output_cost
      posthog.capture('cohere_chat_completion', {
        model: selectedModel.name,
        prompt: userInput,
        input_tokens: chat.meta.billedUnits.inputTokens,
        output_tokens: chat.meta.billedUnits.outputTokens,
        input_cost_in_dollars: inputCostInDollars,
        output_cost_in_dollars: outputCostInDollars,
        total_cost_in_dollars: inputCostInDollars + outputCostInDollars
      })

      // your existing code...

Refresh your app and submit a few prompts. You should then see your events captured in the PostHog activity tab.

3. Create insights

Now that we're capturing events, we can create insights. Below are three examples of useful metrics:

Total cost per model

To create this insight, go the Product analytics tab and click + New insight. Then:

Set the event to cohere_chat_completion
Click on Total count to show a dropdown. Click on Property value (sum).
Select the total_cost_in_dollars property.
Click + Add breakdown and select model from the event properties list.

Note: Insights may show 0 if the total cost is less than 0.01.

Average cost per user

This metric helps give you an idea of how your costs will scale as your product grows. Creating this insight is similar to creating the one above, however we use formula mode to divide the total cost by the total number of users:

Set the event to cohere_chat_completion
Click on Total count to show a dropdown. Click on Property value (sum).
Select the total_cost_in_dollars property.
Click + Add graph series (if your visual is set to number, switch it back to trend first).
Change the event name to cohere_chat_completion. Then change the value from Total count to Unique users.
Click Enable formula mode.
In the formula box, enter A/B.

Average API response time

Cohere's API response time can be slow, especially for longer outputs, so it's useful to keep an eye on this. To track this, we first need to modify our event capture to also include the response time:

page.js

const fetchResponse = async () => {
    try {

      // your existing code...

      const startTime = performance.now(); 
      const chat = await cohere.chat({
        model: selectedModel.name,
        message: userInput,
      });
      const endTime = performance.now();
      const responseTime = endTime - startTime;

      const inputCostInDollars = chat.meta.billedUnits.inputTokens * selectedModel.token_input_cost
      const outputCostInDollars = chat.meta.billedUnits.outputTokens * selectedModel.token_output_cost
      posthog.capture('cohere_chat_completion', {
        model: selectedModel.name,
        prompt: userInput,
        input_tokens: chat.meta.billedUnits.inputTokens,
        output_tokens: chat.meta.billedUnits.outputTokens,
        input_cost_in_dollars: inputCostInDollars,
        output_cost_in_dollars: outputCostInDollars,
        total_cost_in_dollars: inputCostInDollars + outputCostInDollars,
        response_time_in_ms: responseTime
      })

      // your existing code...

Then, after capturing a few events, create a new insight to calculate the average response time:

Set the event to cohere_chat_completion
Click on Total count to show a dropdown. Click on Property value (average).
Select the response_time_in_ms property.

Next steps

We've shown you the basics of creating insights from your product's Cohere usage. Below are more examples of product questions you may want to investigate:

How many of my users are interacting with my LLM features?
Are there generation latency spikes?
Does interacting with LLM features correlate with other metrics e.g. retention, usage, or revenue?