Semantic Evaluation
To validate dynamic outputs agianst an expected output semantically using our AI based service LLM Evaluator
Usage 🚀
Refer to Setup Guide for installing dependecies for both Java and Python.
JAVA
There are 2 clients available with the Java SDK: SyncClient and AsyncClient
Java Client Code
# for async client
import org.qyrus.ai_sdk.Clients.AsyncClient;
AsyncClient client = new AsyncClient(<API_TOKEN>, null);
# for sync client
import org.qyrus.ai_sdk.Clients.SyncClient;
SyncClient client = new SyncClient(<API_TOKEN>, null);Using SyncClient
Here's an example of utilizing LLM Evaluator with SyncClient.
Java SyncClient Code
private static void testLLMEvaluator() {
Dotenv dotenv = Dotenv.load();
String QYRUS_AI_SDK_API_TOKEN = dotenv.get("QYRUS_AI_SDK_API_TOKEN");
SyncClient client = new SyncClient(QYRUS_AI_SDK_API_TOKEN, null);
String context = "application is about generating dynamic text for messages on phone";
String expected_output = "Winning lottery of 10k$";
List<String> executed_output = new ArrayList<>();
executed_output.add("You have won 10000 dollars");
String guardrails = "No sensititve info";
long startTime = System.currentTimeMillis();
int numberOfRequests = 1;
for (int i = 0; i < numberOfRequests; i++) {
try {
// Assuming there is an api_builder field in SyncClient and a create method that matches the described input
LLMEval.LLMEvalResponse response = client.llmevaluator.evaluate(context, expected_output, executed_output, guardrails);
// Assuming the APIBuilderResponse class has a getSwaggerJson method to retrieve the swagger json
String report = response.getReport();
System.out.println("Report: " + report);
} catch (Exception e) {
e.printStackTrace();
}
}
long endTime = System.currentTimeMillis();
System.out.println("Synchronous Total time for LLM Eval request: " + (endTime - startTime) + " ms");
}Using AsyncClient
Here's an example of utilizing LLM Evaluator with AsyncClient.
PYTHON
There are two clients available with the Python SDK: SyncQyrusAI and AsyncQyrusAI.
Using SyncQyrusAI
Here's an example of utilizing LLM Evaluator with SyncQyrusAI.
Using AsyncQyrusAI
Here's an example of utilizing LLM Evaluator with AsyncQyrusAI.
Python-only: RAG and MCP Testing
The Python SDK includes additional LLM Evaluator capabilities for RAG (Retrieval-Augmented Generation) and MCP (Model Context Protocol / tool-calling) testing. These helpers are available via the llm_evaluator.evaluator object on both AsyncQyrusAI and SyncQyrusAI.
Initialize LLM Evaluator
Evaluate RAG (Retrieval-Augmented Generation) Systems
Evaluate MCP (Model Context Protocol) Tool-Calling Systems
Batch Evaluation
Using JSON Input (Alternative to Pydantic)
Legacy Judge Evaluation (Backwards Compatibility)
Synchronous Usage
Advanced MCP with Schema Validation
Note: The legacy LLM Evaluator (the original judge-based evaluation) is accessible via REST APIs. The new RAG (Retrieval-Augmented Generation) and MCP (Model Context Protocol / tool-calling) testing capabilities are available in the Python SDK only at this time and are not yet exposed via the REST API. These RAG and MCP features will be made available via REST APIs soon.
Last updated