Tracker¶
The prompt_versioner.metrics.tracker module provides functionality for tracking and statistical analysis of metrics.
MetricsTracker¶
Static class for tracking and analyzing prompt version metrics.
Methods¶
compute_stats()¶
Computes a statistical summary of metric values.
Parameters:
- values (List[float]): List of metric values
Returns:
- Dict[str, float]: Dictionary with statistical measures:
- count: Number of values
- mean: Arithmetic mean
- median: Median
- std_dev: Standard deviation
- min: Minimum value
- max: Maximum value
- sum: Total sum
Example:
from prompt_versioner.metrics.tracker import MetricsTracker
values = [1.5, 2.3, 1.8, 3.1, 2.0, 1.9, 2.5]
stats = MetricsTracker.compute_stats(values)
print(f"Mean: {stats['mean']:.2f}")
print(f"Standard deviation: {stats['std_dev']:.2f}")
compute_percentiles()¶
@staticmethod
def compute_percentiles(
values: List[float], percentiles: List[int] = [25, 50, 75, 90, 95, 99]
) -> Dict[int, float]
Computes percentiles of metric values.
Parameters:
- values (List[float]): List of metric values
- percentiles (List[int]): List of percentiles to compute (default: [25, 50, 75, 90, 95, 99])
Returns:
- Dict[int, float]: Dictionary percentile -> value
Example:
values = [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
percentiles = MetricsTracker.compute_percentiles(values, [25, 50, 75, 95])
print(f"25th percentile: {percentiles[25]}")
print(f"95th percentile: {percentiles[95]}")
analyze_metrics()¶
Analyzes metrics and returns statistical summaries.
Parameters:
- metrics (Dict[str, List[float]]): Dictionary metric name -> list of values
Returns:
- List[MetricStats]: List of MetricStats objects
Example:
metrics = {
"latency_ms": [100, 120, 95, 110, 105],
"cost_eur": [0.001, 0.0015, 0.0012, 0.0018, 0.0014]
}
stats_list = MetricsTracker.analyze_metrics(metrics)
for stats in stats_list:
print(f"{stats.name}: mean={stats.mean:.4f}, std={stats.std_dev:.4f}")
detect_outliers()¶
@staticmethod
def detect_outliers(
values: List[float], method: str = "iqr", threshold: float = 1.5
) -> List[int]
Detects outliers in metric values.
Parameters:
- values (List[float]): List of metric values
- method (str): Method to use ('iqr' or 'zscore', default: 'iqr')
- threshold (float): Threshold for outlier detection (default: 1.5)
Returns:
- List[int]: List of indices of outlier values
Detection methods: - IQR (Interquartile Range): Uses Q1 - threshold*IQR and Q3 + threshold*IQR as limits - Z-Score: Uses standard deviation and considers values with |z-score| > threshold as outliers
Example:
values = [1.0, 1.1, 1.2, 1.1, 1.0, 5.0, 1.1, 1.0, 1.2] # 5.0 is an outlier
outliers = MetricsTracker.detect_outliers(values, method="iqr")
print(f"Outlier indices: {outliers}") # [5]
# Using z-score
outliers_z = MetricsTracker.detect_outliers(values, method="zscore", threshold=2.0)
calculate_trend()¶
Calculates the trend in metric values over time.
Parameters:
- values (List[float]): List of metric values in chronological order
Returns:
- Dict[str, Any]: Dictionary with trend information:
- trend: Trend type ('increasing', 'decreasing', 'stable', 'insufficient_data')
- direction: Direction ('up', 'down', None)
- slope: Linear regression slope
- start_value: First value
- end_value: Last value
- change: Absolute change
- pct_change: Percent change
Example:
# Increasing values over time
values = [1.0, 1.2, 1.5, 1.8, 2.0]
trend = MetricsTracker.calculate_trend(values)
print(f"Trend: {trend['trend']}") # 'increasing'
print(f"Change: {trend['change']}") # 1.0
print(f"Percent change: {trend['pct_change']:.1f}%") # 100.0%
Algorithms Used¶
Linear Regression for Trend¶
Trend calculation uses simple linear regression: - Slope: Indicates direction and intensity of the trend - R²: Implicitly evaluated through the slope - Stability threshold: |slope| < 0.01 indicates a stable trend
Outlier Detection¶
IQR Method:
Z-Score Method:
See Also¶
Aggregator- Functionality to aggregate metrics across multiple test runsAnalyzer- Functionality for analyzing and comparing metrics between versionsModels- Data models for metrics and comparison structuresCalculator- Utility for single-call metric calculationsPricing- Manages model pricing and calculates LLM call costs