MaziyarPanahi

calme-3.2-instruct-78b

fine-tunedondomain-specificdatasetsQwen2ForCausalLMbfloat16

Install and run this model locally using llmpm, the open-source LLM package manager.

Install

llmpm install MaziyarPanahi/calme-3.2-instruct-78b

Run

llmpm run MaziyarPanahi/calme-3.2-instruct-78b

View Model Page View on HuggingFace

Average Score (0–100)

52.1%

Weighted average of normalized scores from all benchmarks. Each benchmark is normalized to a 0–100 scale, then averaged together.

BENCHMARK SCORES

IFEval80.6%

Instruction-Following Evaluation. Tests the model's ability to follow explicit formatting instructions (instruction following, formatting, generation). Scored by strict format accuracy.

BBH62.6%

Big Bench Hard. A collection of challenging tasks across language understanding, mathematical reasoning, and common sense knowledge. Scored by accuracy on multiple-choice questions.

MATH Lvl 540.3%

Mathematics Aptitude Test of Heuristics, Level 5. High school competition problems covering complex algebra, geometry, and advanced calculus. Scored by exact match.

GPQA20.4%

Graduate-Level Google-Proof Q&A. PhD-level multiple-choice questions in chemistry, biology, and physics. Scored by accuracy.

MuSR38.5%

Multistep Soft Reasoning. Tests reasoning and understanding over long texts, including language understanding, reasoning capabilities, and long-context reasoning. Scored by accuracy.

MMLU-Pro70.0%

Massive Multitask Language Understanding – Professional. Expert-reviewed multiple-choice questions across medicine, law, engineering, and mathematics. Scored by accuracy.

MODEL INFO

Architecture

Qwen2ForCausalLM

Precision

bfloat16

Type

fine-tunedondomain-specificdatasets

Weight Type

Original

Parameters

78.0B

Chat Template

Yes

METADATA

Upload Date

2024-11-19

Submission Date

2024-11-28

License

other

Base Model

Removed

HF Hearts

112

CO₂ Cost (kg)

66.01