Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Past week
  • Any time
  • Past hour
  • Past 24 hours
  • Past week
  • Past month
  • Past year
All results
6 days ago · Input a message to start chatting with neuralmagic/Mistral-7B-Instruct-v0.3-quantized.w4a16. ... This model can be loaded on Inference API (serverless).
7 days ago · Model Card for Mistral-7B-Instruct-v0.3 quantized to 4bit weights. Weight-only quantization of Mistral-7B-Instruct-v0.3 via GPTQ to 4bits with group_size= ...
6 days ago · We use a frozen and quantized Mistral 7B-Instruct, and rely on an in-house implementation but adopt the tokenizer from the mistral-common package. Low-Rank ...
3 days ago · so I wanted to try 4 bit and 8 bit quantized models. but they are drastically slow. at first it thought the models were not on GPU but that does not seem to be ...
4 days ago · To find out, we ran a couple of tests using the popular LLM runner Llama.cpp to create quantized versions of Mistral 7B and Google's new Gemma2 9B models. We ...
3 days ago · Mistral 7B Instruct V0.2 AWQ Parameters and Internals ; Model Files, 4.2 GB ; AWQ Quantization, Yes ; Quantization Type, awq ; Model Architecture ...
1 day ago · I have quantized it to GGUF and it works quite well: ... But I'm not wrong about my relief Mistral is not currently headed that direction with their latest models ...
2 days ago · Unlike Mistral 7B, it's not openly available and operates under a different pricing model, reflecting a collaboration between Mistral AI and Microsoft.
18 hours ago · Details and insights about Dolphin 2.2.1 Mistral 7B LLM by ehartford: benchmarks, internals, and performance insights. Features: 7b LLM, VRAM: 14.4GB, ...
6 days ago · Table 9: Quantization Time (seconds) of Rounding. Methods at W4G-1 with 200 steps for LLaMA V2 Mod- els and Mistral-7B. A Quantization Cost. 834. Table 8 ...