Skip to content
Snippets Groups Projects
Unverified Commit a912d239 authored by Xingjun.Wang's avatar Xingjun.Wang Committed by GitHub
Browse files

Rebuild docs for speed benchmark (#1045)

* add qwen2.5 perf report

* update readme

* rebuild docs and fix format issue

* remove fuzzy in speed_benchmark.po

* fix issue

* recover function_call.po

* update

* remove unused code in speed_benchmark.po
parent 0f0ecfba
No related branches found
No related tags found
No related merge requests found
......@@ -22,7 +22,7 @@ To learn more about Qwen2.5, feel free to read our documentation \[[EN](https://
- Quantization: the practice of quantizing LLMs with GPTQ, AWQ, as well as the guidance for how to make high-quality quantized GGUF files;
- Training: the instructions for post-training, including SFT and RLHF (TODO) with frameworks like Axolotl, LLaMA-Factory, etc.
- Framework: the usage of Qwen with frameworks for application, e.g., RAG, Agent, etc.
- Benchmark: the statistics about inference speed and memory footprint (to be updated for Qwen2.5).
- Benchmark: the statistics about inference speed and memory footprint (Available for Qwen2.5).
## Introduction
......@@ -37,7 +37,7 @@ In the past three months since Qwen2's release, numerous developers have built n
## News
- 2024.09.19: We released the Qwen2.5 series. This time there are 3 extra model sizes: 3B, 14B, and 32B for more possibilities. Check our [blog](https://qwenlm.github.io/blog/qwen2.5) for more!
- 2024.09.19: We released the Qwen2.5 series. This time there are 3 extra model sizes: 3B, 14B, and 32B for more possibilities. Check our [blog](https://qwenlm.github.io/blog/qwen2.5) for more!
- 2024.06.06: We released the Qwen2 series. Check our [blog](https://qwenlm.github.io/blog/qwen2/)!
- 2024.03.28: We released the first MoE model of Qwen: Qwen1.5-MoE-A2.7B! Temporarily, only HF transformers and vLLM support the model. We will soon add the support of llama.cpp, mlx-lm, etc. Check our [blog](https://qwenlm.github.io/blog/qwen-moe/) for more information!
- 2024.02.05: We released the Qwen1.5 series.
......@@ -46,7 +46,7 @@ In the past three months since Qwen2's release, numerous developers have built n
Detailed evaluation results are reported in this <a href="https://qwenlm.github.io/blog/qwen2.5/"> 📑 blog</a>.
For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html) (to be updated for Qwen2.5).
For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html) .
## Quickstart
......
This diff is collapsed.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment