[B! avx] masterqのブックマーク

masterq id:masterq

avxに関するmasterqのブックマーク (7)

GitHub - ggerganov/ggml: Tensor library for machine learning
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
masterq 2023/06/08
ai

simd

x86

avx

gpt

nogpu
リンク
GitHub - ggerganov/llama.cpp: LLM inference in C/C++
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
masterq 2023/03/13
facebook

ai

c

c++

llm

cuda

simd

avx

opencl

gpu
リンク
Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switching To It For 10~17x Faster Sorts - Phoronix
Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switching To It For 10~17x Faster Sorts Written by Michael Larabel in Intel on 15 February 2023 at 04:00 PM EST. 51 Comments Intel recently published an open-source C++ header file library for high performance SIMD-based sorting, which initially is focused on providing a lightning fast AVX-512 quicksort implementation. As of today that co
masterq 2023/02/17
intel

sort

simd

avx

lib

performance
リンク
GitHub - ggerganov/whisper.cpp: Port of OpenAI's Whisper model in C/C++
Stable: v1.7.4 / Roadmap | F.A.Q. High-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model: Plain C/C++ implementation without dependencies Apple Silicon first-class citizen - optimized via ARM NEON, Accelerate framework, Metal and Core ML AVX intrinsics support for x86 architectures VSX intrinsics support for POWER architectures Mixed F16 / F32 precision Integer qua
masterq 2022/11/22
SIMDでも頑張ればGPGPU並のパフォーマンスが出るということ？ggml.cとwhisper.cppを読んでみること

c

c++

speech

recognition

avx

simd

voice

ai

あとで読む
リンク
プログラムのファイル読み書きで「mmap」を使うと速度が向上する理由とは？
ソフトウェア開発において「ファイルの読み書き」は使用頻度の高い操作であり、速度の向上はソフトウェア全体のパフォーマンスに大きく影響します。そんなファイルの読み書き操作を行う際に「mmapを使った方が通常のシステムコールよりも高速にファイルを操作できる理由」について、ブリティッシュコロンビア大学の准教授であるAlexandra Fedorova氏が説明しています。 Why mmap is faster than system calls | by Alexandra (Sasha) Fedorova | Medium https://sasha-f.medium.com/why-mmap-is-faster-than-system-calls-24718e75ab37 OS上でユーザーがプログラムを実行する際、プログラムは「ユーザー空間」と「カーネル空間」と呼ばれる2種類の領域を利用します。
masterq 2020/10/31
AVXに対応すればmmapを使わなくても速くなるんです？

mmap

linux

simd

avx

intel

x86
リンク
x86でdoubleがfloatより速いかどうかを検証してみた - Qiita
昔話それは昔々のこと。 x86には浮動小数点演算を行う手段がなく、外付けの浮動小数点演算ユニットを接続するという手法で、浮動小数点演算を実現していたのであった。 x87と呼ばれたそれはとてもエクセレントなシステムで…という話はwikipediaに譲ろう。 https://ja.wikipedia.org/wiki/Intel_8087 重要なのは、x87が内部表現として80bitの拡張倍精度を使っている、ということ。これのおかげで、x87においては、確かに~~doubleのほうが速かった (floatだとdoubleへのキャストコストが発生するため)~~ 嘘だろそれ。ASM見たら別にキャストとかしてなかったわ。どっちかというと丸めの影響で精度が異なることのほうが重要だわ。改めて調べてみると、doubleが速いとされている資料についてはあんまりないことに気付く。 (同等としている資料
masterq 2019/12/06
double

float

benchmark

sse

x86

intel

doc

japanese

avx
リンク
GitHub - tanakamura/instruction-bench: instruction-bench
masterq 2017/01/16
amd64

benchmark

asm

assembler

intel

avx
リンク
1