V2EX_HOT Telegram 3618
V2EX-最热主题
C++如何优化矩阵乘法 gemm
#v2ex

Avafly:

最近在用 C 手写模型推理, 其中 gemm 可以说是核心计算, 于是决定以学习为目的自己尝试优化一下.

用 3 个 for 循环可以实现最基本的矩阵乘法, 在我用 simd, blocking, 并行计算这些方法之后, 速度比 naive 版本的快了很多, 但还是会比 openblas 慢不少. 接下来该怎么做有点没头绪了. 我想知道有没有办法能进一步提升? 谢谢

代码地址: https://github.com/Avafly/optimize-gemm

source



tgoop.com/v2ex_hot/3618
Create:
Last Update:

V2EX-最热主题
C++如何优化矩阵乘法 gemm
#v2ex

Avafly:

最近在用 C 手写模型推理, 其中 gemm 可以说是核心计算, 于是决定以学习为目的自己尝试优化一下.

用 3 个 for 循环可以实现最基本的矩阵乘法, 在我用 simd, blocking, 并行计算这些方法之后, 速度比 naive 版本的快了很多, 但还是会比 openblas 慢不少. 接下来该怎么做有点没头绪了. 我想知道有没有办法能进一步提升? 谢谢

代码地址: https://github.com/Avafly/optimize-gemm

source

BY V2EX 最热主题


Share with your friend now:
tgoop.com/v2ex_hot/3618

View MORE
Open in Telegram


Telegram News

Date: |

Deputy District Judge Peter Hui sentenced computer technician Ng Man-ho on Thursday, a month after the 27-year-old, who ran a Telegram group called SUCK Channel, was found guilty of seven charges of conspiring to incite others to commit illegal acts during the 2019 extradition bill protests and subsequent months. Telegram Android app: Open the chats list, click the menu icon and select “New Channel.” Read now Private channels are only accessible to subscribers and don’t appear in public searches. To join a private channel, you need to receive a link from the owner (administrator). A private channel is an excellent solution for companies and teams. You can also use this type of channel to write down personal notes, reflections, etc. By the way, you can make your private channel public at any moment. The imprisonment came as Telegram said it was "surprised" by claims that privacy commissioner Ada Chung Lai-ling is seeking to block the messaging app due to doxxing content targeting police and politicians.
from us


Telegram V2EX 最热主题
FROM American