DAVID_RANDOM Telegram 334
最近手里的闲置老显卡基本出二手卖完了,不得不把7800XT拆下来准备给ARL亮机,再腾一部分W7900来日常用

于是尝试极限单卡llama 70B,用llama.cpp iq4_xs模型量化+q4_0量化kv+128K上下文,跑了一天刚好贴着线,还剩不到200M显存

推理效果意外的还不错,性能也比之前双卡q8要好的多(单用户~16 token/s)



tgoop.com/david_random/334
Create:
Last Update:

最近手里的闲置老显卡基本出二手卖完了,不得不把7800XT拆下来准备给ARL亮机,再腾一部分W7900来日常用

于是尝试极限单卡llama 70B,用llama.cpp iq4_xs模型量化+q4_0量化kv+128K上下文,跑了一天刚好贴着线,还剩不到200M显存

推理效果意外的还不错,性能也比之前双卡q8要好的多(单用户~16 token/s)

BY David's random thoughts




Share with your friend now:
tgoop.com/david_random/334

View MORE
Open in Telegram


Telegram News

Date: |

Concise The imprisonment came as Telegram said it was "surprised" by claims that privacy commissioner Ada Chung Lai-ling is seeking to block the messaging app due to doxxing content targeting police and politicians. Joined by Telegram's representative in Brazil, Alan Campos, Perekopsky noted the platform was unable to cater to some of the TSE requests due to the company's operational setup. But Perekopsky added that these requests could be studied for future implementation. Telegram users themselves will be able to flag and report potentially false content. Ng Man-ho, a 27-year-old computer technician, was convicted last month of seven counts of incitement charges after he made use of the 100,000-member Chinese-language channel that he runs and manages to post "seditious messages," which had been shut down since August 2020.
from us


Telegram David's random thoughts
FROM American