you don’t actually need to fit the whole model in RAM at once: the 70b for example “requires” something like 120gb of VRAM, but i’m running it on my 64gb m1 mbp - it just starts to run a bit slower (still very usable; i reckon about a word per 300ms)
you don’t actually need to fit the whole model in RAM at once: the 70b for example “requires” something like 120gb of VRAM, but i’m running it on my 64gb m1 mbp - it just starts to run a bit slower (still very usable; i reckon about a word per 300ms)