[Paper] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

rufus@discuss.tchncs.de · edit-2 8 months ago

[Paper] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

rufus@discuss.tchncs.de · edit-2 9 months ago

Reading up on the speculation on the internet: There must be a caveat… There is probably a reason why they only trained up to 3B parameter models… I mean the team has the name Microsoft underneath and they should have access to enough GPUs. Maybe the training is super (computationally) expensive.

[Paper] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

[Paper] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper page - The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits