Eggroll: Novel general-purpose machine learning algorithm provides 100x speed

(eshyperscale.github.io)

23 points | by felineflock 20 hours ago

2 comments

bdbdbdb 12 hours ago
What does this actually mean for LLMs? Cheaper training?
[-]
- MarkusQ 10 hours ago
  Yes. Provided it works as well as they claim.
  Not only cheaper, but (since in this case money ≈ hardware-cost × time), faster. They claim that training time can even approach inference time:
  > EGGROLL's efficiency results in a hundredfold increase in training throughput for billion-parameter models at large population sizes, nearly reaching the throughput of pure batch inference
- free_bip 3 hours ago
  Their technique does not claim to compete with gradient descent - it's competition for techniques like Proximal Policy Optimization, so it's more suited for things like creating a reasoning model out of an existing pre-trained model.