Machine Learning

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

This paper was accepted at the Workshop on Memory for LLM-Based Agentic Systems at ICLR.

Language models have consistently grown to compress more world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially the capacity of Small Language Models (SLMs) is limited, leading to factually incorrect generations. This problem is often mitigated by giving the SLM access to an outside source: the ability to query a larger model, documents, or a database. Under this setting, we study the fundamental question of which…

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

​This paper was accepted at the Workshop on Memory for LLM-Based Agentic Systems at ICLR.

Language models have consistently grown to compress more world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially the capacity of Small Language Models (SLMs) is limited, leading to factually incorrect generations. This problem is often mitigated by giving the SLM access to an outside source: the ability to query a larger model, documents, or a database. Under this setting, we study the fundamental question of which… ​​ Read More

Leave a Reply

Your email address will not be published. Required fields are marked *