Abstract: Chain-of-thought distillation (CoT-distillation) aims to endow small language models (SLMs) with reasoning ability to improve their performance toward specific tasks by allowing them to ...
Abstract: Transformers have demonstrated impressive capabilities across various tasks, yet their performance on compositional problems remains a subject of debate. In this study, we investigate the ...