Soft thinking unlocking the reasoning potential of llms in continuous concept space

Source: “Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space,” arXiv: arXiv:2505.15778.

Introduction

Background

Large Language Models (LLMs) have shown impressive capabilities in complex reasoning tasks through Chain-of-Thought (CoT) prompting, which generates intermediate reasoning steps in natural language. However, standard CoT is constrained to discrete token embeddings representing fixed points in semantic space, limiting expressive power and potential. Human cognition involves fluid, abstract concepts beyond discrete linguistic tokens, supported by neuroscientific evidence of non-verbal conceptual processing. This discrete constraint restricts LLMs’ reasoning, causing incomplete exploration of paths in CoT due to sampling one token per step. Humans consider multiple possibilities simultaneously, integrating abstract concepts for more flexible reasoning.

Objective

The paper aims to enable LLMs to reason with soft, abstract concepts in a continuous concept space, transcending discrete language boundaries. Specifically, it proposes Soft Thinking, a training-free method that replaces discrete token selection with probabilistic soft aggregation over the vocabulary, forming concept tokens that encapsulate multiple meanings and explore various reasoning paths implicitly to converge toward correct answers more effectively.

Conclusion

Soft Thinking introduces a novel reasoning paradigm that breaks the bottleneck of discrete token-based reasoning by operating in a continuous concept space. By leveraging concept tokens formed as convex combinations of token embeddings, it enhances both the comprehensiveness of reasoning and convergence efficiency. Empirical results demonstrate consistent improvements in accuracy and token efficiency across diverse benchmarks without training. The method presents potential for future work on integrating training-based approaches and adapting to OOD inputs, paving the way for more advanced LLM reasoning capabilities.

Literature Review

Chain-of-Thought (CoT) reasoning enhances multi-step reasoning by generating intermediate steps, with approaches including prompt-based methods, supervised fine-tuning, and reinforcement learning. However, efficiency concerns arise with longer chains. Continuous space reasoning has been explored, such as decoding intermediate variables from hidden states, interventions on hidden states, and latent planning tokens. Methods like COCONUT use hidden states as embeddings, but face challenges with decoupled input/output spaces in larger models. Soft Thinking addresses this by using probability distributions as a bridge, enabling training-free alignment in continuous spaces.

Methodology

Soft Thinking replaces discrete token sampling in CoT with concept tokens, which are probability distributions over the vocabulary. At each reasoning step, the model generates a concept token ct as the full probability distribution, then computes the next embedding as a weighted sum of token embeddings using these probabilities. This forms a continuous concept space as the convex hull of embeddings. The Cold Stop mechanism monitors entropy and terminates reasoning early when confidence is high over consecutive steps, preventing collapse. Theoretically, it approximates full path-summation via linearization. Implementation uses top-k filtering for efficiency and integrates with SGLang.

Experiment

Evaluations on math (Math500, AIME 2024, GSM8K, GPQA-Diamond) and coding (HumanEval, MBPP, LiveCodeBench) benchmarks show Soft Thinking improves pass@1 accuracy by up to 2.48% and reduces token usage by up to 22.4% compared to standard CoT. It outperforms greedy CoT in accuracy while maintaining efficiency. Ablation studies confirm the superiority of probability-weighted embeddings over averages and the necessity of Cold Stop to avoid collapse. Qualitative analysis shows interpretable outputs with shorter, concise reasoning. The method generalizes across model architectures and scales without training.

Reference

[1] Z. Zhang et al., “Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space,” May 21, 2025, arXiv: arXiv:2505.15778. doi: 10.48550/arXiv.2505.15778.

Table of Contents