Yap Wei Herng
Passionate about artificial intelligence, machine learning, and software development. Sharing insights from research and practical experience.
Research Areas
Artificial Intelligence
Machine learning, deep learning, neural networks, and intelligent systems.
Software Engineering
Best practices, architecture patterns, and modern development methodologies.
Research & Development
Exploring cutting-edge technologies and innovative solutions.
Discover Random
Agent-based Automated Claim Matching with Instruction-following LLMs
The motivation stems from the need to overcome limitations in existing LLM-based claim matching methods. Earlier studies, such as Pisarevskaya and Zubiaga (2025), demonstrated strong results with hand-crafted prompts but highlighted the...
Recent Posts
Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting
Introduction Background Adapting language models (LMs) to new tasks via post-training carries the risk of degrading existing capabilities, known as catastrophic forgetting. This phenomenon has been observed in both supervised fine-tuning (SFT) for instruction following and reinforcement learning (RL) for preference alignment. However, the comparative susceptibility of SFT and RL...
Read ArticleAgent-based Automated Claim Matching with Instruction-following LLMs
Introduction Background Automated fact-checking pipelines rely on claim matching to identify claims that can be verified using the same evidence or fact-check. This task is crucial for scaling fact-checking efforts, as it helps in grouping related claims for efficient verification. Previous work has framed claim matching as a ranking problem...
Read ArticleThe 10,000x Explosion: Reproducing DeepSeek’s mHC at Scale
The 10,000x Explosion: Reproducing DeepSeek’s mHC at Scale
Read ArticlemHC: Manifold-Constrained Hyper-Connections
Introduction Background Deep neural network architectures have evolved significantly since the introduction of ResNets in 2016, with residual connections becoming a cornerstone of modern models like Transformers and large language models (LLMs). Hyper-Connections (HC) extended this paradigm by expanding the residual stream width and diversifying connectivity patterns, yielding performance gains...
Read ArticleConfTuner: Training Large Language Models to Express Their Confidence Verbally
Introduction Background Large Language Models (LLMs) are increasingly deployed in high-stakes domains such as science, law, and healthcare, where accurate expressions of uncertainty are essential for reliability and trust. However, current LLMs often generate incorrect answers with high confidence, a phenomenon known as overconfidence. This undermines trust and poses challenges...
Read Article