Model Merging in Pre-training of Large Language Models