Pre-training refers to unsupervised training that's done before a model is fine-...

Pre-training refers to unsupervised training that's done before a model is fine-tuned. The model still starts out random before it's pre-trained.

Here's where the Othello paper's weights are (randomly) initialized: