I wouldn't say NLP as a field was resistant to probabilistic approaches or even neural networks. From maybe 2000-2018 almost all the papers were about using probabilistic methods to figure out word sense disambiguation or parsing or sentiment analysis or whatever. What changed was that these tasks turned out not to be important for the ultimate goal of making language technologies. We thought things like parsing were going to be important because we thought any system that can understand language would have to do so on the basis of the parse tree. But it turns out a gigantic neural network text generator can do nearly anything we ever wanted from a language technology, without dealing with any of the intermediate tasks that used to get so much attention. It's like the whole field got short-circuited.