You can probably beat the standard if you have a special case to optimize for — for example, if your documents are fixed “chunks” then you don’t need to normalize by length. If you can extract sets of keywords with NLP, then you don’t need to normalize by frequency.
Also you can get some cool behavior out of representing a corpus as a competitive network that reverberates, where a query yields an “impulse response”.
Also you can get some cool behavior out of representing a corpus as a competitive network that reverberates, where a query yields an “impulse response”.