Page 8 of the technical paper [1] is especially informative.
The first chart (Cumulative Average NLL for Long Documents) shows a deviation from the trend and an increase in accuracy when working with >=1M tokens. The 1.0 graph is overlaid and supports the experience of 'muddiness'.
The first chart (Cumulative Average NLL for Long Documents) shows a deviation from the trend and an increase in accuracy when working with >=1M tokens. The 1.0 graph is overlaid and supports the experience of 'muddiness'.
[1] https://storage.googleapis.com/deepmind-media/gemini/gemini_...