I don't think everyone's using the term 'firehose' the same here. A child comment refers to half a billion tokens for $20.
I did some really basic napkin math with some Rails logs. One request with some extra junk in it was about 400 tokens according to the OpenAI tokenizer[0]. 500M/400 = ~1.25 million log lines.
Paying linearly for logs at $20 per 1.25 million lines is not reasonable for mid-to-high scale tech environments.
I think this would be sufficient if a 'firehose of data' is a bunch of news/media/content feeds that needs to be summarized/parsed/guessed at.
I did some really basic napkin math with some Rails logs. One request with some extra junk in it was about 400 tokens according to the OpenAI tokenizer[0]. 500M/400 = ~1.25 million log lines.
Paying linearly for logs at $20 per 1.25 million lines is not reasonable for mid-to-high scale tech environments.
I think this would be sufficient if a 'firehose of data' is a bunch of news/media/content feeds that needs to be summarized/parsed/guessed at.
[0] https://platform.openai.com/tokenizer