This is pulling the content of the RSS feeds of several news sites into the context window of an LLM and then asking it to summarize news items into articles and fill in the blanks?
I'm asking because that is what it looks like, but AI / LLMs are not specifically mentioned in this blog post, they just say news are 'generated' under the 'News in your language' heading, which seems to imply that is what they are doing.
I'm a little skeptical towards the approach, when you ask an LLM to point to 'sources' for the information it outputs, as far as I know there is no guarantee that those are correct – and it does seem like sometimes they just use pure LLM output, as no sources are cited, or it's quoted as 'common knowledge'.
Just for concrete confirmation that LLM(s) are being used, there's an open issue on the GitHub repository, on hallucinations with made up information, where a Kagi employee specifically mentions "an LLM hallucination problem":
There's also a line at the bottom of the about page at https://kite.kagi.com/about that says "Summaries may contain errors. Please verify important information."
FWIW, as someone who has chosen to pay for Kagi for three years now:
- I agreee fake news is a real problem
- I pay for Kagi because I get more much more precise results[1]
- They have a public feedback forum and I think every time I have pointed out a problem they have come back with an answer and most of the time also a fix
- When Kagi introduced AI summaries in search they made it opt in, and unlike every other AI summary provider I had seen at that point they have always pointed to the sources. The AI might still hallucinate[2] but if it does I am confident that if I pointed it out to them my bug report would be looked into and I would get a good answer and probably even a fix.
[1]: I hear others say they get more precise Google results, and if so, more power to them. I have used Google enthusiastically since 2005, as the only real option from 2012, as fallback for DDG since somewhere between 2012 and 2022 and basically only when I am on other peoples devices or to prove a point since I started using Kagi in 2022
[2]: haven't seen much of that, but that might be because of the kind of questions I ask and the fact that I mostly use ordinary search.
It is getting easier and easier to fake stuff and there are becoming less and less fully trusted institutions. So sadly I think you are right. Its scary but we are likely heading towards a future where you need to pay to get verified information and that itself will likely be segmented to different subscriptions for what information you want.
Well the thing is that technically information is free, but creating it is definitely not.
So, if Ads are not paying for it, and people won't pay for it either, who does?
Fake news exists because of the perverses incentives of the system; where getting as many clicks as possible is what matters. This is very much a result of social networks and view-based remuneration.
I don't think it's that bad if people need to pay for real information...
It seems a challenging situation.. if you pay fact checkers you get accused of censorship by “weaponised free speech” and if you leave it to the community you get inconsistent results.
The first one sounds like it's an argument made by someone who never wanted the facts to begin with. Correcting misinformation is not stifling free speech.
I'm all for more proper fact checkers, backed by reputable sources.
To take a moment to be a hopeless Stan for one of my all-time favorite companies: I don't think the summary above yours is fair, and I see why they don't center the summary part of it.
Unlike the disastrous Apple feature from earlier this year (which is still available, somehow!), this isn't trying to transform individual articles. Rather, it's focused on capturing broader trends and giving just enough info to decide whether to click into any of the source articles. That seems like a much smaller, more achievable scope than Apple's feature, and as always, open-source helps work like this a ton.
I, for one, like it! I'll try it out. Seems better than my current sources for a quick list of daily links, that's for sure (namely Reddit News, Apple News, Bluesky in general, and a few industry newsletters).
>giving just enough info to decide whether to click into any of the source articles.
If that info is hallucinated, then it's worse than useless. Click bait still attempts to represent the article, a hallucination isn't guaranteed to do thst.
Why not have someone properly vet out interesting and curious news and articles and provide traffic to their site? In this age of sincerity, proper citation is more vital than ever.
Yeah. I really like Kagi. This is a terrible idea.
1. It seems to omit key facts from most stories.
2. No economic value is returned to the sources doing the original reporting. This is not okay.
3. If your summary device makes a mistake, and it will, you are absolutely on the hook for libel.
There seem to be some misunderstandings about what news is and what’s makes it well-executed. It’s not the average, it’s the deepest and most accurate reporting. If anyone from the Kagi team wants to discuss, I’m a paying member and I know this field really, really well.
Thank you. Also a paying Kagi user because I like the idea that it’s worth it to pay for a good service. Ripping off journalists/newspapers content goes against that.
> It’s not the average, it’s the deepest and most accurate reporting.
Yes! I'm also a paying member but I'm deeply suspicious of this feature.
The website claims "we expose readers to the full spectrum of global perspectives", but not all perspectives are equal. It smacks of "all sides" framing which is just not what news ought to be about.
Yes, that's what it is. Kagi as a brand is LLM-optimist, so you may be fundamentally at odds with them here... If it lessens the issue for you, the sources of each item are cited properly in every example I tried, so maybe you could treat it as a fancy link aggregator
Kagi founder here. I am personally not an LLM-optimist. The thing is that I do not think LLMs will bring us to "Star Trek" level of useful computers (which I see humans eventually getting to) due to LLM's fundamentally broken auto-regressive nature. A different approach will be needed. Slight nuance but an important one.
Kagi as a brand is building tools in service of its users, no particular affinity towards any technologies.
You claimed reading LLM summaries will provide complete understanding. Optimistic would be a charitable description of this claim. And optimism is not limited to the most optimistic.
Another LLM-pragmatist here. I don't see why we should treat LLMs differently than any other tool in the box. Except maybe that it's currently the newest and most shiny, albeit still a bit clunky and overpriced.
Fwiw, I love your approach to AI. It's been very useful to me. Quick answers especially has been amazingly accurate and I've used it hundreds of times, if not thousands, and routinely check the links it gives
I'm about as AI-pessimist as it gets, but Kagi's use of LLMs is the most tasteful and practical I've seen. It's always completely opt-in (e.g. "append a ? to your search query if you want an AI summary", as opposed to Google's "append a swear word to your search query if you don't want one"), it's not pushy, and it's focused on summarizing and aggregating content rather than trying to make it up.
Google thinks the same of me and I don't even edit the URL. I can have a session working just fine one night and come back the next day, open a new tab to search for something, and get captcha'd to hell. I'm fairly sure they just mess with Firefox on purpose. I won't install Brave, Chrome, or Edge out of principle either. Safari works fine, but I don't like it.
Google has gotten amazingly hostile toward power users. I don't even try to use it anymore. It almost feels like they actively hate people that learned how to use their tools
I consider myself a major LLM optimist in many ways, but if I'm receiving a once per day curated news aggregation feed I feel I'd want a human eye. I guess an LLM in theory might have less of the biases found in humans, but you're trading one kind of bias for another.
This isn't really comparable. A newspaper is a single source. New York Times is a newspaper, CNN (a part of it) is a newspaper. Services like Kagi News, whether AI or human-curated, try to do aggregation and meta-analysis of many newspaper.
Yeah, I agree. The entire value/fact dichotomy that the announcement bases itself on is a pretty hot philosophical topic I lean against Kagi on. It's just impossible to summarize any text without imparting some sort of value judgement on it, therefore "biasing" the text
> It's just impossible to summarize any text without imparting some sort of value judgement on it, therefore "biasing" the text
Unfortunately, the above is nearly a cliché at this point. The phrase "value judgment" is insufficient because it occludes some important differences. To name just two that matter; there is a key difference between (1) a moral value judgment; (2) selection & summarization (often intended to improve information density for the intended audience).
For instance, imagine two non-partisan medical newsletters. Even if they have the same moral values (e.g. rooted in the Hippocratic Oath), they might have different assessments of what is more relevant for their audience. One could say both are "biased", but does doing so impart any functional information? I would rather say something like "Newsletter A is compromised of Editorial Board X with such-and-such a track record and is known for careful, long-form articles" or "Newsletter B is a one-person operation known for a prolific stream of hourly coverage." In this example, saying the newsletters differ in framing and intended audience is useful, but calling each "biased in different ways" is a throwaway comment (having low informational content in the Shannonian sense).
Personally, instead of saying "biased" I tend to ask questions like: (a) Who is their intended audience; (b) What attributes and qualities consistently shine through?; (c) How do they make money? (d) Is the publication/source transparent about their approach? (e) What is their track record about accuracy, separating commentary from factual claims, professional integrity, disclosure of conflicts of interest, level of intellectual honesty, epistemic standards, and corrections?
> The entire value/fact dichotomy that the announcement bases itself on
Hmmm. Here I will quote some representative sections from the announcement [1]:
>> News is broken. We all know it, but we’ve somehow accepted it as inevitable. The endless notifications. The clickbait headlines designed to trigger rather than inform, driven by relentless ad monetization. The exhausting cycle of checking multiple apps throughout the day, only to feel more anxious and less informed than when we started. This isn’t what news was supposed to be. We can do better, and create what news should have been all along: pure, essential information that respects your intelligence and time.
>> .. Kagi News operates on a simple principle: understanding the world requires hearing from the world. Every day, our system reads thousands of community curated RSS feeds from publications across different viewpoints and perspectives. We then distill this massive information into one comprehensive daily briefing, while clearly citing sources.
>> .. We strive for diversity and transparency of resources and welcome your contributions to widen perspectives. This multi-source approach helps reveal the full picture beyond any single viewpoint.
>> .. If you’re tired of news that makes you feel worse about the world while teaching you less about it, we invite you to try a different approach with Kagi News, so download it today ...
I don't see any evidence from these selections (nor the announcement as a whole) that their approach states, assumes, or requires a value/fact dichotomy. Additionally, I read various example articles to look for evidence that their information architecture group information along such a dichotomy.
Lastly, to be transparent, I'll state a claim that I find to be true: for many/most statements, it isn't that difficult nor contentious to separate out factual claims from value claims. We don't need to debate the exact percentages or get into the weeds on this unless you think it will be useful.
I will grant this -- which is a different point that what the commenter above made -- when reading various articles from a particular source, it can take effort and analysis to suss out the source's level of intellectual honesty, ulterior motives, and other questions I mention in my sibling comment.
Hard pass then. I’m a happy Kagi search subscriber, but I certainly don’t want more AI slop in my life.
I use RSS with newsboat and I get mainstream news by visiting individual sites (nytimes.com, etc.) and using the Newshound aggregator. Also, of course, HN with https://hn-ai.org/
You can also convert regular newspapers into RSS feeds! NYTimes and Seattle Times have official RSS feeds, and with some scripting you can also get their article contents.
> when you ask an LLM to point to 'sources' for the information it outputs, as far as I know there is no guarantee that those are correct
A lot of times when I ask for a source, I get broken links. I'm not sure if the links existed at one point, or if the LLM is just hallucinating where it thinks a link should exist. CDN libraries, for example. Or sources to specific laws.
I monitor 404 errors on my website. ChatGPT frequently sends traffic to pages that never existed. Sometimes the information they refer to has never existed on my website.
For example: "/glossary/love-parade" - There is no mention of this on my website. "/guides/blue-card-germany" has always been at "/guides/blue-card". I don't know what "/guides/cost-of-beer-distribution" even refers to.
They'll do pretty much everything you ask of them, so unless the text actually come from some source (via tool calls, injecting content into the context or other way), they'll make up a source rather than doing nothing, unless prompted otherwise.
On my llm, I have a prompt that condenses down to:
For every line of text output, give me a full MLA annotated source. If you cannot then say your source does not exist or you are generating information based on multiple sources then give me those sources. If you cannot do that, print that you need more information to respond properly.
Every new model I mess with needs a slightly different prompt due to safeguards or source protections. It is interesting when it lists a source that I physically own and their training data is deteriorated.
They could make up source, but ChatGPT is an actual app with complicated backend, not dumb pipe between textedit and GPU. Surely they could verify on server side every link they output to user before including it in the answer. I'm sure Codex will implement it in no time!
They surely can detect it, but what are they going to do after detecting it? Loop the last job with a different seed and hope that the model doesn't lie through its teeth? They won't be doing it because the model will gladly generate you a fake source on the next retry too.
Maybe they should be trained on the understanding that making up a source is not "doing what you ask of them" when you ask for a source. It's actually the exact opposite of the "doing what you asked, not what you wanted" trope-- it's providing something it thinks you want instead of providing what you asked for (or being honest/erroring out that it can't).
Think for a second about what that means... this is a very easy thing to do IFF we already had a general purpose intelligence.
How do you make an LLM understand that it must only give factual sources? Just some naive RL with positive reward on the correct sources and negative reward on incorrect sources is not enough -- there are obscenely many more hallucinated sources possible, and the set of correct sources is a set of insanely tiny measure.
"Easy". You make the model distinguish between information and references to information. Information may be fabricated (for example, a fictional book is mostly composed of lies) but references are assumed to be factual (a link does point to something and is related to something). Factual information is true only to the degree that it is conveyed exactly, so the model needs to be able to store and reproduce references verbatim.
Of course, "easy" is in quotes because none of this is easy. It's just easier than AGI.
If you need to ask for a source in the first place, chances are very high that the LLM's response is not based on summarizing existing sources but rather exclusively quoting from memory. That usually goes poorly, in my experience.
The loop "create a research plan, load a few promising search results into context, summarize them with the original question in mind" is vastly superior to "freely associate tokens based on the user's question, and only think about sources once they dig deeper".
It actually seems more like an aggregator (like ground.news) to me. And pretty much every single sentence cites the original article(s).
There are nice summaries within an article. I think what they mean is that they generate a meta-article after combining the rest of them. There's nothing novel here.
But the presentation of the meta-article and publishing once a day feel like great features.
I have yeah, to me it looks like what I described in my comment above, it's LLM generated text, is it not?
> And pretty much every single sentence cites the original article(s).
Yeah but again, correct me if I'm wrong, but I don't think asking an LLM to provide a source / citation yields any guarantee that the text it generates alongside it is accurate.
I also see a lot of text without any citations at all, here are three sections (Historical background, Technical details and Scientific significance) that don't cite any sources: https://kite.kagi.com/s/5e6qq2
I can envision the day where an LLM article generator starts consuming LLM generated articles which were sourced from single articles (co-written by an LLM).
I guess I'm trying to understand your comment. Is there a distinction you're making between LLM summaries or LLM generated text, or are you stating that they aren't being transparent about the summaries being generated by LLMs (as opposed to what? human editors?).
Because at some point when I launched the app, it did say summaries might be inaccurate.
Looks like you found an example where it isn't properly citing the summaries. My guess is that they will tighten this up, because I looked mostly at the first and second page and most of those articles seemed to have citations in the summaries.
Like most people, I would want those everywhere to guard against potential hallucinations. No, the citations don't guarantee that there weren't any hallucinations, but if you read something that makes you go "huh" – the citations give you a low-friction opportunity to read more.
But another sibling commenter talked about the phys.org and google both pointing to the same thing. I agree, and this is exactly an issue I have with other aggregators like Ground.news.
They need to build some sort of graph that distills down duplicates. Like I don't need the article to say "30 sources" when 26 of them are just reprints of an AP/Reuters wire story. That shouldn't count as 30 sources.
> I guess I'm trying to understand your comment. Is there a distinction you're making between LLM summaries or LLM generated text, or are you stating that they aren't being transparent about the summaries being generated by LLMs (as opposed to what? human editors?).
The main point of my original comment was that I wanted to understand what this is, how it works and whether I can trust the information on there, because it wasn't completely clear to me.
I'm not super up to date with AI stuff, but my working knowledge is that I should never trust the output of an LLM and always verify it myself, so therefore I was wondering if this is just LLM output or if there is some human review process, or a mechanism related to the citation functions that makes it output of a different, more trusted category.
I did catch the message on the loading screen as well now, I do still think it could be a little more clear on the individual articles about it being LLM generated text, apart from that I think I understand somewhat better what it is now.
> No, the citations don't guarantee that there weren't any hallucinations, but if you read something that makes you go "huh" – the citations give you a low-friction opportunity to read more.
Either you mean every time you read something interesting (“huh”) you should check it. But in that case, why bother with reading the AI summary in the first place…
Or you mean that any time you read something that sounds wrong, you should check it. But in that case, everything false in the summaries that happens to sound true to you will be confirmed in your mind without you ever checking it.
...yes? If I go to a website called "_ News" (present company included), I expect to see either news stories aggregated by humans or news stories written and fact checked by humans. That's why newspapers have fact checking departments, but they're being replaced by something with almost none of the utility and its proponents are framing the benefits of the old system as impossible or impractical.
I think you misunderstood my comment. I wasn't challenging the concept of human editors and fact checkers. I was asking a parent for a clarification of what the parent post meant by outlining that they were LLM generated summaries.
Like, I was asking whether they were expecting the curation/summarization to be done by humans at Kagi News.
Publishing once a day to remove the "slot machine dopamine hit" is worth it for that alone. I have forever been looking for a peer/replacement to Google News, I was about to pony up for a Ground News subscription but I'll probably hold off for a couple more months. Alternatives to google news have been sorely lacking for over a decade, especially since google news got their mobile-first redesign which significantly and permanently weakened the product to meet some product manager's bonus-linked KPI. One more product to wean off the google mothership. Gmail is gonna be real hard though.
Gmail seems like the easiest piece of the Google puzzle to replace. Different calendar systems have different quirks around repeating events, you sometimes need to try a variety of search engines to find what you're looking for, Docs aren't bug-for-bug equivalent to the Office or iCloud competitors, YouTube has audience, monetization, and hosting scale... Gmail is just "make an email account with a different provider and switch all of your accounts to use the new address." They don't even give you that much storage for free Gmail; it's 15GB, which lots of other email providers can match (especially paid ones). You can import your old emails to your new provider or just store them offline with a variety of email clients.
Is updating all of your accounts (and telling your contacts about the new address) what you consider to be the hard part, or do you actually use any Gmail-specific features? Genuinely curious, as I tend to disregard almost all mail-provider-specific features that any of my mail providers try to get me excited about (Gmail occasionally adds some new trick, but Zoho Mail is especially bad about making me roll my eyes with their new feature notifications).
I am sticking with this reprehensible company for email because their spam detection is awesome and I have found no clear measurements of detection to reasonably compare. I’d love to be proven wrong!
Switched from Gmail to Fastmail about 10 years ago.
2-3 spam emails slip through every week, and sometimes a false positive happens when I sign up for something new. I don't see this as a huge problem, and I doubt Gmail is significantly better.
Gmail significantly improved the email spam situation for everyone by aggressively pushing email security standards like DMARC/DKIM/SPF [1]. This came at the cost of basically no longer being able to selfhost your own email server though.
I agree with the other commenter, I use Fastmail and I get very few spam emails, most of which wouldn't have been detected by gmail either because they're basically legitimate looking emails advertising scams. I have a Gmail account I don't use and it seems like it receives about the same amount of spam, if not more.
I don't understand how this once-per-day thing, very obviously a cost-cutting measure, can be taken seriously as a "feature". Stories evolve throughout the day. If this is truly important to you, just screen shot Google News, then look at the screen shot all day.
I am fine with it using AI but it makes me feel pretty icky that they didn’t mention that this was ai/llm generated at any point in this article. That’s a no-no IMO, and has turned me off this pretty strongly.
They don't explicitly say they generate summaries at any point in the article. In fact I read it and though this was just some fancy RSS aggregator. The way they describe the "daily briefing" is extremely ambiguous.
In this situation, humans are more accurate, for now, so it's good information to have.
Same as I would like to know if humans self assessed in a study about how well they drive vs the empirical evidence. Humans just aren't that good at that task so it would be good to know coming in.
Just call it Kagi Vibes instead of Kagi News as news has a higher bar (at least for me)
Kagi team member here who helped author that blog post announcement. That's a fair point, and we've updated the post to now clearly mention that we use AI for the daily briefing. Thank you for your feedback!
I'm firmly on the side of "AI" skepticism, but even I have to admit that this is a very good use of the tech. LLMs generally do a great job at summarizing text, which is essentially what this is. The sources could be statically defined in advance, given that they know where they pull the information from, so I don't think the LLM generates that content.
So if this automates the process of fetching the top news from a static list of news sites and summarizing the content in a specific structure, there's not much that can go wrong there. There's a very small chance that the LLM would hallucinate when asked to summarize a relatively short amount of text.
It's useful for the users, but tragically bad for anyone involved with journalism. Not that they're not used to getting fucked by search engines at this point, be it via AMP, instant answers, or AI overviews.
Not that the userbase of 50k is big enough to matter right now, but still...
What journalism? Most of these sites copy their content from each other or social media, and give it their own spin. Nowadays most of them use AI anyway.
Actual journalism doesn't rely on advertising, and is subscription based. Anyone interested in that is already subscribed to those sources, but that is not the audience this service is aiming for. Some people only want to spend a few minutes a day catching up with major events, and this service can do that for them. They're not the same people who would spend hours on news sites, so these sites are not missing any traffic.
Broadly agreed, I don't consider the CBS (national) news website to be a source of hard hitting journalism; Reuters, however, is. Reuters and the AP are often the source of these news stations.
I continue to subscribe to Reuters because of the quality of journalism and reporting. I have also started using Kagi News. They are not incompatible.
All this is doing is aggregating RSS feeds and linking to the original articles.
So this might result in lower traffic for "anyone involved in journalism" – but the constant doomscrolling is worse for society. So I think we can all agree that the industry needs to veer towards less quantity and more quality.
RSS feeds are meant to be used by actual users, not regurgitated publicly. RSS readers at the very least have have author info visible and its users tend to be reported to website's analytics with a special user agent.
I see! One thing I'm wondering: They say they are fetching the content from the RSS feeds of news outlets rather than scraping them, I haven't used RSS in a bit, but I recall most news outlets would usually not include the full article in their feed but just the headline or a small summary. I'd be worried that articles with misleading headlines (which are not uncommon) might cause this tool to generate incorrect news items, is that not a concern?
That's a fair concern, and I would prefer it if they scraped the sites instead. They could balance this out by favoring content from sites that do provide the entire article in their feeds, but that could lead to bias problems. Maybe this is why their own summaries are short. We can't know for sure unless they explain how it works.
If the parent commenter is correct, the concern I'd have would be about transparency. Even if it's good at what it does, I don't think we're anywhere close to a place as a society where it shouldn't be explicit when it's being used for something like this.
When you go to Google News, the way they group together stories is AI (pre-LLM technology). Kagi is merely taking it one step further.
I agree with your concern. I see this as a convenient grouping, and if any interests me I can skip reading the LLM summary and just click on the sources they provide (making it similar to Google News).
It cannot be "one step further", because there's a clear break in reality between what Google News provides and Kagi provides. Google News links to an article that exists in our world, 100%, no chance involved. Kagi uses an LLM generate text and thus is entirely up to chance.
> when you ask an LLM to point to 'sources' for the information it outputs,
Services listing sources, like Kagi news, perplexity and others don't do that. They start with known links and run LLMs on that content. They don't ask LLMs to come up with links based on the question.
That is what I mean yeah, I’m not saying it’s fabricating sources from training data, that would obviously be impossible for news articles, I’m saying if you give it a list of articles A, B and C including their content in the context and ask ‘what is the foo of bar?’ and it responds ‘the foo of bar is baz, source: article B paragraph 2’, that does not tell you whether the output is actually correct, or contained in the cited source at all, unless you manually verify it.
This seems like the opposite of "privacy by design"
> Privacy by design: Your reading habits belong to you. We don’t track, profile, or monetize your attention. You remain the customer and not the product.
How would the LLM provider get any information about your reading habits from the app? The LLM is used _before_ the news content is served to you, the reader.
It's also a workaround around copyright, news sites would be (rightfully) pissed if you publicly post their articles in full and would argue that you're stealing their viewership. But, if you're essentially doing an automatic mash-up of five stories on the same topic from different sources, all of a sudden you're not doing anything wrong!
As an example from one of their sources, you can only re-publish a certain amount of words from an article in The Guardian (100 commercially, 500 non-comercially) without paying them.
Yes, that is fine! That's how RSS feeds usually work when you follow more "mainstream" news sources. At the very least, you see the name of the author and you actually make a connection to their server that can be measured in the analytics.
But instead, Kagi "helpfully" regurgitates the whole story, visits the article once, delivers it to presumably thousands, and it can't even be bothered to display all of the sources it regurgitates unless you click to expand the dropdown. And even then the headline itself is one additional click away, and they straight up don't even display the name of the journalist in the pop-up, just the headline.
Incredibly shitty behaviour from them. And then they have the balls to start their about page with this:
And yet, after trying it, I have to admit it's more informative and less provocative than any other news source I've seen since at least 2005.
I don't know how they do it, and I'm not sure I care, the result is they've eliminated both clickbait and ragebait, and the news are indeed better off for it!
Soulless, uncreative, not fact-checked (or read by anyone before clicking publish), not contributing anything back to the original journalists, all of the editorial decisions are done by an undeterministic AI filter.
Not gonna call it the worst insult to journalism I've ever seen because I've seen factually(.)so which does essentially the same thing but calls it an "AI fact check", but it's not much better.
It's like instead of borrowing a book from the library, there's like a spokesperson at the entrance who you ask a question and then blindly believe whatever they say.
This is exactly how I want my news to be. Nothing worse than a headline about a new vaccine breakthrough, followed by a first paragraph that starts with "it was a cold November morning as I arrived in..."
I guess it's a matter of taste, but I prefer it short and to the point
Yes, they are not the only player here. Quite a few companies are doing this, if you use Perplexity, they also have a news tab with the exact feature set.
> if you use Perplexity, they also have a news tab with the exact feature set
"Exact" is far from accurate. I just did a side-by-side comparison. To name only two obvious differences:
A. At the top level, Perplexity has a "Discover" tab [1] -- not titled "News". That leads to a AAF page with the endless-scroll anti-pattern (see [2] [3] for other examples). Kagi News [4] presents a short list of ~7ish items without images.
B. At the detail-page level, Kagi organizes their content differently (with more detail, including "sources", "highlights", "perspectives", "historical background", and "quick questions"). Perplexity only has content with sources and "discover more". You can verify for yourself.
After using Perplexity, its news tab is US-centric, without much options to get regional content from what i can see.
Kagi seems to offer regional news and the sources appear to be from the respective area also. do appreciate public access (for now?) with RSS feeds (ironic but handy).
Thanks for pointing out that this is yet more AI slop. Very disappointing for Kagi to do this. I get my money's worth from searches, but if I was looking for more features I would want them to be not AI-based.
I guess they embed the news of the day and let it summarize it. You can add metadata to the training set, which you should technically query reliably. You don't have to let the model do the summarization of the source, which can be erroneous.
Far more interesting is how they aggregate the data. I thought many sources moved behind paywalls already.
> Kagi is probably the only pro-LLM company praised on HN.
Kagi made search useful again, and their genAI stuff can be easily ignored. Best of both worlds -- it remains useful for people like myself who don't want genAI involved, but there's genAI stuff for people who like that sort of thing.
That said, if their genAI stuff gets to be too hard to ignore, then I'd stop using or praising Kagi.
That this is about news also makes it less problematic for me. I just won't see it at all, since I don't go to Kagi for news in the first place.
I'm not against AI summaries if they are marked as so. Sneakily sliding LLM under the table is a dark pattern no matter how I interpret their intentions.
Even Google calls the overview box AI Overview (not saying it doesn't hurt content hosting sites.)
Disappointing. Non-LLM NLP summarization is actually rather good these days. It works by finding the key sentences in the text and extracting the relevant sections, no possibility for hallucination. No need to go full AI for this feature.
i believe an llm output is fine for giving an overview if provided the articles, if you want a detailed overview you should be reading the articles anyways.
This is pulling the content of the RSS feeds of several news sites into the context window of an LLM and then asking it to summarize news items into articles and fill in the blanks?
I'm asking because that is what it looks like, but AI / LLMs are not specifically mentioned in this blog post, they just say news are 'generated' under the 'News in your language' heading, which seems to imply that is what they are doing.
I'm a little skeptical towards the approach, when you ask an LLM to point to 'sources' for the information it outputs, as far as I know there is no guarantee that those are correct – and it does seem like sometimes they just use pure LLM output, as no sources are cited, or it's quoted as 'common knowledge'.