Foundations of Large Language Models

MR4D · 2025-01-23T05:42:35 1737610955

Me to ChatGPT:

Assume you are a college instructor for a Freshman Computer Science course.

Your job is to take a pdf file from the internet and teach the topics to you students.

You will do this by writing paragraphs or bullet points about any and all key concepts in the PDF necessary to cover the topic in 2 hours of lectures

The pdf file is at https://arxiv.org/pdf/2501.09223

Build the lecture for me.

astrange · 2025-01-23T09:08:12 1737623292

It can't read PDFs. If you ask it to, it generates code to read the first X characters of the PDF and does a bad job.

(Claude is much better at it.)

helsinkiandrew · 2025-01-23T09:31:07 1737624667

Yes it can - both via websearches and uploaded (atleast I'm doing it daily).

EDIT: This article says its only in ChatGPT Enterprise, but works for me on free plan: https://help.openai.com/en/articles/10416312-visual-retrieva...

cdfuller · 2025-01-23T15:54:49 1737647689

That article is referencing visuals embedded in PDFs. As a free user you wouldn't be able to ask ChatGPT to analyze a graph inside a PDF, only text.

drmindle12358 · 2025-01-23T16:51:35 1737651095

Authors are from Northeaster University, Shenyang, China, not the Northeastern U in Boston. Don't understand why the two Chinese professors write an LLM book in english, definitely not from experiences, probably under pressure to publish.

hustwindmaple1 · 2025-01-24T00:19:40 1737677980

prob. not prof; just phd students needs pubs to graduate

thatxliner · 2025-01-23T05:05:58 1737608758

These things can be on Arxiv??

bradly · 2025-01-23T05:34:47 1737610487

I assumed Arxiv was peer-reviewed content only, but it looks like that is not the case.

Submission guidelines: https://info.arxiv.org/help/submit/index.html

Moderation process: https://info.arxiv.org/help/moderation/index.html

_l7dh · 2025-01-23T09:06:39 1737623199

On the contrary, ArXiv is for pre-prints, i.e. not (yet) peer-reviewed. Off the top of the my head, it was initially used by physicists who often have huge collaborations and long reviewing time. Then the ML community invaded the space later on. This does not mean a peer-reviewed paper cannot go there of course.

gus_massa · 2025-01-23T11:05:13 1737630313

Most of the times the peer review version has a copiright restriction, so the arxiv version is the finañ draft that may have small differences.

williamstein · 2025-01-23T07:20:28 1737616828

As an academic, I always thought of arxiv as where you put your papers first, before they are peer reviewed. Before that we used our webpages, but they kept breaking.

incognito124 · 2025-01-23T20:07:27 1737662847

Definitely, eg https://arxiv.org/abs/2201.00650

crisissolution · 2025-01-23T08:26:51 1737620811

Didn't know I could find it on arxiv, will definitely give it a read

htrp · 2025-01-23T03:42:42 1737603762

at 231 pages this is definitely book territory

ipython · 2025-01-23T03:53:59 1737604439

Thankfully the submission is self aware- the first sentence of the article is literally:

> This is a book about large language models.

bradly · 2025-01-23T06:13:24 1737612804

The book too it self aware, though you do have to make it to page ii.

> In writing this book, we have gradually realized that it is more like a compilation of "notes" we have taken while learning about large language models. Through this note-taking writing style, we hope to offer readers a flexible learning path. Whether they wish to dive deep into a specific area or gain a comprehensive understanding of large language models, they will find the knowledge and insights they need within these "notes".

TeMPOraL · 2025-01-23T10:53:06 1737629586

Now I wonder if the LLMs described in it are self-aware too, and whether by the time I reach the end of this book, I will become self-aware as well.

hintymad · 2025-01-23T07:23:38 1737617018

Is it just me or this book looks rather like a Word doc than a Latex one?

hexomancer · 2025-01-23T07:36:05 1737617765

A- Who cares?

B- The latex source of the book is available on the ArXiv page.

jsvlrtmred · 2025-01-23T16:16:02 1737648962

It's just you