Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't the whole idea of Lisp that there is _no_ syntactic complexity? Lisp programs are roughly a serialized AST.


LLMs use tokens, with 1d positions and rich complex fuzzy meanings, as their native "syntax", so for them LISP is alien and hard to process.

That's like reading binary for humans. 1s and 0s may be the simplest possible representation of information, but not the one your wet neural network recognizes.


Already over two years ago, using GPT4, I experimented with code generation using a relatively unknown dialect of Lisp for which there are few online materials or discussions. Yet, the results were good. The LLM slightly hallucinated between that dialect and Scheme and Common Lisp, but corrected itself when instructed clearly. When given a verbal description of a macro that is available in the dialect, it was able to refactor the code to take advantage of it.


Agreed, Gleam as a language has very few, generalized syntactic constructs compared to most procedural languages. There's enough of a signal in the data to be able to answer queries about the language; but when writing, LLMs universally trip over themselves. The signal from other nearby languages is too strong and it ends up trying to do early returns, if statements, even loops on occasion.


[deleted]


I'm not disputing that LLMs are bad for Lisp code, I'm just saying I don't think "syntactic complexity" is a correct explanation for why that is.


Yes, the concept of "syntactic complexity" applied to LLMs can be very different of what we think and I think it depends of the tokenizer. Perhaps LLMs could be fine-tuned by using a grammar for computer languages and special tokens for this grammar in order to reduce syntactic complexity. For example in Lisp, a right or left parenthesis could be tokenized in a special way (indicating left-lisp-parenthesis or right-lisp-parenthesis), that way the LLM could learn faster and reduce syntactic errors.


I usually use deepseek (gratis) for code, and when using defun and let it usually lacks one (or more) closing parenthesis. So the way to mark the end is not well understood by this LLM, or perhaps that the height of the AST is usually bigger than in python.


The idea in Lisp is that there is low complexity in encoding abstract syntax into surface syntax.

There can be considerable complexity in Lisp abstract syntax.


Fair enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: