Hacker Newsnew | past | comments | ask | show | jobs | submit | kansai's commentslogin

For what it's worth, TFA does mention that the site is:

> "hidden in plain site", the archaeologists say, as it is just 15 minutes hike from a major road


Describing the K of K&R as "Awk creator" is like describing Einstein as a "refrigerator engineer".


It's an awk interview. "Awk creator" is relevant.


Right, the headline gave the appropriate context for the interview in a short sentence. Not everyone knows Kernighan created awk, as opposed to a respected person that happens to have opinions on awk.


AWK's name is actually an initialism of the last names of its three creators: Aho, Weinberger, and Kernighan.


Aho, one of the authors of the dragon books. When I studied CS 40 years ago it was given standard literature on compiler construction, a mandatory course.

No idea whether it is still used today. Well, no idea whether there is anything fundamentally new in compilers such an old book would not cover.


Indeed, still in use. Also presumably the Aho of the Aho-Corasick string matching algorithm.


There are plenty of advanced compiler techniques, but you can't understand them until you understand the fundamentals. There will always be a place for the purple dragon book.


Ah, it's purple now. It was green when I studied, became red before I graduated. When I search for it now the thumbnails look unfamiliar.


He didn't say it was irrelevant.


I agree it was an incredibly awkward interview


Why do you say that? developing Awk was no less of an accomplishment than C.


This seems like a rather large claim that would benefit from some kind of supporting argument.


c-descended languages include c++, java, and c#

awk-descended languages include perl, tcl, js, python and lua

it's a close race but if i had to pick i'd say the second group is more influential


Most implementations of that second group of languages are written in C, and run on operating systems written in C. It's really no contest that C has had far more of an influence on computing. So much software would not exist, or not in the forms we recognize, without it.


or at least in c++, c#, or java, sure, smalltalk being the main exception

but the layers are somewhat orthogonal. at the verilog level you can't really tell if your hardware is built out of luts, cmos standard cells, hand-laid-out nmos, ttl, or vacuum tubes; at the operating system level you can't really tell if your cpu was designed in vhdl, verilog, chisel, or hand-drawn schematics; at the interpreter level you can't really tell if your operating system was written in c, pl/x, assembly, or lisp; and at the python level you can't really tell if you're running on cpython (which is written in c) or pypy (which is written in python)

i mean there are small clues that make it hard to completely hide, but they're subtle

so, to my way of thinking, saying 'c has had far more of an influence on computing than awk', on the basis that awk was written in c, is similar to saying 'transistors have had far more of an influence on computing than c'. it's sort of true, but it's also sort of false


Python influenced by AWK? Never heard that before, and Wikipedia does not list awk among python's numerous influences, either. https://en.wikipedia.org/wiki/Python_(programming_language)



this says 'abc, shell, awk', but what is guido responding to?


Not sure, but by the context of the conversation, it seems he complained people keeps saying lisp was an inspiration for Python, when in fact he was not even thinking about it by the time Python was created. Then it appears someone asked him what he actually based on then, and he answered abc, shell, awk. But given the lost message in the thread, perhaps you can ask him for clarification, to set the record straight.


i see, thanks!

yeah, i think it's well-known that lambda, map, filter, and reduce (the lispy bits of python) weren't in the earliest versions of python; they were contributed later on by a lisp fan

at a larger remove, of course, lisp was an inspiration for abc (via lcf and setl), shell (via algol), and awk (in a smaller way)


I may be wrong, but I think apply() was there in an early version of python, maybe 1.5.


1.5 was several years later


but it was also several years ago, which is when I first started playing around with python, before starting to do production work with it a few years later.

so in one sense, i.e. compared to the present, 1.5 is an early version.

https://en.m.wikipedia.org/wiki/History_of_Python


all true


kernighan said in the interview

> The main idea in Awk was associative arrays, which were newish at the time, but which now show up in most languages either as library functions (hashmaps in Java or C++) or directly in the language (dictionaries in Perl and Python). Associative arrays are a very powerful construct, and can be used to simulate lots of other data structures.

awk was released in 01979, and the authors published this paper in sp&e that year: https://plan9.io/sources/contrib/steve/other-docs/awk.pdf but you see this report version is dated september 01978. but i don't think the report was widely circulated until the next year, when it was included in 7th edition unix as /usr/doc/awk (sudo apt install groff; groff -ms -Tutf8 v7/usr/doc/awk | less -r). it explains:

> Array elements may be named by non-numeric values, which gives awk a capability rather like the associative memory of Snobol tables. (...) There is an alternate form of the for statement which is suited for accessing the elements of an associative array:

this pdf has evidently been retypeset from the troff sources from the open-source 7th edition release, but without the correct bibliographic database, so the references are missing. a comment in the troff source says:

> ....It supersedes TM-77-1271-5, dated September 8, 1977.

but possibly that reference is inaccurate

python goes beyond merely having dicts 'directly in the language'. python's primary data structure is the dict; among other things, it uses dicts for modules, (most) class instances, associating methods with classes, the locals() user interface to the local-variable namespace, and passing keyword arguments to functions and methods. that is, it uses associative arrays to simulate lots of other data structures, as you are obliged to do in awk, lua, and js. so where did python get dicts?

python got dicts (and tuples) from abc, a teaching language which wikipedia claims was started in 01987, 8 years after awk's release, and added conventional arrays (lists) back in. the five data types in abc https://homepages.cwi.nl/~steven/abc/language.html are numbers, strings, compounds (called tuples in ml), lists (really multisets because they're implicitly sorted), and tables (dictionaries), which are awk's 'associative arrays'—and, as in awk, js, lua, and tcl, they're used to provide the functionality of conventional arrays as well

however, lambert meertens credits the use of tables in abc to jack schwartz's setl https://inference-review.com/article/the-origins-of-python rather than to awk. he says of the addition of tables to b (the early name for abc, not to be confused with the b that was an earlier version of c)

> Having coded a few algorithms in SETL, I had experienced its power firsthand—a power that stemmed entirely from its high-level inbuilt data types. Particularly powerful were sets and maps, also known as “associative arrays,” containing data that can be indexed not only by consecutive integers but by arbitrary values. A programmer could introduce a simple database of quotations named whosaid, in which the value ”Descartes” could be stored in the location whosaid[”cogito ergo sum”]. These high-level types made it possible to express algorithms that required many steps in B1 using just a few steps in SETL. In a clear violation of the Fair-Expectation Rule, B1 allowed only integers as array indices. This design decision had been driven by fear: we had been concerned that aiming too high would make our language unimplementable on the small personal computers that were starting to appear on the market. But Dewar, in particular, convinced me that this meant we were designing for the past, not the future. This led us to redesign the system of data types for our beginners’ language. This time we used only the criteria of ease of learning and ease of use to select candidate systems. The winner turned out to be remarkably similar to the data type system of SETL. The set of possible data type systems to choose from was very large, and to make the process more manageable I had written a program to select the competitive (Pareto-optimal) candidate systems. Interestingly, but quite incidentally, that selection program itself was written in SETL. The winning type system became that of B2, and remained unchanged in the final iteration, released in 1985 under the name “ABC.”

'associative arrays', of course, is the term used by awk

this story of adding associative arrays to abc only for b2 is somewhat complicated by the fact that the version of b (b1?) in meertens's 01981 'draft proposal for the b programming language' https://ir.cwi.nl/pub/16732 already includes tables, three years after the release of awk as part of 7th edition; p. 6 (9/91) says,

> Tables are somewhat like dictionaries. A short English-Dutch dictionary (not sufficient to maintain a conversation) might be (...) Table entries, like entries in a dictionary, consist of two parts. The first part is called the key , and the second part is called the associate. All keys must be the same type of value, and similarly for associates. A table may be written thus: {[’I’]: 1; [’V’]: 5; [’X’]: 10}.

> If this table has been put in a target roman, then roman[’X’] = 10.

note that this is also awk's syntax for indexing an associative array, though it doesn't have a syntax for writing one down.

(to be continued; hn says, 'that comment included too many facts')


> The set of possible data type systems to choose from was very large, and to make the process more manageable I had written a program to select the competitive (Pareto-optimal) candidate systems. Interestingly, but quite incidentally, that selection program itself was written in SETL.

Wow, now that I'd like to see.

Awesome and informative comments, cheers!


I can't even make a guess as to how such a program could be written.

anyone has an idea?


it sounds a little bogus to me, but i can outline some ways to approach it

if you have a set of features, generating its powerset is fairly simple

    >>> powerset = lambda xs: [x0 + others for others in powerset(xs[1:]) for x0 in [[], [xs[0]]]] if xs else [[]]
    >>> powerset(['int', 'float', 'set', 'array'])
    [[], ['int'], ['float'], ['int', 'float'], ['set'], ['int', 'set'], ['float', 'set'], ['int', 'float', 'set'], ['array'], ['int', 'array'], ['float', 'array'], ['int', 'float', 'array'], ['set', 'array'], ['int', 'set', 'array'], ['float', 'set', 'array'], ['int', 'float', 'set', 'array']]
then you just need some way to calculate various merits of the different designs. one merit is simplicity, which you could maybe try to operationalize as something like this:

    >>> [(d, {'simplicity': 10 - len(d)}) for d in powerset(['int', 'set', 'array'])]
    [([], {'simplicity': 10}), (['int'], {'simplicity': 9}), (['set'], {'simplicity': 9}), (['int', 'set'], {'simplicity': 8}), (['array'], {'simplicity': 9}), (['int', 'array'], {'simplicity': 8}), (['set', 'array'], {'simplicity': 8}), (['int', 'set', 'array'], {'simplicity': 7})]
comparing two merit-maps such as {'simplicity': 8} and {'simplicity': 6} for pareto optimality is simple enough with some thought, especially if we assume the keys are the same:

    >>> some_way_better = lambda a, b: any(a[k] > b[k] for k in a)
    >>> defeats = lambda a, b: some_way_better(a, b) and not some_way_better(b, a)
    >>> defeats({'simplicity': 9}, {'simplicity': 8})
    True
    >>> defeats({'simplicity': 8}, {'simplicity': 9})
    False
    >>> defeats({'simplicity': 9, 'turing-complete': 0}, {'simplicity': 8, 'turing-complete': 1})
    False
    >>> defeats({'simplicity': 9, 'turing-complete': 1}, {'simplicity': 8, 'turing-complete': 1})
    True
then it's easy enough to calculate which of a set of candidate designs can be eliminated because some other design defeats them

the bogus part is how you automatically calculate the various merits of a hypothetical collection of language features


Yeah, what was the representation of these "data type systems" and what were the metrics I wonder.


i'm glad they were helpful!


a more recent set of slides on the relation between abc and python is https://www.cwi.nl/documents/195216/Meertens-20191121.pdf which describes again how abc was started in 01975. this helpfully clarifies the timeline: b0 was 01975; b1 was 01978; b2 was 01979; and b∞ = abc was 01985. so specifically the point at which setl inspired the replacement of conventional arrays in b1 with associative arrays in b2 was 01979, which was the year 7th edition unix was released and the aho, weinberger, and kernighan paper was published in sp&e

a question of some interest to me here is what platform they were developing abc on in 01979. clearly it couldn't have been the ibm pc, which wouldn't come out until 01983 (and as far as i know abc on the ibm pc only runs under cygwin or 32-bit microsoft windows), or macos (which came out in 01984) or atari tos, which wouldn't come out until 01985. and so far i haven't seen any mention in the history of abc of other operating systems of the time like cp/m, vm/cms, dg rdos, tenex, or tops-20. the most likely platform would seem to have been unix, on which awk was one of the relatively few programming languages available. perhaps at some point i'll run across an answer to that question in the abc papers

python adopted awk's syntax for putting 10 into roman['x'], which was `put 10 in roman['x']` in abc, but `roman['x'] = 10` in awk and python. abc's syntax is uppercase, presumably case-insensitive, separates words with apostrophes, and departs widely from conventional infix syntax. python's syntax is case-sensitive, mostly lowercase, and conventionally infix, features that have become common through the influence of unix. python's control structures are for, while, and if/elif/else, as in algol and in abc, and indentation-sensitive as in abc, but uses a conventional ascii syntax rather than abc's scratch-like syntax-directed editor

abc was statically typed with a hindley-milner type system ('the type system is similar to that of lcf', p. 15 (18/91) of the draft proposal), while python is dynamically typed, like smalltalk, lisp, and awk

if meertens got his daring notion of storing everything in associative arrays from awk, he certainly doesn't mention it. instead he mentions setl a lot! the draft proposal doesn't cite awk but it also doesn't cite setl; it cites the algol-68 report, milner's lcf typing paper, a cleaveland and uzgalis paper about grammars, gehani, and three of his own papers, from 01976, 01978, and 01981. unfortunately i can't find any of those earlier meertens papers online

the wikipedia page about setl says

> SETL provides two basic aggregate data types: (unordered) sets, and tuples.[1][2][5] The elements of sets and tuples can be of any arbitrary type, including sets and tuples themselves, except the undefined value om[1] (sometimes capitalized: OM).[6] Maps are provided as sets of pairs (i.e., tuples of length 2) and can have arbitrary domain and range types.[1][5]

but it's citing papers about setl from 01985 there, well after awk had supposedly popularized the notion of associative arrays

however, in meertens's essay on python's history, he cites a 01975 paper on setl! https://www.softwarepreservation.org/projects/SETL/setl/doc/...

> Jacob T. Schwartz. ON PROGRAMMING: An Interim Report on the SETL Project. Part I: Generalities; Part II: The SETL Language and Examples of Its Use. Computer Science Department, Courant Institute of Mathematical Sciences, New York University, revised June 1975.

this discusses how setl represented data in memory starting on p. 57 (57/689). it used hash tables to represent sets, including sets of tuples, rather than the ill-advised balanced-tree approach used by abc. (python, like awk and setl, uses hash tables.) on pp. 62–63 (62–63/689) it explains:

> The hash code of a tuple is taken to be the hash code of its first component, for reasons that will become clear in the next section. The hash code of a set is the exclusive or of the hash codes of all its members. (...)

> — Tuples in Sets —

> Though expressible in terms of the membership test, "with", and "less" operations, functional evaluation plays so important a role in SETL algorithms that we treat it as a primitive.

> SETL makes three types of set-related functional evaluation operators available:

> - f(x)

> - f{x}

> - f[s]

> The most fundamental of these is f{x}, which invokes a search over f for all n-tuples that begin with x (n ≥ 2), and which yields as result the set of all tails of these n-tuples. More precisely, in SETL:

> f{x} = if #y eq 2 then y(2) else tℓ y, y ∈ f | type y eq tupl and #y ge 2 and hd y eq x}

> The operation f(x) has a similar definition but includes a single valuedness check:

> f(x) = if #f{x} eq 1 then ∋f{x} else Ω

> The operation f[s] is adequately defined in terms of f{x}:

> f[s] = [+: x ∈ s] f{x}

i am fairly confident that the f{x} definition translates into current vernacular python as the set-comprehension {y[2] if len(y) == 2 else y[1:] for y in f if type(y) == tuple and len(y) >= 2 and y[0] == x}.

so, it becomes clear that already in 01975 setl treated sets of tuples as maps, which is to say associative arrays, but it didn't use the 'associative array' terminology used by meertens in 01981, or for that matter 'maps'. to look up an element in the map, it didn't use the f[x] notation used by python, awk, and abc; instead it used f(x). further explanation on pp. 64–65 (64–65/689) clarifies that really it is more accurate to think of 'sets of tuples' as trees; each item in the tuple entails following an additional level of pointers to a further hash table

(in a number of other notational details, python and presumably abc follows setl: start: or start:end for array index ranges, + for string concatenation, * for string repetition, boolean operators spelled out as and, or, and not. but indexing maps is far from the only difference)

abc (including b as described in the 01981 report) also seems to lack the f{x} operation and its possibility of associating an arbitrary-sized set of values with each key. this is a nontrivial semantic divergence

so if abc got its idea of tables from setl, but used awk's terminology, notation, and semantics for them (and its own ill-conceived balanced-tree implementation, used by neither), and decided to adopt the table idea in the year when awk was released, probably on the platform that awk was released on, i think it's reasonable to assign some share of the credit for abc's tables to awk? even if not all of it

but if that's so, then why didn't meertens credit aho, weinberger, and kernighan? i don't know. maybe awk's loosey-goosey nature was repugnant to him. maybe weinberger is jewish and meertens is secretely anti-semitic. maybe meertens thought that awk's loosey-goosey nature would be repugnant to the dijkstra-honoring dutch computer science establishment. maybe aho insulted meertens's favorite beer one time when he visited the netherlands. or maybe he thought it would be unfair for aho, weinberger, and kernighan to steal the thunder of schwartz, who did after all precede them in developing a general-purpose language whose memory was based entirely on hash tables. from a certain point of view that would be like crediting carl sagan with the theory of relativity because he explained it on nova


I read this comment in the spirit of "you might not have known that K did much more that write awk, but he's a genius." Which I didn't know, so I appreciate the context.


He’s the closest man to C still alive. Yet it’s not really “his” accomplishment. C was created by Ritchie alone, Kernighan only wrote the book.


I would say Ken Thompson and Steve Johnson are the closer.


You can register a legal alias (通称名) that you can use on forms. I know people who have done this with just single a single kanji to avoid a lot of the headache associated with having a foreign name or (god forbid) a middle name. I've considered a few times registering my legal alias as 一一 to have a two stroke full name.


Same in Japan. My name is truncated on nearly every single form I receive from utilities companies. online forms will regularly limit your full name to 9 kana in total.


Worth noting that if you have a US-issued visa card, you cannot use it to pay for IC via Apple pay.


Same with EU-issued visa cards! As I learned when visiting in July.

You can charge up the iphone suica balance using cash in the machines... a bit weird but that's what I did.

Getting back home first thing I did was get a mastercard to complement my visa, as it seems EU mastercards still works to charge Suica.

It would be interesting to read about why this happened, it feels like there are interesting payment processing details to learn from it :)


As a test I just successfully charged my Suica in my iPhone (card was issued in 2019) with my EU-issued Visa (issued 2021) through Apple Pay. Is there maybe some another condition that may influence whether it works or not?

Edit: this article explains it: https://atadistance.net/2023/03/15/troubleshooting-apple-pay... My Visa is actually listed as one of those which still work ("Some VISA debit cards work for adding money to Suica (DKB, Hyundai Zero, Revolut works depending on the country of the account, no other issuers confirmed).")


Really? Interesting! I tried with at least two different EU (Swedish) visa cards back in July and none of the cards I had worked.

Not in Japan anymore otherwise I would have tried again.

This article, linked to from a sibling comment here, seems to indicate it shouldn't work for you :) https://atadistance.net/2023/07/15/foreign-visa-cards-blocke...

edit: ha. you read the same articles better than I did.


Yes as per the article I'm luckily an outlier. :) But this is a very unfortunate situation especially in combination with the stopped sale of physical cards. I still have an old PASMO which I used for a commuter pass in 2016-2017 so I may be able to use this card when I visit Japan again next year. Well, unless these cards expire. Not sure about that.

Is it known whether this foreign-VISA situation is supposed to resolve and get fixed eventually? The article didn't mention whether the current situation is on purpose or an error. I fear that if this is on purpose it might actually get worse and other foreign cards eventually stop working as well.

>Not in Japan anymore otherwise I would have tried again.

I'm neither. Does this change anything? I could charge from home directly in ¥. But the card is from DKB so expected to work.


Here's an English blog that looks like it focuses a lot on Suica/Japanese contactless payments goes a bit more in depth both on the Visa issue and the FeliCa/NFC-F shortage: https://atadistance.net/2023/08/01/how-long-will-the-suica-c...


You can however buy a physical card (at a few of the major stations) then transfer it to your phone. Once it's there you still can't reload it directly, but you can use a recharge station (with cash) - just put your phone down where the card would go.


Sadly due to semiconductor shortage, physical Suica cards are very hard to come by atm: https://www.timeout.com/tokyo/news/sale-of-pasmo-and-suica-c...

FWIW I've been using Suica on a Japanese Garmin smart watch and refilling it with Google Pay on an international card without too many problems (and before that with Apple Pay with the same card on iPhone/Apple Watch), although there are probably some gotchas (Apple Pay at POS seems to be pretty hit and miss for in Tokyo and I've never figured out why).


Oh wow, that sucks. Glad I was able to get one when I was there in March. Tourists can still use https://www.jreast.co.jp/multi/en/welcomesuica/welcomesuica.... though it seemed like a worse deal than a regular card when I was there.


Outside of Tokyo you can still easily get one of the other regional IC cards [1]. They can now all be used everywhere in Japan including the Tokyo subway, provided you don't start travel in one region and end it in another.

[1] https://www.japan-guide.com/e/e2359_003.html


I have a EU Garmin - is there a way to get Suica?!


The Japanese IC systems all use FeliCa, which not all devices support. I believe Garmin only has it enabled (unsure if this is licensing or hardware) in their East Asian models (Hong Kong's Octopus cards also FeliCa).

* https://en.wikipedia.org/wiki/FeliCa

* https://en.wikipedia.org/wiki/MIFARE


It was possible before, I had used it until last year!

But earlier this year I checked in with some 240 JPY left and found out that I couldn't charge it during the trip... Luckily it was enough for the 2 stations in the Osaka Metro to save me an awkward moment.


It was an Atari ST: https://youtu.be/6LxPEz9x2fs


> The inventor of the atomic bomb

I'm not sure I understand this characterization of Oppenheimer. There are a few people who could take this title but I don't think Oppenheimer has anywhere near the same claims to it as someone like Leo Szilard or even Enrico Fermi.

If by inventor they mean man who "put into practice", I don't see why the director of a project employing hundreds deserves the title of "inventor".


My guess is that it's the next entry in the BCPL "lineage".

[0] https://en.wikipedia.org/wiki/BCPL


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: