Anecdotally, what I've seen is that folks who learned typing in the 80s and earlier use two dashes '--' instead of the em-dash (although modern word processors seem to replace this combination with the em-dash). Something else I've noticed is their tendency to use two blank spaces between sentences.
I'm a self-taught typist, with all the quirks that comes with (can type programming stuff very accurately at a 100+ WPM; can type normal stuff at a high WPM as well, but the error rate goes up).
I wonder why he (Andrew Feldman) didn't retort to the SRAM vs HBM memory incorrect assumption made by the OP comment; maybe he was so busy that he couldn't even cite the sibling comment? That's a bigger wrong assumption than being off by maybe 30-50% at most on Cerebras's single server price (it definitely doesn't cost less than $1.5-2M).
I have followed Cerebras for sometime. Several comments:
1. Yes I think that is Feldman. I have seen him intervene at hacerknews before thou don't remember the handle specifically
2. Yes, the OP technical assumption is generally correct. Cerebras load the model onto the wafer to get the speed. It's the whole point of their architecture to minimize the distance between memory and compute. They can do otherwise in a "low cost" model, they announced something like that in a partnership with Qualcomm that AFAIK has never been implemented. But it would not be a high-speed mode.
3. The OP is also incorrect on the costs. They pick these costs up from dated customer quotation seen online (in which the Cerebras has incentive to jack it up), but this is not how anything works commercially, and at that time Cerebras was at much smaller scale. But you wouldn't expect Feldman to tell you what their actual costs are. That would be nuts. My thinking is the number could be off by up to 80% by now assuming Cerebras was making progress in their cost curve and the original number had very high margins (which it must have).
Sri Lankan here. I used to compete in the IOI (International Olympiad in Informatics), back in the '90s.
I aced the competition in Sri Lanka, but I failed to win a medal the first time I competed internationally. I solved several problems with recursion, but these failed to complete within the strict time limits allowed.
One of the contestants from another country told me: "You needed dynamic programming to solve problems 1 and 5." I then spent the next year trying to figure out what dynamic programming was. The folks over here weren't familiar with the term. In fact, I was often asked "did you mean dynamic memory allocation?"
After a while, I managed to find a book that talked about dynamic programming, and I was like "Oh, it's recursion + storing results" and "Ah, you can also do this iteratively in certain circumstances"
Bottom line, armed with this new knowledge, I ended up winning a gold medal at IOI 2001. Fun times.
In the university, we had a team for ACM ICPC. We helped our professor organize a local competition, to encourage more people to practice. We participated in that competition, of course.
One question would give you a few numbers no greater than 13, and you would output the number of ways to put N queens on the cheeseboard of those sizes. Not a difficult problem, but our first implementation, needed 2 minutes while time limit was 1 minute. We could optimize, but we decided to run it for all possible numbers, write a new program with just a switch-case. This was q perfectly legal move. We could see his face while judging our submission, his huge surprise when code ended in milliseconds, his look of admiration at us, the curiosity about the solution, and finally his reaction when he checked out our code. Priceless.
Fun enough, in commercial software engineering your solution won’t raise eyebrows as something routine and expected, but an efficient algorithm could do.
Had a similar problem in a highschool comp. sci. class with a maze-following algorithm where you got points off for back-tracking. Turns out your code could pre-fetch the entire maze before making a move, which I think goes against the spirit of the exercise.
I wonder if you can clear up a memory of mine from IOI 2001. The first day results were delayed, and I seem to remember this is because a contestant encoded the search in one of the problems into a compile time c++ template metaprogram. On the second day, there was then also a compile time limit for solutions, from what I remember. Do you remember any more details of this?
I think I know the culprit. The story I was told is that the person precomputed some of the results and submitted a huge source file (in tens of megabytes) of hard coded data. This probably led to subsequent amendment of the rules regarding source file sizes.
I'll report back once I find the persons involved.
Wasn't this for one of the questions on the second day? Where you had to submit a series of outputs, but these were trivial to generate once you cracked the problem.
I remember getting an "aha" moment, writing a program, and then submitting (scored 100% too!). Then, I met a guy who also cracked the problem and realized that he just needed to paste the test data into a spreadsheet, do some basic sorting there, and then paste the output into a text file; no coding necessary.
I think you're referring to "double crypt" of the day 2 problem set. For this problem you submit the result/answer instead of the code, so it shouldn't trigger a compilation/source code limit.
Yeah, but they didn't precompute the solution for that specific program data, they precomputed the solution for all possible ones, and then selected the correct one when the program data was provided.
Sure, but the input might be bounded/finite, or the operations needed similarly constrained (e.g. trigonometry operations). Then you can offload lots of the computation to the compilation, sometimes all of it.
No, sorry. I vaguely remember compile time limits, but they were high enough (30 seconds, I think?) that I didn't bother worrying about them (at least, that's my memory).
Yes, it is 5 hours long. At least 4 of those are worth it.
Many people confuse caching with DP. I’ve had that conversation too often. I think it’s down to the memoization examples people choose being too toy and just looking like caching. Caching is global shared state, whereas memoization can be scoped to the current calculation and still be useful.
But they always skip over tabulation, which is where I believe DP distinguishes itself.
The key difference between caching and memoization is not global shared state, it's determinism. That's why the latter mostly applies locally. In that sense, they're only semantically different. Data in a cache can be expected to go stale; it must not if you rely on the structure as a memoized equivalent to calling the function with the given inputs. In practice, you can have global cache whose data is deterministic and you can also have local cache that goes stale. The latter is not memoization.
I think you’re quibbling. Sharing state between independent tasks leads to nondeterminism. That’s the whole problem with shared state. Spooky action at a distance creates nondeterminism and even undefined behavior.
There are absolute claims to make, they just aren't the ones you're making. Determinism is an absolute claim of memoization. "Local rather than shared state" is simply not, no matter how many time you repeat it is.
Excluding shared state from memoization implies that you can't rely on the technique in parallel scenarios, which is just not true. I can think of more than a handful of DP problems where you can further optimize with parallelism if the function is modified to rely on shared state memoization. For example, the slew of problems that rely on the tabulation to simply fetch one value in the "previous row" and the value at the "previous column" of the current row. In practice, instead of iterating through rows and waiting to completely process them one at a time, you could simply start processing another row in parallel once you have just enough of the requisite "previous" data. This is absolutely valid DP. Some DP solutions even receive their tabulation. How does that not qualify as global?
If you thought "memoization is never shared", then in practice, not only your possibly very deterministic functions would end up solving the same problems over and over each time they're called, you would also deny yourself the opportunity of tremendous optimization techniques given by a guarantee of determinism.
> Many people confuse caching with DP. I’ve had that conversation too often.
This is the beginning of my thesis. Without this no other conversation is productive.
The problem with the DP space in specific and coding in general is that a substantial fraction of 'everyone' hears 'memoization' and immediately substitutes caching into their thought processes, "oh that's just caching, got it," and then believes they are productively contributing to the followup conversation when really all they are doing is confusing everyone else and themselves.
That's my beef, frustration, and frankly, institutional trauma. Equating them in your brain makes everything you say from them on practically useless.
You have a fair point about parallelism, but even here the sharing you illustrate is between parallel sub-problems, the 'sub' is IMO doing all of the work in that sentence. Versus me caching a header because you and I have the same account settings, only there's a bug because it's now showing you my unread message status due to incorrect sharing.
But dp is a form of caching, no? It sacrifices space by caching intermediate results to compute successive results. The only reason it is called dp is because the "inventor" (Iirc Bell) needed a cool name.
DP is not a form of caching. DP is a way of breaking down a problem into subproblems. Caching (memoization) is one way to use this subproblem structure to reduce the amount of computation that ends up being performed in the end.
DP is a resolution technique that uses progressive (i.e. "dynamic") memoization (i.e. "programming" as in "tabulation", or resulting data stored in and retrieved from a table) as a way to curtail the cost of overlapping (i.e. intermediary and repetitive) subproblems. Memoization is a form of deterministic caching.
No, memoization is not caching. That’s a reductive understanding that causes no end of trouble.
Caching is global shared state. It brings nearly every problem that global shared state brings with it. If you conflate the two you miss all of the advantages of DP and “just use caches”.
Memoization in the scope of DP (some languages misuse the word in their APIs) is acknowledging that a problem is self recursive and eliminating the duplicate work by either inverting the problem or storing previously calculated values. These are usually call graph scoped. Nobody else sees the value. For instance in the case of solving Fibonacci with iteration, you need only remember the previous two values, which you store in registers. There is no table. There is no O(n) storage space.
Again, tabulation is when the contrast becomes stark. You’re creating an array where the relationship between the values contains information that makes the calculations cheap.
I don’t know what rock you’re living under but Redis is absolutely GSS, and any lookup table accessible between different tasks is shared state. In a language with threads anything the entire app can see is considered global shared state.
I think you misunderstand. DP is not equal to "caching", but in the set of all things that are "caching" DP is contained. I cannot think of any DP algorithm that does not explicitly or implicitly use a memory cache. Can you? Even in the case of DP Fibonacci the two registers of previous values constitute a cache.
I was given a programming interview once where the interviewer gave me a DP problem (which I didn't know at the time), and he failed me because I used a cache instead of memoization. Asides from being inelegant, what are the downsides to caching?
Memoization is caching so that's a strange reason to fail someone. You cache based on the function call and its arguments.
SOME_FUN = dict(base_cases)
def some_fun(a, b, c):
if (a,b,c) in SOME_FUN: return SOME_FUN[(a,b,c)]
result = some_fun(a - 1, b - 1, c) + some_fun(a, b, c-1) # or whatever the recursive calls would be
SOME_FUN[(a,b,c)] = result
return result
That's it. Some languages (Python, Lisp) make it easy to take the naive version and wrap it (Python decorators are useful here) so that you don't have to write that code above. That's what the @cache decorator in functools does for you. You can memoize the naive version without needing to actually write a specific memoized version.
> In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls to pure functions and returning the cached result when the same inputs occur again. Memoization has also been used in other contexts (and for purposes other than speed gains), such as in simple mutually recursive descent parsing.[1] It is a type of caching, distinct from other forms of caching such as buffering and page replacement. In the context of some logic programming languages, memoization is also known as tabling.[2]
It includes citations so I won't bother repeating them. Memoization is caching, there is no confusion about this in the literature.
If I wrote "a square is a rectangle" would you also waste time shouting that rectangles aren't squares? I didn't write that all rectangles are squares. Nor did I write that all caching is memoization.
Even your quote doesn't support your quibbling. It says that memoization is distinct from other forms of caching. Which does not mean that it is not a form of caching.
Global shared state. High memory overhead. Evictions mean that the beginning of your call can see potentially different values than the end, and most of us write code like this as if it is transactional. We cannot handle a user’s account being closed or upgraded to an admin in the middle of a transaction. So using a cache to keep refetching the user data instead of passing it with the call graph is fraught (and makes unit testing terrible).
As a concrete example, my coworker was trying to walk a DAG with many leaf nodes that were repeated (think dependency tree). He used a cache to do this. It had bugs (2 I knew, but 6 I found). It was one of three large calculations we made at startup, and it should have been the least of these. But it ate memory like candy and added a lot to our start time which was slowing down deployments.
So I replaced it with a DP solution that just looked like normal code, reducing startup memory by ~80 MB and start time by half. By walking the tree and making a list of all the nodes in dependency order and then processing them once, after the list was compiled. And for the microservices and dev tools where we ran this same code the startup time became negligible.
Part of the problem is that the library we used did not like being called recursively, so you had to dance around reentrancy bugs and that cost time and space. But a linearized run has no recursion.
Edit to add: I realize now that I’ve left the conclusion ambiguous.
His two main sins were one, that the cache lived beyond the useful lifetime, and two, that even with the cache some amount of post-processing was required. By unrolling the tree and running each calculation once I reduced the post processing by about 3/4 and climbing as the tree grew in width and height.
Memoization is deterministic caching, meaning that the structure is a perfect stand-in for a deterministic function. The "determinism" of the function itself is relative to some agreed upon lifespan, during which if you feed it some input, the corresponding output is guaranteed to always be the same (e.g. until the next reboot, the runtime of the main program, the span of a function call, etc). Within that timespan, the result of computations can be reliably stored and retrieved from the cache. When it's exceeded, the function's corresponding cache also goes stale and should be considered unreliable and to be discarded.
Note that in the case of memoization, the guarantee of determinism (and associated lifespan) is attached to the function, not the caching structure. Unlike some other caching approaches that rely on a replacement policy.
I was part of the hosting team at IOI 1997 in Cape Town and had similar conversations with the participants that you described and also learned about Dynamic Programming there. Your description of it as "recursion + storing results which you can sometimes do iteratively" is the best summary. Good example is iterative Fibonacci algorithm.
Another fun anecdote from that time is that we had to take special precautions with the scoring program which connected to the participants machine over the serial port. The previous year one participant had hacked the serial port interface to manipulate the results. :-)
Nice! I guess you must know Bruce Merry. Once, when practicing, I kept on hitting my head on a problem. On the spur of the moment, I decided to fire off an email to Bruce (given his track record, I was like "if anyone can figure this out, he can"). I was shocked (and very pleased) to get a reply a few days later outlining his thought process and how he went around solving the problem.
I used to know the ZA team leaders as well (after competing, I was a team leader for several more years).
Yes, I do indeed know Bruce. He won the national Maths and Computer olympiads in my final year of school even though he was 4 grades below me and had just started high school. :-D
The term always made me insufficient as I always assumed it meant much more and I just didn’t understand how.
In the Fibonacci example, it seems more like common sense to avoid recomputing expensive things. Instead we should give derogatory names to the inefficient implementations, not an opaque name to the only sane implementation.
One thing I always wondered is: what makes dynamic programming more "dynamic" than regular programming? Why is recursion "static" but it becomes "dynamic" if you store results?
The word dynamic was chosen by Bellman to capture the time-varying aspect of the problems, and because it sounded impressive.The word programming referred to the use of the method to find an optimal program, in the sense of a military schedule for training or logistics. This usage is the same as that in the phrases linear programming and mathematical programming, a synonym for mathematical optimization.
Also, Harold J. Kushner stated in a speech that, "On the other hand, when I asked [Bellman] the same question, he replied that he was trying to upstage Dantzig's linear programming by adding dynamic.
When I learned dynamic programming, I was told this was in contrast to simple memoization in that memoization is a cache of computation, dynamic programming involves a "choice" in each step.
For example let's look at the fibonacci sequence, F(n) = F(n-1) + F(n-2), naively you just fill in a (1 dimensional) table with the results [1, 1, 2, 3, 5, 8, 13...] and it feels static because there's nothing to choose or decide.
In contrast for example the longest common subsequence looks like this:
lcs[m, n] = 0 if m = 0 or n = 0
lcs[m, n] = lcs[m-1, n-1] + 1 if X[m] = Y[n]
lcs[m, n] = max(lcs[m-1, n], lcs[m, n-1]) if X[m] != Y[n]
At every step there's a "choice" to be made depending on the situation. The functions like "max()" are quite typical for dynamic programming. At least this is how I interpret the "dynamic" part.
Ditto, it took me years to figure out "dynamic programming". It sounded hard and complicated! When I finally figured out wtf it meant I felt anger instead of relief - what an absolutely asinine name for what is otherwise a simple (but valuable) concept.
Our industry has a bad habit of this, where we name things in such a way that make them harder to explain.
See for example one of my other pet peeves: "lambdas". Sounds fancy and difficult! Sounds like a concept you may read about in a research paper! The reality is that it's really quite simple, and we do ourselves a disservice when we name things like this.
Hydration has good connotations though. Hydration is healthy and good for your skin. If you'd used an adjective like "waterboarded" it would have less marketing appeal.
If you use too much moist code, is there a risk of getting a soggy bottom? Considering how many names seem to be derived from memes these days (all the -strikes and so on), a language that guarantees " safety" as a metaphor for correctness would probably do well.
Oh, then you will really love and want to fuck the name of the "Wave Function Collapse" algorithm! (Maybe Chuck Tingle will write a book about it.) Also "Extreme Learning Machines", "Reservoir Computing", "Liquid State Machines", and "Crab Computing" (I shit you not)! Don't even get me started on "Moveable Feast Machines".
[Welcome to Coffee Talk, with Linda Richman. Todays topic: Wave Function Collapse is neither quantum physics, particles, waves, nor collapsing functions. Discuss amongst yourselves, while I am constrained to solve all these Sudoko puzzles. Oh, I am so verklempt!]
DonHopkins 6 months ago | parent | context | favorite | on: Generating an infinite world with the Wave Functio...
Ha ha, great comment! You're right, WFC's name just refers to quantum mechanics because it sounds cool, not because the algorithm has anything to do with physics, waves, or collapsing functions (it's more just a constraint based Sudoko solver). But I perceive it more as a playful buzzword for a useful algorithm, and not as maliciously deceptive like Deepak Chopra, who I was just commenting on recently (Quantum was cool until Deepak Chopra ruined it for everybody):
I do think it's useful combined with other techniques like procedural programming, cellular automata, Voronoi diagrams / jump flooding, noise, etc, to steer the higher and lower level structures. Check out Maxim Gumin's work on Markov Junior!
I think the really nice property of WFC is that it's very naturally and easily artist-driven. It doesn't require programming skills to use, you just supply it with a bunch of concrete before and after examples. (Like "Programming by Example".) That makes it much easier for a wide range of people to use, which is an important feature for democratizing custom world generation. Especially when you can construct or discover the examples in-world, and have it operate kind of like a 2-d tile or 3-d cube auto-complete, that looks at the rest of the stuff you've built or seen, like Canvas does with 1-d code.
DonHopkins 6 months ago | parent | context | favorite | on: Generating an infinite world with the Wave Functio...
The difference between Deepak Chopra's abuse of Quantum Physics terminology and WFC's is that WFC actually works and is useful for something, and its coiner publishes his results for free as open source software and papers, so he deserves more poetic license than a pretentious new-age shill hawking books and promises of immortality for cash like Deepak.
Here are some notes I wrote and links I found when researching WFC (which is admittedly a catchier name than "Variable State Independent Decaying Sum (VSIDS) branching heuristics in conflict-driven clause-learning (CDCL) Boolean satisfiability (SAT) solvers"):
Here are some notes I wrote and links I found when researching Wave
Function Collapse (WFC). -Don Hopkins
Wave Function Collapse
Maxim Gumin
Paul Merrell
https://paulmerrell.org/research/
https://paulmerrell.org/model-synthesis/
Liang et al
Jia Hui Liang, Vijay Ganesh, Ed Zulkoski, Atulan Zaman, and
Krzysztof Czarnecki. 2015. Understanding VSIDS branching heuristics
in conflict-driven clauselearning SAT solvers. In Haifa Verification
Conference. Springer, 225–241.
WaveFunctionCollapse is constraint solving in the wild
https://escholarship.org/content/qt1f29235t/qt1f29235t.pdf?t=qwp94i
Constraint Satisfaction Problem (CSP)
Machine Learning (ML)
[...lots more stuff...]
Even more fun stuff about WFC, AgentSheets, and defining cellular automata rules by example:
>There are two main complaints from academic community concerning this work, the first one is about "reinventing and ignoring previous ideas", the second one is about "improper naming and popularizing", as shown in some debates in 2008 and 2015.[33] In particular, it was pointed out in a letter[34] to the editor of IEEE Transactions on Neural Networks that the idea of using a hidden layer connected to the inputs by random untrained weights was already suggested in the original papers on RBF networks in the late 1980s; Guang-Bin Huang replied by pointing out subtle differences.[35] In a 2015 paper,[1] Huang responded to complaints about his invention of the name ELM for already-existing methods, complaining of "very negative and unhelpful comments on ELM in neither academic nor professional manner due to various reasons and intentions" and an "irresponsible anonymous attack which intends to destroy harmony research environment", arguing that his work "provides a unifying learning platform" for various types of neural nets,[1] including hierarchical structured ELM.[28] In 2015, Huang also gave a formal rebuttal to what he considered as "malign and attack."[36] Recent research replaces the random weights with constrained random weights.[6][37]
But at least it's easier to say, rolls off the tongue smoothly, and makes better click bait for awesome blog postings!
I also love how the cool buzzwords "Reservoir Computing" and "Liquid State Machines" sounds like such deep stuff.
Yukio-Pegio Gunji, Yuta Nishiyama. Department of Earth and Planetary Sciences, Kobe University, Kobe 657-8501, Japan.
Andrew Adamatzky. Unconventional Computing Centre. University of the West of England, Bristol, United Kingdom.
Abstract
Soldier crabs Mictyris guinotae exhibit pronounced swarming behavior. Swarms of the crabs are tolerant of perturbations. In computer models and laboratory experiments we demonstrate that swarms of soldier crabs can implement logical gates when placed in a geometrically constrained environment.
Not quite sure what this all is, but it is an interesting way to wind up in the a.m. fully reminded that I am not and have never been, nor will be, as smart as I would like to be, but it’s an enjoyable ride. Thanks
I know this is trite and arguably against the guidelines and I'm sorry. But this is simply a crazy thing to comment on an article whose whole point is to explain why it's called that.
With dynamic programming the order of computation isn't necessarily fixed, even though you can sometimes order it to reduce or eliminate recursion. You are just reusing past results when they exist, else computing and storing if they don't.
I came to programming after I graduated, and I never really understood Dynamic Programming until I watched one of Stanford’s algorithm courses (kudos to MOOCs).
But honestly, I’ve never done anything truly useful with it—except that it gave me a new perspective on ordinary things. For example, I found that there may be great wisdom in the Taiji symbol (image here: https://commons.wikimedia.org/wiki/File:Yin_yang.svg). We can divide the current problem and conquer it at the lower levels (like one of the dots in yin and yang). And of course, those lower levels still contain their own divisions and computations—until there are none left.
At the same time, we can also start from the base and build upward (bottom-up), moving toward higher-level computations—where possible divisions await, until eventually one big unified answer emerges.
Pretty useless for its actual usage here, isn’t it? ^^
Dynamic programming is for me one of the things that really scratched an itch and made me start liking programming. The invariant/inductions of composing smaller things into a bigger solution is so elegant.
I do feel dynamic programming influences how I solve problems in my work. How, if you compose the problem correctly, it can never reach a fault state in a sense. Not sure if I'm able to explain what I mean, but for instance I recently refactored an old application at work that had lots of bugs, was hard to grasp and had loads of error checking code / asserts. However, by composing it a bit differently, modeling it properly, I could remove swaths of code / defensive programming because I with the help of the compilator now can guarantee the base cases holds etc.
Edit: one a bit contrived example might be an Order. If you have one big model tracking the lifetime of an order, some fields will have to be null until they have a value. For instance a field sent_at_time. Then you have a function that generates an email to the customers, which uses that field. How do you handle that field possibly being null? Do you say "I know that at this point in time, this field is always set since the email is generated after sending" and tell the compiler to ignore the possibly null value? Or do you handle the possible null with some error handling / fallback value, that in practice is dead and cluttered code and will just confuse a future developer seeing it with questions like "how can this happen"?
With the lessons from dynamic programming in mind, I think both are bad solutions. You want to be able to go from state n to n+1 with safety (order purchased, ordered sent, email sent). So model it in a way that the email sending code only ever can get an order in the correct previous state as input. Then if the assumption were to break in the future (maybe some orders are digital and the email is sent without the order having been physically sent), that breaking assumption will immediately be obvious in how the state/models are composed, and not in runtime when either the not-null-assert fails or the emails start having the fallback value.
It was mainly just one example of the thought pattern that I feel translates. In my head it has a relation. In dynamic programming, you compose a solution of smaller, but still valid solutions. When calculating fib(8), you can trust that combining fib(7) and fib(6) is valid, and so on down to the base case. If you apply that concept to other parts of programming, it's much easier to reason about.
Has there been an "arms race" of sorts with programming contests, as knowledge percolates and harder categories of problems need to be chosen? Since these days dynamic programming is basically table stakes for ICPC-type problems.
Yes, absolutely. I did programming competitions back in high-school (around 10 years ago) and common folklore was that back in the days knowing dynamic programming could win you a medal, but today it was just basic expected knowledge.
There's a lot of variety in DP. Knowing about DP doesn't help much with solving DP problems. I'm sure you've seen all the fun problems on codeforces or leetcode hards. The one quote I remember from Erik Demaine's lecture on DP is where he comments, "How do we solve this with DP ? - With difficulty."
I think it's because access to knowledge has become a lot easier, so contestants know more; plus, the competitions themselves have become more organized.
For example, last I checked, the IOI had a syllabus outlining the various algorithms contestants needed to know. During my time, it was more of a wild west.
I had the same experience: no one knew of my mentors what "dynamic programming" was and our country-level competition (that had created problems inspired by IOI) required dynamic programming for 2 out of 5 problems. And of course I failed the first time (2004). Then I learned what it was about and aced the next time (2005).
Oh that's the year I went. Trip down memory lane. I was woefully unprepared. The mobile phone cell summation problem underlined how much I didn't know, and later figuring it out, how much I had to learn; it cemented my love for the field. I just loved the experience.
As a US born native English speaker, I always struggled with the phrase. Eventually I just mentally called it a misnomer as the phrase provided no value to the actual problem set they described.
For my first programming competition (the national ICPC prequalifier, in 2002), I asked my teammates (who I didn't know very well, we had just cobbled together a team) a couple of minutes before the contest what this dynamic programming thing was. One of them was “oh, it's only caching the output of a recursive function” (technically, that's memoization, but who cares).
I ended up solving a problem using DP a couple of minutes before the deadline, which was enough to get us to 3rd and to ICPC. Fun times.
I had to laugh hard the first time I heard about it, I mean it's a really obvious technique if you've ever done low-level programming on limited machines as it's very close to something we used to do all the time - calculate result tables and use lookups which we built in to our code.
Trades memory for CPU when it's worth it. Dynamic programming takes the same idea but does it at runtime.
To be fair, DP generally also involves recursion in some way. You could describe it as a lookup table where, to fill in later entries of the table, you need to use the earlier entries.
Technically, "dynamic programming" (god, I hate that term) is focused on memoization/tabulation. It doesn't necessarily require recursion or an iterative approach, although they almost always go hand-in-hand in common leetcode/hackerrank/etc exercises.
Em-dash fan here as well. I've been watering down my use of the em-dash, because I don't want folk to think that my (always self-written) business emails, etc are a chatGPT job.
As an interviewer, I'm seeing a huge increase in proportion of candidates cheating surreptitiously during video interviews. And it's becoming difficult to suspect any wrong-doing unless you're very watchful by looking for academic responses to questions.
Why would anyone encourage building such a tool, I can't fathom.
Some first/introductory interviews are now "powered" by AI. As in, the interviewee gets an AI bot that evaluates them. I'd not be surprised if this takes over and becomes standard.
For now, this is perhaps a blessing in disguise: it tells you that a company is all aboard the hype train and that leadership is seriously lacking in critical thinking and judgment. That can certainly save you from wasting more time with them.
I really, really hope this does not become a "standard". Ugh.
Don't candidates also get a say? If a company asked me to jump through that hoop I'd have a simple one-word response. "No"
If enough good candidates have that reaction, it will become a prestige marker for a company to not use AI screening to give them access to the best candidates
Have you tried putting yourself in the perspective of the humans trying to find a job in a market that is turning over now and was already dystopian before AI was injected into a dystopian, hellish process of “putting on a tie and using a firm handshake” to apply into the void.
This is so stupid. One of the main reasons it's become a dystopian, hellish process is because people cheat; proliferating cheating will make it even worse.
Lying and cheating on a job interview isn't a victimless crime. You're harming the company and all your coworkers when they hire you into a job you're not qualified for; you're harming all the other actually qualified candidates that didn't get hired instead; you're harming yourself, when your salary comes from a company who rely on you to give something you can't give them.
Well be prepared for it to get MUCH MUCH worse, two AI agents battling it out trying to get each other to mess up. While all the human have no idea what the hell is happening.
It's pretty simple - people need to eat (and fulfill other basic needs, of course), to eat they need jobs, to get jobs they need to pass the interview. The hiring process in a lot of industries is heavily gamed at this point, to the point that not cheating is basically an automatic fail. So, if you want to eat, you cheat.
> The hiring process in a lot of industries is heavily gamed at this point, to the point that not cheating is basically an automatic fail.
This sound a bit of "thief thinks everyone steals". Interview preparation is normal and common but I don't think cheating is. May depend on the location of course.
The "heavily gaming" happens before the interview. When you reorder and edit your resume to have the right keywords to get on top of the LLM/intern sorted pile.
I can totally understand thinking this way out of desperation, and being lulled into thinking it’s this simple, but it seems short sighted with hidden complexities. First of all, it’s risky. If you get caught, you don’t eat, and it could follow you and prevent you from even getting in the door elsewhere. Companies are always going to be watching for cheaters, they are always going to have more visibility than you into what interviewees are doing, and they are always going to have more resources. Even if you do cheat and get hired, it quickly becomes obvious that you’re unqualified and can’t do what you claimed, and even if you don’t lose your job, you’re less likely to get promoted. Being lazy and amoral about interviews seems like a trap people set for themselves.
The good news is that a lot of companies are starting to allow AI during the interviews, and suddenly it’s not cheating. But of course that means you need to be good at using AI and interviewing and programming, you won’t be able to cheat and rely on the AI to do your talking for you.
Doing whatever it takes to get the foot in the door may be encouraged, but only to a point and I think out and out cheating is probably crossing a line... As would murder, arson etc. etc.
If cheating means asking someone in the company you're interviewing for a peek at what will be asked then great. In my book that's using leverage.
Reviewing previously posted interview tests is probably recommended.
Hooking up a copilot to answer interview questions for you in real time is probably less so.
How is it unethical? Say you ask whoever what's being asked and they say you need to sort a string in place and then discuss how a random forest gets trained... You still need to answer those questions AND know enough to answer the follow up questions. If you're no good you'll be still found out. It just means you'll have a head start over someone without those kinds of contacts. So what level of utilizing your professional network crosses a line? Does a recommendation cross a line because I know for a fact that internal recommendations are moved to the head of the queue in most companies.
Presumably the value in knowing "you need to sort a string in place and then discuss how a random forest gets trained" is that it impacts your answers - for instance, by allowing you to look this up before the interview while appearing to the interviewers to he operating unfer the dame conditions as the other candidates, who did not know to. Your performance then appears as a signal of broader inwoledge and capability than you possess - you have, as is the entire point here and which I should not need to spell out, gained an advantage over other candidates by virtue of the information which was intentionally leaked.
If the point of the interview were "answer those questions AND know enough to answer the follow up questions" _once told what to expect and prep_, they’d be sharing those questions with all candidates. If you feel that saying to the interviewers "by the way, I did know this because [X] told me they’d be here" wouldn’t impact outcomes, then great. If you feel you’d need to hide that, then you’re aware this involves dishonesty - and if you still struggle to see how that’s unethical, lets just make sure we never need to work together.
> lets just make sure we never need to work together
Seeing as how you seem to prefer to let everyone else steal a march on you in interviews in the interest of "fairness", that's not likely to happen anytime soon.
This is correct—I do not engage ethics only when it won’t cost me, nor take convenience into account when determining where my lines are. Perhaps I’m privileged to have that option.
Probably you've been out of the getting hired game but I had a glimpse of it last year: absolutely terrible.
When I started you'd send a mail to the company directly about a position, you'd go to the office, have a short interview, meet the team and they'll let you know. That's it.
Now it's 2 rounds of HR bs, 3 layers of tech interviews, then meet the CEO/CTO/etc. And then references and then a final "chat". And you still can get ghosted at literally any step, even at the final cozy chat, just because of "vibes".
And throw in companies sending you leetcode even before talking to you and you can see why one would want to get through the bs.
I still stand about my favourite approach for tech jobs: intro and tech chat (1-2h) about your resume, what you'll be doing and anything you might have questions about (no challenges or stupid riddles). Then, if everything goes smoothly, you get a 2 weeks contract and you are in probation. If everything goes well, you get another contract for 3-6 months (up to you to accept or not) and then you get converted to permanent if everything went well for both parties.
I actually like your idea of a probationary hire, but you can see this is just an even longer extended interview, right? If companies were to adopt this model en masse, they would over-hire and then drop most people after the first 2 weeks, and you’d be out looking for another job, having wasted even more time than 5 rounds of interviews, and being unable to interview for multiple jobs at the same time.
Software interviews and hiring have definitely changed over time, and I know it’s harder right now, but I think we’re seeing the past with slightly rose-tinted glasses here. It was never only just one short interview, there were applications and emails and phone screens. In my career, I’ve always had multiple interviews and technical discussions during job applications, even back in the 90s. Getting hired, for me, has always taken several weeks end to end, if not longer.
There are a bunch of reasons interviews are getting harder, and people trying to game the system and trying to cheat are one of them, a big one. Think about it from the company’s perspective: what would you do if the volume of applications you got started far exceeding the number of positions available, and an increasing percentage of the applications you got were people unqualified for the positions but adept at pretending? More face time vetting before hiring seems like the only reasonable answer.
Other reasons why interviews are getting harder is that software jobs are more competitive now, and possibly relative pay has gone up. If interviewing was easier back in the day (and I agree that it was), it’s because there wasn’t as much competition.
Agreed, I would too. Getting paid is the main difference, and while it would be hard if it happened a lot, it’d at least put food on the table. I just know from time spent contracting and from jobs with periodic and/or seasonal layoffs (films & games in my case) that most people in those jobs still don’t like the unpredictability, even when they choose contracting for it’s flexibility, and even when turnover is limited to once every year or so.
I am old and thankfully out of the getting hired game. I was cleaning out some files (paper!) recently and ran across correspondence from old job searches. As you said, single visit and decision. I was also struck by the number of letters from companies thanking me for my resume and politely telling me they were passing but would keep me in mind for future openings. It was not uncommon to receive a letter directly from the hiring manager thanking me for coming to an interview.
A two week probation means that nearly all candidates will need to quit their current job to do the probation which seems unlikely to be popular with candidates
I won't use it, but I do see it as somewhat symmetric. If the interviewers are using AI or expecting you to use AI for these tasks once you're on the job, then it doesn't seem completely immoral.
Get ready to start having some fun in your interviews. Start including things like redirection of focus through general statements, unrelated (and false) trivia, and misleading suggestions in your interview questions. Most of the humans you'd like to hire will ignore those or ask you about them.
Many LLMs will be derailed into giving entertainingly wrong answers:
> unless you're very watchful by looking for academic responses to questions
I've noticed that a lot of the supposed hallmarks of "AI slop writing" (em-dashes, certain multisyllabic words, frequent use of metaphor) are also just hallmarks of professional and academic writing. (It's true that some of them are clichés, of course.)
It seems like most efforts to instruct people on how to "fight back against AI writing" effectively instruct them to punish highly literate people as well.
I think it's often still possible to tell human writing that uses some of the same tropes or vocabulary apart from AI writing, but it's very vibes-based. I've yet to see specific guidance or characterizations of AI writing that won't also flag journalists, academics, and many random geeks.
Honestly, why would you care? IF, and this is a big if, you are confident your interview process accurately assesses the abilities of candidates to carry out the role, then why would LLM assistance even matter? Are they not going to be allowed to use LLMs on the job?
This faux-outrage is just showing how broken the whole hiring process is in tech.
Stop giving people puzzles and just talk to them. If you're unable to evaluate if someone's a good fit for a role then you either need to learn more about effective interviewing, learn more about the role, or find someone else who is good at hiring/interviewing.
Indeed, I am sympathetic to the author in this situation because I think open-source is important, but I don't approve of this software and don't want to affiliate with it by even starring it on GitHub.
Not really sure what I can do for the author but say "that sucks, bro".
I think what gives Tao the title is how multidisciplinary he is. He can wander in to a new subfield of mathematics and start making SOTA contributions after not very much time, which is a rare thing.
Doctor here (non-USA). This is standard procedure and has been standard ever since I was a medical student, almost 20 years ago. The "running nurse" is in charge of maintaining a count and making sure that nothing is left inside the patient.
Swabs and towels tend to come in pre-prepared packs, so the nurses have another source of truth. Any mismatch between the preprepared count, running count during the operation, and actual swabs/towels laid out at the end is a serious cause for concern. You will not close up a patient if there is a mismatch.
reply