A third of the country rents. Renters pay the utility bills. Landlords pay for appliance upgrades.
Why would the landlord put any effort into upgrading appliances when the cost of not upgrading them is borne by the renters?
I've never rented at a place where they didn't want to fix broken equipment with the cheapest possible replacement. And no renter would ever consider purchasing a major appliance like this since they'll end up priced out before they recover the cost in utility bills.
They're a nice technology, but our incentives are all wrong for a lot of housing stock.
In some locations you can't rent out places without minimum energy efficiency ratings, which then leads to insulation and heat pumps getting installed.
This is referred to as "Minimum Energy Efficiency Standards (MEES)" and seems to have been pioneered in the UK and adopted by Netherlands and France and then the EU generally.
They are efficient but do not have as high of an energy output as a smaller and cheaper gas furnaice. Apart from that, the water temperature is lower, so you need much larger radiators. Due to the lower energy output, you also need better insulation or a relatively massive heat pump. And the tech was not around 20 years ago (for reasons unknown to me).
The water temperature which you deliver to radiators are not defined by capacity of the heatpump, but how hot the radiators can be for safety/comfort reasons. If the radiators are too hot people could burn by touching them or stuff like platsic chairs would melt. Also the piping in the walls and floors cannot support too hot temperatures.
The temp for water used in radiators 60-70C is easily achievable by an air-top-water heat pump. It does not depend on the energy source, gas/oil/electricity.
Condensing gas boilers similarly run more efficiently at lower temps.
If the water returning to the boiler isn't below 54C then there will be no condensing at all, and the advertised 90%+ efficiency won't happen till the return value is more like 46C.
That translates roughly to max winter temp of 65C leaving the boiler and lower when lesss heating is required.
This can be tweaked by the end user and save 10-20% on heating bills.
From context I can't tell if they mean the heated coils in a heat pump head, or somehow connecting to a traditional radiator.
In older homes there isn't necessarily HVAC at all and instead there are actual radiators. I've lived in two like that, there is just no forced air to rooms.
I listed a reason that impacts a third of houses. I didn't write an essay because the article lists plenty of others. It was just weird that they never mentioned the misaligned incentives.
Right and you simply break even there so there's not much upside in terms of variable costs unless your electricity is somehow cheaper and not mainstream California prices.
That doesn't square with the fact that new rentals are built with granite countertops and stainless-steel appliances. Tenants do shop around on the basis of amenities.
Sure, but those amenities are highly visible. Lots of units have a stainless dishwasher exterior, but most will still be the landlord-special plastic tub inside. Who is shopping around based on whether or not there’s a heat pump? I would consider myself relatively well-educated on this and still the heat/cooling source is an afterthought.
Honestly the ductless mini split system in my new apartment was a big factor for me. But it was the first time I'd seen one over here in the mid-atlantic.
The combination with air conditioning and dehumidifying is genuinely compelling for the simplicity. Especially in new construction.
But these things trickle down to renters last. And if the landlord installs it, you bet your ass the rent is going up more than your savings on electricity.
Lose lose lose, if it gets installed then the current residents probably get priced out anyway. It eventually trickles down but we could do so much better.
Could you clarify what you mean about getting serial over the USB port in the context of debug pins?
I've been using Teensy devices for over a decade and have always had it just recognize the device as if it were a USB to serial adapter and I can talk to it as what I'd call "serial over the USB port". But that obviously doesn't involve what I think software people usually mean when they're talking about firmware debug -- which usually entails stepping through execution, right?
I'm used to just printing debug statements to the Serial.println() function, I learned on the 8051 where the best bet was to toggle different pins when code lines are passed, so even Serial.println() was a huge step up.
It wasn't specifically in the context of debug pins.
On a "normal" arduino, an FTDI chip on the board handles the job of exposing a serial adapter to your computer over USB. The atmel chip on the other side of the FTDI chip runs your code and getting serial out from your firmware is a short codepath which directly uses the UART peripheral.
On a teensy, there is still a secondary chip, but its just a small microcontroller running PJRC code. This microcontroller talks over the debug pins of the main chip, and those pins aren't broken out (at least back when I last used a teensy). Despite covering the debug pins, this chip only handles flashing and offers no other functionality. Since there is no USB serial adapter, for hobbyists trying to use it for running code with an arduino HAL, the HAL has to ship an entire USB driver just for you to get serial over USB. And this itself means you can't use the USB for other purposes.
For advanced users, this makes debugging much harder, and god forbid you need to debug your USB driver.
It's kind of just a bunch of weird tradeoffs which maybe don't matter too much if you are just trying to run arduino sketches on it but it was annoying for me when I was trying to develop bare metal firmware for it in C.
Moderation (the intent and success) varies to such a huge extent that it's practically silly to talk about moderation on Mastodon unless you mean moderation on a specific mastodon server (like mastodon.social). But moderation (the process) is intense and servers are usually community run on the change found in a spare couch (i.e. they're volunteers).
I think they do quite well considering the disparate resource levels, but some servers are effectively unmoderated while others are very comfortable; plenty are racist or other types of bigot friendly, but the infrastructure for server-level blocks is ad-hoc. Yet it still seems to work better than you'd guess.
Decentralization means whomever runs the server could be great, could just not be good at running a server, could be a religious fundamentalist, a literal cop, a literal communist, a literal nazi, etc etc. And all have different ideas of what needs moderating. There is no mechanism to enforce that "fediverse wide" other than ad-hoc efforts on top of the system.
Thank you for the clarification; that makes sense.
It is perhaps also worth noting that the Fediverse architecture does nothing to remove racists or bigots from the possibility of being found in the "fediverse" (here referring to the collection of all servers using the protocol and not the protocol itself), and... That's pretty much as-intended. Truth Social uses Mastodon as its backend; there is nothing the creators / maintainers of Mastodon could, or by design would, do to shut it off. The same architecture that makes it fundamentally impossible for Nazis to shut down a gay-friendly node makes it impossible for other people to shut down a Nazi node; there is merely the ability of each node to shield its users from the other.
That's a feature of the experiment, not a bug, and reasonable people have various opinions on that aspect of it.
I wonder if enough of them exist to even do a study like that.
I have encountered side effects that probably no one has seen before, simply because of rarity and peculiarity of behavior. You don't run into a ton of people using both interferon and doing karate, so if bruising more easily happens 10% of the time... would anyone notice?
Personally I would be more worried about persistent inflammation causing inflammatory disorders, of which there are many. If there are like 10,000 individuals with this trait then there just aren't enough to detect. But that seems direct... wouldn't you expect something like this to potentially even destroy viral reservoirs over time?
The fact that this is short term in the treatment made me 1000x more comfortable with the idea in any case.
They would still have too many other possible hosts.Or maybe they'd find a way to attack that very system, similarly how HIV attacks the immune system.
Can't they make it so that anyone from that geographical location is required to prove their identity and log in to view the articles? That seems like it'd be sufficient and sure I'd be annoyed at Wikipedia but if they linked to the law I feel like people would get it.
Of course now no one needs to visit Wikipedia because Google has already scraped them with AI so you can just see the maybe accurate summary. Seems risky, as if you should have to log in to use Google since the AI might have forbidden information.
Given the size of Google, I'm not sure if/how they're excluded from this and may actually ask for real identities of UK users they don't already "know" via other means of Google Wallet, etc.
That's great, but AlphaGo used artificial and constrained training materials. It's a lot easier to optimize things when you can actually define an objective score, and especially when your system is able to generate valid training materials on its own.
Are you simply referring to games having a defined win/loss reward function?
Because pretty sure Alpha Go was ground breaking also because it was self taught, by playing itself, there were no training materials. Unless you say the rules of the game itself is the constraint.
But even then, from move to move, there are huge decisions to be made that are NOT easily defined with a win/loss reward function. Especially early game, there are many moves to make that don't obviously have an objective score to optimize against.
You could make the big leap and say that GO is so open ended, that it does model Life.
"artificial" maybe I should have said "synthetic"? I mean the computer can teach itself.
"constrained" the game has rules that can be evaluated
and as to the other -- I don't know what to tell you, I don't think anything I said is inconsistent with the below quotes.
It's clearly not just a generic LLM, and it's only possible to generate a billion training examples for it to play against itself because synthetic data is valid. And synthetic data contains training examples no human has ever done, which is why it's not at all surprising it did stuff humans never would try. A LLM would just try patterns that, at best, are published in human-generated go game histories or synthesized from them. I think this inherently limits the amount of exploration it can do of the game space, and similarly would be much less likely to generate novel moves.
> As of 2016, AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play. It uses Monte Carlo tree search, guided by a "value network" and a "policy network", both implemented using deep neural network technology.[5][4] A limited amount of game-specific feature detection pre-processing (for example, to highlight whether a move matches a nakade pattern) is applied to the input before it is sent to the neural networks.[4] The networks are convolutional neural networks with 12 layers, trained by reinforcement learning.[4]
> The system's neural networks were initially bootstrapped from human gameplay expertise. AlphaGo was initially trained to mimic human play by attempting to match the moves of expert players from recorded historical games, using a database of around 30 million moves.[21] Once it had reached a certain degree of proficiency, it was trained further by being set to play large numbers of games against other instances of itself, using reinforcement learning to improve its play.[5] To avoid "disrespectfully" wasting its opponent's time, the program is specifically programmed to resign if its assessment of win probability falls beneath a certain threshold; for the match against Lee, the resignation threshold was set to 20%.[64]
Of course, not an LLM. I was just referring to AI technology in general. And that goal functions can be complicated and not-obvious even for a game world with known rules and outcomes.
I was miss-remembering the order of how things happened.
AlphaZero, another iteration after the famous matches, was trained without human data.
"AlphaGo's team published an article in the journal Nature on 19 October 2017, introducing AlphaGo Zero, a version without human data and stronger than any previous human-champion-defeating version.[52] By playing games against itself, AlphaGo Zero surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0, reached the level of AlphaGo Master in 21 days, and exceeded all the old versions in 40 days.[53]"
There are quite a few relatively objective criteria in the real world: real estate holdings, money and material possessions, power to influence people and events, etc.
The complexity of achieving those might result in the "Centaur Era", when humans+computers are superior to either alone, lasting longer than the Centaur chess era, which spanned only 1-2 decades before engines like Stockfish made humans superfluous.
However, in well-defined domains, like medical diagnostics, it seems reasoning models alone are already superior to primary care physicians, according to at least 6 studies.
It makes sense. People said software engineers would be easy to replace with AI, because our work can be run on a computer and easily tested, but the disconnect is that the primary strength of LLMs is that they can draw on huge bodies of information, and that's not the primary skill programmers are paid for. It does help programmers when you're doing trivial CRUD work or writing boilerplate, but every programmer will eventually have to be able to actually truly reason about code, and LLMs fundamentally cannot do that (not even the "reasoning" models).
Medical diagnosis relies heavily on knowledge, pattern recognition, a bunch of heuristics, educated guesses, luck, etc. These are all things LLMs do very well. They don't need a high degree of accuracy, because humans are already doing this work with a pretty low degree of accuracy. They just have to be a little more accurate.
Being a walking encyclopedia is not what we pay doctors for either. We pay them to account for the half truths and actual lies that people tell about their health. This is to say nothing about novel presentations that come about because of the genetic lottery. Same as an AI can assist but not replace a software engineer, an AI can assist but not replace a doctor.
Having worked briefly in the medical fields in the 1990s, there is some sort of "greedy matching" being pursued, so once 1-2 well-known symptoms are recognized that can be associated with diseases, the standard interventions to cure are initiated.
A more "proper" approach would be to work with sets of hypotheses and to conduct tests to exclude alternative explanations gradually - which medics call "DD" (differential diagnosis).
Sadly, this is often not systematically done, and instead people jump on the first diagnosis and try if the intervention "fixes" things.
So I agree there are huge gains from "low hanging fruits" to be expected in the medical domain.
I think at this point it's an absurd take that they aren't reasoning. I don't think without reasoning about code (& math) you can get to such high scores on competitive coding and IMO scores.
Alphazero also doesn't need training data as input--it's generated by game-play. The information fed in is just game rules. Theoretically should also be possible in research math. Less so in programming b/c we care about less rigid things like style. But if you rigorously defined the objective, training data should also be not necessary.
> Alphazero also doesn't need training data as input--it's generated by game-play. The information fed in is just game rules
This is wrong, it wasn't just fed the rules, it was also fed a harness that did test viable moves and searched for optimal ones using a depth first search method.
Without that harness it would not have gained superhuman performance, such a harness is easy to make for Go but not as easy to make for more complex things. You will find the harder it is to make an effective such harness for a topic the harder it is to solve for AI models, it is relatively easy to make a good such harness for very well defined programming problems like competitive programming but much much harder for general purpose programming.
Are you talking about Monte Carlo tree search? I consider it part of the algorithm in AlphaZero's case. But agreed that RL is a lot harder in real-life setting than in a board game setting.
If that's your take-away from that paper, it seems you've arrived at the wrong conclusion. It's not that it's "fake", it's that it doesn't give the full picture, and if you only rely on CoT to catch "undesirable" behavior, you'll miss a lot. There is a lot more nuance than you allude to, from the paper itself:
> These results suggest that CoT monitoring is a promising way of noticing undesired behaviors during training and evaluations, but that it is not sufficient to rule them out.
very few humans are as good as these models at arithmetic. and CoT is not "mostly fake" that's not a correct interpretation of that research. It can be deceptive but so can human justifications of actions.
Humans can learn the symbolic rules and then apply them correctly to any problem, bounded only by time, and modulo lapses of concentration. LLMs fundamentally do not work this way, which is a major shortcoming.
They can convincingly mimic human thought but the illusion falls flat at further inspection.
Humans are statistically speaking static. We just find out more about them but the humans themselves don't meaningfully change unless you start looking at much longer time scales. The state of the rest of the world is in constant flux and much harder to model.
To be fair, it was more a "wow look what the computer did". The AI "art" was always bad. At first it was just bad because it was visually incongruous. Then they improved the finger counting kernel, and now it's bad because it's a shallow cultural average.
AI producing visual art has only flooded the internet with "slop", the commonly accepted term. It's something that meets the bare criteria, but falls short in producing anything actually enjoyable or worth anyone's time.
It sucks for art almost by definition, because art exists for its own reason and is in some way novel.
However, even artists need supporting materials and tooling that meet bare criteria. Some care what kind of wood their brush is made from, but I'd guess most do not.
I suspect it'll prove useless at the heart of almost every art form, but powerful at the periphery.
Sure, that does make things easier: one of the reasons Go took so long to solve is that one cannot define an objective score for Go beyond the end result being a boolean win or loose.
But IRL? Lots of measures exist, from money to votes to exam scores, and a big part of the problem is Goodhart's law — that the easy-to-define measures aren't sufficiently good at capturing what we care about, so we must not optimise too hard for those scores.
> Sure, that does make things easier: one of the reasons Go took so long to solve is that one cannot define an objective score for Go beyond the end result being a boolean win or loose.
Winning or losing a Go game is a much shorter term objective than making or losing money at a job.
> But IRL? Lots of measures exist
No, not that are shorter term than winning or losing a Go game. A game of Go is very short, much much shorter than the time it takes for a human to get fired for incompetence.
I read it as similar to the US espousing how its researchers developed something.
Not really a big deal, though I'd have... you know... linked to the actual paper or maybe mentioned the professor's name more prominently.
I think most stuff I read emphasizes institute and researchers more heavily, but I can see why anyone doing public research might want to expand the scope of credit.
You clearly know what's going on, but still wrote that you should "discourage" an LLM from doing things. It's tough to maintain discipline in calling out the companies rather than the models as if the models had motivations.
A third of the country rents. Renters pay the utility bills. Landlords pay for appliance upgrades.
Why would the landlord put any effort into upgrading appliances when the cost of not upgrading them is borne by the renters?
I've never rented at a place where they didn't want to fix broken equipment with the cheapest possible replacement. And no renter would ever consider purchasing a major appliance like this since they'll end up priced out before they recover the cost in utility bills.
They're a nice technology, but our incentives are all wrong for a lot of housing stock.
reply