I don't think youre missing out necessarily. I would riot if i couldn't use Pycharm anymore, for big python projects just nothing beats it right now.
I do use VSCode too, but mostly for quick scripting or non-programming projects. and even then i installed a bunch of extensions to make it more like Pycharm.
I've been using GitMCP.io + Github Copilot for this problem specifically (AI assistant + accurate docs). The downside is that you need to add a separate MCP server for each repository, but the qualitative difference in agent mode is incomparable.
I used it recently to do a major refactor and upgrade to MLFlow version 3.0. Their documentation is a horrid mess right now, but the MCP server made it a breeze because i could just query the assistant to browse their codebase. Would have taken me hours extra myself.
Not sure, i can't run it since i can't install Node.js on my work environment. What is your experience with Context7 like?
As for GitMCP: I think the url fetching tool of the docs it does is not great, but the code searching tool is quite good. Regardless, i remain open to alternatives, not stuck to this yet.
Look i'm optimistic about time-series foundation models too, but this post is hard to take seriously when the test is so flawed:
- Forward filling missing short periods of missing values. Why keep this in when you explictly mention this is not normal? Either remove it all or don't impute anything
- Claiming superiority over classic models and then not mentioning any in the results table
- Or let's not forget, the cardinal sin of using MAPE as an evaluation metric
Good to see positive reception to feedback! Sorry if my message came out as condescending, was not the intent. I recommend reading this piece on metrics https://openforecast.org/wp-content/uploads/2024/07/Svetunko.... It's easy to grasp, yet it contains great tips.
we're grateful for the honest feedback (and the awesome resource!), makes it easier to identify areas for improvement. Also, your point about using multiple metrics (based on use-cases, audience, etc) makes a lot of sense. Will incorporate this in our next experiment.
Short answer: i use multiple metrics, never rely on just 1 metric.
Long answer: Is the metric for people with subject-matter knowledge? Then (Weighted)RMSSE, or the MASE alternative for a median forecast. WRMSSE is is very nice, it can deal with zeroes, is scale-invariant and symmetrical in penalizing under/over-forecasting.
The above metrics are completely uninterpretable to people outside of the forecasting sphere though. For those cases i tend to just stick with raw errors; if a percentage metric is really necessary then a Weighted MAPE/RMSE, the weighing is still graspable for most, and it doesn't explode with zeroes.
I've also been exploring FVA (Forecast Value Added), compared against a second decent forecast. FVA is very intuitive, if your base-measures are reliable at least. Aside from that i always look at forecast plots. It's tedious but they often tell you a lot that gets lost in the numbers.
RMSLE i havent used much. From what i read it looks interesting, though more for very specific scenarios (many outliers, high variance, nonlinear data?)
MAPE can be a problem also if you have a problem where rare excursions are what you want to predict and the cost of missing an event is much higher than predicting a non-event. A model that just predicts no change would have very low MAPE because most of the time nothing happens. When the event happens, however, the error of predicting status quo ante is much worse than small baseline errors.
Thanks for the reply! I am outside the forecasting sphere.
RMSLE gives proportional error (so, scale-invariant) without MAPE's systematic under-prediction bias. It does require all-positive values, for the logarithm step.
Good list. Some of these i knew already, but the typing overloading and keyword/positional-only arguments were new to me.
One personal favorite of mine is __all__ for use in __init__.py files. It specifies which items are imported whenever uses from x import *. Especially useful when you have other people working on your codebase with the tendency to always import everything, which is rarely a good idea.
It’s never a good idea. I use `__all__` to explicitely list my exports in libraries, so that when one write `from mylib import `, the IDE auto-completes only the public classes and functions.
What a great article, i always like how much Anthropic focuses on explainability, something vastly ignored by most. The multi-step reasoning section is especially good food for thought.