How linear regression works intuitively and how it leads to gradient descent

c7b • 3 hours ago

One interesting property of least squares regression is that the predictions are the conditional expectation (mean) of the target variable given the right-hand-side variables. So in the OP example, we're predicting the average price of houses of a given size.

The notion of predicting the mean can be extended to other properties of the conditional distribution of the target variable, such as the median or other quantiles [0]. This comes with interesting implications, such as the well-known properties of the median being more robust to outliers than the mean. In fact, the absolute loss function mentioned in the article can be shown to give a conditional median prediction (using the mid-point in case of non-uniqueness). So in the OP example, if the data set is known to contain outliers like properties that have extremely high or low value due to idiosyncratic reasons (e.g. former celebrity homes or contaminated land) then the absolute loss could be a wiser choice than least squares (of course, there are other ways to deal with this as well).

Worth mentioning here I think because the OP seems to be holding a particular grudge against the absolute loss function. It's not perfect, but it has its virtues and some advantages over least squares. It's a trade-off, like so many things.

[0] https://en.wikipedia.org/wiki/Quantile_regression

brrrrrm • 4 hours ago

> When using least squares, a zero derivative always marks a minimum. But that's not true in general ... To tell the difference between a minimum and a maximum, you'd need to look at the second derivative.

It's interesting to continue the analysis into higher dimensions, which have interesting stationary points that require looking at the matrix properties of a specific type of second order derivative (the Hessian) https://en.wikipedia.org/wiki/Saddle_point

In general it's super powerful to convert data problems like linear regression into geometric considerations.

reify • 3 hours ago

All thats wrong with the modern world

https://www.ibm.com/think/topics/linear-regression

A proven way to scientifically and reliably predict the future

Business and organizational leaders can make better decisions by using linear regression techniques. Organizations collect masses of data, and linear regression helps them use that data to better manage reality, instead of relying on experience and intuition. You can take large amounts of raw data and transform it into actionable information.

You can also use linear regression to provide better insights by uncovering patterns and relationships that your business colleagues might have previously seen and thought they already understood.

For example, performing an analysis of sales and purchase data can help you uncover specific purchasing patterns on particular days or at certain times. Insights gathered from regression analysis can help business leaders anticipate times when their company’s products will be in high demand.

uniqueuid • 3 hours ago

While I get your point, it doesn't carry too much weight, because you can (and we often read this) claim the opposite:

Linear regression, for all its faults, forces you to be very selective about parameters that you believe to be meaningful, and offers trivial tools to validate the fit (i.e. even residuals, or posterior predictive simulations if you want to be fancy).

ML and beyond, on the other hand, throws you in a whirl of hyperparameters that you no longer understand and which traps even clever people in overfitting that they don't understand.

Obligatory xkcd: https://xkcd.com/1838/

So a better critique, in my view, would be something that the JW Tukey wrote in his famous 1962 paper: (paraphrasing because I'm lazy):

"better to have an approximate answer to a precise question rather than an answer to an approximate question, which can always be made arbitrarily precise".

So our problem is not the tools, it's that we fool ourselves by applying the tools to the wrong problems because they are easier.