They also had nice tutorials on particle filters. I can't find the one I wanted but these are close:
https://cecas.clemson.edu/~ahoover/ece854/refs/Djuric-Partic...
https://eprints.lancs.ac.uk/id/eprint/53537/1/Introduction_t...
https://ieeeoes.org/wp-content/uploads/2021/02/BPF_SPMag_07....
If Q and R are constant (as is usually the case), the gain quickly converges, such that the Kalman filter is just an exponential filter with a prediction step. For many people this is a lot easier to understand, and even matches how it is typically used, where Q and R are manually tuned until it “looks good” and never changed again. Moreover, there is just one gain to manually tune instead of multiple quantities Q and R.
> If Q and R are constant (as is usually the case), the gain quickly converges,
You also need for measurements to be equally spaced. Often they are – you might get an alternating pattern of measurement and observation – but often they're not, in which case the Kalman filter gives extra weight to new measurements coming in if it's been a while since it last had one (because that will have allowed the uncertainty to grow).
The Kalman filter also allows you to take into account measurements that are more uncertain in one direction than another. Think of cameras with visual recognition, which tell you a precise angle but only a rough distance estimate. If you have a couple of those and suitable measurement error matrices then the Kalman filter will automatically do a sort of triangulation.
Add a bonus, you can also use the covariance matrix of the target as information in its own right. But, as you say, often parameters are tuned for getting a good result rather than reality so the target uncertainty isn't always especially meaningful.
It took years after I learned the Kalman Filter as a student, until I actually intuitively understood the update of the covariances. Most learning sources (including the OP) just mechanically go through the computations of the a-posterior covariance, but don't bother with an intuition other than "this is the result of multiplying two gaussians", if anything at all.
I wrote down a note for myself where I work this out, if anyone is interested: https://postbits.de/kalman-measurement-update.html
Figured I can save you a click and put the main point here, as few people will be interested in the rest:
The Kalman filter is adding the precision (inverse of covariance) of the measurement and the precision of the predicted state, to obtain the precision of the corrected state. To do so, the respective covariance matrices are first inverted, to obtain precision matrices. To have both in the same space, the measurement precision matrix is projected to the state space using matrix H. The resulting sum is converted back to a covariance matrix, by inverting it.
That is super helpful, thanks! I'm used to calling the inverse of the covariance the information matrix.
Yeah, that's the correct term! I think precision is mainly used for 1D. But I like the term, as I feel it has a better intuition.
I've seen the Kalman filter presented from a few different angles and the one that made the most sense to me was the one from a Bayesian methods class that speaks only in terms of marginal and conditional Gaussian distributions and discards a long of the control theory terminology.
This was one of the books we used: https://link.springer.com/chapter/10.1007/978-1-4757-9365-9_...
I succeeded in understanding the Kalman filter only when I found a text that took a similar approach. It was this invaluable article, which presents the Kalman filter from a Bayesian perspective:
Meinhold, Richard J., and Nozer D. Singpurwalla. 1983. "Understanding the Kalman Filter." American Statistician 37 (May): 123–27.
Kalman filter is the "learn python in 24 hours" for HN.
I love not knowing whether the "pdf" in the title (and URL) refers to a probability density function or the portable document format.
...The answer will surprise you!
And monads. But I've heard they're just like burritos, so how hard can it be.
That probably depends on how much you overcook the burrito.
i don't know know why the hell people are so obsessed with it. like why aren't there recurring posts about how to solve a separable PDE or how to perform gram shmidt or whatever other ~junior math things.
Kalman filters are useful in data processing and interpretation, I used them heavily in continuous geophysical signal processing four decades past.
My guess is that many computer data engineers encounter them and find their self taught grasp of linear algebra and undergraduate math challenged by the theory behind K-F's .. they seem to come across as a bit of a leg up over moving averages, Savitzky–Golay, FFT applications, etc.
There are many more people dealing with implementing these things than have had formal undergraduate lectures on them.
My gut feeling is that most are more likley to encounter K-F applications in drone control, dead reckoning positions when undergound or with flakey GPS, cleaning real world data, etc. than to find themselves having to solve PDE's ..
I posit the existence of some form of pragmatic Maslow's Hierarchy of Applicable Math.
I do agree though that HN has odd bursts of Kalman filter posts.
> Kalman filters are useful in data processing and interpretation
vaguely - plenty of other imputation approaches that are simpler/better/more accessible.
> F applications in drone control, dead reckoning positions when undergound or with flakey GPS
these are not things 99% of devs encounter. literally
> dead reckoning positions when undergound or with flakey GPS
is the domain of probably like 100-1000 people in the entire world - i know because i actually have brushed up against it and am aware painfully aware of the lack of resources.
i really do think it's just a programmer l33t meme not unlike monads, category theory, etc - something that most devs think will elevate them to godhood if they can get their heads around it (when in fact it's pretty useless in practice and just taught in school as a prereq for actually useful things).
The assertion was not that these examples are common rather that currently they are more common to generic app developers than manipulating PDE's
As K-filters in data processing and interpretation, that depends thoroughly on the data domains, a good number have biases and co-signals that are more easily removed with an adaptive model of some form.
Eg: magnetic heading effect when recording nine axis nano-tesla range ground signals. The readings returned over a specific point at a specific time of day are a function of sensor speed and heading. Repeated flying over the same point (hypothetically at the same time) from North to South Vs East to West returns different data streams on each of the nine channels.
To get a "true ground reading" both the heading bias and the diurnal flux must be estimated and subtracted.
> plenty of other imputation approaches that are simpler/better/more accessible.
Do tell. What would you use in the above example?
If only we had some way to predict when these bursts would appear. But, I guess it would probably depend on a lot of factors, and it might be hard to guess how they all influence each other…
The Kalman Filter is an instance of the Generalized Distributive Law https://en.wikipedia.org/wiki/Generalized_distributive_law
So is the Fast Fourier transform, Viterbi algorithm, dynamic programming, message passing and a trillion other things.
Can you expand on how the Kalman Filter fits into this category?
See also
Kalman Filter Explained Simply (2024, 89 comments) https://news.ycombinator.com/item?id=39343746
A non-mathematical introduction to Kalman filters for programmers (2023, 97 comments) https://news.ycombinator.com/item?id=36971975
I've found this "book" (series of jupyter notebooks) to be a fantastic course on the Kalman filter from basics to advanced topics. https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Pyt...
In the realm of autonomous vehicles, early sensor fusion systems relied heavily on the usage of Kalman Filters for perception.
The state of the art has now been supplanted by large deep learning models in the present day, primarily relying on end-to-end trained Transformer networks.
This may be familiar to you in the context of LLMs which have recently become popular, but they were actually first successfully utilized in autonomous vehicles (invented by researchers at Google and implemented in production at Waymo almost immediately).
As a developer I always found these maths-first approaches to Kalman filters impenetrable (I guess that betrays my lack of knowledge, I dare cast no aspersions on the quality of these explanations!). However, if like me, it helps with the learning curve to implement it first, here's a 1-dimensional version simplified from my blog:
function transpose(a) { return a } // 1x1 matrix eg a single value.
function invert(a) { return 1/a }
const qExternalNoiseVariance = 0.1
const rMeasurementNoiseVariance = 0.1
const fStateTransition = 1
let pStateError = 1
let xCurrentState = rawDataArray[0]
for (const zMeasurement in rawDataArray) {
const xPredicted = fStateTransition * xCurrentState
const pPredicted = fStateTransition * pStateError * transpose(fStateTransition) + qExternalNoiseVariance
const kKalmanGain = pPredicted * invert(pPredicted + rMeasurementNoiseVariance)
pStateError = pPredicted - kKalmanGain * pPredicted
xCurrentState = xPredicted + kKalmanGain * (zMeasurement - xPredicted) // Output!
}
https://www.splinter.com.au/2023/12/14/the-kalman-filter-for...It's not your fault, these can get messy very quickly. Infer.NET was started because Tom Minka and other Bayes experts were tired of writing message passing and variational inference by hand, which is both cumbersome and error prone on non-toy problems.
It helps to take a more abstract view where you split the generative process and the inference algorithm. Some frameworks (Infer.NET, ForneyLab.jl) can generate an efficient inference algorithm from the generative model without any user input. See e.g. https://github.com/biaslab/ForneyLab.jl/blob/master/demo/kal...
Thanks for sharing this - saving this paper too (from link in the github page): https://people.ee.ethz.ch/~loeliger/localpapers/FactorGraphs...
I'm not familiar with these techniques at all but seems like they have a ton of useful applications.
Factor graphs are well discussed in David Barber's excellent free BRML book: http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/140324.pdf
Judea Pearl described it as an excellent Bayesian textbook. There's a free solutions book and everything is also implemented.
I find this lecture series give a good introduction on Bayesian theory of filtering: https://www.youtube.com/watch?v=pVyltJnXlAI&list=PLTD_k0sZVY...
For the lesser developer gods here, can someone give an example of a real life business case where (s)he has effectively used this? Explain like you're talking to a guy who has done CRUD most of his life.
Smoothing out messy data from an eye tracker into a smooth gaze path.
Thanks, I'll see about Box-PF when trying to get IMU-filtered indoor-localization working (once I hopefully get that far with the UWB firmware [0]). Accounting for clock drift across the "satellites" is going to be "fun", but at least it's both useful in practice and of manageable complexity/scope.
[0]: I'll happily talk about it at 39c3.
PF is a great tool for UWB. Even without IMU data and instead adding uniform diffusion of the particles between updates tracking worked well in a 2D environment. It sounds like you're working on TDOA for UWB?
its a pity there are no good software packages for particle filters
I haven't used them yet but I know of pyro, pfjax, and pymc:
https://pyro.ai/examples/smcfilter.html
https://pfjax.readthedocs.io/
https://www.pymc.io/projects/examples/en/latest/samplers/SMC...
The problem with this idea is that deriving all the propagation and measurement functions and associated jacobians is 99% of the problem. Once that's done you can implement literally any filter from them using Wikipedia.
Particle filters dont need jacobians?