i've come to appreciate, over the past 2 years of heavy Prolog use, that all coding should be (eventually) be done in Prolog.
It's one of few languages that is simultaneously a standalone logical formalism, and a standalone representation of computation. (With caveats and exceptions, I know). So a Prolog program can stand in as a document of all facts, rules and relations that a person/organization understands/declares to be true. Even if AI writes code for us, we should expect to have it presented and manipulated as a logical formalism.
Now if someone cares to argue that some other language/compiler is better at generating more performant code on certain architectures, then that person can declare their arguments in a logical formalism (Prolog) and we can use Prolog to translate between language representations, compile, optimize, etc.
Use Constraint Satisfaction Problem Solvers. It commes up with Common Lisp with ease.
So we are back to Japanese Fifth Generation plan from 1980's. :)
This time around we have all sorts of parallel processing capabilities in the form of GPUs. If I recall correctly, the Fifth Generation project envisioned highly parallel machines performing symbolic AI. From a hardware standpoint, those researchers were way ahead of their time.
And they had a self-sustaining video game industry too... if only someone had had the wild thought of implementing perceptrons and tensor arithmetic on the same hardware!
and winter is coming.
Missing some LISP but yeah it's funny how old things are new again (same story with wasm, RISC archs, etc.)
Lots of GOFAI being implemented again – decision trees, goal searching and planning, agent-based strategies... just not symbolic representations, and that might be the key. I figure you might get an interesting contribution out of skimming old AI laboratory publications and seeing whether you could find a way of implementing it through a single LLM, multiple LLM agents, methods of training, etc.
Watson did it too, a while back.
This is why GitHub CodeQL and Co-Pilot assistance is working better for everyone? basically codeql uses variant of Prolog (datalog) to query source code to generate better results.
I tried an experiment with this using a Prolog interpreter with GPT-4 to try to answer complex logic questions. I found that it was really difficult because the model didn't seem to know Prolog well enough to write a description of any complexity.
It seems like you used an interpreter in the loop which is likely to help. I'd also be interested to see how o1 would do in a task like this or if it even makes sense to use something like prolog if the models can backtrack during the "thinking" phase
I also wrote wrote an LLM to Prolog interpreter for a hackathon called "Logical". With a few hours effort I'm sure it could be improved.
https://github.com/Hendler/logical
I think while LLMs may approach completeness here, it's good to have an interpretable system to audit/verify and reproduce results.
I bet one person could probably build a pretty good synthetic NL->Prolog dataset. ROI for paying that person would be high if you were building a foundation model (ie benefits beyond being able to output Prolog.)
Patiently waiting for z3-guided generation, but this is a welcome, if obvious, development. Results are a bit surprising and sound too optimistic, though.
Building on this idea people have grounded LLM generated reasoning logic with perceptual information from other networks : https://web.stanford.edu/~joycj/projects/left_neurips_2023