Prolog and Natural-Language Analysis (1987) [pdf]

50 points2
mcswell1 day ago

Around 1985, while I was working at the Artificial Intelligence Center of (the now defunct) Boeing Computer Services, I evaluated Fernando Pereira's NLP code written in Prolog for his dissertation (he was one of the authors of the referenced 1987 article). My recollection is that his parser was very slow, and difficult to extend (adding rules to account for other English grammatical structures). Another fellow working at the AIC at the time had written a parser in LISP, and I ended up writing the English grammar for his parser.

That's not to say that LISP was faster than Prolog in general, just this particular program was slow.

Now a days, of course nobody writes parsers or grammars by hand like that. Which makes me sad, because it was a lot of fun :).

mcswell15 hours ago

I should have added, Pereira was (and is) a lot smarter than I am. He went on to do great things in computational linguistics, whereas I went on to do...smaller things.

JimmyRuska15 hours ago

Pretty amusing the old AI revolution was pure logic/reasoning/inference based. People knew to be a believable AI the system needed some level of believable reasoning and logic capabilities, but nobody wanted to decompose a business problem into disjunctive logic statements, and any additional logic can have implications across the whole universe of other logic making it hard to predict and maintain.

LLMs brought this new revolution where it's not immediately obvious you're chatting with a machine, but, just like most humans, they still severely lack the ability to decompose unstructured data into logic statements and prove anything out. It would be amazing if they could write some datalog or prolog to approximate more complex neural-network-based understanding of some problem, as logic based systems are more explainable

LunaSea9 hours ago

One of the reasons for why word vectors, sentence embeddings and LLMs won (for now) is that text found on the web especially, does not necessarily follow strict grammar and lexical rules.

Sentences that are incorrect but still understandable.

If you then include leet speak, acronyms, short form writing (SMS / Tweets), it quickly becomes unmanageable.

zcw1009 hours ago

That’s what Stardog is doing

verdverm18 hours ago

If you like this kind of stuff, CUE(lang) is highly influenced by Prolog and pre 90's NLP. The creator Marcel worked on a Typed Feature Structure for optimally representing grammar rules to support the way NLP was approached at the time.

The CUE evaluator is a really interesting codebase for anyone interested in algos

srush38 minutes ago

This book is great. Really mind warping at first read. Fernando Pereira has had an incredible influence across NLP for his whole career. Here is an offhand list of papers to check out.

* Conditional random fields: Probabilistic models for segmenting and labeling sequence data (2001) - Central paper of structured supervised learning in the 2000s era

* Weighted finite-state transducers in speech recognition (2002) - This work and OpenFST are so clean

* Non-projective dependency parsing using spanning tree algorithms (2005) - Influential work connecting graph algorithms to syntax. Less relevant now, but still such a nice paper.

* Distributional clustering of English words (1994) - Proto word embeddings.

* The Unreasonable Effectiveness of Data (2009) - More high-level, but certainly explains the last 15 years