Apr 27, 2024 16:00:00

Cyc: The forgotten AI project

Technology expert I.A. Fisher introduced the AI project 'Cyc,' which was born in the 1980s and rose to prominence for its ability to appropriately answer human questions, but was forgotten as techniques such as deep learning became popular.

Cyc: history's forgotten AI project - by IA Fisher

https://outsiderart.substack.com/p/cyc-historys-forgotten-ai-project

In 1983, a group of AI researchers met at Stanford University to discuss how to program machines with common sense.

The organizer of the meeting, Professor Doug Lenat of Stanford University, developed a system called AM (Automated Mathematician) in the 1970s. AM was programmed to memorize basic mathematical knowledge in advance, and when instructed to 'discover an interesting theorem,' it would come up with a theorem that humans had never seen before. In reality, most of the theorems it produced were commonplace, but some were surprisingly original.

The biggest drawback of AM was that its capabilities ( heuristics ) were hard-coded, but Professor Renato's next development, EURISKO, was able to programmatically evaluate and change the heuristics themselves. When Professor Renato used EURISKO to play a complex tabletop board game against a human, EURISKO proposed a highly unconventional strategy that at first glance seemed unlikely to work. However, EURISKO's unconventional strategy worked, and Renato and EURISKO won the tournament.

Nevertheless, EURISKO eventually fell out of favor, and Professor Renato decided that a machine that could derive a large amount of general knowledge, or what humans call 'common sense,' would have a better chance of acquiring true intelligence than 'smart but simple programs' like AM and EURISKO.

At the time, artificial intelligence was dominated by so-called 'expert systems.' Unlike traditional programs that mechanically followed hard-coded flowcharts, expert systems made inferences and deductions from a set of facts and rules written by experts in fields such as medical diagnosis or organic chemistry. In theory, expert systems could perform rudimentary reasoning and were flexible enough to respond to complex situations.

However, each expert system had its own rules and databases, which resulted in unnecessary duplication and the inability to connect different expert systems to each other. Faced with this problem, Professor Renato came to the conclusion that sharing knowledge, or 'common sense,' would be the basis for a new generation of more effective systems.

In 1984, a research consortium called the Microelectronics Computer Technology Corporation (MCC) was founded by 10 companies in the United States, and Professor Renato became its chief scientist. With a budget of tens of billions of yen and hundreds of employees, Professor Renato aimed to create 'the enormous knowledge base necessary for machines to reason like humans.' The project was named 'Cyc' after the encyclopedia.

Despite its name, Cyc was not meant to be an encyclopedia per se: it covered a basic level of knowledge, including propositions that are so obvious that no one bothers to write them down, such as 'we are only born once' and 'animals can only talk in fairy tales.'

Cyc's engineers originally entered knowledge manually, but eventually they needed to teach Cyc an 'inference' mechanism that could not be handled by hand.

For example, consider the sentence, 'Mary decided to go to Harvard out of the five schools she was accepted to. She graduated with a degree in chemistry.' If we want Cyc to reason like a human, we need to make it consider many things that are not written in the sentence, such as the fact that Mary spent about four years at Harvard and took dozens of courses, many of which were likely chemistry courses. As it completed these tasks, MCC increased Cyc's knowledge.

Cyc is characterized by its ability to reason based on explicit knowledge and rules. For example, if Cyc has the knowledge that 'all trees are plants' and that 'plants will eventually die,' then when asked whether trees will die, it can derive the answer 'trees will eventually die.'

If Cyc also had the following knowledge, 'Parents love their children,' 'People smile when they are happy,' 'A child's first steps are a great achievement,' 'People are happy when their loved ones achieve something great,' and 'Only adults have children,' when asked 'Does the photo entitled 'Someone watching his daughter take her first steps' show a smiling adult?', he could logically infer that the answer is Yes and provide a logical argument using the five additional pieces of knowledge.

MCC was funded for 10 years and taught Cyc a lot of knowledge. When MCC was dissolved in 1994, Cyc was spun off into a new company called Cycorp. While much of what Cycorp did has not been made public, information revealed through research papers and technical reports has revealed that it used Cyc to answer questions from medical researchers, reducing the time it took to answer questions from a month to less than an hour, and partnered with U.S. intelligence agencies to help build a 'terrorism knowledge base' that analysts could query.

In the 2000s, Cycorp released its knowledge data as OpenCyc and offered an extended version called ResearchCyc to researchers. A few outside researchers published papers based on the Cyc system, but the core product was proprietary to Cycorp, and OpenCyc was eventually discontinued in 2012.

While Cyc was slowly but steadily growing its knowledge base, the field of artificial intelligence was changing fundamentally. Most expert systems had long since disappeared by the 2000s, and neural networks, based on algorithms trained on large amounts of data, were making great strides. Although neural networks were the virtual antithesis of Cyc's approach of performing logical inference and painstakingly hand-crafting a knowledge base of explicit rules, neural networks, and by extension deep learning algorithms, were achieving incredible success at previously difficult problems. In this machine learning-dominated field, Cyc's rule-based approach was increasingly looking like an anachronism.

However, as of 2024, 40 years after its founding, Cyc still exists and has grown to a knowledge base of 25 million rules, 1.5 million concepts, and more than 1,000 specialized inference engines. Cycorp employs 50 technical staff and is funded in part by commercial contracts.

Although it is overshadowed by flashy programs like ChatGPT, Cyc still has a long way to go. Professor Renato, who passed away in 2023, said that by combining LLM, which is knowledgeable and natural but often inconsistent and inaccurate, with Cyc, which is not good at understanding natural language but whose conclusions are always backed up by a chain of inference that can be audited by humans, we could create a more powerful AI.

'Although surviving for 40 years is a remarkable achievement, Cyc did not make a revolutionary impact. The AI community will likely remember Cyc as a cautionary tale of wasting a lot of effort on a misguided approach. But there's no doubt that rule-based systems like Cyc were the precursors of AI, and perhaps they will come again soon,' Fisher said.

Related Posts:

Apr 27, 2024 16:00:00 in Software, Posted by log1p_kr