====== Prolog assignment 2: Searching clause space ====== In the first assignment you are going to revise advanced search algorithms. You are asked to save your work in a file called ''bcs.pl'' and to submit it via the //upload system//.((The upload system requires the file to be stored in an archive without any directories.)) Tasks in this assignment do not have to be completed sequentially. If you wish, skip the first part of the assignment and earn all the points in the second half (see [[#Grading|grading]] for more information). ===== Introduction ===== In the second assignment you will create a more advanced [[http://en.wikipedia.org/wiki/Inductive_logic_programming|ILP]] technique, which searches the space of all clauses. Unlike the previously implemented LGG, which was deterministic and Prolog gave you only a single clause as the answer, this time the computer will consider many clauses throughout the search. We will call the algorithm //Blind Clause Search// (BCS). We will work once again with the genealogy graph of the royal family. Let's consider the well-known predicate ''daughter/2'' for a start. The search begins with the most general clause //c//1=''daughter(X,Y).'', which has no body. Notice that this clause correctly predicts //Louise// being the daughter of //Edward//, because the query ''?-daughter(louise,edward).'' succeeds. However it also predicts the reverse: //Edward// being the daughter of //Louise//, which is obviously incorrect. Therefore //c//1 needs to be specialized. There will be two ways of **clause specialization**. First, we will //add literals// to the clause body and create //c2=//''daughter(X,Y):-parent(A,B)''. Notice that variables in the new literal are fresh — not bound to the variables in the head. Binding the head literal with the body literal is done through //variable unification//, which is the second type of clause specialization. Unification creates firstly //c//3=''daughter(X,Y):-parent(Y,B)'' and applying it for the second time yields //c//4=''daughter(X,Y):-parent(Y,X)''. Notice that //c4// is better than //c//1 in the following sense: Both clauses correctly predict //Louise// being the daughter of //Edward//, but only //c//4 does **not** predict //Edward// being the daughter of //Louise//. How shall the computer know such a difference? We will supply the algorithm **two types of examples: the positive and negative** ones. The positive examples are those, for which your clause //must// succeed and the negative ones are those, for which your clause //must not// succeed (formally we say it //entails// or //does not entail// the example). If the clause is overly general (such as //c//1), a negative example fails and if the clause is overly specific (such as ''daughter(X,X).''), a positive example fails. So instead of specifying a single goal state to be reached, we merely constrain the search space and ask the algorithm to find //any// clause, which satisfies all examples. ==== Supplied code ==== Save the database of the British Royal Family (''royal.pl''): female(camilla). female(diana). female(elizabeth). female(louise). female(sophie). male(charles). male(edward). male(george). male(harry). male(james). male(philip). male(william). parent(charles,harry). parent(charles,william). parent(diana,harry). parent(diana,william). parent(edward,james). parent(edward,louise). parent(elizabeth,charles). parent(elizabeth,edward). parent(george,elizabeth). parent(philip,charles). parent(philip,edward). parent(sophie,james). parent(sophie,louise). wife(camilla,charles). wife(diana,charles). wife(elizabeth,philip). wife(sophie,edward). There are two task definitions for your code: **1) The easy one** (''easy.pl'') % Target predicate: target(daughter,2). % Available body predicates: language(parent,2). language(female,1). % Positive examples: pos(daughter(louise, edward)). pos(daughter(louise, sophie)). % Negative examples: neg(daughter(louise, elizabeth)). neg(daughter(diana, charles)). neg(daughter(harry, diana)). neg(daughter(diana, sophie)). **2) The hard one** (''hard.pl'') % Target predicate: target(sibling, 2). % Available body predicates: language(parent, 2). language(\=, 2). % Positive examples: pos(sibling(harry, william)). pos(sibling(william, harry)). pos(sibling(james, louise)). pos(sibling(charles, edward)). % Negative examples: neg(sibling(charles, george)). neg(sibling(george, charles)). neg(sibling(george, diana)). neg(sibling(harry, harry)). neg(sibling(james, george)). neg(sibling(diana, philip)). neg(sibling(elizabeth, camilla)). neg(sibling(louise, george)). neg(sibling(sophie, diana)). ===== I. Building blocks ===== ==== Terminating the search ==== First of all, we will have to determine if the found clause is correct. Define a predicate ''examples_satisfied(Clause)'' which succeeds iff ''Clause'' satisfies all positive examples and satisfies no negative example. Examples are given as ''pos(...)'' and ''neg(...)'' in the source code (see above). The ''Clause'' has the same structure as in the first assignment: ''Clause = cl(HeadLiteral, [BodyLit1,BodyLit2,...])''. To execute the list of //body literals//, you can use the following code: exec_body([]). exec_body([H|T]) :- call(H), exec_body(T). Example usage: ?- exec_body([member(A,[a,b,c]), \+ A = b]). A = a ; A = c. === Clause checks all examples === Make sure your code produces equal results:((Examples come from the ''easy'' task.)) ?- examples_satisfied(cl(daughter(X,Y),[parent(Y,X),female(X)])). true. ?- examples_satisfied(cl(daughter(X,Y),[parent(Y,X),female(Y)])). false. ?- examples_satisfied(cl(daughter(X,Y),[])). false. **Tip:** Start implementing the negative examples first. They are easier. **1 point** === Clause must work even when input arguments are not instantiated === Make sure that ''examples_satisfied'' succeeds iff the queried clause succeeds with input variables not instantiated. Expected results:((Example from the ''hard'' task.)) ?- examples_satisfied(cl(sibling(X,Y),[parent(Z,X),parent(Z,Y),X \= Y])). true. ?- examples_satisfied(cl(sibling(X,Y),[X \= Y,parent(Z,X),parent(Z,Y)])). false. Note that the second clause succeeds for ''?-sibling(harry,william)'', but asking ''?-sibling(X,Y)'' does not yield ''X = harry, Y = william''. This is clearly undesirable. Try yourself by playing with the ''sibling(X,Y) :- X \= Y,parent(Z,X),parent(Z,Y).'' clause. **1 point** ==== Define the search space ==== The search starts with an empty clause (e.g. //c//1), which is //specialized// using two operators: === Unify two variables === Define ''unify_vars(Cl1,Cl2)'' which unifies randomly 2 variables from ''Cl1'': ?- unify_vars(cl(daughter(A,B),[parent(C,D)]), Cl). Cl = cl(daughter(B, B), [parent(C, D)]); Cl = cl(daughter(C, B), [parent(C, D)]); Cl = cl(daughter(D, B), [parent(C, D)]); Cl = cl(daughter(A, C), [parent(C, D)]); Cl = cl(daughter(A, D), [parent(C, D)]); Cl = cl(daughter(A, B), [parent(D, D)]). **Note:** Clauses ''Cl1'' and ''Cl2'' may or may not share variables (they do in the example above). The choice depends on how to plan to use the predicates later in the assignment. The grading of this task will not be affected by your choice. **Tip:** If you decide to separate all variables between ''Cl1'' and ''Cl2'', you may find ''[[http://www.swi-prolog.org/pldoc/man?predicate=copy_term%2F2|copy_term/2]]'' predicate useful. **1 point** === Add a literal === Define ''add_literal(Cl1,Cl2)'' which adds one literal to the body of ''Cl1'': ?- add_literal(cl(daughter(X,Y),[parent(Y,X)]), Cl). Cl = cl(daughter(X, Y), [parent(_G345, _G346), parent(Y, X)]) ; Cl = cl(daughter(X, Y), [female(_G345), parent(Y, X)]). The added literals are defined through the ''language/2'' predicate. Notice that the variables of the new literals are unbound. **1 point** ===== II. Erecting the cathedral ===== Having both the search space and the terminating condition, create the search algorithm! Define a predicate ''bcs(Cl)'' which finds a correct clause. ==== Implementation ==== - Implement any type of search algorithm, which terminates in a finite amount of time given the ''easy'' task.((The upload system can run the code for a limited amount of time, which is 5 minutes. You can still earn the point if the time limit is exceeded, but the implementation is correct.)) **1 point**. - Use an algorithm different from (read: more sophisticated than) //iterative-deepening depth-first-search//./((A //closed list// can be implemented easily using the ''[[http://www.swi-prolog.org/pldoc/man?predicate=%3D@%3D/2|variant]]'' predicate.)) **1 point**. ==== Efficiency ==== - Use any search algorithm that gives you an answer to the ''easy'' task within 5 seconds and to the ''hard'' task within 5 minutes.((The time efficiency will be measured using: Intel(R) Xeon(R) CPU E5530 @ 2.40GHz (4800 [[http://en.wikipedia.org/wiki/BogoMips|bogomips]]); 64-bit linux, kernel 2.6.36 and SWI Prolog without parallel execution)) **1 point**. - Make the algorithm at least half as efficient as the “official implementation” on the ''easy'' task.((Do not worry, the official implementation is not heavily optimized. Honestly, it's hardly optimized at all.)) **1 point** - Make the algorithm at least twice as efficient as the “official implementation” on the ''hard'' task. **1 point** ==== Bonus ==== - Invent and implement a good heuristics. **1 point** - Give a short talk about it during the last tutorial (3 minutes at most). **2 points** ====== Grading ====== You can earn 12 points out of 8 in the second tutorial. The additional 4 points cannot be used for the exam, but they allow you to compensate lost points in the first Prolog assignment and during the test. ====== Submission ====== Submit the file to [[http://cw.felk.cvut.cz/upload/|cw.felk.cvut.cz/upload]] on the **20.5.2012 21.5.2012 23:59 CEST** the latest. The deadline is strict. The deadline for the “heuristics talk” is the last tutorial.