/*Date: 2024-03-18 ************************************************************ Ideas for developments in the next major version: Define an IRRP as an IRR without its middle element. An IGR is an IRRP (LHS), a char (alpha), and an array of IRRP's (the RHS's) In the derivation of an IGR, from its LHS the pointer starts from next to alpha, reaches near the other end then returns to alpha. Then after removing the redundant symbols, it must actually reach the other end first. Write a function to generate the set of RHS's of an IGR from its LHS. To generate the set of IGR's from an initial IGR x, take each RHS of x, apply this function to get the set S of IRRP's. Then the results of F have their LHS's equal to each RHS of x, and it RHS's equal to the corresponding set S. An algorithm for finding all the IGR's might be to start with those to get IRR(2), then repeatedly take each one of these, apply F to it and add all the results that are not already present to the same set and mark the original IGR as done. Repeat until all are done. It would be very interesting if this gave a finite final set. I think this might not work because in general context symbols need to be added before F can be done and their number appears to be unlimited. It might be instructive anyway if the result is not trivial. ************************************************************* Allowing for T1 and T2 to have different lengths appears to done and correct. The dialogues have changed and I've done a lot of tidyng up, i.e. deleting out of date comments and unwanted commented out code. The IGR(2) are now correct. This is the computer program TIE (Turing machine Inference Engine) version 3.0 draft. It employs the strategies described in the latest draft manusript of the Turing Machine analysis paper to be found on www.bluesky-home.co.uk It generates IRR's and IGR's (see references and latest draft paper) from a Turing Machine. This works for non-terminating TM's, and it has not all been checked for TM's that halt. TM's that don't halt are a simpler concept that it makes sense to analyse first, and the analysis of halting TM's could be obviously based on this. This program must be compiled before it can be used. It was written in D and originally had extension .d, but I prefer to keep it on the web as a .txt file and rename it before compilation. To compile it go to dlang.org and download one of the compilers (I use dmd) then do for example the following in a terminal program that takes text commands: $mv tie_v2_0.txt tie.d $dmd tie.d and run it with the command that includes the file name of the TM to be analysed. $./tie TM_file_name.txt An alternative if the user is familiar with the dialogue is to ignore the screen output and redirect the output to any file eg $./tie TM_file_name > out.txt, and then put the answers to the dialogues. The IRR are essentially the non-redundant set of computational short-cuts that can be derived by running the Turing Machine(TM) on a finite length of tape. Every computation of the TM as a sequence of single TM steps can be replaced by an equivalent shorter sequence of applications of the short-cut rules, (though the set of such rules may be and usually is infinite). Any computation with a TM expressed in terms of its IRR is reduced in terms of the number of computation steps needed. By correctly guessing the iterative formulae or method for obtaining the IRR for a particular TM with the aid of the list of IRR generated by this program, and verifying them by an induction argument, it is possible to obtain long- term information on the behaviour of the TM that is not obvious initially, and may shed light on the problem which the TM was designed to solve. Because TM's can be set up to solve any problem that can be solved by an algorithm, the possibilities of application of this method seem limitless. The notation for each irreducible regular rule (IRR), including the original Turing Machine (TM) table, is in the following order: two configuration sets (CS)'s each defined by a quadruple (state,pointer position,length,tape symbols) representing the input and output respectively of the rule, followed by the status of the rule (L left-moving,R right-moving,H halt), and finally the numbers of steps involved in its derivation (number of derivation steps from rules of length n-1, number of TM steps). The lines of the TM machine table must be sorted first by old state increasing, and then by symbol read in alphabetical order a to z. The layout of this file is as follows: (1)Copyright notice and literature references (2)History of different versions what they do including the current one, in reverse order. (3) The program code itself which is divided as follows (3.1) Global variables and library imports (3.2) Class definitions (3.3) Write functions for the classes (3.4) Other functions (3.5) The main function which is where the program execution begins. Copyright (C) 2018 John Nixon This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License can be obtained from . Mathematica Aeterna, (2013) Vol. 3, no. 9, 709 - 738 (Paper 1) a second paper continuing this study has now been published Mathematica Aeterna, (2017) Vol. 7, no. 1, 19 - 56 (Paper 2) Program modifications New in version 3.0: The latest draft of the paper describes the construction and use of IGR's (IRR generating rules). These are rules that describe the generation of new IRR's from old ones using the general procedure called F in the paper. This version of the program generates these IGR's up to a given length. The motivation for doing this is that a small number of IGR's could be combined to generate a large indeed infinite number of IRR's. It is unknown at this stage under what conditions on a TM is the number of IGR's for that TM finite. New in version 2.1: Foreach loops over arrays of class objects do not need ref to avoid unnecessary copying. them to exactly mimic the output from the default write functions for structs to aid in program testing, but which could now be modifed if needed. I did this to understand the difference between structs and classes, and to attempt to write the program more efficiently. I have simplified the code a little and eliminated some unnecessary copying, but probably not much. I was motivated by the elegant way in which I could implement Tarjan's algorithm with a class that was much more awkward with a struct. I have almost always used ref parameters for functions to ensure 2-way passing of information though this is sometimes unnecessary. I think this should be the default with the use of const where possible to aid in undertanding the program. New in version 2.0: The procedure for generating the IRR's has been changed to include generating the origins of the IRR's i.e. the IRR's are now represented as triplets. This gives a more informative result providing the proof of the reachability of the LHS's of the IRR's and the algorithm has been changed as a result and is now much easier to describe. It also runs about twice as fast as version 1.3 which is partly due to avoiding the unnecessary copying of IRR's through the default method D has for copying objects (pointer copies) instead of physical copies obtained with .dup. I had concerns about doing this at first until I realised that it is quite easy to check the program for the absence of "back doors" i.e. other variables that point to the same memory that could be used to change such a variable. There will often be such other variables initially and it must be checked that they go out of scope without modifying the variable of interest. This mostly deals with my earlier comment about .dup. New in version 1.3: the IRR's are printed with the additional "L" or "R" giving the pointer position on the LHS of the IRR. This is just a convenience because the pointer position is included as before, but the 'type' of the rule e.g. LL, LR, RL or RR can now be read off quickly after the second CS e.g. to determine if the rule has type RL or LR which allows it to be extended by an additional symbol and not be reducible. New in version 1.2: (1) The program has been translated from C++ into D (dlang.org). [D is much easier to use than C++, has bounds checking making anything like Valgrind unnecessary as far as I can see, has many more features and has similar syntax. This otherwise excellent language does unfortunately require the programmer to set up and use functions called "dup" that duplicate things that I think should be done automatically in assignments and copying objects into arrays etc. If the programmer forgets to do this correctly a new pointer will be assigned to the object instead of making a new object copy, so if the object is subsequently changed, errors could arise in code that appears naively to be correct.] (2) The symbols are now in the alphabet {a,b,c,...z} and the states are numbered 1,2,3,... allowing for Turing Machines with many more states and symbols to be analysed. This also removes ambiguity in symbolic expressions where exponents are used and is in accord with the notation in Paper 2. (3) A much improved format for reading Turing machines. (4) There was a modification to the way halting is handled: halting is now considered to be a property of a TM instruction rather than a TM state. As such, the TM can either move right (R) left(L) by one square, or halt (H) at each TM step. This avoids needing a distinguished state, but it means that a configuration set (CS) reached as a result of a halt must specify this explicitly e.g. by putting (H) after it. In any case my interest is mainly in describing non-terminating computations so I probably won't make much use of this in future. The original version 1.1 was based on the translation into D of TIEv1.0 (written in C++). It generates the set of all irreducible regular rules (IRR) for any Turing Machine (TM) up to any specified length, written by John Nixon (john.h.nixon1@gmail.com formerly john.h.nixon@spamcop.net). Comments are welcome. In the terminology of paper 1, "rule" is to be read as "regular rule" i.e. RR. Likewise "irreducible_rule" and "reducible_rule" are to be read as IRR and RRR respectively as defined in paper 1. Comments on D: Function parameters: "ref" is sometimes needed to ensure proper functioning of functions using 2-way information flow even if the arguments are ref types, so use it always for 2-way flow. I use "out" for output-only. I don't use "in" because it is confusing because it is unfortunately almost synonymous with "const", not just copying input variables which is what the semantics suggests. It should be for input-only. I use const instead. You cannot declare a variable conditionally without restricting the scope of that variable to the code in the conditional block. Could the scope of that variable be the next outer scope instead? */ import std.stdio; import std.algorithm; import std.string; import std.file; import std.conv; import std.typecons; //Global variables: File of;//output file; int nsy;//# symbols int nst;//# machine states IRR [][] irrs = new IRR[][0];//The set of IRR's for the TM, used by main and apply_rules IGR [][][][] igrs = new IGR[][][][0];//igrs[i][j1][j2] is the set of IGR's that create the set IRR(i + 2) from the set IRR(i + 1) i.e. //the IGR(i + 2), and have parameters j1 = i1 - 1, j2 = i2 for generating new origins and new RHS's respectively (see paper pp 22-25). //The array of IGR's (initially augmented IGR's i.e. IGR and context strings are combined before the contexts are removed) //igrs is split into a 3D array of sub-arrays of IGR's igrs[i][j1][j2]. //************************************************ Store ********************************************************** //The recursive function "new_origins" was probably the most difficult part of the program to write and the most important. //It generates all the new endpoints of the reverse computation from a given CS as a result of adding a single extra symbol //to the finite part of the TM machine tape considered. new_origins uses and can reuse memory space that does not have to be declared //for each call and is conveniently packaged into one structure. The Type name is Store, and the instance is called st. alias Store = Tuple!(bool, "inside", bool, "dircheck", CS[], "L", bool, "cycle",bool, "right_at_end", bool, "left_at_end", bool, "cond_1", int, "j", int, "ind", int, "j1", CS[], "one_step_ori", bool, "right_at_start"); Store st; alias int_pair = Tuple!(int, "j1", int, "j2"); void Store_write(const ref Store st){ write("\n\ninside = ",st.inside); write("\ndircheck = ",st.dircheck); write("\nL = ");array_CS_write(stdout,st.L); write("\ncycle = ",st.cycle); write("\nright_at_end = ",st.right_at_end); write("\nleft_at_end = ",st.left_at_end); write("\ncond_1 = ",st.cond_1); write("\nj = ",st.j); write("\nind = ",st.ind); write("\nj1 = ",st.j1);//extreme pointer position counted from the alpha end of the string st. pointer starts at position 0. write("\none_step_ori = ");array_CS_write(stdout,st.one_step_ori); writeln("\nright_at_start = ",st.right_at_start);} alias NOT = Tuple!(CS, "new_ori", int, "j1", char, "alpha");//new origins triplet alias NOOCO = Tuple!(CS, "cs0", NOT[], "nota");//new origins one cycle output //************************** class CS ************************************************ //This class represents a configuration set(CS) as defined in the paper. //I will use the convention that if p>0, symbols will be added to the left //of the pointer, and if p=0, symbols will be added to the right. //This distinguishes the two cases when l=0 because then p=1 and 0 respectively. class CS{ int s = 0;//state int p = 0;//pointer position int l = 0;//length of tape given char [] t = [];//part of tape of specified length @safe final this(const int s,const int p,const int l,const char[] t){ this.s = s; this.p = p; this.l = l; this.t = t.dup;} @safe final this(){ }//needed for calling 'new' without an argument to create a default class object @safe final CS dup() const {return new CS(s,p,l,t);} //Order relation for sorting CS's: Compare first by pointer positions. If //these are the same compare by machine states, if these are equal //and the pointer is at the left ( <= 1), compare by symbol strings on the "tape", and if the pointer //is at the right ( >= l), compare by the reversed symbol strings on the "tape". This is such that the most //significant end of the tape is the end closest to the pointer. Then truncation at the other end does not alter the sort order. final int opCmp(ref const CS rhs) const{ if (p != rhs.p) return p - rhs.p; else if(s != rhs.s)return s - rhs.s; else if (p <= 1){ if (t != rhs.t) { return t < rhs.t ? -1 : 1;} else return 0;} else if (p >= l){ char [] str1 = t.dup; char [] str2 = rhs.t.dup; if(str1 != str2) { return reverse(str1) < reverse(str2)? -1 : 1;}else return 0;} else {writeln("Error ocurred in opCmp"); return 0; }} //This is needed though it seems superfluous! needed to equality test class objects by their values @trusted override bool opEquals(const Object o) const { auto rhs = cast(CS)o;//not allowed in safe code if(s != rhs.s) return false; else if(p != rhs.p) return false; else if(l != rhs.l) return false; else if(t != rhs.t) return false; else return true;} } //******************************** class IRR ************************************** //relation between two CS's cs1 and cs2 defined by the action of the TM. In the paper //special types of rule were defined: regular rules i.e. rules defined by a simple //recursive definition, and the subset of these that are non-redundant i.e. derived //in more than one step. These are the irreducible regular rules (IRR's). Later the //array of CS's cs0 was added to the definition. These are all the CS's that lead by //one or more TM steps to cs1. class IRR{ CS[] cs0 = []; CS cs1; CS cs2; char lsig = ' ';//pointer position in cs1 i.e. LHS: 'L' for 1 and 'R' for cs1.p > 1, blank if cs1.l=1 char sig = ' ';//L,R,H, or C i.e. left moving, right moving, halting, or cycling resp. //See the paper for details. int n_derivation = 0;//# derivation steps from cs1 to cs2. int n_machine = 0;//# machine steps from cs1 to cs2. @safe final this(ref CS[] cs0,ref CS cs1,ref CS cs2, const char lsig, const char sig, const int n_derivation,const int n_machine){ this.cs0 = cs0.dup; this.cs1 = cs1.dup; this.cs2 = cs2.dup; this.lsig = lsig; this.sig = sig; this.n_derivation = n_derivation; this.n_machine = n_machine;} @safe final this(){ } @safe final IRR dup(){ return new IRR(cs0, cs1, cs2, lsig, sig, n_derivation, n_machine);} //Compare IRR's by their LHS CS's (cs1). final int opCmp(const ref IRR rhs){ if(cs1 != rhs.cs1) return cs1 < rhs.cs1? -1: 1; else return 0;} } //************************************** class IRRP_set ************************************** //This class is for sets of IRR patterns (IRRP's) and the alpha value that generates them from another IRRP. see in my paper (2021 version). //The RHS is unique but there can be many origins. //An IRRP_set is a set of IRR's that have a common set of origins and rhs that have //arbitrary context strings $T_1$ and $T_2$ respectively. These are just place-holders and will not be represented in the class //because the information is coveyed by the class name itself. //This is for convenience when using apply_rules to generate RHS's. //The class IRRP_set is part of the class IGR below. class IRRP_set{ char alpha = '*';//default value to permit sorting CS[] origins = []; CS rhs; final this(const char alpha, CS[] origins, const ref CS rhs){ this.origins = origins.dup; this.rhs = rhs.dup; this.alpha = alpha;} @safe final this(){} final int opCmp(ref const IRRP_set irrps) const{//<,>,== should be exhaustive if(alpha < irrps.alpha) return -1; if(alpha > irrps.alpha) return 1; if(rhs < irrps.rhs) return -1; if(rhs > irrps.rhs) return 1; if(origins[0] < irrps.origins[0]) return -1; if(origins[0] > irrps.origins[0]) return 1; return 0;} final IRRP_set dup() { return new IRRP_set(alpha,origins,rhs);} override bool opEquals(const Object o) const{//needed to compare class objects by their values auto irrps = cast(IRRP_set)o; if(alpha != irrps.alpha) return false; if(origins != irrps.origins) return false; if(rhs != irrps.rhs) return false; return true;} } //****************************************** class RHE ***************************************** //right hand element. An array of these form the rhs of an IGR. class RHE{ IRRP_set irrp_set; int n_TM_step = 0; //n_TM_step is the number of TM steps from the original rhs to the final rhs on the right of the IGR that contains the RHE. int n_Sub_step = 0; //n_Sub_step is the number of substitution steps involved in this. @safe final this(){} final this(ref IRRP_set irrp_set, const int n_TM_step, const int n_Sub_step){ this.irrp_set = irrp_set.dup; this.n_TM_step = n_TM_step; this.n_Sub_step = n_Sub_step;} override bool opEquals(const Object o) const { auto rhs = cast(RHE)o;//not allowed in safe code if(n_TM_step != rhs.n_TM_step) return false; if(n_Sub_step != rhs.n_Sub_step) return false; if(irrp_set != rhs.irrp_set) return false; return true;} final RHE dup(){ return new RHE(irrp_set, n_TM_step, n_Sub_step);} final int opCmp(ref const RHE rhe){ return irrp_set.opCmp(rhe.irrp_set);} } //****************************************** class Context_pair ********************************* //This is for any pair of strings. It is used only as a context for an IGR to be added on the left and right respectively of the implication. // class Context_pair{ char [] s1 = [];//left char [] s2 = [];//right final this(const ref char [] s1, const ref char [] s2){ this.s1 = s1.dup; this.s2 = s2.dup;} override bool opEquals(const Object o) const { auto rhs = cast(Context_pair)o;//not allowed in safe code if(s1 != rhs.s1) return false; if(s2 != rhs.s2) return false; return true;} final this(){} final Context_pair dup(){ return new Context_pair(s1,s2);} } //***************************************** class IGR ************************************** //This class is for sets of IGR's with a common lhs IRRP and with a set of pairs of strings (contexts). Each instance of //this class will correspond to a row of Table 3 or Table 4. see my paper (2021 version). class IGR{ int ns, j1, j2;//(ns) number of symbols in the rhs of the IGR (origin & rhs are same length), // including context symbols, extreme positions (j1 = i1 - 1,j2 = i2) of the TM on the "tape" found when deriving the new origin and the new RHS. //related to the length of the symbol strings. See paper. Context_pair[] context = [];//this corresponds to a set of $T_1$ and $T_2$ pairs in the paper. //The context is the same for each value of alpha. IRRP_set lhs;//This IRRP_set corresponds to a set of IRR's to which when F is applied generates the set //of IRRP_sets (combined with two ints into RHE's) on the rhs, one for each possible value of alpha (the symbol added). RHE[] rhs = []; final this(int ns, int j1, int j2, ref Context_pair[] context, ref IRRP_set lhs, ref RHE[] rhs){ this.ns = ns; this.j1 = j1; this.j2 = j2; this.context = context.dup; this.lhs = lhs.dup; this.rhs = rhs.dup;} final int opCmp(ref const IGR igr) const{ //compare IGR's by their lhs's. This is just based on logic, the set of rhs's follows from the lhs. return lhs.opCmp(igr.lhs);} //This is needed. override bool opEquals(const Object o) const { auto rhs1 = cast(IGR)o;//not allowed in safe code if(ns != rhs1.ns) return false; if(j1 != rhs1.j1) return false; if(j2 != rhs1.j2) return false; if(context != rhs1.context) return false; if(lhs != rhs1.lhs) return false; if(rhs != rhs1.rhs) return false; return true;} final this(){} final IGR dup() { return new IGR(ns, j1, j2, context,lhs,rhs);} } //*********************** Several write functions **************************** //A "write" function is not supplied by default for classes, but is for structs. I wrote //CS_write, array_CS_write, IRR_write, array_IRR_write to mimic the results of the default struct write functions to compare //the program outputs with classes and structs (for CS's and IRR's) and they agreed and other write functions were added later //in the similar way. void CS_write(File of,const ref CS cs){ if(cs !is null){ of.write("CS("); of.write(cs.s); of.write(", "); of.write(cs.p); of.write(", "); of.write(cs.l); of.write(", "); of.write('"'); of.write(cs.t); of.write('"'); of.write(")");} else of.write("null CS");} void Context_pair_write(File of, const ref Context_pair sp){ if(sp !is null){ of.write("("); of.write(sp.s1); of.write(","); of.write(sp.s2); of.write(")");} else of.write("null Context_pair");} void IRR_write(File of,const ref IRR r){ if(r !is null){ of.write("IRR("); of.write("["); bool first=true; foreach(ref x;r.cs0){if(!first)of.write(", "); CS_write(of,x);first = false;} of.write("]"); of.write(", "); CS_write(of,r.cs1); of.write(", "); CS_write(of,r.cs2); of.write(", '"); of.write(r.lsig); of.write("', '"); of.write(r.sig); of.write("', "); of.write(r.n_derivation); of.write(", "); of.write(r.n_machine); of.write(")");} else of.write("null IRR");} void array_CS_write(File of,const ref CS[] array){ if(array !is null){ of.write("["); foreach(ref x;array){ of.write(" ");CS_write(of,x);} of.write(" ]");} else of.write("null CS[]");} void context_write(File of,const ref Context_pair[] array){ if(array !is null){ of.write("context["); foreach(ref x;array){ of.write(" ");Context_pair_write(of,x);} of.write(" ]");} else of.write("null context");} void array_IRR_write(File of,const ref IRR[] array){ if(array !is null){ bool first = true; of.write(array.length); foreach(ref x;array){ of.write(" ");if(!first)of.write("\n");IRR_write(of,x);first = false;}} else of.write("null IRR[]");} void IRRP_set_write(File of, const ref IRRP_set irrp_set){ if(irrp_set !is null){ of.write("IRRP_set("); of.write(irrp_set.alpha); of.write(", "); of.array_CS_write(irrp_set.origins); of.write(", "); of.CS_write(irrp_set.rhs); of.write(")");} else of.write("null IRRP_set");} void RHE_write(File of, const ref RHE rhe){ if(rhe !is null){ of.write("RHE("); of.IRRP_set_write(rhe.irrp_set); of.write(", "); of.write(rhe.n_Sub_step); of.write(", "); of.write(rhe.n_TM_step); of.write(")");} else of.write("null RHE"); } void array_RHE_write(File of, const ref RHE[] array){ if(array !is null){ of.write("[ "); bool first = true; foreach(ref x; array){ if(!first)of.write("\n"); RHE_write(of,x); first = false;} of.write(" ]");} else of.write("null RHE[]");} void array_IRRP_set_write(File of,const ref IRRP_set[] array){ if(array !is null){ bool first = true; of.write("[ "); foreach(ref x;array){ if(!first)of.write("\n");IRRP_set_write(of,x);first = false;} of.write(" ]");} else of.write("null IRRP_set[]");} void IGR_write(File of, const ref IGR igr){ if(igr !is null){ of.write("IGR("); of.write(igr.ns); of.write(", "); of.write(igr.j1); of.write(", "); of.write(igr.j2); of.write(", "); of.context_write(igr.context); of.write(", "); of.IRRP_set_write(igr.lhs); of.write(", "); of.array_RHE_write(igr.rhs); of.write(")");} else of.write("null IGR");} void array_IGR_write(File of,const ref IGR[] array){ if(array !is null){ bool first = true; of.write(array.length); foreach(ref x;array){ of.write(" ");if(!first)of.write("\n");IGR_write(of,x);first = false;} of.writeln();} else of.write("null IGR[]");} void NOOCO_write(File of, const ref NOOCO nooco){ write("\n nooco.cs0 = ");CS_write(stdout,nooco.cs0); foreach(y; nooco.nota){//new origins triplet array write("\n y.new_ori = ");CS_write(stdout,y.new_ori); write("\n y.j1 = ",y.j1); write("\n y.alpha = ",y.alpha);}} //********************************** function can_write ************************************** //This function finds a filename to be written to for output and returns it. It ensures //that no file is overwritten without the consent of the user. "base_name" is usually the input //datafile name and "extension" is the default file extension for output (including the dot) @trusted char[] can_write(const char[] base_name,const char[] extension){ char[] filename = (base_name ~ extension).dup; char[] chars; char[] question; bool write1,fn,overwrite=false; do{ fn = exists(filename); if(fn){ writeln("The file ",filename," already exists."); question="Do you want me to overwrite it?".dup; overwrite=yes(question); if(!overwrite){ writeln("The new path and file name is ",base_name,"",extension); std.stdio.write("Enter the extra characters: "); readln(chars);chars=chomp(chars);//not allowed in safe code filename=base_name ~ chars ~ extension;}} write1=(!fn) || overwrite;} while (!write1); writeln("Output will be to the file ",filename); return filename;} //************************************** function yes ***************************************** //This function asks the user the question by dispaying the text 'question' and waits for //the response y (for yes) or n (for no) @trusted bool yes(const char[] question){ char ans='*'; bool ret; char[] res; while(ans!='y' && ans!='n'){ std.stdio.write(question,"(y/n): "); res=chomp(readln().dup);//not allowed in safe code ans=res[0]; if(res.length==1){ if(ans=='y')ret = true; else if(ans=='n') ret = false;} else{ans='*';}} return ret; } //*********************************** function input_int *************************************** //The purpose of this function is to extract an int from the input stream and handle //the error that occurs when a non-number is entered, and allow the user to enter it again. @trusted void input_int(ref int i){//ref is needed here bool error; char[] line; for(;;){ line = chomp(readln().dup);//not allowed in safe code error = false; try{i = to!int(line);} catch (Exception){error=true;} if(error == false)break; std.stdio.write("An error occurred. Please try again: ");} } //*********************************************** function is_substring_left ************************************************ //This determines whether of not the shorter of the char arrays is identical to the portion of the other string of the same length //taken from the left. bool is_substring_left(ref char[] str1, ref char[] str2){ int l = to!int(min(str1.length, str2.length)); foreach(i; 0 .. l){ if(str1[i] != str2[i])return false;} return true;} //*********************************************** function can_follow ************************************************** //This function determines whether or not one IGR can be followed by another //and if true, two IRR's can be derived in turn from a single IRR. //The arguments are 6 CS's, the origin and rhs of the rhs of the first IGR, then the same for the lhs of the second IGR. //and in addition, cs1 and cs2 at the origin of the lhs of the IGRs. This is needed to check that the type of the first //IGR to ensure it corresponds to extendable IRR's i.e. type LRL or RLR. //There are several conditions that must be satisfied: //1) The direction of adding alpha must be the same //2) The states on the left and right must match (rhs of x and lhs of y) //3) IGR x (the first) must be of extendable type i.e. RLR of LRL //4) The strings on the left and right must match from the end indicated in step 1. bool can_follow(const ref CS cs1l, const ref CS cs1r, const ref CS cs2l, const ref CS cs2r, const ref CS cs1, const ref CS cs2){ // write("\nx = ");IGR_write(stdout,x); // write("\ny = ");IGR_write(stdout,y); char[] strx_l, strx_r, stry_l, stry_r; bool x_lg, y_lg;//x resp. y are left going if(cs1l.s != cs2l.s) { // write("\n1 false"); return false;} if(cs1r.s != cs2r.s) { // write("\n2 false"); return false;} x_lg = cs1.p == 1; y_lg = cs2.p == 1; if(x_lg != y_lg){ // write("\n3 false"); return false;} if(!((x_lg && cs1r.p == 0) || (!x_lg && cs1r.p == cs1r.l + 1))) { //write("\n3 false"); return false;} //string matching condition strx_l = cs1l.t.dup; strx_r = cs1r.t.dup; stry_l = cs2l.t.dup; stry_r = cs2r.t.dup; if(x_lg && is_substring_left(strx_l,stry_l) && is_substring_left(strx_r,stry_r)) { //write("\n4 true"); return true;} reverse(strx_l);reverse(strx_r);reverse(stry_l);reverse(stry_r); if(!x_lg && is_substring_left(strx_l,stry_l) && is_substring_left(strx_r,stry_r)) { //write("\n5 true"); return true;} // write("\n6 return false"); return false;} //***************************************** template function unique_sort_add ****************************************** //add element x to array in such a way as to maintain increasing order, and such that no elements are repeated //in the array. //Algorithm. Bisect the array segment repeatedly so as to locate between which pair of //elements x is to be inserted to maintain inreasing order. When the pair of elements differ by 1 in index //x can be inserted between them provided x is different from each of this pair. //nova (meaning new which cannnot be used because it is a keyword) tells whether x is distinct from all the elements of array or not void unique_sort_add(T)(File of, int ind, ref T[] array, ref T x, out bool nova){ int i; nova = true; if(x !is null){ int l = to!int(array.length);//include case when l > 0 but elements are null if(l == 0) array ~= x;//array is empty if(l > 0){ //deal with the end cases next if(x.opCmp(array[0]) < 0) { if(array is null)array = [x]; else array = [x] ~ array;}//x is smaller than all array elements else if(x.opCmp(array[l - 1]) > 0) array ~= x;//is greater than all array elements else{//l != 0 and x is not to be inserted at an end. //i1 and i2 are the indices which bracket the position where x is to be placed. //and are at the lower and upper end of the range respectively. int i3,i1 = 0; int i2 = l - 1; while(i2 - i1 > 1){ i3 = i1 + (i2 - i1)/2; if(x.opCmp(array[i3]) < 0) i2 = i3; else i1 = i3;} if(array[i1] != x && x != array[i2]){//x should be included array = array[0 .. i1 + 1] ~ x ~ array[i2 .. $];} else nova = false; //x is not different from all the elements of array, so array is not altered. }}}} //******************************** function split_string *************************** //Remove n_symbols from the string of a CS on the left or right opposite the pointer and put them in cont. //The pointer is at or one space beyond the end of the string when called, and the length of the symbols strings is >= 2. void split_string(ref CS cs,const int n_symbols, out char[] cont){ // write("\nsplit_string: cs = ");CS_write(stdout,cs);write(" n_symbols = ",n_symbols); if(cs.t.length != cs.l){write("cs.l != cs.t.length");} int i; CS cs1 = cs.dup; bool right = cs.p - cs.l == 0 || cs.p - cs.l == 1;//pointer is at right cs.l -= n_symbols; if(right){ cs.p -= n_symbols; i = to!int(cs.t.length) - n_symbols; rotate(cs.t,i); cont = cs1.t[0 .. n_symbols];} else cont = cs1.t[$ - n_symbols .. $].dup; cs.t.length -= n_symbols; } //******************************** function add_opposite_pointer ******************* //Add symbol string (not inplemented as a D string!) s at the opposite end of the string of symbols from the pointer. formerly add_at_x //The routine assumes that the pointer is at one end of the known //symbols on the tape (other cases will not be needed). If the length of the string of known symbols is 1 s will be added on the right //unless direction is set to 'L'. @safe void add_opposite_pointer(ref CS cs,const char [] s, const char direction){ //symbols s are added at the right. Pointer is at left. ( i.e. assume cs.p == 1) int len = to!int(s.length); cs.t ~= s; cs.l += len; if(cs.p > 1 || direction == 'L'){//the symbols are added at the left, pointer is at right ( if cs.p == cs.l ) //rotate the string round so that s is now added at the left, and adjust cs.p accordingly. cs.p += len; rotate(cs.t,len);}} //********************************* function add_at_ptr_or_end **************************** //Add symbol sy to cs (1) at or (2) adjacent to the pointer position at end of symbol string. //The routine assumes that the pointer is just off one end by one space (case 1) or at one end (case 2) //of the known symbols on the tape so that the added symbol never clobbers previous known symbols on the tape. //This assumes that the length of the symbol string is > 1 for case (2). If l = 1 and case (1) then //the other parameters cannot tell whether the symbol sy should be added at the left or right. In this case //this is determined by the parameter direction (L or R) respectively. 'R' is the default value. @safe void add_at_ptr_or_end(ref CS cs,const char sy, const char direction){ cs.t ~= sy; if(cs.p <= 1 && cs.l > 1 || cs.p == 0 && cs.l == 1 || (cs.p == 1 && cs.l == 1 && direction == 'L')){ rotate(cs.t,1);//swaps added symbol from right to left end. ++(cs.p);} ++(cs.l); } //******************************** function rotate ********************************* //Rotates a char[] array of length l by r places to the right. @safe void rotate(ref char[] str,const int r){ char[] temp = str.dup; int l=to!int(str.length); temp.length = 2 * l - r; foreach(ref i;0 .. l - r)temp[i + l] = temp[i]; foreach(ref i;0 .. l)str[i] = temp[i + l - r]; } //******************************* function subset ********************************** //Returns whether cs1 is a subset of cs2 (includes equality) @safe bool subset(const CS cs1,const CS cs2){ int ll; if(cs1.s != cs2.s)return 0;//false if states differ ll = cs1.p - cs2.p;//difference between left end of tape represented if(ll < 0)return 0; if((cs2.l - cs2.p) > (cs1.l - cs1.p))return 0; char[] sub; sub.length = cs2.l; foreach(ref i;0 .. cs2.l)sub[i]=cs1.t[i + ll]; return sub == cs2.t;} //******************************* function st_init ************************************ //This initialises the Store object st. void st_init(){ st.inside = false; st.dircheck = false; st.L = []; st.cycle = false; st.right_at_end = false; st.left_at_end = false; st.cond_1 = 0; st.j = 0; st.ind = 0; st.j1 = 0;} //*************** why not initialise one_step_ori and right_at_start? ************** //************************* function new_origins ************************************ //new origins is a recursive algorithm (i.e. it calls itself) to perform the following tasks, but more efficiently. //Initially the CS cs is augmented by adding an arbitrary symbol (called alpha in the papers) at the //same end where the pointer is for the first call to new_origins. (This is the opposite end of the string to where //the pointer is in IRR.cs1 i.e. its LHS.) //All values of alpha are included in the search. //Then new_origins finds all single backward TM steps from there and their //resulting CS's. This is repeated for each branch recursively, until //one of 4 conditions occurs in each branch. Condition 1: the pointer finishes at the opposite end from //where it started in the LHS which is the same end where it was in cs (CS A in the paper). Condition 2: the pointer finishes at the same end //to where it started in the LHS which is the opposite end where it started in cs. //Condition 3: the computation can go no further because backward steps from there are impossible. //Finally, condition 4: a CS is reached that has been entered before. //All the CS's arrived at as a result of branches ending in condition 1 constitute the array ori on exit. //The algorithm is based on a depth-first search of the tree. An array of the cs's obtained on //the way (L) must be maintained in order to check for a loop (condition 4). This has to be updated at each step of //the search. The first time new_origins is called for a CS cs, L is empty. Otherwise L is the current list of all //previous CS's in the backward search for origins, excluding the one that has just been or being found. //The new element of L is added first for each call to new_origins. //To avoid repetition, alpha is not actually added to the string until the pointer reaches the end. //The initial pointer position in the original LHS i.e. right or left end of the string, is passed down through the variable //"right_at_start" meaning the pointer is at the right in IRR.cs1 (its LHS) which is true or false. This is used to distinguish //case 1 and case 2 when the pointer ends up at the opposite end or same end. //n_symbols is the number of symbols involved in this calculation for this branch. This is needed in order to find //the number of symbols involved in apply_F to express it with no redundant symbols. //n_symb is the array of values of n_symbols, one for each branch. n_symb is empty on first calling new_origins. //The variables cycle, right_at_end, left_at_end, cond_1, j, ind, n_symbols and one_step_ori are in a global structure st of type Store //of which there is just one instance to avoid creating new variables each time new_origins calls itself. //ref is needed for all parameters (value and reference types) to guarantee 2-way information flow //and avoid new copies being used for each call. //The values in ori, n_symb, and alpha in nooco_object at the first call do not affect the running of new_origins except that these //values will be prepended to any subsequent values added to these arrays by new_origins. So in a sense they are //"out" only variables, but not quite. They should be ideally declared as "append only" array arguments but this is not possible. // @trusted void new_origins(ref CS cs,ref NOOCO nooco_object){ bool test; st.cycle = canFind(st.L,cs);//not allowed in safe code st.L ~= cs; st.right_at_end = cs.p == cs.l; st.left_at_end = cs.p == 1; st.cond_1 = (st.right_at_start && st.left_at_end) || (!st.right_at_start && st.right_at_end);//condition 1 st.one_step_ori = []; if(!st.cycle && ((!st.right_at_end && !st.left_at_end) || st.cond_1)){//conditions under which current branch is not required to stop i.e. // non-cycling or condition 1. This eliminates cycling and condition 2. //get the array (one_step_ori) of all CS's ncs that go to cs in 1 TM step with the pointer on the tape and all CS's in ori that //do so from a new pointer position just outside the existing tape. The latter includes //the extra symbol in place of the symbol alpha added. foreach(ref k2; 1 .. nst+1){//look for a possible reverse TM step foreach(ref k1; 0 .. nsy){//through the whole TM table st.j = (k2 - 1) * nsy + k1;//index of TM table row if(irrs[0][st.j].cs2.s == cs.s){//if the new TM state is the state of cs (1st condition for reverse step) st.ind = cs.p - irrs[0][st.j].cs2.p;//array index of symbol matched. st.ind + 1 = (cs.p - 1 if TM step to be reversed goes right,cs.p + 1 otherwise) st.inside = st.ind >= 0 && st.ind < cs.l;//pointer position after reverse TM step is within symbol string, otherwise it is at alpha st.dircheck = (irrs[0][st.j].cs2.p == 2 && st.ind == -1) || (irrs[0][st.j].cs2.p == 0 && st.ind == cs.l);//checks that if condition 1 is true, //the direction of the next reverse TM step takes the pointer just outside the current symbol string. test = (st.inside && irrs[0][st.j].cs2.t[0] == cs.t[st.ind]) || (!st.inside && st.dircheck); if(test){//if the new TM symbol is the symbol at ind (2nd condition for reverse step) CS ncs = new CS; ncs.s = irrs[0][st.j].cs1.s;//This would be k2 if TM table is written in order in file. ncs.p = st.ind + 1; ncs.t = cs.t.dup; if(st.inside){ ncs.l = cs.l; ncs.t[st.ind] = irrs[0][st.j].cs1.t[0]; st.one_step_ori ~= ncs;//dup not needed: cs2 goes out of scope next, so it cannot provide a "back door" to one_step_ori } else{//reverse TM step takes pointer outside string, so terminating this branch of the search. //Find st.n_symbols which is the number of symbols involved i.e. the length of the character string involved in finding //the new origin in the current branch. //writeln("test11"); if(st.right_at_start){//The TM is at the right hand end of the symbol string in the LHS st.j1 = 1;//find n_symbols i.e. the number of symbols used in the computation foreach(ref x; st.L) st.j1 = max(st.j1, x.p); st.j1 -= 1;}//because j1 = i1 - 1 is needed. else{//The TM is at the left in the LHS st.j1 = cs.l; foreach(ref x; st.L) st.j1 = min(st.j1, x.p); st.j1 = cs.l - st.j1;}//ex must be the extreme position based on starting at 0 instead of at cs.l and counting leftwards. //find ncs when the new pointer position is just beyond the symbol string, remembering that if the //reverse TM step goes left, then the new symbol is on the left increasing all the symbol positions by 1. ncs.l = cs.l + 1; ncs.t ~= irrs[0][st.j].cs1.t[0]; if(st.ind == -1) {++(ncs.p); rotate(ncs.t,1);}//add irrs[0][j].cs1.t[0] at position 0 or at right end of string otherwise NOT noto; noto.new_ori = ncs.dup; noto.alpha = irrs[0][st.j].cs2.t[0]; noto.j1 = st.j1; nooco_object.nota ~= noto; }}}}}} //This is the recursive bit foreach(ref cs3;st.one_step_ori){ new_origins(cs3, nooco_object);} if(st.L.length) --st.L.length;//remove last item in L if there are any after all other branches of the search have ended. (this CS cannot be an intermediate CS). //It terminates the branch and is appropriate just prior to back-tracking from a branch that ends to follow another reverse TM step.} } //************************* function apply_rules ************************************ //The function "apply_rules" applies the previously found IRR's up to length one less than the current CS being //processed to carry out substitution //steps for the TM computation starting from the LHS named cs. n_steps is # substitution steps, nm = # TM steps. //Do this repeatedly until one of the following //results occurs: (1)pointer moves outside the length of tape with known symbols, //to the right (R) or left(L). //(2) a halt condition occurs (H), (3) an infinitely repeating cycle occurs(C). //The result is a new rule with the stopping condition (LRHC) indicated and //the numbers of derivation (i.e. substitution) and machine steps involved. In the case C there //is no final CS. In the output there must be one, but C is //indicated showing that it is not really final. //The new rule is added to the IRR, because it must be irreducible //as the result of the correct functioning of this computer program. //Each computation will end in state L,R,H or C, meaning the pointer is //at the left,right, the computation halts or does not terminate, respectively. //On output, cs now is the final result of the substitutions. The new rules get //inserted into the array of arrays of IRR's "irrs" in the main program. //j2 is the extreme position of the pointer in this computation if it starts at position 0. @trusted void apply_rules(ref CS cs,out int n_steps,out int nm, out char d, out char lsig, out int j2){ //d is "sig" for output cs: L,R,C, or H, lsig is sig for original cs (L or R). if(cs.p == 1){lsig = 'L'; j2 = 0;} else {lsig = 'R'; j2 = cs.l;}//initialise j2 //write("\n1 apply_rules: j2 = ",j2); CS[] cs_array; cs_array ~= cs.dup;//dup is needed because cs_array is a record of the changing values of cs. n_steps = 0;//# derivation steps already done nm = 0;//# machine steps bool match; d = '0'; char sig; int l; IRR irr2; for(;;){//repeatedly apply appropriate rules to continue computation if(cs.p == 0){d = 'L';break;} if(cs.p > cs.l){d = 'R';break;} match=0; for(int k = cs.l - 2;k >= 0; --k){//find the longest applicable rule foreach(ref irr3;irrs[k]){ if(subset(cs,irr3.cs1)){match=1; irr2 = irr3;//irr3 next goes out of scope so dup not needed provided irr2 is not altered bolow. break;}} if(match==1)break;} //match now indicates if a matching rule has been found. This should be true always if(!match){ throw new Exception("Error: no matching rule found");} //Do substitution. The first position is element 0 as far as replace is concerned if(cs.p == 1){//pointer at left in cs and irr2. foreach(ref i;0 .. irr2.cs2.l){ cs.t[i] = irr2.cs2.t[i];}} else if(cs.p > 1){//pointer at right in cs and irr2. foreach(ref i;0 .. irr2.cs2.l){ cs.t[cs.l - irr2.cs2.l+i] = irr2.cs2.t[i];}} cs.s = irr2.cs2.s; cs.p += irr2.cs2.p - irr2.cs1.p;//increment of pointer position // write("\n2 apply_rules: cs = ");CS_write(stdout,cs); sig = irr2.sig;//keep a copy of this l = irr2.cs2.l;//and this //write("\n apply_rules: cs.p = ",cs.p); //write("\n2 apply_rules: j2 = ",j2); //cs.l is unchanged. cs is now the result of applying rule irr2 to previous cs ++ n_steps; nm += irr2.n_machine; if(count(cs_array,cs) > 0 || irr2.sig == 'C')d = 'C';//not allowed in safe code cs_array ~= cs.dup;//dup needed if(irr2.sig == 'H'){d = 'H';break;} if(d == 'C')break;//if this CS was found before, there is an endless stationary loop so go to next case. } if(n_steps == 1)j2 = l - 1; else{ if(sig == lsig) j2 = cs.l - 1; else j2 = cs.l;} } //********************************* main program ********************************************** void main(string[] args){ //1 Initial statement output and dialogues writeln("This is TIEv3.0_draft copyright (C) 2018 John Nixon.\nThis program comes with ABSOLUTELY NO WARRANTY;\nThis is free software, and you are welcome to redistribute it\nunder certain conditions; for more information see the source code file.\n"); if(args.length!=2){ writeln("Usage: "); writeln("For example from the same directory as the program:"); writeln(" ./ "); writeln("See the initial comments in the program source file for more information."); writeln("The information in the file of the Turing Machine must be presented in the following"); writeln("order with each element separated from the next by any number (including zero) of"); writeln("whitespace characters (tabs spaces newlines etc)."); writeln("to allow great flexibility of presentation:"); writeln("Number of TM states (nst), Number of TM symbols(nsy), then for each combination of these:"); writeln("Old state,symbol read,new state,symbol printed,direction of movement (L or R) or halt (H)."); writeln("The states are 1,2,... nst, and the symbols are a,b,c,..."); writeln("The lines of the TM machine table must be sorted first by old state increasing, and then by"); writeln("symbol read in alphabetical order a to z."); writeln("This last requirement simplifies the algorithm for searching for reverse TM steps"); writeln("For example\n3 \n2 \n1a2bR\n1b1bH\n2a3bR\n2b3aL\n3a1aL\n3b1aR\n"); writeln("For verification, the program's interpretation of the user specified TM is first shown."); return;} bool print_IRRs = yes("In the output file (a) do you want the IRR's printed?"); bool print_IGRs = yes("and (b) do you want the IGR's together with their contexts printed?"); int n_symbols; char alpha; char dir1, dir2;//direction for addition of new symbol(s) for the context strings dir1 will be used, for alpha //dir2 will be used. ( values will be 'L' or 'R' or possibly ' ' if is not needed). string inputfilename=args[1]; File infile = File(inputfilename,"r"); if(!exists(inputfilename)){writeln("I cannot find file ",inputfilename);return;} char[] outputfilename=can_write(inputfilename,"_out.txt"); of.open(outputfilename.idup,"w"); infile.readf(" %s",&nst);infile.readf(" %s",&nsy); int c=nsy*nst; char [] question; int q, n, opt; stdout.write("\nEnter the maximum length of derived computation rules: "); input_int(n); if(print_IGRs){ stdout.write("\nEnter 1 for showing all IRR's, 2 for showing only IRR's of type LL, or 3 for showing only IRR's of type RR: "); input_int(opt);} irrs.length = n; stdout.write("See the documentation in the program source file for the notation.\n"); of.write("See the documentation in the program source tie.d file for the notation.\n"); writeln("The Turing Machine table expressed as IRR's of length 1"); //2 Reading in the input Turing Machine file foreach(ref oc;0 .. c){ IRR r1 = new IRR;//"new" is needed (a class object) to avoid a segmentation fault with r1.dup CS[] cs0; CS cs1 = new CS;//new is needed here infile.readf(" %d",&cs1.s); cs1.l=1; cs1.t.length = 1; infile.readf(" %s",&cs1.t[0]); cs1.p=1; CS cs2 = new CS; infile.readf(" %d",&cs2.s); cs2.l=1; cs2.t.length = 1; infile.readf(" %s",&cs2.t[0]); infile.readf(" %s",&r1.sig); if(r1.sig == 'L')cs2.p = 0;else if(r1.sig == 'R')cs2.p = 2;else cs2.p = 1; r1.cs0 = cs0; r1.cs1 = cs1; r1.cs2 = cs2; r1.n_derivation = 1; r1.n_machine = 1; irrs[0] ~= r1;}//dup not needed because r1 goes out of scope next. //r1 is now out of scope so cannot be used to change irrs array_IRR_write(stdout,irrs[0]); stdout.write("\nThe numbers of irreducible regular (computation) rules (IRR's)"); of.write("\nThe numbers of irreducible regular (computation) rules (IRR's)"); stdout.write("\nIRR's derived from this Turing Machine for different tape lengths are as follows:"); of.write("\nIRR's derived from this Turing Machine for different tape lengths are as follows:\n"); stdout.writeln("\nSee the output file "~outputfilename~" for the list of IRR's obtained (if requested)."); int count; CS csi = new CS; //3 Generate the reversed Turing Machine CS[][CS] rtm;//reversed Turing Machine steps represented as an associative array. This structure is filled in next. //An effect of using the associative array is that each time it is constructed the //key-value pairs are usually in a different order! The storage is using a hash table, whose elements are stored in an unordered way. //This is a very sophisticated data structure designed for rapid retrieval of information not really needed here. //Do this first for the TM steps going left int index; foreach(ref st;1 .. nst+1)foreach(ref sy;0 .. nsy){ CS cs1 = new CS; cs1.s = st; cs1.p = 0; cs1.l = 1; cs1.t.length = 1; cs1.t[0] = to!char(sy+97); rtm[cs1] = [];//declares that there are no CS(1)'s leading to this (state,symbol) combination yet found. foreach(ref st1;1 .. nst+1)foreach(ref sy1;0 .. nsy){//search through TM table to find them all index=(st1 - 1) * nsy + sy1; //IRR_write(of,irrs[0][0]); //array_IRR_write(stdout,irrs[0]); if(irrs[0][index].cs2.p == 0 && irrs[0][index].cs2.s == st && irrs[0][index].cs2.t[0] == to!char(sy + 97)){//checks state and symbol and direction CS cs2 = new CS; cs2.s = irrs[0][index].cs1.s; cs2.p = 1; cs2.l = 1; cs2.t.length = 1; cs2.t[0] = irrs[0][index].cs1.t[0]; rtm[cs1] ~= cs2;//no dup needed because cs2 goes out of scope next. // write("cs1 = ");CS_write(stdout,cs1);write("cs2 = ");CS_write(stdout,cs2);writeln(""); }}} //then for the TM steps going right foreach(ref st;1 .. nst+1)foreach(ref sy;0 .. nsy){ CS cs1 = new CS; cs1.s = st; cs1.p = 2; cs1.l = 1; cs1.t.length = 1; cs1.t[0] = to!char(sy+97); rtm[cs1] = []; foreach(ref st1;1 .. nst+1)foreach(ref sy1;0 .. nsy){ index=(st1-1)*nsy+sy1; if(irrs[0][index].cs2.p==2 && irrs[0][index].cs2.s == st && irrs[0][index].cs2.t[0] == to!char(sy+97)){ CS cs2 = new CS; cs2.s = irrs[0][index].cs1.s; cs2.p = 1; cs2.l = 1; cs2.t.length = 1; cs2.t[0] = irrs[0][index].cs1.t[0]; rtm[cs1] ~= cs2;//no dup needed and it makes no difference! // write("cs1 = ");CS_write(stdout,cs1);write("cs2 = ");CS_write(stdout,cs2);writeln(""); }}} //4 Write out rtm the reversed Turing Machine writeln("The inverse Turing Machine"); foreach(element;rtm.byKeyValue){ CS_write(stdout,element.key);write(", > ");array_CS_write(stdout,element.value);writeln("");} //5 Generate the set IRR(2) the irreducible regular rules of length 2 using "apply rules" //add symbols at the pointer s.t. forward TM step takes the pointer to opposite end of tape (l=2). //alpha_n is the integer representation of the symbol called alpha in the papers: 0 for a ,1 for b etc. to give an irreducible rule //This (section 5) uses the fact that for this special case there is a simpler procedure than the general one below in section 6. //Here irr is the rule started with (just a TM table entry) and irr2 is the TM table entry used to progress the TM to the next step //after the next symbol has been added. foreach(irr;irrs[0]){ foreach(irr2;irrs[0]){ if(irr2.cs1.s != irr.cs2.s)continue;//ignore non-matching rows of the machine table if(irr2.sig == irr.sig)continue;//If pointer moves in same direction, no new IRR can result IRR irr3 = new IRR; CS cs3 = irr.cs2.dup; add_at_ptr_or_end(cs3,irr2.cs1.t[0],'R');//LHS of new element of IRR(2) irr3.cs1 = cs3.dup; CS cs4 = irr.cs1.dup; char d, dir, lsig; int n_steps, nm, j2; if(irr.cs2.p == 0) dir = 'L'; else dir = 'R'; add_at_ptr_or_end(cs4,irr2.cs1.t[0],dir);//origin of new element of IRR(2) irr3.cs0.length = 1; irr3.cs0[0] = cs4; apply_rules(cs3,n_steps,nm, d, lsig, j2); irr3.cs2 = cs3; if (irr3.cs1.p == 1) irr3.lsig = 'L'; else irr3.lsig = 'R'; irr3.sig = d; irr3.n_derivation = n_steps; irr3.n_machine = nm; irrs[1] ~= irr3;}} array_IRR_write(stdout,irrs[1]); igrs.length = n - 1; //6 Generate the set IRR(3) and IRR's of greater length using "new origins" and "apply rules" of.write("\nlength: ",2," # IRR's = ",irrs[1].length); write("\nlength: ",2," # IRR's = ",irrs[1].length); CS[] new_oris = new CS[0];//combined array of origins. char[] d_alpha; int n_derivation,n_machine; bool new_alpha; //carry out analysis for the required number of iterations i.e. foreach(ref i; 2 .. n){//given all irreducible rules (IRR's) of length i, find all extensions of them to length i+1. count = 0;//initialise count of the number of IRR's found of length i. foreach(ref r2; irrs[i - 1]){ if(!((r2.lsig == 'L' && r2.sig == 'R') || (r2.lsig == 'R' && r2.sig == 'L')))continue;//ignore IRR's that are not of type LR or RL IRR[] new_irrs = new IRR[0];//empty array of IRR's to be filled in NOOCO[] nooco_array; foreach(ref cs; r2.cs0){//generate nooco_array an array of nooco_objects containing all the origins for each member of the origin of r2. NOOCO nooco_object; st.L = []; st.right_at_start = cs.p == 1;//i.e. pointer is at left in origin (it is at right in the LHS). st_init();//initialise the Store st used by new_origins. new_origins(cs, nooco_object); nooco_object.cs0 = cs.dup; nooco_array ~= nooco_object;} c = 0;//counter for the number of new values of alpha d_alpha = [];//array of distinct values of alpha in the order they are found. foreach(ref x; nooco_array){ foreach(ref noto; x.nota){ new_alpha = true; for(int j = 0; j < c; ++j){ if(d_alpha[j] == noto.alpha){ new_irrs[j].cs0 ~= noto.new_ori;//adds new origin to the IRR derived using the same value of alpha. new_alpha = false; break;}} if(new_alpha){ IRR new_irr = new IRR; d_alpha ~= noto.alpha; new_irr.cs0 ~= noto.new_ori.dup; new_irr.cs1 = r2.cs1.dup; add_opposite_pointer(new_irr.cs1,[noto.alpha], '*');//which side are symbols added is determined by new_irr.cs1 new_irr.cs2 = r2.cs2.dup; add_at_ptr_or_end(new_irr.cs2, noto.alpha, dir2);//dir is not used because new_irr.cs2 has length >=2. apply_rules(new_irr.cs2,n_derivation,n_machine,new_irr.sig,new_irr.lsig,index);//last arg missing an int (out) if(new_irr.cs1.p == 1)new_irr.lsig = 'L';else new_irr.lsig = 'R'; new_irr.n_derivation = n_derivation + 1; new_irr.n_machine = n_machine + r2.n_machine; ++ count; ++ c; new_irrs ~= new_irr;}}} irrs[i] ~= new_irrs;} write("\nlength: ", i + 1," # IRR's = ",irrs[i].length); of.write("\nlength: ", i + 1," # IRR's = ",irrs[i].length); } if(print_IRRs){ if(opt == 1)of.write("All IRR's are shown\n"); if(opt == 2)of.write("Only IRR's of type LL are shown\n"); if(opt == 3)of.write("Only IRR's of type RR are shown\n"); foreach(ref i; 0 .. n){ if(irrs[i].length != 0){ if(i)of.writeln("IRR's of length ",i + 1,"(",irrs[i].length,")"); else of.writeln("The Turing Machine table expressed as IRR's of length 1"); sort!()(irrs[i]); foreach(ref it;irrs[i]){ if((opt == 1)||(opt == 2 && it.cs1.p == 1 && it.cs2.p == 0)||(opt == 3 && it.cs1.p == it.cs1.l && it.cs2.p == it.cs2.l + 1))IRR_write(of,it);of.writeln();} of.writeln();}}} //7 Generate the set IGR(2) from scratch char[][] d2_alpha; int n_steps, nm, j1, j2, r1 ,r2; char[] cont_l,cont_r; CS cs0,cs1,cs2,cs3,cs4,cs5,cs6; char lsig,sig; int[] d_j2; int_pair[] d_j_pair; bool new_j2; bool new_j_pair; bool next;//jump to next case because TM step takes machine in same direction so result cannot be an IRR or part of an IGR. bool new_result, first_new_result, new_igr, new_ns, nova; char d;//out variable for apply_rules IGR[] igr_array1; IGR[][] igr_array; igrs[0].length = 1;//index 0 is out of bounds because igrs has length 0 igrs[0][0].length = 3; //d2_alpha is here used for the array of distinct alpha values for each distinct value of the index j2 (= i2 in paper) foreach(llc1,ref r; irrs[0]){//for each single TM step d_j2 = [];//array of distinct values of j2 for the same r in the order they are found. d2_alpha = []; igr_array1 = []; igr_array1.length = 3; if(r.cs2.p > 1) {dir1 = 'L'; dir2 = 'R';} if(r.cs2.p == 0) {dir1 = 'R'; dir2 = 'L';} new_igr = false;//This indicates whether igr will actually be needed. If not igr goes out of scope without being used. foreach(ref sy; 0 .. nsy){ alpha = to!char(sy + 97); cs4 = r.cs1.dup;//reset these cs5 = r.cs2.dup;//values //add alpha to r add_at_ptr_or_end(cs4, alpha, dir2); add_at_ptr_or_end(cs5, alpha, dir2); apply_rules(cs5,n_derivation,n_machine,sig,lsig,j2);//"sig" for output CS needed? No but keep sig and lsig new_result = false; //conditions on the LHS and TM table entry (as an IRR) needed for the current value of alpha foreach(llc4,ref irr2;irrs[0]){//search for matching row(s) to find new origins if((irr2.cs2.s == cs4.s) && (alpha == irr2.cs2.t[0]) && ((cs4.p == 1 && irr2.cs2.p == 0) || (cs4.p == 2 && irr2.cs2.p == 2))){ cs6 = irr2.cs1.dup; add_at_ptr_or_end(cs6, r.cs1.t[0], dir1); //j1 is always 0 so this variable is not needed //All the results for this r, and possible reverse steps have been found i.e. alpha, cs6, cs5, j2. //These must now be assembled into the IGR's (one for each value of j2) corresponding to r. r1 = 1; r2 = min(j2 + 1,2);//on each side of IGR) new_j2 = true; for(int k = 0; k < d_j2.length; ++k){ if(j2 == d_j2[k]){//(d_j2[k] is the kth distinct value of j2 new_j2 = false; new_alpha = true;//this is assumed first to be a new value of (new)alpha. for(int j = 0; j < d2_alpha[k].length; ++j){ if(d2_alpha[k][j] == alpha){//compare the jth distinct value of alpha for the kth distinct value of j2 with y.alpha new_alpha = false; split_string(cs6, 2 - r1, cont_l);//# symbols to be removed unique_sort_add(of,0,igr_array1[j2].rhs[j].irrp_set.origins, cs6, nova);//adds new origin to the IRR derived using the same value of alpha. break;}} if(new_alpha){ RHE rhe_r = new RHE; IRRP_set irrp_set_r = new IRRP_set; d2_alpha[k] ~= alpha; irrp_set_r.alpha = alpha; split_string(cs6, 2 - r1, cont_l); irrp_set_r.origins = [cs6]; irrp_set_r.rhs = cs5; split_string(irrp_set_r.rhs, 2 - r2, cont_r); rhe_r.n_TM_step = n_machine; rhe_r.n_Sub_step = n_derivation; rhe_r.irrp_set = irrp_set_r; igr_array1[j2].rhs ~= rhe_r;} break;}} if(new_j2){ d_j2 ~= j2; RHE rhe_r = new RHE; IRRP_set irrp_set_r = new IRRP_set; IRRP_set irrp_set_l = new IRRP_set; d2_alpha ~= [alpha];//array of distinct values of the new alpha in the order they are found. irrp_set_r.alpha = alpha; irrp_set_r.origins = [cs6]; split_string(irrp_set_r.origins[0], 2 - r1, cont_l); irrp_set_r.rhs = cs5; split_string(irrp_set_r.rhs, 2 - r2, cont_r); rhe_r.irrp_set = irrp_set_r; rhe_r.n_TM_step = n_machine; rhe_r.n_Sub_step = n_derivation; cs3 = r.cs1.dup; split_string(cs3, 2 - r1, cont_l);//should happen once only for each new IGR irrp_set_l.alpha = '*';//This removes old alpha from the lhs of the IGR's and results in sorting them by //their left IRRP_sets i.e. by the rhs of the left IRRP_set, then by its origin. irrp_set_l.origins = [cs3]; if(r.sig == 'L') irrp_set_l.origins[0].p = 1; else irrp_set_l.origins[0].p = 0;//For IGR(2) the reverse search has 1 TM step //so the direction of r determines the direction of adding the context. This is given by igr.lhs.p being 1 or not. irrp_set_l.rhs = r.cs2.dup; split_string(irrp_set_l.rhs, 2 - r2, cont_r); IGR igr = new IGR; igr.ns = 2; igr.j1 = 0; igr.j2 = j2; igr.rhs = [rhe_r]; igr.lhs = irrp_set_l; igr.context ~= new Context_pair(cont_l,cont_r); igr_array1[j2] = igr;}}}} foreach(ref j2_1; 0 .. 3){ if(igr_array1[j2_1] !is null){igrs[0][0][j2_1] ~= igr_array1[j2_1]; }}} of.write("\nstarting the calculation of the IGR's of length greater than 2"); write("\nstarting the calculation of the IGR's of length greater than 2"); //8 Generate the sets IGR of ength greater than 2 foreach(ref i; 1 .. n - 1){ write("\ni = ",i); //This is the main loop that when complete generates igrs, the 4D array of IGR's that generate the IRR's up to //length n. In cycle i these arrays of IGR's are added to such that the new IGR's added generate the IRR(i + 2) from //the IRR(i + 1). For convenience I introduce the indices j1 and j2 in the paper so that the indices of igrs, i, j1, and j2 all //start from 0. //The IGR's generating the IRR(2) have already been found, these are in igrs[0][0][0], igrs[0][0][1], and igrs[0][0][2], where //the last index j2 indicates the rightmost position of the pointer relative to alpha in the computation of the RHS for the IRR(2) of type RL. //For the IRR(2) of type LR it is the leftmost position. //The first cycle (i = 1) will generate the IGR's for obtaining the IRR(3) from the IRR(2) and these will be put in igrs[1][j1][j2] // for j1 = 0, 0 <= j2 <= 3. //Cycle i starts with the set of arrays igrs[i - 1][j1][j2] for 0 <= j1 <= i - 2 and 0 <= j2 <= i + 1 //and generates new members for the larger set of arrays igrs[i][j1][j2] for 0 <= j1 <= i - 1 and 0 <= j2 <= i + 2. //The last cycle is for i = n - 2 generating IGR's to be added to the final set of arrays igrs[n - 2][j1][j2] //for 0 <= j1 <= n - 3 and 0 <= j2 <= n which describe the generation of IRR(n) from IRR(n - 1). //For each value of old alpha in the procedure called F, a set results (IRRP_sets) in the temporary array igr_array is produced, //one for each value of new alpha. As a result of the removal of extra symbols, some are reduced in length. //The last loop over the origins ensures that each new IGR generated is from a single origin in the RHS of the original IGR //i.e. it has a single origin on its LHS. The same is true for the IGR(2) because they are generated from a single TM step //that necessarily has only one origin. igrs[i].length = max(1, i); foreach(ref i1; 0 .. max(1, i)){ igrs[i][i1].length = i + 3;} int l1, l2; l1 = max(1,i - 1);l2 = i + 2; foreach(ref i1; 0 .. l1){ foreach(ref i2; 0 .. l2){ foreach(ref igr;igrs[i - 1][i1][i2]){ if(igr.lhs.origins[0].p == 1) dir1 = 'R';else dir1 = 'L'; foreach(lc4,ref cp; igr.context){ foreach(lc5,ref o_rhe;igr.rhs){//o is for old or original. This is a loop over old alpha. cs2 = o_rhe.irrp_set.rhs.dup; add_opposite_pointer(cs2,cp.s2,dir1);//cs2 is now the RHS of an IRRP_set with context added foreach(lc6,ref ori; o_rhe.irrp_set.origins){ d2_alpha.length = 0; cs0 = ori.dup; add_opposite_pointer(cs0,cp.s1,dir1); cs3 = cs0.dup;//are cs0 and cs3 both needed? if((cs3.p == 1 && cs2.p > 0) || (cs3.p > 1 && cs2.p == 0))continue;//check for the type of the IRRP. Only if it is LRL or RLR can F be applied. // $cs3\to\to cs2$ is now the LHS of a new IGR to be found (before chopping out the context strings. st.L = [];//#1: Two initialisations needed for each run of "new_origins" st.right_at_start = cs0.p == 1;//#2: i.e. pointer is at left in origin (it is at right in the LHS). st_init();//initialise the Store st used by new_origins. needed? NOOCO nooco_object; new_origins(cs0, nooco_object); nooco_object.cs0 = cs0.dup; d_j_pair = [];//(j1,j2) igr_array = [];//empty this array igr_array.length = i; foreach(ref l; 0 .. i)igr_array[l].length = i + 3; foreach(count2, ref y; nooco_object.nota){//new_origins_triplet_array cs4 = cs2.dup; if(cs2.p > cs2.l) dir2 = 'R'; if(cs2.p == 0) dir2 = 'L'; add_at_ptr_or_end(cs4, y.alpha, dir2); apply_rules(cs4,n_derivation,n_machine,sig,lsig,j2);//"sig" for output CS needed? No but keep sig and lsig j1 = y.j1; new_j_pair = true; if(j1 == 0)r1 = 1; else r1 = j1 + 2;//calculate the r1 and r2 parameters (min # symbols needed r2 = min(j2 + 1, i + 2);//on each side of IGR) foreach(ref k; 0 .. d_j_pair.length){//d_j_pair is the array of distinct (j1, j2) pairs if(j1 == d_j_pair[k].j1 && j2 == d_j_pair[k].j2){ new_j_pair = false; new_alpha = true;//this is assumed first to be a new value of (new)alpha. foreach(ref j; 0 .. d2_alpha[k].length){ if(d2_alpha[k][j] == y.alpha){ new_alpha = false; split_string(y.new_ori, i + 2 - r1, cont_l);//# symbols to be removed, so r1 symbols remain unique_sort_add(of,0,igr_array[j1][j2].rhs[j].irrp_set.origins, y.new_ori, nova);//adds new origin to the IRR derived using the same value of alpha. break;}} if(new_alpha){ RHE rhe_r = new RHE; IRRP_set irrp_set_r = new IRRP_set; d2_alpha[k] ~= y.alpha; irrp_set_r.alpha = y.alpha; split_string(y.new_ori, i + 2 - r1, cont_l); irrp_set_r.origins = [y.new_ori]; irrp_set_r.rhs = cs4; split_string(irrp_set_r.rhs, i + 2 - r2, cont_r); rhe_r.n_TM_step = n_machine; rhe_r.n_Sub_step = n_derivation; rhe_r.irrp_set = irrp_set_r; igr_array[j1][j2].rhs ~= rhe_r;} break;}} if(new_j_pair){ d_j_pair ~= int_pair(j1,j2); RHE rhe_r = new RHE; IRRP_set irrp_set_r = new IRRP_set; IRRP_set irrp_set_l = new IRRP_set; d2_alpha ~= [y.alpha];//array of distinct values of the new alpha in the order they are found. irrp_set_r.alpha = y.alpha; irrp_set_r.origins = [y.new_ori]; split_string(irrp_set_r.origins[0], i + 2 - r1, cont_l);//r1 symbols remain irrp_set_r.rhs = cs4; split_string(irrp_set_r.rhs, i + 2 - r2, cont_r);//r2 symbols remain rhe_r.irrp_set = irrp_set_r; rhe_r.n_TM_step = n_machine; rhe_r.n_Sub_step = n_derivation; cs1 = cs3.dup; split_string(cs1, i + 2 - r1, cont_l);//should happen once only for each new IGR irrp_set_l.alpha = '*';//This removes old alpha from the lhs of the IGR's and results in sorting them by //their left IRRP_sets i.e. by the rhs of the left IRRP_set, then by its origin. irrp_set_l.origins = [cs1]; irrp_set_l.rhs = cs2.dup; split_string(irrp_set_l.rhs, i + 2 - r2, cont_r); IGR igr1 = new IGR; igr1.ns = i + 2; igr1.j1 = j1; igr1.j2 = j2; igr1.rhs = [rhe_r]; igr1.lhs = irrp_set_l; igr1.context ~= new Context_pair(cont_l,cont_r); igr_array[j1][j2] = igr1; }} //sort each element of igr_array individually into z by IRRP_sets in their rhs's using alpha. //then add z into main arrays igrs maintaining sort order. foreach(ref x; igr_array){ if(x !is null){ foreach(ref y; x){ if(y !is null){ IGR z = new IGR; z.ns = y.ns; z.j1 = y.j1; z.j2 = y.j2; z.lhs = y.lhs; z.context = y.context; foreach(ref q1; y.rhs){ unique_sort_add(of,0,z.rhs, q1, nova);} // unique_sort_add(igrs[i][j1][j2], z); igrs[i][j1][j2] ~= z;}}}}} }//ends loop over o_irrp_set or equivalently old alpha }//ends loop over cp (context pairs) of input IGR }//ends loop over igr of a set of IGR's. }//ends loop over i2 }//ends loop over i1 }//ends loop over i int[] freq;//freq[i] is the frequency of IGR's having total length i + 2 freq.length = n - 1; foreach(i,ref x; igrs){ if(print_IGRs){ of.write("\n\naugmented (context included) IGR's of length: ",i + 2);} ++ freq[i]; foreach(i1,ref y; x){ foreach(i2,ref t; y){ foreach(ref w; t){ if(print_IGRs){of.writeln(); IGR_write(of,w);}}}}} of.write("\n"); of.writeln("\nThe set of distinct IGR's (with the context deleted) needed to obtain\nthe IRR's of length 2, then the extra ones needed to obtain IRR's of each length n"); int[] fr;//number of new IGR's needed for each value of i fr.length = n - 1; IGR[] igrs_nc = new IGR[0];//These are IGR's that are consolidated by deleting the context foreach(i,ref x; igrs){ of.write("\nn = ",i + 2); foreach(i1,ref y; x){ foreach(i2,ref z; y){ foreach(ref w; z){ w.context.length = 0;//deletes context w.ns = 0;//this value no longer has a meaning, being dependent on context. foreach(ref rhe; w.rhs){//these values depend on the context, so have no meaning in its absence rhe.n_TM_step = 0; rhe.n_Sub_step = 0;} unique_sort_add(of,1,igrs_nc, w, nova); if(nova) {++ fr[i];of.write("\n");IGR_write(of,w);}}}}} of.write("\n\nThe number of distinct extra IGR's needed to obtain the IRR's of length 2, 3, ... : "); foreach(ref i; 0 .. n - 1)of.write(fr[i], " "); //Construct the relation "can be followed by" based on the set igrs_nc, the set of IGR's with no context. This is //represented by the 2D bool array cbf. From this, strongly connected parts giving cycles can be identified to get //formulae for IRR's obtained iteratively. bool [][] cbf = new bool[][0]; bool x_lg,y_lg;//x resp. y are left_going bool cbfb; int s; cbf.length = igrs_nc.length; of.write("\n\nThe IGR's listed in sorted order and numbered"); foreach(i,ref x; igrs_nc){ of.write("\n",i+1," ");IGR_write(of,x);} of.write("\n\nThe boolean matrix 'can be followed by' which refers to the numbered IGR's above"); of.write("\nEach row corresponds to the same numbered IGR above."); s = 0; of.write("\n\n "); foreach(i; 0 .. cbf.length){ if(s == 10){s = 0;of.write(" ");} ++ s; of.write((i+1)/10);} s = 0; of.write("\n "); foreach(i;0 .. cbf.length){ if(s == 10){s = 0;of.write(" ");} ++ s; of.write((i+1)%10);} of.write("\n"); foreach(i,ref x1; igrs_nc){ cbf[i].length = igrs_nc.length; foreach(w; x1.rhs){ foreach(k,z; w.irrp_set.origins){ ++ cbf.length; if(i<9)of.write("\n ",i+1);else of.write("\n",i+1); of.write(w.irrp_set.alpha);of.write(k+1," "); s = 0; foreach(j,ref x2; igrs_nc){ if(s == 10){s = 0;of.write(" ");} ++ s; if (!can_follow(z,w.irrp_set.rhs,x2.lhs.origins[0],x2.lhs.rhs, x1.lhs.origins[0], x2.lhs.origins[0])) of.write("0");else of.write("1");}}}} // change can_follow to depend only on the 4 CS's in IGR x and 2 for IGR y. of.write("\n"); //What follows is the same information in the format that can be used by the program Tarjan_v1_0 that extracts the //strongly connected subsets from a directed graph. foreach(i,ref x1; igrs_nc){ foreach(w1; x1.rhs){ foreach(k1,z1; w1.irrp_set.origins){ of.write(i + 1,w1.irrp_set.alpha,k1 + 1," > "); count = 0; foreach(j,ref x2; igrs_nc){ foreach(w2; x2.rhs){ foreach(k2,z2; w2.irrp_set.origins){ if (can_follow(z1, w1.irrp_set.rhs, x2.lhs.origins[0], x2.lhs.rhs, x1.lhs.origins[0], x2.lhs.origins[0])) ++ count;}}} of.write(count," "); foreach(j,ref x2; igrs_nc){ foreach(w2; x2.rhs){ foreach(k2,z2; w2.irrp_set.origins){ if (can_follow(z1, w1.irrp_set.rhs, x2.lhs.origins[0], x2.lhs.rhs, x1.lhs.origins[0], x2.lhs.origins[0])){ of.write(j + 1,w2.irrp_set.alpha,k2 + 1," & ");}}}} of.write("\n"); }}} }//ends program