/*2021-12-26 Not much changed since 2021-12-21. Examining output shows that this needs to be generalised to
allow for the context strings T1 and T2 to have different lengths.
The dialogues have changed and I've done a lot of tidyng up, i.e. deleting out of date
comments and unwanted commented out code.
This is the computer program TIE (Turing machine Inference Engine) version 3.0 draft. It employs
the strategies described in the latest draft manusript of the Turing Machine analysis paper to be found
on www.bluesky-home.co.uk
It generates IRR's and IGR's (see references and latest draft paper) from a Turing Machine. This works for
non-terminating TM's, and it has not all been checked for TM's that halt. TM's that don't halt are a simpler concept
that it makes sense to analyse first, and the analysis of halting TM's could be obviously based on this.
This program must be compiled before it can be used. It was written in D and
originally had extension .d, but I prefer to keep it on the web as a .txt file and rename it before compilation. To compile
it go to dlang.org and download one of the compilers (I use dmd)
then do for example the following in a terminal program that takes text commands:
$mv tie_v2_0.txt tie.d
$dmd tie.d
and run it with the command that includes the file name of the TM to be analysed.
$./tie TM_file_name.txt
An alternative if the user is familiar with the dialogue is to ignore the screen output
and redirect the output to any file eg
$./tie TM_file_name > out.txt, and then put the answers to the dialogues.
The IRR are essentially the non-redundant set of computational short-cuts that
can be derived by running the Turing Machine(TM) on a finite length of tape.
Every computation of the TM as a sequence of single TM steps can be replaced
by an equivalent shorter sequence of applications of the short-cut rules,
(though the set of such rules may be and usually is infinite). Any computation with a TM
expressed in terms of its IRR is reduced in terms of the number of computation steps needed.
By correctly guessing the iterative formulae or method for obtaining the IRR for a particular TM
with the aid of the list of IRR generated by this program,
and verifying them by an induction argument, it is possible to obtain long-
term information on the behaviour of the TM that is not obvious initially, and
may shed light on the problem which the TM was designed to solve. Because TM's
can be set up to solve any problem that can be solved by an algorithm, the
possibilities of application of this method seem limitless.
The notation for each irreducible regular rule (IRR), including the original
Turing Machine (TM) table, is in the following order: two configuration sets (CS)'s
each defined by a quadruple (state,pointer position,length,tape symbols) representing
the input and output respectively of the rule, followed by the status of the rule
(L left-moving,R right-moving,H halt), and finally the numbers of steps involved in its
derivation (number of derivation steps from rules of length n-1, number of TM steps).
The lines of the TM machine table must be sorted first by old state increasing, and
then by symbol read in alphabetical order a to z.
The layout of this file is as follows:
(1)Copyright notice and literature references
(2)History of different versions what they do including the current one, in reverse order.
(3) The program code itself which is divided as follows
(3.1) Global variables and library imports
(3.2) Class definitions
(3.3) Write functions for the classes
(3.4) Other functions
(3.5) The main function which is where the program execution begins.
Copyright (C) 2018 John Nixon
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
A copy of the GNU General Public License can be obtained from
.
Mathematica Aeterna, (2013) Vol. 3, no. 9, 709 - 738 (Paper 1)
a second paper continuing this study has now been published
Mathematica Aeterna, (2017) Vol. 7, no. 1, 19 - 56 (Paper 2)
Program modifications
New in version 3.0:
The latest draft of the paper describes the construction and use of IGR's (IRR generating rules). These are rules that
describe the generation of new IRR's from old ones using the general procedure called F in the paper. This version of the
program generates these IGR's up to a given length. The motivation for doing this is that a small number of IGR's could be
combined to generate a large indeed infinite number of IRR's. It is unknown at this stage under what conditions on a TM
is the number of IGR's for that TM finite.
New in version 2.1:
Foreach loops over arrays of class objects do not need ref to avoid unnecessary copying.
Structs have been replaced by classes. This entailed recoding in connection with
their creation, and creating new write functions which are not defined by default for classes. I designed
them to exactly mimic the output from the default write functions for structs to aid in program testing, but
which could now be modifed if needed.
I did this to understand the difference between structs and classes, and to attempt to write the program
more efficiently. I have simplified the code a little and eliminated some unnecessary copying, but
probably not much. I was motivated by the elegant way in which I could implement Tarjan's algorithm with a class that was much more awkward with a struct.
I have almost always used ref parameters for functions to ensure 2-way passing of information though this is sometimes unnecessary. I think this should be the default with the use of const where possible to
aid in undertanding the program.
New in version 2.0:
The procedure for generating the IRR has been changed to include generating the origins of the IRR i.e. the IRR are now
represented as triplets. This gives a more informative result providing the proof of the
reachability of the LHS's of the IRR and the algorithm has been changed as a result and
is now much easier to describe. It also runs about twice as fast as version 1.3 which is partly due to avoiding the
unnecessary copying of IRR through the default method D has for copying objects (pointer
copies) instead of physical copies obtained with .dup. I had concerns about doing this at
first until I realised that it is quite easy to check the program for the absence of "back doors"
i.e. other variables that point to the same memory that could be used to change such a variable. There
will often be such other variables initially and it must be checked that they go out of scope without modifying
the variable of interest. This mostly deals with my earlier comment about .dup.
New in version 1.3: the IRR are printed with the additional "L" or "R" giving the
pointer position on the LHS of the IRR. This is just a convenience because the pointer
position is included as before, but the 'type' of the rule e.g. LL, LR, RL or RR can now be
read off quickly after the second CS e.g. to determine if the rule has type RL or LR which allows
it to be extended by an additional symbol and not be reducible.
New in version 1.2:
(1) The program has been translated from C++ into D (dlang.org).
[D is much easier to use than C++, has bounds checking making anything like Valgrind
unnecessary as far as I can see, has many more features and has similar syntax.
This otherwise excellent language does unfortunately require the programmer to set up
and use functions called "dup" that duplicate things that I think should be done
automatically in assignments and copying objects into arrays etc. If the programmer
forgets to do this correctly a new pointer will be assigned to the object instead
of making a new object copy, so if the object is subsequently changed, errors could
arise in code that appears naively to be correct.]
(2) The symbols are now in the alphabet {a,b,c,...z} and the states are numbered 1,2,3,...
allowing for Turing Machines with many more states and symbols to be analysed.
This also removes ambiguity in symbolic expressions where exponents are used and is in accord
with the notation in Paper 2.
(3) A much improved format for reading Turing machines.
(4) There was a modification to the way halting is handled: halting is now considered
to be a property of a TM instruction rather than a TM state. As such, the TM can either move right (R)
left(L) by one square, or halt (H) at each TM step. This avoids needing a distinguished state, but it means that
a configuration set (CS) reached as a result of a halt must specify this explicitly e.g. by putting (H) after it.
In any case my interest is mainly in describing non-terminating computations so I probably won't
make much use of this in future.
The original version 1.1 was based on the translation into D of TIEv1.0 (written in C++). It generates the set of all
irreducible regular rules (IRR) for any Turing Machine (TM) up to any specified
length, written by John Nixon (john.h.nixon1@gmail.com formerly john.h.nixon@spamcop.net).
Comments are welcome.
In the terminology of paper 1, "rule" is to be read as
"regular rule" i.e. RR. Likewise "irreducible_rule" and "reducible_rule" are
to be read as IRR and RRR respectively as defined in paper 1.
Comments on D:
Function parameters: "ref" is sometimes needed to ensure proper functioning of functions using 2-way information flow
even if the arguments are ref types, so use it always for 2-way flow.
I use "out" for output-only.
I don't use "in" because it is confusing because it is unfortunately almost synonymous with "const", not just copying
input variables which is what the semantics suggests. It should be for input-only.
I use const instead.
*/
import std.stdio;
import std.algorithm;
import std.string;
import std.file;
import std.conv;
import std.typecons;
//Global variables:
int nsy;//# symbols
int nst;//# machine states
IRR [][] irrs = new IRR[][0];//The set of IRR for the TM, used by main and apply_rules
IGR [][] igrs = new IGR[][0];//igrs[][i] is the set of IGR's that create the set IRR(i+2) from the set IRR(i+1) i.e. the IGR(i+2), so for example
//igrs[][1] is the set of IGR's that generate the IRR(3) i.e. the IGR(3) etc..
//************************************************ Store **********************************************************
//The recursive function "new_origins" was probably the most difficult part of the program to write and the most important.
//It generates all the new endpoints of the reverse computation from a given CS as a result of adding a single extra symbol
//to the finite part of the TM machine tape considered. new_origins uses and can reuse memory space that does not have to be declared
//for each call and is conveniently packaged into one structure. The Type name is Store, and the instance is called st.
alias Store = Tuple!(bool, "inside", bool, "dircheck", CS[], "L", bool, "cycle",bool, "right_at_end", bool, "left_at_end", bool, "cond_1", int, "j", int, "ind", int, "n_symbols", CS[], "one_step_ori", bool, "right_at_start");
Store st;
void Store_write(File of,const ref Store st){
write("\n\ninside = ",st.inside);
write("\ndircheck = ",st.dircheck);
write("\nL = ");array_CS_write(stdout,st.L);
write("\ncycle = ",st.cycle);
write("\nright_at_end = ",st.right_at_end);
write("\nleft_at_end = ",st.left_at_end);
write("\ncond_1 = ",st.cond_1);
write("\nj = ",st.j);
write("\nind = ",st.ind);
write("\nn_symbols = ",st.n_symbols);
write("\none_step_ori = ");array_CS_write(stdout,st.one_step_ori);
writeln("\nright_at_start = ",st.right_at_start);}
alias NOT = Tuple!(CS, "new_ori", int, "n_symb", char, "alpha");//new origins triplet
alias NOOCO = Tuple!(CS, "cs0", NOT[], "nota");//new origins one cycle output
//************************** class CS ************************************************
//This class represents a configuration set(CS) as defined in the paper.
//I will use the convention that if p>0, symbols will be added to the left
//of the pointer, and if p=0, symbols will be added to the right.
//This distinguishes the two cases when l=0 because then p=1 and 0 respectively.
class CS{
int s = 0;//state
int p = 0;//pointer position
int l = 0;//length of tape given
char [] t = [];//part of tape of specified length
@safe final this(const int s,const int p,const int l,const char[] t){
this.s = s;
this.p = p;
this.l = l;
this.t = t.dup;}
@safe final this(){
}//needed for calling 'new' without an argument to create a default class object
@safe final CS dup() const {return new CS(s,p,l,t);}
//Order relation for sorting CS's: Compare first by pointer positions. If
//these are the same compare by machine states, if these are equal
//and the pointer is at the left ( <= 1), compare by symbol strings on the "tape", and if the pointer
//is at the right ( >= l), compare by the reversed symbol strings on the "tape". This is such that the most
//significant end of the tape is the end closest to the pointer. Then truncation at the other end does not alter the sort order.
final int opCmp(ref const CS rhs) const{
if (p != rhs.p) return p - rhs.p;
else if(s != rhs.s)return s - rhs.s;
else if (p <= 1){
if (t != rhs.t) {
return t < rhs.t ? -1 : 1;} else return 0;}
else if (p >= l){
char [] str1 = t.dup;
char [] str2 = rhs.t.dup;
if(str1 != str2) {
return reverse(str1) < reverse(str2)? -1 : 1;}else return 0;}
else {writeln("Error ocurred in opCmp"); return 0; }}
//This is needed though it seems superfluous! needed to equality test class objects by their values
@trusted override bool opEquals(const Object o) const {
auto rhs = cast(CS)o;//not allowed in safe code
if(s != rhs.s) return false;
else if(p != rhs.p) return false;
else if(l != rhs.l) return false;
else if(t != rhs.t) return false;
else return true;}
}
//******************************** class IRR **************************************
//relation between two CS's cs1 and cs2 defined by the action of the TM. In the paper
//special types of rule were defined: regular rules i.e. rules defined by a simple
//recursive definition, and the subset of these that are non-redundant i.e. derived
//in more than one step. These the the irreducible regular rules (IRR). Later the
//array of CS's cs0 was added to the definition. These are all the CS's that lead by
//one or more TM steps to cs1.
class IRR{
CS[] cs0 = [];
CS cs1;
CS cs2;
char lsig = ' ';//pointer position in cs1 i.e. LHS: 'L' for 1 and 'R' for cs1.p > 1, blank if cs1.l=1
char sig = ' ';//L,R,H, or C i.e. left moving, right moving, halting, or cycling resp.
//See the paper for details.
int n_derivation = 0;//# derivation steps from cs1 to cs2.
int n_machine = 0;//# machine steps from cs1 to cs2.
@safe final this(ref CS[] cs0,ref CS cs1,ref CS cs2, const char lsig, const char sig, const int n_derivation,const int n_machine){
this.cs0 = cs0.dup;
this.cs1 = cs1.dup;
this.cs2 = cs2.dup;
this.lsig = lsig;
this.sig = sig;
this.n_derivation = n_derivation;
this.n_machine = n_machine;}
@safe final this(){ }
@safe final IRR dup(){ return new IRR(cs0, cs1, cs2, lsig, sig, n_derivation, n_machine);}
//Compare IRR's by their LHS CS's (cs1).
final int opCmp(const ref IRR rhs){
if(cs1 != rhs.cs1) return cs1 < rhs.cs1? -1: 1; else return 0;}
}
//************************************** class IRRP_set **************************************
//This class is for sets IRR patterns (IRRP) and the alpha value that generates them from another IRRP. see in my paper (2021 version).
//The RHS is unique but there can be many origins.
//An IRRP_set is a set of IRR's that have a common set of origins and rhs that have
//arbitrary context strings $T_1$ and $T_2$ respectively. These are just place-holders and will not be represented in the class
//because the information is coveyed by the class name itself.
//This is for convenience when using apply_rules to generate RHS's.
//The class IRRP_set is part of the class IGR below.
class IRRP_set{
char alpha = '*';//default value to permit sorting
CS[] origins = [];
CS rhs;
final this(const char alpha, CS[] origins, const ref CS rhs){
this.origins = origins.dup;
this.rhs = rhs.dup;
this.alpha = alpha;}
@safe final this(){}
final int opCmp(ref const IRRP_set irrps) const{//<,>,== should be exhaustive
if(alpha < irrps.alpha) return -1;
if(alpha > irrps.alpha) return 1;
if(rhs < irrps.rhs) return -1;
if(rhs > irrps.rhs) return 1;
if(origins[0] < irrps.origins[0]) return -1;
if(origins[0] > irrps.origins[0]) return 1;
return 0;}
final IRRP_set dup() { return new IRRP_set(alpha,origins,rhs);}
override bool opEquals(const Object o) const{//needed to compare class objects by their values
auto irrps = cast(IRRP_set)o;
if(alpha != irrps.alpha) return false;
if(origins != irrps.origins) return false;
if(rhs != irrps.rhs) return false;
return true;}
}
//****************************************** class RHE *****************************************
//right hand element. An array of these form the rhs of an IGR.
class RHE{
IRRP_set irrp_set;
int n_TM_step = 0;
//n_TM_step is the number of TM steps from the original rhs to the final rhs on the right of the IGR that contains the RHE.
int n_Sub_step = 0;
//n_Sub_step is the number of substitution steps involved in this.
@safe final this(){}
final this(ref IRRP_set irrp_set, const int n_TM_step, const int n_Sub_step){
this.irrp_set = irrp_set.dup;
this.n_TM_step = n_TM_step;
this.n_Sub_step = n_Sub_step;}
override bool opEquals(const Object o) const {
auto rhs = cast(RHE)o;//not allowed in safe code
if(n_TM_step != rhs.n_TM_step) return false;
if(n_Sub_step != rhs.n_Sub_step) return false;
if(irrp_set != rhs.irrp_set) return false;
return true;}
final RHE dup(){ return new RHE(irrp_set, n_TM_step, n_Sub_step);}
final int opCmp(ref const RHE rhe){
return irrp_set.opCmp(rhe.irrp_set);}
}
//****************************************** class Context_pair *********************************
//This is for any pair of strings. It is used only as a context for an IGR to be added on the left and right respectively of the implication.
//
class Context_pair{
char [] s1 = [];//left
char [] s2 = [];//right
final this(const ref char [] s1, const ref char [] s2){
this.s1 = s1.dup;
this.s2 = s2.dup;}
override bool opEquals(const Object o) const {
auto rhs = cast(Context_pair)o;//not allowed in safe code
if(s1 != rhs.s1) return false;
if(s2 != rhs.s2) return false;
return true;}
final this(){}
final Context_pair dup(){ return new Context_pair(s1,s2);}
}
//***************************************** class IGR **************************************
//This class is for sets of IGR's with a common lhs IRRP and with a set of pairs of strings (contexts). Each instance of
//this class will correspond to a row of Table 3 or Table 4. see my paper (2021 version).
class IGR{
int ns;//minimum number of symbols on the "tape" needed to represent the IGR i.e. the length of the symbol strings.
Context_pair[] context = [];//this corresponds to a set of $T_1$ and $T_2$ pairs in the paper.
//The context is the same for each value of alpha.
IRRP_set lhs;//This IRRP_set corresponds to a set of IRR's to which when F is applied generates the set
//of IRRP_sets (combined with two ints into RHE's) on the rhs, one for each possible value of alpha (the symbol added).
RHE[] rhs = [];
final this(ref Context_pair[] context, ref IRRP_set lhs, ref RHE[] rhs){
this.context = context.dup;
this.lhs = lhs.dup;
this.rhs = rhs.dup;}
final int opCmp(ref const IGR igr) const{
//compare IGR's by their lhs's. This is just based on logic, the set of rhs's follows from the lhs.
return lhs.opCmp(igr.lhs);}
//This is needed.
override bool opEquals(const Object o) const {
auto rhs1 = cast(IGR)o;//not allowed in safe code
if(ns != rhs1.ns) return false;
if(context != rhs1.context) return false;
if(lhs != rhs1.lhs) return false;
if(rhs != rhs1.rhs) return false;
return true;}
final this(){}
final IGR dup() { return new IGR(context,lhs,rhs);}
}
//*********************** Several write functions ****************************
//A "write" function is not supplied by default for classes, but is for structs. I wrote
//CS_write, array_CS_write, IRR_write, array_IRR_write to mimic the results of the default struct write functions to compare
//the program outputs with classes and structs (for CS and IRR) and they agreed and other write functions were added later
//in the similar way.
void CS_write(File of,const ref CS cs){
if(cs !is null){
of.write("CS(");
of.write(cs.s);
of.write(", ");
of.write(cs.p);
of.write(", ");
of.write(cs.l);
of.write(", ");
of.write('"');
of.write(cs.t);
of.write('"');
of.write(")");}
else of.write("null CS");}
void Context_pair_write(File of, const ref Context_pair sp){
if(sp !is null){
of.write("(");
of.write(sp.s1);
of.write(",");
of.write(sp.s2);
of.write(")");}
else of.write("null Context_pair");}
void IRR_write(File of,const ref IRR r){
if(r !is null){
of.write("IRR(");
of.write("[");
bool first=true;
foreach(ref x;r.cs0){if(!first)of.write(", ");
CS_write(of,x);first = false;}
of.write("]");
of.write(", ");
CS_write(of,r.cs1);
of.write(", ");
CS_write(of,r.cs2);
of.write(", '");
of.write(r.lsig);
of.write("', '");
of.write(r.sig);
of.write("', ");
of.write(r.n_derivation);
of.write(", ");
of.write(r.n_machine);
of.write(")");}
else of.write("null IRR");}
void array_CS_write(File of,const ref CS[] array){
if(array !is null){
of.write("[");
foreach(ref x;array){
of.write(" ");CS_write(of,x);}
of.write(" ]");}
else of.write("null CS[]");}
void context_write(File of,const ref Context_pair[] array){
if(array !is null){
of.write("context[");
foreach(ref x;array){
of.write(" ");Context_pair_write(of,x);}
of.write(" ]");}
else of.write("null context");}
void array_IRR_write(File of,const ref IRR[] array){
if(array !is null){
bool first = true;
of.write(array.length);
foreach(ref x;array){
of.write(" ");if(!first)of.write("\n");IRR_write(of,x);first = false;}}
else of.write("null IRR[]");}
void IRRP_set_write(File of, const ref IRRP_set irrp_set){
if(irrp_set !is null){
of.write("IRRP_set(");
of.write(irrp_set.alpha);
of.write(", ");
of.array_CS_write(irrp_set.origins);
of.write(", ");
of.CS_write(irrp_set.rhs);
of.write(")");
assert(irrp_set.rhs.l == irrp_set.origins[0].l);}
else of.write("null IRRP_set");}
void RHE_write(File of, const ref RHE rhe){
if(rhe !is null){
of.write("RHE(");
of.IRRP_set_write(rhe.irrp_set);
of.write(", ");
of.write(rhe.n_Sub_step);
of.write(", ");
of.write(rhe.n_TM_step);
of.write(")");}
else of.write("null RHE");
}
void array_RHE_write(File of, const ref RHE[] array){
if(array !is null){
of.write("[ ");
bool first = true;
foreach(ref x; array){
if(!first)of.write("\n");
RHE_write(of,x);
first = false;}
of.write(" ]");}
else of.write("null RHE[]");}
void array_IRRP_set_write(File of,const ref IRRP_set[] array){
if(array !is null){
bool first = true;
of.write("[ ");
foreach(ref x;array){
if(!first)of.write("\n");IRRP_set_write(of,x);first = false;}
of.write(" ]");}
else of.write("null IRRP_set[]");}
void IGR_write(File of, const ref IGR igr){
if(igr !is null){
of.write("IGR(");
of.write(igr.ns);
of.write(", ");
of.context_write(igr.context);
of.write(", ");
of.IRRP_set_write(igr.lhs);
of.write(", ");
of.array_RHE_write(igr.rhs);
of.write(")");}
else of.write("null IGR");}
void array_IGR_write(File of,const ref IGR[] array){
if(array !is null){
bool first = true;
of.write(array.length);
foreach(ref x;array){
of.write(" ");if(!first)of.write("\n");IGR_write(of,x);first = false;}
of.writeln();}
else of.write("null IGR[]");}
void NOOCO_write(File of, const ref NOOCO nooco){
write("\n nooco.cs0 = ");CS_write(stdout,nooco.cs0);
foreach(y; nooco.nota){//new origins triplet array
write("\n y.new_ori = ");CS_write(stdout,y.new_ori);
write("\n y.n_symb = ",y.n_symb);
write("\n y.alpha = ",y.alpha);}}
//********************************** function can_write **************************************
//This function finds a filename to be written to for output and returns it. It ensures
//that no file is overwritten without the consent of the user. "base_name" is usually the input
//datafile name and "extension" is the default file extension for output (including the dot)
@trusted char[] can_write(const char[] base_name,const char[] extension){
char[] filename = (base_name ~ extension).dup;
char[] chars;
char[] question;
bool write1,fn,overwrite=false;
do{
fn = exists(filename);
if(fn){
writeln("The file ",filename," already exists.");
question="Do you want me to overwrite it?".dup;
overwrite=yes(question);
if(!overwrite){
writeln("The new path and file name is ",base_name,"",extension);
std.stdio.write("Enter the extra characters: ");
readln(chars);chars=chomp(chars);//not allowed in safe code
filename=base_name ~ chars ~ extension;}}
write1=(!fn) || overwrite;}
while (!write1);
writeln("Output will be to the file ",filename);
return filename;}
//************************************** function yes *****************************************
//This function asks the user the question by dispaying the text 'question' and waits for
//the response y (for yes) or n (for no)
@trusted bool yes(const char[] question){
char ans='*';
bool ret;
char[] res;
while(ans!='y' && ans!='n'){
std.stdio.write(question,"(y/n): ");
res=chomp(readln().dup);//not allowed in safe code
ans=res[0];
if(res.length==1){
if(ans=='y')ret = true;
else if(ans=='n') ret = false;}
else{ans='*';}}
return ret;
}
//*********************************** function input_int ***************************************
//The purpose of this function is to extract an int from the input stream and handle
//the error that occurs when a non-number is entered, and allow the user to enter it again.
@trusted void input_int(ref int i){//ref is needed here
bool error;
char[] line;
for(;;){
line = chomp(readln().dup);//not allowed in safe code
error = false;
try{i = to!int(line);} catch (Exception){error=true;}
if(error == false)break;
std.stdio.write("An error occurred. Please try again: ");}
}
//***************************************** template function unique_sort_add ******************************************
//add element x to array in such a way as to maintain increasing order, and such that no elements are repeated
//in the array.
//Algorithm. Bisect the array segment repeatedly so as to locate between which pair of
//elements x is to be inserted to maintain inreasing order. When the pair of elements differ by 1 in index
//x can be inserted between them provided x is different from each of this pair.
void unique_sort_add(T)(ref T[] array, ref T x){
int i;
int l = to!int(array.length);
if(l == 0) array ~= x;//array is empty
else{
//deal with the end cases next
if(x.opCmp(array[0]) < 0) array = [x] ~ array;//x is smaller than all array elements
else if(x.opCmp(array[l - 1]) > 0) array ~= x;//is greater than all array elements
else{//l != 0 and x is not to be inserted at an end.
//i1 and i2 are the indices which bracket the position where x is to be placed.
//and are at the lower and upper end of the range respectively.
int i3,i1 = 0;
int i2 = l - 1;
while(i2 - i1 > 1){
i3 = i1 + (i2 - i1)/2;
if(x.opCmp(array[i3]) < 0) i2 = i3;
else i1 = i3;}
if(array[i1] != x && x != array[i2]){//x should be included
array = array[0 .. i1 + 1] ~ x ~ array[i2 .. $];}
else{
//x is not different from all the elements of array, so array is not altered.
}}}}
//******************************** function split_string ***************************
//Remove n_symbols from the string of a CS on the left or right opposite the pointer and put them in cont.
//The pointer is at or one space beyond the end of the string when called, and the length of the symbols strings is >= 2.
void split_string(ref CS cs,const int n_symbols, out char[] cont){
if(cs.t.length != cs.l){write("cs.l != cs.t.length");}
int i;
CS cs1 = cs.dup;
bool right = cs.p - cs.l == 0 || cs.p - cs.l == 1;//pointer is at right
cs.l -= n_symbols;
if(right){
cs.p -= n_symbols;
i = to!int(cs.t.length) - n_symbols;
rotate(cs.t,i);
cont = cs1.t[0 .. n_symbols];}
else cont = cs1.t[$ - n_symbols .. $].dup;
cs.t.length -= n_symbols;}
//******************************** function add_opposite_pointer *******************
//Add symbol string (not inplemented as a D string!) s at the opposite end of the string of symbols from the pointer. formerly add_at_x
//The routine assumes that the pointer is at one end of the known
//symbols on the tape (other cases will not be needed).
//This assumes that the length of the original symbol string in cs is > 1, and the length of s is >= 1.
@safe void add_opposite_pointer(ref CS cs,const char [] s){//symbols are added at right. Pointer is at left. ( i.e. cs.p=1)
int len = to!int(s.length);
cs.l += len;
cs.t ~= s;
if(cs.p > 1){//the symbol is added at the left, pointer is at right (cs.p=cs.l)
cs.p += len;
rotate(cs.t,len);}}
//********************************* function add_at_ptr_or_end ****************************
//Add symbol sy to cs (1) at or (2) adjacent to the pointer position at end of symbol string.
//The routine assumes that the pointer is at one end (case 2) or just off one end (case 1) of the known
//symbols on the tape. The added symbol never clobbers previous known symbols on the tape.
//This assumes that the length of the symbol string is > 1 for case (2). If l = 1 and case (1) then
//the other parameters cannot tell whether the symbol sy should be added at the left or right. In this case
//this is determined by the parameter direction (L or R) respectively. 'R' is the default value.
@safe void add_at_ptr_or_end(ref CS cs,const char sy, const char direction){
cs.t ~= sy;
if(cs.p <= 1 && cs.l > 1 || cs.p == 0 && cs.l == 1 || (cs.p == 1 && cs.l == 1 && direction == 'L')){
rotate(cs.t,1);//swaps added symbol from right to left end.
++(cs.p);}
++(cs.l);
}
//******************************** function rotate *********************************
//Rotates a char[] array of length l by r places to the right.
@safe void rotate(ref char[] str,const int r){
char[] temp = str.dup;
int l=to!int(str.length);
temp.length = 2 * l - r;
foreach(ref i;0 .. l - r)temp[i + l] = temp[i];
foreach(ref i;0 .. l)str[i] = temp[i + l - r];
}
//******************************* function subset **********************************
//Returns whether cs1 is a subset of cs2 (includes equality)
@safe bool subset(const CS cs1,const CS cs2){
int ll;
if(cs1.s != cs2.s)return 0;//false if states differ
ll = cs1.p - cs2.p;//difference between left end of tape represented
if(ll < 0)return 0;
if((cs2.l - cs2.p) > (cs1.l - cs1.p))return 0;
char[] sub;
sub.length = cs2.l;
foreach(ref i;0 .. cs2.l)sub[i]=cs1.t[i + ll];
return sub == cs2.t;}
//******************************* function st_init ************************************
//This initialises the Store object st.
void st_init(){
st.inside = false;
st.dircheck = false;
st.L = [];
st.cycle = false;
st.right_at_end = false;
st.left_at_end = false;
st.cond_1 = 0;
st.j = 0;
st.ind = 0;
st.n_symbols = 0;}
//*************** why not initialise one_step_ori and right_at_start? **************
//************************* function new_origins ************************************
//new origins is a recursive algorithm (i.e. it calls itself) to perform the following tasks, but more efficiently.
//Initially the CS cs is augmented by adding an arbitrary symbol (called alpha in the papers) at the
//same end where the pointer is for the first call to new_origins. (This is the opposite end of the string to where
//the pointer is in IRR.cs1 i.e. its LHS.)
//All values of alpha are included in the search.
//Then new_origins finds all single backward TM steps from there and their
//resulting CS's. This is repeated for each branch recursively, until
//one of 4 conditions occurs in each branch. Condition 1: the pointer finishes at the opposite end from
//where it started in the LHS which is the same end where it was in cs. Condition 2: the pointer finishes at the same end
//to where it started in the LHS which is the opposite end where it started in cs.
//Condition 3: the computation can go no further because backward steps from there are impossible.
//Finally, condition 4: a CS is reached that has been entered before.
//All the CS's arrived at as a result of branches ending in condition 1 constitute the array ori on exit.
//The algorithm is based on a depth-first search of the tree. An array of the cs's obtained on
//the way (L) must be maintained in order to check for a loop (condition 4). This has to be updated at each step of
//the search. The first time new_origins is called for a CS cs, L is empty. Otherwise L is the current list of all
//previous CS's in the backward search for origins, excluding the one that has just been or being found.
//The new element of L is added first for each call to new_origins.
//To avoid repetition, alpha is not actually added to the string until the pointer reaches the end.
//The initial pointer position in the original LHS i.e. right or left end of the string, is passed down through the variable
//"right_at_start" meaning the pointer is at the right in IRR.cs1 (its LHS) which is true or false. This is used to distinguish
//case 1 and case 2 when the pointer ends up at the opposite end or same end.
//n_symbols is the number of symbols involved in this calculation for this branch. This is needed in order to find
//the number of symbols involved in apply_F to express it with no redundant symbols.
//n_symb is the array of values of n_symbols, one for each branch. n_symb is empty on first calling new_origins.
//The variables cycle, right_at_end, left_at_end, cond_1, j, ind, n_symbols and one_step_ori are in a global structure st of type Store
//of which there is just one instance to avoid creating new variables each time new_origins calls itself.
//ref is needed for all parameters (value and reference types) to guarantee 2-way information flow
//and avoid new copies being used for each call.
//The values in ori, n_symb, and alpha in nooco_object at the first call do not affect the running of new_origins except that these
//values will be prepended to any subsequent values added to these arrays by new_origins. So in a sense they are
//"out" only variables, but not quite. They should be ideally declared as "append only" array arguments but this is not possible.
//
@trusted void new_origins(ref CS cs,ref NOOCO nooco_object){
bool test;
st.cycle = canFind(st.L,cs);//not allowed in safe code
st.L ~= cs;
st.right_at_end = cs.p == cs.l;
st.left_at_end = cs.p == 1;
st.cond_1 = (st.right_at_start && st.left_at_end) || (!st.right_at_start && st.right_at_end);//condition 1
st.one_step_ori = [];
if(!st.cycle && ((!st.right_at_end && !st.left_at_end) || st.cond_1)){//conditions under which current branch is not required to stop i.e.
// non-cycling or condition 1. This eliminates cycling and condition 2.
//get the array (one_step_ori) of all CS's ncs that go to cs in 1 TM step with the pointer on the tape and all CS's in ori that
//do so from a new pointer position just outside the existing tape. The latter includes
//the extra symbol in place of the symbol alpha added.
foreach(ref k2; 1 .. nst+1){//look for a possible reverse TM step
foreach(ref k1; 0 .. nsy){//through the whole TM table
st.j = (k2 - 1) * nsy + k1;//index of TM table row
if(irrs[0][st.j].cs2.s == cs.s){//if the new TM state is the state of cs (1st condition for reverse step)
st.ind = cs.p - irrs[0][st.j].cs2.p;//array index of symbol matched. st.ind + 1 = (cs.p - 1 if TM step to be reversed goes right,cs.p + 1 otherwise)
st.inside = st.ind >= 0 && st.ind < cs.l;//pointer position after reverse TM step is within symbol string, otherwise it is at alpha
st.dircheck = (irrs[0][st.j].cs2.p == 2 && st.ind == -1) || (irrs[0][st.j].cs2.p == 0 && st.ind == cs.l);//checks that if condition 1 is true,
//the direction of the next reverse TM step takes the pointer just outside the current symbol string.
test = (st.inside && irrs[0][st.j].cs2.t[0] == cs.t[st.ind]) || (!st.inside && st.dircheck);
if(test){//if the new TM symbol is the symbol at ind (2nd condition for reverse step)
CS ncs = new CS;
ncs.s = irrs[0][st.j].cs1.s;//This would be k2 if TM table is written in order in file.
ncs.p = st.ind + 1;
ncs.t = cs.t.dup;
if(st.inside){
ncs.l = cs.l;
ncs.t[st.ind] = irrs[0][st.j].cs1.t[0];
st.one_step_ori ~= ncs;//dup not needed: cs2 goes out of scope next, so it cannot provide a "back door" to one_step_ori
}
else{//reverse TM step takes pointer outside string, so terminating this branch of the search.
//Find st.n_symbols which is the number of symbols involved i.e. the length of the character string involved in finding
//the new origin in the current branch.
//writeln("test11");
if(st.right_at_start){//The TM is at the right hand end of the symbol string in the LHS
st.n_symbols = 1;//find n_symbols i.e. the number of symbols used in the computation
foreach(x; st.L) st.n_symbols = max(st.n_symbols, x.p);
st.n_symbols += 1;}//the extra symbol at the left shifts the symbol positions up by 1
else{//The TM is at the left in the LHS
st.n_symbols = cs.l;
foreach(x; st.L) st.n_symbols = min(st.n_symbols, x.p);
st.n_symbols = cs.l - st.n_symbols + 2;}
//find ncs when the new pointer position is just beyond the symbol string, remembering that if the
//reverse TM step goes left, then the new symbol is on the left increasing all the symbol positions by 1.
ncs.l = cs.l + 1;
ncs.t ~= irrs[0][st.j].cs1.t[0];
if(st.ind == -1) {++(ncs.p); rotate(ncs.t,1);}//add irrs[0][j].cs1.t[0] at position 0 or at right end of string otherwise
NOT noto;
noto.new_ori = ncs.dup;
noto.alpha = irrs[0][st.j].cs2.t[0];
noto.n_symb = st.n_symbols;
nooco_object.nota ~= noto;
}}}}}}
//This is the recursive bit
foreach(ref cs3;st.one_step_ori){
new_origins(cs3, nooco_object);}
if(st.L.length) --st.L.length;//remove last item in L if there are any after all other branches of the search have ended. (this CS cannot be an intermediate CS).
//It terminates the branch and is appropriate just prior to back-tracking from a branch that ends to follow another reverse TM step.
}
//************************* function apply_rules ************************************
//The function "apply_rules" applies the previously found IRR up to length one less than the current CS being
//processed to carry out substitution
//steps for the TM computation starting from the LHS named cs. n_steps is # substitution steps, nm = # TM steps.
//Do this repeatedly until one of the following
//results occurs: (1)pointer moves outside the length of tape with known symbols,
//to the right (R) or left(L).
//(2) a halt condition occurs (H), (3) an infinitely repeating cycle occurs(C).
//The result is a new rule with the stopping condition (LRHC) indicated and
//the numbers of derivation (i.e. substitution) and machine steps involved. In the case C there
//is no final CS. In the output there must be one, but C is
//indicated showing that it is not really final.
//The new rule is added to IRR, because it must be irreducible
//as the result of the correct functioning of this computer program.
//Each computation will end in state L,R,H or C, meaning the pointer is
//at the left,right, the computation halts or does not terminate, respectively.
//On output, cs now is the final result of the substitutions. The new rules get
//inserted into the array of arrays of IRR "irrs" in the main program.
@trusted void apply_rules(ref CS cs,out int n_steps,out int nm, out char d, out char lsig, out int n_symbols){
//d is "sig" for output cs: L,R,C, or H, lsig is sig for original cs (L or R).
if(cs.p == 1)lsig = 'L'; else lsig = 'R';
CS[] cs_array;
cs_array ~= cs.dup;//dup is needed because cs_array is a record of the changing values of cs.
n_steps = 0;//# derivation steps
nm = 0;//# machine steps
n_symbols = 0;//number of symbols involved in the sequence of substitutions
bool match;
int j = cs.l;
d = '0';
IRR irr2;
for(;;){//repeatedly apply appropriate rules to continue computation
if(cs.p == 0){d = 'L';break;}
if(cs.p > cs.l){d = 'R';break;}
match=0;
for(int k = j-2;k >= 0;--k){//find the longest applicable rule
foreach(ref irr3;irrs[k]){
if(subset(cs,irr3.cs1)){match=1;
irr2 = irr3;//irr3 next goes out of scope so dup not needed provided irr2 is not altered bolow.
break;}}
if(match==1)break;}
//match now indicates if a matching rule has been found. This should be true always
if(!match){
throw new Exception("Error: no matching rule found");}
//Do substitution. The first position is element 0 as far as replace is concerned
if(cs.p == 1){//pointer at left in cs and irr2.
foreach(ref i;0 .. irr2.cs2.l){
cs.t[i] = irr2.cs2.t[i];}}
else if(cs.p > 1){//pointer at right in cs and irr2.
foreach(ref i;0 .. irr2.cs2.l){
cs.t[j - irr2.cs2.l+i] = irr2.cs2.t[i];}}
n_symbols = irr2.cs2.l;
cs.s = irr2.cs2.s;
cs.p += irr2.cs2.p - irr2.cs1.p;
cs.l = j;//cs is now the result of applying rule irr2 to previous cs
++n_steps;
if(n_steps>=2)n_symbols = cs.l;
nm += irr2.n_machine;
if(count(cs_array,cs) > 0 || irr2.sig == 'C')d = 'C';//not allowed in safe code
cs_array ~= cs.dup;//dup needed
if(irr2.sig == 'H'){d = 'H';break;}
if(d == 'C')break;//if this CS was found before, there is an endless stationary loop so go to next case.
}}
//********************************* main program **********************************************
void main(string[] args){
writeln("This is TIEv3.0_draft copyright (C) 2018 John Nixon.\nThis program comes with ABSOLUTELY NO WARRANTY;\nThis is free software, and you are welcome
to redistribute it\nunder certain conditions; for more information see the source code file.\n");
if(args.length!=2){
writeln("Usage: ");
writeln("For example from the same directory as the program:");
writeln(" ./ ");
writeln("See the initial comments in the program source file for more information.");
writeln("The information in the file of the Turing Machine must be presented in the following");
writeln("order with each element separated from the next by any number (including zero) of");
writeln("whitespace characters (tabs spaces newlines etc).");
writeln("to allow great flexibility of presentation:");
writeln("Number of TM states (nst), Number of TM symbols(nsy), then for each combination of these:");
writeln("Old state,symbol read,new state,symbol printed,direction of movement (L or R) or halt (H).");
writeln("The states are 1,2,... nst, and the symbols are a,b,c,...");
writeln("The lines of the TM machine table must be sorted first by old state increasing, and then by");
writeln("symbol read in alphabetical order a to z.");
writeln("This last requirement simplifies the algorithm for searching for reverse TM steps");
writeln("For example\n3 \n2 \n1a2bR\n1b1bH\n2a3bR\n2b3aL\n3a1aL\n3b1aR\n");
writeln("For verification, the program's interpretation of the user specified TM is first shown.");
return;}
bool print_IRRs = yes("In the output file (a) do you want the IRR's printed?");
bool print_IGRs = yes("and (b) do you want the IGR's together with their contexts printed?");
int n_symbols;
char alpha;
char dir;//direction for addition of new symbol alpha (will be 'L' or 'R').
string inputfilename=args[1];
File infile = File(inputfilename,"r");
if(!exists(inputfilename)){writeln("I cannot find file ",inputfilename);return;}
char[] outputfilename=can_write(inputfilename,"_out.txt");
File of;//output file;
of.open(outputfilename.idup,"w");
infile.readf(" %s",&nst);infile.readf(" %s",&nsy);
int c=nsy*nst;
char [] question;
int q, n, opt;
stdout.write("\nEnter the maximum length of derived computation rules: ");
input_int(n);
if(print_IGRs){
stdout.write("\nEnter 1 for showing all IRR, 2 for showing only IRR of type LL, or 3 for showing only IRR of type RR: ");
input_int(opt);}
irrs.length = n;
stdout.write("See the documentation in the program source file for the notation.\n");
of.write("See the documentation in the program source tie.d file for the notation.\n");
writeln("The Turing Machine table expressed as IRR of length 1");
foreach(ref oc;0 .. c){
IRR r1 = new IRR;//"new" is needed (a class object) to avoid a segmentation fault with r1.dup
CS[] cs0;
CS cs1 = new CS;//new is needed here
infile.readf(" %d",&cs1.s);
cs1.l=1;
cs1.t.length = 1;
infile.readf(" %s",&cs1.t[0]);
cs1.p=1;
CS cs2 = new CS;
infile.readf(" %d",&cs2.s);
cs2.l=1;
cs2.t.length = 1;
infile.readf(" %s",&cs2.t[0]);
infile.readf(" %s",&r1.sig);
if(r1.sig == 'L')cs2.p = 0;else if(r1.sig == 'R')cs2.p = 2;else cs2.p = 1;
r1.cs0 = cs0;
r1.cs1 = cs1;
r1.cs2 = cs2;
r1.n_derivation = 1;
r1.n_machine = 1;
r1.lsig=' ';
irrs[0] ~= r1;}//dup not needed because r1 goes out of scope next.
//its value can be udated and copied into irrs with .dup.
//r1 is now out of scope so cannot be used to change irrs
array_IRR_write(stdout,irrs[0]);
stdout.write("\nThe numbers of irreducible regular (computation) rules");
of.write("\nThe numbers of irreducible regular (computation) rules");
stdout.write("\n(IRR) derived from this Turing Machine for different tape lengths are as follows:");
of.write("\n(IRR) derived from this Turing Machine for different tape lengths are as follows:\n");
stdout.writeln("\nSee the output file "~outputfilename~" for the list of IRR obtained (if requested).");
int count;
//trusted void new_origins(ref Tuple!(bool inside, bool dircheck, int ext, CS[] L, CS[] ori, int[] n_symb, bool cycle,bool right_at_end, bool left_at_end, bool cond_1, int j, int ind, int n_symbols) st, const ref CS cs,
CS csi = new CS;
CS[][CS] rtm;//reversed Turing Machine steps represented as an associative array. This structure is filled in next.
//An effect of using the associative array is that each time it is constructed the
//key-value pairs are usually in a different order! The storage is using a hash table, whose elements are stored in an unordered way.
//This is a very sophisticated data structure designed for rapid retrieval of information not really needed here.
//Do this first for the TM steps going left
int index;
foreach(ref st;1 .. nst+1)foreach(ref sy;0 .. nsy){
CS cs1 = new CS;
cs1.s = st;
cs1.p = 0;
cs1.l = 1;
cs1.t.length = 1;
cs1.t[0] = to!char(sy+97);
rtm[cs1] = [];//declares that there are no CS(1)'s leading to this (state,symbol) combination yet found.
foreach(ref st1;1 .. nst+1)foreach(ref sy1;0 .. nsy){//search through TM table to find them all
index=(st1 - 1) * nsy + sy1;
//IRR_write(of,irrs[0][0]);
//array_IRR_write(stdout,irrs[0]);
if(irrs[0][index].cs2.p == 0 && irrs[0][index].cs2.s == st && irrs[0][index].cs2.t[0] == to!char(sy + 97)){//checks state and symbol and direction
CS cs2 = new CS;
cs2.s = irrs[0][index].cs1.s;
cs2.p = 1;
cs2.l = 1;
cs2.t.length = 1;
cs2.t[0] = irrs[0][index].cs1.t[0];
rtm[cs1] ~= cs2;//no dup needed because cs2 goes out of scope next.
// write("cs1 = ");CS_write(stdout,cs1);write("cs2 = ");CS_write(stdout,cs2);writeln("");
}}}
//then for the TM steps going right
foreach(ref st;1 .. nst+1)foreach(ref sy;0 .. nsy){
CS cs1 = new CS;
cs1.s = st;
cs1.p = 2;
cs1.l = 1;
cs1.t.length = 1;
cs1.t[0] = to!char(sy+97);
rtm[cs1] = [];
foreach(ref st1;1 .. nst+1)foreach(ref sy1;0 .. nsy){
index=(st1-1)*nsy+sy1;
if(irrs[0][index].cs2.p==2 && irrs[0][index].cs2.s == st && irrs[0][index].cs2.t[0] == to!char(sy+97)){
CS cs2 = new CS;
cs2.s = irrs[0][index].cs1.s;
cs2.p = 1;
cs2.l = 1;
cs2.t.length = 1;
cs2.t[0] = irrs[0][index].cs1.t[0];
rtm[cs1] ~= cs2;//no dup needed and it makes no difference!
// write("cs1 = ");CS_write(stdout,cs1);write("cs2 = ");CS_write(stdout,cs2);writeln("");
}}}
//generate the IRR(2)
//add symbols at the pointer s.t. forward TM step takes the pointer to opposite end of tape (l=2).
//alpha_n is the integer representation of the symbol called alpha in the papers: 0 for a ,1 for b etc. to give an irreducible rule
foreach(element;rtm.byKeyValue){
if(element.value == [])continue;
foreach(alpha_n;0 .. nsy){
dir = ' ';
index=(element.key.s - 1) * nsy + alpha_n;//use ordering of TM to compute index to find forward TM step
if(element.key.p==0 && irrs[0][index].cs2.p == 2) dir = 'L';//add symbol at left
if(element.key.p==2 && irrs[0][index].cs2.p == 0) dir = 'R';//add symbol at right
if(dir == 'L' || dir == 'R'){//this forward TM step does generate a member of IRR(2)
IRR r = new IRR;
foreach(cs0; element.value){
//construct the IRR triplet of length 2 and add it to the IRR
CS cs1 = cs0.dup;//cs0 is not modified by following statements
alpha = to!char(alpha_n + 97);
add_at_ptr_or_end(cs1,alpha, dir);//dir tells which side to add alpha
r.cs0 ~= cs1;}//ok because cs1 then goes out of scope.
//r.c1=key with alpha added on left or right
r.cs1 = element.key.dup;//this protects element.key from changes via cs
add_at_ptr_or_end(r.cs1,alpha,dir);
r.cs2 = r.cs1.dup;
apply_rules(r.cs2,r.n_derivation,r.n_machine,r.sig,r.lsig,n_symbols);
if(r.cs1.p == 1)r.lsig = 'L';else r.lsig = 'R';
irrs[1] ~= r;}}}
sort(irrs[1]);
// of.write("\n irrs[1] = ");array_IRR_write(stdout,irrs[1]);
//From the IRR(2) generate the IGR(2), the smallest IGR's. They are stored in igrs[0].
//These need to be consolidated i.e. origins combined if the rhs's are the same.
igrs.length = n - 1;
//generate the IRR(3) and beyond
write("\nlength: ",2," # IRR = ",irrs[1].length);
// of.writeln("length: ",2," # IRR = ",irrs[1].length);
// CS[] ori = new CS[0];//array of new origins on output.
CS[] new_oris = new CS[0];//combined array of origins.
// char[] alphas = new char[0];
char[] d_alpha;
// int [] n_symb_1;
int n_derivation,n_machine;
bool new_alpha;
//carry out analysis for the required number of iterations i.e.
foreach(ref i; 2 .. n){//given all irreducible rules (IRR) of length i, find all extensions of them to length i+1.
count = 0;//initialise count of the number of IRR found of length i.
foreach(ref r2; irrs[i - 1]){
if(!((r2.lsig == 'L' && r2.sig == 'R') || (r2.lsig == 'R' && r2.sig == 'L')))continue;//ignore IRR that are not of type LR or RL
IRR[] new_irrs = new IRR[0];//empty array of IRR's to be filled in
NOOCO[] nooco_array;
foreach(ref cs; r2.cs0){
NOOCO nooco_object;
// st.L = [cs];
st.L = [];
st.right_at_start = cs.p == 1;//i.e. pointer is at left in origin (it is at right in the LHS).
st_init();//initialise the Store st used by new_origins.
//Store_write(stdout,st);
// write("\nabout to call new_origins: input CS is ");CS_write(stdout,cs);
new_origins(cs, nooco_object);
nooco_object.cs0 = cs.dup;
nooco_array ~= nooco_object;}
// write("\nnooco_array = ");
// foreach(x; nooco_array)NOOCO_write(stdout,x);
// write("\nend of nooco_array\n");
//n_symb_1 is not used here. Only needed for IGR's
c = 0;//counter for the number of new values of alpha
d_alpha = [];//array of distinct values of alpha in the order they are found.
// foreach(q, ref new_ori; new_oris){//reformulate this in terms of no1co! *************************
foreach(x; nooco_array){
foreach(noto; x.nota){
new_alpha = true;
for(int j = 0; j < c; ++j){
if(d_alpha[j] == noto.alpha){
new_irrs[j].cs0 ~= noto.new_ori;//adds new origin to the IRR derived using the same value of alpha.
new_alpha = false; break;}}
if(new_alpha){
IRR new_irr = new IRR;
d_alpha ~= noto.alpha;
new_irr.cs0 ~= noto.new_ori.dup;
new_irr.cs1 = r2.cs1.dup;
add_opposite_pointer(new_irr.cs1,[noto.alpha]);
new_irr.cs2 = r2.cs2.dup;
add_at_ptr_or_end(new_irr.cs2, noto.alpha, dir);
apply_rules(new_irr.cs2,n_derivation,n_machine,new_irr.sig,new_irr.lsig,n_symbols);
if(new_irr.cs1.p == 1)new_irr.lsig = 'L';else new_irr.lsig = 'R';
new_irr.n_derivation = n_derivation + 1;
new_irr.n_machine = n_machine + r2.n_machine;
++ count;
++ c;
new_irrs ~= new_irr;}}}
irrs[i] ~= new_irrs;}
write("\nlength: ", i + 1," # IRR = ",irrs[i].length);
of.write("\nlength: ", i + 1," # IRR = ",irrs[i].length);
}
//Do iteration of this to get any new patterns on the left and their corresponding set of sets after F, to convergence.
//Then do iteration of the sets of LHS states to convergence by adding new LHS states as required.
if(print_IRRs){
if(opt == 1)of.write("All IRR are shown\n");
if(opt == 2)of.write("Only IRR of type LL are shown\n");
if(opt == 3)of.write("Only IRR of type RR are shown\n");
foreach(ref i; 0 .. n){
if(irrs[i].length != 0){
if(i)of.writeln("IRR of length ",i + 1,"(",irrs[i].length,")");
else of.writeln("The Turing Machine table expressed as IRR of length 1");
sort!()(irrs[i]);
foreach(ref it;irrs[i]){
if((opt == 1)||(opt == 2 && it.cs1.p == 1 && it.cs2.p == 0)||(opt == 3 && it.cs1.p == it.cs1.l && it.cs2.p == it.cs2.l + 1))IRR_write(of,it);of.writeln();}
of.writeln();}}}
int [] d_ns;
char[][] d2_alpha;
int n_steps, nm;
char[] cont_l,cont_r;
CS cs0,cs1,cs2,cs3,cs4,cs5;
char lsig,sig;
// int[] d,d_ns,n_symb_2;
bool new_result, first_new_result, new_igr, new_ns;
char direction, d;
IGR[] igr_array;
//starting to generate the IGR(2) from scratch
foreach(r; irrs[0]){//for each single TM step
IGR igr = new IGR;//Start generating an IGR from this r. This will be in igr that is later
//linked in to the array igrs[0]. Otherwise the empty IGR is ignored and a new one is started for the next IRR r.
new_igr = false;//initially assume no new IGR is needed for IRR r because the TM next step goes "wrong" way.
first_new_result = true;//this is only true when new_result is true for the first time for the IRR r.
foreach(sy; 0 .. nsy){
alpha = to!char(sy + 97);
if(r.cs2.p > 1) direction = 'R';
if(r.cs2.p == 0) direction = 'L';
cs4 = r.cs2.dup;
cs5 = r.cs1.dup;
new_result = false;
foreach(r2; irrs[0]){//determine whether the matching rule when applied using alpha gives a CS s.t. computation goes back into
//the known symbols ("right" way) or not. If so "new_result" is true.
if(r2.cs1.s == cs4.s && r2.cs1.t[0] == alpha){//matching row of TM table has been found
if((cs4.p == 0 && r2.cs2.p == 2) || (cs4.p == 2 && r2.cs2.p == 0)){//new igr or new IRRP_set for the igr.rhs is needed.
new_result = true;
add_at_ptr_or_end(cs4, alpha, direction);
add_at_ptr_or_end(cs5, alpha, direction);
apply_rules(cs4, n_steps, nm, d, lsig, n_symbols);}
break;}}
if(new_result){
new_igr = true;
if(first_new_result){
IRRP_set irrps_l = new IRRP_set;
irrps_l.alpha = '*';
irrps_l.origins = [r.cs1];
irrps_l.rhs = r.cs2;
Context_pair cp = new Context_pair;
cp.s1 = "".dup;
cp.s2 = "".dup;
igr.ns = 2;
igr.context = [cp];
igr.lhs = irrps_l;}
IRRP_set irrps_r = new IRRP_set;
irrps_r.alpha = alpha;
irrps_r.origins = [cs5];
irrps_r.rhs = cs4;
RHE rhe = new RHE;
rhe.irrp_set = irrps_r;
rhe.n_TM_step = n_steps;
rhe.n_Sub_step = 1;
unique_sort_add(igr.rhs, rhe);
first_new_result = false;}}
if(new_igr)unique_sort_add(igrs[0],igr);}
//Modify program to use IGR's instead of IRR's in apply_rules. Then IRR's will not be needed as such.
of.write("\nstarting the calculation of the IGR's of length greater than 2");
write("\nstarting the calculation of the IGR's of length greater than 2");
foreach(i; 1 .. n - 1){//The IGR's generating the IRR(2) have already been found, these are in igrs[0].
//This is the main loop that when complete generates igrs, the 2D array of IGR's that generate the IRR's up to
//length n. For cycle i these IGR's are updated such that the IGR's included generate the IRR's
//of length i + 2. The sub-array igrs[i] contains only IGR's of reduced length i + 2 (i.e. excluding the length of the context).
//The first cycle (i = 1) will generate the IGR's for obtaining the IRR(3) and these will be put in igrs[0] and igrs[1].
//The additional ones put in igrs[0] will be those of length 2 with context pair(s) of length 1 having total length 3. At the end
//of the loop (i = n - 2) igrs[0] ... igrs[n - 2] will have been found generating the IRR up to length n.
//Cycle i starts with the set of members of igrs[0] with contexts having length i - 1 having total length i + 1
//and in general members of igrs[j] with contexts of length i - j - 1 have total length i + 1 giving IRR(i + 1), for
//j (now j1) such that 0 <= j <= i - 1.
//All these after F give members of IRR(i + 2) that are likewise partitioned into
//igrs[k] with context of length i - k for 0 <= k <= i.
//Each IGR under F can give a set of IGR's of different lengths and lengths of context string pairs. For each
//old alpha, a set results (IRRP_sets) in the temporary array is produced, one for each value of new alpha. As a result of the
//removal or extra symbols, some are reduced in length. One new IGR is started for each value of ns the minimum number
//of symbols needed to represent it. Therefore an array of counters is needed, one for each of the sub-arrays. It is called
//j2 and j2[k] is the counter for igrs[k] with IGR's of minimal length k + 2. j2[0] is initialised to the
//could just use igrs[k].length?
write("\ni = ",i);
for(int j1 = 0;j1 < i; ++j1){
write("\n j1 = ",j1);
foreach(lc3,igr;igrs[j1]){//loop counter
foreach(lc4,cp; igr.context){
if(cp.s1.length != i - j1 - 1)continue;//if the total length of the IGR is not i + 1, skip this.
foreach(lc5,ref o_rhe;igr.rhs){//o is for old or original. This is a loop over old alpha.
cs2 = o_rhe.irrp_set.rhs.dup;
add_opposite_pointer(cs2,cp.s2);//cs2 is now the RHS of a IRRP_set with context added
foreach(lc6,ref ori; o_rhe.irrp_set.origins){
if((ori.p == 1 && cs2.p > 0) || (ori.p > 1 && cs2.p == 0))continue;//check for the type of the IRRP. Only if it is LRL or RLR can F be applied.
//igr_array[j] is the IGR obtained having length j + 2.
d2_alpha.length = 0;
cs0 = ori.dup;
add_opposite_pointer(cs0,cp.s1);
cs3 = cs0.dup;
// $cs3\to\to cs2$ is now the LHS of a new IGR to be found (before chopping out the context strings.
st.L = [];//#1: Two initialisations needed for each run of "new_origins"
st.right_at_start = cs0.p == 1;//#2: i.e. pointer is at left in origin (it is at right in the LHS).
st_init();//initialise the Store st used by new_origins. needed?
NOOCO nooco_object;
new_origins(cs0, nooco_object);
nooco_object.cs0 = cs0.dup;
d_ns = [];
igr_array = [];
igr_array.length = i + 1;
foreach(count2, ref y; nooco_object.nota){//new_origins_triplet_array
cs4 = cs2.dup;
add_at_ptr_or_end(cs4, y.alpha, dir);
int ns;
apply_rules(cs4,n_derivation,n_machine,sig,lsig,ns);//"sig" for output CS needed? No but keep sig and lsig
ns = max(y.n_symb,ns);
new_ns = true;
for(int k = 0; k < d_ns.length; ++k){
if(ns == d_ns[k]){
new_ns = false;
new_alpha = true;//this is assumed first to be a new value of (new)alpha.
for(int j = 0; j < d2_alpha[k].length; ++j){
if(d2_alpha[k][j] == y.alpha){
new_alpha = false;
split_string(y.new_ori, i + 2 - ns, cont_l);
unique_sort_add(igr_array[ns - 2].rhs[j].irrp_set.origins, y.new_ori);//adds new origin to the IRR derived using the same value of alpha.
break;}}
if(new_alpha){
RHE rhe_r = new RHE;
IRRP_set irrp_set_r = new IRRP_set;
d2_alpha[k] ~= y.alpha;
irrp_set_r.alpha = y.alpha;
split_string(y.new_ori, i + 2 - ns, cont_l);
irrp_set_r.origins = [y.new_ori];
irrp_set_r.rhs = cs4;
split_string(irrp_set_r.rhs, i + 2 - ns, cont_r);
rhe_r.n_TM_step = n_machine;
rhe_r.n_Sub_step = n_derivation;
rhe_r.irrp_set = irrp_set_r;
igr_array[ns - 2].rhs ~= rhe_r;}
break;}}
if(new_ns){
d_ns ~= ns;
RHE rhe_r = new RHE;
IRRP_set irrp_set_r = new IRRP_set;
IRRP_set irrp_set_l = new IRRP_set;
d2_alpha ~= [y.alpha];//array of distinct values of the new alpha in the order they are found.
irrp_set_r.alpha = y.alpha;
irrp_set_r.origins = [y.new_ori];
split_string(irrp_set_r.origins[0], i + 2 - ns, cont_l);
irrp_set_r.rhs = cs4;
split_string(irrp_set_r.rhs, i + 2 - ns, cont_r);
rhe_r.irrp_set = irrp_set_r;
rhe_r.n_TM_step = n_machine;
rhe_r.n_Sub_step = n_derivation;
cs1 = cs3.dup;
split_string(cs1, i + 2 - ns, cont_l);//should happen once only for each new IGR
irrp_set_l.alpha = '*';//This removes old alpha from the lhs of the IGR's and results in sorting them by
//their left IRRP_sets i.e. by the rhs of the left IRRP_set, then by its origin.
irrp_set_l.origins = [cs1];
irrp_set_l.rhs = cs2.dup;
split_string(irrp_set_l.rhs, i + 2 - ns, cont_r);
IGR igr1 = new IGR;
igr1.ns = ns;
igr1.rhs = [rhe_r];
igr1.lhs = irrp_set_l;
igr1.context ~= new Context_pair(cont_l,cont_r);
igr_array[ns - 2] = igr1;
}}
//sort each element of igr_array individually into z by IRRP_sets in their rhs's using alpha.
//then add z into main arrays igrs maintaining sort order.
foreach(lc10,x; igr_array){
if(x !is null){
// write("\n sort within x");
IGR z = new IGR;
z.ns = x.ns;
z.lhs = x.lhs;
z.context = x.context;
foreach(y; x.rhs){//this operation needs to be combined with a merging of sets of origins 2021-11-04 or at
//least grouping or sorting them so that they are in a group of similar IGR's.
unique_sort_add(z.rhs, y);}
unique_sort_add(igrs[z.ns - 2], z);}}
}
}//ends loop over o_irrp_set or equivalently old alpha
}//ends loop over cp (context pairs) of input IGR
}//ends loop over igr of a set of IGR's.
}//ends loop over j1
}//ends loop over i
int[][] freq;//freq[j - 2][k - 2] is the frequency of IGR's having total length j and reduced length k (excluding the length of the context strings).
//and 2 <= j and k <= n.
freq.length = n - 1;
foreach(i; 0 .. n - 1){freq[i].length = i + 1;}
foreach(i,x; igrs){
if(print_IGRs){
of.write("\n\nIGR's of length: ",i + 2);}
foreach(y;x){
if(print_IGRs){of.writeln(); IGR_write(of,y);}
int i1, i2;
i1 = y.ns;
i2 = to!int(y.context[0].s1.length);
++ freq[i1 + i2 - 2][i1 - 2];
}}
of.write("\nNumber of IGR's (together with their context) with given total and reduced lengths\nreduced length 2, 3, ...");
stdout.write("\nNumber of IGR's (together with their context) with given total and reduced lengths\nreduced length 2, 3, ...");
foreach(i; 0 .. n - 1){
of.write("\ntotal length ", i + 2,": ");of.write(freq[i]);
stdout.write("\ntotal length ", i + 2,": ");stdout.write(freq[i]);
}
stdout.write("\n");
IGR[][] igr_patterns;//These are IGR that are consolidated by deleting the context.
foreach(i,x; igrs){
++ igr_patterns.length;
foreach(y; x){
y.context.length = 0;
unique_sort_add(igr_patterns[i], y);}}
of.writeln("\nThe set of distinct igr patterns i.e. IGR's with the context deleted");
foreach(x; igr_patterns){
foreach(y; x){of.writeln(); IGR_write(of,y);}}
}//ends program