Nick's Auto-Evo Algorithm Discussion

Zahyyy · November 7, 2019, 3:48pm

Hey, @NickTheNick ! I have chosen your auto-evo proposal as a topic for my presentation during my AI lesson covering the topic (can you guess?) evolutionary algorithms. I’d like to ask you a few questions for the sake of my presentation, if you don’t mind.

Is the algorithm somewhat capable of picking organelles already? I would like to give the class some visuals of the evolution performed by the algorithm itself. I have no problem making them in the free-build editor myself, but I would like to do so based on the algorithm’s decisions.
What is the current equation for population? I am assuming it’s not a short one.
Any interesting info you think I should mention?

Thank you so much for helping me out and of course, for writing this beautiful piece of auto-evo! If I come up with any more questions, I will make an edit/addendum, but currently this is all I can think of so far. Thank you one more time.

hhyyrylainen · November 7, 2019, 5:53pm

I’m not Nick but I feel like I want to address a few things.

All of the finished parts of the algorithm are explained here: Nick's Auto-Evo Algorithm (Episode 1) - Theory - Thrive Development Forum
Unless Nick has some super secret stuff he’s done since the last episode, that’s all there is to it.
So there is nothing to convert given species list of organelles to the properties of the species that are used in that algorithm. Also no actual mutation or anything like that is included in the algorithm. I’m a bit worried that currently it just does not count as any kind of AI algorithm and might be unsuitable material for any serious AI course (university level course).

Zahyyy · November 7, 2019, 6:21pm

That’s something I haven’t really considered. Perhaps I can present the goal of the project Nick is working on. And maybe if I assign the organelles currently available some values, such as osmoregulation cost, how they affect speed and such, which could be then varying the input of the algorithm in the equation for population, I might be able to mash together a very basic and simplified beta for at least some sort of a showcase. Since the population numbers are what determines the success of the mutation, there might be one of the fancy showcases common to evolutionary algorithms that show each generation and their population numbers. (I would make their visual representation in the free-build editor).

NickTheNick · November 7, 2019, 6:34pm

Hey @Zahyyy I appreciate the support! Unfortunately, as @hhyyrylainen pointed out, the algorithm’s design is only in Episode 1 right now, which means that I’m still just working on an algorithm to solve for the final population using all the inputs (traits) from the species and environment. Episode 2 is going to be all about how changes in the OE (like adding or removing organelles) can change those inputs, and how to “smartly select” those changes (evolutions) for AI species. So to answer your questions:

Currently, no. However, as you pointed out, you could make your own list of organelles/evolutions and pick trait changes that you think could represent those. Basically equal to doing an early draft of Episode 2. The full list of traits that input into the equation are in the Original Post of the thread I post all the updates on.
Unfortunately, VERY long. It’s also reached the point where it’s not just a single equation anymore because there are certain variables in the equation that have conditionals as to what their input/formula is. Like for example, body volume has a certain formula if you are a sphere, but a different formula if you are tear-drop shaped. The overall general equation for population growth though is below. Everything in the algorithm just contributes towards calculating one of the below terms.

Final Population = Initial Population + Births - Old Age Deaths - Starvation Deaths - Predation Deaths
If you read through the thread, there should be a lot of visuals, diagrams, graphs, and charts available representing how the algorithm works and example simulations using the algorithm that you’re free to use. For example after the part on predation (part 7) I did a sub-part on explaining the changes that had been made (part 7.1) and that post contains a lot of diagrams and visuals if you want.

Zahyyy · November 7, 2019, 7:33pm

Thank you so much! I will try to put together a very simplified version of what will come in the episode 2, just so that I can give the class some kind of a demonstration of the algorithm in practice. Also, I will try to put all the pieces of the equation together (as in I will try to write in in a form where every variable is DIRECTLY affected by the mutations), just so I can directly see what has what effect. Also, I love the visual in the 7.1. of all the traits that affect the final population. Thank you for giving me the permission to use it, I sure will! Also, the “bottom row” of the visual is what I will try to write the equation as. Thank you one more time, if I run into any issues, I might have another question or two, but for now, it’s time to get started. Thank you!

Zahyyy · November 9, 2019, 7:23pm

Hello, @NickTheNick, I have started dissecting the equation and I think I am going absolutely crazy. How could you have put this all together and remain sane? On one hand I don’t want to bother you, as I guess you are busy working on the algorithm, on the other hand, I have a lot of questions. So if you don’t mind, here they are.

In part 7.1 you have the table to calculate the Base time per hunt and Success rate. Is the Success rate represented in the graph in the same part by “Completed hunts” or “Chase success”?
Is Available hunts = Prey population/Prey energy yield? Where does the Required hunts step in?
In my first question I mentioned the Base time per hunt. What is the difference between that and Time spent hunting and Time per hunt?
Is Required hunts simply Base metabolism*Population?
The thing I struggled with the most was calculating Time spent hunting. Honestly I don’t even have idea where to start. I cannot figure out anything between Chase success, Competetivness and Time per hunt and Time spent hunting including those mentioned and I feel really stupid, but I’m genuinely lost.

I am sorry one more time. I’ve scribbled over several papers and I’m still not sure how to continue. I admire your dedication and your talent of not going crazy from this project. I will highly appreciate any help. Thank you so much.

Edit:
Does the average distance from predator to pray in the table in 7.1 equal to distance per hunt? (which is (0.66patch length)1-(4pd population densitypy population density*(1-avg clustering))-pd body radius - py body radius)?

Also, my current dissection of the equation looks like this:

Final Population = Initial Population + Births - Old Age Deaths - Starvation Deaths - Predation Deaths

Final population = Initial population + (Initial population * Litter * Reproductive Fraction * Mating frequency) - (Initial population * 1/Lifespan) - (Initial population * (1 - (Calories consumed / Base metabolism * Initial population))) - Predation deaths

The things in bold are the things I have trouble dissecting. The things in italics are the “base elements” of the equation that cannot be dissected anymore. There are some other things I have dissected, such as spatial density, distance per hunt, and so on, but I have trouble “getting to them” in the equation because of these three things in bold. I assume that Mating frequency is Free time - Time spent hunting, but that would be immediately in bold as well. Also, one more question - what exactly is reproductive fraction?

NickTheNick · November 10, 2019, 7:35pm

Thank you! I appreciate the support, and it definitely is a challenging project.

Before I answer your questions I’ll stress that this algorithm will be obscenely long if you turn it into a single mathematical equation. It’s also actually not possible to do so, because as I said there are a lot of conditionals in the algorithm (like “if A, then do B, but if X, then do Y”), so to turn it into a single equation you’d have to assume a lot of things, like a scenario where A and X and several other conditionals are true. A good example of this is the Endurance vs Speed Hunter from part 7. If you know this and still want to do it though, then don’t let me stop you all power to you.

Additionally, I would recommend to read the relevant posts for the topics you have questions on before you ask the questions, because they really explain each concept pretty thoroughly. I can also provide you with the script if you want, but I don’t know how familiar you are with code and reading it.

With those two things in mind, I’ll do my best to answer your questions:

Chase success
Required hunts is the total number of hunts required by the species to feed all their members. Available hunts is how many prey there actually are that they can hunt. Required hunts is thus limited by the available prey to produce available hunts.
Base Time per Hunt is the average time it takes for a predator to catch the prey. This is then modified by competition modifiers which can increase the total value to produce the actual, final, Time per Hunt. Time Spent Hunting is a proportion of the entire month, aka how much of the month is spent hunting instead of reproducing.
No. Base Metabolism * Population / Prey Energy Yield.
Time Spent Hunting is the time out of the month that is spent hunting. You just find the total number of hunts that the species performed, and multiply it by how long each hunt took. Then you divide that by the total time in a month.

Yes.

Mating Frequency is how often the species reproduces. It’s a trait, so it doesn’t change. You then multiply it by the time in the month which is spent reproducing, as well as the other factors you have listed, to get monthly births. The time spent reproducing is as you put it, the free time minus the time spent hunting.

Reproductive fraction is how much of the population can reproduce. In humans it’s 50% because only females can reproduce. In my example I use a species that uses asexual reproduction so I’ve used 100% so far.

Zahyyy · November 16, 2019, 4:08pm

Hello again, @NickTheNick! I have somewhat finished dissecting the algorithm. I have a hunch that the solution I came up with differs a bit from yours and is a bit simplified, but it should more-or-less work. So my next task is assigning mutations with the variables in the equation.

That would be amazing. I am no expert when it comes to programming, but I’m at least somewhat familiar with it, so I think that might be a great help as well. Thank you for all the help, I really appreciate it!

Nunez2196 · October 16, 2020, 3:04pm

I realize this is a Necro but,

What would you say about doing things as pictured in this diagram? For the general loop of the Auto-Evo process for Species generation/punishment.

hhyyrylainen · October 16, 2020, 4:06pm

It mostly looks like a variant of what’s currently there. I prefer the current system as I think it is a bit more realistic to simulate the resulting populations instead of evaluating how good a species is.

If I had energy, I’d make a proper diagram of the current system, but it goes roughly like this:

(player is swimming around, and auto-evo starts in the background in order to allow the simulation to take quite a lot of time)
For each species, generate 5 random mutations
Simulate the resulting populations for each of those mutations, and for each species select the best one (or no mutation)
Simulate a couple random patch migrations for each species, and pick the best one (if it is better than no migration)
Override the results for the player species so that auto-evo doesn’t affect it. This is done because the player currently has no way to know what their editor changes would do. And also the basic algorithm currently in the game doesn’t match well enough what the player needs to survive swimming around. So those two things need to be resolved before the player species can have auto-evo affect it.
After the player enters the editor, apply the external effects (for example if a cell dies on screen it gets a population decrease), migrations, and random mutations
Remove extinct species (if population ends up less than 25 it is rounded down to 0)
Add a copy of the player species to the world. This step is temporary until we nail down the species split logic, but I’d prefer a logic where if a species has enough population and there are multiple best mutations (for example due to the species being in many different patches), the species splits into two so that both halves have a different mutation.

I’m pretty happy with this algorithm, as once we get better and better population simulations, that can just be plugged into the overall framework

In fact once Nick’s or someone else’s new algorithm is ready, it can be plugged in as the population simulation step and everything else will work the same.

Sentiant · October 17, 2020, 8:19am

This actually brings up an interesting point about auto-evo. Presumably here you mean the auto-evo would pick the mutation that results in the most population, but is that really the ‘best one’?

Consider, for example, that insects far outnumber mammals, and are in turn outnumbered by microbes. Smaller creatures can multiply more easily, giving them higher populations. Thus, we can assume that the auto-evo would always prefer being as small as possible.
Of course, irl large creatures exist. That’s because irl evolution doesn’t care about the species as a whole, that’s just an abstraction, it cares about individuals who do or do not pass on their genes. So really wouldn’t a better metric to evaluate be the chance that the creature reproduces before dying?

hhyyrylainen · October 17, 2020, 8:40am

The auto-evo algorithm is a simplification of real life. In real life all individuals of a species have mutations, and based on their relative chance of making offspring those specific mutations spread. Because auto-evo tries to simulate 100 million years of changes by natural selection the shortcut is to simulate 10 steps per mutation to see how much final population the species ends up with (assuming no other species mutate at the same time as that would get exponentially harder to calculate).

The mutations are random, so the species doesn’t always have the option to pick from “just be smaller” and even now the algorithm seems to prefer a bit smallish cells, but they most of the time will gain organelles that help them in their current patch.

How would you calculate that? Even Nick’s already super complicated algorithm works on population number changes.

Sentiant · October 17, 2020, 11:23am

Its probably most efficient to use the existing population algorithm, since we’d need to know the populations anyway, regardless of whether they’re the sole motivator for auto-evo. Just run the algorithm for a number of timesteos until you’re reasonably sure an equilibrium in population must have been reached. Then, within the last (couple) timestep(s), find the fraction of the total cells that died, and the fraction of the total cells that reproduced. Because of the way probabilities work, these values are approximately equal to the probability of their respective events happening.

Let’s say the R = fraction of cells that reproduced, D = fraction of cells that died. Making the small abstraction that cells can only reproduce or die (or do neither) in a single timestep, not both; we can find all the ways a cell can reproduce before it dies:

It reproduces immediately, in the first timestep: probability of this happening = R
On the first step it neither reproduces nor dies, in the second it reproduces: probability = R * (1 - R - D)
In the first and second step it neither reproduces nor dies, in the third it reproduces: probability = R * (1 - R - D)^2

Etc.

As you may have noticed this is a geometric series (Geometric series - Wikipedia) with -1 < r < 1. It goes on into infinity, but it has a finite sum: R / (1 - (1 - R - D)).

This sum equals the chance of a cell reproducing before it dies. Of course, I am banking on the fact that the steps are short enough that not a lot of cells are dying and reproducing in the same timestep.

hhyyrylainen · October 17, 2020, 11:32am

We currently run 10 timesteps, as it turns out it is very hard to make a stable model. Even in the simple prototypes it seems that there becomes an oscillation between different species very easily. So I kinda doubt we’ll end up with a nice algorithm that converges to final population numbers.

If you just derive the numbers like that, won’t it be exactly the same effect than comparing (percentage wise) how much the population increased?

Sentiant · October 17, 2020, 12:52pm

If we can’t get an equilibrium you can still use the last couple timesteps to get an idea of the average of the oscillation.

And yeah, I just realised too: there is no value for R and D where using the probability gives a different outcome auto-evo wise than using the percentage-based population growth (despite them being different numbers). That said the probability is still the proof that percentage-based growth is more accurate than just plain populations.
It also has some other interesting properties. For instance, a probability is always between 0 and 1, so you *= it by ten and give the player an x/10 successfulness score for each species. (Or by 5 to give x/5 stars, or by 100 to give a percentage, you get the point).

hhyyrylainen · October 17, 2020, 5:11pm

I said percentage based population growth, because the starting population of the species is same for all mutation calculations, which means that the highest percentage population gain, is also the highest total population.

So after this chain of deductions, the highest ending population for a mutation check, is actually the best, and the current implementation is fine.

Sentiant · October 20, 2020, 2:00pm

So we’ve established that population is probably an okay metric to use, but I’m still not entirely satisfied. Am I really meant to believe that every multicellular creature ever failed to get ‘being smaller’ as one of its five auto-evo options every, single, time? Is any species whose population is not in the trillions just the result of unfortunate rng?

I don’t think so. I think auto-evo can only properly create species with lower populations if we make a change even more fundamental to the system than a different metric for evaluation. The problem isn’t that it doesn’t know how to pick the ‘best one’ out of its options, the problem is that it is picking the ‘best one’ at all. Let me explain:

Contrary to what catchphrases like ‘survival of the fittest’ might tell you, evolution doesn’t really select for the optimal species. Rather, it selects for the fit enough, anything that doesn’t die before it reproduces. Multicellular creatures, for instance, are doing worse than bacteria, they just aren’t going extinct either.
To further illustrate this, let’s take a look at a scenario that would be common in the auto-evo. A species of microbes develops five different mutations. Auto-evo runs for all five mutations and for the original species, and checks which one has the most population. The variant that does the best is then selected, and the entire species magically gains that mutation!
In reality, this would never happen. Instead, all six variants of the species would exist concurrently. Then, the ones that don’t make it will go extinct, and the other ones will continue to exist. This means that any species that isn’t heading for extinction will continue to exist, no matter how suboptimal they are.

If we tried to make the auto-evo system account for this, it might look something like this:
Step 1: Make 5 different mutations, as before.
Step 2: In the population algorithm, replace a small fraction (say 1% per mutation) of the species with the mutated variant.
Step 3: Run the population algorithm to find out which variants die off (probably because they couldn’t compete with the original), and which ones don’t. The variants that die get yeeted, the other ones become separate species for the next round.

Now, I have the sneaking suspicion that this system will cause a ton of tiny populations to exist. This could be fixed by moving the threshold slightly above zero, to make those tiny populations count as extinct.

hhyyrylainen · October 20, 2020, 7:37pm

With really small populations, this is just going to get rounded out.
While the overall steps look like they can work, I don’t really get this point.

How long do you run the simulation for? There has to be an arbitrary cutoff at some point.

Just for fun I calculated that to run through all of the generations, we’d need to run a cool 36.5 billion steps, which is the reason auto-evo is abstracted.

I already added that as the prototype algorithm ended up almost infinitely carrying around species with just a couple population.

Sentiant · October 21, 2020, 8:48am

I didn’t really put a whole lot of thought into that number. The point here is that it has to be small, since species heading for extinction will need to get there quickly instead of having five million population to use as a buffer.

From reading Nick’s thread on the dev forums I was under the impression that most species would die long before that point. Unsuccessful species in that thread appear to die before ~20 months. That seems like a good cut-off point.

Sentiant · January 13, 2021, 3:34pm

Double post because it’s been 3 months and this is about something different.

I was rereading the Auto-Evo thread on the dev forum and found a little error in the math. I know Nick has been in hibernation for the past year but this post will be here when he (hopefully) comes back.

The Distance per Hunt isn’t calculated correctly. In his equation, Nick assumes that adding more creatures would decrease the average distance to a food source, but this is incorrect. Since each creature has to get his calories individually, it does not actually matter how many other creatures there are. If you look at the picture with the triceratopses you’ll find that the distance from a triceratops to a fern is still the same regardless of how many triceratopses there are, even if there was only one triceratops. Only the size of the patch and the amount of ferns matter. This is likely part of the reason why time per hunt is so low right now.

Let’s look at the equation Nick gives and how to correct it:

Distance per Hunt = (0.66 * Patch Length) * (1 - (4 * Predator Population Density * Prey Population Density))

Nick expresses doubt about this, but he is right about the (0.66 * Patch Length) part. The rest of the equation just looks like guesswork to me. I have to be honest here: I don’t know how to mathematically prove anything about this either, but I managed to brute force this by using a simulation:

Distance = (0.6617 * Patch Length) / (Food Sources^0.35)

I have verified with the simulation that this is roughly accurate for anywhere from 1 to 5000 food sources.