How much weight (file size) does the galaxy have?
That’s impressive and fair enough. I should have clarified I meant like a useful in terms of gameplay galaxy. For example we need to store stuff like species and their populations etc. for all inhabited planets. That eats up a lot of storage space very fast when you start adding a few dozen bytes (at the minimum) per star system.
I wonder if it could be possible to somehow create a seed system which would generate the same species with the same statistics (like population, civilization things, etc.) every time the player checks on an inhabited planet…
I think that this could be handled the same way that Society stages could handle urban centers. In the early game individual villages would be simulated and stored in full depth and detail but as the society grows in the later game, the large cities would become the standard ‘unit’ of detail and there would be many small villages that are not important enough to manage in full detail. As the scope gets larger, only the most important of the smaller features would need to be stored at all. So in industrial stage, a Society might have 1 capital, 10 major cities, 100 minor cities, and 1000’s of small towns and villages. The capital would be simulated/stored in full depth, 10 major cities in less depth, and so on with the individual villages not even needing individual records for each village, but instead there might be statistics saved for an entire region. So, a region might have a count stored for 100 villages, has a population of 100K, etc. The real system we implement will likely need far more statistics and data to store, such as population development level, or wealth level, industrial production value, and other such values depending on which ones need to be stored vs which ones can be calculated based on the others.
At the scale of the entire galaxy, this would repeat. A galactic civilization might have 1 home system, 10s of major systems, 100s of minor systems, and thousands or tens of thousands of outposts. Only the important systems and planets need to be stored persistently, or even generated at all. At the grand scale, the thousands of outposts making up a civilization’s territory would be grouped into large scale statistics. The most expensive to generate features would likely be the species’, but this would only require the home worlds to need to be generated in full depth to generate with an accurate history, and the rest of the systems would would pick from the content generated from those few home worlds.
And with the content pregeneration system that has been discussed before, if a previously unimportant system needs to be quickly generated at higher detail then one could be selected from the pregenerated cache to slot in the appropriate level of detail. For example, if an entire star system with populated planets needs to be generated, then a matching one could be selected from the cache. The star system might have previously been stored with the statistics “1 Sol mass star, 2 Earth Mass high population/highly developed planets, 2 low population/low development moons” And then matching pregenerated systems would be selected from the cache and used.
This is essentially how I have set up the generation code, so that all generated areas will be regenerated exactly the same given the same conditions. This means that if a certain sector of the galaxy is specified to have, say, 200 stars in it then each time it is generated it will be exactly the same, and only one record would have to be saved for that entire quadrant. If it was an important sector then it would be stored more fully, perhaps one of the star systems is important so it would be generated and stored in more detail while the other 199 would remain at lower detail.
Now, it does get more complicated to do this for something like a town or populated planet, which would need to stay the same over small timescales and changes, but over larger timescales can develop in larger ways. And some towns might randomly have huge changes occur such as natural disasters or war. But this can be accounted for separately than the base generation of the unmodified region/planet/system/sector.
Thankfully we are still a long way from having to develop these features to make the game’s development progress. Though we’ll have to contend with making a good system on the civ stages’ elements one day…
How the stars are like can depend on their position in the galaxy, aswell as other variables, like the presence of nebulas.
And we’d again be down to a galaxy being like 100 collections (whatever size they be, like a star cluster or something) that the player manages. So basically we are back to galaxies being 100 star systems to inhabit, but instead we just replace the graphics with a bigger scale more abstract representation where the player no longer can check individual planets.
I think that is the wrong approach because in the space stage one of the goals of the systems I think should be that you can zoom into any planet you want to manage it. That keeps the space stage consistency working from the early space stage into the later space stage.
Oh so essentially every planet should be same detail as the player’s homeworld if they wish to check these planets out?
The difference here, that I may not have made clear, is that the system is not limited to only operating at a specific scale for a specific stage, but is designed to be a completely continuous across the expanding scales of the stages.
Perhaps I could explain this from the other direction:
The reason I am working on the galaxy generator right now is that it is, strangely enough, the prototype for the flora generation and management system. There is a significant overlap here and I decided to work on the galaxy generator first because it is an easier test of the system. The flora generation system will be the only important one in the near term though, as it is the main generator for Macroscopic, Aware, and Awakening stages.
The way I am constructing the flora generation system is like this:
A world is divided up into patches. Each patch has its own set of species of flora and a corresponding population count. This population count is distributed across each patch based on how suitable the conditions are for a species. Like in a patch that has mountainous regions as well as flat plains, a species of grass might concentrate its population in the plains while fading out up the mountains, being not present at its peaks. When a section of the patch is zoomed into, the patch will be subdivided into sub-regions of the patch, and the population of each species will be divided proportionally based on the ‘habitability’ of each sub-region. This will continue recursively as the patch is zoomed into until it reaches the scale at which individual members of the species should be visible distinctly. At that point, the sub-region of the world will stop tracking the species as a group population count along with a population distribution across the sub-region, and will instead take those group values and turn them into a set of individual members of the species. For example, if a species had a population of 16 in a sub-patch then 16 members of the species would be created and distributed across the sub-region according to the population distribution. For a tree, the scale at which the individualization would occur would be larger than the scale needed for individual grass to become individual.
In the Macroscopic-Aware-Awakening stages, the player is going to be observing the world on very small scales, close to the ground. This is also the stage where the player will be interacting with the smallest scales of the world, perhaps interacting with individual grass plants (like a herbivore eating them) or cutting down trees in the Awakening stage. So at these early stages, it will definitely be important to distribute the species all the way down to the individual level. However with flora like grass, this distribution only needs to be done in an area near to the player. At a distance from the player the grass does not need to be rendered individually, and can be rendered instead as part of the terrain texture. This is simply how normal LOD systems work, except that this one is done using accurate statistics of the objects managed.
Changes also should be persistent in the world within acceptable timescales. This may not matter as much in the earlier stages as the Editor Cycles will reset the world, but the later stages will need increasingly more persistence. If a player chops down a couple rare, valuable trees far away from their camp in Awakening, these changes should remain even if the player travels a far enough distance to unload the chunks if the player returns to it.
As the game increases in scale along with the progression of the game, the highest levels of detail will become less relevant to generate and keep track of. In Society stage, the scale might be large enough that individual grass is no longer distinguishable, and thus hardly ever needs to be generated to the individual level. The player would no longer be focused on that scale, having gone on to manage towns and cities. But if the player did zoom in close enough to the ground, they could indeed still generate the individual grass and see it if they wanted. At a large enough scale in Society, when focused on entire cities and counties, the individual trees will no longer be viewed anymore most of the time. But it would still be important for the generator to keep track of and manage the populations of trees in a region, not only because the player might decide to zoom into a random forest, but because the actual number and distribution of the trees would still have an important impact on the game even if not individually kept track of anymore. A forest might be used for lumber by a society, so it would need to keep track of the changes to a forest’s population in a region to correctly calculate.
Once Space Stage is reached, the player will almost certainly not be spending much time on the ground level of a planet looking at individual trees or grass. But if, for example, the player decides to terraform a minor world, then they should indeed be able to zoom all the way to ground level to observe the individual trees if they want to. It seems relatively likely that a player might want to do that for the first few worlds they terraform. But if the player has terraformed hundreds or thousands of worlds, then they are unlikely to ever zoom into the level to resolve individual trees on most of them. The capacity would always be there, but it simply wouldn’t matter anymore if you cut down a single tree on a random world. If you decide to burn down a whole forest on a random world, then perhaps it would affect some statistics of that planet by 0.1% and then you could track that change, but it would be fine to stop tracking that change of a single tree being removed 10 years ago on a minor planet in your empire.
What I am saying here is basically that this same system that will manage the flora can be extended to work throughout the whole game on all scales. The importance of a single tree in Awakening Stage might be similar to importance of a single village in Society Stage, or a typical town’s importance in the Industrial Stage, or a typical city in early Space Stage.
The importance of a single object would never instantaneously become irrelevant, but would instead gradually and continuously become less relevant over time as the game progresses.
You can also think about it this way: The amount of detail, and thus computation time, that should be spent on different features, whether they are trees, towns, planets, systems, or galactic sectors, should be proportional to the amount of focus that the player gives to it. As the game progresses, less time will be spent on the high detail features that were in previous stages because the game has moved on, but if the player chooses to focus on those small features at any time, they can.
Bringing it back to the Universe generation, at the end of the Space Stage, perhaps the player’s society spans 100K Systems, with 200K terraformed planets. The player has probably only ever observed maybe 1K planets directly, and only actually spent enough time to zoom into and explore maybe 100 planets. Even though the other 99.5% of planets that the player owns have never been seen by the player, and are really just tracked by simple statistics, the universe would still be able to feel as expansive as a real size universe would because the player could choose at any time to zoom into any of those 199K other planets they own and see it generated fully, to whatever depth they want. And it wouldn’t just be a completely random planet that is generated, it would be one that is based on the statistics of the overall region and its history. If a region of the society has a certain ratio of sentient citizens species then that should be reflected in the generated planet. If a region was recently damaged by war, then that should also be reflected in the generated planet.
How long would it take for a planet’s details to be generated?
Hello again. I apologize for the long silence, I have not died and I do want to continue with this project, but lately life’s been moving at supersonic speeds. I’ve finished my BSc. thesis, my fiancée and I are buying an apartment, so we’re dealing with mortgage and IKEA furniture and talking to construction workers constantly and we’re planning a wedding and on top of that I have some big projects at work. Hopefully soon things should calm down a bit and I should be able to focus on this project a lot more once again. As a little treat for your patience, here’s an update on how the project has been going:
I’ve decided to polish up the star generator to a level where I could be genuinely proud of it. Currently, the star generator takes three inputs - (initial) mass, age and metallicity. I wish to add speed of rotation around its own axis as another input parameter, as it can greatly affect many parameters and I have decided to go all in. These are the parameters from which I will be calculating everything else.
The values being calculated from these input values are anything from temperature, radius, luminosity, estimated main-sequence lifetime, peak wavelength or the current stage (e.g. main sequence with a specific classification, remnant phases such as white dwarves, black holes, or intermediate stages such as giants and supergiants). These are then used to visualize the results as a nice human-readable picture - I determine the phase, draw a simple circle of a given radius (log scale for obvious reasons), use peak wavelength to determine the color and so on.
I’ve created a dynamically populated Hertzsprung-Russell diagram to help me empirically test if the results I am getting reflect reality. For the same reason I’ve also added a dozen-or-so real life presets and measure the difference between the expected values and returned values. I’ve created a simple system that automatically tunes the function coefficients to better fit the expected data, however, I’ve decided to abandon it for a better approach.
As I said, currently I am just course-correcting a simple equation. The current equation is piece-wise, which has a lot of issues, but I was planning on converting it to a continuous function later, once I would be happy with the values. However, I’ve decided that the best approach would be using a multi-layer perceptron regressor. Basically, I will collect huge amounts of astronomical data and, using advanced technology indistinguishable from magic, I will create a function that nearly perfectly describes any parameter I want in relation to the input parameters.
So I started working on Starscraper - basically a simple scraping script that crawls through astronomical databases and wikipedia and any and all sources I could find to obtain as much data as possible. This, however, was not without a problem. Usually, each source has only a handful of data, often times with units different from other sources and/or internally inconsistent (for example Wiki provides age of stars either in Myrs or Gyrs). Not to mention that it’s often really difficult to parse out data for individual stars in entries that are binary or trinary systems. I’ve also had some parsing issues where on a web page the temperature would be e.g. 9000 K, but it would get parsed as “9”. Lastly, sometimes different sources claim vastly different values, which is making my life a lot harder.
So this is where the project currently stands - I need to collect huge amounts of reliable data on main-sequence stars so that I can fine-tune the equations for main sequence parameters. Fine-tuning the other phases (remnants, giants) will be secondary, as 1) I need a solid basis to build on top of, meaning plausible results for all MS and 2) MS stars are the most interesting from the utility standpoint, as cases where a non-MS star might be habitable are very rare and niche. That’s not to say I will disregard non-MS stars completely - honestly, they are my favorite part of this project. I however understand that the effort to utility ratio is much lower.
So, if by some miracle you’ve got hundreds or thousands of astronomical data about main-sequence stars just laying around, please do let me know, as with those I will be able to release the first polished version of this project in like one caffeine-filled weekend. Until then, I will try to figure something out, so that I don’t have to hack together the dataset by hand. Thank you for your patience. For those curious about the future, here is an approximate roadmap for this project.
Roadmap
Star generation - Work in progress
Solar system generation - Basic prototype
Planet surface generation - worked on by HyperbolicHadron
Climate generation - No work done yet
Should the makers of the data/ data’s sources the script is using be credited in some way after enough data has been gathered?
I think it is of his own making
You mean the scanned documents are of his own making?
the script [filler text]
Because I meant if he should credit the articles used to feed the model. Probably yes, as that would be the most ethical option, no?
Oh I misread. But yeah getting your work credited is nice but it also depends on the nature of the source
I think this sort of source should be relatively easy to credit…
That’s a great question I honestly have not thought of. The database I am pulling my data from is VizieR, they have a section about the rules of usage of VizieR data with specific instruction for crediting and acknowledgements, but I have not read it yet. I definitely will. As a small update, I managed to pull around 2,5k data points with MOST of the info I need, however, I will probably have to ditch the speed of rotation, as we mostly know only vsini (velocity of rotation v times sin(i) - the angle of rotation relative to Earth) and we don’t usually know the value of i. Also another but, the pulled data are not just MS stars but contain a wide range of star types, so I will have to filter it out a bit. And cross-checking star types is such a pain. Maybe I will figure something out, though. I had to cross-check for 2MASS (basically star ID) using their coords already, so maybe I will figure out something similar.
Also
Yes, the script itself is mine. You can tell because it runs like
It does work in what it needs to do at the minimum still most of the time, correct?
Seems like a good plan, feel free to update us when you have more to share.
Recently, I have been working on land biome classification, using a reduced version of Holdridge life zones to begin with.
And here is ocean classification. I currently have it recognize 4 different climate zones, but we are still deciding on how finely we want to classify the oceans.