Datasets
Note: placeholder links and text. The final archives have not been generated yet. Give it a couple months
Intended use and Licensing
The datasets are distributed solely for non-commercial academic research. Permitted uses are limited to direct academic research activities conducted without commercial intent.
Use for training, fine-tuning, or otherwise improving large language models or other AI systems is expressly forbidden.
These limitations apply on top of the license specified in the footer of this page.
If unsure, contact information is available on the Home page.
Microstructures
These datasets all contain the same type of structures. All structures have the following properties:
- 128x128x128 binary voxel volume
- Non-homogeneous
- Periodic for all axis
- Connected
Download
Using OTBV encoding:- 1k samples, OTBV encoded - Download (XXX MB .tar archive)
- 10k samples, OTBV encoded - Download (XXX MB .tar archive)
- 100k samples, OTBV encoded - Download (XXX MB .tar archive)
- 1M samples, OTBV encoded - Download (XXX MB .tar archive)
- Automatic decoding:
- 1k samples, HDF - Download (XXX MB .zip archive)
- 10k samples, HDF - Download (XXX MB .zip archive)
- 100k samples, HDF - Download (XXX MB .zip archive)
- 1M samples, HDF - Download (XXX MB .zip archive)
Only the 1k dataset has been manually checked for quality due to size.
All data is synthetic.
The generation algorithm is open-source [link after published].