|
Description
|
The complete data set, “MOTHER_Macaque_Monkey_Preantral_Follicles_00N.zip”:
This is part of a multi-Zip archive with 6 parts. Most Zip software packages will automatically unzip all six files if you unzip the first file, "MOTHER_Macaque_Monkey_Preantral_Follicles_001.zip”. It will take ~2 hours to download this file using a stable high-speed network.
Individual images are partitioned in folders by follicle type and Train, Test and Validate subfolders used for training our machine learning algorithm. In addition, various image augmentations are included such as color inversion, image rotations, etc. Each annotation of a particular follicle generates a total of 48 augmentations. The set of 48 augmentations (which includes the original) for a particular annotation will always be in the same Train, Test or Validate folder. The data set also contains an extensive set of images representing non-follicle portions of the ovary. These images can be used as counter examples to the preantral follicle classifications sets. The image filenames identify the name of the full-size histology image, the follicle type, the location of the annotation in the full-size image and information about how it was augmented. The Train, Test, and Validate partition was done randomly to give partitions of 75:20:5. If desired, these three folders can be combined and the data repartitioned.
In total, the dataset contains 1.7 million images based on approximately 7,700 annotated follicles. This is a large dataset at ~120GB. You need to download the entire set of zip archives with the “.zip.00N”, when N is a digit, extensions. Zip software will reconstruct the complete zip archive if you open the first file in the series. |