Image processing refactor
Created by: Dreamlone
The alternative title for the future PR on this issue is "Batch-support implementation for image processing"
FEDOT currently has a separate DataTypesEnum.image data type. It loads all matrices into an array and then feeds some matrices in batches for CNN training. However, for most datasets with images, it is impossible to keep all matrices in RAM at the same time. Therefore, my suggestion is to create a new DataTypesEnum.image_paths data type, or modify an existing one.
The table will have something like the following structure
features | target |
---|---|
D:/dataset_name/train/first_features.png | 1 |
D:/dataset_name/train/second_features.png | 0 |
or the following structure for segmentation task for example:
features | target |
---|---|
D:/dataset_name/train/first_features.png | D:/dataset_name/train/first_target.png |
D:/dataset_name/train/second_features.png | D:/dataset_name/train/second_target.png |
Most likely, it is necessary to implement auxiliary classes that will load data into the neural network by parts (DataLoaders)