i have 7gb .tgz file archive of thousands of high-res photos i'd work in python. able of following in case of single image, i'm not sure how work such large data , .tgz file format. have googled, perhaps i'm not using best search terms. explicit code helpful me understand.
how load .tgz data python? (pickle, numpy, tarfile? pip install tarfile fails.) want convert them numpy arrays.
how make of images set resolution?
how convert of images greyscale?
the goal manipulate data use in convolutional neural network (cnn).
i'm not sure if handling archive problem. it's quite obvious .tgz file should handled using tarfile
. tarfile
in inbuilt module in python , not need pip install
it.
#!/usr/bin/env python # import tarfile tarfile import tarfile # open tarfile reading itgz = tarfile.gzopen( "photos.tgz", 'r' ) # open tarfile saving images otgz = tarfile.gzopen( "photos_edited.tgz", 'w' ) # handle images one-by-one img_name in itgz.getnames() : # extract ever want itgz.extract( img_name ) # image processing numpy, pil or tool of choice # if want save edited images tar file otgz.add( img_name ) else: itgz.close() otgz.close()
Comments
Post a Comment