Scripts Reference¶

`train_mrcnn_model.py`¶

Train a new mrcnn model starting from pretrained COCO weights

optional arguments:

`-h, --help`	show this help message and exit
`--steps STEPS`	1, 2, or 3. How many of the steps to train: (heads, 4+, entire model)
`--model_dir MODEL_DIR`
	Directory in which to create new training subdirectory for checkpoints and tensorboard logs.
`--data_dir DATA_DIR`
	Directory in which to find `train` and `test` subdirectories containing labeled images. Extract OCR top and base depths from images with pytesseract

`get_ocr_depths.py`¶

Extract OCR top and base depths from images with pytesseract

optional arguments:

`-h, --help`	show this help message and exit
`--root_dir ROOT_DIR`
	A common parent directory of all target `<subdir>` directories.
`--subdir SUBDIR`
	A string contained in the name of all target subdirectories.
`--save_name SAVE_NAME`
	Name of depths csv file(s) to be saved in matching subdirs.
`--force`	Flag to force overwrite of any existing `<save_name>.csv` files.
`--inspect`	Flag to inspect images and print OCR output whenever there is an issue.

As an example, you can test the script by running:

This should save a new file at tests/data/two_image_dataset/auto_depths_test.csv, with contents like:

,top,bottom
S00101409.jpeg,2348.0,2350.0
S00111582.jpeg,7716.0,2220.0

Note that 7716.0 is a misread, and should have been 2218.0. At least with our BGS images, some manual corrections are usually required, but this provides a template for the --depth_csv file required to run process_directory.py.

`process_directory.py`¶

Process directory of raw images with Mask R-CNN and save results as a CoreColumn.

The path given should contain images as jpeg files, and a depth_csv file in the format:

         ,    top,    bottom
<filename1>, <top1>, <bottom1>
...
<filenameN>, <topN>, <bottomN>

NOTE: model Config, class_names, and segmentation layout_params can only be changed manually at the top of script, and default to those configured in defaults.py

positional arguments:

path Path to directory of images (and depth information csv) to process.

optional arguments:

`-h, --help`	show help message and exit
`--model_dir MODEL_DIR`
	Directory to load `mrcnn` model from. Default=``defaults.MODEL_DIR``
`--weights_path WEIGHTS_PATH`
	Path to model weights to load. Default=``defaults.CB_MODEL_PATH``
`--add_tol ADD_TOL`
	Gap tolerance when adding `CoreColumn` objects, default=5.0.
`--add_mode ADD_MODE`
	`CoreColumn.add_mode`. One of {‘fill’, ‘collapse’}.
`--depth_csv DEPTH_CSV`
	Name of filename + (top, bottom) csv to read from `path`, default=``’auto_depths.csv’``
`--save_dir SAVE_DIR`
	Path to save `CoreColumn` to, default=None will save to `path`
`--save_name SAVE_NAME`
	Name to use for `CoreColumn.save`, default=None results in `CoreColumn_<top>_<base>`
`--save_mode SAVE_MODE`
	One of {‘pickle’, ‘numpy’}. Whether to save as single `pkl` file or multiple `npy` files

Assuming you’ve downloaded and unzipped the assets folder in the default location, you can test the script with default parameters by running:

$ cd scripts
$ python process_directory.py ../tests/data/two_image_dataset --depth_csv dummy_depths.csv

This should save the aggregated CoreColumn to tests/data/two_image_dataset/CoreColumn_1.00_5.00.pkl.

`prune_imageData.py`¶

Remove the imageData field from all JSON files in tree below path:

positional arguments:

path Path to parent of all target JSON files.

optional arguments:

-h, --help

show this help message and exit

Scripts Reference¶

train_mrcnn_model.py¶

get_ocr_depths.py¶

process_directory.py¶

prune_imageData.py¶

`train_mrcnn_model.py`¶

`get_ocr_depths.py`¶

`process_directory.py`¶

`prune_imageData.py`¶