Evaluation

Simple callbacks to evaluate the current training model on a dataset at different training times.

We’re going to build a class that stores a given dataset and calls evaluate on the training model when needed to obtain the evaluation metrics. We want it to be flexible in a way that we can specify a number of epochs or batches as the evaluation frequency. This could be solved by having a different callback for epoch and batches, but we probably can get away with using only one.

source

EvaluateDataset

 EvaluateDataset (dataset, freq_epochs=None, freq_batches=None, append='')

Evaluates a given tf.data.Dataset at different training times.

	Type	Default	Details
dataset			Dataset to be evaluated.
freq_epochs	NoneType	None	Number of epochs to wait between evaluations. `None` means not evaluating at an epoch interval.
freq_batches	NoneType	None	Number of batches to wait between evaluations. `None` means not evaluating at a batch interval.
append	str		Text to append to the metrics’ names as an identifier.

from iqadatasets.datasets.tid2013 import TID2013
from iqadatasets.datasets.tid2008 import TID2008

tid13 = TID2013("/media/disk/databases/BBDD_video_image/Image_Quality/TID/TID2013")
tid08 = TID2013("/media/disk/databases/BBDD_video_image/Image_Quality/TID/TID2008")

from perceptnet.networks import PerceptNet
from perceptnet.pearson_loss import PearsonCorrelation

model = PerceptNet(kernel_initializer="ones", gdn_kernel_size=1, learnable_undersampling=False)
model.compile(optimizer="adam",
              loss=PearsonCorrelation())

2022-11-07 12:32:23.207120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2373 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 780 Ti, pci bus id: 0000:02:00.0, compute capability: 3.5
2022-11-07 12:32:23.209164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10793 MB memory:  -> device: 1, name: Tesla K40m, pci bus id: 0000:03:00.0, compute capability: 3.5
2022-11-07 12:32:23.210353: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 5435 MB memory:  -> device: 2, name: NVIDIA GeForce GTX TITAN Black, pci bus id: 0000:83:00.0, compute capability: 3.5
2022-11-07 12:32:23.211964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 5435 MB memory:  -> device: 3, name: NVIDIA GeForce GTX TITAN Black, pci bus id: 0000:84:00.0, compute capability: 3.5

cb_eval = EvaluateDataset(tid13.dataset.batch(16).take(4), freq_batches=5, append="_TID2013")
history = model.fit(tid08.dataset.batch(16).take(10), epochs=2, callbacks=[cb_eval])

2022-11-07 12:32:23.454509: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)

Epoch 1/2

2022-11-07 12:32:26.883482: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
2022-11-07 12:32:27.433804: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-11-07 12:32:28.139269: W tensorflow/core/common_runtime/bfc_allocator.cc:272] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.02GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-11-07 12:32:28.232115: W tensorflow/core/common_runtime/bfc_allocator.cc:272] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.30GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-11-07 12:32:28.240635: W tensorflow/core/common_runtime/bfc_allocator.cc:272] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.30GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-11-07 12:32:28.440640: W tensorflow/core/common_runtime/bfc_allocator.cc:272] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.14GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-11-07 12:32:28.460049: W tensorflow/core/common_runtime/bfc_allocator.cc:272] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.30GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2022-11-07 12:32:28.680518: W tensorflow/core/common_runtime/bfc_allocator.cc:272] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.40GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

      6/Unknown - 9s 768ms/step - loss: -0.7324WARNING:tensorflow:Callback method `on_train_batch_end` is slow compared to the batch time (batch time: 0.3184s vs `on_train_batch_end` time: 0.5511s). Check your callbacks.
10/10 [==============================] - 12s 738ms/step - loss: -0.8586
Epoch 2/2
10/10 [==============================] - 6s 668ms/step - loss: -0.8743

cb_eval.results_batches

{'loss_TID2013': [-0.8481918573379517,
  -0.8950705528259277,
  -0.88816237449646,
  -0.9112398624420166]}