Keras Save Cannot Load Again Load a Weight File Containing
Role I: Saving and Loading of Keras Sequential and Functional Models
Saving unabridged model or simply architecture or simply weights
In TensorFlow and Keras, there are several ways to save and load a deep learning model. When you take also many options, sometimes it will be confusing to know which selection to select for saving a model. Moreover, if you take a custom metric or a custom layer and so the complexity increases fifty-fifty more.
The main goal of this commodity is to explain different approaches for saving and loading a Keras model. If you are a seasoned automobile learning practitioner, so virtually probably y'all know which choice to select. You could skim through this article to run across if in that location is whatsoever value you are going to become by reading this article.
Outline of this article is as follows
- Overview of saving and loading of Keras models
- Saving and loading just architecture
- Saving and loading merely weights
- Saving and loading entire models
1. Overview of Saving and Loading of Models
Before going into details, let united states of america run into what a Keras model consist (the following is from TensorFlow website)
An compages, or configuration, which specifies what layers the model comprise, and how they're connected
A prepare of weights values (the "state of the model")
An optimizer state (defined by compiling the model)
A ready of losses and metrics (defined past compiling the model)
Depending on your requirements, you lot may want to save the architecture of the model only and share it with someone (team member or client). In another case, yous may desire to salve only weights and resume the grooming later from where y'all left off. Finally, yous may want to save the entire model (compages, weights, optimizer state, and training configuration) and share information technology with others. Skilful matter is that Keras allows us to do what we wish.
The above flowchart is created to explain different options for saving and loading (i) model compages only, (two) model weights but, and (3) entire model (compages, weights, optimizer land, and training configuration).
Telescopic of this article is to depict choice (i) and (ii) with more details, and depict option (3) with a elementary instance. As saving a circuitous Keras model with custom metrics and custom layers is not simple, we will bargain with it in another article (part Two of this series).
Model compages tin can be saved either in json
or yaml
format whereas model weights or entire model can be saved in tf
format or h5
format. In the following sections, more details on these options are provided with code examples.
2. Saving and loading only architecture
Model compages includes Keras layers used in the model cosmos and how the layers are connected to each other. After defining model architecture, you can plot the model architecture using plot_model
function nether tensorflow.keras.utils
. Executing the command plot_model(model, show_shapes=True)
, plots the architecture where you can see the name of the layers, how the layers are continued to each other, and input and output shapes of data.
If your utilize case requires only saving model architecture, so you could relieve information technology either injson
format or inyaml
format as shown below. The starting time control (model.to_json()
) in the flowchart saves the compages in json
format and the second command ( model_from_json()
) loads the architecture back into the new model instantiated. A more than detailed example code is provided below. Every bit the approach is the same for json
and yaml
, the post-obit example is described for json
format only.
In this section, we will take a simple classification model, train information technology, save the architecture and weights, create a new model, load the saved architecture, load the saved weights, and finally compare the performance of the model before saving versus after loading the model compages.
The following example uses a simple Keras Sequential model with MNIST information to classify a given image of a digit betwixt 0 to nine. After training the model, the functioning of the model was evaluated with ( x_test, y_test
) information.
The architecture and weights of the model were saved to a disk equally follows. In the following example, architecture was saved in json
format but the lawmaking is very similar to saving architecture in yaml
format. Note that weights can be saved into two formats ( tf
and h5
) as shown beneath.
Equally we take the model configuration, we can create a new model (loaded_model_h5
) with freshly initialized weights. Weights of the loaded_model_h5
are freshly initialized which means that they are different from the original model
. Earlier evaluating the performance of the model, we need to load the weights that were saved earlier. Once yous run the post-obit code, you tin can notice that the operation is identical before saving and afterward loading the compages.
Similarly, the performance of the loaded_model_tf
is identical to the original model
. The complete code used in this section is shared here.
iii. Saving and loading but weights
As mentioned earlier, model weights tin be saved in ii different formats tf
and h5
. Moreover, weights can be saved either during model preparation or before/later on training.
When do y'all need to save but weights?
If y'all are training a model, you don't need to save the entire model each and every epoch because there is no change in the architecture of the model. You only need to salvage weights and biases to capture the learnable parameters of the model. Keras provides a couple of options to relieve the weights and biases either during the training of the model or before/after the model training.
3.1 Saving weights before preparation
When yous instantiate a model API (Sequential/Functional) and provide an compages (stack of Keras layers), the architecture of the model is created and weights and biases are randomly initialized. At this point, you could compile (model.compile
) the model and save the weights and biases. This is what I am calling as saving just weights earlier training.
Note that so far nosotros haven't trained (model.fit
) the model. Running model.predict
with randomly initiated weights and biases is non useful as they were not updated through training.
When you instantiate anew_model
as shown below, weights are randomly initialized which are different from what we had saved before using model.save_weights
. If we don't load the saved weights into thenew_model
then the predictions are wrong as the weights are very dissimilar.
The following example code demonstrates the difference in the prediction when yous don't load the saved weights. Finally, when nosotros load the saved weights using new_model.load_weights()
and then the predictions are identical to the original model
.
3.2 Saving weights later on training
This is the approach most unremarkably used by users. After instantiating the model, it tin be compiled and trained. Equally mentioned in section 3.1, when you lot instantiate a model, weights get initialized randomly. During the preparation, these weights volition get updated to optimize the performance of the model. In one case you are happy with the model performance, you can save the weights using model.save_weights
. You tin can gauge the performance of the model
past executing model.evaluate
with the test data as shown below.
When you lot want to resume the preparation, you can instantiate the model architecture, compile the model and, load the weights using new_model.load_weights
. You can bank check the performance of the model
earlier saving and after loading weights to the new_model
. The functioning is identical every bit noticed in the post-obit example. The total code used in this Section is shared here.
3.three Saving weights during preparation
Keras allows us to salve weights during model grooming through ModelCheckpoint callback.
But, why do we need to relieve weights during training?
- If your fashion is minor and takes only a few seconds to railroad train the model, then we don't need to save weights during the preparation. But, what if your model is big and training takes hours or days and due to unexpected failure (ex. ability failure) in the organisation or out of retentiveness (OOM) problems causes the training process to stop.
- Practically, Machine Learning models volition get new data continuously. We can add the new data to the training data and utilize the latest checkpoint to retrain the model so that performance is amend
- Maybe you trained a model in a machine and at present you lot accept a much bigger car but don't want to commencement the preparation from scratch.
What is checkpoint?
Checkpointing is a technique that provides fault tolerance for computing systems. Information technology basically consists of saving a snapshot of the application'south state, and then that applications tin restart from that indicate in case of failure. This is particularly of import for the long running applications that are executed in the failure-prone computing systems. (from Wikipedia)
Keras has ModelCheckpoint callback, just what does it do?
- ModelCheckpoint captures the weights of the model or entire model
- It allows us to specify which
metric
to monitor, such equallyloss
oraccuracy
on the preparation or validation dataset. Information technology tin can save weights automatically when growth in themetric
satisfies a set condition - It allows us to load the latest checkpoint to resume training where it was left off
- Model Checkpoint callback also allow us to save the best model or all-time model weights when you lot select
save_best_only= True
The major use of ModelCheckpoint is to salve the model weights or the entire model when there is whatsoever improvement observed during the grooming. The code below saves the model weights every epoch.
In the in a higher place instance, we ready checkpoint_path = "training_1/cp-{epoch:04d}.ckpt"
then that checkpoint file names automatically modify depending on the save_freq
statement. Withal, if y'all select save_best_only=True
so you lot can ascertain a fixed checkpoint_path as checkpoint_path = "training_2/cp.ckpt"
and this file is overwritten when the current model has better operation when compared to the previous all-time model. The post-obit code shows how to save_best_only
.
ModelCheckpoint saves three dissimilar files (1) checkpoint
file, (ii) index
file and, (iii) checkpoint information
file. The three files are equally shown below.
checkpoint
: This file contains the latest checkpoint file. The file in our example has model_checkpoint_path: "cp-0010.ckpt"
and all_model_checkpoint_paths: "cp-0010.ckpt"
cp-0001.ckpt.data-00000-of-00001
contains the values for all the variables afterwards the first epoch, without the architecture.
cp-0001.ckpt.index
simply stores the list of variable names and shapes saved afterward the first epoch. More details here
The arguments of ModelCheckpoint callback are as shown below (from the TensorFlow website). Effort to play with these arguments to larn more.
filepath: string or
PathLike
, path to relieve the model file.filepath
tin can contain named formatting options, which will exist filled the value ofepoch
and keys inlogs
(passed inon_epoch_end
). For example: iffilepath
isweights.{epoch:02d}-{val_loss:.2f}.hdf5
, then the model checkpoints will be saved with the epoch number and the validation loss in the filename.monitor: quantity to monitor.
verbose: verbosity way, 0 or 1.
save_best_only: if
save_best_only=True
, the latest best model according to the quantity monitored will not exist overwritten. Iffilepath
doesn't incorporate formatting options similar{epoch}
sofilepath
will be overwritten by each new amend model.mode: ane of {auto, min, max}. If
save_best_only=True
, the decision to overwrite the current salvage file is made based on either the maximization or the minimization of the monitored quantity. Forval_acc
, this should existmax
, forval_loss
this should bemin
, etc. Inmotorcar
mode, the direction is automatically inferred from the name of the monitored quantity.save_weights_only: if True, and so only the model's weights volition exist saved (
model.save_weights(filepath)
), else the full model is saved (model.salvage(filepath)
).save_freq:
'epoch'
or integer. When using'epoch'
, the callback saves the model subsequently each epoch. When using integer, the callback saves the model at end of this many batches. If theModel
is compiled withexperimental_steps_per_execution=N
, then the saving criteria will be checked every Nth batch. Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reverberate as footling equally 1 batch, since the metrics become reset every epoch). Defaults to'epoch'
.**kwargs: Additional arguments for backwards compatibility. Possible primal is
menses
.
The Full code used in this section is hither.
iv. Saving and loading entire models
Unabridged Keras model (architecture + weights + optimizer land + compiler configuration) tin exist saved to a disk in two formats (i) TensorFlow SavedModel ( tf
) format, and (ii) H5 format.
Moreover, entire Keras model can be saved either during training or before/after training the model. However, we will hash out uncomplicated saving and loading an entire model afterwards training. In part Two of this series, we will learn more about other utilise-cases.
model.save('MyModel',save_format='tf')
saves entire model into tf
format. This will create a binder named MyModel
which contains
-
assets
directory may contain additional data used by the TensorFlow graph, such as vocabulary files, class names, and text files used to initialize vocabulary tables. -
saved_model.pb
TensorFlow graph and grooming configuration and optimizer state -
variables
weights are saved in this directory
model.save('MyModel_h5',save_format='h5')
saves entire model into h5
format. This will create a unmarried file that contains everything required for you lot to reload later for restarting the training or reload to predict.
You tin can delete the model
object as the object returned by tf.keras.models.load_model
doesn't depend on the code that was used to create it. Afterwards loading yous can check the performance of the loaded_model
and compare it with the performance of the original model
. Operation before saving and after loading is identical. The full code of this example is provided here.
The above case is a very simple one without whatsoever custom metrics/losses or custom layers. In part 2 of this series, we will see how to deal with saving and loading a niggling more circuitous TensorFlow models.
5. Conclusions
Top five takeaways from this article are:
I. Keras has the power to save a model's architecture only, model's weights only, or the unabridged model (architecture and weights)
II. Architecture can be serialized into json
or yaml
format. A new model can be created by using the model configuration that was serialized earlier. You can utilise model_from_json
or model_from_yaml
to create a new model.
III. A Keras model'southward weights can exist saved either during the training or before/later on grooming.
Four. Keras ModelCheckpoint callback can be used to save the best weights of a model or save weights every N
batches or every epoch
Five. Model's weights or entire model can be saved into two formats (ane) new SavedModel (as well called as tf
) format, and (2) HDF5 (too h5
) format. In that location are some differences between these two formats (nosotros will deal that in another article) but general practice is to use tf
format if y'all are using TensorFlow or Keras.
References
- Save and Serialize guide from TensorFlow website
- https://keras.io/api/callbacks/
- https://keras.io/api/models/model_saving_apis/
Source: https://medium.com/swlh/saving-and-loading-of-keras-sequential-and-functional-models-73ce704561f4
0 Response to "Keras Save Cannot Load Again Load a Weight File Containing"
Post a Comment