Save File Format¶
Overview¶
When saving a (partly) trained model to disk, the resulting model file is in YAML format and looks very similar to the configuration files (see Experiment configuration file format) with a few exceptions:
- Saved model files hold only one experiment (in contrast, config files contain dictionaries of several named experiments).
- Saved models are accompanied by a
.data
directory holding trained DyNet weights. - Some components replace the originally specified arguments with updated contents. For instance, the vocabulary is usually stored as an explicit list in saved model files, whereas config files typically refer to an external vocab file.
.data sub-directory¶
This directory contains a list of DyNet subcollections with names such as Linear.98dc700f
or
UniLSTMSeqTransducer.519cfb41
. Every Serializable
class that allocates DyNet parameters using
xnmt.param_collection.ParamManager.my_params(self)
(see Writing XNMT classes) will have one such
subcollection written to disk. The file names correspond to the component’s xnmt_subcol_name
, consisting of the
component name and a unique identifier. The xnmt_subcol_name
is also stored in the saved model’s YAML file to
establish the correspondence. Each subcollection is stored using DyNet’s serialization format which is a readable text
file.
In case several checkpoints are saved, there will be additional .data.1
, .data.2
etc. files. It is worth
mentioning that xnmt_subcol_name
does not change between checkpoints, and only one YAML file is written out. Also
note that the additional checkpoints are generally ignored when loading a saved model, but can be substituted manually
by renaming them, or be processed by the below utilities.
Command-line utilities¶
script/code/avg_checkpoints.py
: Perform checkpoint-averaging by taking the elementwise arithmetic average of parameters from all saved checkpoints.script/code/conv_checkpoints_to_model.py
: Convert a checkpoint to its own model. This is for example useful to enable checkpoint ensembling. Under the hood, this draw new randomxnmt_subcol_name
identifiers and in order to enable loading all checkpoints as separate models into XNMT.