Scribe Design Rationale
#######################

Exceptions are thrown to:
 1) Avoid generating an erroneous or corrupt archive, and
 2) The programmer can fix the error in how they used the scribe system.
Exceptions should not be caught *during* transcribing (only after). A thrown exception means the
transcribe state is indeterminate and transcribing should be aborted (eg, fail to save or load a project file).

Error codes are returned when an object in the archive cannot be read.
This can be caused by loading an archive generated by a future GPlates and finding that an object is missing.
This can happen when the future GPlates no longer writes out that object.
In this case the current GPlates has two choices:
 1) Provide a default value for the missing object and continue on, or
 2) Propagate the error up to its caller (who can also decide whether to recover with a default value) or further propagate the error.
If the error propagates all the way back to the root (ie, there is no recovery) then the entire load is aborted.

The similar scenario of loading an archive generated by a future GPlates and finding an unknown object does not happen.
This is because unknown objects are simply ignored (they're never looked up by the current GPlates and hence never found).
And because the transcription obtained from the loaded archive supports random access, these unused objects are simply skipped over.

TODO:
With return codes you don't need to explicitly version the archives when backwards/forwards compatibility is broken.
If an object type that would break compatibility is not transcribed then the transcribed archive remains compatible.
Although the archives will still have their own version but that just covers changes to the archive file format (not the transcribed object network).

TODO:
With return codes (TRANSCRIBE_SUCCESS, TRANSCRIBE_INCOMPATIBLE, TRANSCRIBE_UNKNOWN_TYPE)
the TRANSCRIBE_UNKNOWN_TYPE enables better forward compatibility...

// This error type is differentiated from @a TRANSCRIBE_INCOMPATIBLE in order to better support
// forward compatibility. So if a future version of GPlates adds a new derived class then
// we can ignore the new type by looking for @a TRANSCRIBE_UNKNOWN_TYPE.
// For example, a transcribed sequence of smart pointers (via polymorphic base class),
// such as 'std::vector< boost::shared_ptr<Base> >', can ignore elements containing the
// new derived type and keep elements containing known types (rather than failing altogether).

...this is because normally when a future version transcribes a new class member it will have a new object tag.
The old version of GPlates will simply skip that new data member - in fact it won't really even know
that it exists in the transcription (loaded from archive). So we want the same sort of behaviour
when adding a new (derived) object type in a sequence. But since we cannot easily give each sequence
element it owns object tag we simply detect unknown types. Most sequences are homogeneous anyway
(all elements are the same type) so it's really only sequences of pointers to polymorphic types where you
can get heterogeneous sequences and this is where TRANSCRIBE_UNKNOWN_TYPE is aimed.

TODO:
Sequences such as std::vector, std::list, std::set, QList, QStack, QSet, etc, all use the same transcribe protocol
and hence can be swapped without breaking backward/forward compatibility.
However if you save a std::vector and then load it via a QSet, for example, then any duplicate items
in the std::vector will get combined into a single item in QSet (because it does not support duplicates).
This is probably fine in some cases but in others it might result in a failed load - for example,
if a pointer is loaded that references the duplicate item that was dropped.

Void cast registry
------------------

Void cast registry is used to up-cast from a derived class to a base class.
Used when transcribing a pointer or reference to a base class sub-object of a full derived class object.
Registration happens when transcribing base class sub-object 'transcribe_base()' from within 'transcribe()' handler of derived class.
It's called a 'void' cast because sometimes only have access to object addresses and not their type
(ie, we have the std::type_info but not the actual template type such as 'int').
Previously we used the object id of the base class sub-object when transcribing a pointer.
But now we use the object id of the full derived class object and use the void cast registry to up-cast to the base class pointer.
This ends up being more robust in terms of backward/forward compatibility when changes are made in future versions of GPlates.
For example, if we transcribed a 'B *' pointer to a B object in an old version, but then in a new version we changed B to inherit from A
and instead transcribed an 'A *' pointer. Then because both the 'B *' pointer in the old version and the 'A *' pointer in the new version
use the same object id (of B) we don't need to change how those pointers are transcribed (from old to new). As an aside, we still however
need to make some changes to how the B object itself is transcribed (but at least the pointer transcribe does not need changing).
If we had used the object id of the A sub-object (instead of B) when transcribing the 'A *' pointer in the new version then
in order to make it backwards compatible (ie, able to be loaded into old version) we would have needed to transcribe both the
'A *' pointer (for loading into new versions - with an updated tag version to prevent old versions trying to load it) using the
object id of A and a 'B *' pointer (for loading into old versions - where new versions ignore it since using old tag version) using
the object id of B.

If there is more than one inheritance path between a derived class and a base class when requesting a cast then an AmbiguousCast exception is thrown.
For example, the following will throw an exception when requesting a cast from derived class D to indirect base class A:

  A   A
  |   |
  B   C
   \ /
    D


Comparison with other serialisation software
============================================

Boost serialization
-------------------

TODO:

This serialization system is very similar to boost:serialization which we were very close
to using, but unfortunately there were just enough issues to tip the balance in favour
of using another serialization library or implementing our own (we chose to implement our own).

We ended up not using the boost::serialization library for the following reasons:

   - The boost::serialization library writes a (serialization) library version to the archive.
     Later library versions can read archives generated by earlier versions, but of course
     not vice versa. So far so good. However GPlates doesn't currently specify a specific boost
     library version for a specific GPlates version - we only require GPlates, for example,
     to use boost version 1.34 or greater. This makes it easier to compile GPlates, especially
     on linux systems like Ubuntu, where each version of Ubuntu comes with specific
     pre-packaged versions of libraries like boost. The problem arises when, for example,
     the *same* version of GPlates compiled with boost 1.34 on one system and boost 1.46 on
     another need to read archives generated by each other. It's the same version of GPlates
     on both systems so you'd expect it to work. But if the GPlates with boost 1.46 generates
     an archive, and then the GPlates with boost 1.34 tries to read it, then it won't be able to.
     This can also happen quite easily when different GPlates developers build MacOS GPlates
     bundles for the same user (perhaps they are different GPlates branches with slightly
     different functionality but yet still have the same serialization code paths).
     In this case the user might switch between using the different GPlates bundles and expect
     project files generated from one to be usable in the other, but this won't be the case.
     By implementing our own serialization framework, the serialization (library) version
     naturally stays in sync with the GPlates version (ie, there's always a one-to-one mapping
     between serialization (library) versions and GPlates versions) so it no longer matters
     what version of boost you compile GPlates with.

   - There are some backwards compatibility bugs in boost::serialization where newer versions
     cannot read older version archives. This link https://svn.boost.org/trac/boost/ticket/4660
     provides a good coverage of the backward compatibility issues with binary archives.
     Unfortunately the ticket has been re-opened a number of times over the last two years
     and is apparently still having issues in the relatively recent boost 1.51 version.
     Text archives would not normally be affected by this, but a related link
     https://bugzilla.redhat.com/show_bug.cgi?id=694448 mentions issues with text archives.
     Having said that it's just as likely our own serialization frameworks will have bugs
     in it - but unfortunately this is not the only issue we had with boost::serialization.

   - It's difficult to deserialize objects that reference other objects that were *not*
     serialized to the archive. This, in essence, is a problem of partial serialization.
     boost::serialization works fine if you serialize *everything* that can be referenced
     by the objects you serialize. Boost provides the 'serialize()' and 'load_construct_data()'
     functions that you can specialise for a particular object class (or type) but, being functions
     and not objects, it is difficult to pass in information regarding the non-serialised objects.
     For example...

		class X { X(const Y &); Y &y; };

		template <class Archive>
		void serialize(Archive &ar, X &x, const unsigned int version);

		template <class Archive>
		void load_construct_data(Archive &ar, X *x, const unsigned int file_version);

     ...to serialise/deserialise class X when class Y is *not* serialized is not easy because
     there's no easy way to pass the Y object into the above functions. Although you could
     possibly define your own archive type and get the Y object from there but it is awkward.
     By implementing our own serialization framework we can change this restriction.
     In our case we allow the user to optionally specialise the class TranscribeContext (for
     each object type being transcribed) to place whatever context information is desired into
     the specialised class. An instance of TranscribeContext, specialised for the object type,
     can then be pushed and popped onto a context stack available for each object type using
     the Scribe methods @a push_transcribe_context and @a pop_transcribe_context.
     And the most recently pushed context can be accessed using @a get_transcribe_context.

