DeHL, Delphi 2010 and Serialization

DeHLA few months have passed and I did not release a new version of DeHL yet. No, it’s not dead. I’ve just been busy with a delicate new feature — Serialization. This post will demonstrate the new capabilities of DeHL it’s advantages and and shortcomings.

But first — since the new releases will focus mostly on serialization and related stuff, I decided to drop Delphi 2009 support. It made no sense to support 2009 for future versions since no essential changes are made to the prior code. You can still use 0.7 release in Delphi 2009.

Back to serialization. The following list describes the changes that went into the new version:

  • In order to support serialization, DeHL’s type system was extended to support Serialize and Deserialize methods. Each type class (that describes a type in Delphi) now knows how to serialize values of the type it manages.
  • A new unit named DeHL.Serialization was added. It contains the base definitions of types used by the type system for serialization.
  • TPointerType, TRecordType<T>, TArrayType<T>, etc. were added for simplified type handling. The old method was a mix-up of Delphi 2009 and Delphi 2010 RTTI specifics (which have some essential differences in my case).
  • All classes can now implement ISerializable interface. The TClassType<T> detects whether this interface is implemented by the object and uses it for serialization (no, reference counting is not touched).
  • DeHL.Serialization.Abstract contains the semi-implementation of a “serializer” and it’s context. It is used by specific serializers.
  • DeHL.Serialization.XML defines the TXMLSerializer<T> which can be used to serialize/deserialize into XML nodes (uses TXMLDocument). Supports it’s own set of attributes (such as XmlRoot, XmlElement, etc.).
  • DeHL.Serialization.Ini defines the TIniSerializer<T> that you can use to serialize/deserialize type into Ini files or registry (through RTL’s TRegIniFile).
  • Most DeHL types (such as Nullable<T>TFixedArray<T>, BigInteger, etc.) provide their own serialization and deserialization methods.
  • All Enex collections (except a few that can’t actually) can be serialized and deserialized. They implement a custom serialization and deserialization technique through ISerializable.

Enough talk, a mandatory example:

type
  [XmlRoot('Testing', 'http://test.namespace.com')]
  TTest = class
    { Pointer to self }
    [XmlElement('PointerToSelf')]
    FSelf: TObject; 
    {A set of format settings }
    FFormatSettings: TFormatSettings;

    { And internal record }
    FInternal: record
      { Force the field to be an attribute of FInternal }
      [XmlAttribute('Value')]
      FOne: Integer;

      { Force this element to have same name but other namespace }
      [XmlElement('Value', 'http://other.namespace.com')]
      FTwo: String;
    end;

    FListOfDoubles: TList<Double>;
  end;
var
  LDocument: IXMLDocument;
  LXMLSerializer: TXMLSerializer<TTest>;
  LOutInst, LInInst: TTest;
begin
  CoInitializeEx(nil, 0);

  { Initialize the test object }
  LOutInst := TTest.Create;
  LOutInst.FSelf := LOutInst;
  GetLocaleFormatSettings(GetThreadLocale(), LOutInst.FFormatSettings);
  LOutInst.FInternal.FOne := 1;
  LOutInst.FInternal.FTwo := '2 - Two';
  LOutInst.FListOfDoubles := TList<Double>.Create();
  LOutInst.FListOfDoubles.Add(0.55);
  LOutInst.FListOfDoubles.Add(0.122);
  LOutInst.FListOfDoubles.Add(122.23);

  { Create the serializer and an XML document }
  LXMLSerializer := TXMLSerializer<TTest>.Create();
  LDocument := TXMLDocument.Create(nil);

  { Set the options }
  LDocument.Active := true;
  LDocument.Options := LDocument.Options + [doNodeAutoIndent];

  { Force fields to elements by default }
  LXMLSerializer.DefaultFieldsToTags := true;

  { Serialize the structure }
  LXMLSerializer.Serialize(LOutInst, LDocument.Node);

  { Serialize the structure }
  LXMLSerializer.Deserialize(LInInst, LDocument.Node);

  { Cleanup }
  LDocument.SaveToFile('c:\test.xml');
  LXMLSerializer.Free;
end.

The XML file generated by this code looks like this (INI looks uglier):

<Testing xmlns="http://test.namespace.com" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:DeHL="http://alex.ciobanu.org/DeHL.Serialization.XML" xmlns:NS1="http://other.namespace.com">
  <PointerToSelf DeHL:ref="Testing"/>
  <FFormatSettings>
    <CurrencyString>$</CurrencyString>
    <CurrencyFormat>0</CurrencyFormat>
    <CurrencyDecimals>2</CurrencyDecimals>
    <DateSeparator>/</DateSeparator>
    <TimeSeparator>:</TimeSeparator>
    <ListSeparator>,</ListSeparator>
    <ShortDateFormat>M/d/yyyy</ShortDateFormat>
    <LongDateFormat>dddd, MMMM dd, yyyy</LongDateFormat>
    <TimeAMString>AM</TimeAMString>
    <TimePMString>PM</TimePMString>
    <ShortTimeFormat>h:mm AMPM</ShortTimeFormat>
    <LongTimeFormat>h:mm:ss AMPM</LongTimeFormat>
    <ShortMonthNames>
      <string>Jan</string>
      <string>Feb</string>
      <string>Mar</string>
      <string>Apr</string>
      <string>May</string>
      <string>Jun</string>
      <string>Jul</string>
      <string>Aug</string>
      <string>Sep</string>
      <string>Oct</string>
      <string>Nov</string>
      <string>Dec</string>
    </ShortMonthNames>
    <LongMonthNames>
      <string>January</string>
      <string>February</string>
      <string>March</string>
      <string>April</string>
      <string>May</string>
      <string>June</string>
      <string>July</string>
      <string>August</string>
      <string>September</string>
      <string>October</string>
      <string>November</string>
      <string>December</string>
    </LongMonthNames>
    <ShortDayNames>
      <string>Sun</string>
      <string>Mon</string>
      <string>Tue</string>
      <string>Wed</string>
      <string>Thu</string>
      <string>Fri</string>
      <string>Sat</string>
    </ShortDayNames>
    <LongDayNames>
      <string>Sunday</string>
      <string>Monday</string>
      <string>Tuesday</string>
      <string>Wednesday</string>
      <string>Thursday</string>
      <string>Friday</string>
      <string>Saturday</string>
    </LongDayNames>
    <ThousandSeparator>,</ThousandSeparator>
    <DecimalSeparator>.</DecimalSeparator>
    <TwoDigitYearCenturyWindow>50</TwoDigitYearCenturyWindow>
    <NegCurrFormat>0</NegCurrFormat>
  </FFormatSettings>
  <FInternal Value="1">
    <NS1:Value>2 - Two</NS1:Value>
  </FInternal>
  <FListOfDoubles>
    <Elements>
      <Double>0.55</Double>
      <Double>0.122</Double>
      <Double>122.23</Double>
    </Elements>
  </FListOfDoubles>
</Testing>

On the first serialized/deserialized value, serializers build up a sort of an internal “object graph” and gathers all information about the data being serialized. The next uses of the same serializer instance yield an 10x performance gain since there is no need to rebuild all the information from scratch. I am still working on more optimizations that could give greater speed boost.

P.S. I can’t show the contents of the deserialized object here so you’ll have to take my word for it.

Note. This is just a preview of what is going on in the trunk. No version is released since I have to iron out the last problems and write the missing unit tests.