1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
|
# Generalized Agile Transport System (GATS)
GATS is a data encoding format, as well as a set of libraries for reading and
writing data in that format. It's a structured format reminiscent of a binary
json, but with some extra cool features.
* All numbers (integers and floating point) are stored in an arbitrary
precision format which takes up the least amount of space it can.
* Unlike textual formats (like json, xml, etc.) floating point numbers are
stored exactly, the exact number you write in is the exact number you read
out.
* All formats stored are machine independent, you can read and write them on
any architecture, any endianness and they will always be the same. No more
encoding worries.
* There are a number of data types you can use to make up data structures:
* Dictionaries
* Lists
* Integers
* Floats
* Strings (binary 8-bit strings, perfect for UTF-8)
* Booleans
* Nulls
# License
GATS is distributed under a new-BSD type license. This is fairly permissive,
but if you require other licensing options feel free to contact us.
# Uses
GATS was originally intended for use in internet protocols, and is great for
that purpose. Used correctly it allows you to create efficient yet highly
malleable protocols with very little extra effort.
However, you can certainly use it for other purposes such as serializing data
for storage, and inter-process communication. In fact, with the different
language bindings available, sometimes GATS may be one of the easier ways to
allow, say, a PHP web service to communicate with a custom C++ executable or
python program.
# Languages/Libraries Supported
At the moment we actively maintain libraries for C++, java, php, python, and C#.
Other languages and libraries are welcomed. Here's a little info on each
target directory:
* *c++-libbu++* - The original libgats implementation. Works using libbu++
data types and streams. You need to have libbu++ and Xagasoft build in
order to build this version.
* *c++-qt* - A version written using Qt data types. This version builds using
qmake, so if you're using Qt you already have everything you need. Also
features handy signals & slots to make event driven networking even easier!
* *java* - A library using the Java native interfaces, everything looks and
works exactly how you would expect it should. There is a Xagasoft Build
script to build a jar file, but it's simple enough that a single javac
command can build it all, or just import the code into your project directly.
This java version has been used on desktops and android devices.
* *php* - There are two libraries for working with php, the first defines a
set of classes for fine control over the format, sometimes this is necessary
as php's types are a little loose. The second simply uses php native types
like array() as data transport. The second option is usually the much easier
to use, but doesn't always get the encoding correct for all inputs.
* *python* - These work like other serialization mechanisms in Python like
pickle, json, shelve, and marshal. It exposes the functions load, dump,
loads, dumps, and also the handy helpers recv and send for working with
sockets. The Python implementation returns and transmits native Python
data types, which makes life pretty easy. To use this version simply copy
gats.py to your project.
* *cs-dotnet* - This implementation is written in C# and compiles against .NET
version 4.0 or later (possibly earlier). It takes advantage of standard
.NET interfaces for container types so they function just like native
Dictionaries and Lists. The class layout is similar to other languages,
specifically Java. This implementation does slightly more buffering than
some of the others, but it still wouldn't hurt to buffer your more volatile
streams, like network streams.
# Basic Operation
The way GATS works is dictated by the format, so it works similarly in every
implementation, although they each have slightly different mechanics. When
encoding GATS you always encode each object in it's own "GATS packet." A GATS
packet has a very simple header which includes the size of the packet to make
parsing fast and efficient.
Each packet can contain a single root object. It can be any type, but for most
protocols a dictionary is a great choice for the root object.
The format is designed to make it very easy to work with various encoding,
packing, and encryption systems. The reader, by default, will skip all leading
zero bytes that come before a valid GATS packet, and will stop processing
precisely at the end of a valid GATS packet.
Skipping leading zeros makes it easy to work in environments where padding may
be required. You can use the simplest of all padding schemes (pad with zeros)
and it will work seamlessly with GATS.
Since the reader always reads exactly the number of bytes it needs, it's very
easy to embed GATS packets in other streams, or read them sequentially as fast
as you can from a socket.
## A Note About Strings
All strings in GATS are simply sequences of 8-bit bytes. There is no
overarching encoding that is dictated by the format. When using GATS it is
good to specify how you are encoding your text data, we recommend Unicode.
There is a possibility that a future version of GATS will include a separate
Unicode String data type, but for now it's important to remember this.
For this reason, we also recommend making the keys in all dictionaries 7-bit
UTF-8 compatible ASCII/Latin1. This isn't required of course, but it makes
things a bit easier.
# Speed vs Size
GATS objects are, on average, smaller than storing in other binary formats, and
can be much smaller than textual formats by virtue of storing only as many
bytes as necessary for integers and floats. This also means that GATS requires
more processing than fixed field binary formats, but interestingly not quite as
much as text formats like json. The processing we do on floats is actually
roughly comparable in many ways to text processing, although with fewer steps.
|