Release: Javascript/Node 4.0
The newly improved Typescript and ES6 SDK
Features:
- New binary protocol support (under the hood)
- Bulk actions support (under the hood)
- Full typescript declaration files
- Promises everywhere! Long live async/await!
- Offline record support
{
// Use indexdb to store data client side
offlineEnabled: false,
// Save each update as it comes in from the server
saveUpdatesOffline: false,
indexdb: {
// The db version, incrementing this triggers a db upgrade
dbVersion: 1,
// This auto updates the indexdb version if the objectStore names change
autoVersion: false,
// The key to index records by
primaryKey: 'id',
// The indexdb databae name
storageDatabaseName: 'deepstream',
// The default store name if not using a '/' to indicate the object store (example person/uuid)
defaultObjectStoreName: 'records',
// The object store names, required in advance due to how indexdb works
objectStoreNames: [],
// Things to not save, such search results
ignorePrefixes: [],
// The amount of time to buffer together actions before making a request
flushTimeout: 50
}
}
- Customizable offline storage support
export type offlineStoreWriteResponse = ((error: string | null, recordName: string) => void)
export interface RecordOfflineStore {
get: (recordName: string, callback: ((recordName: string, version: number, data: RecordData) => void)) => void
set: (recordName: string, version: number, data: RecordData, callback: offlineStoreWriteResponse) => void
delete: (recordName: string, callback: offlineStoreWriteResponse) => void
}
Improvements
- Separation of errors and warnings for clarity. Non critical failures (such as an ack timeout) can now be treated separated or fully muted.
- Enhanced services to reduce timeout overhead
Backwards compatibility
- Only works with V4 server
- All single response APIs now return promises when not providing a callback. This means most APIs that could have been chained would now break.
const client = deepstream()
try {
await client.login()
const record = client.record.getRecord(name)
await record.whenReady()
const data = await client.record.snapshot(name)
const version = await client.record.head(name)
const exists = await client.record.has(name)
const result = await client.rpc.make(name, data)
const users = await client.presence.getAll()
} catch (e) {
console.log('Error occurred', e)
}
- Listening
The listening API has been ever so slightly tweaked in order to simplify removing an active subscription.
Before when an active provider was started you would usually need to store it in a higher scope, for example:
const listeners = new Map()
client.record.listen('users/.*', (name, isSubscribed, ({ accept, reject }) => {
if (isSubscribed) {
const updateInterval = setInterval(updateRecord.bind(this, name), 1000)
listeners.set(name, updateInterval)
accept()
} else {
clearTimeout(listeners.get(name))
listeners.delete(name)
}
})
Where now we instead do:
const listeners = new Map()
client.record.listen('users/.*', (name, ({ accept, reject, onStop }) => {
const updateInterval = setInterval(updateRecord.bind(this, name), 1000)
accept()
onStop(() => clearTimeout(updateInterval))
})
TLDR:
Binary Protocol
Binary Protocol
The driver behind pretty much all of the V4 refactor was our move from our old text based protocol to binary. It makes building SDKs and new features so much easier. Seriously. LIKE SO MUCH EASIER.
Okay so first things first, the structure of text vs binary messages:
V3 -Text:
TOPIC | ACTION | meta1 | meta2 | ...metaN | payload +
This string had the initial TOPIC and ACTION read by the parser to find out where to route it, and the rest of the data was figured out within the code module that dealt with it. This gave some benefits like only parsing a full message once its actually required, but also meant that the message parsing code was distibuted and adding for example a meta field would require lots of refactoring. Tests also had to create text based messages even when testing internal code paths. Payload serialization also didn't use JSON, but instead used a custom form of serialization to minimize bandwidth: U for undefined, T for true, F for false, O for object, S prefix for string and a N prefix for number.
So the message object in V3 SDKs and server were like:
{
"topic": "R",
"action": "S",
"data": ["A", "recordName"]
}
V4 - Binary:
The binary protocol is implemented using protobuf. The decision to use proto was due to its wide support of other languages, it's ease of formats and how quickly we managed to get it implemented.
The main message is simply this:
message Message {
TOPIC topic = 2;
bytes message = 3;
}
While individual messages use a combination of an action enum and fields.
For example, the event message looks something like this:
message EventMessage {
required EVENT_ACTION action = 1;
string data = 2;
string correlationId = 3;
bool isError = 4;
bool isAck = 5;
string name = 6;
repeated string names = 7;
string subscription = 8;
TOPIC originalTOPIC = 10;
EVENT_ACTION originalAction = 11;
}
An example representation that deepstream would get translated within the JS SDKs looks like this:
{
"topic": 3,
"action": 2,
"isAck": true,
"name": "event"
}
This makes writing code alot easier. At the time of writing the full message API that can be consumed is as follows:
export interface Message {
topic: TOPIC
action: ALL_ACTIONS
name?: string
isError?: boolean
isAck?: boolean
data?: string | Buffer
parsedData?: RecordData | RPCResult | EventData | AuthData
parseError?: false
// listen
subscription?: string
originalTopic?: TOPIC | STATE_REGISTRY_TOPIC
originalAction?: ALL_ACTIONS
names?: Array<string>
reason?: string
// connection
url?: string
protocolVersion?: string
// record
isWriteAck?: boolean
correlationId?: string
path?: string
version?: number
versions?: { [index: string]: number }
// state
checksum?: number
fullState?: Array<string>
serverName?: string
registryTopic?: TOPIC
// cluster
leaderScore?: number
externalUrl?: string,
role?: string
// lock
locked?: boolean
}
Using this approach has made adding new features and maintaining current ones significantly easier. And the given combination of TOPICs and ACTIONs we can pretty much ensure we'll be able to extend it without running out of space any time soon.
Cons
It wouldn't be fair to say that this overhaul has no downsides. There have been some sacrifices that we had to make along the way.
1) If you count messages in the billions, those extra bytes add up. Data bandwidth is quite expensive on cloud systems so lack of compression isn't just a latency issue anymore. Protobuf has some very good compression algorithms which defeats JSON objects in most cases.
Why yet another proprietary standard?
Because deepstream offers some very specific features, and has alot more on the way. For example we currently have a unique concept such as listening. Trying to use a realtime standard (which there aren't many of) would seriously hinder development. That being said deepstream allows swapping out of protocols quite easily as long as theres an interop layer so feel free to create compatibility protocols to work with your favourite SDKs!
Offline Storage
Offline storage is probably the biggest feature in 4.0. So I’m really happy to say it has been added. But offline storage is one of the hardest things we worked on due to the insane amount of states it introduces. So it’s with a bit of regret that I say you should not use it if you want to immediately go into production! What would be extremely helpful is if you have it enabled in development incase you run into issues, and hopefully once all small glitches are resolved I’ll release a 4.1 with it being officially production ready. If you are using a data pattern where you don’t have to do updates via deepstream (only consuming message for visual realtime updates) then ignore that, production ready it is!
So why use it at all? Because it gives you full record usage without a connection. Pretty slick!
The way offline works is as follows (this is just one path, but most likely):
- User opens app first time, data is requested from server and stored on client side.
- User loses connection to app, but from an app perspective functionality remains the same
- User updates multiple things while offline, sets the record to dirty and updates the value in local storage
- User is back online
- Deepstream requests the version of the record on deepstream. If its the same as the one locally it sends all the modifications as the next update, it it isn’t, it requests the data and does a merge conflict.
The reason why it would be production ready for read only scenarios is because the record is never marked as dirty, which means server side always wins:
- User opens app first time, data is requested from server and stored on client side.
- User loses connection to app, but from an app perspective functionality remains the same
- User is back online
- User requests the version of the record on deepstream. If its the same as the one locally it, so doesn’t do anything more. If it isn’t, it requests the data and assumes its the latest (using a remote wins algorithm).
Typescript
We converted the majority of the codebase to typescript, for the benefit of future code maintenance as well making it easier for people to contribute. This also means consumers of the SDK can now directly use the generated declaration files when installing deepstream rather than maintaining separate bindings.
Services
We added a few services to improve the way things work in the client.
Timer Service
We now have a timer service that all timers in the sdk are registered against, rather than using the native nodeJS timeouts. This gives us two benefits. First off its just generally much quicker, if you do a CPU profile of native timeouts you’ll notice the time used is noticeable, while instead we have a single interval to poll the timeout queue. Secondly it allows us to easily deal with timing slips. What this means is that in the future rather than timeouts being fired much later due to the CPU being blocked, the timer registry can either allow certain timeouts to be ignored or reset.
Bulk Subscription Service
We now register our subscriptions via a service rather than directly sending a subscription message. This allows us generate a single subscription message for up to thousands of records with a single ack rather than thousands.
Connection Service
We now have a connection service that is driven by a state machine that can be consumed by any class to send messages as well as listen to any connection lifecycle change.
Current API hooks for reconnection logic are:
public onLost (() => void): void
public onReestablished (() => void): void
public onExitLimbo (() => void): void
For those who looked into the SDK internals before you’ll notice the introduction of a limbo state. What this means is the connection was just lost, but you don’t want API calls to immediately start failing as a reconnect might be likely to immediately happen. As such feature developers now have the potential of buffering those requests until either the connection is reestablished or the buffer timeout is exceeded and all API calls will fail with a not connected error.