When adding a distributed cache like Redis to a solution, it is important to define the data model and sets in a way that enables us to easily fetched from the cache while maintaining a similar structure to our SQL database to minimize the impact on our client applications. The challenge is to define the approach that can work within the technical specifications of the cache, yet it supports our solution.
Review the SQL Data Model
When building a good model on the cache, we need to understand how the data is retrieved from the database from the client applications. We also need to learn how to leverage the Redis key/value structure to facilitate the same data access. We can practice these principles by looking at a data model from devices that measure telemetry information like temperature and noise.
By looking ate entity above, we can notice that we can query the data/measures from this table by using the date (processed) and deviceId dimensions by using a SQL query like the one shown next:
SELECT [telemetryId]
,[deviceId]
,[temperature]
,[humidity]
,[sound]
,[processed]
,[created]
FROM [dbo].[Telemetry] (nolock)
WHERE processed > dateadd(MM,-1, getdate())
AND deviceId = '0ZA-112'
ORDER BY telemetryId DESC
|
We can look at the output of this query to help us understand what data needs to be stored in the cache. However, the first question that comes to mind is how do we do this using Redis? To answer that, let’s try to understand how the database is structured and then apply those concepts to Redis.
Data Structure on Redis
To build our data structure on Redis, we first start by mapping the database, schema and table names to a Redis Key that we can use to pull the data. For example, our table is in the database device with a table named telemetry. Using that information, we can create a namespace like device.dbo.telemetry. We can use this namespace to construct our Redis key, or if the schema name is not meaningful (i.e. dbo), we can shorten the namespace by removing the schema name part and use this instead: device.telemetry.
Now that we have a good Redis key, the next challenge is the ability to select the record by deviceId. On SQL server, this is done by doing a lookup for the deviceId column as was shown on the previous SQL statement.
The way we can do this on Redis is by appending the deviceId to our existent key. This generates an extended key that includes the deviceId value with this format device.telemetry.deviceId (deviceId is replaced by the value). By setting this key with the unique value, we can now query the cache for the information associated with a specific device.
How do we query by dates?
The next part of the SQL query predicate enables us to select data by using a date. That gets a bit more challenging since a date range does not match the concept of a key, so we cannot just do device.telemetry.deviceId.datetime as this limits the key to a specific date. However, Redis supports the concept of storing a sorted set or list with a specified key and a score number that can be sorted in an ascending way (older dates to most recent dates).
But what do we need a score when we need to fetch the information by date and time? Well, in programming languages, a date time stamp can be represented as a numeric value in milliseconds. This does enable us to convert a DateTime field into a numeric field and use it as a score within Redis by using the following command.
zadd device.telemetry.0ZA-112 1565379958535 "my data here"
|
By storing data with this numeric value/score, we can query the DateTime numeric value by using this command zrangebyscore (or zrevrangebyscore to reverse the order - descending). This command enables us to select data using a date range similar to how we would do in SQL with a between operator on the where clause. The long numbers are just DateTime values that have been converted using the native JavaScipt Date.getTime() API.
zrangebyscore device.telemetry.0ZA-112 1565379958535 1565379958999
|
How do we store the rest of the fields/measures?
Now that we have a way to pull data for a specific deviceId and a date range, how do we associate the data with that information? Well, this is done by serializing the rest of the information into a JSON payload and storing it into the cache as a string value. When we add a record to a sorted list with a score, we can associate data with that score as shown next:
zadd device.telemetry.0ZA-112 1565379958535 '{"telemetryId": 2076,"deviceId": "0ZA-112","temperature": 34.000}’
|
We can now query all the information within a particular DateTime range for a selected deviceId.
Selecting the data from multiple devices and date range:
What if we want to select the telemetry data for all devices within a particular date range? The answer to this requires us to review our approach. When we add the deviceId to the Redis key, we have set a constraint on the sorted list, and we can only query by the deviceId. The command zrangebyscore does not support the use of wildcard character (*). We can, however, change the key and remove the deviceId. This results in a new key with this namespace: device.telemetry.
By removing the deviceId from the key, we can now run the same operations with a date range, and all the devices that processed information within the date range should be selected. If the data is too massive, we can accept the trade-off of memory usage for performance by creating two sorted sets, one with the deviceId and another one without. This really depends on the particular use case. Always keep in mind, this is a cache, and the data should expire within a time constraint.
Conclusion
It looks like we have done a good job of reproducing our SQL data models into Redis structure that can support the same operations that we do on SQL. We have also been able to work within the specifications of the Redis commands to set our data structure in a format that enables us to retrieve the data with similar operations that we support on SQL.
0 comments :
Post a Comment
What do you think?