Can your caching solution give room to the Room Library on Android?

In this article you will read about an example implementation of an offline-first approach using a relational-database for HTTP request caching.

Introduction to the caching theory

Caching is a term of huge importance in computer science.

Can you imagine travelling 2 hours to a warehouse to do daily shopping? Unless you live in the wilderness (or a car-centric suburbia) you would probably look for the most frequently used groceries in a nearby store instead. When you have found what you were looking for ­- nice hit! Only when you miss a product in a local store, would you look for it in a backing store.

This simple real-life observation has a direct counterpart in computer science – in this context, the nearby store represents a cache. Instead of serving groceries to the customer, it serves data to the caching client for the sake of faster access. The cached data is a copy of the data stored in the backing store. Each item of the data is identified by a tag, wich allows to associate locally cached data with the data in backing store. The pair (tag, data) is known as cache entry.

Caching client searches by the tag in cache in hope it finds the entry associated with the tag there (cache hit). Only when no data associated with the tag can be found in cache (cache miss), the client looks for the data in a backing store.

A real-life counterpart would be looking for a specific product – let’s say a bottle of water. You would go to the warehouse to look for it only when you did not found them in your local store.

In the computer science there are many examples of caching – CPU cache, disk cache and web cache, just to name a few.

Mobile challenges

In the mobile environment, one type of cache is especially interesting. Due to their characteristics mobile devices often have to work with limited Internet access. It can be caused by quality of the mobile Internet connection, required power consumption or data transfer costs.

Therefore, limiting network traffic by caching entries, once they are fetched from the server, seems a reasonable option in a mobile environment.

Many apps have to meet „offline-mode” requirement. That is: they have to continue working even though they cannot communicate with the server. Handling cache misses is a crucial question here.

Not only networking capabilities but also cache size might be limited in the mobile environment. Smartphones have usually smaller storage size than their desktop counterparts.

And here is where relational-database-based caching solution shines!

Caching strategies

But first, let us have a look at a typical HTTP-cache.

HTTP-Response caching

In the case of HTTP Cache, a caching entry consists of URL as a tag and resource as a piece of data. HTTP protocol specifies Cache-control header used to control cache lifetime and other policies.

Whenever there is no corresponding URL in the cache – no data can be provided to the client.

Let’s suppose we are using a REST API which serves TODOs as JSON objects.

They are displayed in a paged list on one screen. The paged TODOs are downloaded from the server.

 
  
{baseUrl}/todos?offset={offset}&limit={limit}
  

The server returns TODOs sorted by creation date (descending).

But displaying a list of TODOs seems not enough to be the sole responsibility of a mobile app, doesn’t it?! Let’s add another screen, where a TODO can be searched by title containing a given phrase:

 
  
{baseUrl}/todos?title={title}
  

Let’s assume that the screen with a paged list of TODOs is displayed before the „Search TODO” screen. Sounds reasonable, doesn’t it?

Using the standard HTTP caching, after visiting the first screen we would have te following entry in cache:

URL:

 
  
{baseUrl}/todos?offset=0&limit=3
  

Resource:

 
  
[
    {
        "title": "Prepare soup",
        "description: "Chicken soup",
        "createdAt": "2022-09-08T18:25:43.511Z"
    },
    {
        "title": "Prepare salad",
        "description: "Greek salad",
        "createdAt": "2022-09-07T18:25:43.511Z"
    },
    {
        "title": "Walk the dog",
        "description: "",
        "createdAt": "2022-09-06T18:25:43.511Z"
    }
]
  

Okay, a page of TODOs has been displayed, but the user wants to find only these conaining the phrase „prepare”. Therefore, they open the „Search TODO” Screen.

Even though the previously cached response contains 2 TODOs that match the query:

Find TODOs whose title contains phrase „prepare”

there is no cached entry that matches URL:

 
  
{baseUrl}/todos?title=prepare
  

That means there would be nothing to display to the user, if they searched for „prepare” in offline mode, having no corresponding tag in cache.

Even if the user have had the corresponding entry in cache, 2 identical objects would be present in both responses.

URL:

 
  
{baseUrl}/todos?title=prepare
  

Resource:

 
  
[
    {
        "title": "Prepare soup",
        "description: "Chicken soup",
        "createdAt": "2022-09-08T18:25:43.511Z"
    },
    {
        "title": "Prepare salad",
        "description: "Greek salad",
        "createdAt": "2022-09-07T18:25:43.511Z"
    },
    {
        "title": "Prepare mikshake",
        "description: "",
        "createdAt": "2022-09-02T18:25:43.511Z"
    }
]
  

So, as we see – not only are the offline capabilities of the app limited, but some pieces of data are also duplicated in a local storage. As it was already discussed, these pain-points are especially relevant in the mobile environment.
But there is a way to overcome these obstacles, still having a fully functional caching solution.

Relational database-based caching

Instead of storing data in a dictionary-like cache, we can use a relational database.
I strongly recommend you reading the following article, where the concept was described in detail. Although it is brilliant and contains dozens of helpful diagrams, the solution recommended here is slightly different, as it will be demonstrated later on.

The API of such a solution may be even similar to a dummy HTTP-Cache extended to handle cache-misses in a nicer way.

 
  
interface DataSource {
    suspend fun find(key: Url): List
}
  

Instead of returning an empty list of TODOs in case there is no corresponding key in the cache, a relational-database-based cache might run the local SQL query to provide a partially-valid response. And it is the key difference between the proposed solution and the aforementioned article. We may want to receive partially valid results in case it is not possible to reach a remote data source. It is possible, because in contrast to standard HTTP-Caching, relational-database-based caching is some what „intelligent”. Not only does it consist of the cached data, but it also contains- the logic needed to recreate server responses. And the logic is expressed in form of SQL-queries.

So URL needs somehow to be transformed to SQL-query. A nice way of doing it is by extracting the query to a separate type to provide us with type safety and the separation of concerns obviously, HTTP query and SQL query are totally independent types of queries.

 
  
interface DataSource {
    suspend fun find(key: Query): List
}
  

And to be semantically correct, let us rename find to execute. Because we rather execute queries instead of finding them ?

 
  
interface DataSource {
    suspend fun execute(query: Query): List
}
  

Query is a declarative description of the query we want to run – no matter if it is HTTP-Request or SQL-Query under the hood. Or simply a usage of Parameter-Object Pattern.

 
  
sealed class Query {
    data class FindByPage(pageNumber: Int, pageSize: Int): Query
    data class FindByTitle(title: String): Query
}
  

These queries would be then translated to HTTP-Requests and SQL-queries correspondingly in Remote- and LocalSources.

Wait! What with controlling cache lifetime and the invalidation of queries?!
Of course, these responsibilities are characteristic of LocalSource, not the remote one.
Let us extend our DataSource interface so that all the aforementioned use-cases are covered.

 
  
interface LocalSource: DataSource {
    // 1. Have the following query already been cached?
    suspend fun exists(key: Query): Boolean
    // 2. If yes, then when?
    suspend fun getUpdateTimestamp(key: Query): Instant
    // 3. Recenty, but I want to make Pull to Refresh - thus invalidate the query!
    suspend fun invalidate(key: Query)
    // 4. ...or invalidate all cached TODOs :)
    suspend fun invalidateAll()
}
  

In case of map-like HTTP-Cache implementing, the following queries may be quite easy. Conceptually, we might extend our map to such a state:

But what in case of relational-database-based caching? The Resource is just an abstraction – the result of a dynamically run SQL-query, not a tangible entity! An entity would be at TODO itself. We might add information about an update timestamp and invalidation status to our business entity end we get the following table:

TODO

  • id: integer
  • title: text
  • description: text
  • created_at: real
  • update_timestamp: real
  • invalidated: integer

Keep in mind, that TODOs, which in the end form are the result of the query, may in fact come from different HTTP-Responses (do you remember the example with Find TODOs whose title contains phrase „prepare”?). So storing update_timestamp and invalidated in TODO table will not be enough.

As we discussed our resource – what is left of the original map-like structure is:

Which storage solution you choose for the structure is up to you. It might be a file, SharedPreferences, or an extra database table. In the last case, we can implement LocalSource using a relational database consisting of two tables.

Android devices support SQLite databases, but doing so using raw sqlite is no more in fashion! Let’s gear up with a jetpack and get ready to flight!

Room Library

In the good old days, using a database in Android consisted of some errorprone pieces – no compile-time query validation, using raw strings or the necessity of mapping query results to objects manually, just to name a few.

In other environments such challenges are solved by object-relational mapping frameworks. Some of them are quite heavy and therefore used primarily in a backend environment.

The team behind the Android ecosystem decided to provide us, the developers, with the ORM-goodies in form of the Room Library – part of Android Jetpack. Not only does it solve the aforementioned issues, but it also serves as a stable ORM-solution.

Defining entities comes with ease:

 
  
@Entity(tableName = "todo")
data class Todo(
    @PrmiaryKey(name = "id")
    val id: Int,

    @ColumnInfo(name = "title")
    val title: String,

    @ColumnInfo(name = "description")
    val description: String,

    @ColumnInfo(name = "created_at")
    val createdAt: Instant,

    @ColumnInfo(name = "udate_timestamp")
    val updateTimestamp: Instant,

    @ColumnInfo(name = "invalidated")
    val invalidated: Boolean,
)
  

How you access the stored through DAOs, how you handle database migrations and much more can be found in the official documentation.

Conclusions

In the article we found out how the knowledge about logic used to construct server responses combined with a local database can be used to implement an efficient caching solution in a mobile environment using the Room Library.

You might ask how big can the cache grow? Should we limit its size? And how?
A discussion on these subjects is worthy of a separate article, but if you are interested in finding out more, you can look for replacement policies and caching algorithms.

Written by
Piotr Chorościn

No items found.