Investigating the Browser Caching Concept - ArvanCloud

ArvanCloud Blog

Read more about ArvanCloud news,
updates, products and services in ArvanCloud weblog.

Investigating the Browser Caching Concept

16 Jul 2019

Browser Caching is saving some (or all) of a website’s resources on the user’s browser and not receiving them with every visit to the website. The browser saves these resources in its local cache for a specific period of time. Depending on caching policy, the browser should send a request to the server to use that resource again after this time ends.

Caching policies such as time to save a resource and how to save it, which users can save it (the browser or CDN edge servers), as well as needing or not needing to download a resource again after expiration are all matters that are exchanged between the server and the user through HTTP headers.

This article discusses the most important HTTP headers in browser caching and investigates ArvanCloud’s general resource caching policy.

Validating Cached Responses Using ETag Headers

The server uses ETags HTTP headers to exchange validation tokens. Validation tokens make it possible to only download a cached resource again when it has changed.

To better understand this subject, imagine that 60 seconds is the highest time that a resource is saved in the browser cache. After this time period ends, the browser will no longer be able to use that resource, and should send a request to receive it again from the server. If the resource has not changed in this period, however, the browser once again receives the data that it still has in its cache. Downloading the data again is, therefore, not efficient.

The validation token, which refers to ETags headers, was introduced to resolve this issue. When the server sends a resource to the browser for the first time, it also sends Etags headers containing the validation token along with a collection of HTTP headers.

 Validation token is a hash string of the contents of the file sent. The browser sends this token to the server in the form of “If-None-Match HTTP request header” after the time for saving the resource in cache ends. The server compares the token with the available resource and if there are no changes, responds to the browser with “HTTP Status code 304 (Not Modified)”. This response means that the browser can use the resource saved in its cache for another 60 seconds, and there is no need to download it again. Not having to download the resource again saves time and bandwidth and also reduces latency when the user is accessing the data.

The browser does all of this automatically, and web developers only need to make sure that the server supports ETags headers. ArvanCloud edge servers fully support this header.

Expire Header

This header can be used to specify a precise date and time for a resource to expire. Using this header is in fact considered an old method to make a response expire.

Setting Caching Policies Using the Cache-Control Header

HTTP cache-control headers can be used to specify which user (Cacheability) can cache a resource, on what conditions (Revalidation) and for how long (Expiration) . These headers can be present in requests as well as responses.

Cache-Control headers allow website administrators to specify how to manage content received from the website’s main host web server. These policies are determined by commands specified in Cache Control. The Cache-Control header can include multiple commands separated by a comma (,). The most important among them are as follows:

  • no-cache: The user (browser or CDN edge server) should receive validation confirmation from the main server before using the resource that the command is specified for. If ETags are also used, the request and response will be performed only to make sure the resource hasn’t changed between the user and the server, and there is no need to download the resource that has not changed. This saves bandwidth and time.

  • no-store: This command means that the browser, and all the devices between it and the main server (including CDN edge servers), are not allowed to save the resource in cache, and should reload it from the main server each time it’s needed. Banking information, for example, should not be cached; and the user should request them from the main server and download the data completely for each access.

  • Public: Using this command for a resource means that any user (browser or CDN edge server) can save this resource.

  • Private: If this command is used for a resource, only the browser is able to save that resource, and not the devices between the browser and the main server. For example, the browser is permitted to cache an HTML page containing private user information, but CDN edge servers are not.

  • max-age: This command specifies the highest time for saving a resource in cache in seconds. After this time ends, the resource expires and the user (browser or CDN edge server) should request it from the server again. For example, the browser can save a resource in cache and use it for 60 seconds if its max-age is set to 60 seconds.

If the private command is not used explicitly in the Cache Control header and only the max-age is specified, it means that that resource is saved by any device and there is no need to explicitly set the public command.

Common max-age values are as follows:

    • One minute: max-age=60

    • One hour: max-age=3600

    • One day: max-age=86400

    • One week: max-age= 604800

    • One month: max-age= 2628000

    • One year: max-age= 31536000

  • s-maxage: S in this case stands for shared cache. This command is similar to max-age, but it is an instruction for CDNs only, which the browser ignores. If this command is used for a resource, the CDN considers the value specified in this command, and ignores those of max-age or the Expire header.

  • must-revalidate: This command specifies that before using a resource saved in cache (one that is now old, may not exists now, or may have changed; in other words, a resource whose specified max-age has ended), the user (browser or CDN edge server) should first validate it from the main server, and is not permitted to use the old resource until validation is complete.

  • proxy-revalidate: It is similar to must-revalidate, with the only difference being that this command is specific to proxy servers.

  • no-transform: This command specifies that devices between the main server and the browser, including CDN edge servers, are not allowed to change the resource.

  • stale-while-revalidate: This command specifies a time in seconds in which the user (browser or CDN edge server) can use the old resource saved in cache while validating it with the main server.

  • Stale-If-Error: It is similar to stale-while-revalidate, with the difference being that the user (browser or CDN edge server) can use the old resource saved in its cache only when the main server has returned one of 500, 501, 502, 503 and 504 error codes while validating.

  • immutable: This commands tells the user (browser or CDN edge server) that the response’s main body does not change with time, so there is no need to check to update that resource until it has expired.

Setting Correct Caching Policies

The following flowchart, originally produced by Ilya Grigorik, a google developer, presents a good overview of optimized resource caching. It can be used to determine which command is better set for a specific resource:

Cache-Control Setting Examples

  • Caching a static resource

Cache-Control: public, max-age=86400

  • Assurance regarding an important resource not being saved

Cache-Control: no-store

  • Saving a resource in browser cache and not in CDN edge servers

Cache-Control: private, max-age=3600

  • Saving a resource in browser cache and CDN edge servers, but on the condition of validating it for each use

Cache-Control: public, no-cache

  • Saving a resource in CDN edge servers and validating it for every use

Cache-Control: public, s-maxage=0

  • Saving a resource by any user (browser or CDN edge server) and requiring validation for each use

Cache-Control: public, no-cache, must-revalidate

  • Caching a resource with different expiration times for the browser and CDN edge servers

Cache-Control: public, max-age=7200, s-maxage=3600

Cache-Control Configuration

The HTTP cache-control header can be implemented on the server as well as using code. Examples of how to implement Cache-Control in Nginx, Apache, as well as in PHP codes are provided next.

  • Apache

Add the following commands to the .htacces file so that the server sets a cache-control header for the files specified in the command with the following parameters: max-age value 84600 and public:

<filesMatch ".(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">

Header set Cache-Control “max-age=84600, public”

</filesMatch>

  • Nginx

Adding the following commands to the Nginx configuration file will set a Cache-Control header with the public and no-transform parameters for the files specified in the command.

location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {

add_header Cache-Control “public, no-transform”;

}

The commands related to adding a Cache-Control header can be placed directly in the website’s codes: For example, the following code sets a Cache-Control header with a one day max-age parameter:

header('Cache-Control: max-age=84600');

ArvanCloud Caching Policies

  • The Request Phase

In this phase, the request received from the user (browser) is first compared with a list of files cached in ArvanCloud edge servers, and if the request is related to one of these cached resources, the edge servers will send that resource in response to the user for as long as that resource hasn’t expired. If the requested resource is expired, however, ArvanCloud edge servers will first validate the resource from the website’s main host server before responding to the user.

Your “Caching Settings” in the ArvanCloud panel determines which files can be cached in ArvanCloud edge servers. For example, all of a website’s resources are automatically saved in ArvanCloud’s edge servers when cache is configured for all files, and the format review stage is no longer performed while receiving the user request.

  • The Response Phase

In this phase, after receiving a request regarding a cacheable resource, ArvanCloud servers first search the cache to find the resource. If they don’t find that resource in Cache, they will send a request to the website’s main server to receive it. ArvanCloud edge servers send the response received from the main server to the user.

Based on request headers, the response sent by the main server to the edge server may be cacheable, in which case, after receiving upcoming requests for accessing this resource, edge servers use the cached resource to respond to users, or it may not be cacheable, in which case these steps should be repeated each time a user requests accessing that resource.

Browser Cache Settings in the ArvanCloud Panel

You can specify the time permitted for saving data in browser cache by going to ArvanCloud panel, “Caching settings”, “Advanced Settings,” and activating the “Cache Information in Browser” option.

Note that Browser Caching will save your website’s resources on the user’s browser, and you will not have access to the user’s browser to wipe the cache.

دیدگاه شما