The User-Agent is a central piece of web architecture and plays an important role in content negotiation. It was created with the express intention of building the ability to address users making requests with different clients or ‘agents’ differently, depending on capabilities or context. This article will answer many of questions you may have around User-Agents, how they work and how they can be used.
What is a User-Agent?
A User-Agent (UA) is an alphanumeric string that identifies the ‘agent’ or program making a request to a web server for an asset such as a document, image or web page. It is a standard part of web architecture and is passed by all web requests in the HTTP headers. The User-Agent string is very useful because it tells you quite specific information about the software and hardware running on the device that is making the request. You can make important decisions on how to handle web traffic based on the User-Agent string, ranging from simple segmentation and redirection, to more complex content adaptation and device targeting decisions.
Although the User-Agent doesn’t identify specific individuals, it does provide developers with an extremely powerful method of analysing and segmenting traffic. This information, gleaned directly from the User-Agent string itself (a process known as User-Agent parsing) typically includes browser, web rendering engine, operating system and device. Deeper information can be returned when the User-Agent string is mapped to an additional set of data about the underlying device. This is the approach taken by device detection solutions.
Anatomy of a User -Agent
Use of the User-Agent string is specified in the standards on HTTP here RFC 1945 and again over here (RFC7231). In fact, the UA string has been part of the HTTP standard since the very first version, and has been retained in every update since, right up to HTTP 2.0. These standards make recommendations on what should be in the User-Agent string as well as describing their purpose.
The “User-Agent” header field contains information about the User-Agent originating the request, which is often used by servers to help identify the scope of reported interoperability problems, to work around or tailor responses to avoid particular User-Agent limitations, and for analytics regarding browser or operating system use. A User-Agent SHOULD send a User-Agent field in each request unless specifically configured not to do so.
How it is constructed is defined to a degree:
1 |
User-Agent = product *( RWS ( product / comment ) ) |
Product tokens are explained in more detail as:
The User-Agent field-value consists of one or more product identifiers, each followed by zero or more comments (Section 3.2 of [RFC7230]), which together identify the User-Agent software and its significant subproducts. By convention, the product identifiers are listed in decreasing order of their significance for identifying the User-Agent software. Each product identifier consists of a name and optional version.
Product tokens are used to allow communicating applications to identify themselves by software name and version. Most fields using product tokens also allow sub-products which form a significant part of the application to be listed, separated by white space.
Each product token includes a product name and its version separated by a “/” sign with some optional information in brackets. The tokens are typically listed by significance, however this is completely left up to software publisher. Tokens can be used to send browser-specific information and to acquire device-specific information from the device’s ROM, such as the model ID, operating system and its version, etc.
Here are two examples of UAs used by Samsung Galaxy S6 and Mac OS X-based computer using a Safari browser.
1 2 3 |
Mozilla/5.0 (Linux; Android 6.0.1; SM-G920V Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9 |
Check out the list of User-Agent strings >>
A point to note is that HTTP/RFC7231 warns against the use of overly long and detailed User-Agent strings that might increase request latency, or make “fingerprinting” possible. Any data unique to an individual device such as device IDs, IMEIs, usernames, phone numbers, even preferred language should not be included.
In summary, it is not a very standardized format, and as we will see, has evolved into a fairly chaotic environment that can be only unraveled by sustained and dedicated attention to mapping and interpreting this entropy.
So what can you do with a User-Agent parsing?
One of the main use cases is to identify and handle requests from certain types of traffic. By checking device capabilities or browser capabilities, you can decide which content to send down to the requesting device, or even adapt the content on the fly. This is particularly useful when dealing with the wide spectrum of devices in use today, and allows you to get as fine grained as you like with your content targeting strategy. Outside of web optimization, this has obvious applications to the advertising sector, where the device can be useful as a criteria for targeting. Device information is part of the spec for RTB. Another possible use case is serving language specific content using the language and locale headers.
The other main use case is around analytics, which can be very deep with a good device description repository. All this insight can be used to improve on content publishing decisions, targeting strategies or conversion optimization. Having a constantly updated method of parsing User-Agent strings also means that you are aware when new devices hit your services and can identify any issues at an early stage.
Beyond the browser
Though User-Agents are most associated with the browser, it is not the only client that has a User-Agent. Bots and crawlers have User-Agents too, and can be identified accurately by a good device detection solution. Again, this is particularly useful for some business verticals such as the advertising industry, where bots often masquerade as real devices and click fraud is a real issue. Not all device detection solutions have the ability to accurately detect masquerading User-Agents.
Security is the other big area where being aware of the nature of traffic hitting your services is extremely important. There are all sorts of other User-Agents that can and do crawl your site. These range from search engines to link checkers, SEO tools, feed readers, scripts and other nefarious actors at large in the web landscape. Being able to distinguish between these different sources can provide significant savings in IT costs by detecting and identifying bot traffic to your site. This goes beyond what you can do with the robots.txt file which is static. Search engine bots could be handled differently to other bots, human traffic can be prioritized over other traffic and bad actors can be blocked entirely.
Is User-Agent parsing a bad approach?
Some people have a bad impression of User-Agent parsing due to its role in what is known as User-Agent sniffing. To understand why using the User-Agent sometimes gets a bad rap, we need to go back to the 1990s and a period referred to as the browser wars. Before we get into the history, it is worth stating upfront that User-Agent parsing is used by many of top web companies today to cater to different device classes. Some 82% of the Alexa 100 used Adaptive Web Design (AWD, or server-side adaptation) in their websites, so its clear that major companies do not share this view.
Back in the 1990’s, one of the first capable browsers, NSCA Mosaic did not support frames. When Netscape emerged (originally known as Mozilla) with support for frames, webmasters began to serve frame-enabled content based on the presence of the token ‘Mozilla’ in the User-Agent. This process of checking for certain tokens in the User-Agent that has become known as “UA sniffing” or “browser sniffing”.
As browsers became successively more capable and new ones were released, they began to include the Mozilla token in their User-Agents rather than wait for their own unique User-Agents to be known by webmasters in the hope that their browsers would proliferate faster. This is exactly what Microsoft did when it entered the browser market with Internet Explorer. This trend continued with other browsers, making the User-Agent a messy and non-standard string. Some browsers such as Opera even allow the user to set the User-Agent string, further obfuscating the meaning of the UA.
The following User-Agent strings from the mid-1990s illustrate the problem
1 |
Netscape Navigator 2: Mozilla/Version [Language] (Platform; Encryption) |
1 |
Internet Explorer 3:Mozilla/2.0 (compatible; MSIE Version; Operating System) |
Although the web had become a much more standardized place since, many developers still see ‘browser sniffing’ as bad practice due to the unreliability of looking for the presence of one keyword such as iPhone or Mozilla in the User-Agent.
It is important to note that browser sniffing is not the same as device detection. It is primarily based on looking for one particular string in the User-Agent, rather than making a definite identification based on prior knowledge of the User-Agent. And as we’ve seen, you just can’t rely on the presence of one token in the UA to accurately determine what kind of device you are addressing.
Current best practice is about serving the best possible experience to all devices. It is a reaction to a massive increase in the diversity of devices that we see online. It’s unlikely that a one-size-fits-all approach would work well in the days of watches/tablets/smartphones, etc. It would be like trying to use the exact same video stream for a phone, TV and cinema at the same time. Device detection is a pragmatic solution to this problem.
How does User-Agent parsing work in device detection?
From a technical point of view examining the User-Agent is not difficult. You can get a User-Agent string using navigator.userAgent in JavaScript or the HTTP_USER_AGENT variable.
As we’ve seen sniffing is easy but unreliable. Many companies use a regex approach to analyze the User-Agent. Again this relies on pattern or string matching to identify keywords which might identify the underlying device. Typical regex approach would look for the presence of iPhone or Android in the User-Agent, but the accuracy concerns are many. Telling Android tablets and phones apart is an obvious weakness, and the presence of the iPhone token may be just about as useful as the Mozilla token.
As User-Agent strings do not conform to any standard pattern, this technique is prone to failure and is not future proof. You would need to constantly update your regex rules as new devices, browsers and OS’s are released, and then run tests to see if the solution still works well. At some point, this becomes a costly maintenance job, and, over time, a real risk that you are mis-detecting or failing to detect much of your traffic.Accurately parsing User-Agents is one problem. The real difficulty is in staying on top of the constantly shifting sands of the device, browser, OS market with potentially millions of permutations when things like language and locale or side loaded browsers are layered on. This is where a good device detection solution really pays off.
There are two prerequisites for device detection.
- That the User-Agent lookup happens extremely quickly and
- That the device identification is highly accurate.
This involves accurately mapping all possible User-Agent strings for a particular device and having an API that can accurately and quickly return the information while being flexible enough to accommodate new variants as they arise. The reason that this is difficult is that there are millions of variants and new user-agents are being created all the time. Every new device, browser, browser version, OS or app can create new and previously unseen User-Agent.
In this regard not all approaches to device detection are created equal—the bad ones will have inaccurate data, return false positives—you may think you are getting a correct result, but an inferior solution may return default values for unknown UAs. Some approaches hog server resources because of their unsophisticated and messy APIs and codebases. This is the reason why major companies rely on established solutions built on proven and patented technology, such as DeviceAtlas.
Article image by Minesweeper, Wikipedia.
Leave a Reply