With more players supporting Eddystone bluetooth beacon technology it’s important to look at it from the web developer perspective. This article will reveal how it all works behind the scenes, show how to optimize a website for Physical Web beacons across different implementations, and discuss its caveats.
URL settings
First let’s dive into what actually happens when you try to configure a Physical Web beacon with your website URL. Before the URL address is saved into the beacon it is shortened by the management app to fit within the limited size of the low energy data packet.
The advertisement part (31B) of the packet is used for the data and it consists of a namespace and an encoded URI (20B) – you can see the Eddystone-URL specification for more details.
The beacon management app uses the shortened URL for the broadcasting, not the original URL. When a device (e.g. Google Chrome, Opera, The Physical Web app, etc.) receives a shortened URL, it uses a resolver service to look-up meta information: real URL, website name, description and an icon. A browser/application does not connect to your website directly – unless user decides to follow a link you wouldn’t see any action.
The Physical Web app uses its own implementation of URL shortener (GAE: s~url-caster), which internally uses Google URL shortener (goo.gl) and it is not possible to change it. Other management apps can use their own shorteners.
Opera offers a Python script for Linux to set the URL directly without using the app and thus gives better control of the broadcasted URL, and there are many more alternatives when you look up a “physical web” phrase in any app store (e.g. Google Play Store).
Behind the scenes
We know roughly what happens immediately after you set your URL. Let’s look into what the actual communication looks like. All traffic between a device and a shortener/resolver uses HTTPS. There are following situations at the moment:
1. The beacon management app shortens the URL.
2. A device connects to the resolver to find real URL, website name and description and an icon.
3. The resolver crawls your website and caches the details (including the icon image).
It’s worth noting that these steps are not mandated. A different implementation of the client portion could choose to resolve the URL itself. A different implementation could also opt to get the website name and description directly by fetching the page or a manifest file. Privacy concerns might drive implementations to do things differently too.
1. URL Shortening
After you submit a URL the management app automatically shortens it to ensure the length is within the limits for broadcasting.
The Physical Web app <> Shortener
The Physical Web app makes a POST request to the URL shortener, passing the original URL in the request body. A shortened URL is returned as part of the JSON response. The HTTP request and response look like this:
Request
1 2 3 4 5 6 7 8 9 |
POST https://url-caster.appspot.com/shorten-url HTTP/1.1 Content-Type: application/json; charset=utf-8 User-Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; Nexus 4 Build/LMY48T) Host: url-caster.appspot.com Connection: Keep-Alive Accept-Encoding: gzip Content-Length: %d {"longUrl":"http://dev.gorth.cz/pwb/article"} |
Response
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Cache-Control: no-cache Content-Type: application/json Date: Mon, 30 Nov 2015 22:05:29 GMT Server: Google Frontend Content-Length: %d Alternate-Protocol: 443:quic,p=1 Alt-Svc: quic=":443"; ma=604800; v="30,29,28,27,26,25" { "kind":"urlshortener#url", "id":"http://goo.gl/XXX", "longUrl":"http://dev.gorth.cz/pwb/article" } |
2. Resolving website details
After the beacon “detection” app receives a URL from a beacon it tries to retrieve meta information via a resolver. The resolvers might vary (i.e. each app is using its own resolver: Google Chrome, Opera and The Physical Web app).
The Physical Web app <> The Physical Web resolver
The Physical Web app makes a POST request to the URL resolver, passing the shortened URL in the request body along with TxPower (Transmission Power) & RSSI (Strength Indicator) – TxPower and RSSI can be used to measure approximate distance. A full URL and the details are returned as part of the JSON response. The HTTP request and response look like the following:
Request
1 2 3 4 5 6 7 8 9 |
POST https://url-caster.appspot.com/resolve-scan HTTP/1.1 Content-Type: application/json; charset=utf-8 User-Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; Nexus 4 Build/LMY48T) Host: url-caster.appspot.com Connection: Keep-Alive Accept-Encoding: gzip Content-Length: %d {"objects":[{"url":"http://goo.gl/XXX","txpower":-22,"rssi":-92}]} |
Response
1 2 3 4 5 6 7 8 9 |
Cache-Control: no-cache Content-Type: application/json Date: Mon, 30 Nov 2015 21:55:12 GMT Server: Google Frontend Content-Length: %d Alternate-Protocol: 443:quic,p=1 Alt-Svc: quic=":443"; ma=604800; v="30,29,28,27,26,25" {"metadata": [{"description": "Learn more about The Physical Web beacons and how to use them.", "title": "The Physical Web for web developers", "url": "http://dev.gorth.cz/pwb/article", "rank": 28.183829312644534, "displayUrl": "http://dev.gorth.cz/pwb/article", "id": "http://goo.gl/XXX", "icon": "https://url-caster.appspot.com/favicon?url=http%3A%2F%2Fdev.gorth.cz%2Fpwb%2Ficons%2Ffavicon-96x96.png"}]} |
Opera Labs (32.0.1953.96910) <> Opera Labs resolver
The Opera browser makes a GET request to the URL resolver, passing only the shortened URL as a query string. A full URL and the details are returned as part of the JSON response. The HTTP request and response look like the following:
Request
1 2 3 4 5 |
GET https://varde.op-test.net/varde?url=http%3A%2F%2Fgoo.gl%2FXXX HTTP/1.1 User-Agent: Dalvik/2.1.0 (Linux; U; Android 5.1.1; Nexus 4 Build/LMY48T) Host: varde.op-test.net Connection: Keep-Alive Accept-Encoding: gzip |
Response
1 2 3 4 5 6 7 8 9 10 11 |
Cache-Control: public, max-age=30, s-maxage=10 Content-Type: application/json; charset=utf-8 Content-Length: %d Accept-Ranges: bytes Date: Sat, 05 Dec 2015 18:42:18 GMT X-Varnish: 1020838031 Age: 0 Via: 1.1 varnish Connection: keep-alive {"cache": null, "desc": "Learn more about The Physical Web beacons and how to use them.", "icon": "/varde?img=http%3A%2F%2Fdev.gorth.cz%2Fpwb%2Flogo.png", "lang": null, "nav": [], "parse_time": 0.023353099822998, "retrieve_time": 0.258214950561523, "title": The Physical Web for web developers", "url": "http://dev.gorth.cz/pwb/article", "url_label": "dev.gorth.cz", "warn": "snow"} |
3. Crawling the website
A resolver connects to your website to gather meta information for a device. All details are cached by the resolver. In this case a resolver and a shortener are the same.
All implementations are using simple GET requests with a customized User-Agent strings and no other details. The only exception is the “develop” version of the Physical Web resolver which seems to forward approximate distance to the beacon as an additional HTTP header.
The Physical Web resolver <> Website
Request
1 2 3 4 5 6 |
GET /pwb/article HTTP/1.1 Host: dev.gorth.cz X-Cloud-Trace-Context: 700f0b407ac625ee6e3dd166740692cb/14200255404988386925 Connection: close Accept-Encoding: gzip,deflate User-Agent: Mozilla/5.0 AppEngine-Google; (+http://code.google.com/appengine; appid: s~url-caster) |
Request (dev. version)
1 2 3 4 5 6 7 |
GET /pwb/article HTTP/1.1 Host: dev.gorth.cz X-Physicalweb-Distance: 0.398107170553 X-Cloud-Trace-Context: 407d84cf37fd2c269ec9ee58cf26ffac/12874555662050094581 Connection: close Accept-Encoding: gzip,deflate User-Agent: Mozilla\/5.0 AppEngine-Google; (+http://code.google.com/appengine; appid: s~url-caster-dev) |
Opera Labs resolver <> Website
Request
1 2 3 4 |
GET /pwb/article HTTP/1.1 Connection: close Host: dev.gorth.cz User-Agent: Varde/0.1 (oha [at] opera.com) |
Website optimization
Now that we know what is going on behind the scenes, we can try to optimize a website so that the notification, which is made up of a title, description and icon, looks good once the beacon is detected. We will focus on the resolvers, and how they crawl your website and cache its details.
To test the implementations I prepared four different page templates and some alterations of two of them.
- HTML with META description (favicon: shortcut icon, .PNG – 96px)
- Pure HTML only (favicon: icon, .ICO – 48px)
- Open Graph with META description
- Open Graph with no META & TITLE tags
- Schema.org – WebSite
- 1-5 mixed together (HTML, Schema.org & Open Graph)
The first two links use a .ico favicon (48px), 3-5 use different formats to inform about icon/logo (with PNG file), and the last one uses a PNG image defined as favicon (96px). There is an extra favicon in the document root to see when a resolver fallbacks to the default icon.
All templates, and some others, are available at: dev.gorth.cz/pwb/.
The testing device was a Google Nexus 4.
The Physical Web app
A beacon is found notification
I was quite surprised at how big a difference it makes to use different favicon format. Clearly the PNG image is winner here. I didn’t test what is the minimal meaningful resolution, 96x96px seems to work quite nicely.
Physical Web app: nearby beacons view
In all cases the primary source for meta information is standard HTML tags. The only exception is the Open Graph tags but those are used only in the absence of META description and TITLE tags, which is an unlikely scenario.
- No surprise here.
- In absence of META description the app uses first paragraph as description.
- Ignoring Open Graph tags, apart from the image.
- In total absence of TITLE and META description we can finally see Open Graph.
- No support for Schema.org, fallbacks to document root /favicon.ico
- The same as #1
Opera Labs
Opera published a very nice article about URL beacon detection where you can find many details, including the formats their crawler understands, Schema.org, Open Graph or pure HTML.
The official recommendation is to use pure HTML. This sounds very strange considering the common practices. I quickly checked www.opera.com and even they are using Open Graph. It is very unlikely people would remove microdata from their websites only because of the beacons.
A beacon is found notification
Opera nearby view
Opera is trying to support multiple formats but it’s not perfect either. Schema.org tags takes precedence over pure HTML, which takes precedence before Open Graph. Again, it’s unlikely webmasters would remove the META and TITLE tags to use the Open Graph option, or remove Schema.org microdata just to go with the official recommendation.
Schema.org, JSON-LD is not supported at the moment.
- Does not support LINK with “shortcut icon”, falls back to document root /favicon.ico
- In the absence of META description does not use other context.
- Ignoring Open Graph tags, apart from the image.
- In total absence of TITLE and META description we can finally see Open Graph.
- Schema.org takes priority but it is not following type definition, it simply checks for “title” & “description” regardless of a type. In this case Opera does not display Schema.org “name” (and fallbacks to TITLE html) even though it is valid for the WebSite type. If I used “title” instead, the code would not pass through Schema.org validator.
- Schema.org takes priority before HTML & OG.
Opera beacon error
You might get this error if you misconfigure, use wrong URL or when a resolver is having issues.
Resolution
All the implementations are either using exact opposite of what I would expect as a priority list for extracting data (Schema.org > Open Graph > TITLE & META > best guess) or are somehow in-between. I think there is still some work needed before having proper guidelines for the web owners.
Google Chrome (Android)
Even though the Physical Web support should be available starting with Chrome 49.x, it didn’t work. I tried Android Chrome version (49.x), Chrome Beta (50.x) and Chrome Dev (51.x). I have enabled the hidden flag through chrome://flags and then another one in Settings>>Privacy. With the latter two (Chrome Beta & Chrome Dev) I was at least able to get to a screen where the beacons should be visible, but no beacon appeared there.
To verify that there is no issue with the beacon itself, I asked a colleague to give it a try with Chrome on iPhone. As you can see on the screenshot below there were no issues on iOS.
Further thoughts
Language locale
Due to a caching layer and not forwarding the accept-language header, it is not possible to support multiple language versions and thus use locale targeting. Personally I think this would be a very good addition. There is nothing like target your customers using their native language.
Anonymity
It is said that the caching layer is to preserve user anonymity. That might be true but it has its caveats. Instead of letting a website owner know about his or her proportion of visits, the whole traffic, for all the websites, is collected only by the resolvers.
There seem to be no session/device/cookie information sent along with the requests, though Chrome requires you to turn Location on for some reason and, at the time of writing this article, I couldn’t get any traffic sample coming from the Chrome resolver. On top of that Google has a control to remotely enable/disable beacon support on individual phones or so it seems.
Remote objects
Even though the whole idea around the Physical Web is to see physical objects around you, it stands and falls on remote service(s). If the resolver (e.g. Opera URL-Caster) or URL shortener (e.g. goo.gl) or internet connection is down, the beacons are not usable. These are two dependencies (and potential bottlenecks) on third-party services, which goes against internet decentralization core.
It would be nice if client implementations had “pure local” mode (or fallback) for privacy and reliability reasons. This is important and agrees with physical-web.org project goals: “We are using the open Eddystone-URL Bluetooth beacon format to find nearby URLs without requiring any centralized registrar.“
Private networks
Due to the public resolver it is not possible to use beacons within a private network. This could be solvable by adding manifest file support in combination with mDNS and uPnP protocols (which are already integrated).
One of the main reasons, to my understanding, to use resolver service is to save on traffic data – because to crawl a site directly can be quite expensive and slow. With the manifest file support this should not be an issue, and the whole physical web beacon network would be perfectly scalable.
Measuring performance
For the same reasons mentioned above it is not possible to measure the performance of your beacons, like beacon visibility, traffic and devices at given nodes, click-through rates, and any other traffic and impression data that might be useful to you as a publisher or service provider.
This is all visible only to Google, Opera or any other player who will be successful in leveraging the beacons through the apps. Without accessible data, it might be difficult for some companies to justify this whole physical web beacons idea, as it is a black box for everyone else at the moment.
Interestingly, if client implementations accessed a manifest file directly from your site you would get some level of traffic information, so the proxy approach that Google and Opera are following is effectively preventing sites from seeing this traffic.
External links
- Understanding the different types of BLE Beacons
- Favicons, Touch Icons, Tile Icons, etc.
- Manifest File Format
- The Web Manifest specification
- shortlink – Specification.wiki
- Capture Android Mobile Web Traffic With Fiddler
- Dev.Opera: Release the Beacons!
- Google services on iOS now work natively with beacons
- Google Chrome Privacy Whitepaper – Physical web
- Creating a custom Physical Web Beacon with Piggate
- The Physical Web expands to Chrome for Android
Leave a Reply