dotMobimobiThinkingmobiForgemobiReadyDeviceAtlasfind.mobiInstant Mobilizer

Posted by nosher 1 year 21 weeks ago

pic
 nosher
mobiForge Newbie
Posts: 5
Joined: 1 year ago
[offline]

In trying to decide whether to use DeviceAtlas or WURLF (or both), I took a sample of our mobile traffic (which includes inevitable robots and desktop visits) and distilled 1,500 unique user agents. I then ran a comparison test using the DA API and WURLF's API, in Java (it may be worth noting that this was testing solely for resolutionWidth, which is one of our most important info requirements) and got the following results for successful matches, i.e. where resolutionWidth > 0 :

DeviceAtlas: Processed in: < 1sec, success rate: 76%

WURFL Strict (no fall-back): Processed in: < 1sec, success rate: 27%

WURFL Loose (does fall-back): Processed in: 96 seconds, success rate: 96%

So, DA compares very well with WURLF in strictly matching incoming UAs, but not that well against WURLF using its "loose" query, i.e. with a fall-back if an exact match is not found.

Given that this is influencing our purchasing decision, could you suggest whether a) there is some (undocumented?) way of getting DA to fall-back to the next-best device, or b) whether you have plans to add some sort of fall-back/fuzzy-match facility?

Cheers.

Posted by atrasatti 1 year ago

pic
 atrasatti
dotMobi logo
Mobile Genius
Posts: 320
Joined: 3 years ago
[offline]

Hi,
I think your research is missing something, at least on the DeviceAtlas side. When you pass a user-agent string to our API, we will not return a display width (or height) if it is a robot. We don't know the display width of GoogleBot or Yahoo Slurp, do they really have a display width? Maybe they want to pretend to have one, but this is a different topic that we may discuss elsewhere (in these forums).

With regards to desktop browsers, we do not provide display size either, that is dependent on the monitor, videocard, etc. We think you're better off using Javascript for that.

What you should really do to make your research more accurate is run through those 1500 user-agent strings, filter out non-mobile devices and then verify the presence of displayWidth. Always speaking for DeviceAtlas only, what you should do is pass the user-agent string to our API, see if mobileDevice is set to true, and if set so, check the display width. For all the other cases, DeviceAtlas will tell you if it's a browser, a robot, a proxy and so on.
See this blog post about some properties that might help you: DeviceAtlas 1.3.1, better browser detection and more security.

I think that if you make a little change to your test application you will quickly get different figures telling you that 76% were mobile browsers (with a known display width), a certain amount were browsers and another amount were robots, etc. This way you will get what you are really looking for, the percentage of mobile browsers with a known display width, and all the other types of browsers.

If you want to implement our API and assign your own defaults, you're welcome to do it and maybe you will define your own defaults based on the browser type, if it's robot, a browser, or something else.

Andrea Trasatti
DeviceAtlas, mobile device intelligence

Posted by nosher 1 year ago

pic
 nosher
mobiForge Newbie
Posts: 5
Joined: 1 year ago
[offline]

Fair point :-)

Modifying the test to explicitly exclude non-mobile UAs (via the mobileDevice.equals("1") test) gives a 94% success rate, which is certainly more in the realms of what we're looking for. I guess it's a little misleading of WURFL to give such a high success rate for all UAs (regardless of type) when run in loose mode, as presumably it must be giving "made up" values for device width in a lot of cases (there were 307 non-mobile UAs in my list; I had also made sure to exclude WURFL's "generic" device as being a valid result in that test).

Cheers for the pointer.

Posted by daniel.hunt 1 year ago

pic
 daniel.hunt
dotMobi logo
Mobile Grandmaster
Posts: 189
Joined: 2 years ago
[offline]

Quote:
DeviceAtlas: Processed in: < 1sec, success rate: 76%
WURFL Strict (no fall-back): Processed in: < 1sec, success rate: 27%
WURFL Loose (does fall-back): Processed in: 96 seconds, success rate: 96%

Quote:
Modifying the test to explicitly exclude non-mobile UAs (via the mobileDevice.equals("1") test) gives a 94% success rate

This is certainly more like the kind of numbers I like to see :)

Could I ask you to provide just a little more detail in your tests though? I'm interested (now that you've explicitly ignored the non-mobile UAs) to see what the DA -v- WURFL figures are like now. Obviously, removing 307 non-mobile UAs won't account for the 69% difference between the 2 WURFL results, but I can't help but wonder how the 2 compare on your list of mobile-only headers.

Also, you mention that both DA and WURFL resulted in processing times of < 1 sec, can you provide any more granularity in these times? We (of course) do our own performance testing on DA, but 'real-world' (tm) tests run by users such as yourselves can always prove to be interesting reading!

Finally, if you don't mind, I think I'll be a little cheeky and ask you to send the UAs that DA couldn't detect on to us at ;)

Daniel Hunt
dotMobi

Posted by nosher 1 year ago

pic
 nosher
mobiForge Newbie
Posts: 5
Joined: 1 year ago
[offline]

Here goes: pre-filtering the original 1,500 list to remove UAs that DA detects as non-mobile, leaves me with 1193 unique UAs. I then pass this list into my "test DA" and "test WURFL" methods to get the following results (I ran a few tests to get a range of timings):

DeviceAtlas: Total:1193, Missed: 44, Success: 96%, Finished in: 170-240ms

WURFL Strict: Total:1193, Missed: 787, Success: 34%, Finished in: 100-117ms

WURFL Loose: Total:1193, Missed: 10, Success: 99%, Finished in: 73sec

There has been some jitter in the result (DA is now up to 96%, which is good), presumably something to do with my new pre-filtering run. I can send an updated list of the 44 UAs that DA missed this time, if you want (it's grep of a grep of a log file, so some of them may be spurious entries anyway).

These tests are run on a MacBook Pro/OS X Leopard (10.5) in Eclipse w/ Java 1.6.0_05

Hope that helps!

Posted by daniel.hunt 1 year ago

pic
 daniel.hunt
dotMobi logo
Mobile Grandmaster
Posts: 189
Joined: 2 years ago
[offline]

nosher wrote:
DeviceAtlas: Total:1193, Missed: 44, Success: 96%, Finished in: 170-240ms

WURFL Strict: Total:1193, Missed: 787, Success: 34%, Finished in: 100-117ms

WURFL Loose: Total:1193, Missed: 10, Success: 99%, Finished in: 73sec
...
Hope that helps!

It certainly does help! I love looking at data comparisons like this :)

It looks like the performance that you're getting for your Java API is less than we would have anticipated (see the API Performance info). Of course, you're running it using a development laptop which is no doubt being slowed by normal desktop add ons that you use regularly, but I wouldn't have expected there to be that much of a difference for you. (I calculate you hitting around 25% of our own benchmarks)

I must say though, 73 seconds to identify 1200 UA strings, compared to between 100-240ms for DA/WURFL String is amazing ...

Daniel Hunt
dotMobi

Posted by nosher 1 year ago

pic
 nosher
mobiForge Newbie
Posts: 5
Joined: 1 year ago
[offline]

Quote:
I must say though, 73 seconds to identify 1200 UA strings, compared to between 100-240ms for DA/WURFL Strict is amazing ...

It is a long time, but to be fair both DA and WURFL/Strict are trivial node lookups (JSON and XML, respectively), whereas WURFL loose is doing a sequence of progressive string shortening (well, something like that) until it gets a match, so I'd expect it to be slower. However, it does rule out its use for something where we may be doing several queries per request and don't want to add another 300ms to our server-side processes along the way. I'm also somewhat sceptical of the results WURFL/Loose is returning, given that it claimed an almost 100% success rate for deviceWidth even when I was including a whole heap of bots and desktop UAs...

I had seen your published performance figures and felt that my response was slower than this. However, it's still in the order of <10ms per request so I'm quite happy with that :-)

Posted by daniel.hunt 1 year ago

pic
 daniel.hunt
dotMobi logo
Mobile Grandmaster
Posts: 189
Joined: 2 years ago
[offline]

nosher wrote:
I'm also somewhat sceptical of the results WURFL/Loose is returning, given that it claimed an almost 100% success rate for deviceWidth even when I was including a whole heap of bots and desktop UAs...

I had seen your published performance figures and felt that my response was slower than this. However, it's still in the order of <10ms per request so I'm quite happy with that :-)

That 99% success rate is certainly a little worrying alright. Do you know what WURFL is actually returning for those lookups? I mean, if its just a case of almost always returning a default screen width of, for example, 100, then it would definitely have an undesired effect on some (if not the majority) of the devices that it doesn't actually recognise. I do understand what you're saying about why it takes so long to query properties using WURFL Loose, but by the time the site responds to the user (after 70 seconds) they'd already have wandered off to another site anyway :)

And I'm glad you're happy :D

Daniel Hunt
dotMobi

Posted by passani 1 year ago

pic
 passani
mobiForge Enthusiast
Posts: 18
Joined: 3 years ago
[offline]

Hi, sorry for the intrusion. Someone told me about this thread and, while I don't know much about the details of how DA works, I would like to set the record straight about how WURFL works.

First off, I think the original poster is confusing the fall-back with the matching heuristics. Given a UA string similar (but not identical) to the one in WURFL, it's the heuristics that performs the magic of returning a match, not the fall-back. Fall-backs are important at a later stage, when WURFL needs to return a value for a capability.

Secondly, I think that there is an important aspect of how WURFL works that should be considered. Because of heuristics, the first time a UA string is encountered, a fair amount of computation is performed to find a match, *BUT* the second time around a match is found with a simple HashMap lookup. Because of this, I would expect your 1000 matches to be performed way faster than 73 seconds. (BTW, Daniel, when you hit a site with your phone, that's one UA string you send, not 1000 and it definitely won't take 73 secs for the server to respond. What are you talking about?).

Back to WURFL, if I could go back in time, I would only implement the loose matching and simply call it "UAMAtch()". The availability of strict and loose methods has confused more than one developer in the past (for some reason some people tend to think that stricter is necessarily better than looser, go figure).

With regards to the quality of the WURFL data, I can only say that WURFL has over one-hundred registered contributors and a few thousand companies using it. I wouldn't be too surprised if I discovered that a lot of the data is actually good (particularly when it comes to something as basic as screen size and related).

One final note, the new WURFL API is virtually ready:

http://m.wurflpro.com/~wurfl/

this API will do a much better job at matching mobile and web browsers (assuming you have the new WEB patch)

Nosher, if you happen to run the WURFL Loose test twice, please let me know the result. I am curious.

Thank you

Luca Passani

Posted by daniel.hunt 1 year ago

pic
 daniel.hunt
dotMobi logo
Mobile Grandmaster
Posts: 189
Joined: 2 years ago
[offline]

passani wrote:
....
(BTW, Daniel, when you hit a site with your phone, that's one UA string you send, not 1000 and it definitely won't take 73 secs for the server to respond. What are you talking about?).
...
The availability of strict and loose methods has confused more than one developer in the past (for some reason some people tend to think that stricter is necessarily better than looser, go figure).
...
I wouldn't be too surprised if I discovered that a lot of the data is actually good (particularly when it comes to something as basic as screen size and related). ... Nosher, if you happen to run the WURFL Loose test twice, please let me know the result. I am curious.

Ah, Luca, this could prove to be an even more interesting and lively discussion yet! :)

As much as I hate to pick apart posts when replying to people, I think I'll do it here just to clear the air a bit, and ask a few more questions of my own.

  1. I'm well aware that a single request won't take 73 seconds (I've just re-read my previous post. Looks like my brain went asleep while I was typing, sorry for the misunderstanding there), but figures like the ones Nosher has posted highlight the kind of data that normal admins or developers would be interested in. If their site regularly experiences high-volumes of traffic, then the 73 seconds response time (in this case) becomes more important.
  2. This is the first time I've come across anything to do with a strict/loose approach to the retrieval of device data, but as a developer I would be inherently more inclined to go for the strict approach, should they both be presented to me. Depending on the results they produce, I would consider the loose one, but definitely, strict would be my first port of call
    • Something else that should probably be explained is that although it would appear that Nosher is comparing WURFL Strict with DA, there's a little bit of confusion here. DA actually works in a combination of both Strict and Loose mode. There's no actual Loose mode available for DA - we only give values for properties that we know we have a value for.
  3. I'm quite interested to know this myself, by the way. Hopefully Nosher's in a helpful mood

In general, speed tests, such as the ones we've started discussing here, are very subjective. Running the tests on an unloaded machine, with incredibly high specs will always produce much lowers times when compared to those run on a developer's heavily laden dev-machine. For this reason, I always take speed measurements with a grain of salt.
That's not to say it doesn't make for interesting reading though, because it most certainly does.

And probably the best thing about all of this is that its encouraging a lively debate on dev.mobi :D

Finally Luca - its interesting that you state that you would like to go back and change WURFL to use Loose by default. As I mentioned above, as a developer, I would go for strict before loose, but having been using DA for a while, our combination of strict and loose has won my heart over and over again.

Daniel Hunt
dotMobi

Posted by passani 1 year ago

pic
 passani
mobiForge Enthusiast
Posts: 18
Joined: 3 years ago
[offline]

> If their site regularly experiences high-volumes of traffic,
> then the 73 seconds response time (in this case) becomes more important.

there is no 73 seconds response time anywhere. 73 seconds is the time needed to do 1000+ computations in one shot. Ina normal installation, those 73 seconds would be evenly spread over the several hours of operation needed before those 1000+ UAs are spotted for the first time.
Also, in the unlikely even this is an issue, nothing prevents a developer from artificially loading those 1000 computations at start-up (i.e. you wait 73 seconds in the beginning and not anymore).

In hindsight, I would change the UAMatchLoose() to UAMatch() and UAMatchStrict() to UAMatchStrictThatYouShouldntUseUnlessYouHaveSpacialNeeds(), but hey, it was 2002....the industry was still betting that device fragmentation was bound to disappear at the time....

Nosher, would you run the WURFLLoose mode test twice to get an approximate measure of the WURFL API performance?

Luca

Posted by nosher 1 year ago

pic
 nosher
mobiForge Newbie
Posts: 5
Joined: 1 year ago
[offline]

Hi All, and Luca - thanks for the clarifications.

OK, so it turns out that the comparison with WURFL Loose and DA was a little unfair - my unit test obviously starts up, runs, and then terminates which doesn't give WURFL a chance to do any caching. Running the WURFL section twice in the same test, however, gives a much more favourable result:

WURFL Loose 1st run: Total:1193, Missed: 10, Success: 99%, Finished in: 64482ms
WURFL Loose 2nd run: Total:1193, Missed: 10, Success: 99%, Finished in: 85ms

which is, needles to say, a massive change once the data is cached (I had actually considered implementing a cache for WURFL IDs anyway, not being aware that WURFL did its own. I guess in hindsight it would be a fairly obvious expectation to assume that it would).

Soooo, in terms of comparison I'm sort-of back to square one - the remaining arbiter is now quality of data, for which I'll need to do some (much harder) testing. That said, the thing that still stands out is WURFL's almost 100% success with all UAs, which I'm still curious about, given that it sometimes returns values such as:

  • Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.16) Gecko/20080702 Firefox/2.0.0.16 id: htc_touch_ver_subminimo width: 240
  • Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322) id: google_wireless_transcoder_ver1 width: 128
  • Mozilla/5.0 (iPod; U; CPU iPhone OS 2_0_1 like Mac OS X; en-us) AppleWebKit/525.18.1 (KHTML, like Gecko) Version/3.1.1 Mobile/5B108 Safari/525.20 id: apple_ipod_touch_ver2 width: 300

However, I'll try some more tests including WURFL's "is_wireless_device" to see if I can rule these UAs out (as we handle iphone and desktop content as two completely separate branches).

Posted by atrasatti 1 year ago

pic
 atrasatti
dotMobi logo
Mobile Genius
Posts: 320
Joined: 3 years ago
[offline]

I'm happy if nosher runs the test again. It should be noted that WURFL API does the hashmap lookup (in Java) and is faster in the next search for the same user-agent string. The hashmap is limited in size, isn't it? If I have a high traffic site, with, say 100 different unique devices per second, a hashmap with a size limit smaller than 100 might even penalize the performance (I picked 100 randomly, I did not check the size limit of the hashmap cache).

DA does not have a caching system implemented. We decided to leave it to the developer to do, if needed, but we thought the API performance was already good enough in real time.

Andrea Trasatti
DeviceAtlas, mobile device intelligence

Posted by passani 1 year ago

pic
 passani
mobiForge Enthusiast
Posts: 18
Joined: 3 years ago
[offline]

> The hashmap is limited in size, isn't it?

No, it's not. Java HashMaps will grow automatically when more space is needed. It really takes a heavily trafficated mobile site to reach the limits of HashMaps (in fact, I am not aware of any production site in Java that ever did).

Also, HashMaps are the most obvious thing out of the Java box, but nothing prevents a Java programmer from using some different kind of maps (those which discard data not accessed for some time, for example).

Luca

Posted by passani 1 year ago

pic
 passani
mobiForge Enthusiast
Posts: 18
Joined: 3 years ago
[offline]

Nosher, a couple of comments and a possible solution for you.

Comment 1is that the WURFL API was built with mobile device detection in mind. So if you throw a random web browser UA at it, the response would be random (typically, you will get the ID of some device with the Mozillate UA: Mozilla/5 (... MSIE 6.0....Compatible.....) and so on...)

What many WURFL adopters have done is to filter out robots and web browsers in advance, so this does not need to be an issue.

Comment 2 is that WURFL has been architected to always give you an answer (I don't know is not a good answer). The reason for this is performance. Developers shouldn't be forced to handle exceptions in their code (what if this capability does not exist?) since this wouldn't be ideal performance-wise and in terms of simplicity in programming. Experience ha shown that this mechanism is sound. Corrections to WURFL can be handled through patch files directly in your domain: http://wurfl.sourceforge.net/patchfile.php

The possible solution I was referring to is the new API which I already mentioned. With the new API a big effort has been made to also correctly recognize web browsers. What you need is the new API and the (still unpublished) new Web Patch.

If you contact me offline I will make sure you get it and you are able to run the test with the latest API. My email address is here: http://www.wurflpro.com/static/my_email.gif

As an aside, I released the latest wurfl.xml yesterday, so you may want to use it too.

Luca

Posted by daniel.hunt 1 year ago

pic
 daniel.hunt
dotMobi logo
Mobile Grandmaster
Posts: 189
Joined: 2 years ago
[offline]

nosher wrote:
Running the WURFL section twice in the same test, however, gives a much more favourable result:

WURFL Loose 1st run: Total:1193, Missed: 10, Success: 99%, Finished in: 64482ms
WURFL Loose 2nd run: Total:1193, Missed: 10, Success: 99%, Finished in: 85ms

....

Soooo, in terms of comparison I'm sort-of back to square one - the remaining arbiter is now quality of data, for which I'll need to do some (much harder) testing. ...
However, I'll try some more tests including WURFL's "is_wireless_device" to see if I can rule these UAs out (as we handle iphone and desktop content as two completely separate branches).

That's some pretty impressive changes right there. Obviously, caching is having a massive effect on the overall detection of the UAs you have, and forcing a pre-cache using a list of headers (in much the same way as you are doing now) would make for a blisteringly fast overall end-user experience. Well, from the detection side of things at least, after that its all down to your own code :)

Without wanting to waste too much of your time, I'm sure we'd all love to see the findings from your tests once you're finished. Recognising a device and providing a known value (or presuming a value) is all well and good, but if that data isn't valid then you could be in for a whole other world of hurt (which is where all of this testing comes in I suppose)

I'm looking forward to some more data!

Daniel Hunt
dotMobi

Posted by adrian.hopebailie 1 year ago

pic
 adrian.hopebailie
Mobile Expert
Posts: 51
Joined: 1 year ago
[offline]

passani wrote:

WURFL has been architected to always give you an answer (I don't know is not a good answer). The reason for this is performance. Developers shouldn't be forced to handle exceptions in their code (what if this capability does not exist?)

I would venture to say that "I don't know" (which DA does by raising an Exception) is actually the best answer, because it is correct and not an assumption.
I am also yet to discover an environment where the capability to handle exceptions does not exist? I stand to be corrected.

Device Atlas is not fuzzy.

Give the API a user-agent and get back an answer. Either the answer is correct or unknown and how to handle the latter case is left to the developer. If, as a developer, you would prefer the deveice detection API to take this out of your hands and return a best-guess then maybe the DA approach is not for you but in my experience few people would go for a fuzzy approach, if they actually knew that's what they were going for.

In terms of performance I would say this is the best approach too, maybe I am being biased :)
You can depend on the the API to always respond quickly whether it has an answer or not. Deciding what to do next is a simple TRY getProperty(); useIt(); CATCH fallbackAsYouWish.

Raising an exception when there is an error (like trying to get a property that doesn't exist for the given UA) is a best practice approach that developers can easily understand and use accordingly.

I might add that, following user feedback, the next version of the API will also offer a hasProperty(userAgent, propertyName) function that will return a boolean and in the background cache the property that was requested so it is available immediately for the following getProperty() call.

EG: IF hasProperty(x) THEN getProperty(x) ELSE fallBackAsYouWish

The short answer: DA is very fast and very accurate and lets the developer decide what to do next.

Adrian Hope-Bailie
dotMobi