dotMobimobiThinkingmobiForgemobiReadyDeviceAtlasgoMobi

Posted by awoywood 3 years 26 weeks ago

pic
 awoywood
mobiForge Enthusiast
Posts: 12
Joined: 4 years ago
[offline]

Hi,
Is there a way to be sure that the detection was done right?

I thought, one way would be to compare the "_matched" property with the string that is left to the first "/", but DevAtlas seems to return sometimes less than that, even if mathed uniquely:
Nokia5200/2.0 (03.70) Profile/MIDP-2.0 Configuration/CLDC-1.1
_matched=Nokia520

Some misleading cases:

Wrong:
Alcatel-ELLE-N1/1.0 UP.Browser/7.1 (GUI) MMP/2.0
_matched=Alcatel-E
Alcatel-ELLE-N3/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 ObigoInternetBrowser/Q03C
_matched=Alcatel-E

Right: (only the "A" was not matched)
Alcatel-OT-C700A/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 ObigoInternetBrowser/Q03C
_matched=Alcatel-OT-C700

Also, DevAtlas seems to have problems with these user_agents:
Mozilla/4.0 (compatible; MSIE 6.0; Linux; Motorola MOTOROKR Z6; 3348) MOT-MOTOROKR Z6/R60_G_80.33.4ER Profile/MIDP-2.0 Configuration/CLDC-1.1 Opera 8.50 [en]
_matched=Mozilla/4.0 (compatible; MSIE 6.0; Linux;
(its detecting the browser, not a MOTOROKR Z6)

So, is there a "sure" way to say that a detection was done right?

Thanks!

Posted by awoywood 3 years ago

pic
 awoywood
mobiForge Enthusiast
Posts: 12
Joined: 4 years ago
[offline]

I did receive no response to this, so I must assume there IS NO way to be sure that deviceatlas did the recognition right.

I tried to find out how the detection is done in DA and WURFL and it seems to me that both try to match a prefix from the ua against whats in the DB. This has 2 problems for me (please correct me if Im wrong): when a new phone matches with a known prefix in the DB, and phones that have their unique signature in the middle of the ua, as in these:

Mozilla/4.0 (compatible; MSIE 6.0; Linux; Motorola MOTOROKR Z6; 3348) MOT-MOTOROKR Z6/R60_G_80.33.4ER Profile/MIDP-2.0 Configuration/CLDC-1.1 Opera 8.50 [en]

For this ua, the device explorer matched: "Mozilla/4.0 (compatible; MSIE 6.0; Linux;" and it says a 240 x 320 phone without having done a unique recognition!

I'm also testing detectright.com and they seems to have a much stronger algorithm, which there is no way to know anything about, because it works based on API calls to their server.

Please, correct me if I'm wrong. All the tools from mobi are great, I really need to know if this tool is precise.

Posted by daniel.hunt 3 years ago

pic
 daniel.hunt
dotMobi logo
Mobile Grandmaster
Posts: 230
Joined: 4 years ago
[offline]

This post seems to have slipped through my elaborate fishing net...

Both DeviceAtlas and WURFL behave in a mobile-browser-matching way. That is to say, that they check to see if a browser is mobile, and if the system doesn't recognise it then it must be a desktop browser.

There is a second method of detection which is based on not-mobile-detection, which presume that a browser is mobile, unless the system you use says otherwise.

There are a few benefits, and a few pitfalls to both methods, and you should make sure you're aware of them before you make any decisions on which to choose.
The first method relies on groups, like us, to maintain a known-list of mobile browsers. This involves actively maintaining a (definitive) list of all mobile devices known to man, and making sure it stays definitive.

The second method behaves in a slightly similar way, but focuses on desktop browsers instead of mobile ones.

Because of the nature of mobile phone distribution (also, I work for dotMobi ;) ), I'd sway towards the former method. While getting up-to-the-second UA information is near enough to impossible to make no difference, its not necessarily incredibly important that a UA list is 12 hours old when it comes to mobile devices.

However, when using the latter method: When a new version of IE is released, or a slightly updated Firefox is released with a new UA your *entire* desktop site becomes completely and utterly unusable for the vast majority of your users (or at least, those who have auto-updates enabled, or who like to be at the cutting edge of browser development ... like most techies I know)

Anyway, let us know what your final decision is, and if possible, some reasons behind it too. We're always open for frank discussion here, so feel free to ask away!

Daniel

Daniel Hunt
dotMobi

Posted by atrasatti 3 years ago

pic
 atrasatti
dotMobi logo
Mobile Genius
Posts: 325
Joined: 5 years ago
[offline]

awoywood wrote:
Hi,
Is there a way to be sure that the detection was done right?

The way DeviceAtlas does the recognition is based, today, on the user-agent string alone. The matching is a best effort, which means we will distinguish and detect devices we know about or that we have seen in the past. Your example about the Alcatel device explains it all, in short we do not recognise the new ELLE-N3 if we haven't seen it and haven't logged the user-agent string.

We are working on a new API that we think will improve the recognition, but still, you have no way to detect something you have never seen. Yes, you could do some heuristics and say something like "I know the N1, I see the N3 slightly different, so I think this will be an Alcatel ELLE-N3", but once you have detected the device name you still don't know the screen size. We could identify the UAProf, download it, parse it and we would get some information such as screen size, but we would not get a lot of other datapoints that we cover and that are not in UAProf and the performance would get a BIG hit just because you have to wait to retrieve the remote file, parse and so on (and sometimes the URLs are not reachable or just plain slow). We decided not to go that way.

We still think DeviceAtlas does a great job detecting mobile devices and we know that it will always be race to catch up as quickly as possible with new devices and that is why we are also working hard to get agreements in place with vendors and operators to get fresh data from them.

Also, did you know that you can contribute to make DeviceAtlas better since version 2.0?

Posted by awoywood 3 years ago

pic
 awoywood
mobiForge Enthusiast
Posts: 12
Joined: 4 years ago
[offline]

Thanks Daniel and Andrea for your answers, but I must say you really didn't answer my question :-)

I understand its a best effort algorithm (and it works fine most of the time), but it would be great to know -for sure- that the data given back is 100% precise or if is it a best guess. This could be a new tag for instance.
I though at first I could simply compare the "_matched" tag against what is left to the first "/", but in some cases it doesn't work.

Years ago we had to develop our own database and detection algorithm. We maintain it mostly by hand (we have few data such as screensize, wml/xhtml support and ringtone/image/video/java capabilities). We work closely with chilean operators, and most data was gathered from real devices.
Every week we manually check for the user_agents we don't know and complete the database. This is very very painfull task. That's why we're looking on ways to outsource that.
But we have to be sure that the data from deviceatlas is precise, in other words, better than our data.

I believe this a common approach many developers have taken over the years. Many have their own databases. And they also have to know when to use theirs and when to rely on device atlas info.

By the way, our current algorithm is similar to what Daniel proposed. First we match the user_agent against our own database (we use prefixes and keywords). Then there are some rules to detect if its a bot (based on user_agent and IP address). After that we apply other rules to detect desktop browsers (based on user_agent only). The default case is to believe its a mobile device. If we don't have the user_agent we insert it in the DB and every week we update its data.
We finally check against the ACCEPT header and the data in the DB to determine if the device should get WML, if that is the case we use an XHTML to WML converter (inhouse developed).

Andrea, we will take the TA-DA test to upload data for rare devices we currently have. Have you thought about using a Java app to automatically detect capabilities and upload them? We developed one for us... it has some pros (and cons) against the wap tests.

Posted by atrasatti 3 years ago

pic
 atrasatti
dotMobi logo
Mobile Genius
Posts: 325
Joined: 5 years ago
[offline]

We do have complete headers in our central database, but when we export to the JSON we compress and remove unneeded parts. You know what you know, but it's not possible, the way we have built the API, to know what we don't know... If that makes sense.

TA-DA is currently our main way to collect new headers and then we have the other sources that are providing some headers, but not all.

We are going to select a few partners, provide a little script or application that you can run on your logs and daily, or weekly or monthly (depending on the amont of logs) send back to us. That should increase the quantity of headers and the time it takes for us to add new devices. Nevertheless, even if I see a new header like the one in your example, I will still need to collect device data and that will require time and effort. For that reason we are building TA-DA and providing the web interface, so that our users can help us do it more quickly. If everyone did a single device per month, we would already have thousands of profiles filled with many values.
That's the power of the community and we are providing the tools.

We have been thinking about a J2ME application, that wouldn't cover many of the details of the browser, though. We are thinking about it and we might work on something in the next few months, but I should say it's not in the short-term roadmap.

Posted by awoywood 3 years ago

pic
 awoywood
mobiForge Enthusiast
Posts: 12
Joined: 4 years ago
[offline]

Andrea,
if you have the complete user_agent in your DB, you could tell if the matching was exact (via a _exact_match tag for instance).
You prefer not to do that in order to protect your data?

Posted by atrasatti 3 years ago

pic
 atrasatti
dotMobi logo
Mobile Genius
Posts: 325
Joined: 5 years ago
[offline]

awoywood wrote:
Andrea,
if you have the complete user_agent in your DB, you could tell if the matching was exact (via a _exact_match tag for instance).
You prefer not to do that in order to protect your data?

It's not to protect our data, it is for performance reasons. We have more than 10000 user-agent strings, but the truth is that in most cases you need 1 or 2 fragments of these strings to identify a device correctly. For example look at the BlackBerry 8100 Pearl, we have almost 30 unique strings, but you can recognise the device just using a portion of 4 of them. When we generate the JSON tree we take all the user-agent strings and all the devices, we build a tree and we "cut all the empty branches", this means that for example a string like BlackBerry8100/4.2.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 VendorID/100 in the tree will only be represented by BlackBerry810 and that's because further down in the tree we do not have any device or property associated with any string.

This provides a HUGE improvement on the size of the JSON (today is about 750KB) and subsequently to the API seek (and that's why we can do something like 20000 recognitions per second in Java and .NET).

Also, would you really consider a non-exact match a browser using a User-Agent such as BlackBerry8100/4.2.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 VendorID/101 visiting your site if not present in the database?

I agree with you that we had a bad match in your case, but I think that our technique serves the large majority of the cases and the issue you had should be fixed adding the new HTTP request header and a device to DeviceAtlas.

Out of curiosity how many headers do you have in your database and how many exact and non-exact matches do you get montly?

Posted by awoywood 3 years ago

pic
 awoywood
mobiForge Enthusiast
Posts: 12
Joined: 4 years ago
[offline]

Thanks again Andrea for your complete answer.
I understand now why unique user_agent string aren't included in the file. Of course it's much faster to traverse the compacted file based on uniqued prefixes.

But, how do you plan to match this kind of uas?
'Mozilla/4.0 (compatible; MSIE 6.0; Symbian OS; Nokia N70/5.0634.2.6.1; 6366) Opera 8.01 [en]'.
That one is a "Nokia N70", not a "Mozilla".

In our case we rely on prefixes too, but also in keywords. This way we can match uas that have a unique string in the middle, like the one above. Of course, that is not very fast, but we do it after the prefix matching, so its worst case only for those uas.

A bit about us: we operate the wap portal of a south american mobile operator. We have about 1200 uas in our db and get about 30-40 new uas every day. Most are really little variations of the existing uas. As we work within the operator, we also get the imei of the requesting device, and thus we have the tac-fac which uniquely identifies the model.
They also inform us the new models they start to sell and give us the vendor-reported uaprofiles.
The operator also manually tests each new device. We're convincing them to run a wap test and a j2me app so that the data gets automatically in our db :-)

Posted by adrian.hopebailie 3 years ago

pic
 adrian.hopebailie
Mobile Expert
Posts: 51
Joined: 4 years ago
[offline]

Hi Alejandro,

As Andrea explained we trim all UA data off, after the last significant character. In you latest post you mention the UA:

'Mozilla/4.0 (compatible; MSIE 6.0; Symbian OS; Nokia N70/5.0634.2.6.1; 6366) Opera 8.01 [en]'

The last significant character in this case is "N" after Nokia. (You can see this by trying the UA in our new data explorer).

What this means, basically, is that we know of no other UA that is mapped to a different device and has the same characters up to that point but is different after the "N".

As, has been said, this design is optimal for performance. In a live web server environment this makes sense.

Lately we have been looking at how the API may be used in less time critical environments such as log analysers. In these circumstances it would probably be beneficial to include some extra "debug" data in the JSON so that we can not only get out the best possible data but also get an idea of how accurate it is likely to be.

If the demand is there we will ceratinly dedicate more resources to taking this idea forward, I take it from your comments thus far that the demand is certainly there from you :)

We are currently working on v2 of the API which we hope to release in January. The new API will use more headers and include a platform specific client that has extra intellegence built in. We have spent a great deal of time and effort refining the algorithms and we believe that the next release of the API will offer unbeatable accuracy and speed.

We have already trialled some performance improvements that we believe could make the API up to twice as fast too.

I must emphasize what Andrea has said about using TADA. It is an invaluable resource not only for us, but also for you as a DeviceAtlas user. It is one thing for us to collect lots of headers that will help improve recognition of devices we already know, it is far better for users of new devices to run the tests on TADA so that we can gather some device data to go with those headers.

As a developer you are in a priviledged position if you have access to data from the operators. It would be great if you could persuade them to test all of their new phones on TADA too. That would mean that as a DeviceAtlas user you know the data will be there for you before the phone even hits the streets. We do currently have a number of operators already starting to do this.

I hope you have a clearer picture of where we are with our API and where we are heading. We'd really appreciate any feedback you could give us about TADA, both on the existing tests and any tests you would like to see added. Of course all of your feedback is very valueable to us so please keep it coming.

Adrian Hope-Bailie
dotMobi

Posted by awoywood 3 years ago

pic
 awoywood
mobiForge Enthusiast
Posts: 12
Joined: 4 years ago
[offline]

adrian.hopebailie wrote:
...

Lately we have been looking at how the API may be used in less time critical environments such as log analysers. In these circumstances it would probably be beneficial to include some extra "debug" data in the JSON so that we can not only get out the best possible data but also get an idea of how accurate it is likely to be.

If the demand is there we will ceratinly dedicate more resources to taking this idea forward, I take it from your comments thus far that the demand is certainly there from you :)

We are currently working on v2 of the API which we hope to release in January. The new API will use more headers and include a platform specific client that has extra intellegence built in. We have spent a great deal of time and effort refining the algorithms and we believe that the next release of the API will offer unbeatable accuracy and speed.

Hi, is the new v2 API coming out soon? I really hope it cames with more debug headers. As said before, for us it is essential to know if the match is exact. In our case, we are much more concerned about accuracy than speed.

Alejandro Woywood
www.amnesiagames.cl

Posted by xiaopy12 1 week ago

pic
 xiaopy12
Mobile Champion
Posts: 582
Joined: 1 week ago
[offline]

Louis Vuitton Sale fger