51Degrees compares the Raspberry Pi against two modestly priced Intel platforms.
51Degrees is regularly evaluated against competitors DeviceAtlas and WURFL from ScientiaMobile for performance, accuracy and memory consumption. Plenty of resources to help engineers evaluating device detection solutions are available on this web site including migration guides plus API benchmarks.
This blog post builds on a 2015 performance blog with three new commodity hardware benchmarks on;
- a Raspberry Pi
- a low-end desktop PC
- a high-end multi CPU server.
The Importance of Device Detection
Responsive Web Design (RWD) sends the same web page to every device. Device detection increases the options available to the developer and designer by enabling more complex logic to be placed on the web server. When used appropriately a fast and device optimised user experience can be created improving the business benefits of the website. Many businesses large and small use the technique. Many case studies are available on this website.
Here are just a few common scenarios:
- When advertising a product that relates to a specific device such as Apple EarPods there is little benefit displaying the advert to users of non-Apple products.
- Navigation when pointer hover is available enables a rich multi layered navigation experience. However, such a navigation menu would not work on a touch screen device. Hover is used extensively by retailers with a large product catalogue on laptop and desktop devices. Serving multiple menus to all devices is expensive and degrades performance. Device detection provides the answer.
- If a device supports SMS text messaging a mobile number can be captured with a few taps of the touch screen. Such a data capture scenario would mislead users and generate frustration if presented on non SMS enabled devices.
To understand more about the benefits of device detection in business terms see our product guide or ask us a question.
Performance Matters
Device detection only provides benefits if it's so fast there is no noticeable overhead. It needs to be accurate to correctly identify the device, browser and operating system. As such the number of device combinations (device, operating system and browser version pairings) is an important measure of how comprehensive a device detection solution is. However, the larger the number of device combinations the larger the data file and therefore the more data needs to be searched in a finite period of time. This blog post examines the performance characteristics of 51Degrees and provides technical guidance on how to measure device detection solutions for performance and accuracy.
The Method
To recreate the test yourself, follow these steps:
- Download the 51Degrees C distribution from GitHub. Unpack the archive or use Git to clone the repository.
- For our test we will be using Trie. You will need to make the following modifications to achieve the best performance on your system. In the source file "src/trie/PerfTrie.c" set THREAD_COUNT equal to the number of virtual CPU cores available on your system i.e. for 4-core CPU with 8 virtual cores the ideal THREAD_COUNT is 8.
- Make sure that the makefile in the project root has the -O3 optimisation level enabled. The result is likely to be very different without -O3 directive.
- Now run the relevant build file (on Windows) or use the terminal to compile with makefile on UNIX (the makefile can also be used on Windows if you have GCC installed).
- Once compilation is complete you will see 4-5 programs in the root folder of 51Degrees C distribution; the one we are testing with is PerfTrie. Execute PerfTrie as follows:
- Windows:
PerfTrie.exe TrieDataFile UserAgentsFile
- Linux:
./PerfTrie TrieDataFile UserAgentsFile
- Windows:
Where TrieDataFile is the path to 51Degrees Trie device data file: A Lite version of this file is supplied with the detector. But this may be a good opportunity to try our Premium or Enterprise data file for free. The Premium and Enterprise data files are updated automatically on a weekly and daily basis. Lite is updated less frequently and contains fewer properties and device combinations.
Where UserAgentsFile is the path to a csv file containing the user agent strings to be matched: A list of 20,000 User-Agents is also supplied with the detector however for more meaningful testing you can download a list containing a million User-Agent strings from our website here: million.zip (no longer available).
When evaluating performance for a specific website using real User-Agents captured from that environment forms the best source. 51Degrees considers the character positions of relevant sub strings and as such it's important to remove false characters such as leading quotation marks or spaces before evaluating the data.
More information about the detection algorithm is available on this website.
Trie Algorithm Performance
All of our tests were run using both Lite and Enterprise data files using the Trie algorithm for user agent detection and with a sample data file containing 10 million User-Agents. No User-Agent caching is used.
Results
Platform 1
A Raspberry Pi model 3 - 1.2GHz 64-bit quad-core ARMv8 CPU and 1GB of RAM running Raspbian Jessie Lite. Price: $35
Lite Data File |
|
Performance (detections per second) |
579,954 |
Average time for a single detection per core (ms) |
0.007142 |
The Raspberry Pi was tested with 1 million User-Agents and the Lite data file. This was due to memory constraints where the Raspberry Pi's one gigabyte of RAM falls just short of the roughly 1.2 gigabytes needed to run the full tests. As the Trie algorithm is very memory intensive, using persistent storage swap space would drastically reduce performance. This test may be revisited if the Raspberry Pi Foundation ever release a Pi with two gigabytes of RAM or more.
A figure of half a million detections per second is fairly impressive but one could be reasonably certain that a Raspberry Pi could not serve that many web pages per second. Therefore, you could extrapolate that even on a Raspberry Pi, 51Degrees device detection offers minimal overhead.
Platform 2
A 3Ghz Intel Core 2 Duo with 4GB of RAM, running Ubuntu 16.04 LTS. Price: $200
Lite |
Enterprise |
|
Performance (detections per second) |
1,938,015 |
1,099,428 |
Average time for a single detection per core (ms) |
0.001032 |
0.001819 |
The Lite data file contains fewer device combinations and as such the less popular User-Agents in the sample User-Agent data file are evaluate more quickly but with less accuracy. Measuring accuracy will be covered in a separate blog post.
Platform 3
Dual 2.2Ghz Intel Xeon E5-2660 v2 10 Core CPUs with 160GB of RAM, running Windows Server 2012. Price: $3000
Lite |
Enterprise |
|
Performance (detections per second) |
15,617,415 |
7,843,143 |
Average time for a single detection per core (ms) |
0.001280622 |
0.002549998 |
0.0025 milliseconds per detection without caching is suitable for even the most demanding top 10 web properties, load balancers and CDNs in the world.
Best Bang for Your Buck
The Raspberry Pi is a bit of gimmick in this performance test, but it does represent the cheapest commodity hardware environment. In the real world it would not be suited for high intensity usage. When considering a more realistic platform, CPU throttling is much less likely to occur.
Practically the choice of hardware for device detection is likely to have already been made when the application server or other equipment was selected. Increasingly virtual machines in Platform / Infrastructure as a Service (P/IaaS) environment will be used which will demonstrate more variable performance due to factors beyond the device detection library. The tests here are designed to be easily deployed to any evironment to facilitate the evaluation of device detection solutions within the environment they will be running in.