Sometimes you may notice that your browser sends its own user agent to each website that you connect to. So what is a browser user agent, anyway?
A user agent happens to be a long line of text that identifies the operating system and browser to the web servers. This sounds pretty easy, but user agents have become a real mess over the years.
Whenever your browser connects to a site, it will include the user agent field in the HTTP header. The contents of the user agent field will be different for each browser. Each browser happens to have its own distinctive user agent. Basically, a user agent is a way for your browser to say “Hello. I’m Mozilla Firefox on Windows” to the web server.
The web server can then use this information to serve various sites to different browsers and different operating systems. For example, a site could send mobile pages to a mobile browser, modern pages to those modern browsers, and “Please upgrade your browser to see this page” message to those who use IE 6.
Here is an example of a Firefox user agent on Windows 7:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv: 12.0) Gecko/20100101 Firefox/12.0
This user agent tells the server a lot. The operating system is Windows 7, it is a 64-bit version of Windows, and the browser is Firefox 12.
User agent string mess
Mosaic happen to be one of the first browsers and the user string happened to be NCSA_Mosaic/2.0. Eventually Mozilla came along and happened to be a more advanced browser than Mosaic because it supported frames. Web servers then checked to see if the user agent had Mozilla in it and would send the pages that had frames to Mozilla browsers. For other browsers it would send the old pages that did not have frames.
Eventually, the Internet Explorer came along and supported frames as well. However, Internet Explorer never received pages with frames because the web servers only sent those to Mozilla browsers. In order to fix this issue, Microsoft added in the word ‘Mozilla’ to their user agent and threw in other information such as the word ‘Compatible.’ Web servers saw ‘Mozilla’ and then sent modern web pages to Internet Explorer. Other browsers begin to do the same thing as they were introduced.
Eventually, the servers began looking for the word Gecko, which was Firefox’s rendering engine, and served Gecko with different pages than older browsers. KHTML was originally created for Konquerer on Linux’s KDE desktop, and they added words like ‘Gecko’ so they would get the more modern pages as well. WebKit, that was based on KHTML, kept the KHTML and added in ‘Gecko’ as a compatibility line. So as you can see, the browser developers kept adding words to the user agents over time.
Web servers do not really even care about what the exact user agent is, they just check to see if it has a specific word in it.
How user agent is used
Web servers use user agents for a variety of things, such as:
- Gathering statistics by showing the operating systems and browsers in use. If you ever see browser market share statistics, that is how they are acquired.
- Displaying different content to various operating systems.
- Serving various web pages to various web browsers. This can be used for good or evil.
Web crawling bots also use user agents as well. Web servers can even give bots special treatment, for example allowing them to go through the mandatory registration screens. Therefore, you are able to bypass registration screens by simply setting your user agent to Googlebot.
Web servers are even able to give orders to certain bots or to all bots by using the robots.txt file. They would be able to tell a certain bot go away or simply tell a bot to only index certain pages of a site. In the robots.txt file, the bots are identified by the user agent strings.
All of the major browsers today have a way to help you create custom user agents. That way, you are able to see what web servers are sending to different browsers.