Monday, February 02, 2009

OS Discussion: What have people talked about by TLD

I've talked about Google Trends before, and controversy keeps being brought up about Ubuntu v. Windows Vista, as well as other combinations. Today I wanted to take another approach. Not what people are looking for so much, but what has been said. I was inspired by an article on Slashdot about the Department of Defense setting up their own site like Sourceforge, which happens to reside at a .mil top level domain(TLD). So I thought, if .mil sites are heavily regulated and organized, what is the trend of references to various Operating Systems? Further, how closely does this relate to other 'special use' TLDs?

Because I really hate when people don't put the raw data or method for data collection with a study, here is the code I used. Sure, I could have done this by hand, but Linux, for me, is all about making things easier.

note: The reason for the sleep 1s; is because of this

echo "Search results for OS's by TLD\n\n"; for j in '+site%3A.edu' '+site%3A.gov' '+site%3A.mil' '+site%3A.org' '+site%3A.net' '+site%3A.com'; do echo -e "\nMatching terms for $j"; for i in 'Microsoft' 'Windows' '"Microsoft Windows"' 'IBM' 'Apple' 'Unix' 'Linux' '"Red Hat"' 'Solaris' 'AIX' 'Novell' '"Sun Microsystems"' 'OSX' 'Fedora' 'Suse' 'FreeBSD' 'NetBSD' 'OpenBSD' 'Ubuntu' '"Windows 3"' '"Windows 95"' '"Windows 98"' '"Windows NT"' '"Windows 2000"' '"Windows XP"' '"Windows Vista"' '"Windows 7"' '"Windows Server"'; do sleep 1s; echo -en "$i\t\t"; lynx "http://www.google.com/search?hl=en&q=$i$j&btnG=Search" -useragent="Mozilla/5.0 Lynx" -dump | grep Results | sed -e 's/^.* of about \([0-9,]*\)\ .*$/\1/' | head -n 1; done; done | tee TLDresults.txt

Should I have formatted it and saved it as a script? Probably, but that wasn't how it was done. :) I love the terminal. Ooh, and run at your own risk. I got multiple computers banned testing this script.

Anyway, here are the results I got (reformatted):


Company / OS .edu .gov .mil .org* .net* .com*
Microsoft 7420000 1320000 64700 64000 30100 548000
Windows 11000000 1620000 65900 171000 48500 893000
Microsoft Windows 596000 84600 5190 4650 3170 59100
IBM 6460000 1420000 21500 40400 5950 185000
Apple 1920000 606000 11500 40600 15200 318000
Unix 7350000 775000 10300 25100 12000 68400
Linux 2130000 693000 5450 104000 51600 254000
Red Hat 796000 201000 2620 3960 1660 17000
Solaris 612000 68700 2560 13400 2320 21600
AIX 328000 68100 2480 4790 1680 15100
Novell 144000 20600 1190 2170 1050 12200
Sun Microsystems 225000 29000 2240 4180 731 16600
OSX 885000 142000 7090 19300 7440 85500
Fedora 788000 21900 680 6810 3620 13800
Suse 283000 19200 359 4630 1880 9230
FreeBSD 356000 9770 177 10400 2420 9910
NetBSD 46500 2520 136 3270 321 1890
OpenBSD 28300 2450 102 2080 437 2430
Ubuntu 486000 29100 49 14200 9340 44100
Windows 3 57500 3410 271 108 165 2290
Windows 98 83800 16000 1260 2790 1440 46400
Windows 2000 231000 45700 3690 3660 1880 39100
Windows XP 1450000 51400 3450 12600 7590 144000
Windows NT 390000 39700 4010 4810 1700 22900
Windows Vista 296000 8880 905 5830 5440 117000
Windows 7 15500 1150 134 1440 3550 61700
Windows Server 54800 8130 1040 1890 1970 29900
(* thousands)

note: eek, formatting didn't come out as expected. Will fix soon.
update: ok, so my html sucks, but the table is easier to look at than before.

Unfortunately have run out of time make any remarks considering the trend, but I see some interesting relationships. Will comment further tomorrow.

No comments: