"Linux gives you more control over your computer."
Stop it.
Clichés like these are vague and wishy-washy,
and they are founded on anecdotes and hearsay.
They cause endless, unnecessary debates and make a muddle of the facts.
It's easy to opine about one's preferred operating system,
but harder to give objective, concrete examples.
With the caveat that both Windows and Linux are moving targets,
this document describes some specific technical reasons
to prefer using Linux as a desktop operating system.
These reasons are not exhaustive—and not meant to be—but aim to be representative.
This document will not cover servers, phones, or embedded devices.
This document will not cover closed vs. open source development,
but will instead focus on functionality.
There is plenty of discussion
of the advantages and disadvantages of open source elsewhere.
This discussion will only mention Microsoft and other companies
in so far as their actions are directly relevant
to the technically capabilities of Windows and Linux.
(As an aside, Microsoft gets a lot of guff in the open-source world,
but its behavior is typical for a corporation
whose a bottom line relies on sales of proprietary software and devices.
It's economics, not malice.)
The discussion is intended to be as accurate as possible,
at the cost of possible dryness due to technical detail.
I am most familiar with the Debian-based family of Linux distributions,
so my remarks will necessarily touch on these more,
but I have tried to include other distributions when possible.
In this document, the term "Linux" is shorthand for the entire distribution,
including bootloader, kernel, shell, window manager, package manager, etc.
Similarly, the term "Windows"
refers to all default components of modern versions of Microsoft Windows NT,
including Windows XP, Windows Vista, Windows 7, and Windows 8.
Many of the same arguments in favor of Linux
also apply to the BSD family of operating systems
(and POSIX-compliant operating systems in general),
but unfortunately I am not familiar enough with any of them
to comment specifically.
Most people use Windows on the desktop because it's the default.
Few are aware of the benefits of switching to another operating system,
and even fewer are willing to put in the effort to do so.
A Windows user interested in trying Linux
will probably have difficulty finding a coherent reason to do so,
since comparisons of operating systems
tend to be vague, uninformed, or opinion-based.
Even people who know and use Linux by choice
may not do a good job of explaining its benefits to their colleagues
especially without putting down Windows users
or Windows applications in general.
Also, there are many open source alternatives to Linux on the desktop,
including a binary-compatible clone of Windows called ReactOS.
If it were just a matter of being open source,
why bother with the additional effort to learn Linux?
Even if you don't use Linux or Windows,
it's useful to know where Linux has an edge,
since these issues are relevant to all operating systems.
If you are a new Linux user,
this document is intended to inform you
about some of the benefits of Linux you may not be aware of,
and to dig deeper if you are interested.
If you are an experienced Linux user,
this document is a test of the theory that the fastest way to get feedback
is to be publicly wrong about something people care about.
Corrections and additions are welcome.
If you are a Windows user:
This document is not intended to convert you to Linux.
(That would be silly.)
This document does not claim that Windows is inferior in every way,
or even that it is inferior overall.
Instead, this is meant to provide insight
into why some people choose to use Linux as a desktop operating system,
despite its shortcomings,
and possibly to challenge some misconceptions
that people have about Linux and Windows.
Corrections and additions are, of course, welcome.
Windows developers are ones who know the most about its flaws and strengths.
Finally, definitions of better and worse are necessarily subjective,
despite the title's claim of objectivity.
You may heartily disagree with substantial parts of what follows,
but perhaps it may be useful to you, even so.
Windows LiveCDs, though they do exist,
are hampered by licensing restrictions and technical limitations.
For example, until Windows 8, desktop versions of Windows
could not boot from a USB.
(And while running a live USB of Windows 8,
it is still not possible to mount internal hard disks.)
There is also the WinBuilder project,
which is the closest to a fully-functional LiveCD of modern Windows versions,
but installing software and drivers is still sometimes a challenge.
If the Virtual Machine fails don’t worry too much. Just because the Virtual
Machine fails to boot right does not mean your boot media won’t work, I’ve
seen odd results depending on the amount of memory the VM has and what
drivers I load.
The absence of fully functional live versions of Windows
makes it difficult to use for, e.g,
determining if a bug is due to hardware or software problems,
recovering data from a machine with filesystem corruption or bad disk sectors,
and testing out different versions of an OS
without making a new hard drive partition.
Live versions of Linux are full operating systems,
able to mount and repartition disks,
connect to the internet and run a web browser,
and even retain settings and data on the next boot-up
(for persistent live USB flash drives).
This makes live versions of Linux useful for
recovering files from damaged hard drives,
making bootable backups of an entire drive,
scanning a disk for malware
without loading a potentially compromised operating system,
distinguishing hardware problems from software problems,
and other tasks requiring a temporary operating system.
Some live Linux distributions, such as Puppy Linux,
are lightweight enough that they default to running from a RAM disk,
and consequently have much faster disk I/O
than an OS that must access a spinning hard drive.
(This comes at the cost of disk space being limited by RAM.
There's no reason you can't mount an internal or external drive
to store files, though.)
Very little hardware comes with a desktop version of Linux pre-installed,
so live versions of Linux tend to work very well,
since that is almost always the way it is installed.
Similar to live booting,
Linux is often run as a virtual machine,
and consequently it is well-adapted to changes in hardware.
An existing Linux partition on a physical hard drive
can, with some care, be virtualized and ran on another machine,
a virtue which Windows does not share.
Windows installations, unlike Linux, cannot easily be moved from one
hardware to another. This is not just due to Microsoft's activation
mechanism but the fact that the installed kernel and drivers depend on the
actual hardware.
The problem lies with Windows, in that its driver settings, particularly
for storage devices, are not portable. Unless you modify the Windows
registry to force start storage drivers for both the physical and virtual
machines, you will mostly likely end up with a 0x0000007B STOP blue
screen error each time which will require a restore or modifying the
registry to fix.
It's even possible to transfer a Linux install to a USB enclosure
and boot it directly on another machine of the same architecture,
although the kernel will lack proprietary drivers (e.g. some wifi cards).
But the bigger issue
is that many Windows developers are so used toworking around the problem
that it has become deeply entrenched and may never be fixed.
The 2008 POSIX revision has addressed the issue, but prior to this
the Linux kernel had to make non-standard modifications to avoid overflow,
and warned about the problem in the realpath (3) man page
of the Linux Programmer's Manual.
This illustrates that while the Linux kernel developers
scrupulously avoid breaking external compatibility,
they also intentionally expose false assumptions,
since false assumptions tend to cause hard-to-fix bugs.
This is why Linus Torvalds
chose an unusually high timer interrupt frequency for Linux:
I chose 1000 originally partly as a way to make sure that people that
assumed HZ was 100 would get a swift kick in the pants. That meant making
a _big_ change, not a small subtle one. For example, people tend to react
if "uptime" suddenly says the machine has been up for a hundred days (even
if it's really only been up for ten), but if it is off by just a factor of
two, it might be overlooked.
—Linus Torvalds, Selectable Frequency of the Timer Interrupt (2005)
Linux uses case-sensitive filenames
because Unix used case-sensitive filenames.
Unix was case-sensitive because Multics was case-sensitive.
Multics was case-sensitive because the ASCII standard
included both an uppercase and a lowercase alphabet. [1]
Why did ASCII do this?
It was a close call, and almost didn't happen.
Telegraphy codes used uppercase only,
or at least did not distinguish upper and lowercase.
Even ITA2, an international standard from 1930,
used a 5-bit code with a shift to switch between letters and figures,
but not upper and lowercase. [2]
Similarly, punched cards used uppercase letters only.
Encodings with different bit patterns for uppercase and lowercase
had been proposed as early as 1959, [3]
though they were not widely implemented.
For example, the IBM 7030 "Stretch" supercomputer,
first installed at Los Alamos National Laboratory in 1961,
had an 8-bit encoding that interleaved uppercase and lowercase alphabets.
[4]
However, the 7030's character encoding did not catch on.
Early on, ASCII committee concluded that 6-bit encodings (64 bit patterns)
were insufficient to include both control characters and special characters
in addition to the required 26 alphabetics and 10 numerics,
so they decided to use a 7-bit code.
However, ASCII was designed to include a useful 6-bit subset,
which could only fit a single alphabet.
The consideration of a 6-bit, 64-character graphic subset was important to
the standards committee. If the ultimate decision was that columns 6 and 7
would be for graphics, then columns 2 through 7 would contain Space, 94
graphics, and Delete. But, even with the code providing 94 graphics, a
major assumption of the standards committee was that data processing
applications would, for the foreseeable future, be satisfied with a
monocase alphabet (that is, a 64- or less graphic subset) as they had in
the past---that 64-character printers would predominate. So it was
important to be able to derive a 64-character, monocase alphabet, graphic
subset from the code by simple, not complex, logic.
—Charles E. Mackenzie, "Coded character sets: history and development"
(1980), p.228
In fact, some of the committee members
wanted to reserve the remaining space for control characters.
The conclusion of the preceding paragraph is based on the assumption
that two alphabets, small letters and capital letters, would be included
in the 7-bit code and that decision had not yet been made. If the
decision was ultimately made that columns 6 and 7 would would contain
controls, then small letters would not be included in the 7-bit code. *
* If the committee did decide for controls in columns 6 and 7, it is
still likely that they would have wanted an alphabet of small letters to be
provided. Presumably, the small letter alphabet would then have been
provided by a caseshift approach.
—Ibid, p.232
Though the committee first formed in 1961,
it wasn't until late 1963
that they finally agreed to include a lowercase alphabet,
largely because of the influence of the
International Telegraph and Telephone Consultative Committee (CCITT).
At the first meeting of ISO/TC97/SC2 in 1963 October 29-31, a resolution
was passed that the lower-case alphabet should be assigned to
columns 6 and 7.
—Ibid, p. 246
The ISO proposal, though, did not include the lower case alphabet and the
five accent marks that the CCITT considered essential.
—Eric Fisher, "The Evolution of Character Codes, 1874-1968", p.22
Why is it useful for filenames to include upper and lowercase?
It can make filenames more intelligible,
such as distinguishing between
the abbreviation for United State ("US")
and the first-person plural objective pronoun ("us")
in paths such as /usr/share/X11/locale/en_US.UTF-8/.
It also allows more possibilities for filenames,
and makes filename comparisons simpler and faster
because they don't have to occasionally convert
to uppercase or lowercase.
Bear in mind that it's MUCH more work for a filesystem to be
case-insensitive than -sensitive. A filesystem is case-sensitive by
default, in the simplest case; it can only be made case-INsensitive through
a lot of extra engineering. In UNIX, all the system has to do is sort on
the ASCII values of the first letters of the filenames. In the Mac OS and
Windows, the filesystem has to be smart enough to create synonyms of
various letters — A for a, and so on — and sort accordingly. That takes a
LOT of code. It's a testament to the completeness of the original Mac OS
that in 1984 this was all handled properly, before Windows even brought
lower-case letters to the PC side.
However, there is also no shortage of opinions
that enforcing filename case-sensitivity
-- and even case-sensitivity in general --
was a bad decision. [5]
There are also passionate views
to the opposite effect. [6]
Laying aside that argument for the moment,
why did Windows filenames end up case-insensitive?
However, Windows' own NTFS filesystem is case-preserving.
This means that it is possible to mount an NTFS partition with Linux
and make a file called "Myfile.txt" in the same directory as "MYFILE.TXT",
but it will not be possible to read or modify both of those files,
at least not with standard Windows software.
This API behavior exists to maintain compatibility with MS-DOS filesystems. [7]
MS-DOS was built on Tim Paterson's 86-DOS (released in 1980)
and Marc McDonald's FAT filesystem,
which were designed for compatibility with CP/M. [8][9]
CP/M was created in 1973 by Gary Kildall,
and also used case-insensitive filenames. [12]
Lower case ASCII alphabetics are internally translated to upper
case to be consistent with CP/M file and device name conventions.
The CP/M manual does not state explicitly why it uses these conventions,
but Gary Kildall wrote CP/M on a DECPDP-10 mainframe
running the TOPS-10 operating system
when he was working at Intel. [10]
Consequently, there are many similarities between CP/M and TOPS-10,
including filename case-insensitivity.
(It should be noted that CP/M has also been compared to RT-11,
a DEC operating system for the PDP-11 minicomputer
that is closely related to TOPS-10, [11]
although the influence may not have been as direct.)
Why did TOPS-10 use case-insensitive names?
Because the DEC SIXBIT encoding used for filenames
was optimized for its architecture.
RAD50 was used in FILES-11 and RT-11 disks. It was used to store 3
characters in a 16 bit word. SIXBIT was used on TOPS-10 36bit systems to
store 6 characters in a word. It also allowed for a fast file name search
since the names were all on word boundaries (full filename compair took 2
compair, and 1 mask operation 6+3 file names).
(CP/M was written for an eight-bit architecture,
which is presumably why it used an 8.3 filename
instead of a 6.3 filename.) [13]
Similarly, the RT-11 didn't use ASCII for filenames,
but rather an encoding called RADIX-50,
which helped to save memory. [14]
Neither of these encodings are used much anymore,
but their case-insensitivity,
a useful optimization on 1970s hardware,
endures to this day.
The lack of agreement on filename case-sensitivity may seem insignificant,
but it has caused persistent difficulties
in cross-platform development. [15][16][17]
Unity does not properly run on a case-sensitive file system (which is
something that Unity users have discovered if they’ve tried to install and
run Unity on a case-sensitive HFS+ file system). This is primarily due to
Unity’s asset database, and how it stores paths to map them to GUID values.
Of course we tried to be smart in the early days, but if you don’t set up a
way to actually verify that what you’re doing works on a case-sensitive
file system, then it will never fail that some well-intentioned programmer
throws a toLower() in somewhere and ruins the party.
Everything in Multics is case sensitive; Multics permits use of the full
upper and lower case ASCII character set.
Multics command names and programming languages use lowercase by
convention, but users are free to use uppercase letters in path names,
identifiers, user names, etc.
Obviously, BCD had no lower-case characters, and Multics did not use BCD
at all, except to output log and crash and tape mount messages from ring
0 to the primitive Selectric operator's console.
Since the Multics file system distinguished between upper and lower case,
external names had to be case sensitive, and without much discussion we
chose to have all variable names be case sensitive.
Mac & Windows users have to have filenames read to them over the phone
by support techs. They have to be able to write little sticky notes to
their mothers about how to open up the mail program, without worrying
about how the filenames are capitalized. Haven't you ever fumed over a
URL with initial-caps in the folder names in the path, having to fiddle
with capitalization until you get a response that's anything but a 404?
Haven't you ever been secretly pleased that e-mail addresses aren't
case-sensitive?
—Brian Tiemann, On Unix File System's Case Sensitivity (2001)
Anecdotally, case sensitivity in programs is known to be error-prone for
both beginners and experienced users. Bob Frankston, a Multics alumnus
and the co-inventor of VisiCalc, once said it was the biggest mistake
that Multics had inflicted on the world.
—Stavros Macrakis (2003)
Source is down
One of the most pernicious problems with C-based languages is that
they're case-sensitive. While this decision may have made sense in 1972
when the language was created, one wonders why the sins of Kernighan and
Ritchie have been blindly perpetuated for the last thirty-three years.
[ . . . ]
Unless you have extremely compelling reasons to make something
case-sensitive, case insensitivity is a much more human being friendly
design choice. Designing software that's easier for machines is
questionable at best.
--- Jeff Atwood, The Case For Case Insensitivity (2005)
There is no longer any excuse for making humans learn and handle the
quirks of the way computers store upper- and lower-case characters.
Instead, software should handle the quirks of human language.
—Brian Hauer, Case-sensitivity is the past trolling us (2014)
Since it appears to have manifested out of opinion rather than
necessity, it could be said case-sensitivity is the worst way that
modern technology sucks.
Many of us consider those filesystems which cannot preserve case, but
which accept "input" in random case, to be so utterly broken as to be
undeserving of any attention whatsoever. They create a situation where
the computer effectively considers the users to be too stupid or blind
or whatever to be able to say what we mean accurately.
Why are computer file names and conventions and protocols so messed up?
It's bizarre -- and Microsoft has been one of the worst offenders with
one of the most powerful positions and opportunities to make it a better
filename-naming world.
[ . . . ]
And, Microsoft dares to allow mixed case naming, but does case
insensitive handling of file names... don't even get me started about
some of the bizarre results and buggy behavior I've traced to that. I
only wish I'd had a chargeback code for all of the time I've spent
fixing and debugging systems that all come back to the file naming.
Sigh, again.
The old DOS/Mac people thought case insensitivity was a "helpful"
idea, and that was understandable - but wrong - even back in the 80's.
They are still living with the end result of that horrendously bad
decision decades later. They've _tried_ to fix their bad decisions,
and have never been able to (except, apparently, in iOS where somebody
finally had a glimmer of a clue).
Do not assume case sensitivity. For example, consider the names OSCAR,
Oscar, and oscar to be the same, even though some file systems (such as
a POSIX-compliant file system) may consider them as different. Note that
NTFS supports POSIX semantics for case sensitivity but this is not the
default behavior.
Every operating system has basic functions like reading and writing disk
files. The API defines the exact details of how to make it happen and
what the results are. For example, to “open” a file in preparation for
reading or writing, the application would pass the location of an
11-character file name and the function code 15 to CP/M through the
“Call 5” mechanism. The very same sequence would also open a file in
DOS, while, say, UNIX, did not use function code 15, 11-character file
names, or “Call 5” to open a file.
The FAT file system 's restrictions on naming files and directories are
inherited from CP/M. When Paterson was writing 86-DOS one of his primary
objectives was to make programs easy to port from CP/M to his new
operating system. He therefore adopted CP/M's limits on filenames and
extensions so the critical fields of 86-DOS File Control Blocks (FCBs)
would look almost exactly like those of CP/M. The sizes of the FCB
filename and extension fields were also propagated into the structure of
disk directory entries
Gary Kildall developed CP/M on a DEC PDP-10 minicomputer running the
TOPS-10 operating system. Not surprisingly, most CP/M commands and file
naming conventions look and operate like their TOPS-10-counterparts. It
wasn’t pretty, but it did the job.
CP/M and ISIS in operation have some general similarities to interactive
operating systems on minicomputers and mainframes such as the DEC PDP-10
"TOPS-10" OS. Kildall used such systems to develop and run his
cross-assemblers and compilers, which became Intel products; and later
to develop his own products which ran "native" on CP/M systems.
—Herbert R. Johnson, CP/M and Digital Research Inc. (DRI) History
Kildall said that PL/M was ‘‘the base for CP/M,’’ even though the
commands were clearly derived from Digital’s, not IBM’s software. For
example, specifying the drive in use by a letter; giving file names a
period and three-character extension; and using the DIR (Directory)
command, PIP, and DDT were DEC features carried over without change. [100]
[ . . . ]
99. Gary Kildall, ‘‘CP/M: A Family of 8- and 16-Bit Operating Systems,’’
Byte, (June 1981): 216–229. Because of the differences between DEC
minicomputers and the 8080 microprocessor, the actual code of CP/M was
different and wholly original, even if the syntax and vocabulary were
similar.
100. The above argument is based on PDP-10 and CP/M manuals in the
author’s possession, as well as conversations with Kip Crosby, to whom I
am grateful for posting this question over an Internet discussion forum.
—Paul E. Ceruzzi, page 238 of "A History of Modern Computing", 2nd. edition published 2003 by MIT Press.
Of course, CP/M itself is an almost exact knock off of DECs PDP-11 OS,
RT-11, an operating system that dates back to the early seventies, and
RT-11 shows its roots in TOPS-10, which goes back another year or two.
For some reason, all the historians tracing the source of MS-DOS
mysteriously stop at CP/M, even when command sets and utility syntaxes
are compared side-by-side. Who had a PIP utility first? Why, DEC, not
Digital Research.
The joke in the seventies that "Digital Research" was a typographical
error and the companies real name was "Digital [Equipment Corporation]
Rehashed", for RT-11, TOPS-10 and RSTS/E all predated CP/M by a lot and
yet have the same command syntax.
From a post on the alt.folklore.computers Usenet group:
Maybe we do need Kildall for the next step, but when I saw CP/M
version 1 it appeared closest to a dialect of RT-11, so I've always
figured that RT-11 was the closest ancestor. After that, it began
to drift. If I recall correctly, V1's prompt was the DECcish ".",
but in V2 it became "> ". Therefore, it would appear that MS-DOS
got its start from CP/M V2. It's a pity MS-DOS didn't start from
RT-11, which had multitasking, interrupt driven I/O, and all the
other good stuff that is easy to fit in a well designed 8KB kernel.
It should also be noted that all alphabetic lower case letters in file
and drive names are always translated to upper case when they are
processed by the CCP [Console Command Processor].
[ . . . ]
Further, recall that the CCP always translates lower case characters to
upper case characters internally. Thus, lower case alphabetics are
treated as if they are upper case in command names and file references
As for the 8.3, look at the format of a CP/M directory entry. 16
bytes so they fill a disk block, not RAD50, 8 bytes for name, 3 for
extension, and I forget the rest, but it includes pointers to the
data.
... files were located via the directory, which resided in a fixed
location at the beginning of the hard drive. The directory consisted of
a single array of entries, each with a 6.3 character file name formatted
in DEC’s Radix-50 format. A file’s directory entry indicated the address
of the first block of the file.
RADIX50 is a character coding system used in earlier Digital Equipment
Corporation computers, such as the PDP-10, DECsystem-10 and
DECsystem-20. It was implemented as a way to pack as many characters
into as few bits as possible.
RADIX50 actually contains 40 codes, or 50 in octal. Because this is not
a power of two, the PDP-10 processor had instructions to pack several
RADIX-50 words into a single 36-bit word or extract RADIX-50 words from
a 36-bit word.
One problem is that the file-system NTFS, that is used by most modern
Windows Versions, is (by default) only case-preserving (hello.c and
Hello.C are the same file, when in the same folder). The
OpenFOAM-sources need a fully case-sensitive file-system and can't even
be unpacked properly on a Windows system (see [2]).
Issues of alphabetic case in pathnames are a major source of problems.
In some file systems, the customary case is lowercase, in some
uppercase, in some mixed. Some file systems are case-sensitive (that is,
they treat FOO and foo as different file names) and others are not.
In Linux and other Unix-derived operating systems,
the only characters that may not appear
in the name of a file or directory [21]
are the slash /
(which is used to delimit paths)
and the ASCII null \0
(which is used to terminate strings in C). [22]
请发表评论