Binary and text files
Computer files can be divided into two broad categories: binary and text. The distinction is subtle because in many contexts, any file is a sequence of digital bits. For instance, to the circuits which handle information read from or written to a disk, there is not distinction between text data and any other sort. The software concerned with those circuits likewise makes no such distinction. Humans, on the other hand, are concerned about this distinction.Text files (ie, plain text files) are files with generally a one-to-one correspondence between the bytes and ordinary readable characters such as letters and digits. Therefore any simple program to view a file makes them human-readable. Generally, they contain ASCII characters and some control characters such as tabs, line feeds and carriage returns without any embedded information such as font information, hyperlinks or inline images. But sometimes text files contain more than ASCII characters if they are encoded by East-Asian encoding such as SJIS or unicode. If the files are written in unicode, a UTF standard such as UTF-8 defines the encoding format. Although text files are generally human-readable, they can of course be used for data storage by computer programs. This may be done because text files avoid problems which may arise with binary files, such as problems of endianness or the byte-length of integers.
Note that a webpage with formatted text is not in plain text, but the HTML source is; whether a file contains plain text thus may depend on the level on which one is considering it.
Text files can have the MIME type "text/plain", often with suffixes indicating an encoding. Common encodings for plain text include Unicode UTF-8, Unicode UTF-16, ISO 8859, and ASCII.
Transferring text files between Unix, Macintosh, and Microsoft Windows or DOS computers can be problematic, as each platform uses different characters to signify a line break. See new line for a discussion of this confusion.
Binary files, in contrast, usually contain non-alphabetic characters, and may contain any byte value at all. They are generally used to store data rather than textual material in plain text form. Computer programs are typical examples, as the data and CPU instructions they contain can -- in principle -- be any binary value. As a result, compiled applications are often simply referred to as binaries, as opposed to source code, which is contained in plain text files. But binary files can also be image files, sound files, compressed files, etc; in short, any file content whatsoever, other than plain text. Usually the specification of a binary file's file format indicates how to handle that file.
Binary files are often encoded into a plain text representation to improve survivability during transit, using encoding schemes such as Base64.