This page will contain news stories about Diff, as they become available.

Diff

In computing, diff is a file comparison utility that outputs the differences between two text files. The program's output is also called a diff.

Usage

It is invoked from the command line with the names of two files:

diff firstone.txt secondone.txt

Normal output

The result might look like this:

0a1,3
> This is an important notice! It should
> therefore be located at the beginning of
> this document!
7,12d9
< This paragraph contains text that is
< outdated - it will be deprecated and
< deleted in the near future.
< This is an important notice! It should
< therefore be located at the beginning of
< this document!
14,15c11,14
< spell check this dokument. On the other
< hand, I could do with some shoarma.
---
> spell check this document. On the other
> hand, I could do with some shoarma.
> This paragraph contains important new
> additions to this document.

In this normal diff output, a stands for added, d for deleted and c for changed. By default, lines common to both files are not shown. Lines that have moved will show up as added on their new location and as deleted on their old location.

Unified format

In unified format (or unidiff), each line that occurs only in the first file is preceded by a minus sign, each line that occurs only in the second file is preceded by a plus sign, and common lines are preceded by a space.

Lines beginning with three plus signs indicate the number of lines in each hunk, the file names, and where in the files to find them. This output is often used as input to the patch program.

Binary file support

The first editions of the diff program were designed for line comparisons of text files expecting the newline character to delimit lines. By the 1980s, support for binary files resulted in a shift in the application's design and implementation.

History

The diff program was developed in the early 1970s on the Unix operating system which was emerging from AT&T Bell Labs in Murray Hill, New Jersey. The final version, first shipped with the 5th Edition of Unix in 1974, was entirely written by Douglas McIlroy. This research was published in a 1976 paper co-written with James W. Hunt who developed an initial prototype of diff.

McIlroy's work was preceded and influenced by Steve Johnson's comparison program on GECOS and Mike Lesk's proof program. Proof originated on Unix and produced line-by-line changes like diff and even used angle-brackets (">" and "<") for presenting line insertions and deletions in the program's output. The heuristics used in these early applications were, however, deemed unreliable. The potential usefulness of a diff tool provoked McIlroy into researching and designing a more robust tool that could be used in a variety of tasks but perform well in the processing and space limitations of the PDP-11's hardware. His approach resulted from collaboration also with individuals at Bell Labs including Alfred Aho, Elliot Pinson, Jeffrey Ullman, and Harold S. Stone.

In the context of Unix, the use of the ed line editor provided diff with the natural ability to create machine-usable "edit scripts". These edit scripts, when saved to a file, can, along with the original file, be reconstituted by ed into the modified file in its entirety. This greatly reduced the space necessary to maintain multiple versions of a file. McIlroy considered writing a post-processor for diff where a variety of output formats could be designed and implemented, but he found it more frugal and simpler to have diff be responsible for generating the syntax and reverse-order input accepted by the ed command. In 1985, Larry Wall composed a separate utility, patch, that generalized and extended the ability to modify files with diff output.

In diff's early years, common uses included comparing changes in programming language source code, source to technical documents, verifying program debugging output, comparing filesystem listings and analyzing computer assembly code. The output targeted for ed was motivated to provide compression for a sequence of modifications made to a file. The Source Code Control System (SCCS) emerged in the late 1970s as a direct consequence of this development.

A conceptual predecessor of diff includes Project Xanadu, a hypertext project established in 1960 that had envisioned a version tracking system necessary for its "transpointing windows" feature. As part of this feature, file differences were subsumed in the expansive term "transclusion", when a document has included in it parts of other documents or revisions.

In the digital realm of the humanities, computer comparison systems were understood to have been created for working on literary works published as large volumes.

Variations

Most diff implementations remain outwardly unchanged since 1975. The modifications include improvements to the core algorithm, the addition of useful features to the command, and the design of new output formats. The basic algorithm is described in the papers An O(ND) Difference Algorithm and its Variations by Eugene W. Myers and in A File Comparison Program by Webb Miller and Myers. The algorithm was independently discovered as described in Algorithms for Approximate String Matching, E. Ukkonen.

Postprocessors sdiff and diffmk render side-by-side diff listings and applied change marks to printed documents, respectively. Both were developed elsewhere in Bell Labs in or before 1981.

The Berkeley distribution of Unix made a point of adding the context format (-C) and the ability to recurse on filesystem directory structures (-r), adding those features in 2.8 BSD, released in July 1981.

The context format of diff introduced at Berkeley helped with distributing patches for source code that may have been changed minimally.

Diff3 compares one file against two other files. It was originally developed by Paul Jensen to reconcile changes made by two persons editing a common source. It is seldom invoked directly and is largely subsumed by the merge program. However, it is used internally by many revision control systems.

Unified context diffs were originally developed by Wayne Davison in August 1990 (in unidiff which appeared in Volume 14 of comp.sources.misc). Richard Stallman added unified diff support to GNU Project's diff utility one month later, and the feature debuted in GNU diff 1.15, released in January 1991. GNU diff has since generalized the context format to allow arbitrary formatting of diffs. GNU diff is included in the diffutils package with other diff and patch related utilities.

Free software implementations

The GNU Project has an implementation of diff (and diff3) that is available from the GNU diffutils package.

Several tools on various platforms use the GNU diffutils engine and provide a graphical display, and some combine editing and merging capabilities. The following are some of these free tools.

  • Emacs - provided by Ediff mode
  • VimDiff [1]
  • gtkdiff [2]
  • KDiff3 [3]
  • kompare
  • Meld
  • tkdiff [4]
  • WinMerge - Comparison tool for Windows.
  • xxdiff [5]
  • fldiff [6]

This page about Diff includes information from a Wikipedia article.
Additional articles about Diff
News stories about Diff
External links for Diff
Videos for Diff
Wikis about Diff
Discussion Groups about Diff
Blogs about Diff
Images of Diff

The following are some of these free tools. Pluto, with its large moon Charon, is also the site of many eclipses. Several tools on various platforms use the GNU diffutils engine and provide a graphical display, and some combine editing and merging capabilities. It is common to see the larger moons casting circular shadows upon Jupiter's cloudtops. The GNU Project has an implementation of diff (and diff3) that is available from the GNU diffutils package. The most striking involve Jupiter, which has four large moons, and which has a low axial tilt, making eclipses more frequent. GNU diff is included in the diffutils package with other diff and patch related utilities. The gas giants, which have many moons, frequently display eclipses.

GNU diff has since generalized the context format to allow arbitrary formatting of diffs. See Transit of Phobos from Mars and Shadow of Phobos on Mars. Richard Stallman added unified diff support to GNU Project's diff utility one month later, and the feature debuted in GNU diff 1.15, released in January 1991. Martian eclipses have been photographed from both the surface of Mars and from orbit. Unified context diffs were originally developed by Wayne Davison in August 1990 (in unidiff which appeared in Volume 14 of comp.sources.misc). On Mars, only partial eclipses are possible, because neither of its moons is large enough to cover the Sun's disc. However, it is used internally by many revision control systems. Eclipses are impossible on Mercury and Venus, which have no moons.

It is seldom invoked directly and is largely subsumed by the merge program. (see also: omen). It was originally developed by Paul Jensen to reconcile changes made by two persons editing a common source. Traditionally, eclipses were said to have a malefic influence, that supposedly being a more negative and ominous influence rather than a positive one. Diff3 compares one file against two other files. In the field of astrology an eclipse is said to activate the exact degree of the ecliptic that the eclipse falls upon, in one of the 12 astrological signs. The context format of diff introduced at Berkeley helped with distributing patches for source code that may have been changed minimally. In other cultures an eclipse could be both a surprising and a terrifying event.

The Berkeley distribution of Unix made a point of adding the context format (-C) and the ability to recurse on filesystem directory structures (-r), adding those features in 2.8 BSD, released in July 1981. In this explanation we see a recognition of the celestial realities and a cheerful outlook regarding the event. Both were developed elsewhere in Bell Labs in or before 1981. Similarly in China, at the Imperial observatory, Beijing is a carved stone brought from a distant province with the following explanation (here rewritten):. Postprocessors sdiff and diffmk render side-by-side diff listings and applied change marks to printed documents, respectively. No wonder many indians believe that eclipses are inauspicious and can cause damage or bad luck to human beings, apart from the well known damage that can be caused to the retina of the eye when a solar eclipse is viewed directly. Ukkonen. For example, in Hindu mythology, the two demons Rahuand Ketu, are believed to be the cause of eclipses.

The algorithm was independently discovered as described in Algorithms for Approximate String Matching, E. These would typically involve conflicts between mythic forces. Myers and in A File Comparison Program by Webb Miller and Myers. Before modern astronomy arose there were long-standing explanations for eclipses in many cultures. The basic algorithm is described in the papers An O(ND) Difference Algorithm and its Variations by Eugene W. There are three types of lunar eclipses: penumbral, when the Moon crosses only the Earth's penumbra; partial, when the Moon crosses partially into the Earth's umbra; and total, when the Moon crosses entirely within the Earth's umbra. The modifications include improvements to the core algorithm, the addition of useful features to the command, and the design of new output formats. These were used in occult ceremonies.

Most diff implementations remain outwardly unchanged since 1975. These eclipses can be divided into different types:. In the digital realm of the humanities, computer comparison systems were understood to have been created for working on literary works published as large volumes. The most dramatic eclipses visible from Earth are:. As part of this feature, file differences were subsumed in the expansive term "transclusion", when a document has included in it parts of other documents or revisions. They repeat according to eclipse cycles. A conceptual predecessor of diff includes Project Xanadu, a hypertext project established in 1960 that had envisioned a version tracking system necessary for its "transpointing windows" feature. There can be from four to seven eclipses in a calendar year.

The Source Code Control System (SCCS) emerged in the late 1970s as a direct consequence of this development. The Sun passes either node once a year, and eclipses occur in a period of about two draconic months around these times. The output targeted for ed was motivated to provide compression for a sequence of modifications made to a file. Because the plane of the orbit of the Moon is tilted with respect to the plane of the orbit of the Earth (the ecliptic), eclipses occur only when the three bodies are near the intersection (the node) of these planes. In diff's early years, common uses included comparing changes in programming language source code, source to technical documents, verifying program debugging output, comparing filesystem listings and analyzing computer assembly code. An eclipse involving the Sun, Earth and Moon can only occur when they are in a line. In 1985, Larry Wall composed a separate utility, patch, that generalized and extended the ability to modify files with diff output. .

McIlroy considered writing a post-processor for diff where a variety of output formats could be designed and implemented, but he found it more frugal and simpler to have diff be responsible for generating the syntax and reverse-order input accepted by the ed command. An eclipse is a type of syzygy, as are transits and occultations. This greatly reduced the space necessary to maintain multiple versions of a file. A solar eclipse is actually a misnomer; the phenomenon is actually an occultation. These edit scripts, when saved to a file, can, along with the original file, be reconstituted by ed into the modified file in its entirety. However, it can also refer to such events beyond the Earth-Moon system: for example, a planet moving into the shadow cast by one of its moons, a moon passing into the shadow cast by its parent planet, or a moon passing into the shadow of another moon. In the context of Unix, the use of the ed line editor provided diff with the natural ability to create machine-usable "edit scripts". The term is most often used to describe either a solar eclipse, when the Moon's shadow crosses Earth's surface, or a lunar eclipse, when the Moon moves into the shadow of Earth.

Stone. An eclipse (Greek verb: ecleipo, "to cease existing" or calypse, "to cover" ) is an astronomical event that occurs when one celestial object moves into the shadow of another. His approach resulted from collaboration also with individuals at Bell Labs including Alfred Aho, Elliot Pinson, Jeffrey Ullman, and Harold S. Lastly, fourth contact (also called second exterior contact) is the instant when the Moon clears the Earth's umbra completely. The potential usefulness of a diff tool provoked McIlroy into researching and designing a more robust tool that could be used in a variety of tasks but perform well in the processing and space limitations of the PDP-11's hardware. This is the end of totality. The heuristics used in these early applications were, however, deemed unreliable. Third contact (also called second interior contact) is the instant when the Moon starts to come out of the Earth's umbra.

Proof originated on Unix and produced line-by-line changes like diff and even used angle-brackets (">" and "<") for presenting line insertions and deletions in the program's output. The maximum of the eclipse occurs when the angular distance between the centre of the Moon's disc and the centre of the shadow cone is at its smallest value. McIlroy's work was preceded and influenced by Steve Johnson's comparison program on GECOS and Mike Lesk's proof program. This is the beginning of totality. Hunt who developed an initial prototype of diff. Second contact (also called first interior contact) is the instant when the Moon enters completely into the Earth's umbra. This research was published in a 1976 paper co-written with James W. First contact (also called first exterior contact) is the instant when the Moon starts to enter into the Earth's umbra.

The final version, first shipped with the 5th Edition of Unix in 1974, was entirely written by Douglas McIlroy. Lastly, fourth contact (also called second exterior contact) is the instant when the Moon's disc clears the Sun's. The diff program was developed in the early 1970s on the Unix operating system which was emerging from AT&T Bell Labs in Murray Hill, New Jersey. Third contact (also called second interior contact) is the instant when the Moon's disc starts to come out of the Sun's (for an annular eclipse) or the instant when the Sun's disc reappears from behind the Moon's (for a total eclipse). By the 1980s, support for binary files resulted in a shift in the application's design and implementation. Second contact (also called first interior contact) is the instant when the Moon's disc is entirely surrounded by the Sun's (for an annular eclipse) or the instant when the Sun's disc disappears completely behind the Moon's (for a total eclipse). The first editions of the diff program were designed for line comparisons of text files expecting the newline character to delimit lines. First contact (also called first exterior contact) is the instant when the Moon's disc starts to cover the Sun's.

This output is often used as input to the patch program. The general eclipse ends when the Moon's penumbra finishes its sweep across the Earth's disc. Lines beginning with three plus signs indicate the number of lines in each hunk, the file names, and where in the files to find them. The total or annular eclipse ends when the Moon's shadow finishes its sweep across the Earth's disc. In unified format (or unidiff), each line that occurs only in the first file is preceded by a minus sign, each line that occurs only in the second file is preceded by a plus sign, and common lines are preceded by a space. The centrality ends when the axis of the Moon's shadow finishes its sweep across the Earth's disc. Lines that have moved will show up as added on their new location and as deleted on their old location. The eclipse's maximum occurs when the terrestrial surface within the umbra reaches its largest area.

By default, lines common to both files are not shown. The centrality begins when the axis of the Moon's shadow cone starts to sweep across the Earth's disc. In this normal diff output, a stands for added, d for deleted and c for changed. The total or annular eclipse begins when the Moon's umbra starts to sweep across the Earth's disc. The result might look like this:. The general eclipse begins when the Moon's penumbra cone starts to sweep across the Earth's disc. It is invoked from the command line with the names of two files:. Hybrid solar eclipses, which consists of three phases: the eclipse starts as an annular one, then turns into a total and by the end it returns to the annular phase.

. Annular eclipses are ideal times for observing solar prominences. The program's output is also called a diff. It is pure coincidence that the Moon and Sun have nearly equal apparent sizes, making annular eclipses possible. In computing, diff is a file comparison utility that outputs the differences between two text files. For solar eclipses, the viewer is in the antumbra part of the Moon's shadow. fldiff [6]. Annular eclipse, which are a total eclipse of luminary where a thin ring of light is visible around the intervening object.

xxdiff [5]. For solar eclipses, the viewer is in the penumbra part of the Moon's shadow. WinMerge - Comparison tool for Windows. Partial eclipses, in which only part of the luminary is covered (solar eclipses), or when only part of a body is eclipsed by the shadow (lunar eclipses). tkdiff [4]. For total solar eclipses, the viewer is in the umbra part of the Moon's shadow. Meld. Total eclipses, in which the light source is totally blocked off by the eclipsing body.

kompare. This can only happen at new moon. KDiff3 [3]. The Moon casts a shadow that touches the surface of the Earth. gtkdiff [2]. Solar eclipses - the Moon occults the Sun, from the Earth's point of view. VimDiff [1]. This can only happen at full moon.

Emacs - provided by Ediff mode. The Moon moves through the shadow cast by the Earth. Lunar eclipses - the Earth obscures the Sun, from the Moon's point of view.