有一段时间我一直在为我的个人项目使用subversion.
我越来越多地听到关于Git和Mercurial以及DVCS的一般情况.
我想给整个DVCS带来一些旋转,但我对这两种选择都不太熟悉.
Mercurial和Git有什么区别?
注意:我不是要找出哪一个是"最好的",甚至不应该从哪个开始.我主要寻找他们相似的关键领域,以及他们不同的关键领域,因为我有兴趣知道他们在实施和理念方面有何不同.
Disclaimer: I use Git, follow Git development on git mailing list, and even contribute a bit to Git (gitweb mainly). I know Mercurial from documentation and some from discussion on #revctrl IRC channel on FreeNode.
Thanks to all people on on #mercurial IRC channel who provided help about Mercurial for this writeup
Here it would be nice to have some syntax for table, something like in PHPMarkdown/MultiMarkdown/Maruku extension of Markdown
Repository structure: Mercurial doesn't allow octopus merges (with more than two parents), nor tagging non-commit objects.
标签: Mercurial使用.hgtags
带有针对每个存储库标签的特殊规则的版本化文件,并且还支持本地标签.hg/localtags
; 在Git标签中,refs驻留在refs/tags/
名称空间中,默认情况下,在获取时自动循环并需要显式推送.
分支: Mercurial基本工作流程基于匿名头 ; Git使用轻量级命名分支,并且具有跟踪远程存储库中的分支的特殊类型的分支(远程跟踪分支).
Revision naming and ranges: Mercurial provides revision numbers, local to repository, and bases relative revisions (counting from tip, i.e. current branch) and revision ranges on this local numbering; Git provides a way to refer to revision relative to branch tip, and revision ranges are topological (based on graph of revisions)
Mercurial uses rename tracking, while Git uses rename detection to deal with file renames
Network: Mercurial supports SSH and HTTP "smart" protocols, and static HTTP protocol; modern Git supports SSH, HTTP and GIT "smart" protocols, and HTTP(S) "dumb" protocol. Both have support for bundles files for off-line transport.
Mercurial使用扩展(插件)和已建立的API; Git具有可编写脚本和已建立的格式.
Mercurial与Git有一些不同之处,但还有其他一些东西使它们相似.两个项目都互相借鉴.例如hg bisect
,Mercurial(以前的bisect扩展)中的git bisect
命令受Git命令的启发,而想法的git bundle
灵感来自于hg bundle
.
In Git there are four types of objects in its object database: blob objects which contain contents of a file, hierarchical tree objects which store directory structure, including file names and relevant parts of file permissions (executable permission for files, being a symbolic link), commit object which contain authorship info, pointer to snapshot of state of repository at revision represented by a commit (via a tree object of top directory of project) and references to zero or more parent commits, and tag objects which reference other objects and can be signed using PGP/GPG.
Git uses two ways of storing objects: loose format, where each object is stored in a separate file (those files are written once, and never modified), and packed format where many objects are stored delta-compressed in a single file. Atomicity of operations is provided by the fact, that reference to a new object is written (atomically, using create + rename trick) after writing an object.
Git repositories require periodic maintenance using git gc
(to reduce disk space and improve performance), although nowadays Git does that automatically. (This method provides better compression of repositories.)
Mercurial (as far as I understand it) stores history of a file in a filelog (together, I think, with extra metadata like rename tracking, and some helper information); it uses flat structure called manifest to store directory structure, and structure called changelog which store information about changesets (revisions), including commit message and zero, one or two parents.
Mercurial uses transaction journal to provide atomicity of operations, and relies on truncating files to clean-up after failed or interrupted operation. Revlogs are append-only.
Looking at repository structure in Git versus in Mercurial, one can see that Git is more like object database (or a content-addressed filesystem), and Mercurial more like traditional fixed-field relational database.
Differences:
In Git the tree objects form a hierarchical structure; in Mercurial manifest file is flat structure. In Git blob object store one version of a contents of a file; in Mercurial filelog stores whole history of a single file (if we do not take into account here any complications with renames). This means that there are different areas of operations where Git would be faster than Mercurial, all other things considered equal (like merges, or showing history of a project), and areas where Mercurial would be faster than Git (like applying patches, or showing history of a single file). This issue might be not important for end user.
Because of the fixed-record structure of Mercurial's changelog structure, commits in Mercurial can have only up to two parents; commits in Git can have more than two parents (so called "octopus merge"). While you can (in theory) replace octopus merge by a series of two-parent merges, this might cause complications when converting between Mercurial and Git repositories.
As far as I know Mercurial doesn't have equivalent of annotated tags (tag objects) from Git. A special case of annotated tags are signed tags (with PGP/GPG signature); equivalent in Mercurial can be done using GpgExtension, which extension is being distributed along with Mercurial. You can't tag non-commit object in Mercurial like you can in Git, but that is not very important, I think (some git repositories use tagged blob to distribute public PGP key to use to verify signed tags).
In Git references (branches, remote-tracking branches and tags) reside outside DAG of commits (as they should). References in refs/heads/
namespace (local branches) point to commits, and are usually updated by "git commit"; they point to the tip (head) of branch, that's why such name. References in refs/remotes/
namespace (remote-tracking branches) point to commit, follow branches in remote repository
, and are updated by "git fetch" or equivalent. References in refs/tags/
namespace (tags) point usually to commits (lightweight tags) or tag objects (annotated and signed tags), and are not meant to change.
In Mercurial you can give persistent name to revision using tag; tags are stored similarly to the ignore patterns. It means that globally visible tags are stored in revision-controlled .hgtags
file in your repository. That has two consequences: first, Mercurial has to use special rules for this file to get current list of all tags and to update such file (e.g. it reads the most recently committed revision of the file, not currently checked out version); second, you have to commit changes to this file to have new tag visible to other users/other repositories (as far as I understand it).
Mercurial also supports local tags, stored in hg/localtags
, which are not visible to others (and of course are not transferable)
In Git tags are fixed (constant) named references to other objects (usually tag objects, which in turn point to commits) stored in refs/tags/
namespace. By default when fetching or pushing a set of revision, git automatically fetches or pushes tags which point to revisions being fetched or pushed. Nevertheless you can control to some extent which tags are fetched or pushed.
Git treats lightweight tags (pointing directly to commits) and annotated tags (pointing to tag objects, which contain tag message which optionally includes PGP signature, which in turn point to commit) slightly differently, for example by default it considers only annotated tags when describing commits using "git describe".
Git doesn't have a strict equivalent of local tags in Mercurial. Nevertheless git best practices recommend to setup separate public bare repository, into which you push ready changes, and from which others clone and fetch. This means that tags (and branches) that you don't push, are private to your repository. On the other hand you can also use namespace other than heads
, remotes
or tags
, for example local-tags
for local tags.
Personal opinion: In my opinion tags should reside outside revision graph, as they are external to it (they are pointers into graph of revisions). Tags should be non-versioned, but transferable. Mercurial's choice of using a mechanism similar to the one for ignoring files, means that it either has to treat .hgtags
specially (file in-tree is transferable, but ordinary it is versioned), or have tags which are local only (.hg/localtags
is non-versioned, but untransferable).
In Git local branch (branch tip, or branch head) is a named reference to a commit, where one can grow new commits. Branch can also mean active line of development, i.e. all commits reachable from branch tip. Local branches reside in refs/heads/
namespace, so e.g. fully qualified name of 'master' branch is 'refs/heads/master'.
Current branch in Git (meaning checked out branch, and branch where new commit will go) is the branch which is referenced by the HEAD ref. One can have HEAD pointing directly to a commit, rather than being symbolic reference; this situation of being on an anonymous unnamed branch is called detached HEAD ("git branch" shows that you are on '(no branch)').
In Mercurial there are anonymous branches (branch heads), and one can use bookmarks (via bookmark extension). Such bookmark branches are purely local, and those names were (up to version 1.6) not transferable using Mercurial. You can use rsync or scp to copy the .hg/bookmarks
file to a remote repository. You can also use hg id -r
to get the revision id of a current tip of a bookmark.
Since 1.6 bookmarks can be pushed/pulled. The BookmarksExtension page has a section on Working With Remote Repositories. There is a difference in that in Mercurial bookmark names are global, while definition of 'remote' in Git describes also mapping of branch names from the names in remote repository to the names of local remote-tracking branches; for example refs/heads/*:refs/remotes/origin/*
mapping means that one can find state of 'master' branch ('refs/heads/master') in the remote repository in the 'origin/master' remote-tracking branch ('refs/remotes/origin/master').
Mercurial has also so called named branches, where the branch name is embedded in a commit (in a changeset). Such name is global (transferred on fetch). Those branch names are permanently recorded as part of the changeset\u2019s metadata. With modern Mercurial you can close "named branch" and stop recording branch name. In this mechanism tips of branches are calculated on the fly.
Mercurial's "named branches" should in my opinion be called commit labels instead, because it is what they are. There are situations where "named branch" can have multiple tips (multiple childless commits), and can also consist of several disjoint parts of graph of revisions.
There is no equivalent of those Mercurial "embedded branches" in Git; moreover Git's philosophy is that while one can say that branch includes some commit, it doesn't mean that a commit belongs to some branch.
Note that Mercurial documentation still proposes to use separate clones (separate repositories) at least for long-lived branches (single branch per repository workflow), aka branching by cloning.
Mercurial by default pushes all heads. If you want to push a single branch (single head), you have to specify tip revision of the branch you want to push. You can specify branch tip by its revision number (local to repository), by revision identifier, by bookmark name (local to repository, doesn't get transferred), or by embedded branch name (named branch).
As far as I understand it, if you push a range of revisions that contain commits marked as being on some "named branch" in Mercurial parlance, you will have this "named branch" in the repository you push to. This means that names of such embedded branches ("named branches") are global (with respect to clones of given repository/project).
By default (subject to push.default
configuration variable) "git push" or "git push <remote>" Git would push matching branches, i.e. only those local branches that have their equivalent already present in remote repository you push into. You can use --all
option to git-push ("git push --all") to push all branches, you can use "git push <remote> <branch>" to push a given single branch, and you can use "git push <remote> HEAD" to push current branch.
All of the above assumes that Git isn't configured which branches to push via remote.
configuration variables.
Note: here I use Git terminology where "fetch" means downloading changes from remote repository without integrating those changes with local work. This is what "git fetch
" and "hg pull
" does.
If I understand it correctly, by default Mercurial fetches all heads from remote repository, but you can specify branch to fetch via "hg pull --rev
" or "hg pull
" to get single branch. You can specify
In Git by default (for 'origin' remote created by "git clone", and for remotes created using "git remote add") "git fetch
" (or "git fetch
") gets all branches from remote repository (from refs/heads/
namespace), and stores them in refs/remotes/
namespace. This means for example that branch named 'master' (full name: 'refs/heads/master') in remote 'origin' would get stored (saved) as 'origin/master' remote-tracking branch (full name: 'refs/remotes/origin/master').
You can fetch single branch in Git by using git fetch
- Git would store requested branch(es) in FETCH_HEAD, which is something similar to Mercurial unnamed heads.
Those are but examples of default cases of powerful refspec Git syntax: with refspecs you can specify and/or configure which branches one want to fetch, and where to store them. For example default "fetch all branches" case is represented by '+refs/heads/*:refs/remotes/origin/*' wildcard refspec, and "fetch single branch" is shorthand for 'refs/heads/
Personal opinion: I personally think that "named branches" (with branch names embedded in changeset metadata) in Mercurial are misguided design with its global namespace, especially for a distributed version control system. For example let's take case where both Alice and Bob have "named branch" named 'for-joe' in their repositories, branches which have nothing in common. In Joe's repository however those two branches would be mistreated as a single branch. So you have somehow come up with convention protecting against branch name clashes. This is not problem with Git, where in Joe's repository 'for-joe' branch from Alice would be 'alice/for-joe', and from Bob it would be 'bob/for-joe'. See also Separating branch name from branch identity issue raised on Mercurial wiki.
Mercurial's "bookmark branches" currently lack in-core distribution mechanism.
Differences:
This area is one of the main differences between Mercurial and Git, as james woodyatt and Steve Losh said in their answers. Mercurial, by default, uses anonymous lightweight codelines, which in its terminology are called "heads". Git uses lightweight named branches, with injective mapping to map names of branches in remote repository to names of remote-tracking branches. Git "forces" you to name branches (well, with exception of single unnamed branch, situation called detached HEAD), but I think this works better with branch-heavy workflows such as topic branch workflow, meaning multiple branches in single repository paradigm.
In Git there are many ways of naming revisions (described e.g. in git rev-parse manpage):
The full SHA1 object name (40-byte hexadecimal string), or a substring of such that is unique within the repository
A symbolic ref name, e.g. 'master' (referring to 'master' branch), or 'v1.5.0' (referring to tag), or 'origin/next' (referring to remote-tracking branch)
A suffix ^
to revision parameter means the first parent of a commit object, ^n
means n-th parent of a merge commit. A suffix ~n
to revision parameter means n-th ancestor of a commit in straight first-parent line. Those suffixes can be combined, to form revision specifier following path from a symbolic reference, e.g. 'pu~3^2~3'
Output of "git describe", i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name, for example 'v1.6.5.1-75-g5bf8097'.
There are also revision specifiers involving reflog, not mentioned here. In Git each object, be it commit, tag, tree or blob has its SHA-1 identifier; there is special syntax like e.g. 'next:Documentation' or 'next:README' to refer to tree (directory) or blob (file contents) at specified revision.
Mercurial also has many ways of naming changesets (described e.g. in hg
我认为你可以通过对这两个视频进行评分来了解这些系统的相似或不同之处:
关于Git的Linus Torvalds(http://www.youtube.com/watch?v=4XpnKHJAok8 )
Mercurial的Bryan O'Sullivan(http://www.youtube.com/watch?v=JExtkqzEoHY)
它们两者在设计上非常相似,但在实现上却截然不同.
我使用Mercurial.据我了解Git,git的一个主要不同之处在于它跟踪文件的内容而不是文件本身.Linus说如果你将一个函数从一个文件移动到另一个文件,Git会告诉你整个移动过程中单个函数的历史.
他们还说git比HTTP慢,但它拥有自己的网络协议和服务器.
作为SVN胖客户端,Git比Mercurial更好.您可以拉动和推送SVN服务器.Mercurial中仍在开发此功能
Mercurial和Git都有非常好的网络托管解决方案(BitBucket和GitHub),但Google Code仅支持Mercurial.顺便说一句,他们对Mercurial和Git做了非常详细的比较,他们决定支持哪一个(http://code.google.com/p/support/wiki/DVCSAnalysis).它有很多好消息.
我刚刚写了一篇关于Mercurial分支模型的博客文章,并将其与git的分支模型进行了比较.也许你会发现它很有趣:http://stevelosh.com/blog/entry/2009/8/30/a-guide-to-branching-in-mercurial/
我经常使用它们.主要的功能差异在于Git和Mercurial在存储库中的名称分支.使用Mercurial,可以克隆分支名称并将其与变更集一起拉出.将更改添加到Mercurial中的新分支并推送到另一个存储库时,将同时推送分支名称.因此,Mercurial中的分支名称或多或少是全局的,您必须使用Bookmark扩展名才能拥有仅限本地的轻量级名称(如果需要;默认情况下,Mercurial使用匿名轻量级代码行,其术语是被称为"头").在Git中,分支名称及其到远程分支的内射映射存储在本地,您必须明确地管理它们,这意味着知道如何做到这一点.这就是Git因为比Mercurial更难学习和使用而获得声誉的地方.
正如其他人会在这里指出的那样,存在很多很多细微差别.分支机构是最大的区别.
看看Git vs. Mercurial:请放松 Patrick Thomson的博客文章,他写道:
Git是MacGyver,Mercurial是James Bond
请注意,此博客文章是从2008年8月7日开始的,自那时起SCM都有所改进.
Mercurial几乎完全用python编写.Git的核心是用C语言编写的(应该比Mercurial的更快)和用sh,perl,tcl编写的工具,并使用标准的GNU工具.因此,它需要将所有这些工具和解释器带到不包含它们的系统(例如Windows).
两者都支持使用SVN,虽然AFAIK svn支持在Windows上被git打破(可能我只是不幸/跛,谁知道).还有一些扩展允许在git和Mercurial之间进行互操作.
Mercurial具有很好的Visual Studio集成.我上次检查时,Git的插件工作但速度极慢.
它们的基本命令集非常相似(init,clone,add,status,commit,push,pull等).因此,基本工作流程将是相同的.此外,两者都有类似TortoiseSVN的客户端.
Mercurial的扩展可以用python编写(毫不奇怪!),对于git,它们可以用任何可执行的形式(可执行的二进制文件,shell脚本等)编写.一些扩展是疯狂的强大,像git bisect
.
如果您需要良好的Windows支持,您可能更喜欢Mercurial.TortoiseHg(Windows资源管理器插件)设法为一个相当复杂的工具提供一个简单易用的图形界面.作为状态,您还将拥有一个Visual Studio插件.但是,上次我尝试过,SVN界面在Windows上运行得不好.
如果你不介意命令行界面,我会推荐Git.不是出于技术原因,而是出于战略原因.git的采用率要高得多.看看有多少着名的开源项目正在从cvs/svn切换到Mercurial,有多少正在转向Git.与Mercurial托管相比,查看使用git支持可以找到多少代码/项目托管服务提供商.
在阅读了Mercurial之后更容易(我仍然认为,毕竟互联网社区的意见),当我开始使用Git和Mercurial时,我觉得Git对我来说相对简单(我开始了)使用Mercurial和TortoiseHg)从命令行工作时,主要是因为git命令根据我的命名而且数量较少.Mercurial对每个执行不同作业的命令都有不同的命名,而Git命令可以根据情况进行多用途(例如,checkout
).虽然当时Git更难,但现在差距并不大.YMMV ..有了像TortoiseHg这样的优秀GUI客户端,真的很容易使用Mercurial而且我不必记住稍微混乱的命令.我不会详细说明同一动作的每个命令如何变化,但这里有两个全面的列表:1来自Mercurial自己的网站,第2来自wikivs.
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ? Git ? Mercurial ? ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ? git pull ? hg pull -u ? ? git fetch ? hg pull ? ? git reset --hard ? hg up -C ? ? git revert? hg backout ? ? git add ? hg add (Only equivalent when is not tracked.) ? ? git add ? Not necessary in Mercurial. ? ? git add -i ? hg record ? ? git commit -a ? hg commit ? ? git commit --amend ? hg commit --amend ? ? git blame ? hg blame or hg annotate ? ? git blame -C ? (closest equivalent): hg grep --all ? ? git bisect ? hg bisect ? ? git rebase --interactive ? hg histedit (Requires the HisteditExtension.) ? ? git stash ? hg shelve (Requires the ShelveExtension or the AtticExtension.) ? ? git merge ? hg merge ? ? git cherry-pick ? hg graft ? ? git rebase ? hg rebase -d (Requires the RebaseExtension.) ? ? git format-patch ? hg email -r (Requires the PatchbombExtension.) ? ? and git send-mail ? ? ? git am ? hg mimport -m (Requires the MboxExtension and the MqExtension. Imports patches to mq.) ? ? git checkout HEAD ? hg update ? ? git log -n ? hg log --limit n ? ? git push ? hg push ? ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
Git在内部保存了每个版本的已提交文件的记录,而Hg只保存了占用空间较小的变更集.与Hg相比,Git可以更容易地改变历史,但再一次是它的仇恨或爱情特征.我喜欢前者的Hg和后者的Git.
我在Hg中错过的是Git的子模块功能.Hg有subrepos,但这不完全是Git子模块.
围绕这两者的生态系统也可以影响一个人的选择:Git必须更受欢迎(但这是微不足道的),Git有GitHub,而Mercurial有BitBucket,Mercurial有TortoiseHg,我没有看到它对Git有好处.
每个都有其优点和缺点,其中任何一个都不会失去.
不久前看看Scott Chacon的帖子.
我认为git以"更复杂"而闻名,尽管根据我的经验,它并不比它需要的复杂.IMO,git的模型的方式更容易理解(标签包含提交(和指针,以零个或多个父提交)含有树包含斑点和其他树木......完成).
不仅仅是我的经验,git并不比mercurial更令人困惑.我建议再次阅读Scott Chacon关于此事的博客文章.
我在目前的工作中使用Git一年多一点,在此之前,在我以前的工作中使用了Mercurial一年多一点.我将从用户的角度提供评估.
首先,两者都是分布式版本控制系统.分布式版本控制系统需要改变传统版本控制系统的思维方式,但一旦理解它们,实际上在许多方面都会更好地工作.出于这个原因,我认为Git和Mercurial都优于Subversion,Perforce等.分布式版本控制系统和传统版本控制系统之间的差异远大于Git和Mercurial之间的差异.
但是,Git和Mercurial之间也存在显着差异,使每个更适合其自己的用例子集.
Mercurial更容易学习.在使用Mercurial几周后,我很难提到文档或笔记; 即使在使用它一年之后,我仍然需要定期使用Git来引用我的笔记.Git要复杂得多.
这部分是因为Mercurial只是简单清洁.您很少需要在Mercurial中手动分支; Mercurial会在您需要时自动为您创建一个匿名分支.Mercurial命名法更直观; 你不必像担心Git那样担心"fetch"和"pull"之间的区别.Mercurial的马车少了一点.存在文件名区分大小写问题,这些问题在使用Git和Mercurial跨平台推送项目时会导致问题; 这个问题在很久以前就已经在Mercurial修复了,而我们在最后检查时没有在Git中修复过.你可以告诉Mercurial文件重命名; 使用Git,如果它没有自动检测到重命名 - 根据我的经验非常命中或错过命题 - 重命名根本无法跟踪.
然而,Git额外复杂性的另一个原因是需要大部分功能来支持其他功能和功能.是的,在Git中处理分支更复杂 - 但另一方面,一旦你拥有分支,用Mercurial中几乎不可能的那些分支做事情并不困难.重新分支是这些事情之一:你可以移动你的分支,使它的基础,而不是你分支时的主干状态,现在是主干的状态; 当有许多人在相同的代码库上工作时,这极大地简化了版本历史,因为每个推送到trunk可以看起来是顺序的,而不是交织在一起.类似地,将分支上的多个提交折叠成单个提交要容易得多,这可以再次帮助保持版本控制历史记录清洁:理想情况下,功能上的所有工作都可以在trunk中显示为单个提交,替换所有次要提交开发人员在开发功能时可能提交的提交和子分支.
最终,我认为Mercurial和Git之间的选择应取决于您的版本控制项目的大小,以同时处理它们的人数来衡量.例如,如果您有一个或多个团队在一个单一的Web应用程序上工作,那么Git更强大的分支管理工具将使其更适合您的项目.另一方面,如果您的团队正在开发异构分布式系统,任何时候只有一个或两个开发人员在任何一个组件上工作,那么为每个组件项目使用Mercurial存储库将允许开发更顺利,更少存储库管理开销.
一句话:如果你有一个大团队开发一个庞大的应用程序,请使用Git; 如果您的个人应用程序很小,任何规模来自数字而不是此类应用程序的大小,请使用Mercurial.