Skip to content

A Ruby script that can export a Darcs 1 repo to Git without invoking Darcs.

Notifications You must be signed in to change notification settings

freshtonic/undarcs

Repository files navigation

# Copyright (C) 2007 James Sadler <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.


darcs-fast-export is  a Ruby  script that can  natively parse  the Darcs
(version  1) patch  format  and  replay the  patches  into a  directory,
creating a Git commit for each Darcs patch (.gz file in _darcs/patches).

DISCLAIMER:

Unless  you have  a  very  wedged or  extremely  large  Darcs repo  that
performs too slowly  for tailor, darcs-2-git etc to recover  into a real
DVCS, then you are probably better off not using this script.

There      are     better      alternatives      such     as      Tailor
(http://progetti.arstecnica.it/tailor). Tailor and  all other tools that
I am aware of perform repository  conversion by invoking the source DVCS
binaries  directly (the  *correct*  way!), and  therefore  stand a  much
better chance  of actually  working. The only  time you  should consider
darcs-fast-export is when other tools  fail because of running into some
exponential time/space bug Darcs.


STATE OF DEVELOPMENT

I can convert a 2100-ish patch repo  into a Git repo with this script in
about 30  minutes on  a 2.8Ghz  Intel CPU machine.  It seems  to consume
about 500MB of RAM at it's peak.

Unfortunately  the conversion  is  not accurate  yet,  and performing  a
recursive diff  on the original Darcs  working tree and the  Git working
tree shows many differences.

Currently  all Darcs  1  patch  types except  for  'regrem' patches  are
handled. I  have no 'regrem' patches  in my own repo,  so haven't tested
that yet.

'merger' patches are handled by ignoring them, IOW I parse past them and
ignore  the  commands  within  them.  It  is  my  understanding  that  a
subsequent patch will resolve the conflict (at least this is what I have
discovered  by  reverse  engineering  the patches).  I  could  be  wrong
however!

MY UNDERSTANDING OF DARCS'S PATCH FORMAT, AND WHY I DO NOT NEED TO 
UNDERSTAND THE THERY OF PATCHES (er, I think!)

Everyone that knows something about  the internals of Darcss, knows that
it  is based  upon  a Theory  of  Patches, which  is  a special  algebra
for  making statements  about  and manipulating  patches.  It tells  you
interesting things such as whether or not two patches are independent of
each other (commutivity) etc.

It turns out that  I can safely ignore all of  that because Darcs kindly
writes the  inventory (the  identifiers and sequence  of patches  in the
repo) and rewrites the .gz patches in  such a way that if you apply them
in the inventory order starting with a bare tree, it all just works.*

(Yes  you read  that right.  Darcs actually  messes with  your historical
data. But it's OK because I can use that fact to my advantage.)

'merger' and 'regrem'  patches are ust Darcs bookkeeping so  that it can
remember which  patches conflict  with each other,  should you  pull new
patches, unpull old ones etc.

The great thing is we can ignore those when we are applying the patches,
because the conflict resolution comes later in the patch series.

RUNNING THE CODE

To run  the code, invoke darcs-fast-export.rb  with no args and  it will
print the options. They should be self explanatory.

LASTLY...

I need help.  I have come a long  way in a few late  nights and weekends
only to discover that  the resulting tree is not the  same as the source
tree. It could be because: 

* My theory that I can ignore mergers is completely wrong.
* There  is an error  in my hunk application  logic (seems like  all the
content that was  supposed to be applied is actually  there, it just got
inserted a couple of lines away from where it should have been)

Anyone up for a  challenge? I am available to help  anyone and offer any
advice in  order to get more  brains looking at this  stuff. Anyone from
the Darcs community reading this? Feel free to offer advice.

Good luck!

James.


---------

* May not be true!  (I have errors in my resultant Git tree, go figure!)

About

A Ruby script that can export a Darcs 1 repo to Git without invoking Darcs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages