Saturday, February 12, 2011

Delete file contents from SVN history

I have a local svn repository in my PC, I have been using it for a hobby project and it was mean to be accesible to anyone, so I commited files with passwords in them.

Now, I'm thinking on make the repository available for other people and I don't want to have that data there.

Is there a way to crawl the repository and replace all the passwords and account data with a text like "xxxxxxxxxx"?

  • The easiest thing would be to check out the contents of the repository, remove all the sensitive information, import the working directory into a new repository, and make that available to the public. It is very likely that whoever will be using your project will be interested in its current state, not in the change history.

    From Dima
  • If you do an

    svnadmin dump > mysvn
    

    you'll get a flat file of all the data of all the revisions in your repository. From there, you should be able to manually edit the file (if your repo was significant in size at all, you may need a line-editor, like pico, nano, vi, etc.).

    Lastly, you would then reload this dump into a new repository. This will preserve your history of your project.

    svnadmin load /path/to/new/repo < mysvn
    

    This practice would be considered a no-no in any corporate environment where you undergo auditing, etc, but for a hobby project it may just do the trick for you.

    EDIT: I've had to do this before trying to merge two different repositories together, so it required adding a new "directory node" the flat file. I'm not sure if SVN hashes the files or changes to determine if it's been tampered with.

    Iain : It does in fact check the checksums as it reloads the data. Just attempted this approach for a similar problem. :)
    From Matt
  • Check the Subversion FAQ: How do I completely remove a file from the repository's history?

    Witek : The FAQ has moved to http://subversion.apache.org/faq.html#removal
    Romulo A. Ceccon : Thanks, Witek. I've updated the link.
  • It seems that there was a misunderstanding. I didn't want to delete a file. I want to delete passwords stored in the repository. I don't want to lose the files, neither the revisions, modifications and the history.

    What I did is what Matt suggested, dump the repository and edit it.

    To do this, I used a hexadecimal editor (khexedit) and replaced the password string with a string of the same lenght. That way, I don't have to update the size fields.

    Next, I need to update the md5 fields with the hash of the file contents. For this, I wrote a script that used "svnadmin load" output to generate a error and get the old and new md5 from that error. Next, replace the old hash with sed and then, repeat until there aren't errors.

    BCS : could you post code?
    From naw

0 comments:

Post a Comment