Daniel Morrison

Git Submodules, Part 2

In response to my previous article, Ryan McGeary made some good comments that I think warrant a followup.

but isn’t there a small advantage to a piston or braid-like approach where all the files are in your own repository?

Definitely in some cases, but I value keeping my repository small over having everything inside. Especially during development, when I want to keep up-to-date with my submodules (especially Rails).

Once a project launches (and/or slows down), storing external dependencies in the repository is more desirable, and my opinion may change. Same goes for gems. I don’t like storing them in the project unless or until I need to.

Also, adding Piston/Braid is one more dependency.

What happens during deployments? Doesn’t the deployment now rely on a possible many remote repositories to succeed? Consider both vendor/rails and a possible many vendor/plugins each with their own submodule.

This is an important thought, and about the only thing that makes me hesitant about submodules, but I still like ’em.

If you go with a typical Capistrano deployment, you’ll have to download each submodule each time you deploy (same as with svn:externals). This is one of the biggest problems I have with svn:externals, because if a single plugin wasn’t available (I’m looking at you, acts_as_taggable_on_steroids), it kills the checkout.

The same is true with git submodules but there is a very important difference: When you hit the network.

When I do a git pull, it doesn’t look at the submodules at all. Even if I changed them. Even when I do a git submodule update, it will only try to pull new files if I explicitly updated the submodule (see previous post).

Contrast that with svn:externals, which will have to hit the remote repository just to see if anything’s changed.

If you switch your deployment strategy to set :deploy_via, :remote_cache, your average deploy won’t look for submodule updates at all. You can safely deploy if a submodule isn’t available, as it will be already on the server, when you need to update a submodule, you can have a cap task to do that.

Obviously there are other ideas. Maybe I’ll post on that later too.

I think I also missed an important point. How does Matt’s git submodule update avoid pulling a later version of edge rails than the original revision that you intended? The .gitmodules file only seems to specify the public clone url without any commit revision.

The .gitmodules file only stores the paths, but the commit version does get stored with the main project. I created a project on GitHub to demonstrate what happens. Specifically look at this commit, which adds the submodule, to see what’s going on:

<pre>+Subproject commit 60be4b09f51d2560802ebd744893bb6f737ef57c</pre>

Clone the repository and see how it works locally too. Even though it includes all of Rails, the .git directory (and my github usage) is tiny.

7 Comments

  1. Ryan McGeary — April 12, 2008

    Daniel, Thanks for taking the time on this detailed response. Good info, especially on how submodules handle the commit revision. I now have a better understanding of things which will help me make a more informed decision on my git repositories, especially since piston 2.0 isn’t ready yet, and braid currently has self-proclaimed warnings of being riddled with issues.

  2. John-Paul Bader — April 19, 2008

    Hey there – this comment is off topic and concerning your awesome nested set plugin which i have problems with. I’m using postgresql and it seems that “prune_from_tree” doesn’t work in postgres where the line 526 in awesome_nested_set.rb would generate this query:

    UPDATE pages SET lft = (lft – 2) WHERE (lft >= 36) ORDER BY lft

    which is incorrect syntax in Postgres. I was also wondering if the order statement is really neccessary at this particular part of the module. I couldn’t figure out where it comes from anyway because its not in line 526. Maybe you have the motivation to fix this for postgresql. Feel free to contact me via mail so I can give you more feedback in case you don’t ever use postgres.

    Kind regards, John

  3. John-Paul Bader — April 19, 2008

    Sorry, wrong line number – its 526 in my verion – in the git hub version its 534

    self.class.base_class.update_all(“#{left_column_name} = (#{left_column_name} – #{diff})”,“#{left_column_name} >= #{right}”)

  4. hgh — May 09, 2008

    I LOVE GIT! Thanks for the awesome code!

    Eddie HGH

  5. Gustavo — September 14, 2008

    Daniel
    Great article. Can you tell me how to manage those plugins with an install script. How can I install them after added them as submodules?

  6. AtothTewTywopy — September 22, 2008

    thats for sure, man

  7. personal growth — August 11, 2010

    I really love it… thanks for the codes!