Git Mirror
- git_mirror
While creating a git mirror is as simple as git clone --mirror
, unfortunately this git command does not support git submodules or lfs. The subcommands mirror, push and clone in this file, and associated functions, help in creating and subsequently cloning a mirror of a project with submodules and/or git lfs.
Example
Example usage:
git_mirror mirror https://github.com/visionsystemsinc/vsi_common.git main
- Mirror the repository and recursively create mirrors of all submodules currently in the main branch.Transfer
vsi_common_prep/transfer_{date}.tgz
to your destinationOn the destination, create a directory, e.g., vsi_common_prep, and move the archive into it
Extract the archive (the archive will extract directly into this directory)
Write an
info.env
file mapping each repository to its mirrored URL:
repos[.]=https://git-server.com/foobar/vsi_common.git
repos[docker/recipes]=https://git-server.com/foobar/recipes.git
git_mirror push ./info.env ./transfer_extracted_dir/
- Push the mirrored repository and all submodules to a new git server as defined by info.envgit_mirror clone ./info.env ./my_project_dir/
- Clone recursively from the new mirror
- next_section
Change the color of the text output to sdtout/stderr
- Arguments:
[
${@}
] - An optional string (or list of strings) to harold the next section- Output:
A temporary file is created which stores the index into the COLORS array
Change the current text color. A temporary file is created to track the current index into the COLORS array; this index is updated with each call to this function. This temporary file will automatically be deleted
- get_config_submodule_names
Get list of initialized submodules, non-recursive
- Output:
A newline separated list of submodules for the current git repository
Get a list of submodules of the current git repository. This command is non-recursive, i.e., submodules of submodule, etc. are not returned. An implementation of this feature is provided for older versions of git (<2.6)
Example
$ get_config_submodule_names() # from within ./vsi_common/
submodule.docker/recipes
Note
Unlike git submodule foreach --quiet 'echo "${name}"'
, this function works for submodules that have been init’d but not updated
- git_mirror_has_lfs
Is git lfs available for the specified implementation of git
- Output:
0
- git lfs is available1
- git lfs is not available- Internal Use:
__git_mirror_has_lfs
- A variable to save this state
- clone_submodules
Mirror a submodule and all of its submodules (recursively)
This is a helper function to git_mirror_main
. Create a mirror of each submodule in a git repository, recursively. Each mirror is stored according to its full relative path in the project repository in a unique temporary directory created in the GIT_MIRROR_PREP_DIR
. Submodules of a submodule are processed recursively in a depth-first fashion.
WARNING: git submodule foreach
runs commands via sh because git is weird; however I start bash and source this script for its vars and functions, so it’s really bash again.
- Parameters:
GIT_MIRROR_PREP_DIR
- The directory in which to mirror the repositories [base_submodule_path
] - The (relative) path from the project repository to a submodule with, potentially, submodules of its own that need to be mirrored/updated. This must also be the CWD. If unset, mirror the submodules of the parent repository.- Output:
A mirrored submodule located at
GIT_MIRROR_PREP_DIR/{temp_dir}/base_submodule_path
Note
This function assumes base_submodule_path
is also the PWD
- update_submodules
Init/update a submodule and any of its submodules (recursively)
This is a helper function to git_clone_main
. Recursively clone a submodule mirrored with git_mirror_main
, and fixup the submodules’ remote URLs according to the mapping specified by $1
.
WARNING: git submodule foreach
runs commands via sh because git is weird; however I start bash and source this script for its vars and functions, so it’s really bash again.
- Arguments:
$1
- A file specifying the mapping between each repository and its mirror URL; see_git_mirror_load_info
[base_submodule_path
] - The (relative) path from the project repository to a submodule with, potentially, submodules of its own that need to be cloned/updated. This must also be the CWD. If unset, update the submodules of the parent repository. [prefix
] - A variable set when callinggit submodule foreach
(and, by necessity, re-exported by this function) to the submodule path (sm_path) from the .gitmodules file. After git v1.8.3,prefix
was replaced withdisplaypath
[displaypath
] - A variable set when callinggit submodule foreach
(and, by necessity, re-exported by this function) to the relative path from the current working directory to the submodules root directory- Output:
The recursively cloned submodule
Note
This function assumes base_submodule_path
is also the PWD
- sync_submodules
Sync the submodules to the mirrored remote (non-recursive)
This is a helper function to update_submodules
. Sync the submodules’ remote URLs (non-recursively), a la git submodule sync
, however, instead of referring to the .gitmodules file, use the mapping specified by repo_paths
and repo_urls
.
- Arguments:
$1
- The (relative) path from the project repository to a submodule with, potentially, submodules of its own that need to be sync’ed. This must also be the CWD. If unset, sync the submodules of the parent repository.- Parameters:
repo_paths
- The array of (relative) paths from the root of the project repository to each submodule (recursively); see_git_mirror_load_info
repo_urls
- The corresponding URL of each repo_path
Note
This function assumes $1
is also the PWD
- git_mirror_main
Mirror the main repository and all submodules (recursively)
Downloads a mirror of a git repository and all of its submodules. The normal git clone --mirror
command does not support submodules at all. This at least clones all the submodules available in the specified branch (git’s init.defaultBranch by default).
The script creates a directory, referred to as a prep directory (or prep_dir), which will contain all of the mirrored repositories plus a single transfer_{date}.tgz
archive file containing all of these repositories, lfs objects, etc… Only this tgz file needs to be transferred to your destination.
Subsequent calls to git_mirror_main
can use the existing prep directory as cache, updating faster than the first time.
Subsequent calls also create a second tgz
file, transfer_{date1}_transfer_{date2}.tgz
(on supported platforms). This is an incremental archive file. Instead of having to bring in an entire archive, only the incremental file is needed (plus the original full archive).
After you have moved the transfer archive to its destination, you can use git_push_main
to push these mirrored repositories to a new git server.
- Arguments:
$1
- URL of the main git repository. On subsequent calls to this function, the prep (cache) dir created by this function can be used in lue of the repository’s URL[
$2
] - The git branch from which to identify the submodules. Default: git’s init.defaultBranch
- Parameters:
[
GIT_MIRROR_PREP_DIR
] - The output directory in which to mirror the repositories; Default:${PWD}/{repo_name}_prep
- Output:
A prep directory which will contain all of the repositories plus a single
transfer_{date}.tgz
GIT_MIRROR_PREP_DIR
- The path to the mirrored repositoriesGIT_MIRROR_MAIN_REPO
- The main repository’s name. Based off of$1
; e.g., vsi_common if the URL is https://github.com/VisionSystemsInc/vsi_common.git
Example
Mirror the vsi_common repository and all of its submodules in a directory called ./vsi_common_prep. Then, create an archive file that can be transferred to your destination.
git_mirror_main https://github.com/visionsystemsinc/vsi_common.git main
# produces ./vsi_common_prep/transfer_2020_03_02_14_16_09.tgz
Example
Calling git_mirror
again will use the vsi_common_prep dir as a cache, and then create an incremental file.
git_mirror_main vsi_common_prep
# produces ./vsi_common_prep/transfer_2020_03_02_14_24_12_transfer_2020_03_02_14_16_09.tgz
Example
Both of these examples result in identical mirrors on your destination:
tar zxf transfer_2020_03_02_14_16_09.tgz
tar --incremental zxf transfer_2020_03_02_14_24_12_transfer_2020_03_02_14_16_09.tgz
tar zxf transfer_2020_03_02_14_24_12.tgz
Note
git_mirror_main
does not mirror all submodules that have ever been part of the repo, only those from a specific branch/SHA/tag you specify (git’s init.defaultBranch by default). This is because trying to mirror all submodules from the past could be very lengthy, and is very likely to include URLs that do not exist anymore.
Bugs
If the first argument is a local path to a git repository (as opposed to a URL or a path to an existing prep dir), it must be an absolute path.
- git_mirror_repos
Mirror the main repository and all submodules (recursively)
Downloads a mirror of a git repository and all of its submodules. The normal git clone --mirror
command does not support submodules at all. This at least clones all the submodules available in the specified branch (git’s init.defaultBranch by default).
Creates a directory, referred to as a prep directory (or prep_dir), which will contain all of the mirrored repositories. Subsequent calls to git_mirror_repos
can use the existing prep directory as cache, updating faster than the first time.
- Arguments:
$1
- URL of the main git repository. On subsequent calls to this function, the prep (cache) directory created by this function can be used in lue of the repository’s URL[
$2
] - The git branch from which to identify the submodules. Default: git’s init.defaultBranch
- Parameters:
[
GIT_MIRROR_PREP_DIR
] - The output directory in which to mirror the repositories; Default:${PWD}/{repo_name}_prep
- Output:
A prep directory which will contain all of the repositories
GIT_MIRROR_PREP_DIR
- The path to the mirrored repositoriesGIT_MIRROR_MAIN_REPO
- The main repository’s name. Based off of$1
; e.g., vsi_common if the URL is https://github.com/VisionSystemsInc/vsi_common.git
Note
git_mirror_repos
does not mirror all submodules that have ever been part of the repo, only those from a specific branch/SHA/tag you specify (git’s init.defaultBranch by default). This is because trying to mirror all submodules from the past could be very lengthy, and is very likely to include URLs that do not exist anymore.
Bugs
If the first argument is a local path to a git repository (as opposed to a URL or a path to an existing prep dir), it must be an absolute path.
See also
- archive_mirrors
Create an in-place archive of a directory
Create a transfer_{date}.tgz
archive file of a directory. This archive is created within the same directory.
Subsequent calls also create a second tgz
file, transfer_{date1}_transfer_{date2}.tgz
(on supported platforms). This is an incremental archive file. Only the incremental file is needed (plus the original full archive).
- Arguments:
$1
- A directory to archive in-place
- Output:
A
transfer_{date}.tgz
archive file and, on subsequent calls, an incrementaltransfer_{date1}_transfer_{date2}.tgz
- _git_mirror_load_info
Load the mapping between each repository and its mirror URL
- Arguments:
$1
- A file specifying the mapping between each repository and its mirror URL- Output:
repo_paths
- The array of (relative) paths from the root of the project repository to each submodule (recursively). (That is, a path as printed byecho . && git submodule foreach -q 'echo "${displaypath}"'
assuming the CWD is the root of the project repository)repo_urls
- The corresponding URL of each repo_path
Example
$ cat info.env
repo_paths=(
.
docker/recipes
)
repo_urls=(
https://git-server.com/foobar/vsi_common.git
https://git-server.com/foobar/recipes.git
)
Or alternatively, using associative arrays:
repos[.]=https://git-server.com/foobar/vsi_common.git
repos[docker/recipes]=https://git-server.com/foobar/recipes.git
- _git_mirror_get_url
Return the URL for a given submodule
- Arguments:
$1
- A path that can be found inrepo_paths
- Parameters:
repo_paths
- The array of (relative) paths from the root of the project repository to each submodule (recursively); see_git_mirror_load_info
repo_urls
- The corresponding URL of each repo_path- Output:
The URL corresponding to
$1
- git_clone_main
Clone recursively from the new mirror
Once the repository has been mirrored to the new git server with git_push_main
, it can be cloned. However, because the .gitmodules file will point to different URLs than the mirrors, and changing the .gitmodules file will change the repo, which we don’t want to do, we need to make a shallow clone of the repository, init the submodules, modify the submodules’ URLs, and then finally update the submodules. And all of this has to be done recursively for each submodule. As you can tell, this is very tedious, so this script will do it all for you.
- Arguments:
$1
- A file specifying the mapping between each repository and its mirror URL; see_git_mirror_load_info
[$2
] - The directory in which to clone the repo
Example
git_clone_main init.env ~/
- git_push_main
Push the mirrored repository and all submodules to a new git server
After transferring the archive file created by git_mirror_main
to a prep directory on your destination and extracting the archive into it, this function pushes all the mirrored repositories in the extracted archive to your own mirrors on a new git server.
Note
The mirrors must be initialized on the git server.
However, because the URLs for your mirrors will be different from the original repo URLs in .gitmodules, and modifying the URLs will change the git repo, which we do not want to do, you must create a file specifying the mapping between each repository and its mirror URL. The main repo is referred to as .
while the rest of the repos are referred to by the relative path with respect to the main repo (e.g. external/vsi_common
). These need to be stored in an associative array called repos
.
- Arguments:
$1
- A file specifying the mapping between each repository and its mirror URL; see_git_mirror_load_info
$2
- The mirrored repositories (extracted from the archive) created bygit_mirror_main
- Parameters:
[
GIT_MIRROR_FORCE_PUSH_REFS
] - A flag specifying whether to force push refs, should it be necessary. Default: 1 (force-push refs)
Example
In this example, the main repo’s mirror URL is https://git-server.com/foobar/vsi_common.git
, and the submodule stored at ./docker/recipes
has the URL https://git-server.com/foobar/recipes.git
. This file is also used by git_clone_main
. (Any valid git URL format can be used.)
$ cat info.env
repos[.]=https://git-server.com/foobar/vsi_common.git
repos[docker/recipes]=https://git-server.com/foobar/recipes.git
$ git_push_main info.env vsi_common_prep
- git_tag_main
Tag the refs in the mirrored repository and all submodules
Before pushing the mirrored repository and all of its submodules to the new git server with git_push_main
, tag the refs so that if, during the next transfer, they become dereferenced (due to a force push, which is sometimes necessary), they are not lost. For a ref of the form refs/heads/main, this (annotated) tag takes the (long) form of refs/tags/just_git_mirror/{datetime}/heads/main
- Arguments:
$1
- A file specifying the mapping between each repository and its mirror URL; see_git_mirror_load_info
$2
- The mirrored repositories (extracted from the archive) created bygit_mirror_main
- Parameters:
[
GIT_MIRROR_TAG_REFS_REGEX
] - An extended regular expression specifying which refs to tag. The refs should be matched in their long form, e.g., refs/heads/main or refs/tags/v1.0.0. Default: ^refs/heads/.+
See also
Note
Because this function creates a tag directory called just_git_mirror/ (note the trailing slash), a user can no longer create a tag called just_git_mirror