eric | Feb. 25, 2020, 4:08 p.m.
Git puts the upper file size limit at 100 MB. So what can you do if you have files bigger than that? Git Large File Storage (Git LFS) is an open-source extension to Git that allows you to work with large text-files. It lets you store files up to 2 GB in size.
If you have files in a repository that are bigger than 100 MB, you need to use gif-lfs (lfs - large file size) extension on your client. The server must have support for gif-lfs. Github includes git-lfs support.
The upper file size limit of Git is 100 MB. The upper file size limit of git-lfs is 2 GB. [1], [2]
$ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash [sudo] password for ... : Detected operating system as Ubuntu/xenial. Checking for curl... Detected curl... Checking for gpg... Detected gpg... Running apt-get update... done. Installing apt-transport-https... done. Installing /etc/apt/sources.list.d/github_git-lfs.list...done. Importing packagecloud gpg key... done. Running apt-get update... done. The repository is setup! You can now install packages.
Install git-lfs with apt:
$ sudo apt-get install git-lfs Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed git-lfs 0 to upgrade, 1 to newly install, 0 to remove and 0 not to upgrade. Need to get 2.815 kB of archives. After this operation, 11,5 MB of additional disk space will be used. Get:1 https://packagecloud.io/github/git-lfs/ubuntu xenial/main amd64 git-lfs amd64 2.3.4 [2.815 kB] Fetched 2.815 kB in 2s (978 kB/s) Selecting previously unselected package git-lfs. (Reading database ... 228238 files and directories currently installed.) Preparing to unpack .../git-lfs_2.3.4_amd64.deb ... Unpacking git-lfs (2.3.4) ... Processing triggers for man-db (2.7.5-1) ... Setting up git-lfs (2.3.4) ... Git LFS initialized.
Other package downloaders:
RPM:
$ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | sudo bash
Python:
$ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.python.sh | bash
gem:
$ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.gem.sh | bash
You can also get the packages from the git-lfs-page on Package Cloud
Finally, verify the installation:
$ git lfs install Git LFS initialized.
The first step is to specify file patterns to store with Git LFS, stored in the file .gitattributes.:
$ mkdir large-file-repo $ cd large-file-repo $ git init
Add a filter that causes all zip files to be handled through Git LFS:
$ git lfs track "*.csv"
Now you can push commits:
$ git add .gitattributes $ git add test.csv $ git commit -m "add csv files"
Check that Git LFS is managing your zip-file:
$ git lfs ls-files test.csv
As you can see, git-lfs operations are seamless, which means that you can use git-lfs without changing your existing Git workflow. Note that git clone and git pull operations will be faster as you only download the versions of large files referenced by commits that you check out, not every version of the file.
OBS: The fact that you only have a reference to the files in the local repository, also means that you will only see that reference until you check out the file. Moreover, when you check out the file, the download may take some time due to file size. If you want to have all files complete in the local repository, use git lfs pull.
There are several good in-depth guides on how to use git-lfs. The best I have come across is Git LFS by Atlassian. [4]
[1] - GitHub Help: Managing Large Files - https://help.github.com/categories/managing-large-files/
[2] - git-lfs ReadMe - https://github.com/git-lfs/git-lfs
[3] - git-lfs-page on Package Cloud - https://packagecloud.io/github/git-lfs
[4] - Git LFS, by Atlassian - https://www.atlassian.com/git/tutorials/git-lfs
Experienced dev and PM. Data science, DataOps, Python and R. DevOps, Linux, clean code and agile. 10+ years working remotely. Polyglot. Startup experience.
LinkedIn Profile
Statistics & R - a blog about - you guessed it - statistics and the R programming language.
R-blog
Erlang Explained - a blog on the marvelllous programming language Erlang.
Erlang Explained