Non-ASCII filename characters in plain diff files appear garbled
Description
When downloading plain diff files from Code > Plain diff, filenames with non-ASCII characters (such as Chinese, Japanese, Korean) appear garbled or corrupted.
Environment
- Merge requests to change files with non-ASCII characters
Impacted offerings:
- GitLab.com
- GitLab Dedicated
- GitLab Self-Managed
Impacted versions:
- All versions
Solution
GitLab Self-Managed Solution
- Configure gitlab.rb as follows.
gitaly['configuration'] = {
git: {
config: [
{ key: 'core.quotepath', value: 'false' }
]
}
}
- Run
sudo gitlab-ctl reconfigure
GitLab.com & GitLab Dedicated Solution
- Clone/fetch your project into your local repository
- Run
git config core.quotepath false
- Run
git diff origin/<source branch>..origin/<target branch>
Cause
By default, Git commands that output paths escapes control characters or bytes with values larger than 0x80.
Since almost all non-Latin alphabet characters have a code greater than 0x80, merge requests that include filenames with such characters will be escaped and can appear to be garbled in the plain diff file.