Commit Graph

12491 Commits

Author SHA1 Message Date
Patrick Steinhardt
e54343a402 fileops: rename to "futils.h" to match function signatures
Our file utils functions all have a "futils" prefix, e.g.
`git_futils_touch`. One would thus naturally guess that their
definitions and implementation would live in files "futils.h" and
"futils.c", respectively, but in fact they live in "fileops.h".

Rename the files to match expectations.
2019-07-20 19:11:20 +02:00
Edward Thomson
1f44079cae Merge pull request #5179 from pks-t/pks/patch-parse-free
patch_parse: fix segfault due to line containing static contents
2019-07-20 18:08:40 +01:00
Patrick Steinhardt
a613832ef3 patch_parse: fix segfault due to line containing static contents
With commit dedf70ad2 (patch_parse: do not depend on parsed buffer's
lifetime, 2019-07-05), all lines of the patch are allocated with
`strdup` to make lifetime of the parsed patch independent of the buffer
that is currently being parsed. In patch b08932824 (patch_parse: ensure
valid patch output with EOFNL, 2019-07-11), we introduced another
code location where we add lines to the parsed patch. But as that one
was implemented via a separate pull request, it wasn't converted to use
`strdup`, as well. As a consequence, we generate a segfault when trying
to deallocate the potentially static buffer that's now in some of the
lines.

Use `git__strdup` to fix the issue.
2019-07-20 18:49:48 +02:00
Edward Thomson
e07dbc92d1 Merge pull request #5173 from pks-t/pks/gitignore-wildmatch-error
ignore: fix determining whether a shorter pattern negates another
2019-07-20 11:26:00 +01:00
Edward Thomson
fd7a384b68 Merge pull request #5159 from pks-t/pks/patch-parse-old-missing-nl
patch_parse: handle missing newline indicator in old file
2019-07-20 11:24:37 +01:00
Edward Thomson
f33ca472d1 Merge pull request #5158 from pks-t/pks/patch-parsed-lifetime
patch_parse: do not depend on parsed buffer's lifetime
2019-07-20 11:06:23 +01:00
Edward Thomson
d78a1b186d Merge pull request #5174 from pks-t/pks/winhttp-hash
sha1: fix compilation of WinHTTP backend
2019-07-20 11:04:53 +01:00
Edward Thomson
964c1c6001 Merge pull request #5176 from pks-t/pks/repo-template-head
repository: do not initialize HEAD if it's provided by templates
2019-07-20 11:02:30 +01:00
Patrick Steinhardt
9d46f16759 repository: do not initialize HEAD if it's provided by templates
When using templates to initialize a git repository, then git-init(1)
will copy over all contents of the template directory. These will be
preferred over the default ones created by git-init(1). While we mostly
do the same, there is the exception of "HEAD". While we do copy over the
template's HEAD file, afterwards we'll immediately re-initialize its
contents with either the default "ref: refs/origin/master" or the init
option's `initial_head` field.

Let's fix the inconsistency with upstream git-init(1) by not overwriting
the template HEAD, but only if the user hasn't set `opts.initial_head`.
If the `initial_head` field has been supplied, we should use that
indifferent from whether the template contained a HEAD file or not. Add
tests to verify we correctly use the template directory's HEAD file and
that `initial_head` overrides the template.
2019-07-19 13:05:34 +02:00
Patrick Steinhardt
f3134a8456 repository: update error handling in init_ext
Update `git_repository_init_ext` to use our typical style of error
handling. The function had multiple statements which didn't `goto out`
immediately but instead deferred it to later calls combined with `if`
statements.
2019-07-19 13:02:59 +02:00
Patrick Steinhardt
869ae5a3a1 repository: avoid swallowing error codes in create_head
The error handling in `git_repository_create_head` completely swallows
all error codes. While probably not too much of a problem, this also
violates our usual coding style.

Refactor the code to use a local `error` variable with the typical `goto
out` statements.
2019-07-19 13:02:57 +02:00
Patrick Steinhardt
0d12b8dd77 tests: repo: refactor setup of templates and repos
All tests in repo::template have a common pattern of first setting up
templates, then settung up the repository that makes use of those
templates via several init options. Refactor this pattern into two
functions `setup_templates` and `setup_repo` that handle most of that
logic to make it easier to spot what a test actually wants to check.

Furthermore, this also refactors how we clean up after the tests.
Previously, it was a combination of manually calling
`cl_fixture_cleanup` and `cl_set_cleanup`, which really is kind of hard
to read. This commit refactors this to instead provide the cleaning
parameters in the setup functions. All cleanups are then performed in
the suite's cleanup function.
2019-07-19 11:05:07 +02:00
Patrick Steinhardt
3b79ceafb0 tests: repo: refactor template path handling
The repo::template test suite makes use of quite a few local variables
that could be consolidated. Do so to make the code easier to read.
2019-07-19 11:03:24 +02:00
Patrick Steinhardt
ee1934807b tests: repo: move template tests into their own suite
There's quite a lot of supporting code for our templates and they are an
obvious standalone feature. Thus, let's extract those tests into their
own suite to also make refactoring of them easier.
2019-07-19 11:02:45 +02:00
Patrick Steinhardt
3424c21057 Merge pull request #5138 from libgit2/ethomson/cvar
configuration: cvar -> configmap
2019-07-19 08:00:13 +02:00
Patrick Steinhardt
a33c0de2ed Merge pull request #5172 from bk2204/cache-efficient-eviction
Evict cache items more efficiently
2019-07-18 19:17:40 +02:00
Patrick Steinhardt
e86d75f3b2 Merge pull request #5175 from pks-t/pks/clar-fix-suite-count
clar: fix suite count
2019-07-18 19:00:42 +02:00
Patrick Steinhardt
92109976fd tests: fix undercounting of suites
With the introduction of data variants for suites, we started
undercounting the number of suites as we didn't account for those that
were executed twice. This was then adjusted to count the number of
initializers instead, but this fails to account for suites without any
initializers at all.

Fix the suite count by counting either the number of initializers or, if
there is no initializer, count it as a single suite, only.
2019-07-18 14:29:43 +02:00
Patrick Steinhardt
29fe79e653 Merge pull request #5163 from csware/gitignore-vs2017
Ignore VS2017 specific files and folders
2019-07-18 14:07:22 +02:00
Edward Thomson
36558513c8 configuration: deprecate git_cvar safely 2019-07-18 13:59:05 +02:00
Patrick Steinhardt
658022c41a configuration: cvar -> configmap
`cvar` is an unhelpful name.  Refactor its usage to `configmap` for more
clarity.
2019-07-18 13:53:41 +02:00
Patrick Steinhardt
343fb83a4d Merge pull request #5156 from pks-t/pks/attr-macros-in-subdir
gitattributes: ignore macros defined in subdirectories
2019-07-18 13:50:47 +02:00
Patrick Steinhardt
270fd807df azure: compile one Windows platform with the WinHTTP SHA1 backend
We currently have no job that compiles libgit2 with the WinHTTP backend
for SHA1. Due to this, a compile error has been introduced and not
noticed for several months. Change the x86 MSVC job to use the HTTPS
backend for SHA1. The x86 job was chosen with no particular reason.
2019-07-18 13:45:04 +02:00
Patrick Steinhardt
7574564efc sha1: win32: fix compilation due to unknown type
In commit bbf034ab9 (hash: move `git_hash_prov` into Win32 backend,
2019-02-22), the `git_hash_prov`'s structure name has been removed in
favour of its typedef'ed name. But as we have no CI that compiles with
the WinHTTPS hashing backend right now, it wasn't noticed that the
implementation that uses this struct wasn't changed correctly.

Fix the struct type to make it compile again.
2019-07-18 13:40:34 +02:00
Patrick Steinhardt
b7c247b3ad cmake: include SHA1 headers into our source files
When selecting the SHA1 backend, we only include the respective C
implementation of the selected backend. But since commit bd48bf3fb
(hash: introduce source files to break include circles, 2019-06-14), we
have introduced separate headers and compilation units for all hashes.
So by not including the headers, we may not honor them to compute
whether a file needs to be recompiled and they also will not be
displayed in IDEs.

Add the header files to fix this problem.
2019-07-18 13:37:02 +02:00
Patrick Steinhardt
6f6340afec ignore: fix determining whether a shorter pattern negates another
When computing whether we need to store a negative pattern, we iterate
through all previously known patterns and check whether the negative
pattern undoes any of the previous ones. In doing so we call `wildmatch`
and check it's return for any negative error values. If there was a
negative return, we will abort and bubble up that error to the caller.

In fact, this check for negative values stems from the time where we
still used `fnmatch` instead of `wildmatch`. For `fnmatch`, negative
values indicate a "real" error, while for `wildmatch` a negative value
may be returned if the matching was prematurely aborted. A premature
abort may for example also happen if the pattern matches a prefix of the
haystack if the pattern is shorter. Returning an error in that case is
the wrong thing to do.

Fix the code to compare for equality with `WM_MATCH`, only. Negative
values returned by `wildmatch` are perfectly fine and thus should be
ignored. Add a test that verifies we do not see the error.
2019-07-18 12:02:54 +02:00
Patrick Steinhardt
368b9795ae Merge pull request #5168 from tiennou/clar/fix-data-suite-count
clar: correctly account for "data" suites when counting
2019-07-18 11:27:21 +02:00
Edward Thomson
51124a5b23 Merge pull request #5170 from bk2204/packbuilder-efficient-realloc
Allocate memory more efficiently when packing objects
2019-07-17 17:33:34 +01:00
brian m. carlson
770b91b114 cache: evict items more efficiently
When our object cache is full, we pick eight items (or the whole cache,
if there are fewer) and evict them. For small cache sizes, this is fine,
but when we're dealing with a large number of objects, we can repeatedly
exhaust the cache and spend a large amount of time in git_oidmap_iterate
trying to find items to evict.

Instead, let's assume that if the cache gets full, we have a large
number of objects that we're handling, and be more aggressive about
evicting items. Let's remove one item for every 2048 items, but not less
than 8. This causes us to scale our evictions in proportion to the size
of the cache and significantly reduces the time we spend in
git_oidmap_iterate.

Before this change, a full pack of all the non-blob objects in the Linux
repository took in excess of 30 minutes and spent 62.3% of total runtime
in odb_read_1 and its children, and 44.3% of the time in
git_oidmap_iterate. With this change, the same operation now takes 14
minutes and 44 seconds, and odb_read_1 accounts for only 35.9% of total
time, whereas git_oidmap_iterate consists of 6.2%.

Note that we do spend a little more time inflating objects and a decent
amount more time in memcmp. However, overall, the time taken is
significantly improved, and time in pack building is now dominated by
git_delta_create_from_index (33.7%), which is what we would expect.
2019-07-17 15:59:54 +00:00
brian m. carlson
c4df926b32 pack-objects: allocate memory more efficiently
The packbuilder code allocates memory in chunks. When it needs to
allocate, it tries to add 1024 to the number of objects and multiply by
3/2. However, it actually multiplies by 1 instead, since it performs an
integral division in the expression "3 / 2" and only then multiplies by
the increased number of objects.

The current behavior causes the code to waste massive amounts of time
copying memory when it reallocates, causing inserting all non-blob
objects in the Linux repository into a new pack to take some
indeterminate time greater than 5 minutes instead of 52 seconds.

Correct this error by first dividing by two, and only then multiplying
by 3. We still check for overflow for the multiplication, which is the
only part that can overflow. This appears to be the only place in the
code base which has this problem.
2019-07-17 13:37:41 +00:00
Etienne Samson
4cd8dfaa30 clar: correctly account for "data" suites when counting
Failing to do that makes clar miss the last of the suites, as all
duplicated "data" would have not been accounted for.
2019-07-16 20:20:55 +02:00
Sven Strickroth
5f22f8d2d3 Ignore VS2017 specific files and folders
Signed-off-by: Sven Strickroth <email@cs-ware.de>
2019-07-12 17:29:27 +02:00
Patrick Steinhardt
f92d495d44 Merge pull request #5131 from pks-t/pks/fileops-mkdir-in-root
fileops: fix creation of directory in filesystem root
2019-07-12 10:48:14 +02:00
Patrick Steinhardt
f83469059b attr_file: ignore macros defined in subdirectories
Right now, we are unconditionally applying all macros found in a
gitatttributes file. But quoting gitattributes(5):

    Custom macro attributes can be defined only in top-level
    gitattributes files ($GIT_DIR/info/attributes, the .gitattributes
    file at the top level of the working tree, or the global or
    system-wide gitattributes files), not in .gitattributes files in
    working tree subdirectories. The built-in macro attribute "binary"
    is equivalent to:

So gitattribute files in subdirectories of the working tree may
explicitly _not_ contain macro definitions, but we do not currently
enforce this limitation.

This patch introduces a new parameter to the gitattributes parser that
tells whether macros are allowed in the current file or not. If set to
`false`, we will still parse macros, but silently ignore them instead of
adding them to the list of defined macros. Update all callers to
correctly determine whether the to-be-parsed file may contain macros or
not. Most importantly, when walking up the directory hierarchy, we will
only set it to `true` once it reaches the root directory of the repo
itself.

Add a test that verifies that we are indeed not applying macros from
subdirectories. Previous to these changes, the test would've failed.
2019-07-12 09:26:22 +02:00
Patrick Steinhardt
97968529f9 attr_file: refactor parse_buffer function
The gitattributes code is one of our oldest and most-untouched codebases
in libgit2, and as such its code style doesn't quite match our current
best practices. Refactor the function `git_attr_file__parse_buffer` to
better match them.
2019-07-12 09:26:22 +02:00
Patrick Steinhardt
dbc7e4b150 attr_file: refactor load_standalone function
The gitattributes code is one of our oldest and most-untouched codebases
in libgit2, and as such its code style doesn't quite match our current
best practices. Refactor the function `git_attr_file__lookup_standalone`
to better match them.
2019-07-12 09:26:22 +02:00
Patrick Steinhardt
be8f9bb188 attrcache: fix memory leak if inserting invalid macro to cache
A macro without any assignments is considered an invalid macro by the
attributes cache and is thus not getting added to the macro map at all.
But as `git_attr_cache__insert_macro` returns success with neither
free'ing nor adopting the macro into its map, this will cause a memory
leak.

Fix this by freeing the macro in the function if it's not going to be
added. This is perfectly fine to do, as callers assume that the
attrcache will have the macro adopted on success anyway.
2019-07-12 09:26:22 +02:00
Patrick Steinhardt
7277bf833d attrcache: fix multiple memory leaks when inserting macros
The function `git_attr_cache__insert_macro` is responsible for adopting
macros in the per-repo macro cache. When adding a macro that replaces an
already existing macro (e.g. because of re-parsing gitattributes files),
then we do not free the previous macro and thus cause a memory leak.

Fix this leak by first checking if the cache already has a macro defined
with the same name. If so, free it before replacing the cache entry with
the new instance.
2019-07-12 09:26:22 +02:00
Patrick Steinhardt
df417a4325 tests: attr: verify that in-memory macros are respected
Add some tests to ensure that the `git_attr_add_macro` function works as
expected.
2019-07-12 09:26:02 +02:00
Patrick Steinhardt
4a7f704fdb tests: attr: implement tests to verify attribute rewriting behaviour
Implement some tests that verify that we are correctly updating
gitattributes when rewriting or unlinking the corresponding files.
2019-07-12 09:01:57 +02:00
Patrick Steinhardt
ed854aa095 tests: attr: extract macro tests into their own suite
As macros are a specific functionality in the gitattributes code, it
makes sense to extract them into their own test suite, too. This makes
finding macro-related tests easier.
2019-07-12 09:01:57 +02:00
Patrick Steinhardt
dacac9e1e9 Merge pull request #5160 from pks-t/pks/win32-fuzzers
win32: fix fuzzers and have CI build them
2019-07-12 08:30:07 +02:00
Patrick Steinhardt
5ae22a6373 fileops: fix creation of directory in filesystem root
In commit 45f24e787 (git_repository_init: stop traversing at
windows root, 2019-04-12), we have fixed `git_futils_mkdir` to
correctly handle the case where we create a directory in
Windows-style filesystem roots like "C:\repo".

The problem here is an off-by-one: previously, to that commit,
we've been checking wether the parent directory's length is equal
to the root directory's length incremented by one. When we call
the function with "/example", then the parent directory's length
("/") is 1, but the root directory offset is 0 as the path is
directly rooted without a drive prefix. This resulted in `1 == 0 +
1`, which was true. With the change, we've stopped incrementing
the root directory length, and thus now compare `1 <= 0`, which
is false.

The previous way of doing it was kind of finicky any non-obvious,
which is also why the error was introduced. So instead of just
re-adding the increment, let's explicitly add a condition that
aborts finding the parent if the current parent path is "/".

Making this change causes Azure Pipelines to fail the testcase
repo::init::nonexistent_paths on Unix-based systems. This is because we
have just fixed creating directories in the filesystem root, which
previously didn't work. As Docker-based tests are running as root user,
we are thus able to create the non-existing path and will now succeed to
create the repository that was expected to actually fail.

Let's split this up into three different tests:

- A test to verify that we do not create repos in a non-existing parent
  directoy if the flag `GIT_REPOSITORY_INIT_MKPATH` is not set.

- A test to verify that we fail if the root directory does not exist. As
  there is a common root directory on Unix-based systems that always
  exist, we can only test for this on Windows-based systems.

- A test to verify that we fail if trying to create a repository in an
  unwriteable parent directory. We can only test this if not running
  tests as root user, as CAP_DAC_OVERRIDE will cause us to ignore
  permissions when creating files.
2019-07-11 22:34:34 +02:00
Patrick Steinhardt
a6ad9e8a4e Merge pull request #5134 from pks-t/pks/config-parser-separation
Config parser separation
2019-07-11 14:03:21 +02:00
Erik Aigner
b08932824e patch_parse: ensure valid patch output with EOFNL 2019-07-11 12:14:27 +02:00
Patrick Steinhardt
3f855fe863 patch_parse: handle missing newline indicator in old file
When either the old or new file contents have no newline at the end of
the file, then git-diff(1) will print out a "\ No newline at end of
file" indicator. While we do correctly handle this in the case where the
new file has this indcator, we fail to parse patches where the old file
is missing a newline at EOF.

Fix this bug by handling and missing newline indicators in the old file.
Add tests to verify that we can parse such files.
2019-07-11 12:11:11 +02:00
Patrick Steinhardt
b30dab8fa7 apply: refactor to use a switch statement 2019-07-11 12:10:48 +02:00
Patrick Steinhardt
001d76e1fe diff: ignore EOFNL for computing patch IDs
The patch ID is supposed to be mostly context-insignificant and
thus only includes added or deleted lines. As such, we shouldn't honor
end-of-file-without-newline markers in diffs.

Ignore such lines to fix how we compute the patch ID for such diffs.
2019-07-11 11:34:40 +02:00
Patrick Steinhardt
dbeadf8a9e config_parse: provide parser init and dispose functions
Right now, all configuration file backends are expected to
directly mess with the configuration parser's internals in order
to set it up. Let's avoid doing that by implementing both a
`git_config_parser_init` and `git_config_parser_dispose` function
to clearly define the interface between configuration backends
and the parser.

Ideally, we would make the `git_config_parser` structure
definition private to its implementation. But as that would
require an additional memory allocation that was not required
before we just live with it being visible to others.
2019-07-11 11:10:04 +02:00
Patrick Steinhardt
3215752653 config_file: refactor error handling in config_write
Error handling in `config_write` is rather convoluted and does
not match our current code style. Refactor it to make it easier
to understand.
2019-07-11 11:10:02 +02:00