Commit Graph

308 Commits

Author SHA1 Message Date
Peter Pettersson
7dcc29fc46 Make enum in src,tests and examples C90 compliant by removing trailing comma. 2021-11-15 16:45:40 +01:00
Edward Thomson
91246ee5e0 path: use new length validation functions 2021-11-09 15:17:18 +00:00
Edward Thomson
63e36c53ca path: validate -> is_valid
Since we're returning a boolean about validation, the name is more
properly "is valid".
2021-11-09 15:17:17 +00:00
Edward Thomson
95117d4744 path: separate git-specific path functions from util
Introduce `git_fs_path`, which operates on generic filesystem paths.
`git_path` will be kept for only git-specific path functionality (for
example, checking for `.git` in a path).
2021-11-09 15:17:17 +00:00
Edward Thomson
f0e693b18a str: introduce git_str for internal, git_buf is external
libgit2 has two distinct requirements that were previously solved by
`git_buf`.  We require:

1. A general purpose string class that provides a number of utility APIs
   for manipulating data (eg, concatenating, truncating, etc).
2. A structure that we can use to return strings to callers that they
   can take ownership of.

By using a single class (`git_buf`) for both of these purposes, we have
confused the API to the point that refactorings are difficult and
reasoning about correctness is also difficult.

Move the utility class `git_buf` to be called `git_str`: this represents
its general purpose, as an internal string buffer class.  The name also
is an homage to Junio Hamano ("gitstr").

The public API remains `git_buf`, and has a much smaller footprint.  It
is generally only used as an "out" param with strict requirements that
follow the documentation.  (Exceptions exist for some legacy APIs to
avoid breaking callers unnecessarily.)

Utility functions exist to convert a user-specified `git_buf` to a
`git_str` so that we can call internal functions, then converting it
back again.
2021-10-17 09:49:01 -04:00
Edward Thomson
3d8749d323 checkout: always provide a path for attribute lookup
Always pass a working-directory relative path to attribute lookups
during checkout.
2021-09-25 14:39:01 +01:00
Edward Thomson
2998a84ab6 Merge pull request #5841 from J0Nes90/features/checkout-dry-run
Checkout dry-run
2021-08-29 21:49:33 -04:00
Edward Thomson
1db5b2199b filter: filter options are now "filter sessions"
Filters use a short-lived structure to keep state during an operation to
allow for caching and avoid unnecessary reallocations.  This was
previously called the "filter options", despite the fact that they
contain no configurable options.  Rename them to a "filter session" in
keeping with an "attribute session", which more accurately describes
their use (and allows us to create "filter options" in the future).
2021-07-22 15:08:50 -04:00
Edward Thomson
ff78aea6de Merge pull request #5860 from libgit2/ethomson/buf_text
buf: remove unnecessary buf_text namespace
2021-05-11 11:09:31 +01:00
Edward Thomson
d525e063ba buf: remove internal git_buf_text namespace
The `git_buf_text` namespace is unnecessary and strange.  Remove it,
just keep the functions prefixed with `git_buf`.
2021-05-11 01:29:22 +01:00
Edward Thomson
31d9c24b33 filter: internal git_buf filter handling function
Introduce `git_filter_list__convert_buf` which behaves like the old
implementation of `git_filter_list__apply_data`, where it might move the
input data buffer over into the output data buffer space for efficiency.

This new implementation will do so in a more predictible way, always
freeing the given input buffer (either moving it to the output buffer or
filtering it into the output buffer first).

Convert internal users to it.
2021-05-06 16:32:14 +01:00
Edward Thomson
99ddfd5c25 checkout: validate path length
Ensure that we are validating working directory paths before we try to
write to them.
2021-04-28 13:03:34 +01:00
Edward Thomson
6b878db5e8 checkout: use target path; don't assume workdir
We're not necessarily checking out into the working directory.  We could
be checking out into an arbitrary location.  Ensure that when we are
writing conflict data that we do it in the checkout target.
2021-04-28 13:03:34 +01:00
Edward Thomson
88323cd072 path: git_path_isvalid -> git_path_validate
If we want to validate more and different types of paths, the name
`git_path_validate` makes that easier and more expressive.  We can add,
for example, `git_path_validate_foo` while the current name makes that
less ergonomic.
2021-04-14 23:02:51 +01:00
Jochen Hunz
958205a33d implement GIT_CHECKOUT_DRY_RUN to allow notifications without touching the working directory 2021-04-14 22:05:47 +02:00
Edward Thomson
266f26ed92 checkout: use GIT_ASSERT 2020-11-27 11:09:19 +00:00
Edward Thomson
74520b914c Merge pull request #5552 from libgit2/pks/small-fixes
Random code cleanups and fixes
2020-06-13 19:38:11 +01:00
Patrick Steinhardt
c6184f0c4b tree-wide: do not compile deprecated functions with hard deprecation
When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor
definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h"
header to be empty. As a result, no function declarations are made
available to callers, but the implementations are still available to
link against. This has the problem that function declarations also
aren't visible to the implementations, meaning that the symbol's
visibility will not be set up correctly. As a result, the resulting
library may not expose those deprecated symbols at all on some platforms
and thus cause linking errors.

Fix the issue by conditionally compiling deprecated functions, only.
While it becomes impossible to link against such a library in case one
uses deprecated functions, distributors of libgit2 aren't expected to
pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually
define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real"
hard deprecation still makes sense in the context of CI to test we don't
use deprecated symbols ourselves and in case a dependant uses libgit2 in
a vendored way and knows it won't ever use any of the deprecated symbols
anyway.
2020-06-09 14:57:06 +02:00
Patrick Steinhardt
a6c9e0b367 tree-wide: mark local functions as static
We've accumulated quite some functions which are never used outside of
their respective code unit, but which are lacking the `static` keyword.
Add it to reduce their linkage scope and allow the compiler to optimize
better.
2020-06-09 14:57:06 +02:00
Patrick Steinhardt
46637b5e24 checkout: remove unused code for deferred removals
With commit 05f690122 (checkout: remove blocking dir when FORCEd,
2015-03-31), the last case was removde that actually queued a deferred
removal. This is now more than five years in the past and nobody
complained, so we can rest quite assured that the deferred removal is
not really needed at all.

Let's remove all related code to simplify the already complicated
checkout logic.
2020-06-08 14:47:49 +02:00
Edward Thomson
0f35efeb57 git_pool_init: handle failure cases
Propagate failures caused by pool initialization errors.
2020-06-01 14:12:17 +02:00
Patrick Steinhardt
3f201f75c6 checkout: fix file being treated as unmodified due to racy index
When trying to determine whether a file changed, we try to avoid heavy
operations by fist taking a look at the index, seeing whether the index
entry is modified already. This doesn't seem to cut it, though, as we
currently have the racy checkout::index::can_disable_pathspec_match test
case: sometimes the files get restored to their original contents,
sometimes they aren't.

The issue is caused by a racy index [1]: in case we modify a file, add
it to the index and then modify it again in-place without changing its
file, then we may end up with a modified file that has the same stat(3P)
info as we've currently got it in its corresponding index entry. The
mitigation for this is to treat files with the same mtime as the index
are treated as racily modified. We already have this logic in place for
the index, but not when doing a checkout.

Fix the issue by only consulting the index entry in case it has an older
mtime as the index. Previously, the following script reliably had at
least 20 failures, while now there is no failure to be observed anymore:

```bash
j=0
for i in $(seq 100)
do
	if ! ./libgit2_clar -scheckout::index::can_disable_pathspec_match >/dev/null
	then
		j=$(($j + 1))
	fi
done
echo "Failures: $j"
```

[1]: https://git-scm.com/docs/racy-git
2020-05-16 14:02:42 +02:00
Segev Finer
d62e44cb82 checkout: Fix removing untracked files by path in subdirectories
The checkout code didn't iterate into a subdir if it didn't match the
pathspec, but since the pathspec might match files in the subdir we
should recurse into it (In contrast to gitignore handling).

Fixes #5089
2020-05-11 00:15:06 +01:00
Edward Thomson
63de21283f checkout: filter pathspecs for _all_ checkout types
We were previously applying the pathspec filter for the baseline
iterator during checkout, as well as the target tree.  This was an
oversight; in fact, we should apply the pathspec filter to _all_
checkout targets, not just trees.

Add a helper function to set the iterator pathspecs from the given
checkout pathspecs, and call it everywhere.
2020-05-10 23:47:20 +01:00
Etienne Samson
cd5e33fbc2 global: DRY includes of assert.h 2019-11-06 11:08:23 +01:00
Patrick Steinhardt
6be5ac2351 checkout: postpone creation of symlinks to the end
On most platforms it's fine to create symlinks to nonexisting files. Not
so on Windows, where the type of a symlink (file or directory) needs to
be set at creation time. So depending on whether the target file exists
or not, we may end up with different symlink types. This creates a
problem when performing checkouts, where we simply iterate over all blobs
that need to be updated without treating symlinks any special. If the
target file of the symlink is going to be checked out after the symlink
itself, then the symlink will be created as directory symlink and not as
file symlink.

Fix the issue by iterating over blobs twice: once to perform postponed
deletions and updates to non-symlink blobs, and once to perform updates
to symlink blobs.
2019-07-20 19:11:20 +02:00
Patrick Steinhardt
658022c41a configuration: cvar -> configmap
`cvar` is an unhelpful name.  Refactor its usage to `configmap` for more
clarity.
2019-07-18 13:53:41 +02:00
Edward Thomson
0b5ba0d744 Rename opt init functions to options_init
In libgit2 nomenclature, when we need to verb a direct object, we name
a function `git_directobject_verb`.  Thus, if we need to init an options
structure named `git_foo_options`, then the name of the function that
does that should be `git_foo_options_init`.

The previous names of `git_foo_init_options` is close - it _sounds_ as
if it's initializing the options of a `foo`, but in fact
`git_foo_options` is its own noun that should be respected.

Deprecate the old names; they'll now call directly to the new ones.
2019-06-14 09:57:00 +01:00
Patrick Steinhardt
351eeff3b2 maps: use uniform lifecycle management functions
Currently, the lifecycle functions for maps (allocation, deallocation, resize)
are not named in a uniform way and do not have a uniform function signature.
Rename the functions to fix that, and stick to libgit2's naming scheme of saying
`git_foo_new`. This results in the following new interface for allocation:

- `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an
  error code if we ran out of memory

- `void git_<t>map_free(git_<t>map *map)` to free a map

- `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map

This commit also fixes all existing callers.
2019-02-15 13:16:48 +01:00
Edward Thomson
f673e232af git_error: use new names in internal APIs and usage
Move to the `git_error` name in the internal API for error-related
functions.
2019-01-22 22:30:35 +00:00
Edward Thomson
168fe39bea object_type: use new enumeration names
Use the new object_type enumeration names within the codebase.
2018-12-01 11:54:57 +00:00
Edward Thomson
5e26391a25 checkout: FORCE doesn't halt on dirty index
If the index is dirty, allow `GIT_CHECKOUT_FORCE` to obliterate unsaved
changes.  This is in keeping with its name and description.
2018-06-29 14:54:28 +01:00
Edward Thomson
b242cdbf1e index: commit the changes to the index properly
Now that the index has a "dirty" state, where it has changes that have
not yet been committed or rolled back, our tests need to be adapted to
actually commit or rollback the changes instead of assuming that the
index can be operated on in its indeterminate state.
2018-06-29 14:54:28 +01:00
Edward Thomson
88b30f51e4 checkout: always set the index in checkout data
Always set the `index` in the `checkout_data`, even in the case that we
are not reloading the index.  Other functionality in checkout examines
the index (for example: determining whether the workdir is modified) and
we need it even in the (uncommon) case that we are not reloading.
2018-06-26 12:38:35 +01:00
Patrick Steinhardt
ecf4f33a4e Convert usage of git_buf_free to new git_buf_dispose 2018-06-10 19:34:37 +02:00
Carlos Martín Nieto
a7168b47ee path: reject .gitmodules as a symlink
Any part of the library which asks the question can pass in the mode to have it
checked against `.gitmodules` being a symlink.

This is particularly relevant for adding entries to the index from the worktree
and for checking out files.
2018-05-23 08:47:08 +02:00
Edward Thomson
c214ba1984 checkout: respect core.filemode when comparing filemodes
Fixes #4504
2018-02-23 17:21:33 -08:00
Edward Thomson
d7fea1e1a7 checkout: take mode into account when comparing index to baseline
When checking out a file, we determine whether the baseline (what we
expect to be in the working directory) actually matches the contents
of the working directory.  This is safe behavior to prevent us from
overwriting changes in the working directory.

We look at the index to optimize this test: if we know that the index
matches the working directory, then we can simply look at the index
data compared to the baseline.

We have historically compared the baseline to the index entry by oid.
However, we must also compare the mode of the two items to ensure that
they are identical.  Otherwise, we will refuse to update the working
directory for a mode change.
2018-02-19 18:42:47 +00:00
David Turner
2a3e06359b Do not attempt to check out submodule as blob when merging a submodule modify/deltete conflict 2017-12-04 17:15:00 -05:00
Edward Thomson
128c5ca930 checkout: do not test file mode on Windows
On Windows, we do not support file mode changes, so do not test
for type changes between the disk and tree being checked out.

We could have false positives since the on-disk file can only have
an (effective) mode of 0100644 since NTFS does not support executable
files.  If the tree being checked out did have an executable file,
we would erroneously decide that the file on disk had been changed.
2017-10-07 12:32:04 +01:00
Edward Thomson
752b7c792d checkout: treat files as modified if mode differs
When performing a forced checkout, treat files as modified when the
workdir or the index is identical except for the mode.  This ensures
that force checkout will update the mode to the target.  (Apply this
check for regular files only, if one of the items was a file and the
other was another type of item then this would be a typechange and
handled independently.)
2017-10-06 23:53:34 +01:00
Patrick Steinhardt
0c7f49dd43 Make sure to always include "common.h" first
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.

This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.

This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
2017-07-03 08:51:48 +00:00
Edward Thomson
83989d70ec checkout: cope with untracked files in directory deletion
When deleting a directory during checkout, do not simply delete the
directory, since there may be untracked files.  Instead, go into
the iterator and examine each file.

In the original code (the code with the faulty assumption), we look to
see if there's an index entry beneath the directory that we want to
remove.   Eg, it looks to see if we have a workdir entry foo and an
index entry foo/bar.txt. If this is not the case, then the working
directory must have precious files in that directory. This part is okay.
The part that's not okay is if there is an index entry foo/bar.txt. It
just blows away the whole damned directory.

That's not cool.

Instead, by simply pushing the directory itself onto the stack and
iterating each entry, we will deal with the files one by one - whether
they're in the index (and can be force removed) or not (and are
precious).

The original code was a bad optimization, assuming that we didn't need
to git_iterator_advance_into if there was any index entry in the folder.
That's wrong - we could have optimized this iff all folder entries are
in the index.

Instead, we need to simply dig into the directory and analyze its
entries.
2017-06-10 19:16:48 +01:00
Patrick Steinhardt
77c8ee74ff checkout: fix double-free of checkout_data's mkdir_map
We currently call `git_strmap_free` on `checkout_data.mkdir_map` in the
`checkout_data_clear` function. The only thing protecting us from a
double-free is that the `git_strmap_free` function is in fact not a
function, but a macro that also sets the map to NULL.

Remove the second call to `git_strmap_free` and explicitly set the map
member to NULL.
2017-03-20 08:59:30 +01:00
Patrick Steinhardt
13c3bc9adf strmap: remove GIT__USE_STRMAP macro 2017-02-17 11:41:06 +01:00
Edward Thomson
cb76eed5ce Merge pull request #4054 from jfultz/jfultz/fix_GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH
Fix handling of GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH flag.
2017-01-14 17:41:49 +00:00
John Fultz
5f959dca0d Fix handling of GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH flag.
git_checkout_tree() sets up its working directory iterator to respect the
pathlist if GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH is present, which is great.
What's not so great is that this iterator is then used side-by-side with
an iterator created by git_checkout_iterator(), which did not set up its
pathlist appropriately (although the iterator mirrors all other iterator
options).

This could cause git_checkout_tree() to delete working tree files which
were not specified in the pathlist when GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH
was used, as the unsynchronized iterators causes git_checkout_tree() to think
that files have been deleted between the two trees.  Oops.

And added a test which fails without this fix (specifically, the final check
for "testrepo/README" to still be present fails).
2016-12-29 20:13:03 -06:00
Edward Thomson
909d549436 giterr_set: consistent error messages
Error messages should be sentence fragments, and therefore:

1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
2016-12-29 12:26:03 +00:00
Patrick Steinhardt
90a934a521 checkout: pass string instead of git_buf to giterr_set 2016-11-14 10:07:55 +01:00
Edward Thomson
955c99c214 checkout: don't try to calculate oid for directories
When trying to determine if we can safely overwrite an existing workdir
item, we may need to calculate the oid for the workdir item to determine
if its identical to the old side (and eligible for removal).

We previously did this regardless of the type of entry in the workdir;
if it was a directory, we would open(2) it and then try to read(2).
The read(2) of a directory fails on many platforms, so we would treat it
as if it were unmodified and continue to perform the checkout.

On FreeBSD, you _can_ read(2) a directory, so this pattern failed.  We
would calculate an oid from the data read and determine that the
directory was modified and would therefore generate a checkout conflict.

This reliance on read(2) is silly (and was most likely accidentally
giving us the behavior we wanted), we should be explicit about the
directory test.
2016-09-14 10:28:24 +01:00