summaryrefslogtreecommitdiffstats
path: root/services/migrations
Commit message (Collapse)AuthorAgeFilesLines
* Fix "force private" logic (#31012) (#31021)Giteabot2024-05-201-1/+1
| | | | | Backport #31012 by wxiaoguang Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Enable more `revive` linter rules (#30608)silverwind2024-04-222-2/+0
| | | | | | | | | | | Noteable additions: - `redefines-builtin-id` forbid variable names that shadow go builtins - `empty-lines` remove unnecessary empty lines that `gofumpt` does not remove for some reason - `superfluous-else` eliminate more superfluous `else` branches Rules are also sorted alphabetically and I cleaned up various parts of `.golangci.yml`.
* Change the default maxPerPage for gitbucket (#30392)Kazushi (Jam) Marukawa2024-04-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | This patch improves the migration from gitbucket to gitea. The gitbucket uses it's own internal perPage value (= 25) for paging and ignore per_page arguments in the requested URL. This cause gitea to migrate only 25 issues and 25 PRs from gitbucket repository. This may not happens on old gitbucket. But recent gitbucket 4.40 or 4.38.4 has this problem. This patch change to use this internally hardcoded perPage of gitbucket as gitea's maxPerPage numer when migrating from gitbucket. There are several perPage values in gitbucket like 25 for Isseus/PRs and 10 for Releases. Some of those API doesn't support paging yet. It sounds difficult to implement, but using the minimum number among them worked out very well. So, I use 10 in this patch. Brief descriptions of problems and this patch are also available in https://github.com/go-gitea/gitea/issues/30316. In addition, I'm not sure what kind of test cases are possible to write here. It's a test for migration, so it requires testing gitbucket server and gitea server, I guess. Please let me know if it is possible to write such test cases here. Thanks!
* Fix duplicate migrated milestones (#30102)yp053272024-03-261-2/+1
| | | Fix #17567
* remove util.OptionalBool and related functions (#29513)65432024-03-021-4/+4
| | | | | | and migrate affected code _last refactoring bits to replace **util.OptionalBool** with **optional.Option[bool]**_
* Move migration functions to services layer (#29497)Lunny Xiao2024-03-011-1/+1
|
* Include resource state events in Gitlab downloads (#29382)Sebastian Brückner2024-02-262-0/+60
| | | | | | | | | | | Some specific events on Gitlab issues and merge requests are stored separately from comments as "resource state events". With this change, all relevant resource state events are downloaded during issue and merge request migration, and converted to comments. This PR also updates the template used to render comments to add support for migrated comments of these types. ref: https://docs.gitlab.com/ee/api/resource_state_events.html
* Properly migrate target branch change GitLab comment (#29340)Sebastian Brückner2024-02-243-4/+35
| | | | | | | | | | | | | | | | | | GitLab generates "system notes" whenever an event happens within the platform. Unlike Gitea, those events are stored and retrieved as text comments with no semantic details. The only way to tell whether a comment was generated in this manner is the `system` flag on the note type. This PR adds detection for a new specific kind of event: Changing the target branch of a PR. When detected, it is downloaded using Gitea's type for this event, and eventually uploaded into Gitea in the expected format, i.e. with no text content in the comment. This PR also updates the template used to render comments to add support for migrated comments of this type. ref: https://gitlab.com/gitlab-org/gitlab/-/blob/11bd6dc826e0bea2832324a1d7356949a9398884/app/services/system_notes/merge_requests_service.rb#L102
* Use the database object format name but not read from git repoisitory ↵Lunny Xiao2024-02-241-3/+13
| | | | | | | | | | | | | everytime and fix possible migration wrong objectformat when migrating a sha256 repository (#29294) Now we can get object format name from git command line or from the database repository table. Assume the column is right, we don't need to read from git command line every time. This also fixed a possible bug that the object format is wrong when migrating a sha256 repository from external. <img width="658" alt="image" src="https://github.com/go-gitea/gitea/assets/81045/6e9a9dcf-13bf-4267-928b-6bf2c2560423">
* Properly migrate automatic merge GitLab comments (#27873)Sebastian Brückner2024-02-223-24/+93
| | | | | | | | | | | | | | | | | | | | | | | | GitLab generates "system notes" whenever an event happens within the platform. Unlike Gitea, those events are stored and retrieved as text comments with no semantic details. The only way to tell whether a comment was generated in this manner is the `system` flag on the note type. This PR adds detection for two specific kinds of events: Scheduling and un-scheduling of automatic merges on a PR. When detected, they are downloaded using Gitea's type for these events, and eventually uploaded into Gitea in the expected format, i.e. with no text content in the comment. This PR also updates the template used to render comments to add support for migrated comments of these two types. ref: https://gitlab.com/gitlab-org/gitlab/-/blob/11bd6dc826e0bea2832324a1d7356949a9398884/app/services/system_notes/merge_requests_service.rb#L6-L17 --------- Co-authored-by: silverwind <me@silverwind.io> Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Simplify how git repositories are opened (#28937)Lunny Xiao2024-01-272-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Purpose This is a refactor toward building an abstraction over managing git repositories. Afterwards, it does not matter anymore if they are stored on the local disk or somewhere remote. ## What this PR changes We used `git.OpenRepository` everywhere previously. Now, we should split them into two distinct functions: Firstly, there are temporary repositories which do not change: ```go git.OpenRepository(ctx, diskPath) ``` Gitea managed repositories having a record in the database in the `repository` table are moved into the new package `gitrepo`: ```go gitrepo.OpenRepository(ctx, repo_model.Repo) ``` Why is `repo_model.Repository` the second parameter instead of file path? Because then we can easily adapt our repository storage strategy. The repositories can be stored locally, however, they could just as well be stored on a remote server. ## Further changes in other PRs - A Git Command wrapper on package `gitrepo` could be created. i.e. `NewCommand(ctx, repo_model.Repository, commands...)`. `git.RunOpts{Dir: repo.RepoPath()}`, the directory should be empty before invoking this method and it can be filled in the function only. #28940 - Remove the `RepoPath()`/`WikiPath()` functions to reduce the possibility of mistakes. --------- Co-authored-by: delvh <dev.lh@web.de>
* Only migrate the first 255 chars of a Github issue title (#28902)JakobDev2024-01-241-1/+2
| | | Fixes #28846
* Move more functions to db.Find (#28419)Lunny Xiao2024-01-151-2/+4
| | | | | | | | | Following #28220 This PR move more functions to use `db.Find`. --------- Co-authored-by: delvh <dev.lh@web.de>
* Use known issue IID to generate new PR index number when migrating from ↵wxiaoguang2023-12-262-11/+45
| | | | | GitLab (#28616) Fix #13884
* Bump google/go-github to v57 (#28514)Yevhen Pavlov2023-12-182-5/+5
|
* Adjust object format interface (#28469)Lunny Xiao2023-12-172-2/+2
| | | | | | | - Remove `ObjectFormatID` - Remove function `ObjectFormatFromID`. - Use `Sha1ObjectFormat` directly but not a pointer because it's an empty struct. - Store `ObjectFormatName` in `repository` struct
* Abstract hash function usage (#28138)Adam Majer2023-12-133-5/+8
| | | | | | Refactor Hash interfaces and centralize hash function. This will allow easier introduction of different hash function later on. This forms the "no-op" part of the SHA256 enablement patch.
* Second part of refactor `db.Find` (#28194)Lunny Xiao2023-12-111-6/+6
| | | Continue of #27798 and move more functions to `db.Find` and `db.Count`.
* Fix migration panic due to an empty review comment diff (#28334)Nanguan Lin2023-12-051-1/+1
| | | | | | | | | | | | | | | | | Fix #28328 ``` func (p *PullRequestComment) GetDiffHunk() string { if p == nil || p.DiffHunk == nil { return "" } return *p.DiffHunk } ``` This function in the package `go-github` may return an empty diff. When it's empty, the following code will panic because it access `ss[1]` https://github.com/go-gitea/gitea/blob/ec1feedbf582b05b6a5e8c59fb2457f25d053ba2/services/migrations/gitea_uploader.go#L861-L867 https://github.com/go-gitea/gitea/blob/ec1feedbf582b05b6a5e8c59fb2457f25d053ba2/modules/git/diff.go#L97-L101
* Use db.Find instead of writing methods for every object (#28084)Lunny Xiao2023-11-241-6/+7
| | | | For those simple objects, it's unnecessary to write the find and count methods again and again.
* Fix DownloadFunc when migrating releases (#27887)Zettat1232023-11-032-6/+9
| | | | | | | | | We should not use `asset.ID` in DownloadFunc because DownloadFunc is a closure. https://github.com/go-gitea/gitea/blob/1bf5527eac6b947010c8faf408f6747de2a2384f/services/migrations/gitea_downloader.go#L284-L295 A similar bug when migrating from GitHub has been fixed in #14703. This PR fixes the bug when migrating from Gitea and GitLab.
* Fix merge base commit for fast-forwarded GitLab PRs (#27825)Sebastian Brückner2023-10-291-1/+9
| | | | | | | | | | | | | | | | | | | | | | | Due to a bug in the GitLab API, the diff_refs field is populated in the response when fetching an individual merge request, but not when fetching a list of them. That field is used to populate the merge base commit SHA. While there is detection for the merge base even when not populated by the downloader, that detection is not flawless. Specifically, when a GitLab merge request has a single commit, and gets merged with the squash strategy, the base branch will be fast-forwarded instead of a separate squash or merge commit being created. The merge base detection attempts to find the last commit on the base branch that is also on the PR branch, but in the fast-forward case that is the PR's only commit. Assuming the head commit is also the merge base results in the import of a PR with 0 commits and no diff. This PR uses the individual merge request endpoint to fetch merge request data with the diff_refs field. With its data, the base merge commit can be properly set, which—by not relying on the detection mentioned above—correctly imports PRs that were "merged" by fast-forwarding the base branch. ref: https://gitlab.com/gitlab-org/gitlab/-/issues/29620
* Use GitLab's squash_commit_sha when available (#27824)Sebastian Brückner2023-10-291-1/+6
| | | | | | | | | | Before this PR, the PR migration code populates Gitea's MergedCommitID field by using GitLab's merge_commit_sha field. However, that field is only populated when the PR was merged using a merge strategy. When a squash strategy is used, squash_commit_sha is populated instead. Given that Gitea does not keep track of merge and squash commits separately, this PR simply populates Gitea's MergedCommitID by using whichever field is present in the GitLab API response.
* Final round of `db.DefaultContext` refactor (#27587)JakobDev2023-10-143-3/+3
| | | Last part of #27065
* Penultimate round of `db.DefaultContext` refactor (#27414)JakobDev2023-10-112-3/+3
| | | | | | | Part of #27065 --------- Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
* Even more `db.DefaultContext` refactor (#27352)JakobDev2023-10-031-1/+1
| | | | | | | | Part of #27065 --------- Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com> Co-authored-by: delvh <dev.lh@web.de>
* More `db.DefaultContext` refactor (#27265)JakobDev2023-09-291-2/+2
| | | | | | | Part of #27065 This PR touches functions used in templates. As templates are not static typed, errors are harder to find, but I hope I catch it all. I think some tests from other persons do not hurt.
* make writing main test easier (#27270)Lunny Xiao2023-09-281-4/+1
| | | | | | | | | This PR removed `unittest.MainTest` the second parameter `TestOptions.GiteaRoot`. Now it detects the root directory by current working directory. --------- Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Another round of `db.DefaultContext` refactor (#27103)JakobDev2023-09-252-4/+4
| | | | | | | Part of #27065 --------- Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
* Next round of `db.DefaultContext` refactor (#27089)JakobDev2023-09-164-11/+10
| | | Part of #27065
* More refactoring of `db.DefaultContext` (#27083)JakobDev2023-09-151-1/+1
| | | Next step of #27065
* Move some functions to service layer (#26969)Lunny Xiao2023-09-082-7/+7
|
* move repository deletion to service layer (#26948)Lunny Xiao2023-09-081-1/+1
| | | Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Move createrepository from module to service layer (#26927)Lunny Xiao2023-09-061-1/+2
| | | | Repository creation depends on many models, so moving it to service layer is better.
* Use `Set[Type]` instead of `map[Type]bool/struct{}`. (#26804)KN4CK3R2023-08-301-3/+3
|
* Add context parameter to some database functions (#26055)Lunny Xiao2023-07-221-2/+3
| | | | | To avoid deadlock problem, almost database related functions should be have ctx as the first parameter. This PR do a refactor for some of these functions.
* Replace `interface{}` with `any` (#25686)silverwind2023-07-044-12/+12
| | | | | Result of running `perl -p -i -e 's#interface\{\}#any#g' **/*` and `make fmt`. Basically the same [as golang did](https://github.com/golang/go/commit/2580d0e08d5e9f979b943758d3c49877fb2324cb).
* Sync branches into databases (#22743)Lunny Xiao2023-06-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Related #14180 Related #25233 Related #22639 Close #19786 Related #12763 This PR will change all the branches retrieve method from reading git data to read database to reduce git read operations. - [x] Sync git branches information into database when push git data - [x] Create a new table `Branch`, merge some columns of `DeletedBranch` into `Branch` table and drop the table `DeletedBranch`. - [x] Read `Branch` table when visit `code` -> `branch` page - [x] Read `Branch` table when list branch names in `code` page dropdown - [x] Read `Branch` table when list git ref compare page - [x] Provide a button in admin page to manually sync all branches. - [x] Sync branches if repository is not empty but database branches are empty when visiting pages with branches list - [x] Use `commit_time desc` as the default FindBranch order by to keep consistent as before and deleted branches will be always at the end. --------- Co-authored-by: Jason Song <i@wolfogre.com>
* Fix panic when migrating a repo from GitHub with issues (#25246)Jason Song2023-06-141-1/+1
| | | Fix #25245. Regression of #23946.
* Update github.com/google/go-github to v53 (#25157)Yevhen Pavlov2023-06-092-2/+2
| | | | | | The new `go-github` version [53](https://github.com/google/go-github/releases/tag/v53.0.0) has been released.
* GitLab migration: Sanitize response for reaction list (#25054)65432023-06-022-15/+67
|
* Update github.com/google/go-github to v52 (#24004)65432023-05-312-14/+7
| | | | | | | | | | based on https://github.com/google/go-github/pull/2743 because of https://github.com/go-gitea/gitea/pull/23946#discussion_r1160317554 --------- Co-authored-by: silverwind <me@silverwind.io>
* Rewrite logger system (#24726)wxiaoguang2023-05-217-35/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## ⚠️ Breaking The `log.<mode>.<logger>` style config has been dropped. If you used it, please check the new config manual & app.example.ini to make your instance output logs as expected. Although many legacy options still work, it's encouraged to upgrade to the new options. The SMTP logger is deleted because SMTP is not suitable to collect logs. If you have manually configured Gitea log options, please confirm the logger system works as expected after upgrading. ## Description Close #12082 and maybe more log-related issues, resolve some related FIXMEs in old code (which seems unfixable before) Just like rewriting queue #24505 : make code maintainable, clear legacy bugs, and add the ability to support more writers (eg: JSON, structured log) There is a new document (with examples): `logging-config.en-us.md` This PR is safer than the queue rewriting, because it's just for logging, it won't break other logic. ## The old problems The logging system is quite old and difficult to maintain: * Unclear concepts: Logger, NamedLogger, MultiChannelledLogger, SubLogger, EventLogger, WriterLogger etc * Some code is diffuclt to konw whether it is right: `log.DelNamedLogger("console")` vs `log.DelNamedLogger(log.DEFAULT)` vs `log.DelLogger("console")` * The old system heavily depends on ini config system, it's difficult to create new logger for different purpose, and it's very fragile. * The "color" trick is difficult to use and read, many colors are unnecessary, and in the future structured log could help * It's difficult to add other log formats, eg: JSON format * The log outputer doesn't have full control of its goroutine, it's difficult to make outputer have advanced behaviors * The logs could be lost in some cases: eg: no Fatal error when using CLI. * Config options are passed by JSON, which is quite fragile. * INI package makes the KEY in `[log]` section visible in `[log.sub1]` and `[log.sub1.subA]`, this behavior is quite fragile and would cause more unclear problems, and there is no strong requirement to support `log.<mode>.<logger>` syntax. ## The new design See `logger.go` for documents. ## Screenshot <details> ![image](https://github.com/go-gitea/gitea/assets/2114189/4462d713-ba39-41f5-bb08-de912e67e1ff) ![image](https://github.com/go-gitea/gitea/assets/2114189/b188035e-f691-428b-8b2d-ff7b2199b2f9) ![image](https://github.com/go-gitea/gitea/assets/2114189/132e9745-1c3b-4e00-9e0d-15eaea495dee) </details> ## TODO * [x] add some new tests * [x] fix some tests * [x] test some sub-commands (manually ....) --------- Co-authored-by: Jason Song <i@wolfogre.com> Co-authored-by: delvh <dev.lh@web.de> Co-authored-by: Giteabot <teabot@gitea.io>
* Some refactors for issues stats (#24793)Lunny Xiao2023-05-191-1/+1
| | | | | | | | This PR - [x] Move some functions from `issues.go` to `issue_stats.go` and `issue_label.go` - [x] Remove duplicated issue options `UserIssueStatsOption` to keep only one `IssuesOptions`
* Make repo migration cancelable and fix various bugs (#24605)wxiaoguang2023-05-111-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | Replace #12917 Close #24601 Close #12845 ![image](https://github.com/go-gitea/gitea/assets/2114189/39378118-064d-40fb-8396-4579ebf33917) ![image](https://github.com/go-gitea/gitea/assets/2114189/faf37418-191c-46a6-90a8-353141e00e2d) ![image](https://github.com/go-gitea/gitea/assets/2114189/fdc8ee4d-125f-4737-9990-89bcdf9eb388) ![image](https://github.com/go-gitea/gitea/assets/2114189/9a3bd2c2-fe20-4011-81f0-990ed869d139) --------- Co-authored-by: Yarden Shoham <git@yardenshoham.com> Co-authored-by: silverwind <me@silverwind.io> Co-authored-by: Giteabot <teabot@gitea.io>
* Rewrite queue (#24505)wxiaoguang2023-05-081-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # ⚠️ Breaking Many deprecated queue config options are removed (actually, they should have been removed in 1.18/1.19). If you see the fatal message when starting Gitea: "Please update your app.ini to remove deprecated config options", please follow the error messages to remove these options from your app.ini. Example: ``` 2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]` 2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]` 2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options ``` Many options in `[queue]` are are dropped, including: `WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`, `BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed from app.ini. # The problem The old queue package has some legacy problems: * complexity: I doubt few people could tell how it works. * maintainability: Too many channels and mutex/cond are mixed together, too many different structs/interfaces depends each other. * stability: due to the complexity & maintainability, sometimes there are strange bugs and difficult to debug, and some code doesn't have test (indeed some code is difficult to test because a lot of things are mixed together). * general applicability: although it is called "queue", its behavior is not a well-known queue. * scalability: it doesn't seem easy to make it work with a cluster without breaking its behaviors. It came from some very old code to "avoid breaking", however, its technical debt is too heavy now. It's a good time to introduce a better "queue" package. # The new queue package It keeps using old config and concept as much as possible. * It only contains two major kinds of concepts: * The "base queue": channel, levelqueue, redis * They have the same abstraction, the same interface, and they are tested by the same testing code. * The "WokerPoolQueue", it uses the "base queue" to provide "worker pool" function, calls the "handler" to process the data in the base queue. * The new code doesn't do "PushBack" * Think about a queue with many workers, the "PushBack" can't guarantee the order for re-queued unhandled items, so in new code it just does "normal push" * The new code doesn't do "pause/resume" * The "pause/resume" was designed to handle some handler's failure: eg: document indexer (elasticsearch) is down * If a queue is paused for long time, either the producers blocks or the new items are dropped. * The new code doesn't do such "pause/resume" trick, it's not a common queue's behavior and it doesn't help much. * If there are unhandled items, the "push" function just blocks for a few seconds and then re-queue them and retry. * The new code doesn't do "worker booster" * Gitea's queue's handlers are light functions, the cost is only the go-routine, so it doesn't make sense to "boost" them. * The new code only use "max worker number" to limit the concurrent workers. * The new "Push" never blocks forever * Instead of creating more and more blocking goroutines, return an error is more friendly to the server and to the end user. There are more details in code comments: eg: the "Flush" problem, the strange "code.index" hanging problem, the "immediate" queue problem. Almost ready for review. TODO: * [x] add some necessary comments during review * [x] add some more tests if necessary * [x] update documents and config options * [x] test max worker / active worker * [x] re-run the CI tasks to see whether any test is flaky * [x] improve the `handleOldLengthConfiguration` to provide more friendly messages * [x] fine tune default config values (eg: length?) ## Code coverage: ![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
* Improve test logger (#24235)wxiaoguang2023-04-211-39/+28
| | | | | | | | | | | | | Before, there was a `log/buffer.go`, but that design is not general, and it introduces a lot of irrelevant `Content() (string, error) ` and `return "", fmt.Errorf("not supported")` . And the old `log/buffer.go` is difficult to use, developers have to write a lot of `Contains` and `Sleep` code. The new `LogChecker` is designed to be a general approach to help to assert some messages appearing or not appearing in logs.
* Update github.com/google/go-github to v51 (#23946)harryzcy2023-04-082-20/+27
| | | | `github.com/google/go-github` has new major version releases frequently. It is required to update all import path, in additional to `go.mod`
* Introduce path Clean/Join helper functions (#23495)wxiaoguang2023-03-211-2/+2
| | | | | | | | | | | | | | | Since #23493 has conflicts with latest commits, this PR is my proposal for fixing #23371 Details are in the comments And refactor the `modules/options` module, to make it always use "filepath" to access local files. Benefits: * No need to do `util.CleanPath(strings.ReplaceAll(p, "\\", "/"))), "/")` any more (not only one before) * The function behaviors are clearly defined
* Use CleanPath instead of path.Clean (#23371)Lunny Xiao2023-03-081-2/+2
| | | As title.