aboutsummaryrefslogtreecommitdiffstats
path: root/modules/setting/indexer.go
Commit message (Collapse)AuthorAgeFilesLines
* Fix settings not being loaded at CLI (#26402) (#33048)Giteabot2 days1-1/+1
| | | | | | | | | | Backport #26402 by cassiozareck Closes #25898 Signed-off-by: cassiozareck <cassiomilczareck@gmail.com> Co-authored-by: cassio zareck <121526696+cassiozareck@users.noreply.github.com> Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com> Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Improve grep search (#30843)wxiaoguang2024-05-031-7/+5
| | | | Reduce the context line number to 1, make "git grep" search respect the include/exclude patter, and fix #30785
* Refactor startup deprecation messages (#30305)wxiaoguang2024-04-071-1/+1
| | | | | | | It doesn't change logic, it only does: 1. Rename the variable and function names 2. Use more consistent format when mentioning config section&key 3. Improve some messages
* Do not allow different storage configurations to point to the same directory ↵wxiaoguang2024-03-311-1/+1
| | | | | (#30169) Replace #29171
* Disallow duplicate storage paths (#26484)Lunny Xiao2024-02-091-14/+17
| | | | Replace #26380
* Allow skipping forks and mirrors from being indexed (#23187)techknowlogick2023-05-251-16/+19
| | | | | | This PR adds two new options to disable repo/code search indexing of both forks and mirrors. Related: #22842
* Rewrite queue (#24505)wxiaoguang2023-05-081-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # ⚠️ Breaking Many deprecated queue config options are removed (actually, they should have been removed in 1.18/1.19). If you see the fatal message when starting Gitea: "Please update your app.ini to remove deprecated config options", please follow the error messages to remove these options from your app.ini. Example: ``` 2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]` 2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]` 2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options ``` Many options in `[queue]` are are dropped, including: `WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`, `BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed from app.ini. # The problem The old queue package has some legacy problems: * complexity: I doubt few people could tell how it works. * maintainability: Too many channels and mutex/cond are mixed together, too many different structs/interfaces depends each other. * stability: due to the complexity & maintainability, sometimes there are strange bugs and difficult to debug, and some code doesn't have test (indeed some code is difficult to test because a lot of things are mixed together). * general applicability: although it is called "queue", its behavior is not a well-known queue. * scalability: it doesn't seem easy to make it work with a cluster without breaking its behaviors. It came from some very old code to "avoid breaking", however, its technical debt is too heavy now. It's a good time to introduce a better "queue" package. # The new queue package It keeps using old config and concept as much as possible. * It only contains two major kinds of concepts: * The "base queue": channel, levelqueue, redis * They have the same abstraction, the same interface, and they are tested by the same testing code. * The "WokerPoolQueue", it uses the "base queue" to provide "worker pool" function, calls the "handler" to process the data in the base queue. * The new code doesn't do "PushBack" * Think about a queue with many workers, the "PushBack" can't guarantee the order for re-queued unhandled items, so in new code it just does "normal push" * The new code doesn't do "pause/resume" * The "pause/resume" was designed to handle some handler's failure: eg: document indexer (elasticsearch) is down * If a queue is paused for long time, either the producers blocks or the new items are dropped. * The new code doesn't do such "pause/resume" trick, it's not a common queue's behavior and it doesn't help much. * If there are unhandled items, the "push" function just blocks for a few seconds and then re-queue them and retry. * The new code doesn't do "worker booster" * Gitea's queue's handlers are light functions, the cost is only the go-routine, so it doesn't make sense to "boost" them. * The new code only use "max worker number" to limit the concurrent workers. * The new "Push" never blocks forever * Instead of creating more and more blocking goroutines, return an error is more friendly to the server and to the end user. There are more details in code comments: eg: the "Flush" problem, the strange "code.index" hanging problem, the "immediate" queue problem. Almost ready for review. TODO: * [x] add some necessary comments during review * [x] add some more tests if necessary * [x] update documents and config options * [x] test max worker / active worker * [x] re-run the CI tasks to see whether any test is flaky * [x] improve the `handleOldLengthConfiguration` to provide more friendly messages * [x] fine tune default config values (eg: length?) ## Code coverage: ![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
* Add meilisearch support (#23136)techknowlogick2023-03-281-0/+15
| | | | | Add meilisearch support Fixes #20665
* handle deprecated settings (#22992)Lunny Xiao2023-02-201-6/+7
| | | Fix #22736
* Refactor the setting to make unit test easier (#22405)Lunny Xiao2023-02-201-7/+7
| | | | | | | | | | | | | | | | | | | | | | Some bugs caused by less unit tests in fundamental packages. This PR refactor `setting` package so that create a unit test will be easier than before. - All `LoadFromXXX` files has been splited as two functions, one is `InitProviderFromXXX` and `LoadCommonSettings`. The first functions will only include the code to create or new a ini file. The second function will load common settings. - It also renames all functions in setting from `newXXXService` to `loadXXXSetting` or `loadXXXFrom` to make the function name less confusing. - Move `XORMLog` to `SQLLog` because it's a better name for that. Maybe we should finally move these `loadXXXSetting` into the `XXXInit` function? Any idea? --------- Co-authored-by: 6543 <6543@obermui.de> Co-authored-by: delvh <dev.lh@web.de>
* Implement FSFE REUSE for golang files (#21840)flynnnnnnnnnn2022-11-271-2/+1
| | | | | | | | | Change all license headers to comply with REUSE specification. Fix #16132 Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github> Co-authored-by: John Olheiser <john.olheiser@gmail.com>
* format with gofumpt (#18184)65432022-01-201-31/+29
| | | | | | | | | | | * gofumpt -w -l . * gofumpt -w -l -extra . * Add linter * manual fix * change make fmt
* Enable deprecation error for v1.17.0 (#18341)Gusted2022-01-201-23/+11
| | | Co-authored-by: Andrew Thornton <art27@cantab.net>
* Fix various documentation, user-facing, and source comment typos (#16367)luzpaz2021-07-081-1/+1
| | | | | * Fix various doc, user-facing, and source comment typos Found via `codespell -q 3 -S ./options/locale,./vendor -L ba,pullrequest,pullrequests,readby`
* Clean-up the settings hierarchy for issue_indexer queue (#16001)zeripath2021-06-161-17/+15
| | | | | | | | | There are a couple of settings in `[indexer]` relating to the `issue_indexer` queue which override settings in unpredictable ways. This PR adjusts this hierarchy and makes explicit that these settings are deprecated. Signed-off-by: Andrew Thornton <art27@cantab.net> Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Use filepath.ToSlash and Join in indexer defaults and queues (#15971)zeripath2021-05-251-6/+5
| | | | | | | | | As revealed by #15964 there is inconsistent use of filepath Join and path Join for these directories. The best thing to do is to use filepath.Join but then ToSlash them for consistency. Signed-off-by: Andrew Thornton <art27@cantab.net> Co-authored-by: John Olheiser <john.olheiser@gmail.com>
* Change default queue settings to be low go-routines (#15964)zeripath2021-05-241-2/+2
| | | | | | | | This PR suggests a change to the default configuration for queues: * Use a common DATADIR for the queues * Set starting workers to 0 and make boost a single worker Signed-off-by: Andrew Thornton <art27@cantab.net>
* Avoid setting the CONN_STR in issue indexer queue unless it is meant to be ↵zeripath2020-10-071-1/+1
| | | | | | | | | | | | | | | set (#13069) Since the move to common leveldb and common redis the disk queue code (#12385) will check the connection string before defaulting to the DATADIR. Therefore we should ensure that the connection string is kept empty unless it is actually set. Unforunately the issue indexer was missed in #13025 this PR fixes this omission Fix #13062 Signed-off-by: Andrew Thornton <art27@cantab.net>
* Support elastic search for code search (#10273)Lunny Xiao2020-08-301-0/+12
| | | | | | | | | | | | | | | | | | | | | | | * Support elastic search for code search * Finished elastic search implementation and add some tests * Enable test on drone and added docs * Add new fields to elastic search * Fix bug * remove unused changes * Use indexer alias to keep the gitea indexer version * Improve codes * Some code improvements * The real indexer name changed to xxx.v1 Co-authored-by: zeripath <art27@cantab.net>
* Add detected file language to code search (#10256)Lauris BH2020-02-201-0/+3
| | | | | | | Move langauge detection to separate module to be more reusable Add option to disable vendored file exclusion from file search Allways show all language stats for search
* Issue search support elasticsearch (#9428)Lunny Xiao2020-02-131-11/+19
| | | | | | | | | | | | * Issue search support elasticsearch * Fix lint * Add indexer name on app.ini * add a warnning on SearchIssuesByKeyword * improve code
* Refactor code indexer (#9313)Lunny Xiao2019-12-231-0/+2
| | | | | | | | | | | | | | | | | | | | * Refactor code indexer * fix test * fix test * refactor code indexer * fix import * improve code * fix typo * fix test and make code clean * fix lint
* Restore Graceful Restarting & Socket Activation (#7274)zeripath2019-10-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * Prevent deadlock in indexer initialisation during graceful restart * Move from gracehttp to our own service to add graceful ssh * Add timeout for start of indexers and make hammer time configurable * Fix issue with re-initialization in indexer during tests * move the code to detect use of closed to graceful * Handle logs gracefully - add a pid suffix just before restart * Move to using a cond and a holder for indexers * use time.Since * Add some comments and attribution * update modules.txt * Use zero to disable timeout * Move RestartProcess to its own file * Add cleanup routine
* Restrict repository indexing by glob match (#7767)guillep2k2019-09-111-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Restrict repository indexing by file extension * Use REPO_EXTENSIONS_LIST_INCLUDE instead of REPO_EXTENSIONS_LIST_EXCLUDE and have a more flexible extension pattern * Corrected to pass lint gosimple * Add wildcard support to REPO_INDEXER_EXTENSIONS * This reverts commit 72a650c8e42f4abf59d5df7cd5dc27b451494cc6. * Add wildcard support to REPO_INDEXER_EXTENSIONS (no make vendor) * Simplify isIndexable() for better clarity * Add gobwas/glob to vendors * manually set appengine new release * Implement better REPO_INDEXER_INCLUDE and REPO_INDEXER_EXCLUDE * Add unit and integration tests * Update app.ini.sample and reword config-cheat-sheet * Add doc page and correct app.ini.sample * Some polish on the doc * Simplify code as suggested by @lafriks
* Issue indexer queue redis support (#6218)Lunny Xiao2019-04-081-17/+21
| | | | | | | | | | | | | | | | | | * add redis queue * finished indexer redis queue * add redis vendor * fix vet * Update docs/content/doc/advanced/config-cheat-sheet.en-us.md Co-Authored-By: lunny <xiaolunwen@gmail.com> * switch to go mod * Update required changes for new logging func signatures
* Add more tests and docs for issue indexer, add db indexer type for searching ↵Lunny Xiao2019-02-211-0/+1
| | | | | | | | | | | | | | from database (#6144) * add more tests and docs for issue indexer, add db indexer type for searching from database * fix typo * fix typo * fix lint * improve docs
* Refactor issue indexer (#5363)Lunny Xiao2019-02-191-0/+55