summaryrefslogtreecommitdiffstats
path: root/modules/indexer/code/search.go
Commit message (Collapse)AuthorAgeFilesLines
* Render embedded code preview by permlink in markdown (#30234)wxiaoguang2024-04-021-7/+9
| | | | | The permlink in markdown will be rendered as a code preview block, like GitHub Co-authored-by: silverwind <me@silverwind.io>
* Support repo code search without setting up an indexer (#29998)wxiaoguang2024-03-241-17/+18
| | | | | | | | | | | | | | | | | By using git's ability, end users (especially small instance users) do not need to enable the indexer, they could also benefit from the code searching feature. Fix #29996 ![image](https://github.com/go-gitea/gitea/assets/2114189/11b7e458-88a4-480d-b4d7-72ee59406dd1) ![image](https://github.com/go-gitea/gitea/assets/2114189/0fe777d5-c95c-4288-a818-0427680805b6) --------- Co-authored-by: silverwind <me@silverwind.io>
* Refactor code_indexer to use an SearchOptions struct for PerformSearch (#29724)65432024-03-161-3/+5
| | | | | | | | similar to how it's already done for the issue_indexer --- *Sponsored by Kithara Software GmbH*
* Patch in exact search for meilisearch (#29671)65432024-03-091-2/+3
| | | | | | | | | | | | | | | | | | | meilisearch does not have an search option to contorl fuzzynes per query right now: - https://github.com/meilisearch/meilisearch/issues/1192 - https://github.com/orgs/meilisearch/discussions/377 - https://github.com/meilisearch/meilisearch/discussions/1096 so we have to create a workaround by post-filter the search result in gitea until this is addressed. For future works I added an option in backend only atm, to enable fuzzynes for issue indexer too. And also refactored the code so the fuzzy option is equal in logic to code indexer --- *Sponsored by Kithara Software GmbH*
* Fix wrong line number in code search result (#29260)yp053272024-03-061-19/+31
| | | | | | | | | | | Fix #29136 Before: The result is a table and all line numbers are all in one row. After: Use a separate table column for the line numbers. --------- Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Add option to disable ambiguous unicode characters detection (#28454)wxiaoguang2023-12-171-1/+2
| | | | | | | | * Close #24483 * Close #28123 * Close #23682 * Close #23149 (maybe more)
* Use Go 1.21 and update dependencies (#26878)wxiaoguang2023-09-031-3/+2
| | | | | | To make sure Gitea's next release's lifecycle could have active Golang support. And min/max are builtin now.
* Refactor indexer (#25174)Jason Song2023-06-231-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor `modules/indexer` to make it more maintainable. And it can be easier to support more features. I'm trying to solve some of issue searching, this is a precursor to making functional changes. Current supported engines and the index versions: | engines | issues | code | | - | - | - | | db | Just a wrapper for database queries, doesn't need version | - | | bleve | The version of index is **2** | The version of index is **6** | | elasticsearch | The old index has no version, will be treated as version **0** in this PR | The version of index is **1** | | meilisearch | The old index has no version, will be treated as version **0** in this PR | - | ## Changes ### Split Splited it into mutiple packages ```text indexer ├── internal │   ├── bleve │   ├── db │   ├── elasticsearch │   └── meilisearch ├── code │   ├── bleve │   ├── elasticsearch │   └── internal └── issues ├── bleve ├── db ├── elasticsearch ├── internal └── meilisearch ``` - `indexer/interanal`: Internal shared package for indexer. - `indexer/interanal/[engine]`: Internal shared package for each engine (bleve/db/elasticsearch/meilisearch). - `indexer/code`: Implementations for code indexer. - `indexer/code/internal`: Internal shared package for code indexer. - `indexer/code/[engine]`: Implementation via each engine for code indexer. - `indexer/issues`: Implementations for issues indexer. ### Deduplication - Combine `Init/Ping/Close` for code indexer and issues indexer. - ~Combine `issues.indexerHolder` and `code.wrappedIndexer` to `internal.IndexHolder`.~ Remove it, use dummy indexer instead when the indexer is not ready. - Duplicate two copies of creating ES clients. - Duplicate two copies of `indexerID()`. ### Enhancement - [x] Support index version for elasticsearch issues indexer, the old index without version will be treated as version 0. - [x] Fix spell of `elastic_search/ElasticSearch`, it should be `Elasticsearch`. - [x] Improve versioning of ES index. We don't need `Aliases`: - Gitea does't need aliases for "Zero Downtime" because it never delete old indexes. - The old code of issues indexer uses the orignal name to create issue index, so it's tricky to convert it to an alias. - [x] Support index version for meilisearch issues indexer, the old index without version will be treated as version 0. - [x] Do "ping" only when `Ping` has been called, don't ping periodically and cache the status. - [x] Support the context parameter whenever possible. - [x] Fix outdated example config. - [x] Give up the requeue logic of issues indexer: When indexing fails, call Ping to check if it was caused by the engine being unavailable, and only requeue the task if the engine is unavailable. - It is fragile and tricky, could cause data losing (It did happen when I was doing some tests for this PR). And it works for ES only. - Just always requeue the failed task, if it caused by bad data, it's a bug of Gitea which should be fixed. --------- Co-authored-by: Giteabot <teabot@gitea.io>
* Implement FSFE REUSE for golang files (#21840)flynnnnnnnnnn2022-11-271-2/+1
| | | | | | | | | Change all license headers to comply with REUSE specification. Fix #16132 Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github> Co-authored-by: John Olheiser <john.olheiser@gmail.com>
* Show syntax lexer name in file view/blame (#21814)silverwind2022-11-191-1/+4
| | | | | | | | | | | | | | | | | | | | | | Show which Chroma Lexer is used to highlight the file in the file header. It's useful for development to see what was detected, and I think it's not bad info to have for the user: <img width="233" alt="Screenshot 2022-11-14 at 22 31 16" src="https://user-images.githubusercontent.com/115237/201770854-44933dfc-70a4-487c-8457-1bb3cc43ea62.png"> <img width="226" alt="Screenshot 2022-11-14 at 22 36 06" src="https://user-images.githubusercontent.com/115237/201770856-9260ce6f-6c0f-442c-92b5-201e5b113188.png"> <img width="194" alt="Screenshot 2022-11-14 at 22 36 26" src="https://user-images.githubusercontent.com/115237/201770857-6f56591b-80ea-42cc-8ea5-21b9156c018b.png"> Also, I improved the way this header overflows on small screens: <img width="354" alt="Screenshot 2022-11-14 at 22 44 36" src="https://user-images.githubusercontent.com/115237/201774828-2ddbcde1-da15-403f-bf7a-6248449fa2c5.png"> Co-authored-by: delvh <dev.lh@web.de> Co-authored-by: Lauris BH <lauris@nix.lv> Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com> Co-authored-by: John Olheiser <john.olheiser@gmail.com>
* Automatically pause queue if index service is unavailable (#15066)Lauris BH2022-01-271-2/+3
| | | | | | * Handle keyword search error when issue indexer service is not available * Implement automatic disabling and resume of code indexer queue
* Add .gitattribute assisted language detection to blame, diff and render (#17590)zeripath2021-11-171-1/+1
| | | | | | | Use check attribute code to check the assigned language of a file and send that in to chroma as a hint for the language of the file. Signed-off-by: Andrew Thornton <art27@cantab.net>
* [Feature] add precise search type for Elastic Search (#12869)Jui-Nan Lin2021-01-271-2/+2
| | | | | | | | * feat: add type query parameters for specifying precise search * feat: add select dropdown in search box Co-authored-by: Lauris BH <lauris@nix.lv> Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Server-side syntax highlighting for all code (#12047)mrsdizzie2020-07-011-16/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Server-side syntax hilighting for all code This PR does a few things: * Remove all traces of highlight.js * Use chroma library to provide fast syntax hilighting directly on the server * Provide syntax hilighting for diffs * Re-style both unified and split diffs views * Add custom syntax hilighting styling for both regular and arc-green Fixes #7729 Fixes #10157 Fixes #11825 Fixes #7728 Fixes #3872 Fixes #3682 And perhaps gets closer to #9553 * fix line marker * fix repo search * Fix single line select * properly load settings * npm uninstall highlight.js * review suggestion * code review * forgot to call function * fix test * Apply suggestions from code review suggestions from @silverwind thanks Co-authored-by: silverwind <me@silverwind.io> * code review * copy/paste error * Use const for highlight size limit * Update web_src/less/_repository.less Co-authored-by: Lauris BH <lauris@nix.lv> * update size limit to 1MB and other styling tweaks * fix highlighting for certain diff sections * fix test * add worker back as suggested Co-authored-by: silverwind <me@silverwind.io> Co-authored-by: Lauris BH <lauris@nix.lv>
* Add detected file language to code search (#10256)Lauris BH2020-02-201-6/+15
| | | | | | | Move langauge detection to separate module to be more reusable Add option to disable vendored file exclusion from file search Allways show all language stats for search
* Refactor code indexer (#9313)Lunny Xiao2019-12-231-0/+130
* Refactor code indexer * fix test * fix test * refactor code indexer * fix import * improve code * fix typo * fix test and make code clean * fix lint