- .github
- .vscode
- build
-
cmd
- zoekt
- zoekt-archive-index
- zoekt-dynamic-indexserver
- zoekt-git-clone
- zoekt-git-index
- zoekt-index
- zoekt-indexserver
- zoekt-merge-index
- zoekt-mirror-bitbucket-server
- zoekt-mirror-gerrit
- zoekt-mirror-github
- zoekt-mirror-gitiles
- zoekt-mirror-gitlab
- zoekt-repo-index
-
zoekt-sourcegraph-indexserver
-
json_schemas
- CdsConfig.json
- EdsLoadBalancingPolicyConfig.json
- GrpcLbConfig.json
- LeastRequestLocalityLoadBalancingPolicyConfig.json
- LoadBalancingConfig.json
- LrsLoadBalancingPolicyConfig.json
- MethodConfig.json
- OutlierDetectionLoadBalancingConfig.json
- OverrideHostLoadBalancingPolicyConfig.json
- PickFirstConfig.json
- PriorityLoadBalancingPolicyConfig.json
- RingHashLoadBalancingConfig.json
- RlsLoadBalancingPolicyConfig.json
- RoundRobinConfig.json
- ServiceConfig.json
- WeightedRoundRobinLbConfig.json
- WeightedTargetLoadBalancingPolicyConfig.json
- XdsClusterImplLoadBalancingPolicyConfig.json
- XdsClusterManagerLoadBalancingPolicyConfig.json
- XdsClusterResolverLoadBalancingPolicyConfig.json
- XdsConfig.json
- XdsServer.json
- XdsWrrLocalityLoadBalancingPolicyConfig.json
- update.sh
- protos
- backoff.go
- backoff_test.go
- cleanup.go
- cleanup_test.go
- debug.go
- default_grpc_service_configuration.json
- index.go
- index_mutex.go
- index_test.go
- main.go
- main_test.go
- merge.go
- merge_test.go
- meta.go
- meta_test.go
- meta_unix.go
- meta_windows.go
- owner.go
- owner_test.go
- queue.go
- queue_test.go
- sg.go
- sg_test.go
-
json_schemas
- zoekt-test
- zoekt-webserver
- flags.go
- ctags
- debugserver
- doc
- gitindex
- grpc
- ignore
-
internal
- archive
-
e2e
-
testdata
- Get_databaseuser.txt
- InternalDoer.txt
- Repository_metadata_Write_rbac.txt
- assets_are_not_configured_for_this_binary.txt
- bufio_buffer.txt
- bufio_flush_writer.txt
- bytes_buffer.txt
- coverage_data_writer.txt
- generate_unit_test.txt
- graphql_type_User.txt
- r_cody_sourcegraph_url.txt
- rank_stats.txt
- sourcegraphserver_docker_image_build.txt
- test_server.txt
- time_compare.txt
- zoekt_searcher.txt
- doc.go
- e2e_rank_test.go
- e2e_test.go
-
testdata
- languages
- mockSearcher
- otlpenv
- profiler
- syntaxutil
- tracer
- json
- query
- shards
- testdata
- trace
- web
- .bazelignore
- .dockerignore
- .gitignore
- .tool-versions
- Dockerfile
- Dockerfile.indexserver
- Dockerfile.webserver
- LICENSE
- README.md
- all.bash
- api.go
- api_proto.go
- api_proto_test.go
- api_test.go
- bits.go
- bits_test.go
- btree.go
- btree_test.go
- contentprovider.go
- contentprovider_test.go
- eval.go
- eval_test.go
- flake.lock
- flake.nix
- gen-proto.sh
- go.mod
- go.sum
- hititer.go
- hititer_test.go
- index_test.go
- indexbuilder.go
- indexdata.go
- indexdata_test.go
- indexfile_other.go
- indexfile_unix.go
- install-ctags-alpine.sh
- limit.go
- limit_test.go
- marshal.go
- marshal_test.go
- matchiter.go
- matchtree.go
- matchtree_test.go
- merge.go
- merge_test.go
- read.go
- read_test.go
- score.go
- score_test.go
- section.go
- shell.nix
- toc.go
- tombstones.go
- tombstones_test.go
- tombstones_unix.go
- tombstones_windows.go
- write.go
Explanation
Explanation of cmd/zoekt-mirror-github/main.go
This Go program, part of the Zoekt project, is a GitHub repository mirror. It fetches repositories belonging to a user or organization and clones them to a local directory.
Key Features:
- Token Authentication: It utilizes a personal access token (PAT) for API authentication, enhancing security and access.
- GitHub Enterprise Support: The code handles both GitHub.com and GitHub Enterprise instances with appropriate API URL configuration.
- Repository Filtering:
- It filters repositories based on user/organization name, fork status, regular expression matching (name and exclude patterns), and topics.
- The
filterRepositories
function filters repos based on provided include and exclude topics, as well as thenoArchived
flag. - Clones and Configuration: It clones each matching repository, populating metadata like star count, watchers, and fork status within Zoekt’s internal configuration.
- Stale Repo Removal: An optional feature to delete repositories locally that are no longer present on the remote GitHub instance, ensuring the mirror stays synchronized.
Code Breakdown:
Flags:
dest
: Path to the destination directory for cloned repositories.url
: GitHub Enterprise URL (if applicable).org
: Organization to mirror.user
: User to mirror.token
: File containing the API token.forks
: Include forks in the mirror.delete
: Delete missing repositories from the mirror.name
,exclude
: Regular expressions for filtering repository names.topic
,exclude_topic
: Topics for inclusion and exclusion of repositories.no_archived
: Only mirror non-archived repositories.GitHub Client:
The code initializes a GitHub client, leveraging the
google/go-github
library for API interaction. It correctly handles GitHub.com and Enterprise scenarios.Repository Fetching:
getOrgRepos
andgetUserRepos
functions retrieve repositories from the specified organization or user, respectively, using GitHub API calls. They iterate through pages of results until all repositories are fetched.Cloning:
cloneRepos
iterates through filtered repositories and clones them to the designated destination directory. It populates a configuration map with relevant metadata for Zoekt.Stale Repo Removal:
The
deleteStaleRepos
function uses thegitindex
package to compare local repositories with the remote repository list and delete any missing ones.Helper Functions:
hasIntersection
: Checks if two string slices have common elements.itoa
: Converts an integer pointer to a string.marshalBool
: Converts a boolean value to a string (“0” or “1”).
Overall:
This program provides a robust solution for mirroring GitHub repositories, enabling efficient indexing and search within the Zoekt system. It offers features like authentication, filtering, configuration, and stale repo management, ensuring accurate and up-to-date repository data within the Zoekt index.
Graph
The graph shows the usage of functions within the codebase.
Select a code symbol to view it's graph