What is Search Service?

Search Service is a code search engine based on Zoekt, a fast text search engine intended for use with source code. It enables you to quickly find relevant code within your repositories. https://github.com/sourcegraph/zoekt/

Why is Search Service important?

Search Service is crucial for efficient code navigation and understanding. It allows you to:

  • Find code quickly: Search for specific code snippets, functions, or files within your repositories.
  • Improve code comprehension: Locate related code to understand how different parts of your codebase work together.
  • Streamline development: Efficiently locate the code you need to make changes, debug issues, or implement new features.

Managing Search Service with zoekt-indexserver

The zoekt-indexserver command provides a convenient way to manage a search service that automatically mirrors and indexes repositories. This ensures continuous indexing and keeps search results up-to-date.

Configuration

You can configure zoekt-indexserver by creating a JSON file that specifies the repositories you want to mirror and index. For example:

[
            {"GithubUser": "username"},
            {"GithubOrg": "org"},
            {"GitilesURL": "https://gerrit.googlesource.com", "Name": "zoekt" }
          ]
          

This configuration will mirror all repositories under github.com/username, github.com/org, as well as the zoekt repository.

Running zoekt-indexserver

To start the zoekt-indexserver, run the following command:

$GOPATH/bin/zoekt-indexserver -mirror_config config.json
          

This will start the service, which will continuously fetch and index new data from the specified repositories. It also takes care of cleaning up log files.

Accessing Search Results

Search Service provides several ways to access search results:

Web Interface

You can access the web interface by starting the zoekt-webserver command:

$GOPATH/bin/zoekt-webserver -listen :6070
          

This will start a web server listening on port 6070, where you can interact with the search service.

JSON API

You can retrieve search results as JSON by sending a GET request to the zoekt-webserver:

curl --get \
                 --url "http://localhost:6070/search" \
                 --data-urlencode "q=ngram f:READ" \
                 --data-urlencode "num=50" \
                 --data-urlencode "format=json"
          

The response data is a JSON object, and you can refer to the web.ApiSearchResult documentation to understand its structure.

CLI

You can use the zoekt CLI to search for code:

$GOPATH/bin/zoekt 'ngram f:READ'
          

Integration with Gerrit/Gitiles

Search Service can be integrated with Gerrit/Gitiles, allowing users to search for code directly from the code review and browsing system. This integration respects Gerrit/Gitiles’s access control lists (ACLs), ensuring that users only see the code they are authorized to access.

Symbol Search

To improve search ranking, it is recommended to install Universal ctags. This tool helps Zoekt find symbol definitions and other sections within files, which can be used to boost the ranking of relevant results.

Acknowledgements

  • Han-Wen Nienhuys for creating Zoekt
  • Alexander Neubeck for suggesting this idea and collaborating with Han-Wen Nienhuys

Top-Level Directory Explanations

cmd/ - This directory contains the command-line interface (CLI) for Zoekt.

cmd/zoekt/ - This directory contains the main package for the CLI tool.

cmd/zoekt-indexserver/ - This directory contains the code for the index server.

cmd/zoekt-sourcegraph-indexserver/ - This directory contains the code for the Sourcegraph index server.

doc/ - This directory contains documentation for the project.

grpc/ - This directory contains the code for gRPC support.

json/ - This directory contains the code for handling JSON data.

web/ - This directory contains the code for the web server.

Entrypoints and Where to Start

cmd/zoekt/main.go - The main entrypoint for the Zoekt command-line tool, providing functions for displaying matches, loading shards, and starting various profiling modes.

cmd/zoekt-indexserver/main.go - The main entrypoint for the index server, handling logging, periodic fetching, indexing pending repos, and deleting logs.

cmd/zoekt-sourcegraph-indexserver/main.go - The main entrypoint for the sourcegraph index server, which periodically reindexes enabled repositories on Sourcegraph.

cmd/zoekt-webserver/main.go - The main entrypoint for the web server, which serves search results and handles various HTTP requests.