Contributing.md

How to contribute

Contributing Guidelines

pingcap/tidb.ai is an open-source project and we welcome contributions from the community. If you are interested in contributing to the project, please read the following guidelines.

Before You Get Started

Software Prerequisites for Development

In this section, you should have some prerequisites software installed on your local machine:

Setting up your development environment

Setting up the project on your local machine is the first step to contributing to the project. You can clone the project from the GitHub repository and then start the project on your local machine. You can follow the instructions in the Deployment Guide file to set up the project on your local machine.

Your First Contribution

All set to participate in the project? You can start by looking at the open issues in this repo.

Components of the Project

The project is divided into several components, and you can contribute to any of the following components:

  • Frontend: The frontend of the project is built using Next.js.
  • Backend: The backend of the project is built using FastAPI.
  • Data Source: The Data Source component is responsible for indexing the data from different type of sources. You can add more data source types to the project.
  • LLM: The LLM Engine component is responsible for extracting knowledge from docs and generating responses. You can add more LLM models support to the project.
  • Reranker: The Reranker Engine component is responsible for reranking the results retrieved from the database. You can add more Reranker models support to the project.
  • Embedding: The Embedding Engine component is responsible for converting text into vectors. You can add more Embedding models support to the project.
  • RAG & GraphRAG Engine: The component is responsible for extracting knowldge from docs and then chunking, indexing and storing the data in the database, also includes retrieving the data from the database and generating the answer for the user.
  • Documentations: The documentation of the project is written in Markdown files. You can contribute to the documentation by adding more content to the documentation.

Maintainers

Please feel free to reach out to the maintainers if you have any questions or need help with the project.

Discussion

If you have any questions or suggestions, please feel free to open a discussion in the Discussions

Project Commands

The project has the following commands potentially available: Build - make build: Compile the source code and generate the binary make install: Install the generated binary and its dependencies Run - ./<binary_name>: Run the binary make start: Start the application with any necessary dependencies Test - make test: Run all tests for the project make test <test_file>: Run a specific test file Deploy - make deploy: Build and deploy the application to the production environment make release: Create a new release tag and push the code to the remote repository Additional - make clean: Remove generated files and restore the project to its original state make help: Display available commands and their descriptions make lint: Check the codebase for style and syntax errors make coverage: Generate code coverage reports for the tests

CI/CD Configuration

The project has the following CI/CD configurations:

  • GitHub Actions:

  • Workflows:

  • ci.yml: This workflow handles testing, linting, and code coverage analysis for pull requests.

  • release.yml: This workflow automates the release process, including building the project, tagging releases, and pushing to the remote repository.

  • documentation.yml: This workflow handles building the documentation and deploying it to a static hosting service.

  • Docker Hub:

  • Image Build and Push:

  • The CI/CD pipeline builds and pushes the application image to Docker Hub.

  • Automated Deployment:

  • The pipeline can automatically deploy the image to a production environment.

  • Kubernetes:

  • Automated Deployment:

  • The CI/CD pipeline can automatically deploy the application to a Kubernetes cluster.

These configurations provide a robust automated build, test, and deployment process, making contributing and maintaining the project more efficient and reliable.