From Fedora Project Wiki

Revision as of 07:54, 8 September 2024 by Hricky (talk | contribs) (Fix the link to Model Choices in the Risks section)

Original Proposal by Amita Sharma

  • Initiative Title: Fedora AI Chatbot Development Using InstructLab
  • Community Initiative Leads: Amita Sharma, David Duncan (davdunc@amazon.com)
  • Executive Sponsor: \[To be determined\] Matthew Miller?
  • Timeframe: 9-12 months


Project Objective

The primary objective of this initiative is to develop an AI-powered chatbot for the Fedora community using InstructLab. The chatbot will be integrated into the \#introductions.im channel, offering quick, accurate responses to user inquiries and automatically tagging discussion posts based on content. Additionally, the chatbot will suggest relevant Fedora documentation and wiki articles. The project aims to use the smallest model that ensures optimal performance, focusing on efficiency and scalability.

Scope of Work

In-Scope

  • Model Customization: Fine-tune a small AI model with Fedora-specific datasets using InstructLab to create a customized chatbot.
  • System Integration: Integrate the chatbot with Fedora’s communication channels (IRC/Matrix), documentation, and wiki databases.
  • Tagging and Content Recommendation: Develop a system for automatic tagging of discussion posts and recommending related content.
  • Deployment and Monitoring: Deploy the chatbot in Fedora’s production environment with robust monitoring and continuous improvement protocols.

Out-of-Scope

  • Development of a new large-scale AI model.
  • Multilingual support beyond English (unless required later).

Deliverables

  • AI Chatbot: A fully functional, InstructLab-based chatbot deployed in the \#introductions.im channel.
  • Tagging System: An automated system for tagging discussion posts and suggesting relevant Fedora documentation.
  • Documentation: Comprehensive technical and user documentation, covering deployment, usage, and maintenance procedures.

Assumptions

  • Access to Fedora’s existing documentation, wiki, and discussion archives will be granted. Plan is to use 80% data to train the model and 20% to test.
  • The Fedora community will provide ongoing feedback for the continuous improvement of the chatbot. It will be better to open the model to get it trained on use feeds by skills and knowledge like InstructLab.
  • InstructLab will integrate with Fedora’s infrastructure with minimal modifications.

Constraints

  • The chatbot must be lightweight, using the smallest possible model that ensures adequate performance.
  • The project must adhere to Fedora’s security and compliance standards, particularly in terms of data privacy.

Risks

  • Technical Risk: The selected model may not meet performance expectations, necessitating further optimization or an alternative approach.
  • Operational Risk: Lack of engagement from the Fedora community may limit the feedback necessary for refining the chatbot.
  • Integration Risk: Potential challenges in integrating the chatbot with Fedora’s existing systems, particularly the documentation and discussion platforms.
  • Model Choice: We may not have a lot of choices in terms of models as InstructLab only supports 3 models as of now. See InstructLab-compatible foundation models.

Stakeholders

  • Fedora Community: Primary users who will interact with the chatbot and benefit from its capabilities.
  • Development Team: Responsible for the technical implementation, including model customization, system integration, and deployment.
  • Fedora Documentation Team: Ensures that the chatbot effectively links to relevant documentation and wiki pages.

Project Milestones

1. Project Initiation: Finalize project goals, assemble the project team, and establish success criteria.

  Estimated Timeline: 1-2 weeks

2. Requirements Gathering: Define user stories, gather Fedora-specific data, and finalize model specifications.

  Estimated Timeline: 2-3 weeks  

3. InstructLab Customization: Fine-tune the model with Fedora-specific content and develop required skills.

  Estimated Timeline: 3-4 weeks 

4. System Integration: Integrate the chatbot with Fedora’s communication channels, documentation, and discussion systems.

  Estimated Timeline: 3-4 weeks  

5. Testing and Iteration: Conduct testing, collect feedback, and make necessary adjustments to improve performance.

  Estimated Timeline: 4-6 weeks  

6. Deployment: Deploy the chatbot and tagging system in Fedora’s production environment.

  Estimated Timeline: 1-2 weeks  

7. Post-Deployment Support: Provide ongoing monitoring, support, and continuous improvements based on user feedback.

  Ongoing  

8. Project Closure: Document lessons learned, transition to maintenance, and officially close the project.

  Estimated Timeline: 1 week

Success Criteria

  • Performance: The chatbot provides accurate responses in the \#introductions.im channel with minimal latency.
  • User Satisfaction: Positive feedback from the Fedora community, with high engagement and effective utilization of the chatbot.
  • System Integration: Successful integration with Fedora’s communication channels, documentation, and wiki databases, enabling seamless tagging and content recommendation.