Skip to content

Blog¤

Awesome Claude Skills

Awesome Claude Skills

Composio banner

Awesome PRs Welcome License: Apache-2.0

A curated list of practical Claude Skills for enhancing productivity across Claude.ai, Claude Code, and the Claude API.

Want skills that do more than generate text? Claude can send emails, create issues, post to Slack, and take actions across 1000+ apps. See how →


Quickstart: Connect Claude to 500+ Apps

The connect-apps plugin lets Claude perform real actions - send emails, create issues, post to Slack. It handles auth and connects to 500+ apps using Composio under the hood.

1. Install the Plugin

claude --plugin-dir ./connect-apps-plugin

2. Run Setup

/connect-apps:setup

Paste your API key when asked. (Get a free key at platform.composio.dev)

3. Restart & Try It

exit
claude

Want skills that do more than generate text? Claude can send emails, create issues, post to Slack, and take actions across 1000+ apps. See how →

If you receive the email, Claude is now connected to 500+ apps.

See all supported apps →


Contents

What Are Claude Skills?

Claude Skills are customizable workflows that teach Claude how to perform specific tasks according to your unique requirements. Skills enable Claude to execute tasks in a repeatable, standardized manner across all Claude platforms.

Skills

Document Processing

  • docx - Create, edit, analyze Word docs with tracked changes, comments, formatting.
  • pdf - Extract text, tables, metadata, merge & annotate PDFs.
  • pptx - Read, generate, and adjust slides, layouts, templates.
  • xlsx - Spreadsheet manipulation: formulas, charts, data transformations.
  • Markdown to EPUB Converter - Converts markdown documents and chat summaries into professional EPUB ebook files. By @smerchek

Development & Code Tools

  • artifacts-builder - Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using modern frontend web technologies (React, Tailwind CSS, shadcn/ui).
  • aws-skills - AWS development with CDK best practices, cost optimization MCP servers, and serverless/event-driven architecture patterns.
  • Changelog Generator - Automatically creates user-facing changelogs from git commits by analyzing history and transforming technical commits into customer-friendly release notes.
  • Claude Code Terminal Title - Gives each Claud-Code terminal window a dynamic title that describes the work being done so you don't lose track of what window is doing what.
  • D3.js Visualization - Teaches Claude to produce D3 charts and interactive data visualizations. By @chrisvoncsefalvay
  • FFUF Web Fuzzing - Integrates the ffuf web fuzzer so Claude can run fuzzing tasks and analyze results for vulnerabilities. By @jthack
  • finishing-a-development-branch - Guides completion of development work by presenting clear options and handling chosen workflow.
  • iOS Simulator - Enables Claude to interact with iOS Simulator for testing and debugging iOS applications. By @conorluddy
  • jules - Delegate coding tasks to Google Jules AI agent for async bug fixes, documentation, tests, and feature implementation on GitHub repos. By @sanjay3290
  • LangSmith Fetch - Debug LangChain and LangGraph agents by automatically fetching and analyzing execution traces from LangSmith Studio. First AI observability skill for Claude Code. By @OthmanAdi
  • MCP Builder - Guides creation of high-quality MCP (Model Context Protocol) servers for integrating external APIs and services with LLMs using Python or TypeScript.
  • move-code-quality-skill - Analyzes Move language packages against the official Move Book Code Quality Checklist for Move 2024 Edition compliance and best practices.
  • Playwright Browser Automation - Model-invoked Playwright automation for testing and validating web applications. By @lackeyjb
  • prompt-engineering - Teaches well-known prompt engineering techniques and patterns, including Anthropic best practices and agent persuasion principles.
  • pypict-claude-skill - Design comprehensive test cases using PICT (Pairwise Independent Combinatorial Testing) for requirements or code, generating optimized test suites with pairwise coverage.
  • reddit-fetch - Fetches Reddit content via Gemini CLI when WebFetch is blocked or returns 403 errors.
  • Skill Creator - Provides guidance for creating effective Claude Skills that extend capabilities with specialized knowledge, workflows, and tool integrations.
  • Skill Seekers - Automatically converts any documentation website into a Claude AI skill in minutes. By @yusufkaraaslan
  • software-architecture - Implements design patterns including Clean Architecture, SOLID principles, and comprehensive software design best practices.
  • subagent-driven-development - Dispatches independent subagents for individual tasks with code review checkpoints between iterations for rapid, controlled development.
  • test-driven-development - Use when implementing any feature or bugfix, before writing implementation code.
  • using-git-worktrees - Creates isolated git worktrees with smart directory selection and safety verification.
  • Connect - Connect Claude to any app. Send emails, create issues, post messages, update databases - take real actions across Gmail, Slack, GitHub, Notion, and 1000+ services.
  • Webapp Testing - Tests local web applications using Playwright for verifying frontend functionality, debugging UI behavior, and capturing screenshots.

Data & Analysis

  • CSV Data Summarizer - Automatically analyzes CSV files and generates comprehensive insights with visualizations without requiring user prompts. By @coffeefuelbump
  • deep-research - Execute autonomous multi-step research using Gemini Deep Research Agent for market analysis, competitive landscaping, and literature reviews. By @sanjay3290
  • postgres - Execute safe read-only SQL queries against PostgreSQL databases with multi-connection support and defense-in-depth security. By @sanjay3290
  • root-cause-tracing - Use when errors occur deep in execution and you need to trace back to find the original trigger.

Business & Marketing

  • Brand Guidelines - Applies Anthropic's official brand colors and typography to artifacts for consistent visual identity and professional design standards.
  • Competitive Ads Extractor - Extracts and analyzes competitors' ads from ad libraries to understand messaging and creative approaches that resonate.
  • Domain Name Brainstormer - Generates creative domain name ideas and checks availability across multiple TLDs including .com, .io, .dev, and .ai extensions.
  • Internal Comms - Helps write internal communications including 3P updates, company newsletters, FAQs, status reports, and project updates using company-specific formats.
  • Lead Research Assistant - Identifies and qualifies high-quality leads by analyzing your product, searching for target companies, and providing actionable outreach strategies.

Communication & Writing

  • article-extractor - Extract full article text and metadata from web pages.
  • brainstorming - Transform rough ideas into fully-formed designs through structured questioning and alternative exploration.
  • Content Research Writer - Assists in writing high-quality content by conducting research, adding citations, improving hooks, and providing section-by-section feedback.
  • family-history-research - Provides assistance with planning family history and genealogy research projects.
  • Meeting Insights Analyzer - Analyzes meeting transcripts to uncover behavioral patterns including conflict avoidance, speaking ratios, filler words, and leadership style.
  • NotebookLM Integration - Lets Claude Code chat directly with NotebookLM for source-grounded answers based exclusively on uploaded documents. By @PleasePrompto
  • Twitter Algorithm Optimizer - Analyze and optimize tweets for maximum reach using Twitter's open-source algorithm insights. Rewrite and edit tweets to improve engagement and visibility.

Creative & Media

  • Canvas Design - Creates beautiful visual art in PNG and PDF documents using design philosophy and aesthetic principles for posters, designs, and static pieces.
  • imagen - Generate images using Google Gemini's image generation API for UI mockups, icons, illustrations, and visual assets. By @sanjay3290
  • Image Enhancer - Improves image and screenshot quality by enhancing resolution, sharpness, and clarity for professional presentations and documentation.
  • Slack GIF Creator - Creates animated GIFs optimized for Slack with validators for size constraints and composable animation primitives.
  • Theme Factory - Applies professional font and color themes to artifacts including slides, docs, reports, and HTML landing pages with 10 pre-set themes.
  • Video Downloader - Downloads videos from YouTube and other platforms for offline viewing, editing, or archival with support for various formats and quality options.
  • youtube-transcript - Fetch transcripts from YouTube videos and prepare summaries.

Productivity & Organization

  • File Organizer - Intelligently organizes files and folders by understanding context, finding duplicates, and suggesting better organizational structures.
  • Invoice Organizer - Automatically organizes invoices and receipts for tax preparation by reading files, extracting information, and renaming consistently.
  • kaizen - Applies continuous improvement methodology with multiple analytical approaches, based on Japanese Kaizen philosophy and Lean methodology.
  • n8n-skills - Enables AI assistants to directly understand and operate n8n workflows.
  • Raffle Winner Picker - Randomly selects winners from lists, spreadsheets, or Google Sheets for giveaways and contests with cryptographically secure randomness.
  • Tailored Resume Generator - Analyzes job descriptions and generates tailored resumes that highlight relevant experience, skills, and achievements to maximize interview chances.
  • ship-learn-next - Skill to help iterate on what to build or learn next, based on feedback loops.
  • tapestry - Interlink and summarize related documents into knowledge networks.

Collaboration & Project Management

  • git-pushing - Automate git operations and repository interactions.
  • google-workspace-skills - Suite of Google Workspace integrations: Gmail, Calendar, Chat, Docs, Sheets, Slides, and Drive with cross-platform OAuth. By @sanjay3290
  • outline - Search, read, create, and manage documents in Outline wiki instances (cloud or self-hosted). By @sanjay3290
  • review-implementing - Evaluate code implementation plans and align with specs.
  • test-fixing - Detect failing tests and propose patches or fixes.

Security & Systems

App Automation via Composio

Pre-built workflow skills for 78 SaaS apps via Rube MCP (Composio). Each skill includes tool sequences, parameter guidance, known pitfalls, and quick reference tables — all using real tool slugs discovered from Composio's API.

CRM & Sales - Close Automation - Automate Close CRM: leads, contacts, opportunities, activities, and pipelines. - HubSpot Automation - Automate HubSpot CRM: contacts, deals, companies, tickets, and email engagement. - Pipedrive Automation - Automate Pipedrive: deals, contacts, organizations, activities, and pipelines. - Salesforce Automation - Automate Salesforce: objects, records, SOQL queries, and bulk operations. - Zoho CRM Automation - Automate Zoho CRM: leads, contacts, deals, accounts, and modules.

Project Management - Asana Automation - Automate Asana: tasks, projects, sections, assignments, and workspaces. - Basecamp Automation - Automate Basecamp: to-do lists, messages, people, groups, and projects. - ClickUp Automation - Automate ClickUp: tasks, lists, spaces, goals, and time tracking. - Jira Automation - Automate Jira: issues, projects, boards, sprints, and JQL queries. - Linear Automation - Automate Linear: issues, projects, cycles, teams, and workflows. - Monday Automation - Automate Monday.com: boards, items, columns, groups, and workspaces. - Notion Automation - Automate Notion: pages, databases, blocks, comments, and search. - Todoist Automation - Automate Todoist: tasks, projects, sections, labels, and filters. - Trello Automation - Automate Trello: boards, cards, lists, members, and checklists. - Wrike Automation - Automate Wrike: tasks, folders, projects, comments, and workflows.

Communication - Discord Automation - Automate Discord: messages, channels, servers, roles, and reactions. - Intercom Automation - Automate Intercom: conversations, contacts, companies, tickets, and articles. - Microsoft Teams Automation - Automate Teams: messages, channels, teams, chats, and meetings. - Slack Automation - Automate Slack: messages, channels, search, reactions, threads, and scheduling. - Telegram Automation - Automate Telegram: messages, chats, media, groups, and bots. - WhatsApp Automation - Automate WhatsApp: messages, media, templates, groups, and business profiles.

Email - Gmail Automation - Automate Gmail: send/reply, search, labels, drafts, and attachments. - Outlook Automation - Automate Outlook: emails, folders, contacts, and calendar integration. - Postmark Automation - Automate Postmark: transactional emails, templates, servers, and delivery stats. - SendGrid Automation - Automate SendGrid: emails, templates, contacts, lists, and campaign stats.

Code & DevOps - Bitbucket Automation - Automate Bitbucket: repos, PRs, branches, issues, and workspaces. - CircleCI Automation - Automate CircleCI: pipelines, workflows, jobs, and project configuration. - Datadog Automation - Automate Datadog: monitors, dashboards, metrics, incidents, and alerts. - GitHub Automation - Automate GitHub: issues, PRs, repos, branches, actions, and code search. - GitLab Automation - Automate GitLab: issues, MRs, projects, pipelines, and branches. - PagerDuty Automation - Automate PagerDuty: incidents, services, schedules, escalation policies, and on-call. - Render Automation - Automate Render: services, deploys, and project management. - Sentry Automation - Automate Sentry: issues, events, projects, releases, and alerts. - Supabase Automation - Automate Supabase: SQL queries, table schemas, edge functions, and storage. - Vercel Automation - Automate Vercel: deployments, projects, domains, environment variables, and logs.

Storage & Files - Box Automation - Automate Box: files, folders, search, sharing, collaborations, and sign requests. - Dropbox Automation - Automate Dropbox: files, folders, search, sharing, and batch operations. - Google Drive Automation - Automate Google Drive: upload, download, search, share, and organize files. - OneDrive Automation - Automate OneDrive: files, folders, search, sharing, permissions, and versioning.

Spreadsheets & Databases - Airtable Automation - Automate Airtable: records, tables, bases, views, and field management. - Coda Automation - Automate Coda: docs, tables, rows, formulas, and automations. - Google Sheets Automation - Automate Google Sheets: read/write cells, formatting, formulas, and batch operations.

Calendar & Scheduling - Cal.com Automation - Automate Cal.com: event types, bookings, availability, and scheduling. - Calendly Automation - Automate Calendly: events, invitees, event types, scheduling links, and availability. - Google Calendar Automation - Automate Google Calendar: events, attendees, free/busy, and recurring schedules. - Outlook Calendar Automation - Automate Outlook Calendar: events, attendees, reminders, and recurring schedules.

Social Media - Instagram Automation - Automate Instagram: posts, stories, comments, media, and business insights. - LinkedIn Automation - Automate LinkedIn: posts, profiles, companies, images, and comments. - Reddit Automation - Automate Reddit: posts, comments, subreddits, voting, and moderation. - TikTok Automation - Automate TikTok: video uploads, queries, and creator management. - Twitter Automation - Automate Twitter/X: tweets, search, users, lists, and engagement. - YouTube Automation - Automate YouTube: videos, channels, playlists, comments, and subscriptions.

Marketing & Email Marketing - ActiveCampaign Automation - Automate ActiveCampaign: contacts, deals, campaigns, lists, and automations. - Brevo Automation - Automate Brevo: contacts, email campaigns, transactional emails, and lists. - ConvertKit Automation - Automate ConvertKit (Kit): subscribers, tags, sequences, broadcasts, and forms. - Klaviyo Automation - Automate Klaviyo: profiles, lists, segments, campaigns, and events. - Mailchimp Automation - Automate Mailchimp: audiences, campaigns, templates, segments, and reports.

Support & Helpdesk - Freshdesk Automation - Automate Freshdesk: tickets, contacts, agents, groups, and canned responses. - Freshservice Automation - Automate Freshservice: tickets, assets, changes, problems, and service catalog. - Help Scout Automation - Automate Help Scout: conversations, customers, mailboxes, and tags. - Zendesk Automation - Automate Zendesk: tickets, users, organizations, search, and macros.

E-commerce & Payments - Shopify Automation - Automate Shopify: products, orders, customers, inventory, and GraphQL queries. - Square Automation - Automate Square: payments, customers, catalog, orders, and locations. - Stripe Automation - Automate Stripe: charges, customers, products, subscriptions, and refunds.

Design & Collaboration - Canva Automation - Automate Canva: designs, templates, assets, folders, and brand kits. - Confluence Automation - Automate Confluence: pages, spaces, search, CQL, labels, and versions. - DocuSign Automation - Automate DocuSign: envelopes, templates, signing, and document management. - Figma Automation - Automate Figma: files, components, comments, projects, and team management. - Miro Automation - Automate Miro: boards, sticky notes, shapes, connectors, and items. - Webflow Automation - Automate Webflow: CMS collections, items, sites, publishing, and assets.

Analytics & Data - Amplitude Automation - Automate Amplitude: events, cohorts, user properties, and analytics queries. - Google Analytics Automation - Automate Google Analytics: reports, dimensions, metrics, and property management. - Mixpanel Automation - Automate Mixpanel: events, funnels, cohorts, annotations, and JQL queries. - PostHog Automation - Automate PostHog: events, persons, feature flags, insights, and annotations. - Segment Automation - Automate Segment: sources, destinations, tracking, and warehouse connections.

HR & People - BambooHR Automation - Automate BambooHR: employees, time off, reports, and directory management.

Automation Platforms - Make Automation - Automate Make (Integromat): scenarios, connections, and execution management.

Zoom & Meetings - Zoom Automation - Automate Zoom: meetings, recordings, participants, webinars, and reports.

Getting Started

Using Skills in Claude.ai

  1. Click the skill icon (🧩) in your chat interface.
  2. Add skills from the marketplace or upload custom skills.
  3. Claude automatically activates relevant skills based on your task.

Using Skills in Claude Code

  1. Place the skill in ~/.config/claude-code/skills/:

    mkdir -p ~/.config/claude-code/skills/
    cp -r skill-name ~/.config/claude-code/skills/
    

  2. Verify skill metadata:

    head ~/.config/claude-code/skills/skill-name/SKILL.md
    

  3. Start Claude Code:

    claude
    

  4. The skill loads automatically and activates when relevant.

Using Skills via API

Use the Claude Skills API to programmatically load and manage skills:

import anthropic

client = anthropic.Anthropic(api_key="your-api-key")

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    skills=["skill-id-here"],
    messages=[{"role": "user", "content": "Your prompt"}]
)

See the Skills API documentation for details.

Creating Skills

Skill Structure

Each skill is a folder containing a SKILL.md file with YAML frontmatter:

skill-name/
├── SKILL.md          # Required: Skill instructions and metadata
├── scripts/          # Optional: Helper scripts
├── templates/        # Optional: Document templates
└── resources/        # Optional: Reference files

Basic Skill Template

---
name: my-skill-name
description: A clear description of what this skill does and when to use it.
---

# My Skill Name

Detailed description of the skill's purpose and capabilities.

## When to Use This Skill

- Use case 1
- Use case 2
- Use case 3

## Instructions

[Detailed instructions for Claude on how to execute this skill]

## Examples

[Real-world examples showing the skill in action]

Skill Best Practices

  • Focus on specific, repeatable tasks
  • Include clear examples and edge cases
  • Write instructions for Claude, not end users
  • Test across Claude.ai, Claude Code, and API
  • Document prerequisites and dependencies
  • Include error handling guidance

Contributing

We welcome contributions! Please read our Contributing Guidelines for details on:

  • How to submit new skills
  • Skill quality standards
  • Pull request process
  • Code of conduct

Quick Contribution Steps

  1. Ensure your skill is based on a real use case
  2. Check for duplicates in existing skills
  3. Follow the skill structure template
  4. Test your skill across platforms
  5. Submit a pull request with clear documentation

Resources

Official Documentation

Community Resources

Inspiration & Use Cases

Join the Community


Join 20,000+ developers building agents that ship

Get Started

License

This repository is licensed under the Apache License 2.0.

Individual skills may have different licenses - please check each skill's folder for specific licensing information.


Note: Claude Skills work across Claude.ai, Claude Code, and the Claude API. Once you create a skill, it's portable across all platforms, making your workflows consistent everywhere you use Claude.

  • AgentsKB - Upgrade your AI with researched answers. We did the research so your AI gets it right the first time.

Credit by: @github.com/ComposioHQ/awesome-claude-skills

Awesome Agent Skills

# Awesome Agent Skills

English | 繁體中文 | 简体中文 | 日本語 | 한국어 | Español

A curated list of skills, tools, and capabilities for AI coding agents.


Table of Contents


What Are Agent Skills?

Think of Agent Skills as "how-to guides" for AI assistants. Instead of the AI needing to know everything upfront, skills let it learn new abilities on the fly, like giving someone a recipe card instead of making them memorize an entire cookbook.

Skills are simple text files (called SKILL.md) that teach an AI how to do specific tasks. When you ask the AI to do something, it finds the right skill, reads the instructions, and gets to work.

How It Works

Skills load in three stages:

  1. Browse - The AI sees a list of available skills (just names and short descriptions)
  2. Load - When a skill is needed, the AI reads the full instructions
  3. Use - The AI follows the instructions and accesses any helper files

Why This Matters

  • Faster and lighter - The AI only loads what it needs, when it needs it
  • Works everywhere - Create a skill once, use it with any compatible AI tool
  • Easy to share - Skills are just files you can copy, download, or share on GitHub

Skills are instructions, not code. The AI reads them like a human would read a guide, then follows the steps.


Compatible Agents

The following platforms have documented support for Agent Skills:

Agent Documentation
Claude Code code.claude.com/docs/en/skills
Claude.ai support.claude.com
Codex (OpenAI) developers.openai.com
GitHub Copilot docs.github.com
VS Code code.visualstudio.com
Antigravity antigravity.google
Kiro kiro.dev
Gemini CLI geminicli.com

Skill List

Official Claude Skills (Document Processing)

Claude provides built-in skills for common document types:

Skill Description Source
docx Create, edit, analyze Word documents with tracked changes anthropics/skills
xlsx Spreadsheet manipulation: formulas, charts, data transformations anthropics/skills
pptx Read, generate, and adjust slides, layouts, templates anthropics/skills
pdf Extract text, tables, metadata from PDFs anthropics/skills

Official OpenAI Codex Skills

Codex supports skills at different scopes:

Skill Scope Location Suggested Use
REPO $CWD/.codex/skills Skills relevant to a working folder (e.g., microservice or module)
REPO $CWD/../.codex/skills Skills for shared areas in parent folders
REPO $REPO_ROOT/.codex/skills Root skills for everyone using the repository
USER $CODEX_HOME/skills (default: ~/.codex/skills) Personal skills that apply to any repository
ADMIN /etc/codex/skills SDK scripts, automation, and default admin skills
SYSTEM Bundled with Codex Built-in skills like skill-creator and plan

Official HuggingFace Skills

Skill Description Source
hf_dataset_creator Prompts, templates, and scripts for creating structured training datasets huggingface/skills
hf_model_evaluation Instructions plus utilities for orchestrating evaluation jobs, generating reports, and mapping metrics huggingface/skills
hf-llm-trainer Comprehensive training skill with guidance, helper scripts, cost estimators huggingface/skills
hf-paper-publisher Tools for publishing and managing research papers on Hugging Face Hub huggingface/skills

Community Skills

Community-maintained skills and collections (verify before use):

Skill Collections
Repository Description
anthropics/skills Official Anthropic collection (document editing, data analysis)
openai/skills Official OpenAI Codex skills catalog
huggingface/skills HuggingFace skills (compatible with Claude, Codex, Gemini)
skillcreatorai/Ai-Agent-Skills SkillCreator.ai collection with CLI installer
agentskill.sh 44k+ skills directory with security scanning and /learn installer
karanb192/awesome-claude-skills 50+ verified skills for Claude Code and Claude.ai
shajith003/awesome-claude-skills Skills for specialized capabilities
GuDaStudio/skills Multi-agent collaboration skills
DougTrajano/pydantic-ai-skills Pydantic AI integration
OmidZamani/dspy-skills Skills for DSPy framework
hikanner/agent-skills Curated Claude Agent Skills collection
gradion-ai/freeact-skills Freeact agent library skills
dmgrok/agent_skills_directory npm-like CLI for skills (brew install dmgrok/tap/skills) - aggregates 177+ skills from 24 providers
gotalab/skillport Skills distribution via CLI or MCP
mhattingpete/claude-skills-marketplace Git, code review, and testing skills
kukapay/crypto-skills cryptocurrency, web3 and blockchain skills.
chadboyda/agent-gtm-skills 18 go-to-market skills: pricing, outbound, SEO, ads, retention, and ops
product-on-purpose/pm-skills 24 product management skills covering discovery, definition, delivery, and optimization
sanjay3290/ai-skills Google Workspace (Gmail, Chat, Calendar, Docs, Drive, Sheets, Slides), AI delegation (Jules, Manus, Deep Research), and database skills
RioBot-Grind/agentfund-skill Crowdfunding for AI agents on Base chain - milestone escrow
Document Processing
Skill Description
Markdown to EPUB Converts markdown documents into professional EPUB ebook files
Development & Code Tools
Skill Description
aws-skills AWS development with CDK best practices
D3.js Visualization D3 charts and interactive data visualizations
Playwright Automation Browser automation for testing web apps
Specrate Manage specs and changes in a structured workflow
iOS Simulator Interact with iOS Simulator for testing
Swift Concurrency Migration Swift Concurrency Migration guide
Obsidian Plugin Obsidian.md plugin development
Stream Coding Stream Coding methodology
SwiftUI Skills Apple-authored SwiftUI and platform guidance extracted from Xcode
Tool Advisor Analyzes prompts and recommends optimal tools, skills, agents, and orchestration patterns
Vibe Testing Pressure-test spec documents with LLM reasoning before writing code
Mantra AI coding session management - save, restore, and time-travel through Claude Code, Cursor, and Windsurf sessions
Data & Analysis
Skill Description
CSV Summarizer Analyze CSV files and generate insights with visualizations
Kaggle Skill Complete Kaggle integration — account setup, competition reports, dataset/model downloads, notebook execution, submissions, and badge collection
Integration & Automation
Skill Description
Dev Browser Web browser capability for agents
Vectorize MCP Worker Edge-native MCP server patterns for production RAG
Agent Manager Manage local CLI AI agents via tmux (start/stop/monitor/assign + cron scheduling)
HOL Claude Skills AI agent discovery via Registry Broker - /hol-search, /hol-resolve, /hol-chat
Sheets CLI Google Sheets CLI automation
Notification Skill Send message notifications for agent workflows
Spotify Skill Spotify API integration
AgentStore Open-source plugin marketplace with gasless USDC payments, CLI install, and 3-field publishing API
Transloadit Skills Media processing: video encoding, image manipulation, OCR, and 86+ Robots
commune Agent-native email inbox — permanent @commune.ai address with full send/receive, semantic search, triage, and webhooks
Collaboration & Project Management
Skill Description
git-pushing Automate git operations and repository interactions
review-implementing Evaluate code implementation plans
test-fixing Detect failing tests and propose fixes
Security & Systems
Skill Description
computer-forensics Digital forensics analysis and investigation
safe-encryption-skill Modern encryption alternative to GPG/PGP with post-quantum support, composable authentication, and agent-to-agent communication
Threat Hunting Hunt for threats using Sigma detection rules
Vincent Wallet Secure EVM wallet for agent transfers, swaps, and transactions
Vincent Polymarket Polymarket prediction market trading for agents
Agent OS Governance Kernel-level governance for AI agents — deterministic policy enforcement, compliance checking, audit logging
Advanced & Research
Skill Description
Context Engineering Context engineering techniques
Pomodoro System Skill System Skill Pattern (skills that remember & improve)
Mind Cloning Mind cloning with LLM skills

Official Tutorials and Guides

Claude and Anthropic

GitHub Copilot

Model Context Protocol (MCP)


Using Skills

Using Skills in Claude.ai

  1. Click the skill icon in your chat interface.
  2. Add skills from the marketplace or upload custom skills.
  3. Claude automatically activates relevant skills based on your task.

Using Skills in Google Antigravity

Antigravity supports two types of skills:

  • Workspace Skills: Project-specific skills located in /.agent/skills/
  • Global Skills: User-wide skills located in ~/.gemini/antigravity/skills

For more details, see the official documentation.

Using Skills in Claude Code

Place the skill in your configuration directory:

mkdir -p ~/.claude/skills/
cp -r skill-name ~/.claude/skills/

Verify skill metadata:

head ~/.claude/skills/skill-name/SKILL.md

The skill loads automatically and activates when relevant.

Using Skills in Codex

Create a skill:

Use the built-in $skill-creator skill in Codex. Describe what you want your skill to do, and Codex will bootstrap it for you.

If you install $create-plan (experimental) with $skill-installer create-plan, Codex will create a plan before writing files.

You can also create a skill manually by creating a folder with a SKILL.md file:

---
name: skill-name
description: Description that helps Codex select the skill
metadata:
  short-description: Optional user-facing description
---

Skill instructions for the Codex agent to follow when using this skill.

Install new skills:

Download skills from GitHub using the $skill-installer skill:

$skill-installer linear

You can also prompt the installer to download skills from other repositories. After installing a skill, restart Codex to pick up new skills.

Using Skills in VS Code

Skills are stored in directories with a SKILL.md file. VS Code supports skills in two locations:

  • .github/skills/ - Recommended location for all new skills
  • .claude/skills/ - Legacy location, also supported

Create a skill:

  1. Create a .github/skills directory in your workspace
  2. Create a subdirectory for your skill (e.g., .github/skills/webapp-testing)
  3. Create a SKILL.md file with the following structure:
---
name: skill-name
description: Description of what the skill does and when to use it
---

# Skill Instructions

Your detailed instructions, guidelines, and examples go here...
  1. Optionally, add scripts, examples, or other resources to your skill's directory

Using Skills in Copilot CLI

Adding skills to your repository:

  1. Create a .github/skills directory (skills in .claude/skills are also supported)
  2. Create a subdirectory for your skill (e.g., .github/skills/webapp-testing)
  3. Create a SKILL.md file with your skill's instructions

SKILL.md structure:

  • name (required): A unique lowercase identifier using hyphens for spaces
  • description (required): What the skill does and when Copilot should use it
  • license (optional): License that applies to this skill
  • Markdown body with instructions, examples, and guidelines

Example SKILL.md:

---
name: github-actions-failure-debugging
description: Guide for debugging failing GitHub Actions workflows.
---

To debug failing GitHub Actions workflows:

1. Use `list_workflow_runs` to look up recent workflow runs
2. Use `summarize_job_log_failures` to get an AI summary of failed jobs
3. Use `get_job_logs` for full detailed failure logs if needed
4. Try to reproduce the failure in your environment
5. Fix the failing build

When performing tasks, Copilot decides when to use skills based on your prompt and the skill's description. The SKILL.md file is injected into the agent's context.

Using MCP Servers (Claude Desktop)

Edit your configuration file: - macOS: ~/Library/Application Support/Claude/claude_desktop_config.json - Windows: %APPDATA%\Claude\claude_desktop_config.json

Example Configuration:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/username/Desktop"
      ]
    }
  }
}


Creating Skills

Skills are instruction bundles that tell the agent how to perform specific tasks. They are not executable code by default.

Skill Structure

skill-name/
├── SKILL.md          # Required: Instructions and metadata
├── scripts/          # Optional: Helper scripts
├── templates/        # Optional: Document templates
└── resources/        # Optional: Reference files

Basic SKILL.md Template

---
name: my-skill-name
description: A clear description of what this skill does.
---

# My Skill Name

Detailed description of the skill's purpose.

## When to Use This Skill

- Use case 1
- Use case 2

## Instructions

[Detailed instructions for the agent on how to execute this skill]

## Examples

[Real-world examples]

MCP Server Example (Python)

For skills that need to connect to external data sources, you can create an MCP server:

pip install fastmcp

server.py:

from fastmcp import FastMCP

mcp = FastMCP("My Server")

@mcp.tool()
def hello_world(name: str = "World") -> str:
    """A simple tool that says hello."""
    return f"Hello, {name}!"

if __name__ == "__main__":
    mcp.run()


Community Resources

LangChain Tools

Articles & Research


Frequently Asked Questions

What are Agent Skills?

Agent Skills are instruction files that teach AI assistants how to do specific tasks. Think of them as "how-to guides" that the AI reads and follows. They only load when needed, so the AI stays fast and focused.

How are Agent Skills different from fine-tuning?

Fine-tuning permanently changes how an AI thinks (expensive and hard to update). Agent Skills are just instruction files, you can update, swap, or share them anytime without touching the AI itself.

What's the difference between Agent Skills and MCP?

They do different things and work great together: - Agent Skills = teach the AI how to do something (workflows, best practices) - MCP = help the AI access things (APIs, databases, external tools)

Which AI tools support Agent Skills?

Currently supported: Claude (Claude.ai and Claude Code), GitHub Copilot, VS Code, Codex (OpenAI), Antigravity (Google), Gemini CLI, and Kiro. The list is growing as more tools adopt the standard.

Do Agent Skills run code?

No. Skills are just text instructions, the AI reads and follows them like a recipe. If you need to run actual code, you'd use something like MCP servers alongside skills.

How do I create my first Agent Skill?

  1. Create a SKILL.md file with a name and description at the top
  2. Write clear, step-by-step instructions in the file
  3. Put it in your .github/skills/ or .claude/skills/ folder
  4. Test it out!

Full guide: How to create custom skills


Contributing

Contributions are welcome. See CONTRIBUTING.md for full guidelines.

Quick summary: - Follow the skill template structure - Provide clear, actionable instructions - Include working examples where appropriate - Document trade-offs and potential issues - Keep SKILL.md under 500 lines for optimal performance - Verify that skills actually exist before adding them


License

MIT License - see LICENSE file for details.


References

The principles in these skills are derived from research and production experience at leading AI labs and framework developers.

Credit by: @github.com/heilcheng/awesome-agent-skills

Linux ate my RAM

Linux is borrowing unused memory for disk caching. This makes it look like you are low on "free" memory, but you are not! Everything is fine!

What is disk caching? It is a way for the operating system to use unused RAM to speed up disk access. When you read a file, Linux will keep it in memory so that if you read it again, it can be accessed much faster.

Read more...

My FOSS

I finally escaped from the telemetry secret agents :)

Thank you, FOSS!

As a way to give back to the FOSS community, I have created and maintained a Fedora Copr repository, coprs/thangckt, to keep up-to-date builds of various cutting-edge packages for the latest RedHat based distros.

The repository includes even essential daily-use packages, such as mail client, office stuff, to critical scientific applications like Ovito, Zotero, VScodium, SSH client, and many others.

To install the packages, simply enable the Copr repository:

sudo dnf copr enable thangckt/thang_foss
sudo dnf install ovito zotero  # or other packages

Or add repo file:

sudo dnf config-manager addrepo --overwrite --from-repofile=https://copr.fedorainfracloud.org/coprs/thangckt/thang_foss/repo/fedora-42/thangckt-thang_foss-fedora-42.repo
sudo dnf install ovito

AI for Crystal Materials - models and benchmarks

# AI for Crystal Materials: models and benchmarks Here we have collected papers with the theme of "AI for crystalline materials" that have appeared at top machine learning conferences and journals (ICML, ICLR, NeurIPS, AAAI, NPJ, NC, etc.) in recent years. See https://arxiv.org/abs/2408.08044 or https://dl.acm.org/doi/10.1145/3794853 for details. We will keep this page updated.

Crystalline Material Physicochemical Property Prediction

Method Paper
SchNet Schnet: A continuous-filter convolutional neural network for modeling quantum interactions (NeurIPS2017) Paper(https://github.com/atomistic-machine-learning/schnetpack)]
CGCNN Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties (Physical Review Letters, 2018) Paper(https://github.com/txie-93/cgcnn)]
MEGNET Graph networks as a universal machine learning framework for molecules and crystals (Chemistry of Materials, 2019) Paper(https://github.com/materialsvirtuallab/megnet)]
GATGNN Graph convolutional neural networks with global attention for improved materials property prediction (Physical Chemistry Chemical Physics, 2020) Paper(https://github.com/superlouis/GATGNN)]
ALIGNN Atomistic line graph neural network for improved materials property predictions (npj Computational Materials, 2021) Paper(https://github.com/usnistgov/alignn)]
E(3)NN Direct prediction of phonon density of states with Euclidean neural networks (Advanced Science, 2021) Paper(https://github.com/zhantaochen/phonondos_e3nn)]
ECN Equivariant networks for crystal structures (NeurIPS2022) Paper(https://github.com/oumarkaba/equivariant_crystal_networks)]
Matformer Periodic Graph Transformers for Crystal Material Property Prediction (NeurIPS2022) Paper(https://github.com/YKQ98/Matformer)]
PotNet Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction (ICML2023) Paper(https://github.com/divelab/AIRS/tree/main/OpenMat/PotNet)]
CrysGNN Crysgnn: Distilling pre-trained knowledge to enhance property prediction for crystalline materials (AAAI2023) Paper(https://github.com/kdmsit/crysgnn)]
ETGNN A general tensor prediction framework based on graph neural networks (The Journal of Physical Chemistry Letters, 2023) [Paper]
DOSTransformer Density of States Prediction of Crystalline Materials via Prompt-guided Multi-Modal Transformer (NeurIPS2023) Paper(https://github.com/HeewoongNoh/DOSTransformer)]
MOFTransformer A multi-modal pre-training transformer for universal transfer learning in metal-organic frameworks (Nature Machine Intelligence, 2023) Paper(https://github.com/hspark1212/MOFTransformer)]
- Examining graph neural networks for crystal structures: Limitations and opportunities for capturing periodicity (Science Advances, 2023) Paper(https://github.com/shenggong1996/examining-GNN-for-crystal-periodicity/tree/master)]
SCANN Towards understanding structure–property relations in materials with interpretable deep learning (npj Computational Materials, 2023) Paper(https://github.com/sinhvt3421/scann--material)]
CEGANN CEGANN: Crystal Edge Graph Attention Neural Network for multiscale classification of materials environment (npj Computational Materials, 2023) Paper(https://github.com/sbanik2/CEGANN)]
DTNet Dielectric tensor prediction for inorganic materials using latent information from preferred potential (npj Computational Materials, 2024) Paper(https://github.com/pfnet-research/dielectric-pred)]
GMTNet A Space Group Symmetry Informed Network for O(3) Equivariant Crystal Tensor Prediction (ICML2024) Paper(https://github.com/divelab/AIRS/tree/main/OpenMat/GMTNet)]
ComFormer Complete and Efficient Graph Transformers for Crystal Material Property Prediction (ICLR2024) Paper(https://github.com/divelab/AIRS/tree/main/OpenMat/ComFormer)]
Crystalformer Crystalformer: infinitely connected attention for periodic structure encoding (ICLR2024) Paper(https://github.com/omron-sinicx/crystalformer)]
Crystalformer Conformal Crystal Graph Transformer with Robust Encoding of Periodic Invariance (AAAI2024) [Paper]
CrysDiff A Diffusion-Based Pre-training Framework for Crystal Property Prediction (AAAI2024) [Paper]
- Structure-aware graph neural network based deep transfer learning framework for enhanced predictive analytics on diverse materials datasets (npj Computational Materials, 2024) [Paper]
Uni-MOF A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks (Nature Communications, 2024) Paper(https://github.com/dptech-corp/Uni-MOF)]
SODNet Learning Superconductivity from Ordered and Disordered Material Structures (NeurIPS2024) Paper(https://github.com/pincher-chen/SODNet)]
ChargE3Net Higher-order equivariant neural networks for charge density prediction in materials (npj Computational Materials, 2024) Paper(https://github.com/AIforGreatGood/charge3net)]
MD-HIT MD-HIT: Machine learning for material property prediction with dataset redundancy control (npj Computational Materials, 2024) Paper(https://github.com/usccolumbia/MD-HIT)]
VGNN Virtual node graph neural network for full phonon prediction (Nature Computational Science, 2024) Paper(https://github.com/RyotaroOKabe/phonon_prediction)]
ECSG Predicting thermodynamic stability of inorganic compounds using ensemble machine learning based on electron configuration (Nature Communications, 2025) Paper(https://github.com/Haozou-csu/ECSG)]
CrystalFramer Rethinking the role of frames for SE(3)-invariant crystal structure modeling (ICLR2025) [Paper] [Code]
ct-UAE Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning (Nature Communications, 2025) [Paper] [Code]
- Cross-scale covariance for material property prediction (npj Computational Materials, 2025) [Paper] [Code]
AdsMT A multi-modal transformer for predicting global minimum adsorption energy (Nature Communications, 2025) [Paper] [Code]
DPF A Denoising Pre-training Framework for Accelerating Novel Material Discovery (AAAI2025) [Paper]
CrysCo Accelerating materials property prediction via a hybrid Transformer Graph framework that leverages four body interactions (npj Computational Materials, 2025) [Paper] [Code]
E2T Advancing extrapolative predictions of material properties through learning to learn using extrapolative episodic training (Communications Materials, 2025) [Paper] [Code]
- Probing out-of-distribution generalization in machine learning for materials (Communications Materials, 2025) [Paper] [Code]
- A machine learning model with minimize feature parameters for multi-type hydrogen evolution catalyst prediction (npj Computational Materials, 2025) [Paper] [Code]
- Automatic identification of slip pathways in ductile inorganic materials by combining the active learning strategy and NEB method (npj Computational Materials, 2025) [Paper]
BETE-NET Accelerating superconductor discovery through tempered deep learning of the electron-phonon spectral function (npj Computational Materials, 2025) [Paper] [Code]
HiBoFL Hierarchy-boosted funnel learning for identifying semiconductors with ultralow lattice thermal conductivity (npj Computational Materials, 2025) [Paper] [Code]
PDDFormer PDDFormer: Pairwise Distance Distribution Graph Transformer for Crystal Material Property Prediction (IJCAI2025) [Paper]
Rep-CodeGen Code-Generated Graph Representations Using Multiple LLM Agents for Material Properties Prediction (ICML2025) [Paper]
CSLLM Accurate prediction of synthesizability and precursors of 3D crystal structures via large language models (Nature Communications, 2025) [Paper] [Code]
LLM-Prop LLM-Prop: predicting the properties of crystalline materials using large language models (npj Computational Materials, 2025) [Paper] [Code]
- Faithful novel machine learning for predicting quantum properties (npj Computational Materials, 2025) [Paper] [Code]
SciToolAgent SciToolAgent: a knowledge-graph-driven scientific agent for multitool integration (Nature Computational Science, 2025) Paper(https://github.com/hicai-zju/scitoolagent)]
SPFrame Local-Global Associative Frames for Symmetry-Preserving Crystal Structure Modeling (NeurIPS2025) [Paper]
PRDNet Beyond Structure: Invariant Crystal Property Prediction with Pseudo-Particle Ray Diffraction (ICLR2026) [Paper]
CFT A Single Architecture for Representing Invariance Under Any Space Group (ICLR2026) [Paper]
MoMa MoMa: A Simple Modular Learning Framework for Material Property Prediction (ICLR2026) [Paper]
SpatialRead From atom to space: A region-based readout function for spatial properties of materials (ICLR2026) [Paper]
- A crystal graph convolutional neural network framework for predicting stacking fault energy in concentrated alloys (npj Computational Materials, 2026) [Paper]
PE-AG-GMoE Accelerating electron diffraction analysis using graph neural networks and attention mechanisms (npj Computational Materials, 2026) [Paper] [Code]
TSENN Accurate prediction of tensorial spectra using equivariant graph neural network (Nature Communications, 2026) [Paper] [Code]

Crystalline Material Generative Design

Method Paper
G-SchNet Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules (NeurIPS2019) Paper(https://github.com/atomistic-machine-learning/G-SchNet)]
CubicGAN High-throughput discovery of novel cubic crystal materials using deep generative neural networks (Advanced Science, 2021) Paper(https://github.com/MilesZhao/CubicGAN)]
CDVAE Crystal Diffusion Variational Autoencoder for Periodic Material Generation (ICLR2022) Paper(https://github.com/txie-93/cdvae)]
LCOMs Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction (NeurIPS2023 Workshop) [Paper]
DiffCSP Crystal structure prediction by joint equivariant diffusion on lattices and fractional coordinates (NeurIPS2023) Paper(https://github.com/jiaor17/DiffCSP)]
SyMat Towards symmetry-aware generation of periodic materials (NeurIPS2023) Paper(https://github.com/divelab/AIRS/tree/main/OpenMat/SyMat)]
EMPNN Equivariant Message Passing Neural Network for Crystal Material Discovery (AAAI2023) Paper(https://github.com/aklipf/pegnn)]
GemsNet Unified Model for Crystalline Material Generation (IJCAI2023) Paper(https://github.com/aklipf/GemsNet/tree/master)]
- Optimized Crystallographic Graph Generation for Material Science (IJCAI2023) Paper(https://github.com/aklipf/mat-graph)]
PGCGM Physics guided deep learning for generative design of crystal materials with symmetry constraints (npj Computational Materials, 2023) Paper(https://github.com/MilesZhao/PGCGM)]
PCVAE PCVAE: A Physics-informed Neural Network for Determining the Symmetry and Geometry of Crystals (IJCNN2023) Paper(https://github.com/zjuKeLiu/PCVAE)]
Govindarajan Behavioral Cloning for Crystal Design (ICLR2023 Workshop) Paper()]
CHGFlowNet Hierarchical GFlownet for Crystal Structure Generation (NeurIPS2023 Workshop) [Paper]
LM-CM,LM-AC Language models can generate molecules, materials, and protein binding sites directly in three dimensions as xyz, cif, and pdb files (Arxiv, 2023) Paper(https://github.com/danielflamshep/xyztransformer)]
SLI2Cry An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning (Nature Communications, 2023) Paper(https://github.com/xiaohang007/SLICES/tree/main)]
GNoME Scaling deep learning for materials discovery (Nature, 2023) Paper(https://github.com/google-deepmind/materials_discovery)]
ipcsp Optimality guarantees for crystal structure prediction (Nature, 2023) Paper(https://github.com/lrcfmd/ipcsp)]
DiffCSP-SC Learning Superconductivity from Ordered and Disordered Material Structures (NeurIPS2024) Paper(https://github.com/pincher-chen/DiffCSP-SC)]
EquiCSP Equivariant Diffusion for Crystal Structure Prediction (ICML2024) Paper(https://github.com/EmperorJia/EquiCSP)]
GemsDiff Vector Field Oriented Diffusion Model for Crystal Material Generation (AAAI2024) Paper(https://github.com/aklipf/gemsdiff)]
UniMat Scalable Diffusion for Materials Generation (ICLR2024) Paper(https://unified-materials.github.io/unimat/)]
DiffCSP++ Space Group Constrained Crystal Generation (ICLR2024) Paper(https://github.com/jiaor17/DiffCSP-PP)]
FlowMM FlowMM: Generating Materials with Riemannian Flow Matching (ICML2024) Paper(https://github.com/facebookresearch/flowmm)]
CrystaLLM Crystal structure generation with autoregressive large language modeling (Nature Communications, 2024) Paper(https://github.com/lantunes/CrystaLLM)]
Con-CDVAE Con-CDVAE: A method for the conditional generation of crystal structures (Computational Materials Today, 2024) Paper(https://github.com/cyye001/Con-CDVAE)]
Cond-CDVAE Deep learning generative model for crystal structure prediction (npj Computational Materials, 2024) Paper(https://github.com/ixsluo/cond-cdvae)]
LCMGM A deep generative modeling architecture for designing lattice-constrained perovskite materials (npj Computational Materials, 2024) [Paper] [Code]
CrystalFormer Space Group Informed Transformer for Crystalline Materials Generation (Arxiv, 2024) Paper(https://github.com/deepmodeling/CrystalFormer)]
Gruver Fine-Tuned Language Models Generate Stable Inorganic Materials as Text (ICLR2024) Paper(https://github.com/facebookresearch/crystal-text-llm)]
FlowLLM FlowLLM: Flow Matching for Material Generation with Large Language Models as Base Distributions (NeurIPS2024) Paper(https://github.com/facebookresearch/flowmm)]
Mat2Seq Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation (NeurIPS2024) [Paper] [Code]
FlowDPO 3D Structure Prediction of Atomic Systems with Flow-Based Direct Preference Optimization (NeurIPS2024) [Paper]
GenMS Generative Hierarchical Materials Search (NeurIPS2024) [Paper]
ChemReasoner CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback (ICML2024) [Paper] [Code]
a²c Predicting emergence of crystals from amorphous precursors with deep learning potentials (Nature Computational Science, 2024) Paper(https://github.com/jax-md/jax-md/tree/main/jax_md/a2c)]
CrystalMath Rapid prediction of molecular crystal structures using simple topological and physical descriptors (Nature Communications, 2024) [Paper] [Code]
ShotgunCSP Shotgun crystal structure prediction using machine-learned formation energies (npj Computational Materials, 2024) [Paper] [Code]
MatterGen A generative model for inorganic materials design (Nature, 2025) Paper(https://github.com/microsoft/mattergen)]
SymmCD SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models (ICLR2025) Paper(https://github.com/sibasmarak/SymmCD)]
MatExpert MatExpert: Decomposing Materials Discovery By Mimicking Human Experts (ICLR2025) [Paper] [Code]
- Designing Mechanical Meta-Materials by Learning Equivariant Flows (ICLR2025) [Paper]
MOFFlow MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks (ICLR2025) [Paper] [Code]
TGDMat Periodic Materials Generation using Text-Guided Joint Diffusion Model (ICLR2025) [Paper] [Code]
CrysBFN A Periodic Bayesian Flow for Material Generation (ICLR2025) [Paper] [Code]
OSDAs OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents (ICLR2025) [Paper]
- Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective (ICLR2025) [Paper]
MAGUS Efficient crystal structure prediction based on the symmetry principle (Nature Computational Science, 2025) [Paper] [Code]
Target XXXI A robust crystal structure prediction method to support small molecule drug development with large scale validation and blind study (Nature Communications, 2025) [Paper]
active-csp Accelerating crystal structure search through active learning with neural networks for rapid relaxations (npj Computational Materials, 2025) Paper(https://github.com/stefaanhessmann/active-csp)]
Chemeleon Exploration of crystal chemical space using text-guided generative artificial intelligence (Nature Communications, 2025) Paper(https://github.com/hspark1212/chemeleon/)]
MAGECS Inverse design of promising electrocatalysts for CO2 reduction via generative models and bird swarm algorithm (Nature Communications, 2025) Paper(https://github.com/szl666/CO2RR-inverse-design)]
PGH-VAEs Inverse design of catalytic active sites via interpretable topology-based deep generative models (npj Computational Materials, 2025) [Paper] [Code]
WyFormer Wyckoff Transformer: Generation of Symmetric Crystals (ICML2025) [Paper] [Code]
KLDM Kinetic Langevin Diffusion for Crystalline Materials Generation (ICML2025) [Paper]
WyckoffDiff WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry (ICML2025) [Paper] [Code]
OMatG Open Materials Generation with Stochastic Interpolants (ICML2025) [Paper] [Code]
ADiT All-atom Diffusion Transformers: Unified generative modelling of molecules and materials (ICML2025) [Paper] [Code]
UniMate UniMate: A Unified Model for Mechanical Metamaterial Generation, Property Prediction, and Condition Confirmation (ICML2025) [Paper] [Code]
- Inverse design of metal-organic frameworks using deep dreaming approaches (Nature Communications, 2025) Paper(https://github.com/SarkisovTeam/dreaming4MOFs)]
VQCrystal Massive discovery of crystal structures across dimensionalities by leveraging vector quantization (npj Computational Materials, 2025) Paper(https://github.com/Fatemoisted/VQCrystal)]
- Generative deep learning for predicting ultrahigh lattice thermal conductivity materials (npj Computational Materials, 2025) [Paper]
- Machine learning-assisted Ru-N bond regulation for ammonia synthesis (Nature Communications, 2025) [Paper]
MACS MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures (NeurIPS2025) [Paper] [Code]
SGEquiDiff Space Group Equivariant Crystal Diffusion (NeurIPS2025) [Paper] [Code]
CrysLLMGen LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation (NeurIPS2025) [Paper] [Code]
MOF-BFN MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks (NeurIPS2025) [Paper] [Code]
MOFFlow-2 Flexible MOF Generation with Torsion-Aware Flow Matching (NeurIPS2025) [Paper] [Code]
CrystalFlow CrystalFlow: a flow-based generative model for crystalline materials (Nature Communications, 2025) Paper(https://github.com/ixsluo/CrystalFlow)]
SCIGEN Structural constraint integration in a generative model for the discovery of quantum materials (Nature Materials, 2025) Paper(https://github.com/RyotaroOKabe/SCIGEN)]
InvDesFlow-AL InvDesFlow-AL: active learning-based workflow for inverse design of functional materials (npj Computational Materials, 2025) [Paper] [Code]
CRYSIM Predicting symmetric structures of large crystals with GPU-based Ising machines (Communications Physics, 2025) [Paper] [Code]
CrystalICL CrystalICL: Enabling In-Context Learning for Crystal Generation (EMNLP2025) [Paper]
- Networking autonomous material exploration systems through transfer learning (npj Computational Materials, 2025) [Paper] [Code]
PODGen Materials discovery acceleration by using conditional generative methodology (npj Computational Materials, 2025) [Paper] [Code]
OXtal An All-Atom Diffusion Model for Organic Crystal Structure Prediction (ICLR2026) [Paper]
LLEMA Accelerating Materials Design via LLM-Guided Evolutionary Search (ICLR2026) [Paper]
RG-VFM Riemannian Variational Flow Matching for Material and Protein Design (ICLR2026) [Paper]
PRO-MOF PRO-MOF: Policy Optimization with Universal Atomistic Models for Controllable MOF Generation (ICLR2026) [Paper]
DiffSyn DiffSyn: a generative diffusion approach to materials synthesis planning (Nature Computational Science, 2026) [Paper] [Code]
Matra-Genoa A generative material transformer using Wyckoff representation (npj Computational Materials, 2026) [Paper] [Code]
LEGO-xtal AI-assisted rapid crystal structure generation towards a target local environment (npj Computational Materials, 2026) [Paper] [Code]
LLaMat A family of large language models for materials research with insights into model adaptability in continued pretraining (Nature Machine Intelligence, 2026) [Paper] [Code]

Aiding Characterization

Method Paper
- Insightful classification of crystal structures using deep learning (Nature Communications, 2018) [Paper]
- Advanced steel microstructural classification by deep learning methods (Scientific Reports, 2018) [Paper]
- Neural network for nanoscience scanning electron microscope image recognition (Scientific Reports, 2017) [Paper]
- Deep Learning-Assisted Quantification of Atomic Dopants and Defects in 2D Materials (Advanced Science, 2021) [Paper]
- Classification of crystal structure using a convolutional neural network (IUCrJ,2017) [Paper]
- Synthesis, optical imaging, and absorption spectroscopy data for 179072 metal oxides (Scientific Data, 2019) [Paper]
- Adaptively driven X-ray diffraction guided by machine learning for autonomous phase identification (npj Computational Materials, 2023) Paper(https://github.com/njszym/AdaptiveXRD)]
- Automated classification of big X-ray diffraction data using deep learning models (npj Computational Materials, 2023) Paper(https://github.com/AGI-init/XRDs)]
XRD-AutoAnalyzer Integrated analysis of X-ray diffraction patterns and pair distribution functions for machine-learned phase identification (npj Computational Materials, 2024) Paper(https://github.com/njszym/XRD-AutoAnalyzer)]
CrystalNet Towards end-to-end structure determination from x-ray diffraction data using deep learning (npj Computational Materials, 2024) Paper(https://github.com/gabeguo/deep-crystallography-public)]
- Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model (NeurIPS2024) Paper(https://github.com/MasterAI-EAM/Material-Knowledge-Graph)]
- Structured information extraction from scientific text with large language models (Nature Communications, 2024) Paper(https://github.com/LBNLP/NERRE)]
- Accelerating materials language processing with large language models (Communications Materials, 2024) Paper(https://github.com/jwchoi95/GPT_MLP)]
MatDuck Zero-Shot Learning for Materials Science Texts: Leveraging Duck Typing Principles (AAAI2025) Paper(https://github.com/xinzcode/MatDuck)]
SLM-MATRIX SLM-MATRIX: a multi-agent trajectory reasoning and verification framework for enhancing language models in materials data extraction (npj Computational Materials, 2025) [Paper] [Code]
- Construction of a knowledge graph for framework material enabled by large language models and its application (npj Computational Materials, 2025) [Paper] [Code]
- Unsupervised identification of crystal defects from atomistic potential descriptors (npj Computational Materials, 2025) [Paper]
PAGL Learning to predict rare events: the case of abnormal grain growth (npj Computational Materials, 2025) [Paper]
PXRDnet Ab initio structure solutions from nanocrystalline powder diffraction data via diffusion models (Nature Materials, 2025) Paper(https://github.com/gabeguo/cdvae_xrd)]
SBC Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering (npj Computational Materials, 2025) [Paper] [Code]
DefiNet Modeling crystal defects using defect informed neural networks (npj Computational Materials, 2025) Paper(https://github.com/Shen-Group/DefiNet)]
PXRDGen Powder diffraction crystal structure determination using generative models (Nature Communications, 2025) Paper(https://codeocean.com/capsule/7727770/tree/v1)]
Mat-Instruction Mat-Instructions: A Large-Scale Inorganic Material Instruction Dataset for Large Language Models (IJCAI2025) [Paper]
Daisy Data-driven microstructural optimization of Ag-Bi-I perovskite-inspired materials (npj Computational Materials, 2025) [Paper] [Code]
CrystalShift Probabilistic phase labeling and lattice refinement for autonomous materials research (npj Computational Materials, 2025) [Paper]
- FerroAI: a deep learning model for predicting phase diagrams of ferroelectric materials (npj Computational Materials, 2025) [Paper] [Code]

Accelerating Theoretical Computation

Method Paper
BPNN Generalized neural-network representation of high-dimensional potential-energy surfaces (Physical Review Letters, 2007) [Paper]
- Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons (Physical Review Letters, 2010) [Paper]
NequIP E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials (Nature Communications, 2022) Paper(https://github.com/mir-group/nequip)]
Cormorant Cormorant: Covariant molecular neural networks (NeurIPS2019) Paper(https://github.com/risilab/cormorant)]
MACE MACE: Higher order equivariant message passing neural networks for fast and accurate force fields (NeurIPS2022) Paper(https://github.com/ACEsuit/mace)]
DimeNet Directional Message Passing for Molecular Graphs (ICLR2020) Paper(https://github.com/gasteigerjo/dimenet)]
M3GNet A universal graph deep learning interatomic potential for the periodic table (Nature Computational Science, 2022) Paper(https://github.com/materialsvirtuallab/m3gnet)]
- Injecting domain knowledge from empirical interatomic potentials to neural networks for predicting material properties (NeurIPS2022) Paper(https://github.com/shuix007/EIP4NNPotentials)]
FAENet FAENet: Frame Averaging Equivariant GNN for Materials Modeling (ICML2023) [Paper]
CHGNet CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling (Nature Machine Intelligence, 2023) Paper(https://github.com/CederGroupHub/chgnet)]
- Forces are not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations (Transactions on Machine Learning Research, 2023) [Paper]
DeepH-E3 General framework for E (3)-equivariant neural network representation of density functional theory Hamiltonian (Nature Communications, 2023) [Paper] [Code]
AdsorbDiff AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion (ICML2024) [Paper] [Code]
- Multimodal language and graph learning of adsorption configuration in catalysis (Nature Machine Intelligence, 2024) Paper(https://github.com/hoon-ock/multi-view)]
DeepRelax Scalable crystal structure relaxation using an iteration-free deep generative model with uncertainty quantification (Nature Communications, 2024) [Paper] [Code]
- Universal machine learning interatomic potentials are ready for phonons (npj Computational Materials, 2025) [Paper] [Code]
AssembleFlow AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly (ICLR2025) [Paper]
SLEM Learning local equivariant representations for quantum operators (ICLR2025) [Paper]
- Machine learning Hubbard parameters with equivariant neural networks (npj Computational Materials, 2025) [Paper] [Code]
PIWSL Physics-Informed Weakly Supervised Learning For Interatomic Potentials (ICML2025) Paper(https://github.com/nec-research/PICPS-ML4Sci)]
ELoRA ELoRA: Low-Rank Adaptation for Equivariant GNNs (ICML2025) Paper(https://github.com/hyjwpk/ELoRA)]
- Learning the Electronic Hamiltonian of Large Atomic Structures (ICML2025) [Paper] [Code]
eSEN Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction (ICML2025) Paper(https://github.com/facebookresearch/fairchem)]
- The dark side of the forces: assessing non-conservative force models for atomistic machine learning (ICML2025) [Paper]
UMA UMA: A Family of Universal Models for Atoms (NeurIPS2025) [Paper]
TraceGrad TraceGrad: a Framework Learning Expressive SO(3)-equivariant Non-linear Representations for Electronic-Structure Hamiltonian Prediction (ICML2025) Paper(https://drive.google.com/file/d/1KVWrdCY_fIx88O5Me80EQeIu6A0xUWmT/view)]
MLIP Arena MLIP Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials via an Open, Accessible Benchmark Platform (NeurIPS2025) [Paper] [Code]
E2Former E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products (NeurIPS2025) [Paper] [Code]
LiFlow Flow matching for accelerated simulation of atomic transport in crystalline materials (Nature Machine Intelligence, 2025) Paper(https://github.com/learningmatter-mit/liflow)]
E2GNN Efficient equivariant model for machine learning interatomic potentials (npj Computational Materials, 2025) [Paper] [Code]
AlphaNet AlphaNet: scaling up local-frame-based neural network interatomic potentials (npj Computational Materials, 2025) [Paper] [Code]
- Accurate machine learning interatomic potentials for polyacene molecular crystals: application to single molecule host-guest systems (npj Computational Materials, 2025) [Paper]
- Universal machine learning interatomic potentials are ready for phonons (npj Computational Materials, 2025) [Paper] [Code]
- Interpolation and differentiation of alchemical degrees of freedom in machine learning interatomic potentials (Nature Communications, 2025) Paper(https://github.com/learningmatter-mit/alchemical-mlip)]
- Systematic softening in universal machine learning interatomic potentials (npj Computational Materials, 2025) [Paper]
- Cross-functional transferability in foundation machine learning interatomic potentials (npj Computational Materials, 2025) [Paper] [Code]
- A foundation machine learning potential with polarizable long-range interactions for materials modelling (Nature Communications, 2025) Paper(https://github.com/reaxnet/reaxnet)]
SurFF SurFF: a foundation model for surface exposure and morphology across intermetallic crystals (Nature Computational Science, 2025) Paper(https://github.com/Long1Corn/SurFF)]
- Systematic softening in universal machine learning interatomic potentials (npj Computational Materials, 2025) [Paper]
- Sparsity-promoting Fine-tuning for Equivariant Materials Foundation Model (ICLR2026) [Paper]
NextHAM Advancing Universal Deep Learning for Electronic-Structure Hamiltonian Prediction of Materials (ICLR2026) [Paper]
MatRIS MatRIS: Toward Reliable and Efficient Pretrained Machine Learning Interaction Potentials (ICLR2026) [Paper]
DistMLIP DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials (ICLR2026) [Paper]
Uni-HamGNN A universal spin–orbit-coupled Hamiltonian model for accelerated quantum material discovery (Nature Machine Intelligence, 2026) [Paper] [Code]

Benchmark

Method Paper
MatBench Benchmarking materials property prediction methods: the Matbench test set and Automatminer reference algorithm (npj Computational Materials, 2020) Paper(https://github.com/materialsproject/matbench)]
M² Hub M²Hub: Unlocking the Potential of Machine Learning for Materials Discovery (NeurIPS2023) Paper(https://github.com/yuanqidu/M2Hub)]
JARVIS-Leaderboard JARVIS-Leaderboard: a large scale benchmark of materials design methods (npj Computational Materials, 2024) Paper(https://github.com/usnistgov/jarvis_leaderboard)]
- Structure-based out-of-distribution (OOD) materials property prediction: a benchmark study (npj Computational Materials, 2024) Paper(https://github.com/sadmanomee/OOD_Materials_Benchmark)]
SimXRD SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystalline Symmetry Classification Benchmark (ICLR2025) [Paper] [Code]
ECD ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials (ICLR2025) [Paper] [Code]
MaCBench Probing the limitations of multimodal language models for chemistry and materials research (Nature Computational Science, 2025) [Paper] [Code]
Matbench Discovery A framework to evaluate machine learning crystal stability predictions (Nature Machine Intelligence, 2025) Paper(https://github.com/janosh/matbench-discovery)]
MatGL Materials Graph Library (MatGL), an open-source graph deep learning library for materials science and chemistry (npj Computational Materials, 2025) Paper(https://github.com/materialsvirtuallab/matgl)]
- Mining extreme properties from a large metamaterial database (Nature Communications, 2025) [Paper] [Code]
LeMat-GenBench A Unified Evaluation Framework for Crystal Generative Models (NeurIPS2025 Workshop AI4Mat) [Paper] [Code]
HSG-12M HSG-12M: A Large-Scale Dataset of Spatial Multigraphs from the Energy Spectra of non-Hermitian Crystals (ICLR2026) [Paper]
CatalystBench CatalystBench: A Comprehensive Multi-Task Benchmark for Advancing Language Models in Catalysis Science(ICLR2026) [Paper]
MOFSimBench MOFSimBench: evaluating universal machine learning interatomic potentials in metal-organic framework molecular modeling (npj Computational Materials, 2026) [Paper] [Code]

Common Dataset

Dataset Description URL
Materials Project Materials Project encompasses over 120,000 materials, each accompanied by a comprehensive specification of its crystal structure and important physical properties. Materials Project
JARVIS-DFT JARVIS-DFT encompasses data for approximately 40,000 materials and includes around one million calculated properties. JARVIS-DFT
OQMD OQMD is a repository of thermodynamic and structural properties of inorganic materials, derived from high-throughput DFT calculations. OQMD
Perov-5 Perov-5 is a specialized dataset of perovskite crystal materials, containing 18,928 different perovskite materials. Perov-5
Carbon-24 Carbon-24 is a specialized dataset of carbon materials, containing over 10,000 different carbon structures. Carbon-24
Crystallography Open Database Crystallography Open Database is a crystallography database that specializes in collecting and storing crystal structure information for inorganic compounds, small organic molecules, metal-organic compounds, and minerals. Crystallography Open Database
Raman Open Database Raman Open Database is an open database that specializes in collecting and storing Raman spectroscopy data. Raman Open Database
Inorganic Crystal Structure Database Inorganic Crystal Structure Database is the world's largest database for completely identified inorganic crystal structures. Inorganic Crystal Structure Database
Open Catalyst Project The goal of Open Catalyst Project is to utilize artificial intelligence to simulate and discover new catalysts for renewable energy storage. Open Catalyst Project
Python Materials Genomics Python Materials Genomics is a robust, open-source Python library for materials analysis, offering a range of modules for handling crystal structures, band structures, phase diagrams, and material properties. Python Materials Genomics
Phonon DOS Dataset Phonon DOS Dataset contains approximately 1,500 crystalline materials whose phonon DOS is calculated from DFPT. Phonon DOS Dataset
Carolina Materials Database CMD primarily consists of ternary and quaternary materials generated by some AI methods. Carolina Materials Database
Alexandria Database Alexandria Database includes a large quantity of hypothetical crystal structures generated by ML methods or other algorithmic methodologies. Alexandria Database
Materials Project Trajectory Dataset MPtrj contains 1,580,395 atomic configurations, corresponding energies, 7,944,833 magnetic moments, 49,295,660 forces, and 14,223,555 stress values. Materials Project Trajectory Dataset
Quantum MOF QMOF is a dataset of over 20K metal-organic frameworks and coordination polymers derived from DFT. Quantum MOF
Open Materials 2024 OMat24 contains over 110 million DFT calculations focused on structural and compositional diversity. Open Materials 2024
SuperCon3D SuperCon3D contains 1,578 superconductor materials (includes 83 distinct elements), each with both Tc and crystal structure data. SuperCon3D
Atomly The Atomly database provides an extensive collection of material data generated through high-throughput first-principles calculations. This includes 320,000 inorganic crystal structures, 310,000 bandgap and density of states profiles, 12,000 dielectric constant tensors, and 16,000 mechanical tensors. Atomly
LeMaterial LeMaterial is one of the largest databases. It aggregates and standardizes materials data from foundational sources such as the MP, OQMD, and Alexandria into unified datasets (e.g., LeMat-Bulk and LeMat-Traj) with consistent formats and identifiers to facilitate training and evaluation of ML models for crystalline materials. LeMaterial

Credit by: @github.com/WanyuGroup/AI-for-Crystal-Materials

Materials & Chemistry Datasets

# Awesome Materials & Chemistry Datasets

About

A curated list of the most useful datasets in materials science and chemistry for training machine learning and AI foundation models. This includes experimental, computational, and literature-mined datasets—prioritizing open-access resources and community contributions.

This project aims to: - Catalog the best datasets by domain, type, quality, and size - Support reproducible research in AI for chemistry and materials - Provide a community-driven resource with contributions from researchers and developers


Table of Contents


Contributing

Want to add a new dataset or improve metadata?

  1. Fork the repository
  2. Edit the appropriate dataset list or add a new entry
  3. Submit a pull request with a brief description and download link OR
  4. Submit as an issue

Datasets

Computational Datasets

Dataset Domain Size Type Format
BOOM: Benchmarks for Out-Of-distribution Molecules Small molecules 10 Out-Of-Distribution Tasks (1M+ entries) Computational CSV
MSR-ACC/TAE25 Small molecules 77k CCSD(T)/CBS atomization energies Computational JSON
OMat24 (Meta) Inorganic crystals 110M DFT entries Computational JSON/HDF5
OMol25 (Meta) Molecular chemistry 100M+ DFT calculations Computational LMDB
OMC25 Molecular crystals >27M structures Computational Zarr
Materials Project (LBL) Inorganic crystals 500k+ compounds Computational JSON/API
Open Catalyst 2020 (OC20) Catalysis (surfaces) 1.2M relaxations Computational JSON/HDF5
AFLOW Inorganic materials 3.5M materials Computational REST API
OQMD Inorganic solids 1M+ compounds Computational SQL/CSV
JARVIS-DFT (NIST) 3D/2D materials 40k+ entries Computational JSON/API
Carolina Materials DB Hypothetical crystals 214k structures Computational JSON
NOMAD Various DFT/MD >19M calculations Computational JSON
MatPES DFT Potential Energy Surfaces ~400,000 structures from 300K MD simulations Computational JSON
Vector-QM24 Small organic and inorganic molecules 836k conformational isomers Computational JSON
AIMNet2 Dataset Non-metallic compounds 20M hybrid DFT calculations Computational JSON
RDB7 Barrier height and enthalpy for small organic reactions 12k CCSD(T)-F12 calculations Computational CSV
RDB19-Rad ΔG of activation and of reaction for organic reactions in 40 common solvents 5.6k DFT + COSMO-RS calculations Computational CSV
QCML Small molecules consisting of up to 8 heavy atoms 14.7B Semi-empirical + 33.5M DFT calculations Computational TFDS
QM9 Small organic molecules 134k molecules with quantum properties Experimental SDF/CSV
QM7/QM7b Small molecules 7k molecules with atomization energies Experimental SDF/CSV
QMugs Drug-like molecules 665 k mol / 2 M conf Computational HDF5
C2DB 2-D materials ~4 000 entries Computational JSON/API
ANI-1x / 1ccx Small organic mol 5 M (DFT) + 0.5 M (CCSD) Computational HDF5
CoRE MOF 2019 Metal-organic frameworks 14 763 structures Computational CIF/JSON
QMOF Database Metal-organic frameworks 20k+ structures (DFT) Computational CIF/JSON
Catalysis-Hub Surface reactions >100 k energies Computational JSON/API
ODAC23 MOF + CO₂/H₂O adsorption 38 M DFT calcs Computational HDF5
MOFX-DB Gas adsorption in MOFs 3 M isotherm pts Computational CSV/HDF5
LeMat-Bulk Inorganic materials (bulk) 6.7M structures (5.9M materials) Computational HuggingFace Dataset
LeMat-Traj Inorganic materials (trajectories) 113M structures Computational HuggingFace Dataset
NeurIPS Open Polymer Prediction 2025 Polymers ~1,500 test polymers with MD-derived properties Computational CSV
Carbon Data Carbon materials 22.9M atoms, 546 trajectories Computational EXTXYZ
MSR-ACC/TAE25 Small molecules (up to Ar) 76,879 total atomization energies Computational HDF5/CSV
DFT Solvation Energy Dataset Small molecules 651,290 solvation energies in 5 solvents Computational CSV/JSON
MD Simulated Monomer Properties Small molecules 410 molecules with thermodynamic properties Computational CSV/JSON
Multimodal Spectroscopic Dataset Molecular spectroscopy 790k molecules with simulated spectra Computational HDF5/JSON
PubChemQCR Small molecules (relaxation) 3.5M trajectories / 300M conformations Computational HuggingFace Dataset
MP-ALOE Universal MLIPs (89 elements) ~1M r2SCAN DFT calculations Computational JSONL/MACE
Alexandria DB Inorganic (1D–3D) >5 M DFT calcs (PBE) Computational JSON/OPTIMADE/LMDB
Quantum‑Chemical Bonding DB (LOBSTER) Solid‑state bonding analysis 1,520 compounds Computational JSON
MultixcQM9 (OpenQDC) Small molecules (QM9, multi‑XC) 133k molecules Computational Torch/NumPy
SPICE (OpenQDC) Drug‑like molecules 1 M conformers (energies & forces) Computational Torch/ASE
Matbench v0.1 Benchmarks (13 tasks) 10 datasets Benchmark/Comp CSV/HDF5
Matbench Discovery Stability, κ, structures Multiple files Benchmark/Comp CSV/ZIP
Materials Cloud Archives Various DFT/MD workflows 1,000+ datasets Computational HDF5/JSON/CIF
MS25 MLIP benchmark (6 material systems) Multi-system benchmark suite Computational/Benchmark HDF5
RadonPy Polymer Properties Data Polymer ~1070 MD-calculated Properties Computational CSV
SHNITSEL Data Organic Molecules 418,870 Post-HF-calculated Ground- and Excited-states Properties Computational XARRAY
Frustrated Lewis Pairs Database Small Molecules 146 Metal-free FLPs Computational HTML
AQCat25 Catalysis 13.5M frames / 5K materials Computational Parquet/ASE DB
OMol25 Electronic Structures Molecular chemistry 4M+ calculations Computational Raw DFT outputs
Unrestricted CCSD(T) Dataset For Organic Molecule Reactions Organic reactions 3119 configurations Computational
MC-PDFT-OPESf Reaction kinetics Diels-Alder reaction Computational
Quantum Cluster Database Nanoclusters 63,015 clusters Computational CSV/JSON
The Cambridge Cluster Database Mixed Clusters Multiple Files Computational Multiple Types
Battery Electrolyte Solvation/Ionization Organic molecules Thousands of molecules Computational

Experimental Datasets

Dataset Domain Size Type Format
Crystallography Open Database (COD) Crystal structures ~525k entries Experimental CIF/SMILES
NIST ICSD (subset) Inorganic structures ~290k structures Experimental CIF
CSD (Cambridge) Organic crystals ~1.3M structures Experimental CIF
opXRD Crystal structures 92552 (2179 labeled) Experimental JSON
MDR SuperCon Superconductivity legacy superconductor database w/ material composition, structure, properties, and processes Mixed
ChEMBL Bioactive molecules 2.3M+ compounds with bioactivity data Experimental JSON/SDF
MoleculeNet Molecular properties 700k+ compounds across 17 datasets Mixed CSV/SDF
ESOL Aqueous solubility 1,128 compounds with solubility data Experimental CSV
FreeSolv Hydration free energy 643 molecules with experimental data Experimental CSV
Lipophilicity Octanol/water distribution 4,200 compounds with logD values Experimental CSV
PCBA Bioassay screening 400k+ compounds, 128 bioassays Experimental CSV
HIV Antiviral screening 41k compounds with HIV inhibition data Experimental CSV
BACE Beta-secretase inhibitors 1,522 compounds with IC50 data Experimental CSV
BBBP Blood-brain barrier permeability 2,053 compounds with permeability data Experimental CSV
Tox21 Toxicity screening 8k compounds, 12 toxicity targets Experimental CSV
ToxCast High-throughput toxicity 8k compounds, 600+ assays Experimental CSV
SIDER Drug side effects 1,427 drugs with adverse reactions Experimental CSV
ClinTox Clinical trial toxicity 1,491 compounds with FDA approval status Experimental CSV
PDBbind Protein-ligand binding 19k complexes with binding affinities Experimental PDB/SDF
BindingDB Protein-ligand binding 2.8M+ binding data points Experimental CSV/SDF
ProtBENCH Drug-target interactions Protein family-specific datasets Experimental CSV
PDBench Protein sequence design 595 protein structures, 40 architectures Experimental PDB
PDB-Struct Structure-based protein design Comprehensive protein design benchmark Experimental PDB
HTEM-DB Thin-film composition libraries 140 k+ samples Experimental JSON/API
OCx24 Electrocatalyst inks 572 samples (+DFT) Experimental CSV
Polymer Genome Polymers 20 k polymers Experimental + Comp CSV/JSON
CoRE MOF 2024 Metal-organic frameworks 40k+ experimental MOFs Experimental CIF
SAIR Protein-ligand binding 1M+ complexes, 5.2M structures, 2.5TB Experimental 3D/CSV
Anion Solvation DB Anion solvation ~26k properties Mixed CSV
BigSolDB Organic molecule solubility ~54k exp. values Experimental CSV
StarryData2 Experimental properties Figshare dump (2023/2024) Experimental CSV/JSON
CRIPT Polymer Data Polymers (synthesis, properties) Growing community DB Mixed JSON/API
Catechol Benchmark Solvent selection / Reaction yield 1200+ process conditions Experimental CSV
Leeds Solubility Data Solubility 2.3k measurements Experimental CSV
BigSolDB 2.0 Solubility 103k+ values Experimental CSV/XLSX
OpenExp Chemical reactions 274k pairs Experimental Varies
Battery Imaging Library (BIL) Battery imaging 80+ scans, >500B voxels Experimental Various

LLM Training Datasets

Dataset Domain Size Type Format
ChemPile Chemistry 75B+ tokens LLM Training Mixed
SmolInstruct Small molecules 3.3M samples LLM Training JSON
CAMEL Chemistry 20K problem-solution pairs LLM Training JSON
ChemNLP Chemistry Extensive, many combined datasets LLM Training JSON
ChemQA Chemistry Multimodal QA dataset LLM Training JSON
ChemLLMBench Chemistry 8 chemistry tasks benchmark LLM Training JSON
ChemistryQA Chemistry 4,500 questions across 200 topics LLM Training JSON
MaScQA Materials Science 640 QA pairs LLM Training XLSX
SciCode Research Coding in Physics, Math, Material Science, Biology, and Chemistry 338 subproblems LLM Training JSON
ChemData 700K Chemistry (9 core tasks) 730K Q-A instruction pairs LLM Training JSON
MatSci-Instruct (HoneyBee) Materials science ≈55K verified instructions LLM Training JSON
MoleculeQA Molecular properties & safety 62K multiple-choice QA pairs LLM Training JSON
BioInstruct 25K Biomedical / biochemistry 25K GPT-4 generated instructions LLM Training JSON
Lab-Bench Biology 2,400+ questions for biology agents LLM Training JSON
ChemBench 4K Chemistry competency benchmark 4,100 single-choice questions LLM Training JSON
GPQA Diamond Biology, Physics, Chemistry 448 multiple-choice questions LLM Training JSON
MaCBench Chemistry and materials science Vision-language tasks LLM Training JSON
ChemBench Chemistry 2,700+ question-answer pairs LLM Training JSON
MatText Materials property prediction 2M structures LLM Training HuggingFace Dataset
SciAssess Scientific literature analysis Benchmark for LLMs in science LLM Training JSON
ZINC20-ML Drug-like molecules (SMILES) ≈1B molecules LLM Training SMILES
PMC Open Access Subset Biomedical full-text 3.4M+ articles LLM Training XML
MatScholar Task-Schema QA (MatSci-NLP) Materials science (7 NLP tasks) Tens of thousands of examples LLM Training JSON
Mol-Instructions Chemistry molecular, protein, and biochemical instructions LLM Training HuggingFace Dataset
USPTO-LLM Chemical reactions 247K reactions LLM Training JSON/Graph
ChemRxivQuest Chem literature QA 970 QA pairs LLM Training JSON
USPTO-Lowe Patent reactions 1.8 M reactions Literature-mined RXN/SMILES
MolTextNet Small molecules with text 2.5M molecule-text pairs LLM Training HuggingFace Dataset
MolOpt-Instructions Molecule optimization 1.18M instruction-based optimization tasks LLM Training HuggingFace Dataset
TextEdge Crystal properties Crystal text descriptions with properties LLM Training JSON
LAMBench-TrainingSet-v1 Materials structures 19.8M structures for Large Atom Models LLM Training Various
LLM4Mat Materials property prediction 1.9M crystal structures, 45 properties, 3 modalities LLM Training Various
LLM-EO Transition metal complexes / Optimization 1.37M TMC space explored LLM Training GitHub
Flavor Analysis and Recognition Transformer Molecular taste prediction Multi-class taste classification dataset LLM Training SMILES/JSON
SCQA (Solar Cell QA) Solar cells 47K QA pairs LLM Training JSON
ScienceQA K–12 science, multimodal MCQs w/ lectures & explanations 21,208 Qs LLM Training/Eval JSON
SciBench College-level scientific problem solving (math/chem/phys) Open & closed sets LLM Eval PDF/JSON
MegaScience Scientific reasoning (7 disciplines) 1.25M instances (650k reasoning questions from 12k textbooks) LLM Training HuggingFace Dataset
Mat-Instructions Inorganic materials ~30k instructions LLM Training JSON
Open Materials Guide (OMG) Materials synthesis 17K synthesis recipes LLM Training JSON
ChemDFM Chemistry 34B tokens / 2.7M instructions LLM Training HuggingFace
ChemTable Chemistry Tables Large-scale benchmark LLM Training/Benchmark JSON
ChemCoTBench Molecular reasoning Annotated datasets LLM Training/Benchmark HuggingFace Dataset

Literature-mined & Text Datasets

Dataset Domain Size Type Format
PubChem Molecules & data 119M compounds Literature SMILES/SDF
Open Reaction Database (ORD) Synthetic reactions ~1M reactions Experimental/Lit JSON
PatCID (IBM) Chemical image data 81M images / 13M mols Literature PNG/SMILES
MatScholar NLP corpus (materials) 5M+ abstracts Literature JSON/Graph
Matbench (metadata/text tasks) Text/meta ML tasks 13 tasks Literature/Benchmark CSV
OpenQDC Hub QM molecules & reactions 1.5 B geometries Literature/Computational Python API/NPZ
L2M3 - Large Language Model MOF Miner Metal-organic frameworks from >40k articles Literature-mined CSV

🌊 Computational Fluid Dynamics, PDE & Engineering Datasets

Dataset Domain Size Type Format
PDEBench PDE solving / Scientific ML Multiple datasets Benchmark / Simulation HDF5/PyTorch
BLASTNet Fluid mechanics / Reacting flows 17 TB Simulation / CFD HDF5/NPY
Johns Hopkins Turbulence DB (JHTDB) DNS/LES turbulence (9 canonical flows) ≈ 350 TB Simulation Web API / HDF5 cutouts
Airfoil CFD 2k 1,830 airfoils × 25 AoA × 3 Re ~6 GB (250 k cases) Simulation HDF5
PDEArena (collection) 2-D Navier–Stokes, Shallow-Water, 3-D Maxwell ≈ 100 GB (4 datasets) Simulation Torch / HDF5
WeatherBench 2 Global weather reanalysis (ERA5, 1979-2023) ≈ 5 TB Reanalysis NetCDF/Zarr
UT Austin Channel-DNS Suite Incompressible channel flow Reτ 180 – 5200 ≈ 10 TB Simulation Binary / ASCII
Compressible TPC DNS DB Compressible channel flow (25 M, Reτ*) ~2 GB Simulation TXT tables
Curated RANS ↔ DNS Dataset 29 geometries, 4 RANS models w/ DNS/LES labels 1.1 GB Simulation HDF5/CSV
NASA Common Research Model (CRM) Aircraft CRM geom. + wind-tunnel & CFD results Multi-GB Mixed (Exp + Sim) CAD / CSV / Tecplot
Darcy-Flow (FNO) 2-D porous-media pressure fields (∇·k∇u = f) ≈ 1 GB (10 k samples) Simulation HDF5
HiFi-TURB LES/DNS High-fidelity LES/DNS for complex 3D flows Multi-case suite Simulation (DNS/LES) HDF5/NetCDF
NASA High Lift Prediction Workshop (HLPW) High-lift aircraft configurations Multi-GB Mixed (exp + CFD) CAD/CSV/Tecplot
High-Speed TBL DNS DB Compressible turbulent boundary layers DNS database Simulation HDF5
ML Turbulence (Kaggle) RANS Reynolds stress tensor data ~GB scale Benchmark/Simulation CSV/HDF5

Proprietary Datasets (for reference)

Dataset Domain Size Use Case Notes
CAS Registry Chemical substances 250M+ substances Industry standard for molecule indexing
Reaxys (Elsevier) Reactions & properties Millions of reactions Rich curated literature reaction data
Citrine Informatics DB Experimental materials Private Materials ML platform w/ industry data
CSD (Cambridge) Organic crystals 1.3M+ Gold-standard X-ray structures
PoLyInfo Polymers & properties 500k+ data points / Experimental Polymer properties from literature sources

Dataset Resources

  • The Materials Data Facility - Over 100 TB of open materials data. #TODO list some of these in the tables above
  • Foundry-ML search Foundry - 61 structured datasets ready for download through a Python client #TODO list some of these in the tables above

TODO

  • Add all OpenQDC datasets https://www.openqdc.io/datasets
  • A dataset on solubilities of gases in polymers (15 000 experimental measurements of 79 gases' uptakes (0.01–50 wt%) in 102 different polymers, pressures from 1 × 10−3 to 7 × 102 bar and temperatures from 233 to 508 K, includes nearly 500 solvent–polymer systems). Optimized structures of various repeating units are included. Should it be of interest for you, it is available here: Data
  • Add Materials Cloud Datasets
  • Classify Atomly. A bit challenging with non-English
  • Look into adding NOMAD for experimental data as well
  • Add A Quantum-Chemical Bonding Database for Solid-State Materials Part 1: https://zenodo.org/records/8091844 Part 2: https://zenodo.org/records/8092187
  • Add QM datasets. http://quantum-machine.org/datasets/
  • Find link for | ChemRxivQuest | Chemistry literature QA | 970 curated QA pairs | LLM Training | JSON | CC BY 4.0 | Open | ChemRxivQuest |
  • Find new link for USPTO-Reactions | USPTO Reactions | Organic reactions | 1.8M reactions | Literature | RXN/SMILES | Open | Open |
  • Find dataset link for | SciCUEval | Multidomain scientific comprehension (bio/chem/phys/matsci) | 10 sub-datasets | LLM Eval | JSON/PDF | Open | Open |
  • Find dataset for | MatSciKB | Materials science KB | 38.5k entries (20k papers, 3.6k Wikipedia, 1.9k textbooks, 10.5k datasets) | Literature | Structured text | Open | Open |


License

This project is licensed under the MIT License. Each dataset listed has its own license, noted in the table. Always check the source's license before using the data in your project.


Acknowledgements

The primary effort of Ben Blaiszik on this project was performed under financial assistance award 70NANB24H049 / MML24-1001 from the National Institute of Standards and Technology (NIST).

Thanks to the open data and research communities including: - Meta AI FAIR - The Materials Data Facility / Foundry-ML - NIST JARVIS and Materials Project - LBL, MIT, CCDC, FIZ Karlsruhe - Contributors to Open Catalyst, PubChem, ORD, and AFLOW - Developers of open chemistry toolkits (RDKit, Open Babel)


Citation

If this repository was helpful in your work, feel free to cite or star the repo. You can also reference the underlying dataset publications linked above.

Changelog

This Changelog is autogenerated, there may be errors.

October 2025

Added 18 new datasets focusing on catalysis, reaction kinetics, cluster chemistry, experimental solubility, literature mining, and foundation models to enhance resources for computational chemistry and machine learning applications.

🧮 Computational Datasets (7 datasets)
  • AQCat25: The AQCat25 dataset provides a large and diverse collection of 13.5 million DFT calculation trajectories, encompassing approximately 5K materials and 47K intermediate-catalyst systems. It is designed to complement existing large-scale datasets by providing calculations at higher fidelity and including critical spin-polarized systems, which are essential for accurately modeling many industrially relevant catalysts.
  • OMol25 Electronic Structures Dataset: The OMol25 Electronic Structures dataset includes the raw DFT outputs, electronic densities, wavefunctions, and molecular orbital information for over 4M million high-accuracy quantum chemical calculations. We see this as a transformative opportunity to develop higher quality partial charges, partial spins, and advanced electronic features to unlock the next generation of physics-informed ML models.
  • Unrestricted CCSD(T) Dataset For Organic Molecule Reactions: Dataset of 3119 organic molecule configurations at gold-standard quantum accuracy with automated workflows for unrestricted CCSD(T) calculations. Includes a transferable MLIP trained on UCCSD(T) data, showing significant improvements in force and activation energy accuracy.
  • MC-PDFT-OPESf: This work combines multi-configuration pair-density functional theory (MC-PDFT) as an accurate and efficient multireference electronic structure method with on-the-fly probability enhanced sampling flooding (OPESf) as an enhanced sampling method capable of accelerating reactive transitions. MC-PDFT–OPESf provides reaction rates in agreement with experiments at a fraction of the computational cost required by conventional unbiased ab-initio calculations.
  • Quantum Cluster Database: A database of 63015 low-energy atomically precise nanoclusters for 55 elements across the periodic table, including main group and transition metal elements.
  • Cambridge Cluster Database: A collection of results from global optimizations for a variety of cluster systems, including Lennard-Jones, metal, molecular, and ionic clusters. The database is continuously updated with new results from published papers.
  • Battery Electrolyte Solvation/Ionization: This dataset presents molecular properties critical for battery electrolyte design, specifically solvation energies, ionization potentials, and electron affinities for thousands of organic molecules from QM9, EGP, GDB17, and ZINC.
🧪 Experimental Datasets (5 datasets)
  • Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning, providing the first-ever transient flow dataset for machine learning benchmarking, covering over 1200 process conditions. This dataset focuses on solvent selection, a task that is particularly difficult to model theoretically.
  • BNNLab/Solubility_data: Leeds Solubility Data: Curated solubility data in organic solvents and water and descriptors for solubility prediction.
  • BigSolDB 2.0: A comprehensive dataset of 103,944 experimentally measured solubility values of 1,448 organic compounds in 213 solvents reported in 1,595 literature peer-reviewed articles.
  • OpenExp: Features 274,439 pairs of chemical reactions and their corresponding step-by-step instructions of experimental procedures. This dataset, compiled from the USPTO-Applications and ORD databases.
  • Battery Imaging Library (BIL): An open, curated collection of multi-modal and multi-length scale battery imaging datasets, featuring over 80 scans and 500 billion voxels of data from single particles to full cells.
📚 LLM Training Datasets (5 datasets)
  • Mat-Instructions: A large-scale inorganic material instruction dataset with ~30k instruction-response pairs, designed to unlock the potential of LLMs in materials science.
  • Open Materials Guide (OMG): A dataset of 17K high-quality, expert-verified synthesis recipes from open-access literature, which forms the basis for the AlchemyBench benchmark for LLM-guided synthesis prediction.
  • ChemDFM: A pioneering LLM for chemistry trained on 34B tokens from chemical literature and textbooks, and fine-tuned using 2.7M instructions. As a result, it can understand and reason with chemical knowledge in free-form dialogue.
  • ChemTable: A large-scale benchmark of real-world chemical tables curated from the experimental sections of literature. ChemTable supports table recognition and table understanding tasks to advance scientific reasoning in chemistry.
  • ChemCoTBench: A reasoning framework that bridges molecular structure understanding with arithmetic-inspired operations to formalize chemical problem-solving into transparent, step-by-step workflows for tasks like molecular optimization and reaction prediction.
📖 Literature-mined & Text Datasets (1 dataset)
  • L2M3 (Large Language Model MOF Miner): A database of MOF synthesis conditions and properties extracted from over 40,000 research articles using LLMs, enabling analysis of synthesis-structure-property relationships.

August 2025

Enhanced scientific reasoning capabilities and machine learning interatomic potential benchmarking with 6 new high-quality datasets for AI scientists and materials researchers.

🧮 Computational Datasets (3 datasets)
  • OMC25: A collection of over 27 million molecular crystal structures containing 12 elements and up to 300 atoms in the unit cell. The dataset was generated from dispersion-inclusive density functional theory (DFT) relaxation trajectories of over 230,000 randomly generated molecular crystal structures of around 50,000 organic molecules.
  • MS25: Comprehensive benchmark dataset for evaluating machine learning interatomic potentials (MLIPs) across 6 diverse materials systems including MgO surfaces, liquid water, zeolites, catalytic Pt surface reactions, high-entropy alloys, and disordered Zr-oxides. Evaluates 5 MLIP architectures (MACE, NequIP, Allegro, MTP, Torch-ANI) with focus on derived physical observables beyond traditional energy/force metrics. Demonstrates that equivariant MLIPs offer 1.5-2× improvements over nonequivariant models in complex systems, while highlighting the importance of explicit validation of physical properties rather than relying solely on error metrics.
  • Added the * Frustrated Lewis Pairs Database ** of 146 Metal-free FLPs
📚 LLM Training Datasets (4 datasets)
  • MaCBench: A comprehensive benchmark for evaluating how vision-language models handle real-world chemistry and materials science tasks across three core aspects: data extraction, experimental understanding, and results interpretation. Reveals fundamental limitations in spatial reasoning and cross-modal information synthesis in leading models.
  • ChemBench: A cutting-edge framework to evaluate the chemical knowledge and reasoning capabilities of large language models (LLMs). It includes over 2,700 curated question-answer pairs across diverse chemistry topics and uniquely encodes chemical semantics, enabling models to process and reason about molecules and equations.
  • MatText: A comprehensive benchmarking framework spanning multiple representations and model scales, which finds that LLMs consistently fail to capture coordinate information while excelling at category patterns. This geometric blindness persists regardless of model size, dataset scale, or text representation strategy.
  • MegaScience: Large-scale scientific reasoning dataset featuring 1.25 million high-quality instances across 7 scientific disciplines. Includes TextbookReasoning component with 650k reasoning questions extracted from 12,000 university-level textbooks, providing truthful reference answers for training AI scientists. Developed through systematic ablation studies and comprehensive evaluation across 15 benchmarks, demonstrating superior performance and training efficiency compared to existing open-source scientific datasets.

July 2025

Expanded the collection into new scientific domains with 31 new datasets, introducing benchmarks for physics-based machine learning, adding comprehensive quantum mechanics datasets, expanding materials science resources, and enhancing scientific evaluation benchmarks.

🌊 Computational Fluid Dynamics, PDE & Engineering Datasets (15 datasets)
  • PDEBench: A comprehensive benchmark suite for scientific machine learning featuring a wide range of Partial Differential Equations. It provides large, ready-to-use datasets for challenging physics problems, supporting both forward and inverse modeling.
  • BLASTNet: A 17 TB collection of high-fidelity fluid mechanics simulation datasets for ML applications in automotive, propulsion, and energy sectors. It includes code and pre-trained models for tasks like turbulence modeling and spatio-temporal prediction.
  • JHTDB: multi-terabyte DNS/LES portal with isotropic, channel, MHD, boundary-layer and atmospheric datasets.
  • Airfoil CFD 2k: DOE/NREL benchmark: 1,830 shapes × 250 k RANS simulations; HDF5 + AWS mirror.
  • PDEArena: Hugging-Face org offering Navier–Stokes, Shallow-Water & Maxwell tensors; MIT license.
  • WeatherBench 2: ERA5-derived Zarr cubes for data-driven medium-range forecasting; MIT.
  • UT Austin DNS Suite: public HTTP server with Reτ 180–5200 channel data & statistics.
  • Compressible TPC DNS DB: 25 Reynolds–Mach cases, plain-text statistics (Mendeley Data).
  • Curated RANS ↔ DNS: Scientific Data descriptor + Kaggle mirror for ML turbulence closures.
  • NASA CRM: open CAD, grids, wind-tunnel Cp & force/moment datasets for the community benchmark.
  • Darcy Flow (FNO): canonical permeability→pressure dataset used in FNO/PINO papers.
  • HiFi-TURB LES/DNS: EU-funded project providing high-fidelity Large Eddy Simulation and Direct Numerical Simulation datasets for complex 3D turbulent flows, supporting advanced turbulence modeling and AI/ML applications in computational fluid dynamics.
  • NASA High Lift Prediction Workshop (HLPW): Multi-phase workshop datasets featuring high-lift aircraft configurations with comprehensive experimental validation data, CAD geometries, and CFD solutions for aerodynamic modeling and validation.
  • High-Speed TBL DNS DB: Specialized database of Direct Numerical Simulation data for compressible turbulent boundary layers, providing detailed flow field information for high-speed aerodynamic applications and turbulence model development.
  • ML Turbulence (Kaggle): Community-contributed dataset featuring RANS Reynolds stress tensor data with ground truth labels, providing a standardized benchmark for machine learning approaches to turbulence modeling.
🧮 Computational Datasets (7 datasets)
  • PubChemQCR: A massive dataset of molecular relaxation trajectories for ~3.5 million small molecules, containing over 300 million conformations with energy and force labels. It is the largest public dataset of its kind, designed to accelerate the development of machine learning interatomic potentials (MLIPs).
  • MP-ALOE: Nearly 1 million DFT calculations using the accurate r2SCAN meta-generalized gradient approximation, covering 89 elements. Created using active learning and primarily consisting of off-equilibrium structures, MP-ALOE is designed for training universal machine learning interatomic potentials (UMLIPs) with strong performance on thermochemical properties, force prediction, and physical soundness under extreme conditions.
  • Alexandria DB: Massive computational materials database containing over 5 million DFT calculations using PBE functional for 1D-3D inorganic materials. Provides OPTIMADE-compliant API access and LMDB format for high-performance materials screening and property prediction workflows.
  • Quantum-Chemical Bonding DB (LOBSTER): Specialized dataset providing detailed bonding analysis for 1,520 solid-state compounds using LOBSTER methodology. Enables understanding of chemical bonding in crystalline materials through projected crystal orbital Hamilton populations and related descriptors.
  • MultixcQM9 & SPICE (OpenQDC): Enhanced quantum chemistry datasets within the OpenQDC framework. MultixcQM9 provides multi-exchange correlation functional data for 133k small molecules, while SPICE offers 1 million conformers with energies and forces for drug-like molecules, both optimized for machine learning applications.
  • Matbench v0.1 & Discovery: Comprehensive benchmarking suites for materials property prediction featuring 13 standardized tasks across 10 datasets. Matbench Discovery specifically targets stability prediction, thermal conductivity, and structure generation with rigorous evaluation protocols.
  • Materials Cloud Archives: Centralized repository of over 1,000 computational datasets from various DFT and molecular dynamics workflows. Provides standardized access to diverse materials science calculations with comprehensive metadata and version control.
📚 LLM Training Datasets (5 datasets)
  • LLM-EO (Evolutionary Optimization): A framework that integrates LLMs into evolutionary algorithms for optimizing transition metal complexes. This approach leverages the chemical knowledge of LLMs to surpass traditional genetic algorithms, enabling flexible, multi-objective optimization without complex mathematical formulations.
  • Flavor Analysis and Recognition Transformer: A state-of-the-art machine learning model dataset for predicting molecular taste from chemical structures. Built on ChemBERTa transformer architecture, it classifies molecules across four taste categories (sweet, bitter, sour, umami) with >91% accuracy, enabling interpretability through gradient-based visualizations and applications in flavor compound discovery and rational food design.
  • SCQA (Solar Cell QA): Domain-specific question-answering dataset containing 47,268 QA pairs about solar cell properties, auto-generated using ChemDataExtractor. Fine-tuning language models on this dataset achieves F1-scores exceeding general-English QA datasets by 10-20%, demonstrating the value of domain-specific training data for specialized scientific applications.
  • ScienceQA: Comprehensive K-12 science education dataset with 21,208 multimodal multiple-choice questions including lectures and explanations. Supports development of educational AI systems and scientific reasoning capabilities in language models.
  • SciBench: College-level scientific problem-solving benchmark covering mathematics, chemistry, and physics with both open and closed evaluation sets. Enables systematic assessment of LLM performance on advanced scientific reasoning tasks.
🧪 Experimental Datasets (4 datasets)
  • Anion Solvation DB: Comprehensive compilation of 26,000+ solvation properties including 8,241 experimental pKa values across 8 solvents, 5,536 computed gas-phase acidities, and over 12,000 solvation energies for anions and neutral compounds computed using COSMO-RS. Bridges experimental and computational approaches for understanding anion behavior in different solvation environments.
  • BigSolDB: Extensive experimental solubility database containing 54,273 measured solubility values across temperature range 243.15-403.15 K in various organic solvents and water. Features diverse chemical space coverage with interactive t-SNE exploration tool and comprehensive statistical analysis for QSPR model development.
  • StarryData2: Large-scale experimental properties dataset from Figshare spanning 2023-2024, providing comprehensive experimental measurements across diverse materials and chemical systems for machine learning model validation and training.
  • CRIPT Polymer Data: Community-driven polymer database featuring synthesis procedures, characterization data, and properties. Enables standardized data sharing and collaborative research in polymer science through structured JSON API access.

June 2025

Added 28 new high-quality datasets spanning polymer science, drug discovery, carbon materials, spectroscopy, MOF databases, foundation model training, and materials knowledge bases:

🧮 Computational Datasets (15 datasets)
  • NeurIPS Open Polymer Prediction 2025: Kaggle competition dataset for predicting 5 key polymer properties (Tg, FFV, Tc, density, Rg) from SMILES structures using MD simulation ground truth. Includes ~1,500 test polymers.
  • Carbon Data: 22.9 million atom dataset with synthetic energy labels from C-GAP-17 potential, featuring 546 carbon trajectories across diverse densities and temperatures. Captures nanotubes, graphitic films, diamond, and amorphous carbon environments.
  • MSR-ACC/TAE25: Microsoft Research's comprehensive dataset of 76,879 total atomization energies computed at CCSD(T)/CBS level using W1-F12 protocol. Exhaustively covers chemical space for elements up to argon with sub-chemical accuracy (±1 kcal/mol).
  • DFT Solvation Energy Dataset: 651,290 computed solvation energies for 130,258 molecules from QM9 dataset across 5 solvents (acetone, ethanol, acetonitrile, DMSO, water). Achieves 0.5 kcal/mol MAE for small molecules with accompanying ML models and web interface.
  • MD Simulated Monomer Properties: GPU-accelerated molecular dynamics dataset of thermodynamic properties for 410 molecules, generated through active learning pipeline. Includes validation against experimental data and automated simulation workflow.
  • Multimodal Spectroscopic Dataset: Comprehensive spectroscopic dataset with simulated 1H-NMR, 13C-NMR, HSQC-NMR, Infrared, and Mass spectra for 790k molecules from patent reactions. Enables multimodal foundation model development for structure elucidation and functional group prediction.
  • QMugs: 665k drug-like molecules with ~2M conformers, featuring quantum mechanical properties at both semi-empirical (GFN2-xTB) and DFT (ωB97X-D/def2-SVP) levels.
  • C2DB (Computational 2D Materials Database): ~4,000 two-dimensional materials with computed structural, electronic, magnetic, and optical properties.
  • ANI-1x / ANI-1ccx: 5 million DFT and 500k CCSD(T) calculations for organic molecules, supporting machine learning potential development.
  • CoRE MOF 2019: 14,763 computation-ready metal-organic frameworks with solvent and charge balancing, suitable for high-throughput screening.
  • QMOF Database: Comprehensive database of quantum-chemical properties for 20,000+ metal-organic frameworks derived from high-throughput periodic density functional theory calculations.
  • Catalysis-Hub Surface Reactions: Over 100,000 adsorption and reaction energies on catalytic surfaces, accessible via a Python/GraphQL API.
  • ODAC23 (Open DAC 2023): 38 million DFT calculations of CO₂/H₂O adsorption on 8,400 MOFs, aimed at direct-air-capture sorbent discovery.
  • MOFX-DB: Over 3 million simulated adsorption data points across 160,000 MOFs and 286 zeolites for various gases.
  • Enhanced QCML dataset entry with more comprehensive description of coverage and properties
🧪 Experimental Datasets (5 datasets)
  • SAIR (Structurally Augmented IC50 Repository): Largest public protein–ligand binding dataset with over 1 million complexes and 5.2 million cofolded 3D structures (2.5TB total). Combines experimental binding affinities from ChEMBL/BindingDB with Boltz-1x predicted structures.
  • CoRE MOF 2024: Updated database of over 40,000 experimentally reported metal-organic frameworks from literature through early 2024. Includes pre-computed material properties for high-throughput material-process screening and carbon-capture applications.
  • HTEM-DB (High-Throughput Experimental Materials Database): More than 140,000 composition–process–property data points from combinatorial sputtering experiments, with optical, electrical, and structural measurements.
  • OCx24 (Open Catalyst Experiments 2024): 572 synthesized catalyst inks evaluated with matched XRF/XRD and DFT adsorption energies, bridging the gap between simulation and laboratory data.
  • Khazana / Polymer Genome: Approximately 20,000 polymers with DFT-calculated properties and experimental dielectric data, supporting machine learning on soft materials.
📚 LLM Training Datasets (5 datasets)
  • MolTextNet: 2.5 million high-quality molecule-text pairs from ChEMBL35, featuring GPT-4o-mini generated descriptions 10x longer than existing datasets. Integrates structural features, computed properties, bioactivity data, and synthetic complexity for multimodal molecular modeling.
  • MolOpt-Instructions: 1.18 million instruction-based molecule optimization tasks for fine-tuning LLMs on drug discovery. Supports interactive human-machine dialogue for molecule optimization through the DrugAssist framework, enabling expert feedback integration and iterative refinement.
  • TextEdge: Benchmark dataset for predicting crystal properties from natural language text descriptions. Demonstrates superior performance of LLM-based approaches over traditional GNN methods, with improvements of 8% on band gap prediction and 65% on unit cell volume prediction.
  • LAMBench-TrainingSet-v1: Massive training dataset for Large Atom Models (LAMs) containing 19.8 million valid structures from the OpenLAM Initiative. Includes 1 million structures on the convex hull for advancing generative modeling and materials science applications.
  • LLM4Mat: Comprehensive benchmark dataset for evaluating LLMs in materials property prediction, containing 1.9M crystal structures from 10 data sources with 45 distinct properties. Features three input modalities (crystal composition, CIF, text description) with 4.7M, 615.5M, and 3.1B tokens respectively.
📖 Literature-mined & Text Datasets (3 datasets)
  • MatSciKB: Comprehensive materials science knowledge base with 38,469 curated entries across 16 categories. Integrates ArXiv papers (20,384), Wikipedia articles (3,620), textbooks (1,930), datasets (10,473), formulas (57), and GPT-generated examples (2,005) with efficient CRUD operations for research applications.
  • ChemRxivQuest: 970 curated question–answer pairs spanning 17 chemistry subfields, designed for retrieval-augmented generation and factuality assessments.
  • USPTO-Lowe Reactions (1976–2016): 1.8 million atom-mapped reactions extracted from US patents, serving as a benchmark for reaction prediction and retrosynthesis models.
📚 Enhanced Literature & Benchmark Resources (2 datasets)
  • Matbench (metadata/text tasks): Extended benchmarking suite providing 13 standardized tasks for text-based and metadata-driven materials property prediction. Enables systematic evaluation of natural language processing approaches in materials science applications.
  • OpenQDC Hub: Comprehensive quantum chemistry database aggregating 1.5 billion molecular geometries and quantum mechanical properties. Provides unified Python API access to diverse quantum chemistry datasets with standardized formats for large-scale machine learning applications.

Earlier Updates

For changes made earlier than the changelog entries, please see the repository commit history.

Credit by: @github.com/blaiszik/awesome-matchem-datasets

Awesome Materials & Chemistry Datasets

About

A curated list of the most useful datasets in materials science and chemistry for training machine learning and AI foundation models. This includes experimental, computational, and literature-mined datasets—prioritizing open-access resources and community contributions.

This project aims to: - Catalog the best datasets by domain, type, quality, and size - Support reproducible research in AI for chemistry and materials - Provide a community-driven resource with contributions from researchers and developers


Table of Contents


Contributing

Want to add a new dataset or improve metadata?

  1. Fork the repository
  2. Edit the appropriate dataset list or add a new entry
  3. Submit a pull request with a brief description and download link OR
  4. Submit as an issue

Datasets

Computational Datasets

Dataset Domain Size Type Format
BOOM: Benchmarks for Out-Of-distribution Molecules Small molecules 10 Out-Of-Distribution Tasks (1M+ entries) Computational CSV
MSR-ACC/TAE25 Small molecules 77k CCSD(T)/CBS atomization energies Computational JSON
OMat24 (Meta) Inorganic crystals 110M DFT entries Computational JSON/HDF5
OMol25 (Meta) Molecular chemistry 100M+ DFT calculations Computational LMDB
OMC25 Molecular crystals >27M structures Computational Zarr
Materials Project (LBL) Inorganic crystals 500k+ compounds Computational JSON/API
Open Catalyst 2020 (OC20) Catalysis (surfaces) 1.2M relaxations Computational JSON/HDF5
AFLOW Inorganic materials 3.5M materials Computational REST API
OQMD Inorganic solids 1M+ compounds Computational SQL/CSV
JARVIS-DFT (NIST) 3D/2D materials 40k+ entries Computational JSON/API
Carolina Materials DB Hypothetical crystals 214k structures Computational JSON
NOMAD Various DFT/MD >19M calculations Computational JSON
MatPES DFT Potential Energy Surfaces ~400,000 structures from 300K MD simulations Computational JSON
Vector-QM24 Small organic and inorganic molecules 836k conformational isomers Computational JSON
AIMNet2 Dataset Non-metallic compounds 20M hybrid DFT calculations Computational JSON
RDB7 Barrier height and enthalpy for small organic reactions 12k CCSD(T)-F12 calculations Computational CSV
RDB19-Rad ΔG of activation and of reaction for organic reactions in 40 common solvents 5.6k DFT + COSMO-RS calculations Computational CSV
QCML Small molecules consisting of up to 8 heavy atoms 14.7B Semi-empirical + 33.5M DFT calculations Computational TFDS
QM9 Small organic molecules 134k molecules with quantum properties Experimental SDF/CSV
QM7/QM7b Small molecules 7k molecules with atomization energies Experimental SDF/CSV
QMugs Drug-like molecules 665 k mol / 2 M conf Computational HDF5
C2DB 2-D materials ~4 000 entries Computational JSON/API
ANI-1x / 1ccx Small organic mol 5 M (DFT) + 0.5 M (CCSD) Computational HDF5
CoRE MOF 2019 Metal-organic frameworks 14 763 structures Computational CIF/JSON
QMOF Database Metal-organic frameworks 20k+ structures (DFT) Computational CIF/JSON
Catalysis-Hub Surface reactions >100 k energies Computational JSON/API
ODAC23 MOF + CO₂/H₂O adsorption 38 M DFT calcs Computational HDF5
MOFX-DB Gas adsorption in MOFs 3 M isotherm pts Computational CSV/HDF5
LeMat-Bulk Inorganic materials (bulk) 6.7M structures (5.9M materials) Computational HuggingFace Dataset
LeMat-Traj Inorganic materials (trajectories) 113M structures Computational HuggingFace Dataset
NeurIPS Open Polymer Prediction 2025 Polymers ~1,500 test polymers with MD-derived properties Computational CSV
Carbon Data Carbon materials 22.9M atoms, 546 trajectories Computational EXTXYZ
MSR-ACC/TAE25 Small molecules (up to Ar) 76,879 total atomization energies Computational HDF5/CSV
DFT Solvation Energy Dataset Small molecules 651,290 solvation energies in 5 solvents Computational CSV/JSON
MD Simulated Monomer Properties Small molecules 410 molecules with thermodynamic properties Computational CSV/JSON
Multimodal Spectroscopic Dataset Molecular spectroscopy 790k molecules with simulated spectra Computational HDF5/JSON
PubChemQCR Small molecules (relaxation) 3.5M trajectories / 300M conformations Computational HuggingFace Dataset
MP-ALOE Universal MLIPs (89 elements) ~1M r2SCAN DFT calculations Computational JSONL/MACE
Alexandria DB Inorganic (1D–3D) >5 M DFT calcs (PBE) Computational JSON/OPTIMADE/LMDB
Quantum‑Chemical Bonding DB (LOBSTER) Solid‑state bonding analysis 1,520 compounds Computational JSON
MultixcQM9 (OpenQDC) Small molecules (QM9, multi‑XC) 133k molecules Computational Torch/NumPy
SPICE (OpenQDC) Drug‑like molecules 1 M conformers (energies & forces) Computational Torch/ASE
Matbench v0.1 Benchmarks (13 tasks) 10 datasets Benchmark/Comp CSV/HDF5
Matbench Discovery Stability, κ, structures Multiple files Benchmark/Comp CSV/ZIP
Materials Cloud Archives Various DFT/MD workflows 1,000+ datasets Computational HDF5/JSON/CIF
MS25 MLIP benchmark (6 material systems) Multi-system benchmark suite Computational/Benchmark HDF5
RadonPy Polymer Properties Data Polymer ~1070 MD-calculated Properties Computational CSV
SHNITSEL Data Organic Molecules 418,870 Post-HF-calculated Ground- and Excited-states Properties Computational XARRAY
Frustrated Lewis Pairs Database Small Molecules 146 Metal-free FLPs Computational HTML
AQCat25 Catalysis 13.5M frames / 5K materials Computational Parquet/ASE DB
OMol25 Electronic Structures Molecular chemistry 4M+ calculations Computational Raw DFT outputs
Unrestricted CCSD(T) Dataset For Organic Molecule Reactions Organic reactions 3119 configurations Computational
MC-PDFT-OPESf Reaction kinetics Diels-Alder reaction Computational
Quantum Cluster Database Nanoclusters 63,015 clusters Computational CSV/JSON
The Cambridge Cluster Database Mixed Clusters Multiple Files Computational Multiple Types
Battery Electrolyte Solvation/Ionization Organic molecules Thousands of molecules Computational

Experimental Datasets

Dataset Domain Size Type Format
Crystallography Open Database (COD) Crystal structures ~525k entries Experimental CIF/SMILES
NIST ICSD (subset) Inorganic structures ~290k structures Experimental CIF
CSD (Cambridge) Organic crystals ~1.3M structures Experimental CIF
opXRD Crystal structures 92552 (2179 labeled) Experimental JSON
MDR SuperCon Superconductivity legacy superconductor database w/ material composition, structure, properties, and processes Mixed
ChEMBL Bioactive molecules 2.3M+ compounds with bioactivity data Experimental JSON/SDF
MoleculeNet Molecular properties 700k+ compounds across 17 datasets Mixed CSV/SDF
ESOL Aqueous solubility 1,128 compounds with solubility data Experimental CSV
FreeSolv Hydration free energy 643 molecules with experimental data Experimental CSV
Lipophilicity Octanol/water distribution 4,200 compounds with logD values Experimental CSV
PCBA Bioassay screening 400k+ compounds, 128 bioassays Experimental CSV
HIV Antiviral screening 41k compounds with HIV inhibition data Experimental CSV
BACE Beta-secretase inhibitors 1,522 compounds with IC50 data Experimental CSV
BBBP Blood-brain barrier permeability 2,053 compounds with permeability data Experimental CSV
Tox21 Toxicity screening 8k compounds, 12 toxicity targets Experimental CSV
ToxCast High-throughput toxicity 8k compounds, 600+ assays Experimental CSV
SIDER Drug side effects 1,427 drugs with adverse reactions Experimental CSV
ClinTox Clinical trial toxicity 1,491 compounds with FDA approval status Experimental CSV
PDBbind Protein-ligand binding 19k complexes with binding affinities Experimental PDB/SDF
BindingDB Protein-ligand binding 2.8M+ binding data points Experimental CSV/SDF
ProtBENCH Drug-target interactions Protein family-specific datasets Experimental CSV
PDBench Protein sequence design 595 protein structures, 40 architectures Experimental PDB
PDB-Struct Structure-based protein design Comprehensive protein design benchmark Experimental PDB
HTEM-DB Thin-film composition libraries 140 k+ samples Experimental JSON/API
OCx24 Electrocatalyst inks 572 samples (+DFT) Experimental CSV
Polymer Genome Polymers 20 k polymers Experimental + Comp CSV/JSON
CoRE MOF 2024 Metal-organic frameworks 40k+ experimental MOFs Experimental CIF
SAIR Protein-ligand binding 1M+ complexes, 5.2M structures, 2.5TB Experimental 3D/CSV
Anion Solvation DB Anion solvation ~26k properties Mixed CSV
BigSolDB Organic molecule solubility ~54k exp. values Experimental CSV
StarryData2 Experimental properties Figshare dump (2023/2024) Experimental CSV/JSON
CRIPT Polymer Data Polymers (synthesis, properties) Growing community DB Mixed JSON/API
Catechol Benchmark Solvent selection / Reaction yield 1200+ process conditions Experimental CSV
Leeds Solubility Data Solubility 2.3k measurements Experimental CSV
BigSolDB 2.0 Solubility 103k+ values Experimental CSV/XLSX
OpenExp Chemical reactions 274k pairs Experimental Varies
Battery Imaging Library (BIL) Battery imaging 80+ scans, >500B voxels Experimental Various

LLM Training Datasets

Dataset Domain Size Type Format
ChemPile Chemistry 75B+ tokens LLM Training Mixed
SmolInstruct Small molecules 3.3M samples LLM Training JSON
CAMEL Chemistry 20K problem-solution pairs LLM Training JSON
ChemNLP Chemistry Extensive, many combined datasets LLM Training JSON
ChemQA Chemistry Multimodal QA dataset LLM Training JSON
ChemLLMBench Chemistry 8 chemistry tasks benchmark LLM Training JSON
ChemistryQA Chemistry 4,500 questions across 200 topics LLM Training JSON
MaScQA Materials Science 640 QA pairs LLM Training XLSX
SciCode Research Coding in Physics, Math, Material Science, Biology, and Chemistry 338 subproblems LLM Training JSON
ChemData 700K Chemistry (9 core tasks) 730K Q-A instruction pairs LLM Training JSON
MatSci-Instruct (HoneyBee) Materials science ≈55K verified instructions LLM Training JSON
MoleculeQA Molecular properties & safety 62K multiple-choice QA pairs LLM Training JSON
BioInstruct 25K Biomedical / biochemistry 25K GPT-4 generated instructions LLM Training JSON
Lab-Bench Biology 2,400+ questions for biology agents LLM Training JSON
ChemBench 4K Chemistry competency benchmark 4,100 single-choice questions LLM Training JSON
GPQA Diamond Biology, Physics, Chemistry 448 multiple-choice questions LLM Training JSON
MaCBench Chemistry and materials science Vision-language tasks LLM Training JSON
ChemBench Chemistry 2,700+ question-answer pairs LLM Training JSON
MatText Materials property prediction 2M structures LLM Training HuggingFace Dataset
SciAssess Scientific literature analysis Benchmark for LLMs in science LLM Training JSON
ZINC20-ML Drug-like molecules (SMILES) ≈1B molecules LLM Training SMILES
PMC Open Access Subset Biomedical full-text 3.4M+ articles LLM Training XML
MatScholar Task-Schema QA (MatSci-NLP) Materials science (7 NLP tasks) Tens of thousands of examples LLM Training JSON
Mol-Instructions Chemistry molecular, protein, and biochemical instructions LLM Training HuggingFace Dataset
USPTO-LLM Chemical reactions 247K reactions LLM Training JSON/Graph
ChemRxivQuest Chem literature QA 970 QA pairs LLM Training JSON
USPTO-Lowe Patent reactions 1.8 M reactions Literature-mined RXN/SMILES
MolTextNet Small molecules with text 2.5M molecule-text pairs LLM Training HuggingFace Dataset
MolOpt-Instructions Molecule optimization 1.18M instruction-based optimization tasks LLM Training HuggingFace Dataset
TextEdge Crystal properties Crystal text descriptions with properties LLM Training JSON
LAMBench-TrainingSet-v1 Materials structures 19.8M structures for Large Atom Models LLM Training Various
LLM4Mat Materials property prediction 1.9M crystal structures, 45 properties, 3 modalities LLM Training Various
LLM-EO Transition metal complexes / Optimization 1.37M TMC space explored LLM Training GitHub
Flavor Analysis and Recognition Transformer Molecular taste prediction Multi-class taste classification dataset LLM Training SMILES/JSON
SCQA (Solar Cell QA) Solar cells 47K QA pairs LLM Training JSON
ScienceQA K–12 science, multimodal MCQs w/ lectures & explanations 21,208 Qs LLM Training/Eval JSON
SciBench College-level scientific problem solving (math/chem/phys) Open & closed sets LLM Eval PDF/JSON
MegaScience Scientific reasoning (7 disciplines) 1.25M instances (650k reasoning questions from 12k textbooks) LLM Training HuggingFace Dataset
Mat-Instructions Inorganic materials ~30k instructions LLM Training JSON
Open Materials Guide (OMG) Materials synthesis 17K synthesis recipes LLM Training JSON
ChemDFM Chemistry 34B tokens / 2.7M instructions LLM Training HuggingFace
ChemTable Chemistry Tables Large-scale benchmark LLM Training/Benchmark JSON
ChemCoTBench Molecular reasoning Annotated datasets LLM Training/Benchmark HuggingFace Dataset

Literature-mined & Text Datasets

Dataset Domain Size Type Format
PubChem Molecules & data 119M compounds Literature SMILES/SDF
Open Reaction Database (ORD) Synthetic reactions ~1M reactions Experimental/Lit JSON
PatCID (IBM) Chemical image data 81M images / 13M mols Literature PNG/SMILES
MatScholar NLP corpus (materials) 5M+ abstracts Literature JSON/Graph
Matbench (metadata/text tasks) Text/meta ML tasks 13 tasks Literature/Benchmark CSV
OpenQDC Hub QM molecules & reactions 1.5 B geometries Literature/Computational Python API/NPZ
L2M3 - Large Language Model MOF Miner Metal-organic frameworks from >40k articles Literature-mined CSV

🌊 Computational Fluid Dynamics, PDE & Engineering Datasets

Dataset Domain Size Type Format
PDEBench PDE solving / Scientific ML Multiple datasets Benchmark / Simulation HDF5/PyTorch
BLASTNet Fluid mechanics / Reacting flows 17 TB Simulation / CFD HDF5/NPY
Johns Hopkins Turbulence DB (JHTDB) DNS/LES turbulence (9 canonical flows) ≈ 350 TB Simulation Web API / HDF5 cutouts
Airfoil CFD 2k 1,830 airfoils × 25 AoA × 3 Re ~6 GB (250 k cases) Simulation HDF5
PDEArena (collection) 2-D Navier–Stokes, Shallow-Water, 3-D Maxwell ≈ 100 GB (4 datasets) Simulation Torch / HDF5
WeatherBench 2 Global weather reanalysis (ERA5, 1979-2023) ≈ 5 TB Reanalysis NetCDF/Zarr
UT Austin Channel-DNS Suite Incompressible channel flow Reτ 180 – 5200 ≈ 10 TB Simulation Binary / ASCII
Compressible TPC DNS DB Compressible channel flow (25 M, Reτ*) ~2 GB Simulation TXT tables
Curated RANS ↔ DNS Dataset 29 geometries, 4 RANS models w/ DNS/LES labels 1.1 GB Simulation HDF5/CSV
NASA Common Research Model (CRM) Aircraft CRM geom. + wind-tunnel & CFD results Multi-GB Mixed (Exp + Sim) CAD / CSV / Tecplot
Darcy-Flow (FNO) 2-D porous-media pressure fields (∇·k∇u = f) ≈ 1 GB (10 k samples) Simulation HDF5
HiFi-TURB LES/DNS High-fidelity LES/DNS for complex 3D flows Multi-case suite Simulation (DNS/LES) HDF5/NetCDF
NASA High Lift Prediction Workshop (HLPW) High-lift aircraft configurations Multi-GB Mixed (exp + CFD) CAD/CSV/Tecplot
High-Speed TBL DNS DB Compressible turbulent boundary layers DNS database Simulation HDF5
ML Turbulence (Kaggle) RANS Reynolds stress tensor data ~GB scale Benchmark/Simulation CSV/HDF5

Proprietary Datasets (for reference)

Dataset Domain Size Use Case Notes
CAS Registry Chemical substances 250M+ substances Industry standard for molecule indexing
Reaxys (Elsevier) Reactions & properties Millions of reactions Rich curated literature reaction data
Citrine Informatics DB Experimental materials Private Materials ML platform w/ industry data
CSD (Cambridge) Organic crystals 1.3M+ Gold-standard X-ray structures
PoLyInfo Polymers & properties 500k+ data points / Experimental Polymer properties from literature sources

Dataset Resources

  • The Materials Data Facility - Over 100 TB of open materials data. #TODO list some of these in the tables above
  • Foundry-ML search Foundry - 61 structured datasets ready for download through a Python client #TODO list some of these in the tables above

TODO

  • Add all OpenQDC datasets https://www.openqdc.io/datasets
  • A dataset on solubilities of gases in polymers (15 000 experimental measurements of 79 gases' uptakes (0.01–50 wt%) in 102 different polymers, pressures from 1 × 10−3 to 7 × 102 bar and temperatures from 233 to 508 K, includes nearly 500 solvent–polymer systems). Optimized structures of various repeating units are included. Should it be of interest for you, it is available here: Data
  • Add Materials Cloud Datasets
  • Classify Atomly. A bit challenging with non-English
  • Look into adding NOMAD for experimental data as well
  • Add A Quantum-Chemical Bonding Database for Solid-State Materials Part 1: https://zenodo.org/records/8091844 Part 2: https://zenodo.org/records/8092187
  • Add QM datasets. http://quantum-machine.org/datasets/
  • Find link for | ChemRxivQuest | Chemistry literature QA | 970 curated QA pairs | LLM Training | JSON | CC BY 4.0 | Open | ChemRxivQuest |
  • Find new link for USPTO-Reactions | USPTO Reactions | Organic reactions | 1.8M reactions | Literature | RXN/SMILES | Open | Open |
  • Find dataset link for | SciCUEval | Multidomain scientific comprehension (bio/chem/phys/matsci) | 10 sub-datasets | LLM Eval | JSON/PDF | Open | Open |
  • Find dataset for | MatSciKB | Materials science KB | 38.5k entries (20k papers, 3.6k Wikipedia, 1.9k textbooks, 10.5k datasets) | Literature | Structured text | Open | Open |


License

This project is licensed under the MIT License. Each dataset listed has its own license, noted in the table. Always check the source's license before using the data in your project.


Acknowledgements

The primary effort of Ben Blaiszik on this project was performed under financial assistance award 70NANB24H049 / MML24-1001 from the National Institute of Standards and Technology (NIST).

Thanks to the open data and research communities including: - Meta AI FAIR - The Materials Data Facility / Foundry-ML - NIST JARVIS and Materials Project - LBL, MIT, CCDC, FIZ Karlsruhe - Contributors to Open Catalyst, PubChem, ORD, and AFLOW - Developers of open chemistry toolkits (RDKit, Open Babel)


Citation

If this repository was helpful in your work, feel free to cite or star the repo. You can also reference the underlying dataset publications linked above.

Changelog

This Changelog is autogenerated, there may be errors.

October 2025

Added 18 new datasets focusing on catalysis, reaction kinetics, cluster chemistry, experimental solubility, literature mining, and foundation models to enhance resources for computational chemistry and machine learning applications.

🧮 Computational Datasets (7 datasets)
  • AQCat25: The AQCat25 dataset provides a large and diverse collection of 13.5 million DFT calculation trajectories, encompassing approximately 5K materials and 47K intermediate-catalyst systems. It is designed to complement existing large-scale datasets by providing calculations at higher fidelity and including critical spin-polarized systems, which are essential for accurately modeling many industrially relevant catalysts.
  • OMol25 Electronic Structures Dataset: The OMol25 Electronic Structures dataset includes the raw DFT outputs, electronic densities, wavefunctions, and molecular orbital information for over 4M million high-accuracy quantum chemical calculations. We see this as a transformative opportunity to develop higher quality partial charges, partial spins, and advanced electronic features to unlock the next generation of physics-informed ML models.
  • Unrestricted CCSD(T) Dataset For Organic Molecule Reactions: Dataset of 3119 organic molecule configurations at gold-standard quantum accuracy with automated workflows for unrestricted CCSD(T) calculations. Includes a transferable MLIP trained on UCCSD(T) data, showing significant improvements in force and activation energy accuracy.
  • MC-PDFT-OPESf: This work combines multi-configuration pair-density functional theory (MC-PDFT) as an accurate and efficient multireference electronic structure method with on-the-fly probability enhanced sampling flooding (OPESf) as an enhanced sampling method capable of accelerating reactive transitions. MC-PDFT–OPESf provides reaction rates in agreement with experiments at a fraction of the computational cost required by conventional unbiased ab-initio calculations.
  • Quantum Cluster Database: A database of 63015 low-energy atomically precise nanoclusters for 55 elements across the periodic table, including main group and transition metal elements.
  • Cambridge Cluster Database: A collection of results from global optimizations for a variety of cluster systems, including Lennard-Jones, metal, molecular, and ionic clusters. The database is continuously updated with new results from published papers.
  • Battery Electrolyte Solvation/Ionization: This dataset presents molecular properties critical for battery electrolyte design, specifically solvation energies, ionization potentials, and electron affinities for thousands of organic molecules from QM9, EGP, GDB17, and ZINC.
🧪 Experimental Datasets (5 datasets)
  • Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning, providing the first-ever transient flow dataset for machine learning benchmarking, covering over 1200 process conditions. This dataset focuses on solvent selection, a task that is particularly difficult to model theoretically.
  • BNNLab/Solubility_data: Leeds Solubility Data: Curated solubility data in organic solvents and water and descriptors for solubility prediction.
  • BigSolDB 2.0: A comprehensive dataset of 103,944 experimentally measured solubility values of 1,448 organic compounds in 213 solvents reported in 1,595 literature peer-reviewed articles.
  • OpenExp: Features 274,439 pairs of chemical reactions and their corresponding step-by-step instructions of experimental procedures. This dataset, compiled from the USPTO-Applications and ORD databases.
  • Battery Imaging Library (BIL): An open, curated collection of multi-modal and multi-length scale battery imaging datasets, featuring over 80 scans and 500 billion voxels of data from single particles to full cells.
📚 LLM Training Datasets (5 datasets)
  • Mat-Instructions: A large-scale inorganic material instruction dataset with ~30k instruction-response pairs, designed to unlock the potential of LLMs in materials science.
  • Open Materials Guide (OMG): A dataset of 17K high-quality, expert-verified synthesis recipes from open-access literature, which forms the basis for the AlchemyBench benchmark for LLM-guided synthesis prediction.
  • ChemDFM: A pioneering LLM for chemistry trained on 34B tokens from chemical literature and textbooks, and fine-tuned using 2.7M instructions. As a result, it can understand and reason with chemical knowledge in free-form dialogue.
  • ChemTable: A large-scale benchmark of real-world chemical tables curated from the experimental sections of literature. ChemTable supports table recognition and table understanding tasks to advance scientific reasoning in chemistry.
  • ChemCoTBench: A reasoning framework that bridges molecular structure understanding with arithmetic-inspired operations to formalize chemical problem-solving into transparent, step-by-step workflows for tasks like molecular optimization and reaction prediction.
📖 Literature-mined & Text Datasets (1 dataset)
  • L2M3 (Large Language Model MOF Miner): A database of MOF synthesis conditions and properties extracted from over 40,000 research articles using LLMs, enabling analysis of synthesis-structure-property relationships.

August 2025

Enhanced scientific reasoning capabilities and machine learning interatomic potential benchmarking with 6 new high-quality datasets for AI scientists and materials researchers.

🧮 Computational Datasets (3 datasets)
  • OMC25: A collection of over 27 million molecular crystal structures containing 12 elements and up to 300 atoms in the unit cell. The dataset was generated from dispersion-inclusive density functional theory (DFT) relaxation trajectories of over 230,000 randomly generated molecular crystal structures of around 50,000 organic molecules.
  • MS25: Comprehensive benchmark dataset for evaluating machine learning interatomic potentials (MLIPs) across 6 diverse materials systems including MgO surfaces, liquid water, zeolites, catalytic Pt surface reactions, high-entropy alloys, and disordered Zr-oxides. Evaluates 5 MLIP architectures (MACE, NequIP, Allegro, MTP, Torch-ANI) with focus on derived physical observables beyond traditional energy/force metrics. Demonstrates that equivariant MLIPs offer 1.5-2× improvements over nonequivariant models in complex systems, while highlighting the importance of explicit validation of physical properties rather than relying solely on error metrics.
  • Added the * Frustrated Lewis Pairs Database ** of 146 Metal-free FLPs
📚 LLM Training Datasets (4 datasets)
  • MaCBench: A comprehensive benchmark for evaluating how vision-language models handle real-world chemistry and materials science tasks across three core aspects: data extraction, experimental understanding, and results interpretation. Reveals fundamental limitations in spatial reasoning and cross-modal information synthesis in leading models.
  • ChemBench: A cutting-edge framework to evaluate the chemical knowledge and reasoning capabilities of large language models (LLMs). It includes over 2,700 curated question-answer pairs across diverse chemistry topics and uniquely encodes chemical semantics, enabling models to process and reason about molecules and equations.
  • MatText: A comprehensive benchmarking framework spanning multiple representations and model scales, which finds that LLMs consistently fail to capture coordinate information while excelling at category patterns. This geometric blindness persists regardless of model size, dataset scale, or text representation strategy.
  • MegaScience: Large-scale scientific reasoning dataset featuring 1.25 million high-quality instances across 7 scientific disciplines. Includes TextbookReasoning component with 650k reasoning questions extracted from 12,000 university-level textbooks, providing truthful reference answers for training AI scientists. Developed through systematic ablation studies and comprehensive evaluation across 15 benchmarks, demonstrating superior performance and training efficiency compared to existing open-source scientific datasets.

July 2025

Expanded the collection into new scientific domains with 31 new datasets, introducing benchmarks for physics-based machine learning, adding comprehensive quantum mechanics datasets, expanding materials science resources, and enhancing scientific evaluation benchmarks.

🌊 Computational Fluid Dynamics, PDE & Engineering Datasets (15 datasets)
  • PDEBench: A comprehensive benchmark suite for scientific machine learning featuring a wide range of Partial Differential Equations. It provides large, ready-to-use datasets for challenging physics problems, supporting both forward and inverse modeling.
  • BLASTNet: A 17 TB collection of high-fidelity fluid mechanics simulation datasets for ML applications in automotive, propulsion, and energy sectors. It includes code and pre-trained models for tasks like turbulence modeling and spatio-temporal prediction.
  • JHTDB: multi-terabyte DNS/LES portal with isotropic, channel, MHD, boundary-layer and atmospheric datasets.
  • Airfoil CFD 2k: DOE/NREL benchmark: 1,830 shapes × 250 k RANS simulations; HDF5 + AWS mirror.
  • PDEArena: Hugging-Face org offering Navier–Stokes, Shallow-Water & Maxwell tensors; MIT license.
  • WeatherBench 2: ERA5-derived Zarr cubes for data-driven medium-range forecasting; MIT.
  • UT Austin DNS Suite: public HTTP server with Reτ 180–5200 channel data & statistics.
  • Compressible TPC DNS DB: 25 Reynolds–Mach cases, plain-text statistics (Mendeley Data).
  • Curated RANS ↔ DNS: Scientific Data descriptor + Kaggle mirror for ML turbulence closures.
  • NASA CRM: open CAD, grids, wind-tunnel Cp & force/moment datasets for the community benchmark.
  • Darcy Flow (FNO): canonical permeability→pressure dataset used in FNO/PINO papers.
  • HiFi-TURB LES/DNS: EU-funded project providing high-fidelity Large Eddy Simulation and Direct Numerical Simulation datasets for complex 3D turbulent flows, supporting advanced turbulence modeling and AI/ML applications in computational fluid dynamics.
  • NASA High Lift Prediction Workshop (HLPW): Multi-phase workshop datasets featuring high-lift aircraft configurations with comprehensive experimental validation data, CAD geometries, and CFD solutions for aerodynamic modeling and validation.
  • High-Speed TBL DNS DB: Specialized database of Direct Numerical Simulation data for compressible turbulent boundary layers, providing detailed flow field information for high-speed aerodynamic applications and turbulence model development.
  • ML Turbulence (Kaggle): Community-contributed dataset featuring RANS Reynolds stress tensor data with ground truth labels, providing a standardized benchmark for machine learning approaches to turbulence modeling.
🧮 Computational Datasets (7 datasets)
  • PubChemQCR: A massive dataset of molecular relaxation trajectories for ~3.5 million small molecules, containing over 300 million conformations with energy and force labels. It is the largest public dataset of its kind, designed to accelerate the development of machine learning interatomic potentials (MLIPs).
  • MP-ALOE: Nearly 1 million DFT calculations using the accurate r2SCAN meta-generalized gradient approximation, covering 89 elements. Created using active learning and primarily consisting of off-equilibrium structures, MP-ALOE is designed for training universal machine learning interatomic potentials (UMLIPs) with strong performance on thermochemical properties, force prediction, and physical soundness under extreme conditions.
  • Alexandria DB: Massive computational materials database containing over 5 million DFT calculations using PBE functional for 1D-3D inorganic materials. Provides OPTIMADE-compliant API access and LMDB format for high-performance materials screening and property prediction workflows.
  • Quantum-Chemical Bonding DB (LOBSTER): Specialized dataset providing detailed bonding analysis for 1,520 solid-state compounds using LOBSTER methodology. Enables understanding of chemical bonding in crystalline materials through projected crystal orbital Hamilton populations and related descriptors.
  • MultixcQM9 & SPICE (OpenQDC): Enhanced quantum chemistry datasets within the OpenQDC framework. MultixcQM9 provides multi-exchange correlation functional data for 133k small molecules, while SPICE offers 1 million conformers with energies and forces for drug-like molecules, both optimized for machine learning applications.
  • Matbench v0.1 & Discovery: Comprehensive benchmarking suites for materials property prediction featuring 13 standardized tasks across 10 datasets. Matbench Discovery specifically targets stability prediction, thermal conductivity, and structure generation with rigorous evaluation protocols.
  • Materials Cloud Archives: Centralized repository of over 1,000 computational datasets from various DFT and molecular dynamics workflows. Provides standardized access to diverse materials science calculations with comprehensive metadata and version control.
📚 LLM Training Datasets (5 datasets)
  • LLM-EO (Evolutionary Optimization): A framework that integrates LLMs into evolutionary algorithms for optimizing transition metal complexes. This approach leverages the chemical knowledge of LLMs to surpass traditional genetic algorithms, enabling flexible, multi-objective optimization without complex mathematical formulations.
  • Flavor Analysis and Recognition Transformer: A state-of-the-art machine learning model dataset for predicting molecular taste from chemical structures. Built on ChemBERTa transformer architecture, it classifies molecules across four taste categories (sweet, bitter, sour, umami) with >91% accuracy, enabling interpretability through gradient-based visualizations and applications in flavor compound discovery and rational food design.
  • SCQA (Solar Cell QA): Domain-specific question-answering dataset containing 47,268 QA pairs about solar cell properties, auto-generated using ChemDataExtractor. Fine-tuning language models on this dataset achieves F1-scores exceeding general-English QA datasets by 10-20%, demonstrating the value of domain-specific training data for specialized scientific applications.
  • ScienceQA: Comprehensive K-12 science education dataset with 21,208 multimodal multiple-choice questions including lectures and explanations. Supports development of educational AI systems and scientific reasoning capabilities in language models.
  • SciBench: College-level scientific problem-solving benchmark covering mathematics, chemistry, and physics with both open and closed evaluation sets. Enables systematic assessment of LLM performance on advanced scientific reasoning tasks.
🧪 Experimental Datasets (4 datasets)
  • Anion Solvation DB: Comprehensive compilation of 26,000+ solvation properties including 8,241 experimental pKa values across 8 solvents, 5,536 computed gas-phase acidities, and over 12,000 solvation energies for anions and neutral compounds computed using COSMO-RS. Bridges experimental and computational approaches for understanding anion behavior in different solvation environments.
  • BigSolDB: Extensive experimental solubility database containing 54,273 measured solubility values across temperature range 243.15-403.15 K in various organic solvents and water. Features diverse chemical space coverage with interactive t-SNE exploration tool and comprehensive statistical analysis for QSPR model development.
  • StarryData2: Large-scale experimental properties dataset from Figshare spanning 2023-2024, providing comprehensive experimental measurements across diverse materials and chemical systems for machine learning model validation and training.
  • CRIPT Polymer Data: Community-driven polymer database featuring synthesis procedures, characterization data, and properties. Enables standardized data sharing and collaborative research in polymer science through structured JSON API access.

June 2025

Added 28 new high-quality datasets spanning polymer science, drug discovery, carbon materials, spectroscopy, MOF databases, foundation model training, and materials knowledge bases:

🧮 Computational Datasets (15 datasets)
  • NeurIPS Open Polymer Prediction 2025: Kaggle competition dataset for predicting 5 key polymer properties (Tg, FFV, Tc, density, Rg) from SMILES structures using MD simulation ground truth. Includes ~1,500 test polymers.
  • Carbon Data: 22.9 million atom dataset with synthetic energy labels from C-GAP-17 potential, featuring 546 carbon trajectories across diverse densities and temperatures. Captures nanotubes, graphitic films, diamond, and amorphous carbon environments.
  • MSR-ACC/TAE25: Microsoft Research's comprehensive dataset of 76,879 total atomization energies computed at CCSD(T)/CBS level using W1-F12 protocol. Exhaustively covers chemical space for elements up to argon with sub-chemical accuracy (±1 kcal/mol).
  • DFT Solvation Energy Dataset: 651,290 computed solvation energies for 130,258 molecules from QM9 dataset across 5 solvents (acetone, ethanol, acetonitrile, DMSO, water). Achieves 0.5 kcal/mol MAE for small molecules with accompanying ML models and web interface.
  • MD Simulated Monomer Properties: GPU-accelerated molecular dynamics dataset of thermodynamic properties for 410 molecules, generated through active learning pipeline. Includes validation against experimental data and automated simulation workflow.
  • Multimodal Spectroscopic Dataset: Comprehensive spectroscopic dataset with simulated 1H-NMR, 13C-NMR, HSQC-NMR, Infrared, and Mass spectra for 790k molecules from patent reactions. Enables multimodal foundation model development for structure elucidation and functional group prediction.
  • QMugs: 665k drug-like molecules with ~2M conformers, featuring quantum mechanical properties at both semi-empirical (GFN2-xTB) and DFT (ωB97X-D/def2-SVP) levels.
  • C2DB (Computational 2D Materials Database): ~4,000 two-dimensional materials with computed structural, electronic, magnetic, and optical properties.
  • ANI-1x / ANI-1ccx: 5 million DFT and 500k CCSD(T) calculations for organic molecules, supporting machine learning potential development.
  • CoRE MOF 2019: 14,763 computation-ready metal-organic frameworks with solvent and charge balancing, suitable for high-throughput screening.
  • QMOF Database: Comprehensive database of quantum-chemical properties for 20,000+ metal-organic frameworks derived from high-throughput periodic density functional theory calculations.
  • Catalysis-Hub Surface Reactions: Over 100,000 adsorption and reaction energies on catalytic surfaces, accessible via a Python/GraphQL API.
  • ODAC23 (Open DAC 2023): 38 million DFT calculations of CO₂/H₂O adsorption on 8,400 MOFs, aimed at direct-air-capture sorbent discovery.
  • MOFX-DB: Over 3 million simulated adsorption data points across 160,000 MOFs and 286 zeolites for various gases.
  • Enhanced QCML dataset entry with more comprehensive description of coverage and properties
🧪 Experimental Datasets (5 datasets)
  • SAIR (Structurally Augmented IC50 Repository): Largest public protein–ligand binding dataset with over 1 million complexes and 5.2 million cofolded 3D structures (2.5TB total). Combines experimental binding affinities from ChEMBL/BindingDB with Boltz-1x predicted structures.
  • CoRE MOF 2024: Updated database of over 40,000 experimentally reported metal-organic frameworks from literature through early 2024. Includes pre-computed material properties for high-throughput material-process screening and carbon-capture applications.
  • HTEM-DB (High-Throughput Experimental Materials Database): More than 140,000 composition–process–property data points from combinatorial sputtering experiments, with optical, electrical, and structural measurements.
  • OCx24 (Open Catalyst Experiments 2024): 572 synthesized catalyst inks evaluated with matched XRF/XRD and DFT adsorption energies, bridging the gap between simulation and laboratory data.
  • Khazana / Polymer Genome: Approximately 20,000 polymers with DFT-calculated properties and experimental dielectric data, supporting machine learning on soft materials.
📚 LLM Training Datasets (5 datasets)
  • MolTextNet: 2.5 million high-quality molecule-text pairs from ChEMBL35, featuring GPT-4o-mini generated descriptions 10x longer than existing datasets. Integrates structural features, computed properties, bioactivity data, and synthetic complexity for multimodal molecular modeling.
  • MolOpt-Instructions: 1.18 million instruction-based molecule optimization tasks for fine-tuning LLMs on drug discovery. Supports interactive human-machine dialogue for molecule optimization through the DrugAssist framework, enabling expert feedback integration and iterative refinement.
  • TextEdge: Benchmark dataset for predicting crystal properties from natural language text descriptions. Demonstrates superior performance of LLM-based approaches over traditional GNN methods, with improvements of 8% on band gap prediction and 65% on unit cell volume prediction.
  • LAMBench-TrainingSet-v1: Massive training dataset for Large Atom Models (LAMs) containing 19.8 million valid structures from the OpenLAM Initiative. Includes 1 million structures on the convex hull for advancing generative modeling and materials science applications.
  • LLM4Mat: Comprehensive benchmark dataset for evaluating LLMs in materials property prediction, containing 1.9M crystal structures from 10 data sources with 45 distinct properties. Features three input modalities (crystal composition, CIF, text description) with 4.7M, 615.5M, and 3.1B tokens respectively.
📖 Literature-mined & Text Datasets (3 datasets)
  • MatSciKB: Comprehensive materials science knowledge base with 38,469 curated entries across 16 categories. Integrates ArXiv papers (20,384), Wikipedia articles (3,620), textbooks (1,930), datasets (10,473), formulas (57), and GPT-generated examples (2,005) with efficient CRUD operations for research applications.
  • ChemRxivQuest: 970 curated question–answer pairs spanning 17 chemistry subfields, designed for retrieval-augmented generation and factuality assessments.
  • USPTO-Lowe Reactions (1976–2016): 1.8 million atom-mapped reactions extracted from US patents, serving as a benchmark for reaction prediction and retrosynthesis models.
📚 Enhanced Literature & Benchmark Resources (2 datasets)
  • Matbench (metadata/text tasks): Extended benchmarking suite providing 13 standardized tasks for text-based and metadata-driven materials property prediction. Enables systematic evaluation of natural language processing approaches in materials science applications.
  • OpenQDC Hub: Comprehensive quantum chemistry database aggregating 1.5 billion molecular geometries and quantum mechanical properties. Provides unified Python API access to diverse quantum chemistry datasets with standardized formats for large-scale machine learning applications.

Earlier Updates

For changes made earlier than the changelog entries, please see the repository commit history.

Credit by: @github.com/blaiszik/awesome-matchem-datasets

Best-of Machine Learning with Python

Best-of Machine Learning with Python

🏆  A ranked list of awesome machine learning Python libraries. Updated weekly.

This curated list contains 920 awesome open-source projects with a total of 5M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!


🧙‍♂️  Discover other best-of lists or create your own.
📫  Subscribe to our newsletter for updates and trending projects.


Contents

Explanation

  • 🥇🥈🥉  Combined project-quality score
  • ⭐️  Star count from GitHub
  • 🐣  New project (less than 6 months old)
  • 💤  Inactive project (6 months no activity)
  • 💀  Dead project (12 months no activity)
  • 📈📉  Project is trending up or down
  • ➕  Project was recently added
  • ❗️  Warning (e.g. missing/risky license)
  • 👨‍💻  Contributors count from GitHub
  • 🔀  Fork count from GitHub
  • 📋  Issue count from GitHub
  • ⏱️  Last update timestamp on package manager
  • 📥  Download count from package manager
  • 📦  Number of dependent projects
  •   Tensorflow related project
  •   Sklearn related project
  •   PyTorch related project
  •   MxNet related project
  •   Apache Spark related project
  •   Jupyter related project
  •   PaddlePaddle related project
  •   Pandas related project
  •   Jax related project


Machine Learning Frameworks

Back to top

General-purpose machine learning and deep learning frameworks.

Tensorflow (🥇56 · ⭐ 190K) - An Open Source Machine Learning Framework for Everyone. Apache-2 - [GitHub](https://github.com/tensorflow/tensorflow) (👨‍💻 4.9K · 🔀 75K · 📦 520K · 📋 41K - 3% open · ⏱️ 22.05.2025):
git clone https://github.com/tensorflow/tensorflow
- [PyPi](https://pypi.org/project/tensorflow) (📥 22M / month · 📦 8.9K · ⏱️ 12.03.2025):
pip install tensorflow
- [Conda](https://anaconda.org/conda-forge/tensorflow) (📥 5.5M · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow
- [Docker Hub](https://hub.docker.com/r/tensorflow/tensorflow) (📥 80M · ⭐ 2.8K · ⏱️ 22.05.2025):
docker pull tensorflow/tensorflow
PyTorch (🥇55 · ⭐ 90K) - Tensors and Dynamic neural networks in Python with strong GPU.. BSD-3 - [GitHub](https://github.com/pytorch/pytorch) (👨‍💻 5.6K · 🔀 24K · 📥 88K · 📦 780K · 📋 52K - 31% open · ⏱️ 22.05.2025):
git clone https://github.com/pytorch/pytorch
- [PyPi](https://pypi.org/project/torch) (📥 49M / month · 📦 25K · ⏱️ 23.04.2025):
pip install torch
- [Conda](https://anaconda.org/pytorch/pytorch) (📥 27M · ⏱️ 25.03.2025):
conda install -c pytorch pytorch
scikit-learn (🥇53 · ⭐ 62K) - scikit-learn: machine learning in Python. BSD-3 - [GitHub](https://github.com/scikit-learn/scikit-learn) (👨‍💻 3.3K · 🔀 26K · 📥 1.1K · 📦 1.3M · 📋 12K - 17% open · ⏱️ 22.05.2025):
git clone https://github.com/scikit-learn/scikit-learn
- [PyPi](https://pypi.org/project/scikit-learn) (📥 100M / month · 📦 30K · ⏱️ 09.05.2025):
pip install scikit-learn
- [Conda](https://anaconda.org/conda-forge/scikit-learn) (📥 36M · ⏱️ 09.05.2025):
conda install -c conda-forge scikit-learn
Keras (🥇47 · ⭐ 63K) - Deep Learning for humans. Apache-2 - [GitHub](https://github.com/keras-team/keras) (👨‍💻 1.4K · 🔀 20K · 📋 13K - 2% open · ⏱️ 21.05.2025):
git clone https://github.com/keras-team/keras
- [PyPi](https://pypi.org/project/keras) (📥 16M / month · 📦 1.9K · ⏱️ 19.05.2025):
pip install keras
- [Conda](https://anaconda.org/conda-forge/keras) (📥 4.1M · ⏱️ 20.05.2025):
conda install -c conda-forge keras
XGBoost (🥇46 · ⭐ 27K) - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or.. Apache-2 - [GitHub](https://github.com/dmlc/xgboost) (👨‍💻 660 · 🔀 8.8K · 📥 16K · 📦 160K · 📋 5.5K - 8% open · ⏱️ 20.05.2025):
git clone https://github.com/dmlc/xgboost
- [PyPi](https://pypi.org/project/xgboost) (📥 25M / month · 📦 2.5K · ⏱️ 13.05.2025):
pip install xgboost
- [Conda](https://anaconda.org/conda-forge/xgboost) (📥 6.1M · ⏱️ 15.05.2025):
conda install -c conda-forge xgboost
jax (🥇45 · ⭐ 32K) - Composable transformations of Python+NumPy programs: differentiate,.. Apache-2 - [GitHub](https://github.com/jax-ml/jax) (👨‍💻 880 · 🔀 3K · 📦 44K · 📋 6.2K - 23% open · ⏱️ 22.05.2025):
git clone https://github.com/google/jax
- [PyPi](https://pypi.org/project/jax) (📥 7.5M / month · 📦 2.5K · ⏱️ 21.05.2025):
pip install jax
- [Conda](https://anaconda.org/conda-forge/jaxlib) (📥 2.7M · ⏱️ 17.05.2025):
conda install -c conda-forge jaxlib
PaddlePaddle (🥇45 · ⭐ 23K) - PArallel Distributed Deep LEarning: Machine Learning.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/Paddle) (👨‍💻 1.4K · 🔀 5.7K · 📥 15K · 📦 8.2K · 📋 20K - 9% open · ⏱️ 22.05.2025):
git clone https://github.com/PaddlePaddle/Paddle
- [PyPi](https://pypi.org/project/paddlepaddle) (📥 390K / month · 📦 230 · ⏱️ 26.03.2025):
pip install paddlepaddle
PySpark (🥈44 · ⭐ 41K) - Apache Spark Python API. Apache-2 - [GitHub](https://github.com/apache/spark) (👨‍💻 3.2K · 🔀 29K · ⏱️ 22.05.2025):
git clone https://github.com/apache/spark
- [PyPi](https://pypi.org/project/pyspark) (📥 46M / month · 📦 1.9K · ⏱️ 27.02.2025):
pip install pyspark
- [Conda](https://anaconda.org/conda-forge/pyspark) (📥 3.9M · ⏱️ 22.04.2025):
conda install -c conda-forge pyspark
pytorch-lightning (🥈43 · ⭐ 30K) - Pretrain, finetune ANY AI model of ANY size on.. Apache-2 - [GitHub](https://github.com/Lightning-AI/pytorch-lightning) (👨‍💻 1K · 🔀 3.5K · 📥 13K · 📦 46K · 📋 7.3K - 12% open · ⏱️ 20.05.2025):
git clone https://github.com/Lightning-AI/lightning
- [PyPi](https://pypi.org/project/pytorch-lightning) (📥 9.5M / month · 📦 1.7K · ⏱️ 25.04.2025):
pip install pytorch-lightning
- [Conda](https://anaconda.org/conda-forge/pytorch-lightning) (📥 1.5M · ⏱️ 28.04.2025):
conda install -c conda-forge pytorch-lightning
StatsModels (🥈43 · ⭐ 11K) - Statsmodels: statistical modeling and econometrics in Python. BSD-3 - [GitHub](https://github.com/statsmodels/statsmodels) (👨‍💻 460 · 🔀 3.1K · 📥 35 · 📦 170K · 📋 5.7K - 50% open · ⏱️ 06.05.2025):
git clone https://github.com/statsmodels/statsmodels
- [PyPi](https://pypi.org/project/statsmodels) (📥 17M / month · 📦 4.5K · ⏱️ 03.10.2024):
pip install statsmodels
- [Conda](https://anaconda.org/conda-forge/statsmodels) (📥 19M · ⏱️ 22.04.2025):
conda install -c conda-forge statsmodels
LightGBM (🥈42 · ⭐ 17K) - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT,.. MIT - [GitHub](https://github.com/microsoft/LightGBM) (👨‍💻 330 · 🔀 3.9K · 📥 290K · 📦 53K · 📋 3.6K - 12% open · ⏱️ 22.05.2025):
git clone https://github.com/microsoft/LightGBM
- [PyPi](https://pypi.org/project/lightgbm) (📥 11M / month · 📦 1.4K · ⏱️ 15.02.2025):
pip install lightgbm
- [Conda](https://anaconda.org/conda-forge/lightgbm) (📥 3.6M · ⏱️ 22.04.2025):
conda install -c conda-forge lightgbm
Catboost (🥈42 · ⭐ 8.4K) - A fast, scalable, high performance Gradient Boosting on Decision.. Apache-2 - [GitHub](https://github.com/catboost/catboost) (👨‍💻 1.3K · 🔀 1.2K · 📥 390K · 📦 18 · 📋 2.4K - 25% open · ⏱️ 22.05.2025):
git clone https://github.com/catboost/catboost
- [PyPi](https://pypi.org/project/catboost) (📥 2.6M / month · 📦 650 · ⏱️ 13.04.2025):
pip install catboost
- [Conda](https://anaconda.org/conda-forge/catboost) (📥 2M · ⏱️ 19.05.2025):
conda install -c conda-forge catboost
Fastai (🥈41 · ⭐ 27K) - The fastai deep learning library. Apache-2 - [GitHub](https://github.com/fastai/fastai) (👨‍💻 670 · 🔀 7.6K · 📦 23K · 📋 1.8K - 13% open · ⏱️ 19.04.2025):
git clone https://github.com/fastai/fastai
- [PyPi](https://pypi.org/project/fastai) (📥 610K / month · 📦 330 · ⏱️ 18.04.2025):
pip install fastai
PyFlink (🥈40 · ⭐ 25K) - Apache Flink Python API. Apache-2 - [GitHub](https://github.com/apache/flink) (👨‍💻 2K · 🔀 14K · 📦 21 · ⏱️ 22.05.2025):
git clone https://github.com/apache/flink
- [PyPi](https://pypi.org/project/apache-flink) (📥 7.2M / month · 📦 35 · ⏱️ 12.02.2025):
pip install apache-flink
einops (🥈37 · ⭐ 8.9K) - Flexible and powerful tensor operations for readable and reliable code.. MIT - [GitHub](https://github.com/arogozhnikov/einops) (👨‍💻 33 · 🔀 360 · 📦 76K · 📋 200 - 17% open · ⏱️ 25.04.2025):
git clone https://github.com/arogozhnikov/einops
- [PyPi](https://pypi.org/project/einops) (📥 9.6M / month · 📦 2.6K · ⏱️ 09.02.2025):
pip install einops
- [Conda](https://anaconda.org/conda-forge/einops) (📥 380K · ⏱️ 22.04.2025):
conda install -c conda-forge einops
Flax (🥈37 · ⭐ 6.6K) - Flax is a neural network library for JAX that is designed for.. Apache-2 - [GitHub](https://github.com/google/flax) (👨‍💻 260 · 🔀 690 · 📥 61 · 📦 14K · 📋 1.2K - 33% open · ⏱️ 21.05.2025):
git clone https://github.com/google/flax
- [PyPi](https://pypi.org/project/flax) (📥 1.5M / month · 📦 610 · ⏱️ 23.04.2025):
pip install flax
- [Conda](https://anaconda.org/conda-forge/flax) (📥 100K · ⏱️ 22.04.2025):
conda install -c conda-forge flax
Ignite (🥈37 · ⭐ 4.7K) - High-level library to help with training and evaluating neural.. BSD-3 - [GitHub](https://github.com/pytorch/ignite) (👨‍💻 860 · 🔀 650 · 📦 3.8K · 📋 1.4K - 10% open · ⏱️ 07.05.2025):
git clone https://github.com/pytorch/ignite
- [PyPi](https://pypi.org/project/pytorch-ignite) (📥 180K / month · 📦 110 · ⏱️ 22.05.2025):
pip install pytorch-ignite
- [Conda](https://anaconda.org/pytorch/ignite) (📥 230K · ⏱️ 30.03.2025):
conda install -c pytorch ignite
Jina (🥈35 · ⭐ 22K) - Build multimodal AI applications with cloud-native stack. Apache-2 - [GitHub](https://github.com/jina-ai/serve) (👨‍💻 180 · 🔀 2.2K · 📋 1.9K - 0% open · ⏱️ 24.03.2025):
git clone https://github.com/jina-ai/jina
- [PyPi](https://pypi.org/project/jina) (📥 46K / month · 📦 29 · ⏱️ 24.03.2025):
pip install jina
- [Conda](https://anaconda.org/conda-forge/jina-core) (📥 95K · ⏱️ 22.04.2025):
conda install -c conda-forge jina-core
- [Docker Hub](https://hub.docker.com/r/jinaai/jina) (📥 1.8M · ⭐ 8 · ⏱️ 24.03.2025):
docker pull jinaai/jina
Thinc (🥈34 · ⭐ 2.8K) - A refreshing functional take on deep learning, compatible with your favorite.. MIT - [GitHub](https://github.com/explosion/thinc) (👨‍💻 67 · 🔀 280 · 📥 1.3K · 📦 68K · 📋 150 - 12% open · ⏱️ 07.03.2025):
git clone https://github.com/explosion/thinc
- [PyPi](https://pypi.org/project/thinc) (📥 18M / month · 📦 160 · ⏱️ 04.04.2025):
pip install thinc
- [Conda](https://anaconda.org/conda-forge/thinc) (📥 3.5M · ⏱️ 22.04.2025):
conda install -c conda-forge thinc
ivy (🥈33 · ⭐ 14K) - Convert Machine Learning Code Between Frameworks. Apache-2 - [GitHub](https://github.com/ivy-llc/ivy) (👨‍💻 1.5K · 🔀 5.7K · 📋 17K - 5% open · ⏱️ 29.04.2025):
git clone https://github.com/unifyai/ivy
- [PyPi](https://pypi.org/project/ivy) (📥 21K / month · 📦 16 · ⏱️ 21.02.2025):
pip install ivy
Vowpal Wabbit (🥈33 · ⭐ 8.6K · 💤) - Vowpal Wabbit is a machine learning system which pushes the.. BSD-3 - [GitHub](https://github.com/VowpalWabbit/vowpal_wabbit) (👨‍💻 340 · 🔀 1.9K · 📦 2 · 📋 1.3K - 10% open · ⏱️ 01.08.2024):
git clone https://github.com/VowpalWabbit/vowpal_wabbit
- [PyPi](https://pypi.org/project/vowpalwabbit) (📥 22K / month · 📦 40 · ⏱️ 08.08.2024):
pip install vowpalwabbit
- [Conda](https://anaconda.org/conda-forge/vowpalwabbit) (📥 360K · ⏱️ 22.04.2025):
conda install -c conda-forge vowpalwabbit
mlpack (🥈33 · ⭐ 5.4K) - mlpack: a fast, header-only C++ machine learning library. BSD-3 - [GitHub](https://github.com/mlpack/mlpack) (👨‍💻 330 · 🔀 1.7K · 📋 1.7K - 1% open · ⏱️ 22.05.2025):
git clone https://github.com/mlpack/mlpack
- [PyPi](https://pypi.org/project/mlpack) (📥 4.3K / month · 📦 6 · ⏱️ 15.05.2025):
pip install mlpack
- [Conda](https://anaconda.org/conda-forge/mlpack) (📥 360K · ⏱️ 22.04.2025):
conda install -c conda-forge mlpack
Ludwig (🥉32 · ⭐ 11K · 💤) - Low-code framework for building custom LLMs, neural networks,.. Apache-2 - [GitHub](https://github.com/ludwig-ai/ludwig) (👨‍💻 160 · 🔀 1.2K · 📦 310 · 📋 1.1K - 4% open · ⏱️ 17.10.2024):
git clone https://github.com/ludwig-ai/ludwig
- [PyPi](https://pypi.org/project/ludwig) (📥 1.4K / month · 📦 6 · ⏱️ 30.07.2024):
pip install ludwig
Sonnet (🥉32 · ⭐ 9.8K) - TensorFlow-based neural network library. Apache-2 - [GitHub](https://github.com/google-deepmind/sonnet) (👨‍💻 61 · 🔀 1.3K · 📦 1.4K · 📋 190 - 16% open · ⏱️ 14.02.2025):
git clone https://github.com/deepmind/sonnet
- [PyPi](https://pypi.org/project/dm-sonnet) (📥 19K / month · 📦 19 · ⏱️ 02.01.2024):
pip install dm-sonnet
- [Conda](https://anaconda.org/conda-forge/sonnet) (📥 43K · ⏱️ 22.04.2025):
conda install -c conda-forge sonnet
skorch (🥉32 · ⭐ 6K) - A scikit-learn compatible neural network library that wraps.. BSD-3 - [GitHub](https://github.com/skorch-dev/skorch) (👨‍💻 67 · 🔀 390 · 📦 1.7K · 📋 540 - 12% open · ⏱️ 24.04.2025):
git clone https://github.com/skorch-dev/skorch
- [PyPi](https://pypi.org/project/skorch) (📥 110K / month · 📦 94 · ⏱️ 10.01.2025):
pip install skorch
- [Conda](https://anaconda.org/conda-forge/skorch) (📥 800K · ⏱️ 22.04.2025):
conda install -c conda-forge skorch
tensorflow-upstream (🥉32 · ⭐ 690) - TensorFlow ROCm port. Apache-2 - [GitHub](https://github.com/ROCm/tensorflow-upstream) (👨‍💻 4.9K · 🔀 99 · 📥 29 · 📋 390 - 4% open · ⏱️ 21.05.2025):
git clone https://github.com/ROCmSoftwarePlatform/tensorflow-upstream
- [PyPi](https://pypi.org/project/tensorflow-rocm) (📥 12K / month · 📦 9 · ⏱️ 10.01.2024):
pip install tensorflow-rocm
Haiku (🥉31 · ⭐ 3K) - JAX-based neural network library. Apache-2 - [GitHub](https://github.com/google-deepmind/dm-haiku) (👨‍💻 88 · 🔀 250 · 📦 2.5K · 📋 250 - 29% open · ⏱️ 01.05.2025):
git clone https://github.com/deepmind/dm-haiku
- [PyPi](https://pypi.org/project/dm-haiku) (📥 220K / month · 📦 190 · ⏱️ 22.04.2025):
pip install dm-haiku
- [Conda](https://anaconda.org/conda-forge/dm-haiku) (📥 34K · ⏱️ 23.04.2025):
conda install -c conda-forge dm-haiku
Determined (🥉29 · ⭐ 3.1K) - Determined is an open-source machine learning platform.. Apache-2 - [GitHub](https://github.com/determined-ai/determined) (👨‍💻 120 · 🔀 360 · 📥 13K · 📋 450 - 22% open · ⏱️ 20.03.2025):
git clone https://github.com/determined-ai/determined
- [PyPi](https://pypi.org/project/determined) (📥 200K / month · 📦 4 · ⏱️ 19.03.2025):
pip install determined
Geomstats (🥉28 · ⭐ 1.4K) - Computations and statistics on manifolds with geometric structures. MIT - [GitHub](https://github.com/geomstats/geomstats) (👨‍💻 95 · 🔀 250 · 📦 140 · 📋 570 - 36% open · ⏱️ 22.05.2025):
git clone https://github.com/geomstats/geomstats
- [PyPi](https://pypi.org/project/geomstats) (📥 5K / month · 📦 12 · ⏱️ 09.09.2024):
pip install geomstats
- [Conda](https://anaconda.org/conda-forge/geomstats) (📥 6.5K · ⏱️ 22.04.2025):
conda install -c conda-forge geomstats
NuPIC (🥉27 · ⭐ 6.3K) - Numenta Platform for Intelligent Computing is an implementation of.. MIT - [GitHub](https://github.com/numenta/nupic-legacy) (👨‍💻 120 · 🔀 1.6K · 📥 22 · 📦 21 · 📋 1.8K - 25% open · ⏱️ 03.12.2024):
git clone https://github.com/numenta/nupic
- [PyPi](https://pypi.org/project/nupic) (📥 930 / month · ⏱️ 01.09.2016):
pip install nupic
pyRiemann (🥉27 · ⭐ 680) - Machine learning for multivariate data through the Riemannian.. BSD-3 - [GitHub](https://github.com/pyRiemann/pyRiemann) (👨‍💻 37 · 🔀 170 · 📦 470 · 📋 110 - 2% open · ⏱️ 19.05.2025):
git clone https://github.com/pyRiemann/pyRiemann
- [PyPi](https://pypi.org/project/pyriemann) (📥 51K / month · 📦 28 · ⏱️ 12.02.2025):
pip install pyriemann
- [Conda](https://anaconda.org/conda-forge/pyriemann) (📥 13K · ⏱️ 22.04.2025):
conda install -c conda-forge pyriemann
Neural Network Libraries (🥉26 · ⭐ 2.7K) - Neural Network Libraries. Apache-2 - [GitHub](https://github.com/sony/nnabla) (👨‍💻 76 · 🔀 330 · 📥 1K · 📋 95 - 36% open · ⏱️ 15.11.2024):
git clone https://github.com/sony/nnabla
- [PyPi](https://pypi.org/project/nnabla) (📥 5.1K / month · 📦 44 · ⏱️ 29.05.2024):
pip install nnabla
ktrain (🥉26 · ⭐ 1.3K · 💤) - ktrain is a Python library that makes deep learning and AI.. Apache-2 - [GitHub](https://github.com/amaiya/ktrain) (👨‍💻 17 · 🔀 270 · 📦 570 · 📋 500 - 0% open · ⏱️ 09.07.2024):
git clone https://github.com/amaiya/ktrain
- [PyPi](https://pypi.org/project/ktrain) (📥 5.4K / month · 📦 4 · ⏱️ 19.06.2024):
pip install ktrain
fklearn (🥉24 · ⭐ 1.5K) - fklearn: Functional Machine Learning. Apache-2 - [GitHub](https://github.com/nubank/fklearn) (👨‍💻 56 · 🔀 170 · 📦 16 · 📋 64 - 60% open · ⏱️ 23.04.2025):
git clone https://github.com/nubank/fklearn
- [PyPi](https://pypi.org/project/fklearn) (📥 1.4K / month · ⏱️ 26.02.2025):
pip install fklearn
Towhee (🥉23 · ⭐ 3.4K · 💤) - Towhee is a framework that is dedicated to making neural data.. Apache-2 - [GitHub](https://github.com/towhee-io/towhee) (👨‍💻 38 · 🔀 260 · 📥 2.7K · 📋 670 - 0% open · ⏱️ 18.10.2024):
git clone https://github.com/towhee-io/towhee
- [PyPi](https://pypi.org/project/towhee) (📥 4.5K / month · ⏱️ 04.12.2023):
pip install towhee
Runhouse (🥉22 · ⭐ 1K) - Distribute and run AI workloads magically in Python, like PyTorch for.. Apache-2 - [GitHub](https://github.com/run-house/runhouse) (👨‍💻 16 · 🔀 36 · 📥 69 · 📋 51 - 17% open · ⏱️ 03.04.2025):
git clone https://github.com/run-house/runhouse
- [PyPi](https://pypi.org/project/runhouse) (📥 19K / month · 📦 1 · ⏱️ 10.03.2025):
pip install runhouse
chefboost (🥉20 · ⭐ 480) - A Lightweight Decision Tree Framework supporting regular algorithms:.. MIT - [GitHub](https://github.com/serengil/chefboost) (👨‍💻 7 · 🔀 100 · 📦 71 · ⏱️ 31.03.2025):
git clone https://github.com/serengil/chefboost
- [PyPi](https://pypi.org/project/chefboost) (📥 4.5K / month · ⏱️ 30.10.2024):
pip install chefboost
NeoML (🥉19 · ⭐ 780) - Machine learning framework for both deep learning and traditional.. Apache-2 - [GitHub](https://github.com/neoml-lib/neoml) (👨‍💻 40 · 🔀 130 · 📦 2 · 📋 91 - 40% open · ⏱️ 03.05.2025):
git clone https://github.com/neoml-lib/neoml
- [PyPi](https://pypi.org/project/neoml) (📥 690 / month · ⏱️ 26.12.2023):
pip install neoml
ThunderGBM (🥉18 · ⭐ 700) - ThunderGBM: Fast GBDTs and Random Forests on GPUs. Apache-2 - [GitHub](https://github.com/Xtra-Computing/thundergbm) (👨‍💻 12 · 🔀 87 · 📦 4 · 📋 81 - 48% open · ⏱️ 19.03.2025):
git clone https://github.com/Xtra-Computing/thundergbm
- [PyPi](https://pypi.org/project/thundergbm) (📥 180 / month · ⏱️ 19.09.2022):
pip install thundergbm
Show 24 hidden projects... - dlib (🥈40 · ⭐ 14K) - A toolkit for making real world machine learning and data analysis.. ❗️BSL-1.0 - MXNet (🥈39 · ⭐ 21K · 💀) - Lightweight, Portable, Flexible Distributed/Mobile Deep.. Apache-2 - Theano (🥈38 · ⭐ 9.9K · 💀) - Theano was a Python library that allows you to define, optimize, and.. BSD-3 - Chainer (🥈34 · ⭐ 5.9K · 💀) - A flexible framework of neural networks for deep learning. MIT - MindsDB (🥈33 · ⭐ 29K) - AIs query engine - Platform for building AI that can learn and.. ❗️ICU - tensorpack (🥈33 · ⭐ 6.3K · 💀) - A Neural Net Training Interface on TensorFlow, with.. Apache-2 - Turi Create (🥉32 · ⭐ 11K · 💀) - Turi Create simplifies the development of custom machine.. BSD-3 - TFlearn (🥉31 · ⭐ 9.6K · 💀) - Deep learning library featuring a higher-level API for TensorFlow. MIT - dyNET (🥉31 · ⭐ 3.4K · 💀) - DyNet: The Dynamic Neural Network Toolkit. Apache-2 - CNTK (🥉29 · ⭐ 18K · 💀) - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit. MIT - Lasagne (🥉28 · ⭐ 3.9K · 💀) - Lightweight library to build and train neural networks in Theano. MIT - SHOGUN (🥉27 · ⭐ 3K · 💀) - Unified and efficient Machine Learning. BSD-3 - EvaDB (🥉27 · ⭐ 2.7K · 💀) - Database system for AI-powered apps. Apache-2 - xLearn (🥉25 · ⭐ 3.1K · 💀) - High performance, easy-to-use, and scalable machine learning (ML).. Apache-2 - NeuPy (🥉25 · ⭐ 740 · 💀) - NeuPy is a Tensorflow based python library for prototyping and building.. MIT - neon (🥉23 · ⭐ 3.9K · 💀) - Intel Nervana reference deep learning framework committed to best.. Apache-2 - ThunderSVM (🥉22 · ⭐ 1.6K · 💀) - ThunderSVM: A Fast SVM Library on GPUs and CPUs. Apache-2 - mace (🥉21 · ⭐ 5K · 💀) - MACE is a deep learning inference framework optimized for mobile.. Apache-2 - Neural Tangents (🥉21 · ⭐ 2.3K · 💀) - Fast and Easy Infinite Neural Networks in Python. Apache-2 - Torchbearer (🥉21 · ⭐ 640 · 💀) - torchbearer: A model fitting library for PyTorch. MIT - Objax (🥉19 · ⭐ 770 · 💀) - Objax is a machine learning framework that provides an Object.. Apache-2 - elegy (🥉19 · ⭐ 480 · 💀) - A High Level API for Deep Learning in JAX. MIT - StarSpace (🥉16 · ⭐ 4K · 💀) - Learning embeddings for classification, retrieval and ranking. MIT - nanodl (🥉13 · ⭐ 290 · 💤) - A Jax-based library for building transformers, includes.. MIT


Data Visualization

Back to top

General-purpose and task-specific data visualization libraries.

Matplotlib (🥇49 · ⭐ 21K) - matplotlib: plotting with Python. ❗Unlicensed - [GitHub](https://github.com/matplotlib/matplotlib) (👨‍💻 1.8K · 🔀 7.9K · 📦 1.8M · 📋 11K - 14% open · ⏱️ 21.05.2025):
git clone https://github.com/matplotlib/matplotlib
- [PyPi](https://pypi.org/project/matplotlib) (📥 87M / month · 📦 60K · ⏱️ 08.05.2025):
pip install matplotlib
- [Conda](https://anaconda.org/conda-forge/matplotlib) (📥 30M · ⏱️ 15.05.2025):
conda install -c conda-forge matplotlib
dash (🥇46 · ⭐ 22K) - Data Apps & Dashboards for Python. No JavaScript Required. MIT - [GitHub](https://github.com/plotly/dash) (👨‍💻 180 · 🔀 2.1K · 📥 88 · 📦 86K · 📋 2K - 27% open · ⏱️ 06.05.2025):
git clone https://github.com/plotly/dash
- [PyPi](https://pypi.org/project/dash) (📥 5.1M / month · 📦 1.6K · ⏱️ 24.04.2025):
pip install dash
- [Conda](https://anaconda.org/conda-forge/dash) (📥 1.8M · ⏱️ 27.04.2025):
conda install -c conda-forge dash
Plotly (🥇46 · ⭐ 17K) - The interactive graphing library for Python. MIT - [GitHub](https://github.com/plotly/plotly.py) (👨‍💻 290 · 🔀 2.6K · 📥 240 · 📦 430K · 📋 3.2K - 20% open · ⏱️ 20.05.2025):
git clone https://github.com/plotly/plotly.py
- [PyPi](https://pypi.org/project/plotly) (📥 24M / month · 📦 8.3K · ⏱️ 20.05.2025):
pip install plotly
- [Conda](https://anaconda.org/conda-forge/plotly) (📥 9.6M · ⏱️ 21.05.2025):
conda install -c conda-forge plotly
- [npm](https://www.npmjs.com/package/plotlywidget) (📥 44K / month · 📦 9 · ⏱️ 12.01.2021):
npm install plotlywidget
Bokeh (🥇45 · ⭐ 20K) - Interactive Data Visualization in the browser, from Python. BSD-3 - [GitHub](https://github.com/bokeh/bokeh) (👨‍💻 710 · 🔀 4.2K · 📦 100K · 📋 8K - 10% open · ⏱️ 22.05.2025):
git clone https://github.com/bokeh/bokeh
- [PyPi](https://pypi.org/project/bokeh) (📥 3.4M / month · 📦 2K · ⏱️ 12.05.2025):
pip install bokeh
- [Conda](https://anaconda.org/conda-forge/bokeh) (📥 17M · ⏱️ 12.05.2025):
conda install -c conda-forge bokeh
Seaborn (🥇43 · ⭐ 13K · 📈) - Statistical data visualization in Python. BSD-3 - [GitHub](https://github.com/mwaskom/seaborn) (👨‍💻 220 · 🔀 1.9K · 📥 470 · 📦 670K · 📋 2.6K - 6% open · ⏱️ 26.01.2025):
git clone https://github.com/mwaskom/seaborn
- [PyPi](https://pypi.org/project/seaborn) (📥 24M / month · 📦 11K · ⏱️ 25.01.2024):
pip install seaborn
- [Conda](https://anaconda.org/conda-forge/seaborn) (📥 13M · ⏱️ 22.04.2025):
conda install -c conda-forge seaborn
Altair (🥇40 · ⭐ 9.8K · 📉) - Declarative visualization library for Python. BSD-3 - [GitHub](https://github.com/vega/altair) (👨‍💻 180 · 🔀 800 · 📥 230 · 📦 230K · 📋 2.1K - 6% open · ⏱️ 22.04.2025):
git clone https://github.com/altair-viz/altair
- [PyPi](https://pypi.org/project/altair) (📥 29M / month · 📦 920 · ⏱️ 23.11.2024):
pip install altair
- [Conda](https://anaconda.org/conda-forge/altair) (📥 2.7M · ⏱️ 22.04.2025):
conda install -c conda-forge altair
FiftyOne (🥈39 · ⭐ 9.5K) - Visualize, create, and debug image and video datasets.. Apache-2 - [GitHub](https://github.com/voxel51/fiftyone) (👨‍💻 150 · 🔀 630 · 📦 980 · 📋 1.7K - 33% open · ⏱️ 21.05.2025):
git clone https://github.com/voxel51/fiftyone
- [PyPi](https://pypi.org/project/fiftyone) (📥 110K / month · 📦 27 · ⏱️ 09.05.2025):
pip install fiftyone
PyVista (🥈39 · ⭐ 3.1K) - 3D plotting and mesh analysis through a streamlined interface for.. MIT - [GitHub](https://github.com/pyvista/pyvista) (👨‍💻 180 · 🔀 560 · 📥 890 · 📦 4.8K · 📋 1.9K - 36% open · ⏱️ 22.05.2025):
git clone https://github.com/pyvista/pyvista
- [PyPi](https://pypi.org/project/pyvista) (📥 470K / month · 📦 710 · ⏱️ 13.05.2025):
pip install pyvista
- [Conda](https://anaconda.org/conda-forge/pyvista) (📥 690K · ⏱️ 13.05.2025):
conda install -c conda-forge pyvista
pandas-profiling (🥈38 · ⭐ 13K) - 1 Line of code data quality profiling & exploratory.. MIT - [GitHub](https://github.com/ydataai/ydata-profiling) (👨‍💻 140 · 🔀 1.7K · 📥 320 · 📦 6.6K · 📋 840 - 30% open · ⏱️ 26.03.2025):
git clone https://github.com/ydataai/pandas-profiling
- [PyPi](https://pypi.org/project/pandas-profiling) (📥 360K / month · 📦 180 · ⏱️ 03.02.2023):
pip install pandas-profiling
- [Conda](https://anaconda.org/conda-forge/pandas-profiling) (📥 510K · ⏱️ 22.04.2025):
conda install -c conda-forge pandas-profiling
HoloViews (🥈38 · ⭐ 2.8K · 📈) - With Holoviews, your data visualizes itself. BSD-3 - [GitHub](https://github.com/holoviz/holoviews) (👨‍💻 150 · 🔀 410 · 📦 16K · 📋 3.4K - 31% open · ⏱️ 21.05.2025):
git clone https://github.com/holoviz/holoviews
- [PyPi](https://pypi.org/project/holoviews) (📥 500K / month · 📦 430 · ⏱️ 31.03.2025):
pip install holoviews
- [Conda](https://anaconda.org/conda-forge/holoviews) (📥 2.1M · ⏱️ 22.04.2025):
conda install -c conda-forge holoviews
- [npm](https://www.npmjs.com/package/@pyviz/jupyterlab_pyviz) (📥 170 / month · 📦 5 · ⏱️ 14.01.2025):
npm install @pyviz/jupyterlab_pyviz
pyecharts (🥈37 · ⭐ 15K) - Python Echarts Plotting Library. MIT - [GitHub](https://github.com/pyecharts/pyecharts) (👨‍💻 45 · 🔀 2.9K · 📥 73 · 📦 5.4K · 📋 1.9K - 0% open · ⏱️ 26.01.2025):
git clone https://github.com/pyecharts/pyecharts
- [PyPi](https://pypi.org/project/pyecharts) (📥 250K / month · 📦 220 · ⏱️ 24.01.2025):
pip install pyecharts
PyQtGraph (🥈37 · ⭐ 4.1K) - Fast data visualization and GUI tools for scientific / engineering.. MIT - [GitHub](https://github.com/pyqtgraph/pyqtgraph) (👨‍💻 300 · 🔀 1.1K · 📦 12K · 📋 1.4K - 32% open · ⏱️ 08.04.2025):
git clone https://github.com/pyqtgraph/pyqtgraph
- [PyPi](https://pypi.org/project/pyqtgraph) (📥 400K / month · 📦 1K · ⏱️ 29.04.2024):
pip install pyqtgraph
- [Conda](https://anaconda.org/conda-forge/pyqtgraph) (📥 710K · ⏱️ 22.04.2025):
conda install -c conda-forge pyqtgraph
plotnine (🥈36 · ⭐ 4.2K) - A Grammar of Graphics for Python. MIT - [GitHub](https://github.com/has2k1/plotnine) (👨‍💻 110 · 🔀 230 · 📦 12K · 📋 720 - 10% open · ⏱️ 22.05.2025):
git clone https://github.com/has2k1/plotnine
- [PyPi](https://pypi.org/project/plotnine) (📥 2.5M / month · 📦 380 · ⏱️ 19.05.2025):
pip install plotnine
- [Conda](https://anaconda.org/conda-forge/plotnine) (📥 470K · ⏱️ 22.04.2025):
conda install -c conda-forge plotnine
Graphviz (🥈36 · ⭐ 1.7K · 💤) - Simple Python interface for Graphviz. MIT - [GitHub](https://github.com/xflr6/graphviz) (👨‍💻 23 · 🔀 210 · 📦 91K · 📋 190 - 6% open · ⏱️ 13.05.2024):
git clone https://github.com/xflr6/graphviz
- [PyPi](https://pypi.org/project/graphviz) (📥 24M / month · 📦 2.9K · ⏱️ 21.03.2024):
pip install graphviz
- [Conda](https://anaconda.org/anaconda/python-graphviz) (📥 54K · ⏱️ 22.04.2025):
conda install -c anaconda python-graphviz
VisPy (🥈35 · ⭐ 3.4K) - High-performance interactive 2D/3D data visualization library. BSD-3 - [GitHub](https://github.com/vispy/vispy) (👨‍💻 200 · 🔀 620 · 📦 2K · 📋 1.5K - 25% open · ⏱️ 20.05.2025):
git clone https://github.com/vispy/vispy
- [PyPi](https://pypi.org/project/vispy) (📥 160K / month · 📦 200 · ⏱️ 19.05.2025):
pip install vispy
- [Conda](https://anaconda.org/conda-forge/vispy) (📥 810K · ⏱️ 19.05.2025):
conda install -c conda-forge vispy
- [npm](https://www.npmjs.com/package/vispy) (📥 21 / month · 📦 3 · ⏱️ 15.03.2020):
npm install vispy
Perspective (🥈34 · ⭐ 9.2K) - A data visualization and analytics component, especially.. Apache-2 - [GitHub](https://github.com/finos/perspective) (👨‍💻 100 · 🔀 1.2K · 📥 11K · 📦 180 · 📋 880 - 12% open · ⏱️ 15.05.2025):
git clone https://github.com/finos/perspective
- [PyPi](https://pypi.org/project/perspective-python) (📥 18K / month · 📦 30 · ⏱️ 01.05.2025):
pip install perspective-python
- [Conda](https://anaconda.org/conda-forge/perspective) (📥 2.1M · ⏱️ 07.05.2025):
conda install -c conda-forge perspective
- [npm](https://www.npmjs.com/package/@finos/perspective-jupyterlab) (📥 360 / month · 📦 6 · ⏱️ 01.05.2025):
npm install @finos/perspective-jupyterlab
cartopy (🥈34 · ⭐ 1.5K) - Cartopy - a cartographic python library with matplotlib support. BSD-3 - [GitHub](https://github.com/SciTools/cartopy) (👨‍💻 140 · 🔀 380 · 📦 7.6K · 📋 1.3K - 24% open · ⏱️ 15.05.2025):
git clone https://github.com/SciTools/cartopy
- [PyPi](https://pypi.org/project/cartopy) (📥 520K / month · 📦 720 · ⏱️ 08.10.2024):
pip install cartopy
- [Conda](https://anaconda.org/conda-forge/cartopy) (📥 4.8M · ⏱️ 22.04.2025):
conda install -c conda-forge cartopy
UMAP (🥈33 · ⭐ 7.8K) - Uniform Manifold Approximation and Projection. BSD-3 - [GitHub](https://github.com/lmcinnes/umap) (👨‍💻 140 · 🔀 830 · 📦 1 · 📋 850 - 59% open · ⏱️ 12.05.2025):
git clone https://github.com/lmcinnes/umap
- [PyPi](https://pypi.org/project/umap-learn) (📥 1.5M / month · 📦 1.1K · ⏱️ 28.10.2024):
pip install umap-learn
- [Conda](https://anaconda.org/conda-forge/umap-learn) (📥 3M · ⏱️ 22.04.2025):
conda install -c conda-forge umap-learn
datashader (🥈33 · ⭐ 3.4K) - Quickly and accurately render even the largest data. BSD-3 - [GitHub](https://github.com/holoviz/datashader) (👨‍💻 62 · 🔀 370 · 📦 6.1K · 📋 600 - 23% open · ⏱️ 08.05.2025):
git clone https://github.com/holoviz/datashader
- [PyPi](https://pypi.org/project/datashader) (📥 180K / month · 📦 250 · ⏱️ 08.05.2025):
pip install datashader
- [Conda](https://anaconda.org/conda-forge/datashader) (📥 1.5M · ⏱️ 08.05.2025):
conda install -c conda-forge datashader
lets-plot (🥈33 · ⭐ 1.7K) - Multiplatform plotting library based on the Grammar of Graphics. MIT - [GitHub](https://github.com/JetBrains/lets-plot) (👨‍💻 21 · 🔀 53 · 📥 3.2K · 📦 180 · 📋 690 - 23% open · ⏱️ 21.05.2025):
git clone https://github.com/JetBrains/lets-plot
- [PyPi](https://pypi.org/project/lets-plot) (📥 98K / month · 📦 15 · ⏱️ 28.03.2025):
pip install lets-plot
wordcloud (🥈32 · ⭐ 10K) - A little word cloud generator in Python. MIT - [GitHub](https://github.com/amueller/word_cloud) (👨‍💻 73 · 🔀 2.3K · 📦 21 · 📋 560 - 24% open · ⏱️ 12.04.2025):
git clone https://github.com/amueller/word_cloud
- [PyPi](https://pypi.org/project/wordcloud) (📥 1.8M / month · 📦 550 · ⏱️ 10.11.2024):
pip install wordcloud
- [Conda](https://anaconda.org/conda-forge/wordcloud) (📥 670K · ⏱️ 22.04.2025):
conda install -c conda-forge wordcloud
hvPlot (🥈32 · ⭐ 1.2K) - A high-level plotting API for pandas, dask, xarray, and networkx built.. BSD-3 - [GitHub](https://github.com/holoviz/hvplot) (👨‍💻 51 · 🔀 110 · 📦 7.1K · 📋 900 - 42% open · ⏱️ 21.05.2025):
git clone https://github.com/holoviz/hvplot
- [PyPi](https://pypi.org/project/hvplot) (📥 210K / month · 📦 240 · ⏱️ 30.04.2025):
pip install hvplot
- [Conda](https://anaconda.org/conda-forge/hvplot) (📥 770K · ⏱️ 01.05.2025):
conda install -c conda-forge hvplot
mpld3 (🥉31 · ⭐ 2.4K · 💤) - An interactive data visualization tool which brings matplotlib.. BSD-3 - [GitHub](https://github.com/mpld3/mpld3) (👨‍💻 53 · 🔀 360 · 📦 7.4K · 📋 370 - 59% open · ⏱️ 30.10.2024):
git clone https://github.com/mpld3/mpld3
- [PyPi](https://pypi.org/project/mpld3) (📥 350K / month · 📦 150 · ⏱️ 23.12.2023):
pip install mpld3
- [Conda](https://anaconda.org/conda-forge/mpld3) (📥 230K · ⏱️ 22.04.2025):
conda install -c conda-forge mpld3
- [npm](https://www.npmjs.com/package/mpld3) (📥 1.2K / month · 📦 9 · ⏱️ 23.12.2023):
npm install mpld3
D-Tale (🥉30 · ⭐ 4.9K) - Visualizer for pandas data structures. ❗️LGPL-2.1 - [GitHub](https://github.com/man-group/dtale) (👨‍💻 30 · 🔀 420 · 📦 1.5K · 📋 610 - 11% open · ⏱️ 20.03.2025):
git clone https://github.com/man-group/dtale
- [PyPi](https://pypi.org/project/dtale) (📥 140K / month · 📦 53 · ⏱️ 20.03.2025):
pip install dtale
- [Conda](https://anaconda.org/conda-forge/dtale) (📥 430K · ⏱️ 22.04.2025):
conda install -c conda-forge dtale
bqplot (🥉30 · ⭐ 3.7K · 💤) - Plotting library for IPython/Jupyter notebooks. Apache-2 - [GitHub](https://github.com/bqplot/bqplot) (👨‍💻 65 · 🔀 470 · 📦 61 · 📋 640 - 42% open · ⏱️ 22.10.2024):
git clone https://github.com/bqplot/bqplot
- [PyPi](https://pypi.org/project/bqplot) (📥 210K / month · 📦 110 · ⏱️ 21.05.2025):
pip install bqplot
- [Conda](https://anaconda.org/conda-forge/bqplot) (📥 1.6M · ⏱️ 22.04.2025):
conda install -c conda-forge bqplot
- [npm](https://www.npmjs.com/package/bqplot) (📥 1.6K / month · 📦 21 · ⏱️ 24.12.2024):
npm install bqplot
HyperTools (🥉27 · ⭐ 1.8K) - A Python toolbox for gaining geometric insights into high-dimensional.. MIT - [GitHub](https://github.com/ContextLab/hypertools) (👨‍💻 23 · 🔀 160 · 📥 71 · 📦 500 · 📋 200 - 34% open · ⏱️ 24.04.2025):
git clone https://github.com/ContextLab/hypertools
- [PyPi](https://pypi.org/project/hypertools) (📥 400 / month · 📦 2 · ⏱️ 12.02.2022):
pip install hypertools
AutoViz (🥉27 · ⭐ 1.8K · 💤) - Automatically Visualize any dataset, any size with a single line.. Apache-2 - [GitHub](https://github.com/AutoViML/AutoViz) (👨‍💻 17 · 🔀 200 · 📦 870 · 📋 98 - 2% open · ⏱️ 10.06.2024):
git clone https://github.com/AutoViML/AutoViz
- [PyPi](https://pypi.org/project/autoviz) (📥 15K / month · 📦 11 · ⏱️ 10.06.2024):
pip install autoviz
- [Conda](https://anaconda.org/conda-forge/autoviz) (📥 86K · ⏱️ 22.04.2025):
conda install -c conda-forge autoviz
openTSNE (🥉27 · ⭐ 1.5K · 💤) - Extensible, parallel implementations of t-SNE. BSD-3 - [GitHub](https://github.com/pavlin-policar/openTSNE) (👨‍💻 13 · 🔀 170 · 📦 1.1K · 📋 140 - 9% open · ⏱️ 24.10.2024):
git clone https://github.com/pavlin-policar/openTSNE
- [PyPi](https://pypi.org/project/opentsne) (📥 39K / month · 📦 47 · ⏱️ 13.08.2024):
pip install opentsne
- [Conda](https://anaconda.org/conda-forge/opentsne) (📥 440K · ⏱️ 22.04.2025):
conda install -c conda-forge opentsne
Plotly-Resampler (🥉27 · ⭐ 1.1K) - Visualize large time series data with plotly.py. MIT - [GitHub](https://github.com/predict-idlab/plotly-resampler) (👨‍💻 14 · 🔀 72 · 📦 2K · 📋 180 - 32% open · ⏱️ 07.04.2025):
git clone https://github.com/predict-idlab/plotly-resampler
- [PyPi](https://pypi.org/project/plotly-resampler) (📥 470K / month · 📦 31 · ⏱️ 07.04.2025):
pip install plotly-resampler
- [Conda](https://anaconda.org/conda-forge/plotly-resampler) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge plotly-resampler
Chartify (🥉25 · ⭐ 3.6K · 💤) - Python library that makes it easy for data scientists to create.. Apache-2 - [GitHub](https://github.com/spotify/chartify) (👨‍💻 27 · 🔀 320 · 📦 83 · 📋 83 - 61% open · ⏱️ 16.10.2024):
git clone https://github.com/spotify/chartify
- [PyPi](https://pypi.org/project/chartify) (📥 1.2K / month · 📦 9 · ⏱️ 16.10.2024):
pip install chartify
- [Conda](https://anaconda.org/conda-forge/chartify) (📥 38K · ⏱️ 22.04.2025):
conda install -c conda-forge chartify
data-validation (🥉25 · ⭐ 770) - Library for exploring and validating machine learning.. Apache-2 - [GitHub](https://github.com/tensorflow/data-validation) (👨‍💻 27 · 🔀 180 · 📥 980 · 📋 190 - 20% open · ⏱️ 12.05.2025):
git clone https://github.com/tensorflow/data-validation
- [PyPi](https://pypi.org/project/tensorflow-data-validation) (📥 140K / month · 📦 31 · ⏱️ 15.10.2024):
pip install tensorflow-data-validation
python-ternary (🥉25 · ⭐ 760 · 💤) - Ternary plotting library for python with matplotlib. MIT - [GitHub](https://github.com/marcharper/python-ternary) (👨‍💻 28 · 🔀 160 · 📥 36 · 📦 220 · 📋 140 - 24% open · ⏱️ 12.06.2024):
git clone https://github.com/marcharper/python-ternary
- [PyPi](https://pypi.org/project/python-ternary) (📥 17K / month · 📦 32 · ⏱️ 17.02.2021):
pip install python-ternary
- [Conda](https://anaconda.org/conda-forge/python-ternary) (📥 100K · ⏱️ 22.04.2025):
conda install -c conda-forge python-ternary
PyWaffle (🥉22 · ⭐ 600 · 💤) - Make Waffle Charts in Python. MIT - [GitHub](https://github.com/gyli/PyWaffle) (👨‍💻 6 · 🔀 110 · 📦 530 · 📋 22 - 27% open · ⏱️ 16.06.2024):
git clone https://github.com/gyli/PyWaffle
- [PyPi](https://pypi.org/project/pywaffle) (📥 14K / month · 📦 6 · ⏱️ 16.06.2024):
pip install pywaffle
- [Conda](https://anaconda.org/conda-forge/pywaffle) (📥 16K · ⏱️ 22.04.2025):
conda install -c conda-forge pywaffle
vega (🥉22 · ⭐ 380) - IPython/Jupyter notebook module for Vega and Vega-Lite. BSD-3 - [GitHub](https://github.com/vega/ipyvega) (👨‍💻 15 · 🔀 65 · 📦 4 · 📋 110 - 14% open · ⏱️ 01.01.2025):
git clone https://github.com/vega/ipyvega
- [PyPi](https://pypi.org/project/vega) (📥 18K / month · 📦 17 · ⏱️ 25.09.2024):
pip install vega
- [Conda](https://anaconda.org/conda-forge/vega) (📥 740K · ⏱️ 22.04.2025):
conda install -c conda-forge vega
Popmon (🥉20 · ⭐ 500) - Monitor the stability of a Pandas or Spark dataframe. MIT - [GitHub](https://github.com/ing-bank/popmon) (👨‍💻 19 · 🔀 36 · 📥 260 · 📦 22 · 📋 57 - 28% open · ⏱️ 24.01.2025):
git clone https://github.com/ing-bank/popmon
- [PyPi](https://pypi.org/project/popmon) (📥 7.2K / month · 📦 4 · ⏱️ 24.01.2025):
pip install popmon
vegafusion (🥉20 · ⭐ 360) - Serverside scaling for Vega and Altair visualizations. BSD-3 - [GitHub](https://github.com/vega/vegafusion) (👨‍💻 6 · 🔀 20 · 📥 13K · 📋 140 - 36% open · ⏱️ 23.02.2025):
git clone https://github.com/vegafusion/vegafusion
- [PyPi](https://pypi.org/project/vegafusion-jupyter) (📥 1.4K / month · 📦 2 · ⏱️ 09.05.2024):
pip install vegafusion-jupyter
- [Conda](https://anaconda.org/conda-forge/vegafusion-python-embed) (📥 440K · ⏱️ 22.04.2025):
conda install -c conda-forge vegafusion-python-embed
- [npm](https://www.npmjs.com/package/vegafusion-jupyter) (📥 62 / month · 📦 3 · ⏱️ 09.05.2024):
npm install vegafusion-jupyter
animatplot (🥉19 · ⭐ 420 · 💤) - A python package for animating plots build on matplotlib. MIT - [GitHub](https://github.com/t-makaro/animatplot) (👨‍💻 6 · 🔀 38 · 📦 76 · 📋 37 - 45% open · ⏱️ 29.08.2024):
git clone https://github.com/t-makaro/animatplot
- [PyPi](https://pypi.org/project/animatplot) (📥 200 / month · 📦 4 · ⏱️ 29.08.2024):
pip install animatplot
- [Conda](https://anaconda.org/conda-forge/animatplot) (📥 17K · ⏱️ 22.04.2025):
conda install -c conda-forge animatplot
ivis (🥉18 · ⭐ 330 · 💤) - Dimensionality reduction in very large datasets using Siamese.. Apache-2 - [GitHub](https://github.com/beringresearch/ivis) (👨‍💻 10 · 🔀 43 · 📦 38 · 📋 60 - 5% open · ⏱️ 29.09.2024):
git clone https://github.com/beringresearch/ivis
- [PyPi](https://pypi.org/project/ivis) (📥 860 / month · 📦 2 · ⏱️ 13.06.2024):
pip install ivis
Show 17 hidden projects... - missingno (🥉30 · ⭐ 4.1K · 💀) - Missing data visualization module for Python. MIT - Cufflinks (🥉28 · ⭐ 3.1K · 💀) - Productivity Tools for Plotly + Pandas. MIT - pythreejs (🥉28 · ⭐ 970 · 💀) - A Jupyter - Three.js bridge. BSD-3 - Facets Overview (🥉27 · ⭐ 7.4K · 💀) - Visualizations for machine learning datasets. Apache-2 - Sweetviz (🥉27 · ⭐ 3K · 💀) - Visualize and compare datasets, target values and associations, with.. MIT - HiPlot (🥉25 · ⭐ 2.8K · 💀) - HiPlot makes understanding high dimensional data easy. MIT - PandasGUI (🥉24 · ⭐ 3.2K · 💀) - A GUI for Pandas DataFrames. ❗️MIT-0 - Multicore-TSNE (🥉24 · ⭐ 1.9K · 💀) - Parallel t-SNE implementation with Python and Torch.. BSD-3 - ridgeplot (🥉24 · ⭐ 230) - Beautiful ridgeline plots in Python. MIT - Pandas-Bokeh (🥉22 · ⭐ 880 · 💀) - Bokeh Plotting Backend for Pandas and GeoPandas. MIT - pivottablejs (🥉22 · ⭐ 700 · 💀) - Dragndrop Pivot Tables and Charts for Jupyter/IPython.. MIT - joypy (🥉22 · ⭐ 590 · 💀) - Joyplots in Python with matplotlib & pandas. MIT - PDPbox (🥉21 · ⭐ 850 · 💀) - python partial dependence plot toolbox. MIT - pdvega (🥉16 · ⭐ 340 · 💀) - Interactive plotting for Pandas using Vega-Lite. MIT - data-describe (🥉15 · ⭐ 300 · 💀) - datadescribe: Pythonic EDA Accelerator for Data Science. Apache-2 - nx-altair (🥉15 · ⭐ 220 · 💀) - Draw interactive NetworkX graphs with Altair. MIT - nptsne (🥉13 · ⭐ 33 · 💀) - nptsne is a numpy compatible python binary package that offers a.. Apache-2


Text Data & NLP

Back to top

Libraries for processing, cleaning, manipulating, and analyzing text data as well as libraries for NLP tasks such as language detection, fuzzy matching, classification, seq2seq learning, conversational AI, keyword extraction, and translation.

transformers (🥇54 · ⭐ 140K) - Transformers: State-of-the-art Machine Learning for.. Apache-2 - [GitHub](https://github.com/huggingface/transformers) (👨‍💻 3.3K · 🔀 29K · 📦 370K · 📋 18K - 9% open · ⏱️ 22.05.2025):
git clone https://github.com/huggingface/transformers
- [PyPi](https://pypi.org/project/transformers) (📥 62M / month · 📦 8.8K · ⏱️ 21.05.2025):
pip install transformers
- [Conda](https://anaconda.org/conda-forge/transformers) (📥 2.8M · ⏱️ 21.05.2025):
conda install -c conda-forge transformers
nltk (🥇45 · ⭐ 14K) - Suite of libraries and programs for symbolic and statistical natural.. Apache-2 - [GitHub](https://github.com/nltk/nltk) (👨‍💻 470 · 🔀 2.9K · 📦 390K · 📋 1.9K - 14% open · ⏱️ 02.05.2025):
git clone https://github.com/nltk/nltk
- [PyPi](https://pypi.org/project/nltk) (📥 35M / month · 📦 5.6K · ⏱️ 18.08.2024):
pip install nltk
- [Conda](https://anaconda.org/conda-forge/nltk) (📥 3.2M · ⏱️ 22.04.2025):
conda install -c conda-forge nltk
spaCy (🥇44 · ⭐ 32K · 📈) - Industrial-strength Natural Language Processing (NLP) in Python. MIT - [GitHub](https://github.com/explosion/spaCy) (👨‍💻 760 · 🔀 4.5K · 📥 2.7K · 📦 130K · 📋 5.8K - 3% open · ⏱️ 22.05.2025):
git clone https://github.com/explosion/spaCy
- [PyPi](https://pypi.org/project/spacy) (📥 19M / month · 📦 3.2K · ⏱️ 19.05.2025):
pip install spacy
- [Conda](https://anaconda.org/conda-forge/spacy) (📥 5.8M · ⏱️ 20.05.2025):
conda install -c conda-forge spacy
litellm (🥇44 · ⭐ 23K) - Python SDK, Proxy Server (LLM Gateway) to call 100+.. MIT o t h e r s - [GitHub](https://github.com/BerriAI/litellm) (👨‍💻 550 · 🔀 3K · 📥 670 · 📦 13K · 📋 5.8K - 22% open · ⏱️ 22.05.2025):
git clone https://github.com/BerriAI/litellm
- [PyPi](https://pypi.org/project/litellm) (📥 8.9M / month · 📦 1.2K · ⏱️ 21.05.2025):
pip install litellm
sentence-transformers (🥇43 · ⭐ 17K) - State-of-the-Art Text Embeddings. Apache-2 - [GitHub](https://github.com/UKPLab/sentence-transformers) (👨‍💻 220 · 🔀 2.6K · 📦 110K · 📋 2.4K - 52% open · ⏱️ 14.05.2025):
git clone https://github.com/UKPLab/sentence-transformers
- [PyPi](https://pypi.org/project/sentence-transformers) (📥 9.3M / month · 📦 2.4K · ⏱️ 15.04.2025):
pip install sentence-transformers
- [Conda](https://anaconda.org/conda-forge/sentence-transformers) (📥 720K · ⏱️ 22.04.2025):
conda install -c conda-forge sentence-transformers
gensim (🥇40 · ⭐ 16K) - Topic Modelling for Humans. ❗️LGPL-2.1 - [GitHub](https://github.com/piskvorky/gensim) (👨‍💻 460 · 🔀 4.4K · 📥 6K · 📦 76K · 📋 1.9K - 21% open · ⏱️ 14.02.2025):
git clone https://github.com/RaRe-Technologies/gensim
- [PyPi](https://pypi.org/project/gensim) (📥 4.9M / month · 📦 1.4K · ⏱️ 19.07.2024):
pip install gensim
- [Conda](https://anaconda.org/conda-forge/gensim) (📥 1.6M · ⏱️ 22.04.2025):
conda install -c conda-forge gensim
flair (🥇40 · ⭐ 14K) - A very simple framework for state-of-the-art Natural Language Processing.. MIT - [GitHub](https://github.com/flairNLP/flair) (👨‍💻 280 · 🔀 2.1K · 📦 4K · 📋 2.4K - 3% open · ⏱️ 27.04.2025):
git clone https://github.com/flairNLP/flair
- [PyPi](https://pypi.org/project/flair) (📥 100K / month · 📦 150 · ⏱️ 05.02.2025):
pip install flair
- [Conda](https://anaconda.org/conda-forge/python-flair) (📥 43K · ⏱️ 22.04.2025):
conda install -c conda-forge python-flair
Rasa (🥇39 · ⭐ 20K) - Open source machine learning framework to automate text- and voice-.. Apache-2 - [GitHub](https://github.com/RasaHQ/rasa) (👨‍💻 590 · 🔀 4.8K · 📦 5.3K · 📋 6.8K - 2% open · ⏱️ 14.01.2025):
git clone https://github.com/RasaHQ/rasa
- [PyPi](https://pypi.org/project/rasa) (📥 130K / month · 📦 60 · ⏱️ 14.01.2025):
pip install rasa
haystack (🥇38 · ⭐ 21K) - AI orchestration framework to build customizable, production-ready.. Apache-2 - [GitHub](https://github.com/deepset-ai/haystack) (👨‍💻 290 · 🔀 2.2K · 📦 1.2K · 📋 3.9K - 3% open · ⏱️ 22.05.2025):
git clone https://github.com/deepset-ai/haystack
- [PyPi](https://pypi.org/project/haystack) (📥 6.2K / month · 📦 5 · ⏱️ 15.12.2021):
pip install haystack
NeMo (🥇38 · ⭐ 15K) - A scalable generative AI framework built for researchers and.. Apache-2 - [GitHub](https://github.com/NVIDIA/NeMo) (👨‍💻 420 · 🔀 2.8K · 📥 380K · 📦 21 · 📋 2.6K - 5% open · ⏱️ 21.05.2025):
git clone https://github.com/NVIDIA/NeMo
- [PyPi](https://pypi.org/project/nemo-toolkit) (📥 180K / month · 📦 14 · ⏱️ 08.05.2025):
pip install nemo-toolkit
ChatterBot (🥇38 · ⭐ 14K) - ChatterBot is a machine learning, conversational dialog engine for.. BSD-3 - [GitHub](https://github.com/gunthercox/ChatterBot) (👨‍💻 110 · 🔀 4.5K · 📦 6.4K · 📋 1.7K - 8% open · ⏱️ 20.05.2025):
git clone https://github.com/gunthercox/ChatterBot
- [PyPi](https://pypi.org/project/chatterbot) (📥 27K / month · 📦 18 · ⏱️ 05.04.2025):
pip install chatterbot
sentencepiece (🥇38 · ⭐ 11K) - Unsupervised text tokenizer for Neural Network-based text.. Apache-2 - [GitHub](https://github.com/google/sentencepiece) (👨‍💻 92 · 🔀 1.2K · 📥 57K · 📦 110K · 📋 780 - 6% open · ⏱️ 26.02.2025):
git clone https://github.com/google/sentencepiece
- [PyPi](https://pypi.org/project/sentencepiece) (📥 28M / month · 📦 1.7K · ⏱️ 19.02.2024):
pip install sentencepiece
- [Conda](https://anaconda.org/conda-forge/sentencepiece) (📥 1.5M · ⏱️ 22.04.2025):
conda install -c conda-forge sentencepiece
Tokenizers (🥇38 · ⭐ 9.7K) - Fast State-of-the-Art Tokenizers optimized for Research and.. Apache-2 - [GitHub](https://github.com/huggingface/tokenizers) (👨‍💻 110 · 🔀 880 · 📥 74 · 📦 170K · 📋 1.1K - 7% open · ⏱️ 18.03.2025):
git clone https://github.com/huggingface/tokenizers
- [PyPi](https://pypi.org/project/tokenizers) (📥 56M / month · 📦 1.3K · ⏱️ 13.03.2025):
pip install tokenizers
- [Conda](https://anaconda.org/conda-forge/tokenizers) (📥 3M · ⏱️ 22.04.2025):
conda install -c conda-forge tokenizers
TextBlob (🥇38 · ⭐ 9.3K) - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech.. MIT - [GitHub](https://github.com/sloria/TextBlob) (👨‍💻 37 · 🔀 1.2K · 📥 130 · 📦 58K · 📋 280 - 25% open · ⏱️ 19.05.2025):
git clone https://github.com/sloria/TextBlob
- [PyPi](https://pypi.org/project/textblob) (📥 1.4M / month · 📦 400 · ⏱️ 13.01.2025):
pip install textblob
- [Conda](https://anaconda.org/conda-forge/textblob) (📥 290K · ⏱️ 22.04.2025):
conda install -c conda-forge textblob
fairseq (🥈37 · ⭐ 31K · 💤) - Facebook AI Research Sequence-to-Sequence Toolkit written in.. MIT - [GitHub](https://github.com/facebookresearch/fairseq) (👨‍💻 430 · 🔀 6.5K · 📥 410 · 📦 4.3K · 📋 4.4K - 30% open · ⏱️ 18.10.2024):
git clone https://github.com/facebookresearch/fairseq
- [PyPi](https://pypi.org/project/fairseq) (📥 91K / month · 📦 120 · ⏱️ 27.06.2022):
pip install fairseq
- [Conda](https://anaconda.org/conda-forge/fairseq) (📥 140K · ⏱️ 22.04.2025):
conda install -c conda-forge fairseq
spark-nlp (🥈36 · ⭐ 4K) - State of the Art Natural Language Processing. Apache-2 - [GitHub](https://github.com/JohnSnowLabs/spark-nlp) (👨‍💻 110 · 🔀 720 · 📦 600 · 📋 910 - 2% open · ⏱️ 15.05.2025):
git clone https://github.com/JohnSnowLabs/spark-nlp
- [PyPi](https://pypi.org/project/spark-nlp) (📥 4.2M / month · 📦 37 · ⏱️ 14.05.2025):
pip install spark-nlp
qdrant (🥈35 · ⭐ 24K) - Qdrant - High-performance, massive-scale Vector Database and Vector.. Apache-2 - [GitHub](https://github.com/qdrant/qdrant) (👨‍💻 130 · 🔀 1.6K · 📥 400K · 📦 120 · 📋 1.5K - 23% open · ⏱️ 16.05.2025):
git clone https://github.com/qdrant/qdrant
TensorFlow Text (🥈35 · ⭐ 1.3K · 📉) - Making text a first-class citizen in TensorFlow. Apache-2 - [GitHub](https://github.com/tensorflow/text) (👨‍💻 180 · 🔀 350 · 📦 9.6K · 📋 370 - 52% open · ⏱️ 24.03.2025):
git clone https://github.com/tensorflow/text
- [PyPi](https://pypi.org/project/tensorflow-text) (📥 6.5M / month · 📦 230 · ⏱️ 04.04.2025):
pip install tensorflow-text
snowballstemmer (🥈34 · ⭐ 790) - Snowball compiler and stemming algorithms. BSD-3 - [GitHub](https://github.com/snowballstem/snowball) (👨‍💻 36 · 🔀 180 · 📦 10 · 📋 100 - 15% open · ⏱️ 22.05.2025):
git clone https://github.com/snowballstem/snowball
- [PyPi](https://pypi.org/project/snowballstemmer) (📥 22M / month · 📦 550 · ⏱️ 09.05.2025):
pip install snowballstemmer
- [Conda](https://anaconda.org/conda-forge/snowballstemmer) (📥 9.8M · ⏱️ 20.05.2025):
conda install -c conda-forge snowballstemmer
Opik (🥈33 · ⭐ 8.6K) - Debug, evaluate, and monitor your LLM applications, RAG systems, and.. Apache-2 - [GitHub](https://github.com/comet-ml/opik) (👨‍💻 51 · 🔀 570 · 📥 12 · 📦 9 · 📋 310 - 28% open · ⏱️ 22.05.2025):
git clone https://github.com/comet-ml/opik
- [PyPi](https://pypi.org/project/opik) (📥 230K / month · 📦 14 · ⏱️ 22.05.2025):
pip install opik
stanza (🥈33 · ⭐ 7.5K) - Stanford NLP Python library for tokenization, sentence segmentation,.. Apache-2 - [GitHub](https://github.com/stanfordnlp/stanza) (👨‍💻 68 · 🔀 900 · 📦 3.9K · 📋 930 - 10% open · ⏱️ 24.12.2024):
git clone https://github.com/stanfordnlp/stanza
- [PyPi](https://pypi.org/project/stanza) (📥 390K / month · 📦 200 · ⏱️ 24.12.2024):
pip install stanza
- [Conda](https://anaconda.org/stanfordnlp/stanza) (📥 8.7K · ⏱️ 25.03.2025):
conda install -c stanfordnlp stanza
OpenNMT (🥈33 · ⭐ 6.9K · 💤) - Open Source Neural Machine Translation and (Large) Language.. MIT - [GitHub](https://github.com/OpenNMT/OpenNMT-py) (👨‍💻 190 · 🔀 2.3K · 📦 340 · 📋 1.5K - 2% open · ⏱️ 27.06.2024):
git clone https://github.com/OpenNMT/OpenNMT-py
- [PyPi](https://pypi.org/project/OpenNMT-py) (📥 15K / month · 📦 23 · ⏱️ 18.03.2024):
pip install OpenNMT-py
jellyfish (🥈33 · ⭐ 2.1K) - a python library for doing approximate and phonetic matching of strings. MIT - [GitHub](https://github.com/jamesturk/jellyfish) (👨‍💻 35 · 🔀 160 · 📦 15K · 📋 140 - 2% open · ⏱️ 17.05.2025):
git clone https://github.com/jamesturk/jellyfish
- [PyPi](https://pypi.org/project/jellyfish) (📥 7.4M / month · 📦 300 · ⏱️ 31.03.2025):
pip install jellyfish
- [Conda](https://anaconda.org/conda-forge/jellyfish) (📥 1.4M · ⏱️ 22.04.2025):
conda install -c conda-forge jellyfish
ftfy (🥈32 · ⭐ 3.9K · 💤) - Fixes mojibake and other glitches in Unicode text, after the fact. Apache-2 - [GitHub](https://github.com/rspeer/python-ftfy) (👨‍💻 22 · 🔀 120 · 📥 73 · 📦 31K · 📋 150 - 6% open · ⏱️ 30.10.2024):
git clone https://github.com/rspeer/python-ftfy
- [PyPi](https://pypi.org/project/ftfy) (📥 7.1M / month · 📦 570 · ⏱️ 26.10.2024):
pip install ftfy
- [Conda](https://anaconda.org/conda-forge/ftfy) (📥 330K · ⏱️ 22.04.2025):
conda install -c conda-forge ftfy
torchtext (🥈32 · ⭐ 3.5K) - Models, data loaders and abstractions for language processing,.. BSD-3 - [GitHub](https://github.com/pytorch/text) (👨‍💻 160 · 🔀 810 · 📋 850 - 39% open · ⏱️ 24.02.2025):
git clone https://github.com/pytorch/text
- [PyPi](https://pypi.org/project/torchtext) (📥 680K / month · 📦 280 · ⏱️ 24.04.2024):
pip install torchtext
DeepPavlov (🥈31 · ⭐ 6.9K) - An open source library for deep learning end-to-end dialog.. Apache-2 - [GitHub](https://github.com/deeppavlov/DeepPavlov) (👨‍💻 78 · 🔀 1.2K · 📦 430 · 📋 640 - 4% open · ⏱️ 26.11.2024):
git clone https://github.com/deepmipt/DeepPavlov
- [PyPi](https://pypi.org/project/deeppavlov) (📥 15K / month · 📦 4 · ⏱️ 12.08.2024):
pip install deeppavlov
rubrix (🥈31 · ⭐ 4.5K) - Argilla is a collaboration tool for AI engineers and domain experts.. Apache-2 - [GitHub](https://github.com/argilla-io/argilla) (👨‍💻 110 · 🔀 430 · 📦 3K · 📋 2.2K - 1% open · ⏱️ 16.05.2025):
git clone https://github.com/recognai/rubrix
- [PyPi](https://pypi.org/project/rubrix) (📥 3.1K / month · ⏱️ 24.10.2022):
pip install rubrix
- [Conda](https://anaconda.org/conda-forge/rubrix) (📥 46K · ⏱️ 22.04.2025):
conda install -c conda-forge rubrix
Dedupe (🥈30 · ⭐ 4.3K) - A python library for accurate and scalable fuzzy matching, record.. MIT - [GitHub](https://github.com/dedupeio/dedupe) (👨‍💻 72 · 🔀 550 · 📦 360 · 📋 820 - 9% open · ⏱️ 01.11.2024):
git clone https://github.com/dedupeio/dedupe
- [PyPi](https://pypi.org/project/dedupe) (📥 66K / month · 📦 19 · ⏱️ 15.08.2024):
pip install dedupe
- [Conda](https://anaconda.org/conda-forge/dedupe) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge dedupe
Sumy (🥈30 · ⭐ 3.6K · 💤) - Module for automatic summarization of text documents and HTML pages. Apache-2 - [GitHub](https://github.com/miso-belica/sumy) (👨‍💻 32 · 🔀 530 · 📦 4.1K · 📋 120 - 18% open · ⏱️ 16.05.2024):
git clone https://github.com/miso-belica/sumy
- [PyPi](https://pypi.org/project/sumy) (📥 73K / month · 📦 31 · ⏱️ 23.10.2022):
pip install sumy
- [Conda](https://anaconda.org/conda-forge/sumy) (📥 12K · ⏱️ 22.04.2025):
conda install -c conda-forge sumy
spacy-transformers (🥈29 · ⭐ 1.4K) - Use pretrained transformers like BERT, XLNet and GPT-2.. MIT spacy - [GitHub](https://github.com/explosion/spacy-transformers) (👨‍💻 23 · 🔀 170 · 📥 170 · 📦 2.3K · ⏱️ 06.02.2025):
git clone https://github.com/explosion/spacy-transformers
- [PyPi](https://pypi.org/project/spacy-transformers) (📥 220K / month · 📦 98 · ⏱️ 06.02.2025):
pip install spacy-transformers
- [Conda](https://anaconda.org/conda-forge/spacy-transformers) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge spacy-transformers
TextDistance (🥈28 · ⭐ 3.5K) - Compute distance between sequences. 30+ algorithms, pure python.. MIT - [GitHub](https://github.com/life4/textdistance) (👨‍💻 18 · 🔀 250 · 📥 1.1K · 📦 8.6K · ⏱️ 18.04.2025):
git clone https://github.com/life4/textdistance
- [PyPi](https://pypi.org/project/textdistance) (📥 1.1M / month · 📦 99 · ⏱️ 16.07.2024):
pip install textdistance
- [Conda](https://anaconda.org/conda-forge/textdistance) (📥 830K · ⏱️ 22.04.2025):
conda install -c conda-forge textdistance
SciSpacy (🥈28 · ⭐ 1.8K) - A full spaCy pipeline and models for scientific/biomedical documents. Apache-2 - [GitHub](https://github.com/allenai/scispacy) (👨‍💻 37 · 🔀 230 · 📦 1.2K · 📋 320 - 10% open · ⏱️ 23.11.2024):
git clone https://github.com/allenai/scispacy
- [PyPi](https://pypi.org/project/scispacy) (📥 38K / month · 📦 34 · ⏱️ 27.10.2024):
pip install scispacy
CLTK (🥈28 · ⭐ 850) - The Classical Language Toolkit. MIT - [GitHub](https://github.com/cltk/cltk) (👨‍💻 120 · 🔀 330 · 📥 130 · 📦 300 · 📋 580 - 6% open · ⏱️ 04.05.2025):
git clone https://github.com/cltk/cltk
- [PyPi](https://pypi.org/project/cltk) (📥 3.3K / month · 📦 17 · ⏱️ 04.05.2025):
pip install cltk
PyTextRank (🥉27 · ⭐ 2.2K · 💤) - Python implementation of TextRank algorithms (textgraphs) for.. MIT - [GitHub](https://github.com/DerwenAI/pytextrank) (👨‍💻 19 · 🔀 340 · 📦 850 · 📋 100 - 12% open · ⏱️ 21.05.2024):
git clone https://github.com/DerwenAI/pytextrank
- [PyPi](https://pypi.org/project/pytextrank) (📥 70K / month · 📦 19 · ⏱️ 21.02.2024):
pip install pytextrank
english-words (🥉26 · ⭐ 11K) - A text file containing 479k English words for all your.. Unlicense - [GitHub](https://github.com/dwyl/english-words) (👨‍💻 34 · 🔀 1.9K · 📦 2 · 📋 160 - 74% open · ⏱️ 06.01.2025):
git clone https://github.com/dwyl/english-words
- [PyPi](https://pypi.org/project/english-words) (📥 57K / month · 📦 14 · ⏱️ 24.05.2023):
pip install english-words
DeepKE (🥉25 · ⭐ 3.9K) - [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and.. MIT - [GitHub](https://github.com/zjunlp/DeepKE) (👨‍💻 32 · 🔀 710 · 📦 24 · 📋 610 - 1% open · ⏱️ 22.04.2025):
git clone https://github.com/zjunlp/deepke
- [PyPi](https://pypi.org/project/deepke) (📥 1.1K / month · ⏱️ 21.09.2023):
pip install deepke
scattertext (🥉25 · ⭐ 2.3K) - Beautiful visualizations of how language differs among document.. Apache-2 - [GitHub](https://github.com/JasonKessler/scattertext) (👨‍💻 14 · 🔀 290 · 📦 670 · 📋 100 - 22% open · ⏱️ 29.04.2025):
git clone https://github.com/JasonKessler/scattertext
- [PyPi](https://pypi.org/project/scattertext) (📥 8.9K / month · 📦 5 · ⏱️ 23.09.2024):
pip install scattertext
- [Conda](https://anaconda.org/conda-forge/scattertext) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge scattertext
sense2vec (🥉24 · ⭐ 1.7K) - Contextually-keyed word vectors. MIT - [GitHub](https://github.com/explosion/sense2vec) (👨‍💻 20 · 🔀 240 · 📥 72K · 📦 470 · 📋 120 - 20% open · ⏱️ 23.04.2025):
git clone https://github.com/explosion/sense2vec
- [PyPi](https://pypi.org/project/sense2vec) (📥 2K / month · 📦 13 · ⏱️ 19.04.2021):
pip install sense2vec
- [Conda](https://anaconda.org/conda-forge/sense2vec) (📥 61K · ⏱️ 22.04.2025):
conda install -c conda-forge sense2vec
detoxify (🥉24 · ⭐ 1K) - Trained models & code to predict toxic comments on all 3 Jigsaw Toxic.. Apache-2 - [GitHub](https://github.com/unitaryai/detoxify) (👨‍💻 14 · 🔀 120 · 📥 1.2M · 📦 910 · 📋 67 - 55% open · ⏱️ 07.03.2025):
git clone https://github.com/unitaryai/detoxify
- [PyPi](https://pypi.org/project/detoxify) (📥 66K / month · 📦 30 · ⏱️ 01.02.2024):
pip install detoxify
T5 (🥉23 · ⭐ 6.4K) - Code for the paper Exploring the Limits of Transfer Learning with a.. Apache-2 - [GitHub](https://github.com/google-research/text-to-text-transfer-transformer) (👨‍💻 61 · 🔀 760 · 📋 450 - 23% open · ⏱️ 28.04.2025):
git clone https://github.com/google-research/text-to-text-transfer-transformer
- [PyPi](https://pypi.org/project/t5) (📥 41K / month · 📦 2 · ⏱️ 18.10.2021):
pip install t5
Sockeye (🥉22 · ⭐ 1.2K · 💤) - Sequence-to-sequence framework with a focus on Neural.. Apache-2 - [GitHub](https://github.com/awslabs/sockeye) (👨‍💻 60 · 🔀 320 · 📥 21 · 📋 310 - 3% open · ⏱️ 24.10.2024):
git clone https://github.com/awslabs/sockeye
- [PyPi](https://pypi.org/project/sockeye) (📥 1.3K / month · ⏱️ 03.03.2023):
pip install sockeye
happy-transformer (🥉22 · ⭐ 540) - Happy Transformer makes it easy to fine-tune and.. Apache-2 huggingface - [GitHub](https://github.com/EricFillion/happy-transformer) (👨‍💻 14 · 🔀 68 · 📦 330 · 📋 130 - 16% open · ⏱️ 22.03.2025):
git clone https://github.com/EricFillion/happy-transformer
- [PyPi](https://pypi.org/project/happytransformer) (📥 3.5K / month · 📦 5 · ⏱️ 05.08.2023):
pip install happytransformer
fast-bert (🥉21 · ⭐ 1.9K · 💤) - Super easy library for BERT based NLP models. Apache-2 - [GitHub](https://github.com/utterworks/fast-bert) (👨‍💻 37 · 🔀 340 · 📋 260 - 63% open · ⏱️ 19.08.2024):
git clone https://github.com/utterworks/fast-bert
- [PyPi](https://pypi.org/project/fast-bert) (📥 1.4K / month · ⏱️ 19.08.2024):
pip install fast-bert
finetune (🥉21 · ⭐ 710) - Scikit-learn style model finetuning for NLP. MPL-2.0 - [GitHub](https://github.com/IndicoDataSolutions/finetune) (👨‍💻 24 · 🔀 78 · 📦 15 · 📋 140 - 15% open · ⏱️ 20.05.2025):
git clone https://github.com/IndicoDataSolutions/finetune
- [PyPi](https://pypi.org/project/finetune) (📥 370 / month · 📦 2 · ⏱️ 29.09.2023):
pip install finetune
small-text (🥉21 · ⭐ 620) - Active Learning for Text Classification in Python. MIT - [GitHub](https://github.com/webis-de/small-text) (👨‍💻 9 · 🔀 70 · 📦 34 · 📋 66 - 27% open · ⏱️ 06.04.2025):
git clone https://github.com/webis-de/small-text
- [PyPi](https://pypi.org/project/small-text) (📥 720 / month · ⏱️ 06.04.2025):
pip install small-text
- [Conda](https://anaconda.org/conda-forge/small-text) (📥 15K · ⏱️ 22.04.2025):
conda install -c conda-forge small-text
UForm (🥉20 · ⭐ 1.1K) - Pocket-Sized Multimodal AI for content understanding and.. Apache-2 - [GitHub](https://github.com/unum-cloud/uform) (👨‍💻 19 · 🔀 67 · 📥 610 · 📦 34 · 📋 35 - 37% open · ⏱️ 03.01.2025):
git clone https://github.com/unum-cloud/uform
- [PyPi](https://pypi.org/project/uform) (📥 640 / month · 📦 2 · ⏱️ 03.01.2025):
pip install uform
VizSeq (🥉15 · ⭐ 440) - An Analysis Toolkit for Natural Language Generation (Translation,.. MIT - [GitHub](https://github.com/facebookresearch/vizseq) (👨‍💻 4 · 🔀 61 · 📦 13 · 📋 16 - 43% open · ⏱️ 07.03.2025):
git clone https://github.com/facebookresearch/vizseq
- [PyPi](https://pypi.org/project/vizseq) (📥 130 / month · ⏱️ 07.08.2020):
pip install vizseq
Show 56 hidden projects... - AllenNLP (🥈37 · ⭐ 12K · 💀) - An open-source NLP research library, built on PyTorch. Apache-2 - fastText (🥈35 · ⭐ 26K · 💀) - Library for fast text representation and classification. MIT - ParlAI (🥈32 · ⭐ 11K · 💀) - A framework for training and evaluating AI models on a variety of.. MIT - fuzzywuzzy (🥈32 · ⭐ 9.3K · 💀) - Fuzzy String Matching in Python. ❗️GPL-2.0 - nlpaug (🥈30 · ⭐ 4.6K · 💀) - Data augmentation for NLP. MIT - Ciphey (🥈28 · ⭐ 19K · 💀) - Automatically decrypt encryptions without knowing the key or cipher,.. MIT - vaderSentiment (🥈28 · ⭐ 4.7K · 💀) - VADER Sentiment Analysis. VADER (Valence Aware Dictionary.. MIT - fastNLP (🥈28 · ⭐ 3.1K · 💀) - fastNLP: A Modularized and Extensible NLP Framework. Currently.. Apache-2 - textacy (🥈28 · ⭐ 2.2K · 💀) - NLP, before and after spaCy. ❗Unlicensed - flashtext (🥉27 · ⭐ 5.6K · 💀) - Extract Keywords from sentence or Replace keywords in sentences. MIT - underthesea (🥉27 · ⭐ 1.5K) - Underthesea - Vietnamese NLP Toolkit. ❗️GPL-3.0 - pySBD (🥉27 · ⭐ 850 · 💀) - pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence.. MIT - neuralcoref (🥉26 · ⭐ 2.9K · 💀) - Fast Coreference Resolution in spaCy with Neural Networks. MIT - langid (🥉26 · ⭐ 2.4K · 💀) - Stand-alone language identification system. BSD-3 - polyglot (🥉26 · ⭐ 2.3K · 💀) - Multilingual text (NLP) processing toolkit. ❗️GPL-3.0 - PyText (🥉25 · ⭐ 6.3K · 💀) - A natural language modeling framework based on PyTorch. BSD-3 - GluonNLP (🥉25 · ⭐ 2.6K · 💀) - Toolkit that enables easy text preprocessing, datasets.. Apache-2 - textgenrnn (🥉24 · ⭐ 4.9K · 💀) - Easily train your own text-generating neural network of any.. MIT - OpenPrompt (🥉24 · ⭐ 4.6K · 💀) - An Open-Source Framework for Prompt-Learning. Apache-2 - Snips NLU (🥉24 · ⭐ 3.9K · 💀) - Snips Python library to extract meaning from text. Apache-2 - MatchZoo (🥉24 · ⭐ 3.9K · 💀) - Facilitating the design, comparison and sharing of deep.. Apache-2 - promptsource (🥉24 · ⭐ 2.9K · 💀) - Toolkit for creating, sharing and using natural language.. Apache-2 - pytorch-nlp (🥉24 · ⭐ 2.2K · 💀) - Basic Utilities for PyTorch Natural Language Processing.. BSD-3 - FARM (🥉24 · ⭐ 1.8K · 💀) - Fast & easy transfer learning for NLP. Harvesting language.. Apache-2 - whoosh (🥉24 · ⭐ 620 · 💀) - Pure-Python full-text search library. ❗️BSD-1-Clause - Kashgari (🥉23 · ⭐ 2.4K · 💀) - Kashgari is a production-level NLP Transfer learning.. Apache-2 - YouTokenToMe (🥉23 · ⭐ 970 · 💀) - Unsupervised text tokenizer focused on computational efficiency. MIT - gpt-2-simple (🥉22 · ⭐ 3.4K · 💀) - Python package to easily retrain OpenAIs GPT-2 text-.. MIT - NLP Architect (🥉22 · ⭐ 2.9K · 💀) - A model library for exploring state-of-the-art deep.. Apache-2 - Texthero (🥉22 · ⭐ 2.9K · 💀) - Text preprocessing, representation and visualization from zero to.. MIT - Texar (🥉22 · ⭐ 2.4K · 💀) - Toolkit for Machine Learning, Natural Language Processing, and.. Apache-2 - jiant (🥉22 · ⭐ 1.7K · 💀) - jiant is an nlp toolkit. MIT - stop-words (🥉22 · ⭐ 160 · 💀) - Get list of common stop words in various languages in Python. BSD-3 - DELTA (🥉21 · ⭐ 1.6K · 💀) - DELTA is a deep learning based natural language and speech.. Apache-2 - anaGo (🥉21 · ⭐ 1.5K · 💀) - Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition,.. MIT - lightseq (🥉20 · ⭐ 3.3K · 💀) - LightSeq: A High Performance Library for Sequence Processing.. Apache-2 - textpipe (🥉20 · ⭐ 300 · 💀) - Textpipe: clean and extract metadata from text. MIT - DeepMatcher (🥉19 · ⭐ 5.2K · 💀) - Python package for performing Entity and Text Matching using.. BSD-3 - numerizer (🥉19 · ⭐ 230 · 💤) - A Python module to convert natural language numerics into ints and.. MIT - pyfasttext (🥉19 · ⭐ 230 · 💀) - Yet another Python binding for fastText. ❗️GPL-3.0 - NeuroNER (🥉18 · ⭐ 1.7K · 💀) - Named-entity recognition using neural networks. Easy-to-use and.. MIT - nboost (🥉18 · ⭐ 680 · 💀) - NBoost is a scalable, search-api-boosting platform for deploying.. Apache-2 - fastT5 (🥉18 · ⭐ 580 · 💀) - boost inference speed of T5 models by 5x & reduce the model size.. Apache-2 - textaugment (🥉18 · ⭐ 420 · 💀) - TextAugment: Text Augmentation Library. MIT - Camphr (🥉18 · ⭐ 340 · 💀) - Camphr - NLP libary for creating pipeline components. Apache-2 spacy - skift (🥉17 · ⭐ 240 · 💀) - scikit-learn wrappers for Python fastText. MIT - OpenNRE (🥉16 · ⭐ 4.4K · 💀) - An Open-Source Package for Neural Relation Extraction (NRE). MIT - TextBox (🥉16 · ⭐ 1.1K · 💀) - TextBox 2.0 is a text generation library with pre-trained language.. MIT - BLINK (🥉15 · ⭐ 1.2K · 💀) - Entity Linker solution. MIT - Translate (🥉15 · ⭐ 830 · 💀) - Translate - a PyTorch Language Library. BSD-3 - NeuralQA (🥉15 · ⭐ 230 · 💀) - NeuralQA: A Usable Library for Question Answering on Large Datasets.. MIT - Headliner (🥉15 · ⭐ 230 · 💀) - Easy training and deployment of seq2seq models. MIT - ONNX-T5 (🥉14 · ⭐ 250 · 💀) - Summarization, translation, sentiment-analysis, text-generation.. Apache-2 - TransferNLP (🥉13 · ⭐ 290 · 💀) - NLP library designed for reproducible experimentation.. MIT - textvec (🥉13 · ⭐ 190 · 💀) - Text vectorization tool to outperform TFIDF for classification.. MIT - spacy-dbpedia-spotlight (🥉13 · ⭐ 110 · 💀) - A spaCy wrapper for DBpedia Spotlight. MIT spacy


Image Data

Back to top

Libraries for image & video processing, manipulation, and augmentation as well as libraries for computer vision tasks such as facial recognition, object detection, and classification.

Pillow (🥇48 · ⭐ 13K) - Python Imaging Library (Fork). ❗️PIL - [GitHub](https://github.com/python-pillow/Pillow) (👨‍💻 480 · 🔀 2.3K · 📦 2.3M · 📋 3.3K - 3% open · ⏱️ 08.05.2025):
git clone https://github.com/python-pillow/Pillow
- [PyPi](https://pypi.org/project/Pillow) (📥 150M / month · 📦 14K · ⏱️ 12.04.2025):
pip install Pillow
- [Conda](https://anaconda.org/conda-forge/pillow) (📥 54M · ⏱️ 07.05.2025):
conda install -c conda-forge pillow
torchvision (🥇42 · ⭐ 17K) - Datasets, Transforms and Models specific to Computer Vision. BSD-3 - [GitHub](https://github.com/pytorch/vision) (👨‍💻 640 · 🔀 7K · 📥 41K · 📦 21 · 📋 3.7K - 29% open · ⏱️ 09.05.2025):
git clone https://github.com/pytorch/vision
- [PyPi](https://pypi.org/project/torchvision) (📥 17M / month · 📦 7K · ⏱️ 23.04.2025):
pip install torchvision
- [Conda](https://anaconda.org/conda-forge/torchvision) (📥 2.6M · ⏱️ 22.04.2025):
conda install -c conda-forge torchvision
PyTorch Image Models (🥇41 · ⭐ 34K · 📉) - The largest collection of PyTorch image encoders /.. Apache-2 - [GitHub](https://github.com/huggingface/pytorch-image-models) (👨‍💻 170 · 🔀 4.9K · 📥 7.8M · 📦 57K · 📋 980 - 5% open · ⏱️ 21.05.2025):
git clone https://github.com/rwightman/pytorch-image-models
- [PyPi](https://pypi.org/project/timm) (📥 7.9M / month · 📦 1.1K · ⏱️ 23.02.2025):
pip install timm
- [Conda](https://anaconda.org/conda-forge/timm) (📥 390K · ⏱️ 22.04.2025):
conda install -c conda-forge timm
Albumentations (🥇41 · ⭐ 15K) - Fast and flexible image augmentation library. Paper about.. MIT - [GitHub](https://github.com/albumentations-team/albumentations) (👨‍💻 170 · 🔀 1.7K · 📦 38K · 📋 1.2K - 17% open · ⏱️ 21.05.2025):
git clone https://github.com/albumentations-team/albumentations
- [PyPi](https://pypi.org/project/albumentations) (📥 6.3M / month · 📦 720 · ⏱️ 16.05.2025):
pip install albumentations
- [Conda](https://anaconda.org/conda-forge/albumentations) (📥 290K · ⏱️ 16.05.2025):
conda install -c conda-forge albumentations
MoviePy (🥇41 · ⭐ 13K · 📈) - Video editing with Python. MIT - [GitHub](https://github.com/Zulko/moviepy) (👨‍💻 180 · 🔀 1.8K · 📦 63K · 📋 1.6K - 2% open · ⏱️ 21.05.2025):
git clone https://github.com/Zulko/moviepy
- [PyPi](https://pypi.org/project/moviepy) (📥 3.2M / month · 📦 1.2K · ⏱️ 21.05.2025):
pip install moviepy
- [Conda](https://anaconda.org/conda-forge/moviepy) (📥 300K · ⏱️ 22.04.2025):
conda install -c conda-forge moviepy
deepface (🥇39 · ⭐ 19K) - A Lightweight Face Recognition and Facial Attribute Analysis (Age,.. MIT - [GitHub](https://github.com/serengil/deepface) (👨‍💻 89 · 🔀 2.6K · 📦 7.8K · 📋 1.2K - 0% open · ⏱️ 17.05.2025):
git clone https://github.com/serengil/deepface
- [PyPi](https://pypi.org/project/deepface) (📥 620K / month · 📦 44 · ⏱️ 17.08.2024):
pip install deepface
InsightFace (🥈38 · ⭐ 25K) - State-of-the-art 2D and 3D Face Analysis Project. MIT - [GitHub](https://github.com/deepinsight/insightface) (👨‍💻 66 · 🔀 5.5K · 📥 8.1M · 📦 4.4K · 📋 2.6K - 45% open · ⏱️ 22.05.2025):
git clone https://github.com/deepinsight/insightface
- [PyPi](https://pypi.org/project/insightface) (📥 260K / month · 📦 30 · ⏱️ 17.12.2022):
pip install insightface
imageio (🥈38 · ⭐ 1.6K) - Python library for reading and writing image data. BSD-2 - [GitHub](https://github.com/imageio/imageio) (👨‍💻 120 · 🔀 310 · 📥 1.8K · 📦 180K · 📋 610 - 16% open · ⏱️ 21.02.2025):
git clone https://github.com/imageio/imageio
- [PyPi](https://pypi.org/project/imageio) (📥 26M / month · 📦 2.6K · ⏱️ 20.01.2025):
pip install imageio
- [Conda](https://anaconda.org/conda-forge/imageio) (📥 7.9M · ⏱️ 22.04.2025):
conda install -c conda-forge imageio
Kornia (🥈37 · ⭐ 10K) - Geometric Computer Vision Library for Spatial AI. Apache-2 - [GitHub](https://github.com/kornia/kornia) (👨‍💻 290 · 🔀 1K · 📥 1.9K · 📦 16K · 📋 980 - 30% open · ⏱️ 20.05.2025):
git clone https://github.com/kornia/kornia
- [PyPi](https://pypi.org/project/kornia) (📥 2.7M / month · 📦 340 · ⏱️ 08.05.2025):
pip install kornia
- [Conda](https://anaconda.org/conda-forge/kornia) (📥 230K · ⏱️ 08.05.2025):
conda install -c conda-forge kornia
opencv-python (🥈35 · ⭐ 4.8K) - Automated CI toolchain to produce precompiled opencv-python,.. MIT - [GitHub](https://github.com/opencv/opencv-python) (👨‍💻 54 · 🔀 900 · 📦 590K · 📋 860 - 17% open · ⏱️ 19.05.2025):
git clone https://github.com/opencv/opencv-python
- [PyPi](https://pypi.org/project/opencv-python) (📥 18M / month · 📦 13K · ⏱️ 16.01.2025):
pip install opencv-python
Wand (🥈33 · ⭐ 1.4K) - The ctypes-based simple ImageMagick binding for Python. MIT - [GitHub](https://github.com/emcconville/wand) (👨‍💻 110 · 🔀 200 · 📥 52K · 📦 21K · 📋 430 - 6% open · ⏱️ 01.04.2025):
git clone https://github.com/emcconville/wand
- [PyPi](https://pypi.org/project/wand) (📥 1.1M / month · 📦 260 · ⏱️ 03.11.2023):
pip install wand
- [Conda](https://anaconda.org/conda-forge/wand) (📥 140K · ⏱️ 22.04.2025):
conda install -c conda-forge wand
PaddleSeg (🥈32 · ⭐ 9K) - Easy-to-use image segmentation library with awesome pre-.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleSeg) (👨‍💻 130 · 🔀 1.7K · 📦 1.4K · 📋 2.2K - 0% open · ⏱️ 25.12.2024):
git clone https://github.com/PaddlePaddle/PaddleSeg
- [PyPi](https://pypi.org/project/paddleseg) (📥 1.5K / month · 📦 7 · ⏱️ 30.11.2022):
pip install paddleseg
ImageHash (🥈32 · ⭐ 3.6K) - A Python Perceptual Image Hashing Module. BSD-2 - [GitHub](https://github.com/JohannesBuchner/imagehash) (👨‍💻 28 · 🔀 340 · 📦 17K · 📋 150 - 15% open · ⏱️ 17.04.2025):
git clone https://github.com/JohannesBuchner/imagehash
- [PyPi](https://pypi.org/project/ImageHash) (📥 1.9M / month · 📦 270 · ⏱️ 01.02.2025):
pip install ImageHash
- [Conda](https://anaconda.org/conda-forge/imagehash) (📥 450K · ⏱️ 22.04.2025):
conda install -c conda-forge imagehash
lightly (🥈32 · ⭐ 3.4K) - A python library for self-supervised learning on images. MIT - [GitHub](https://github.com/lightly-ai/lightly) (👨‍💻 65 · 🔀 290 · 📦 450 · 📋 600 - 12% open · ⏱️ 21.05.2025):
git clone https://github.com/lightly-ai/lightly
- [PyPi](https://pypi.org/project/lightly) (📥 62K / month · 📦 20 · ⏱️ 22.04.2025):
pip install lightly
detectron2 (🥈31 · ⭐ 32K · 📉) - Detectron2 is a platform for object detection,.. Apache-2 - [GitHub](https://github.com/facebookresearch/detectron2) (👨‍💻 280 · 🔀 7.5K · 📋 3.6K - 14% open · ⏱️ 24.04.2025):
git clone https://github.com/facebookresearch/detectron2
- [PyPi](https://pypi.org/project/detectron2) (📦 13 · ⏱️ 06.02.2020):
pip install detectron2
- [Conda](https://anaconda.org/conda-forge/detectron2) (📥 690K · ⏱️ 13.05.2025):
conda install -c conda-forge detectron2
vit-pytorch (🥈30 · ⭐ 23K) - Implementation of Vision Transformer, a simple way to achieve.. MIT - [GitHub](https://github.com/lucidrains/vit-pytorch) (👨‍💻 23 · 🔀 3.3K · 📦 650 · 📋 280 - 49% open · ⏱️ 05.03.2025):
git clone https://github.com/lucidrains/vit-pytorch
- [PyPi](https://pypi.org/project/vit-pytorch) (📥 19K / month · 📦 17 · ⏱️ 05.03.2025):
pip install vit-pytorch
sahi (🥈30 · ⭐ 4.6K) - Framework agnostic sliced/tiled inference + interactive ui + error analysis.. MIT - [GitHub](https://github.com/obss/sahi) (👨‍💻 54 · 🔀 640 · 📥 36K · 📦 1.8K · ⏱️ 15.05.2025):
git clone https://github.com/obss/sahi
- [PyPi](https://pypi.org/project/sahi) (📥 140K / month · 📦 35 · ⏱️ 05.05.2025):
pip install sahi
- [Conda](https://anaconda.org/conda-forge/sahi) (📥 100K · ⏱️ 05.05.2025):
conda install -c conda-forge sahi
PaddleDetection (🥈29 · ⭐ 13K) - Object Detection toolkit based on PaddlePaddle. It.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleDetection) (👨‍💻 190 · 🔀 2.9K · 📋 5.6K - 17% open · ⏱️ 16.04.2025):
git clone https://github.com/PaddlePaddle/PaddleDetection
- [PyPi](https://pypi.org/project/paddledet) (📥 890 / month · 📦 2 · ⏱️ 19.09.2022):
pip install paddledet
doctr (🥈29 · ⭐ 4.7K) - docTR (Document Text Recognition) - a seamless, high-.. Apache-2 - [GitHub](https://github.com/mindee/doctr) (👨‍💻 62 · 🔀 500 · 📥 5.4M · 📋 410 - 6% open · ⏱️ 22.05.2025):
git clone https://github.com/mindee/doctr
- [PyPi](https://pypi.org/project/python-doctr) (📥 100K / month · 📦 14 · ⏱️ 30.01.2025):
pip install python-doctr
Face Alignment (🥉28 · ⭐ 7.3K · 💤) - 2D and 3D Face alignment library build using pytorch. BSD-3 - [GitHub](https://github.com/1adrianb/face-alignment) (👨‍💻 26 · 🔀 1.4K · 📦 21 · 📋 320 - 24% open · ⏱️ 30.08.2024):
git clone https://github.com/1adrianb/face-alignment
- [PyPi](https://pypi.org/project/face-alignment) (📥 75K / month · 📦 10 · ⏱️ 17.08.2023):
pip install face-alignment
vidgear (🥉28 · ⭐ 3.5K · 💤) - A High-performance cross-platform Video Processing Python.. Apache-2 - [GitHub](https://github.com/abhiTronix/vidgear) (👨‍💻 14 · 🔀 260 · 📥 2.3K · 📦 720 · 📋 300 - 2% open · ⏱️ 22.06.2024):
git clone https://github.com/abhiTronix/vidgear
- [PyPi](https://pypi.org/project/vidgear) (📥 28K / month · 📦 15 · ⏱️ 22.06.2024):
pip install vidgear
Norfair (🥉28 · ⭐ 2.5K) - Lightweight Python library for adding real-time multi-object tracking.. BSD-3 - [GitHub](https://github.com/tryolabs/norfair) (👨‍💻 31 · 🔀 260 · 📥 350 · 📦 320 · 📋 180 - 16% open · ⏱️ 30.04.2025):
git clone https://github.com/tryolabs/norfair
- [PyPi](https://pypi.org/project/norfair) (📥 29K / month · 📦 9 · ⏱️ 30.04.2025):
pip install norfair
mtcnn (🥉28 · ⭐ 2.4K · 💤) - MTCNN face detection implementation for TensorFlow, as a PIP.. MIT - [GitHub](https://github.com/ipazc/mtcnn) (👨‍💻 16 · 🔀 530 · 📥 52 · 📦 8.8K · 📋 130 - 37% open · ⏱️ 08.10.2024):
git clone https://github.com/ipazc/mtcnn
- [PyPi](https://pypi.org/project/mtcnn) (📥 170K / month · 📦 73 · ⏱️ 08.10.2024):
pip install mtcnn
- [Conda](https://anaconda.org/conda-forge/mtcnn) (📥 15K · ⏱️ 22.04.2025):
conda install -c conda-forge mtcnn
pyvips (🥉28 · ⭐ 700) - python binding for libvips using cffi. MIT - [GitHub](https://github.com/libvips/pyvips) (👨‍💻 16 · 🔀 49 · 📦 1.1K · 📋 460 - 42% open · ⏱️ 17.05.2025):
git clone https://github.com/libvips/pyvips
- [PyPi](https://pypi.org/project/pyvips) (📥 120K / month · 📦 94 · ⏱️ 28.04.2025):
pip install pyvips
- [Conda](https://anaconda.org/conda-forge/pyvips) (📥 220K · ⏱️ 29.04.2025):
conda install -c conda-forge pyvips
facenet-pytorch (🥉27 · ⭐ 4.9K · 💤) - Pretrained Pytorch face detection (MTCNN) and facial.. MIT - [GitHub](https://github.com/timesler/facenet-pytorch) (👨‍💻 18 · 🔀 940 · 📥 1.8M · 📦 3.6K · 📋 190 - 41% open · ⏱️ 02.08.2024):
git clone https://github.com/timesler/facenet-pytorch
- [PyPi](https://pypi.org/project/facenet-pytorch) (📥 130K / month · 📦 51 · ⏱️ 29.04.2024):
pip install facenet-pytorch
mahotas (🥉27 · ⭐ 870) - Computer Vision in Python. MIT - [GitHub](https://github.com/luispedro/mahotas) (👨‍💻 35 · 🔀 150 · 📦 1.5K · 📋 92 - 22% open · ⏱️ 25.02.2025):
git clone https://github.com/luispedro/mahotas
- [PyPi](https://pypi.org/project/mahotas) (📥 24K / month · 📦 63 · ⏱️ 17.07.2024):
pip install mahotas
- [Conda](https://anaconda.org/conda-forge/mahotas) (📥 630K · ⏱️ 22.04.2025):
conda install -c conda-forge mahotas
Image Deduplicator (🥉26 · ⭐ 5.4K) - Finding duplicate images made easy!. Apache-2 - [GitHub](https://github.com/idealo/imagededup) (👨‍💻 17 · 🔀 460 · 📥 4 · 📦 190 · 📋 130 - 28% open · ⏱️ 07.05.2025):
git clone https://github.com/idealo/imagededup
- [PyPi](https://pypi.org/project/imagededup) (📥 24K / month · 📦 26 · ⏱️ 13.05.2025):
pip install imagededup
CellProfiler (🥉26 · ⭐ 980) - An open-source application for biological image analysis. BSD-3 - [GitHub](https://github.com/CellProfiler/CellProfiler) (👨‍💻 150 · 🔀 390 · 📥 8.7K · 📦 28 · 📋 3.3K - 9% open · ⏱️ 20.05.2025):
git clone https://github.com/CellProfiler/CellProfiler
- [PyPi](https://pypi.org/project/cellprofiler) (📥 1K / month · 📦 2 · ⏱️ 16.09.2024):
pip install cellprofiler
MMF (🥉25 · ⭐ 5.6K) - A modular framework for vision & language multimodal research from.. BSD-3 - [GitHub](https://github.com/facebookresearch/mmf) (👨‍💻 120 · 🔀 920 · 📦 22 · 📋 690 - 21% open · ⏱️ 24.04.2025):
git clone https://github.com/facebookresearch/mmf
- [PyPi](https://pypi.org/project/mmf) (📥 340 / month · 📦 1 · ⏱️ 12.06.2020):
pip install mmf
pytorchvideo (🥉25 · ⭐ 3.4K) - A deep learning library for video understanding research. Apache-2 - [GitHub](https://github.com/facebookresearch/pytorchvideo) (👨‍💻 58 · 🔀 410 · 📋 210 - 50% open · ⏱️ 25.01.2025):
git clone https://github.com/facebookresearch/pytorchvideo
- [PyPi](https://pypi.org/project/pytorchvideo) (📥 58K / month · 📦 24 · ⏱️ 20.01.2022):
pip install pytorchvideo
tensorflow-graphics (🥉25 · ⭐ 2.8K) - TensorFlow Graphics: Differentiable Graphics Layers.. Apache-2 - [GitHub](https://github.com/tensorflow/graphics) (👨‍💻 39 · 🔀 370 · 📋 240 - 60% open · ⏱️ 03.02.2025):
git clone https://github.com/tensorflow/graphics
- [PyPi](https://pypi.org/project/tensorflow-graphics) (📥 34K / month · 📦 11 · ⏱️ 03.12.2021):
pip install tensorflow-graphics
segmentation_models (🥉24 · ⭐ 4.8K · 💤) - Segmentation models with pretrained backbones. Keras.. MIT - [GitHub](https://github.com/qubvel/segmentation_models) (👨‍💻 15 · 🔀 1K · 📋 540 - 50% open · ⏱️ 21.08.2024):
git clone https://github.com/qubvel/segmentation_models
- [PyPi](https://pypi.org/project/segmentation_models) (📥 28K / month · 📦 28 · ⏱️ 10.01.2020):
pip install segmentation_models
ffcv (🥉23 · ⭐ 2.9K · 💤) - FFCV: Fast Forward Computer Vision (and other ML workloads!). Apache-2 - [GitHub](https://github.com/libffcv/ffcv) (👨‍💻 31 · 🔀 180 · 📦 70 · 📋 290 - 38% open · ⏱️ 06.05.2024):
git clone https://github.com/libffcv/ffcv
- [PyPi](https://pypi.org/project/ffcv) (📥 680 / month · 📦 1 · ⏱️ 28.01.2022):
pip install ffcv
kubric (🥉22 · ⭐ 2.5K) - A data generation pipeline for creating semi-realistic synthetic.. Apache-2 - [GitHub](https://github.com/google-research/kubric) (👨‍💻 32 · 🔀 240 · 📦 7 · 📋 190 - 33% open · ⏱️ 06.05.2025):
git clone https://github.com/google-research/kubric
- [PyPi](https://pypi.org/project/kubric-nightly) (📥 7.7K / month · ⏱️ 27.12.2023):
pip install kubric-nightly
icevision (🥉22 · ⭐ 860 · 💤) - An Agnostic Computer Vision Framework - Pluggable to any.. Apache-2 - [GitHub](https://github.com/airctic/icevision) (👨‍💻 41 · 🔀 130 · 📋 570 - 10% open · ⏱️ 31.10.2024):
git clone https://github.com/airctic/icevision
- [PyPi](https://pypi.org/project/icevision) (📥 3.3K / month · 📦 6 · ⏱️ 10.02.2022):
pip install icevision
PySlowFast (🥉21 · ⭐ 6.9K) - PySlowFast: video understanding codebase from FAIR for.. Apache-2 - [GitHub](https://github.com/facebookresearch/SlowFast) (👨‍💻 34 · 🔀 1.2K · 📦 23 · 📋 710 - 58% open · ⏱️ 26.11.2024):
git clone https://github.com/facebookresearch/SlowFast
- [PyPi](https://pypi.org/project/pyslowfast) (📥 32 / month · ⏱️ 15.01.2020):
pip install pyslowfast
Image Super-Resolution (🥉21 · ⭐ 4.7K) - Super-scale your images and run experiments with.. Apache-2 - [GitHub](https://github.com/idealo/image-super-resolution) (👨‍💻 11 · 🔀 760 · 📋 220 - 48% open · ⏱️ 18.12.2024):
git clone https://github.com/idealo/image-super-resolution
- [PyPi](https://pypi.org/project/ISR) (📥 6.2K / month · 📦 5 · ⏱️ 08.01.2020):
pip install ISR
- [Docker Hub](https://hub.docker.com/r/idealo/image-super-resolution-gpu) (📥 280 · ⭐ 1 · ⏱️ 01.04.2019):
docker pull idealo/image-super-resolution-gpu
scenic (🥉17 · ⭐ 3.5K) - Scenic: A Jax Library for Computer Vision Research and Beyond. Apache-2 - [GitHub](https://github.com/google-research/scenic) (👨‍💻 94 · 🔀 450 · 📋 270 - 56% open · ⏱️ 05.05.2025):
git clone https://github.com/google-research/scenic
Show 26 hidden projects... - scikit-image (🥇42 · ⭐ 6.2K) - Image processing in Python. ❗Unlicensed - MMDetection (🥈37 · ⭐ 31K · 💀) - OpenMMLab Detection Toolbox and Benchmark. Apache-2 - glfw (🥈37 · ⭐ 14K) - A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input. ❗️Zlib - Face Recognition (🥈36 · ⭐ 55K · 💀) - The worlds simplest facial recognition api for Python.. MIT - imgaug (🥈36 · ⭐ 15K · 💀) - Image augmentation for machine learning experiments. MIT - PyTorch3D (🥈32 · ⭐ 9.3K) - PyTorch3D is FAIRs library of reusable components for.. ❗Unlicensed - imageai (🥈31 · ⭐ 8.8K · 💀) - A python library built to empower developers to build applications.. MIT - imutils (🥈31 · ⭐ 4.6K · 💀) - A series of convenience functions to make basic image processing.. MIT - GluonCV (🥉28 · ⭐ 5.9K · 💀) - Gluon CV Toolkit. Apache-2 - layout-parser (🥉28 · ⭐ 5.3K · 💀) - A Unified Toolkit for Deep Learning Based Document Image.. Apache-2 - Augmentor (🥉27 · ⭐ 5.1K · 💀) - Image augmentation library in Python for machine learning. MIT - chainercv (🥉26 · ⭐ 1.5K · 💀) - ChainerCV: a Library for Deep Learning in Computer Vision. MIT - Pillow-SIMD (🥉25 · ⭐ 2.2K · 💤) - The friendly PIL fork. ❗️PIL - Classy Vision (🥉23 · ⭐ 1.6K · 💀) - An end-to-end PyTorch framework for image and video.. MIT - deep-daze (🥉22 · ⭐ 4.4K · 💀) - Simple command line tool for text to image generation using.. MIT - vissl (🥉22 · ⭐ 3.3K · 💀) - VISSL is FAIRs library of extensible, modular and scalable.. MIT - Luminoth (🥉22 · ⭐ 2.4K · 💀) - Deep Learning toolkit for Computer Vision. BSD-3 - detecto (🥉22 · ⭐ 620 · 💀) - Build fully-functioning computer vision models with PyTorch. MIT - DE⫶TR (🥉21 · ⭐ 14K · 💀) - End-to-End Object Detection with Transformers. Apache-2 - solt (🥉21 · ⭐ 260) - Streaming over lightweight data transformations. MIT - image-match (🥉20 · ⭐ 3K · 💀) - Quickly search over billions of images. Apache-2 - nude.py (🥉20 · ⭐ 930 · 💀) - Nudity detection with Python. MIT - pycls (🥉18 · ⭐ 2.2K · 💀) - Codebase for Image Classification Research, written in PyTorch. MIT - Caer (🥉17 · ⭐ 790 · 💀) - A lightweight Computer Vision library. Scale your models, not boilerplate. MIT - Torch Points 3D (🥉17 · ⭐ 240 · 💀) - Pytorch framework for doing deep learning on point.. BSD-3 - HugsVision (🥉14 · ⭐ 200 · 💀) - HugsVision is a easy to use huggingface wrapper for state-of-.. MIT huggingface


Graph Data

Back to top

Libraries for graph processing, clustering, embedding, and machine learning tasks.

networkx (🥇44 · ⭐ 16K) - Network Analysis in Python. BSD-3 - [GitHub](https://github.com/networkx/networkx) (👨‍💻 780 · 🔀 3.3K · 📥 110 · 📦 410K · 📋 3.5K - 10% open · ⏱️ 19.05.2025):
git clone https://github.com/networkx/networkx
- [PyPi](https://pypi.org/project/networkx) (📥 94M / month · 📦 11K · ⏱️ 09.05.2025):
pip install networkx
- [Conda](https://anaconda.org/conda-forge/networkx) (📥 22M · ⏱️ 22.04.2025):
conda install -c conda-forge networkx
PyTorch Geometric (🥇40 · ⭐ 22K) - Graph Neural Network Library for PyTorch. MIT - [GitHub](https://github.com/pyg-team/pytorch_geometric) (👨‍💻 540 · 🔀 3.8K · 📦 9.5K · 📋 3.9K - 30% open · ⏱️ 20.05.2025):
git clone https://github.com/pyg-team/pytorch_geometric
- [PyPi](https://pypi.org/project/torch-geometric) (📥 630K / month · 📦 360 · ⏱️ 26.09.2024):
pip install torch-geometric
- [Conda](https://anaconda.org/conda-forge/pytorch_geometric) (📥 160K · ⏱️ 22.04.2025):
conda install -c conda-forge pytorch_geometric
dgl (🥇36 · ⭐ 14K) - Python package built to ease deep learning on graph, on top of existing DL.. Apache-2 - [GitHub](https://github.com/dmlc/dgl) (👨‍💻 300 · 🔀 3K · 📦 4K · 📋 2.9K - 18% open · ⏱️ 11.02.2025):
git clone https://github.com/dmlc/dgl
- [PyPi](https://pypi.org/project/dgl) (📥 100K / month · 📦 150 · ⏱️ 13.05.2024):
pip install dgl
PyKEEN (🥈31 · ⭐ 1.8K) - A Python library for learning and evaluating knowledge graph embeddings. MIT - [GitHub](https://github.com/pykeen/pykeen) (👨‍💻 43 · 🔀 200 · 📥 240 · 📦 320 · 📋 590 - 19% open · ⏱️ 24.04.2025):
git clone https://github.com/pykeen/pykeen
- [PyPi](https://pypi.org/project/pykeen) (📥 13K / month · 📦 21 · ⏱️ 24.04.2025):
pip install pykeen
pygraphistry (🥈29 · ⭐ 2.3K) - PyGraphistry is a Python library to quickly load, shape,.. BSD-3 - [GitHub](https://github.com/graphistry/pygraphistry) (👨‍💻 46 · 🔀 220 · 📦 150 · 📋 360 - 53% open · ⏱️ 16.05.2025):
git clone https://github.com/graphistry/pygraphistry
- [PyPi](https://pypi.org/project/graphistry) (📥 17K / month · 📦 6 · ⏱️ 16.05.2025):
pip install graphistry
pytorch_geometric_temporal (🥈28 · ⭐ 2.8K) - PyTorch Geometric Temporal: Spatiotemporal Signal.. MIT - [GitHub](https://github.com/benedekrozemberczki/pytorch_geometric_temporal) (👨‍💻 37 · 🔀 390 · 📋 200 - 20% open · ⏱️ 24.03.2025):
git clone https://github.com/benedekrozemberczki/pytorch_geometric_temporal
- [PyPi](https://pypi.org/project/torch-geometric-temporal) (📥 5.9K / month · 📦 7 · ⏱️ 28.03.2025):
pip install torch-geometric-temporal
ogb (🥈28 · ⭐ 2K) - Benchmark datasets, data loaders, and evaluators for graph machine learning. MIT - [GitHub](https://github.com/snap-stanford/ogb) (👨‍💻 32 · 🔀 400 · 📦 2.5K · 📋 310 - 11% open · ⏱️ 06.05.2025):
git clone https://github.com/snap-stanford/ogb
- [PyPi](https://pypi.org/project/ogb) (📥 35K / month · 📦 73 · ⏱️ 07.04.2023):
pip install ogb
- [Conda](https://anaconda.org/conda-forge/ogb) (📥 54K · ⏱️ 22.04.2025):
conda install -c conda-forge ogb
Node2Vec (🥈25 · ⭐ 1.3K · 💤) - Implementation of the node2vec algorithm. MIT - [GitHub](https://github.com/eliorc/node2vec) (👨‍💻 16 · 🔀 250 · 📦 910 · 📋 97 - 5% open · ⏱️ 02.08.2024):
git clone https://github.com/eliorc/node2vec
- [PyPi](https://pypi.org/project/node2vec) (📥 26K / month · 📦 31 · ⏱️ 02.08.2024):
pip install node2vec
- [Conda](https://anaconda.org/conda-forge/node2vec) (📥 35K · ⏱️ 22.04.2025):
conda install -c conda-forge node2vec
torch-cluster (🥈25 · ⭐ 870) - PyTorch Extension Library of Optimized Graph Cluster.. MIT - [GitHub](https://github.com/rusty1s/pytorch_cluster) (👨‍💻 39 · 🔀 150 · 📋 180 - 17% open · ⏱️ 20.04.2025):
git clone https://github.com/rusty1s/pytorch_cluster
- [PyPi](https://pypi.org/project/torch-cluster) (📥 20K / month · 📦 62 · ⏱️ 12.10.2023):
pip install torch-cluster
- [Conda](https://anaconda.org/conda-forge/pytorch_cluster) (📥 360K · ⏱️ 22.04.2025):
conda install -c conda-forge pytorch_cluster
GraphVite (🥉15 · ⭐ 1.3K · 💤) - GraphVite: A General and High-performance Graph Embedding.. Apache-2 - [GitHub](https://github.com/DeepGraphLearning/graphvite) (👨‍💻 1 · 🔀 150 · 📋 110 - 47% open · ⏱️ 14.06.2024):
git clone https://github.com/DeepGraphLearning/graphvite
- [Conda](https://anaconda.org/milagraph/graphvite) (📥 5.1K · ⏱️ 25.03.2025):
conda install -c milagraph graphvite
Show 26 hidden projects... - igraph (🥇32 · ⭐ 1.4K) - Python interface for igraph. ❗️GPL-2.0 - Spektral (🥈28 · ⭐ 2.4K · 💀) - Graph Neural Networks with Keras and Tensorflow 2. MIT - StellarGraph (🥈27 · ⭐ 3K · 💀) - StellarGraph - Machine Learning on Graphs. Apache-2 - pygal (🥈27 · ⭐ 2.7K · 💤) - PYthon svg GrAph plotting Library. ❗️LGPL-3.0 - Paddle Graph Learning (🥈26 · ⭐ 1.6K · 💀) - Paddle Graph Learning (PGL) is an efficient and.. Apache-2 - AmpliGraph (🥈25 · ⭐ 2.2K · 💀) - Python library for Representation Learning on Knowledge.. Apache-2 - Karate Club (🥉24 · ⭐ 2.2K · 💤) - Karate Club: An API Oriented Open-source Python Framework.. ❗️GPL-3.0 - PyTorch-BigGraph (🥉23 · ⭐ 3.4K · 💀) - Generate embeddings from large-scale graph-structured.. BSD-3 - graph4nlp (🥉22 · ⭐ 1.7K · 💀) - Graph4nlp is the library for the easy use of Graph.. Apache-2 - graph-nets (🥉21 · ⭐ 5.4K · 💀) - Build Graph Nets in Tensorflow. Apache-2 - jraph (🥉21 · ⭐ 1.4K · 💀) - A Graph Neural Network Library in Jax. Apache-2 - pyRDF2Vec (🥉21 · ⭐ 260 · 💀) - Python Implementation and Extension of RDF2Vec. MIT - DeepWalk (🥉20 · ⭐ 2.7K · 💀) - DeepWalk - Deep Learning for Graphs. ❗️GPL-3.0 - DIG (🥉20 · ⭐ 2K · 💀) - A library for graph deep learning research. ❗️GPL-3.0 - deepsnap (🥉20 · ⭐ 560 · 💀) - Python library assists deep learning on graphs. MIT - GraphGym (🥉19 · ⭐ 1.8K · 💀) - Platform for designing and evaluating Graph Neural Networks (GNN). MIT - DeepGraph (🥉18 · ⭐ 290) - Analyze Data with Pandas-based Networks. Documentation:. BSD-3 - Sematch (🥉17 · ⭐ 440 · 💀) - semantic similarity framework for knowledge graph. Apache-2 - Euler (🥉16 · ⭐ 2.9K · 💀) - A distributed graph deep learning framework. Apache-2 - AutoGL (🥉16 · ⭐ 1.1K · 💀) - An autoML framework & toolkit for machine learning on graphs. Apache-2 - kglib (🥉16 · ⭐ 550 · 💀) - TypeDB-ML is the Machine Learning integrations library for TypeDB. Apache-2 - ptgnn (🥉15 · ⭐ 380 · 💀) - A PyTorch Graph Neural Network Library. MIT - GraphEmbedding (🥉14 · ⭐ 3.8K · 💀) - Implementation and experiments of graph embedding.. MIT - GraphSAGE (🥉14 · ⭐ 3.5K · 💀) - Representation learning on large graphs using stochastic.. MIT - OpenNE (🥉14 · ⭐ 1.7K · 💀) - An Open-Source Package for Network Embedding (NE). MIT - OpenKE (🥉13 · ⭐ 3.9K · 💀) - An Open-Source Package for Knowledge Embedding (KE). ❗Unlicensed


Audio Data

Back to top

Libraries for audio analysis, manipulation, transformation, and extraction, as well as speech recognition and music generation tasks.

speechbrain (🥇39 · ⭐ 9.8K) - A PyTorch-based Speech Toolkit. Apache-2 - [GitHub](https://github.com/speechbrain/speechbrain) (👨‍💻 260 · 🔀 1.5K · 📦 3.6K · 📋 1.2K - 12% open · ⏱️ 21.05.2025):
git clone https://github.com/speechbrain/speechbrain
- [PyPi](https://pypi.org/project/speechbrain) (📥 1.1M / month · 📦 79 · ⏱️ 07.04.2025):
pip install speechbrain
espnet (🥇38 · ⭐ 9.1K) - End-to-End Speech Processing Toolkit. Apache-2 - [GitHub](https://github.com/espnet/espnet) (👨‍💻 490 · 🔀 2.2K · 📥 84 · 📦 450 · 📋 2.5K - 14% open · ⏱️ 20.05.2025):
git clone https://github.com/espnet/espnet
- [PyPi](https://pypi.org/project/espnet) (📥 18K / month · 📦 12 · ⏱️ 04.12.2024):
pip install espnet
SpeechRecognition (🥈35 · ⭐ 8.7K) - Speech recognition module for Python, supporting several.. BSD-3 - [GitHub](https://github.com/Uberi/speech_recognition) (👨‍💻 54 · 🔀 2.4K · 📦 21 · 📋 670 - 48% open · ⏱️ 18.05.2025):
git clone https://github.com/Uberi/speech_recognition
- [PyPi](https://pypi.org/project/SpeechRecognition) (📥 1.4M / month · 📦 730 · ⏱️ 12.05.2025):
pip install SpeechRecognition
- [Conda](https://anaconda.org/conda-forge/speechrecognition) (📥 260K · ⏱️ 12.05.2025):
conda install -c conda-forge speechrecognition
librosa (🥈35 · ⭐ 7.6K) - Python library for audio and music analysis. ISC - [GitHub](https://github.com/librosa/librosa) (👨‍💻 130 · 🔀 980 · 📋 1.2K - 5% open · ⏱️ 19.05.2025):
git clone https://github.com/librosa/librosa
- [PyPi](https://pypi.org/project/librosa) (📥 3.6M / month · 📦 1.6K · ⏱️ 11.03.2025):
pip install librosa
- [Conda](https://anaconda.org/conda-forge/librosa) (📥 900K · ⏱️ 22.04.2025):
conda install -c conda-forge librosa
torchaudio (🥈35 · ⭐ 2.7K) - Data manipulation and transformation for audio signal.. BSD-2 - [GitHub](https://github.com/pytorch/audio) (👨‍💻 240 · 🔀 690 · 📋 1K - 28% open · ⏱️ 20.05.2025):
git clone https://github.com/pytorch/audio
- [PyPi](https://pypi.org/project/torchaudio) (📥 13M / month · 📦 1.9K · ⏱️ 23.04.2025):
pip install torchaudio
spleeter (🥈33 · ⭐ 27K) - Deezer source separation library including pretrained models. MIT - [GitHub](https://github.com/deezer/spleeter) (👨‍💻 22 · 🔀 2.9K · 📥 4M · 📦 1K · 📋 820 - 31% open · ⏱️ 02.04.2025):
git clone https://github.com/deezer/spleeter
- [PyPi](https://pypi.org/project/spleeter) (📥 34K / month · 📦 18 · ⏱️ 03.04.2025):
pip install spleeter
- [Conda](https://anaconda.org/conda-forge/spleeter) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge spleeter
Magenta (🥈33 · ⭐ 19K) - Magenta: Music and Art Generation with Machine Intelligence. Apache-2 - [GitHub](https://github.com/magenta/magenta) (👨‍💻 160 · 🔀 3.7K · 📦 580 · 📋 1K - 41% open · ⏱️ 17.01.2025):
git clone https://github.com/magenta/magenta
- [PyPi](https://pypi.org/project/magenta) (📥 8.1K / month · 📦 5 · ⏱️ 01.08.2022):
pip install magenta
python-soundfile (🥈32 · ⭐ 770) - SoundFile is an audio library based on libsndfile, CFFI, and.. BSD-3 - [GitHub](https://github.com/bastibe/python-soundfile) (👨‍💻 38 · 🔀 110 · 📥 21K · 📦 65K · 📋 260 - 46% open · ⏱️ 28.04.2025):
git clone https://github.com/bastibe/python-soundfile
- [PyPi](https://pypi.org/project/soundfile) (📥 6.3M / month · 📦 1.1K · ⏱️ 25.01.2025):
pip install soundfile
- [Conda](https://anaconda.org/anaconda/pysoundfile):
conda install -c anaconda pysoundfile
Porcupine (🥈31 · ⭐ 4.1K) - On-device wake word detection powered by deep learning. Apache-2 - [GitHub](https://github.com/Picovoice/porcupine) (👨‍💻 42 · 🔀 520 · 📦 48 · 📋 570 - 0% open · ⏱️ 06.05.2025):
git clone https://github.com/Picovoice/Porcupine
- [PyPi](https://pypi.org/project/pvporcupine) (📥 24K / month · 📦 38 · ⏱️ 05.02.2025):
pip install pvporcupine
audiomentations (🥈31 · ⭐ 2K) - A Python library for audio data augmentation. Useful for making.. MIT - [GitHub](https://github.com/iver56/audiomentations) (👨‍💻 33 · 🔀 200 · 📦 790 · 📋 200 - 26% open · ⏱️ 05.05.2025):
git clone https://github.com/iver56/audiomentations
- [PyPi](https://pypi.org/project/audiomentations) (📥 270K / month · 📦 28 · ⏱️ 05.05.2025):
pip install audiomentations
tinytag (🥉29 · ⭐ 760) - Python library for reading audio file metadata. MIT - [GitHub](https://github.com/tinytag/tinytag) (👨‍💻 27 · 🔀 100 · 📦 1.2K · 📋 120 - 3% open · ⏱️ 05.05.2025):
git clone https://github.com/devsnd/tinytag
- [PyPi](https://pypi.org/project/tinytag) (📥 69K / month · 📦 120 · ⏱️ 23.04.2025):
pip install tinytag
pyAudioAnalysis (🥉28 · ⭐ 6.1K) - Python Audio Analysis Library: Feature Extraction,.. Apache-2 - [GitHub](https://github.com/tyiannak/pyAudioAnalysis) (👨‍💻 28 · 🔀 1.2K · 📦 640 · 📋 320 - 62% open · ⏱️ 28.03.2025):
git clone https://github.com/tyiannak/pyAudioAnalysis
- [PyPi](https://pypi.org/project/pyAudioAnalysis) (📥 17K / month · 📦 12 · ⏱️ 07.02.2022):
pip install pyAudioAnalysis
Madmom (🥉27 · ⭐ 1.4K · 💤) - Python audio and music signal processing library. BSD-3 - [GitHub](https://github.com/CPJKU/madmom) (👨‍💻 24 · 🔀 240 · 📦 500 · 📋 280 - 24% open · ⏱️ 25.08.2024):
git clone https://github.com/CPJKU/madmom
- [PyPi](https://pypi.org/project/madmom) (📥 3.7K / month · 📦 27 · ⏱️ 14.11.2018):
pip install madmom
DDSP (🥉25 · ⭐ 3K · 💤) - DDSP: Differentiable Digital Signal Processing. Apache-2 - [GitHub](https://github.com/magenta/ddsp) (👨‍💻 32 · 🔀 340 · 📦 68 · 📋 170 - 28% open · ⏱️ 23.09.2024):
git clone https://github.com/magenta/ddsp
- [PyPi](https://pypi.org/project/ddsp) (📥 3.6K / month · 📦 1 · ⏱️ 25.05.2022):
pip install ddsp
- [Conda](https://anaconda.org/conda-forge/ddsp) (📥 22K · ⏱️ 22.04.2025):
conda install -c conda-forge ddsp
nnAudio (🥉23 · ⭐ 1.1K) - Audio processing by using pytorch 1D convolution network. MIT - [GitHub](https://github.com/KinWaiCheuk/nnAudio) (👨‍💻 16 · 🔀 93 · 📦 370 · 📋 63 - 28% open · ⏱️ 16.05.2025):
git clone https://github.com/KinWaiCheuk/nnAudio
- [PyPi](https://pypi.org/project/nnAudio) (📥 54K / month · 📦 4 · ⏱️ 13.02.2024):
pip install nnAudio
Julius (🥉23 · ⭐ 440) - Fast PyTorch based DSP for audio and 1D signals. MIT - [GitHub](https://github.com/adefossez/julius) (👨‍💻 3 · 🔀 25 · 📦 2.9K · 📋 12 - 16% open · ⏱️ 17.02.2025):
git clone https://github.com/adefossez/julius
- [PyPi](https://pypi.org/project/julius) (📥 490K / month · 📦 44 · ⏱️ 20.09.2022):
pip install julius
DeepSpeech (🥉22 · ⭐ 26K) - DeepSpeech is an open source embedded (offline, on-device).. MPL-2.0 - [GitHub](https://github.com/mozilla/DeepSpeech) (👨‍💻 140 · 🔀 4K):
git clone https://github.com/mozilla/DeepSpeech
- [PyPi](https://pypi.org/project/deepspeech) (📥 5.7K / month · 📦 24 · ⏱️ 19.12.2020):
pip install deepspeech
- [Conda](https://anaconda.org/conda-forge/deepspeech) (📥 3.8K · ⏱️ 22.04.2025):
conda install -c conda-forge deepspeech
Show 12 hidden projects... - Coqui TTS (🥇36 · ⭐ 40K · 💀) - - a deep learning toolkit for Text-to-Speech, battle-.. MPL-2.0 - Pydub (🥇36 · ⭐ 9.4K · 💀) - Manipulate audio with a simple and easy high level interface. MIT - audioread (🥉30 · ⭐ 510 · 💀) - cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio.. MIT - Essentia (🥉29 · ⭐ 3.1K) - C++ library for audio and music analysis, description and.. ❗️AGPL-3.0 - aubio (🥉28 · ⭐ 3.4K) - a library for audio and music analysis. ❗️GPL-3.0 - TTS (🥉26 · ⭐ 9.8K · 💀) - Deep learning for Text to Speech (Discussion forum:.. MPL-2.0 - python_speech_features (🥉25 · ⭐ 2.4K · 💀) - This library provides common speech features for ASR.. MIT - Dejavu (🥉23 · ⭐ 6.6K · 💀) - Audio fingerprinting and recognition in Python. MIT - kapre (🥉22 · ⭐ 930 · 💀) - kapre: Keras Audio Preprocessors. MIT - TimeSide (🥉21 · ⭐ 380 · 💤) - scalable audio processing framework and server written in.. ❗️AGPL-3.0 - Muda (🥉17 · ⭐ 230 · 💀) - A library for augmenting annotated audio data. ISC - textlesslib (🥉10 · ⭐ 540 · 💀) - Library for Textless Spoken Language Processing. MIT


Geospatial Data

Back to top

Libraries to load, process, analyze, and write geographic data as well as libraries for spatial analysis, map visualization, and geocoding.

pydeck (🥇43 · ⭐ 13K) - WebGL2 powered visualization framework. MIT - [GitHub](https://github.com/visgl/deck.gl) (👨‍💻 290 · 🔀 2.1K · 📦 9.1K · 📋 3.2K - 13% open · ⏱️ 21.05.2025):
git clone https://github.com/visgl/deck.gl
- [PyPi](https://pypi.org/project/pydeck) (📥 8.9M / month · 📦 160 · ⏱️ 21.03.2025):
pip install pydeck
- [Conda](https://anaconda.org/conda-forge/pydeck) (📥 750K · ⏱️ 22.04.2025):
conda install -c conda-forge pydeck
- [npm](https://www.npmjs.com/package/deck.gl) (📥 560K / month · 📦 350 · ⏱️ 14.05.2025):
npm install deck.gl
Shapely (🥇41 · ⭐ 4.1K) - Manipulation and analysis of geometric objects. BSD-3 - [GitHub](https://github.com/shapely/shapely) (👨‍💻 170 · 🔀 590 · 📥 3.8K · 📦 110K · 📋 1.3K - 18% open · ⏱️ 19.05.2025):
git clone https://github.com/shapely/shapely
- [PyPi](https://pypi.org/project/shapely) (📥 50M / month · 📦 4.2K · ⏱️ 19.05.2025):
pip install shapely
- [Conda](https://anaconda.org/conda-forge/shapely) (📥 12M · ⏱️ 19.05.2025):
conda install -c conda-forge shapely
folium (🥇40 · ⭐ 7.1K) - Python Data. Leaflet.js Maps. MIT - [GitHub](https://github.com/python-visualization/folium) (👨‍💻 170 · 🔀 2.2K · 📦 61K · 📋 1.2K - 6% open · ⏱️ 17.05.2025):
git clone https://github.com/python-visualization/folium
- [PyPi](https://pypi.org/project/folium) (📥 2.2M / month · 📦 980 · ⏱️ 15.05.2025):
pip install folium
- [Conda](https://anaconda.org/conda-forge/folium) (📥 3.8M · ⏱️ 16.05.2025):
conda install -c conda-forge folium
GeoPandas (🥈38 · ⭐ 4.8K) - Python tools for geographic data. BSD-3 - [GitHub](https://github.com/geopandas/geopandas) (👨‍💻 240 · 🔀 950 · 📥 3K · 📦 57K · 📋 1.7K - 25% open · ⏱️ 21.05.2025):
git clone https://github.com/geopandas/geopandas
- [PyPi](https://pypi.org/project/geopandas) (📥 7.9M / month · 📦 2.8K · ⏱️ 02.07.2024):
pip install geopandas
- [Conda](https://anaconda.org/conda-forge/geopandas) (📥 4.7M · ⏱️ 22.04.2025):
conda install -c conda-forge geopandas
Rasterio (🥈38 · ⭐ 2.4K) - Rasterio reads and writes geospatial raster datasets. BSD-3 - [GitHub](https://github.com/rasterio/rasterio) (👨‍💻 170 · 🔀 540 · 📥 1K · 📦 17K · 📋 1.9K - 8% open · ⏱️ 20.05.2025):
git clone https://github.com/rasterio/rasterio
- [PyPi](https://pypi.org/project/rasterio) (📥 2.3M / month · 📦 1.5K · ⏱️ 02.12.2024):
pip install rasterio
- [Conda](https://anaconda.org/conda-forge/rasterio) (📥 4.7M · ⏱️ 22.04.2025):
conda install -c conda-forge rasterio
ArcGIS API (🥈36 · ⭐ 2K) - Documentation and samples for ArcGIS API for Python. Apache-2 - [GitHub](https://github.com/Esri/arcgis-python-api) (👨‍💻 97 · 🔀 1.1K · 📥 15K · 📦 960 · 📋 850 - 9% open · ⏱️ 15.05.2025):
git clone https://github.com/Esri/arcgis-python-api
- [PyPi](https://pypi.org/project/arcgis) (📥 130K / month · 📦 41 · ⏱️ 17.04.2025):
pip install arcgis
- [Docker Hub](https://hub.docker.com/r/esridocker/arcgis-api-python-notebook):
docker pull esridocker/arcgis-api-python-notebook
pyproj (🥈36 · ⭐ 1.1K · 📉) - Python interface to PROJ (cartographic projections and coordinate.. MIT - [GitHub](https://github.com/pyproj4/pyproj) (👨‍💻 71 · 🔀 220 · 📦 44K · 📋 640 - 5% open · ⏱️ 05.05.2025):
git clone https://github.com/pyproj4/pyproj
- [PyPi](https://pypi.org/project/pyproj) (📥 11M / month · 📦 1.9K · ⏱️ 16.02.2025):
pip install pyproj
- [Conda](https://anaconda.org/conda-forge/pyproj) (📥 10M · ⏱️ 22.04.2025):
conda install -c conda-forge pyproj
Fiona (🥈35 · ⭐ 1.2K) - Fiona reads and writes geographic data files. BSD-3 - [GitHub](https://github.com/Toblerity/Fiona) (👨‍💻 78 · 🔀 210 · 📦 26K · 📋 810 - 4% open · ⏱️ 20.02.2025):
git clone https://github.com/Toblerity/Fiona
- [PyPi](https://pypi.org/project/fiona) (📥 4.4M / month · 📦 380 · ⏱️ 16.09.2024):
pip install fiona
- [Conda](https://anaconda.org/conda-forge/fiona) (📥 6.9M · ⏱️ 22.04.2025):
conda install -c conda-forge fiona
ipyleaflet (🥉33 · ⭐ 1.5K) - A Jupyter - Leaflet.js bridge. MIT - [GitHub](https://github.com/jupyter-widgets/ipyleaflet) (👨‍💻 92 · 🔀 360 · 📦 17K · 📋 660 - 45% open · ⏱️ 05.12.2024):
git clone https://github.com/jupyter-widgets/ipyleaflet
- [PyPi](https://pypi.org/project/ipyleaflet) (📥 230K / month · 📦 280 · ⏱️ 22.07.2024):
pip install ipyleaflet
- [Conda](https://anaconda.org/conda-forge/ipyleaflet) (📥 1.4M · ⏱️ 22.04.2025):
conda install -c conda-forge ipyleaflet
- [npm](https://www.npmjs.com/package/jupyter-leaflet) (📥 1.8K / month · 📦 9 · ⏱️ 22.07.2024):
npm install jupyter-leaflet
geojson (🥉31 · ⭐ 960) - Python bindings and utilities for GeoJSON. BSD-3 - [GitHub](https://github.com/jazzband/geojson) (👨‍💻 58 · 🔀 120 · 📦 20K · 📋 100 - 25% open · ⏱️ 21.12.2024):
git clone https://github.com/jazzband/geojson
- [PyPi](https://pypi.org/project/geojson) (📥 2.9M / month · 📦 720 · ⏱️ 21.12.2024):
pip install geojson
- [Conda](https://anaconda.org/conda-forge/geojson) (📥 970K · ⏱️ 22.04.2025):
conda install -c conda-forge geojson
PySAL (🥉30 · ⭐ 1.4K) - PySAL: Python Spatial Analysis Library Meta-Package. BSD-3 - [GitHub](https://github.com/pysal/pysal) (👨‍💻 79 · 🔀 300 · 📦 1.8K · 📋 660 - 3% open · ⏱️ 06.02.2025):
git clone https://github.com/pysal/pysal
- [PyPi](https://pypi.org/project/pysal) (📥 29K / month · 📦 59 · ⏱️ 06.02.2025):
pip install pysal
- [Conda](https://anaconda.org/conda-forge/pysal) (📥 630K · ⏱️ 22.04.2025):
conda install -c conda-forge pysal
EarthPy (🥉29 · ⭐ 520 · 📈) - A package built to support working with spatial data using open.. BSD-3 - [GitHub](https://github.com/earthlab/earthpy) (👨‍💻 44 · 🔀 160 · 📥 36 · 📦 430 · 📋 250 - 16% open · ⏱️ 21.05.2025):
git clone https://github.com/earthlab/earthpy
- [PyPi](https://pypi.org/project/earthpy) (📥 12K / month · 📦 17 · ⏱️ 01.10.2021):
pip install earthpy
- [Conda](https://anaconda.org/conda-forge/earthpy) (📥 92K · ⏱️ 22.04.2025):
conda install -c conda-forge earthpy
GeoViews (🥉28 · ⭐ 610) - Simple, concise geographical visualization in Python. BSD-3 - [GitHub](https://github.com/holoviz/geoviews) (👨‍💻 33 · 🔀 77 · 📦 1.3K · 📋 360 - 31% open · ⏱️ 21.05.2025):
git clone https://github.com/holoviz/geoviews
- [PyPi](https://pypi.org/project/geoviews) (📥 17K / month · 📦 63 · ⏱️ 17.12.2024):
pip install geoviews
- [Conda](https://anaconda.org/conda-forge/geoviews) (📥 300K · ⏱️ 22.04.2025):
conda install -c conda-forge geoviews
Mapbox GL (🥉24 · ⭐ 680) - Use Mapbox GL JS to visualize data in a Python Jupyter notebook. MIT - [GitHub](https://github.com/mapbox/mapboxgl-jupyter) (👨‍💻 23 · 🔀 140 · 📦 240 · 📋 110 - 38% open · ⏱️ 06.02.2025):
git clone https://github.com/mapbox/mapboxgl-jupyter
- [PyPi](https://pypi.org/project/mapboxgl) (📥 10K / month · 📦 12 · ⏱️ 02.06.2019):
pip install mapboxgl
pymap3d (🥉24 · ⭐ 420) - pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef.. BSD-2 - [GitHub](https://github.com/geospace-code/pymap3d) (👨‍💻 19 · 🔀 87 · 📦 510 · 📋 60 - 15% open · ⏱️ 08.01.2025):
git clone https://github.com/geospace-code/pymap3d
- [PyPi](https://pypi.org/project/pymap3d) (📥 350K / month · 📦 44 · ⏱️ 11.02.2024):
pip install pymap3d
- [Conda](https://anaconda.org/conda-forge/pymap3d) (📥 100K · ⏱️ 22.04.2025):
conda install -c conda-forge pymap3d
Show 7 hidden projects... - Satpy (🥈34 · ⭐ 1.1K) - Python package for earth-observing satellite data processing. ❗️GPL-3.0 - geopy (🥉33 · ⭐ 4.6K · 💀) - Geocoding library for Python. MIT - Geocoder (🥉33 · ⭐ 1.6K · 💀) - Python Geocoder. MIT - Sentinelsat (🥉27 · ⭐ 1K · 💀) - Search and download Copernicus Sentinel satellite images. ❗️GPL-3.0 - prettymaps (🥉26 · ⭐ 12K) - Draw pretty maps from OpenStreetMap data! Built with osmnx.. ❗️AGPL-3.0 - gmaps (🥉23 · ⭐ 760 · 💀) - Google maps for Jupyter notebooks. BSD-3 - geoplotlib (🥉22 · ⭐ 1K · 💀) - python toolbox for visualizing geographical data and making maps. MIT


Financial Data

Back to top

Libraries for algorithmic stock/crypto trading, risk analytics, backtesting, technical analysis, and other tasks on financial data.

yfinance (🥇43 · ⭐ 18K) - Download market data from Yahoo! Finances API. Apache-2 - [GitHub](https://github.com/ranaroussi/yfinance) (👨‍💻 130 · 🔀 2.7K · 📦 79K · 📋 1.6K - 9% open · ⏱️ 14.05.2025):
git clone https://github.com/ranaroussi/yfinance
- [PyPi](https://pypi.org/project/yfinance) (📥 3.5M / month · 📦 970 · ⏱️ 12.05.2025):
pip install yfinance
- [Conda](https://anaconda.org/ranaroussi/yfinance) (📥 98K · ⏱️ 25.03.2025):
conda install -c ranaroussi yfinance
Qlib (🥇32 · ⭐ 20K) - Qlib is an AI-oriented quantitative investment platform that aims to.. MIT - [GitHub](https://github.com/microsoft/qlib) (👨‍💻 140 · 🔀 3.2K · 📥 800 · 📦 21 · 📋 970 - 25% open · ⏱️ 22.05.2025):
git clone https://github.com/microsoft/qlib
- [PyPi](https://pypi.org/project/pyqlib) (📥 8.9K / month · 📦 1 · ⏱️ 23.12.2024):
pip install pyqlib
bt (🥈30 · ⭐ 2.5K) - bt - flexible backtesting for Python. MIT - [GitHub](https://github.com/pmorissette/bt) (👨‍💻 34 · 🔀 440 · 📦 1.7K · 📋 350 - 23% open · ⏱️ 08.04.2025):
git clone https://github.com/pmorissette/bt
- [PyPi](https://pypi.org/project/bt) (📥 8.4K / month · 📦 15 · ⏱️ 12.04.2025):
pip install bt
- [Conda](https://anaconda.org/conda-forge/bt) (📥 85K · ⏱️ 22.04.2025):
conda install -c conda-forge bt
TensorTrade (🥉27 · ⭐ 5.2K · 💤) - An open source reinforcement learning framework for.. Apache-2 - [GitHub](https://github.com/tensortrade-org/tensortrade) (👨‍💻 61 · 🔀 1.1K · 📦 70 · 📋 260 - 20% open · ⏱️ 09.06.2024):
git clone https://github.com/tensortrade-org/tensortrade
- [PyPi](https://pypi.org/project/tensortrade) (📥 1.2K / month · 📦 1 · ⏱️ 10.05.2021):
pip install tensortrade
- [Conda](https://anaconda.org/conda-forge/tensortrade) (📥 4.9K · ⏱️ 22.04.2025):
conda install -c conda-forge tensortrade
Alpha Vantage (🥉27 · ⭐ 4.5K) - A python wrapper for Alpha Vantage API for financial data. MIT - [GitHub](https://github.com/RomelTorres/alpha_vantage) (👨‍💻 44 · 🔀 750 · 📋 290 - 0% open · ⏱️ 01.05.2025):
git clone https://github.com/RomelTorres/alpha_vantage
- [PyPi](https://pypi.org/project/alpha_vantage) (📥 64K / month · 📦 35 · ⏱️ 18.07.2024):
pip install alpha_vantage
- [Conda](https://anaconda.org/conda-forge/alpha_vantage) (📥 9.1K · ⏱️ 22.04.2025):
conda install -c conda-forge alpha_vantage
ffn (🥉27 · ⭐ 2.3K) - ffn - a financial function library for Python. MIT - [GitHub](https://github.com/pmorissette/ffn) (👨‍💻 36 · 🔀 320 · 📦 560 · 📋 140 - 17% open · ⏱️ 01.04.2025):
git clone https://github.com/pmorissette/ffn
- [PyPi](https://pypi.org/project/ffn) (📥 24K / month · 📦 22 · ⏱️ 11.02.2025):
pip install ffn
- [Conda](https://anaconda.org/conda-forge/ffn) (📥 20K · ⏱️ 22.04.2025):
conda install -c conda-forge ffn
finmarketpy (🥉25 · ⭐ 3.6K) - Python library for backtesting trading strategies & analyzing.. Apache-2 - [GitHub](https://github.com/cuemacro/finmarketpy) (👨‍💻 19 · 🔀 510 · 📥 57 · 📦 16 · 📋 35 - 88% open · ⏱️ 10.03.2025):
git clone https://github.com/cuemacro/finmarketpy
- [PyPi](https://pypi.org/project/finmarketpy) (📥 370 / month · ⏱️ 10.03.2025):
pip install finmarketpy
stockstats (🥉23 · ⭐ 1.4K · 📉) - Supply a wrapper ``StockDataFrame`` based on the.. BSD-3 - [GitHub](https://github.com/jealous/stockstats) (👨‍💻 10 · 🔀 300 · 📋 130 - 10% open · ⏱️ 18.05.2025):
git clone https://github.com/jealous/stockstats
- [PyPi](https://pypi.org/project/stockstats) (📥 14K / month · 📦 14 · ⏱️ 18.05.2025):
pip install stockstats
tf-quant-finance (🥉21 · ⭐ 4.9K) - High-performance TensorFlow library for quantitative.. Apache-2 - [GitHub](https://github.com/google/tf-quant-finance) (👨‍💻 48 · 🔀 600 · 📋 63 - 55% open · ⏱️ 21.03.2025):
git clone https://github.com/google/tf-quant-finance
- [PyPi](https://pypi.org/project/tf-quant-finance) (📥 460 / month · 📦 3 · ⏱️ 19.08.2022):
pip install tf-quant-finance
Show 16 hidden projects... - zipline (🥇33 · ⭐ 18K · 💀) - Zipline, a Pythonic Algorithmic Trading Library. Apache-2 - pyfolio (🥇32 · ⭐ 6K · 💀) - Portfolio and risk analytics in Python. Apache-2 - ta (🥈31 · ⭐ 4.6K · 💀) - Technical Analysis Library using Pandas and Numpy. MIT - arch (🥈30 · ⭐ 1.4K) - ARCH models in Python. ❗Unlicensed - backtrader (🥈29 · ⭐ 17K · 💀) - Python Backtesting library for trading strategies. ❗️GPL-3.0 - Alphalens (🥈28 · ⭐ 3.7K · 💀) - Performance analysis of predictive (alpha) stock factors. Apache-2 - IB-insync (🥈28 · ⭐ 3K · 💀) - Python sync/async framework for Interactive Brokers API. BSD-2 - empyrical (🥈28 · ⭐ 1.4K · 💀) - Common financial risk and performance metrics. Used by.. Apache-2 - Backtesting.py (🥉27 · ⭐ 6.5K) - Backtest trading strategies in Python. ❗️AGPL-3.0 - Enigma Catalyst (🥉26 · ⭐ 2.5K · 💀) - An Algorithmic Trading Library for Crypto-Assets in.. Apache-2 - PyAlgoTrade (🥉24 · ⭐ 4.5K · 💀) - Python Algorithmic Trading Library. Apache-2 - FinTA (🥉24 · ⭐ 2.2K · 💀) - Common financial technical indicators implemented in Pandas. ❗️LGPL-3.0 - Crypto Signals (🥉22 · ⭐ 5.2K · 💀) - Github.com/CryptoSignal - Trading & Technical Analysis Bot -.. MIT - FinQuant (🥉22 · ⭐ 1.5K · 💀) - A program for financial portfolio management, analysis and.. MIT - surpriver (🥉12 · ⭐ 1.8K · 💀) - Find big moving stocks before they move using machine.. ❗️GPL-3.0 - pyrtfolio (🥉9 · ⭐ 150 · 💀) - Python package to generate stock portfolios. ❗️GPL-3.0


Time Series Data

Back to top

Libraries for forecasting, anomaly detection, feature extraction, and machine learning on time-series and sequential data.

sktime (🥇40 · ⭐ 8.4K) - A unified framework for machine learning with time series. BSD-3 - [GitHub](https://github.com/sktime/sktime) (👨‍💻 460 · 🔀 1.5K · 📥 110 · 📦 4.5K · 📋 2.9K - 38% open · ⏱️ 20.05.2025):
git clone https://github.com/alan-turing-institute/sktime
- [PyPi](https://pypi.org/project/sktime) (📥 950K / month · 📦 140 · ⏱️ 12.04.2025):
pip install sktime
- [Conda](https://anaconda.org/conda-forge/sktime-all-extras) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge sktime-all-extras
Prophet (🥇34 · ⭐ 19K) - Tool for producing high quality forecasts for time series data that has.. MIT - [GitHub](https://github.com/facebook/prophet) (👨‍💻 190 · 🔀 4.5K · 📥 3K · 📦 21 · 📋 2.2K - 20% open · ⏱️ 17.05.2025):
git clone https://github.com/facebook/prophet
- [PyPi](https://pypi.org/project/fbprophet) (📥 150K / month · 📦 91 · ⏱️ 05.09.2020):
pip install fbprophet
- [Conda](https://anaconda.org/conda-forge/prophet) (📥 1.4M · ⏱️ 22.04.2025):
conda install -c conda-forge prophet
Darts (🥇33 · ⭐ 8.6K) - A python library for user-friendly forecasting and anomaly detection.. Apache-2 - [GitHub](https://github.com/unit8co/darts) (👨‍💻 130 · 🔀 920 · 📋 1.7K - 14% open · ⏱️ 16.05.2025):
git clone https://github.com/unit8co/darts
- [PyPi](https://pypi.org/project/u8darts) (📥 71K / month · 📦 10 · ⏱️ 18.04.2025):
pip install u8darts
- [Conda](https://anaconda.org/conda-forge/u8darts-all) (📥 79K · ⏱️ 22.04.2025):
conda install -c conda-forge u8darts-all
- [Docker Hub](https://hub.docker.com/r/unit8/darts) (📥 1.6K · ⏱️ 18.04.2025):
docker pull unit8/darts
StatsForecast (🥇33 · ⭐ 4.4K) - Lightning fast forecasting with statistical and econometric.. Apache-2 - [GitHub](https://github.com/Nixtla/statsforecast) (👨‍💻 51 · 🔀 310 · 📦 1.7K · 📋 360 - 30% open · ⏱️ 29.04.2025):
git clone https://github.com/Nixtla/statsforecast
- [PyPi](https://pypi.org/project/statsforecast) (📥 840K / month · 📦 68 · ⏱️ 18.02.2025):
pip install statsforecast
- [Conda](https://anaconda.org/conda-forge/statsforecast) (📥 180K · ⏱️ 22.04.2025):
conda install -c conda-forge statsforecast
tsfresh (🥈32 · ⭐ 8.8K) - Automatic extraction of relevant features from time series:. MIT - [GitHub](https://github.com/blue-yonder/tsfresh) (👨‍💻 99 · 🔀 1.2K · 📦 21 · 📋 550 - 12% open · ⏱️ 16.02.2025):
git clone https://github.com/blue-yonder/tsfresh
- [PyPi](https://pypi.org/project/tsfresh) (📥 240K / month · 📦 100 · ⏱️ 16.02.2025):
pip install tsfresh
- [Conda](https://anaconda.org/conda-forge/tsfresh) (📥 1.4M · ⏱️ 22.04.2025):
conda install -c conda-forge tsfresh
pytorch-forecasting (🥈32 · ⭐ 4.3K) - Time series forecasting with PyTorch. MIT - [GitHub](https://github.com/sktime/pytorch-forecasting) (👨‍💻 69 · 🔀 670 · 📦 610 · 📋 850 - 61% open · ⏱️ 18.05.2025):
git clone https://github.com/jdb78/pytorch-forecasting
- [PyPi](https://pypi.org/project/pytorch-forecasting) (📥 130K / month · 📦 22 · ⏱️ 06.02.2025):
pip install pytorch-forecasting
- [Conda](https://anaconda.org/conda-forge/pytorch-forecasting) (📥 78K · ⏱️ 22.04.2025):
conda install -c conda-forge pytorch-forecasting
NeuralForecast (🥈32 · ⭐ 3.5K) - Scalable and user friendly neural forecasting algorithms. Apache-2 - [GitHub](https://github.com/Nixtla/neuralforecast) (👨‍💻 50 · 🔀 410 · 📦 380 · 📋 620 - 17% open · ⏱️ 16.05.2025):
git clone https://github.com/Nixtla/neuralforecast
- [PyPi](https://pypi.org/project/neuralforecast) (📥 100K / month · 📦 28 · ⏱️ 13.05.2025):
pip install neuralforecast
- [Conda](https://anaconda.org/conda-forge/neuralforecast) (📥 38K · ⏱️ 22.04.2025):
conda install -c conda-forge neuralforecast
pmdarima (🥈32 · ⭐ 1.6K) - A statistical library designed to fill the void in Pythons time series.. MIT - [GitHub](https://github.com/alkaline-ml/pmdarima) (👨‍💻 23 · 🔀 240 · 📦 12K · 📋 340 - 19% open · ⏱️ 07.11.2024):
git clone https://github.com/alkaline-ml/pmdarima
- [PyPi](https://pypi.org/project/pmdarima) (📥 2.4M / month · 📦 150 · ⏱️ 23.10.2023):
pip install pmdarima
- [Conda](https://anaconda.org/conda-forge/pmdarima) (📥 1.3M · ⏱️ 22.04.2025):
conda install -c conda-forge pmdarima
skforecast (🥈32 · ⭐ 1.3K) - Time series forecasting with machine learning models. BSD-3 - [GitHub](https://github.com/skforecast/skforecast) (👨‍💻 20 · 🔀 160 · 📦 470 · 📋 200 - 14% open · ⏱️ 01.05.2025):
git clone https://github.com/JoaquinAmatRodrigo/skforecast
- [PyPi](https://pypi.org/project/skforecast) (📥 77K / month · 📦 18 · ⏱️ 01.05.2025):
pip install skforecast
STUMPY (🥈31 · ⭐ 3.9K) - STUMPY is a powerful and scalable Python library for modern time series.. BSD-3 - [GitHub](https://github.com/TDAmeritrade/stumpy) (👨‍💻 40 · 🔀 330 · 📦 1.3K · 📋 540 - 13% open · ⏱️ 08.04.2025):
git clone https://github.com/TDAmeritrade/stumpy
- [PyPi](https://pypi.org/project/stumpy) (📥 300K / month · 📦 30 · ⏱️ 09.07.2024):
pip install stumpy
- [Conda](https://anaconda.org/conda-forge/stumpy) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge stumpy
tslearn (🥈31 · ⭐ 3K · 💤) - The machine learning toolkit for time series analysis in Python. BSD-2 - [GitHub](https://github.com/tslearn-team/tslearn) (👨‍💻 43 · 🔀 340 · 📦 1.8K · 📋 340 - 41% open · ⏱️ 01.07.2024):
git clone https://github.com/tslearn-team/tslearn
- [PyPi](https://pypi.org/project/tslearn) (📥 340K / month · 📦 79 · ⏱️ 12.12.2023):
pip install tslearn
- [Conda](https://anaconda.org/conda-forge/tslearn) (📥 1.6M · ⏱️ 22.04.2025):
conda install -c conda-forge tslearn
GluonTS (🥈30 · ⭐ 4.9K) - Probabilistic time series modeling in Python. Apache-2 - [GitHub](https://github.com/awslabs/gluonts) (👨‍💻 120 · 🔀 780 · 📋 970 - 34% open · ⏱️ 08.04.2025):
git clone https://github.com/awslabs/gluon-ts
- [PyPi](https://pypi.org/project/gluonts) (📥 630K / month · 📦 36 · ⏱️ 08.04.2025):
pip install gluonts
- [Conda](https://anaconda.org/anaconda/gluonts) (📥 2K · ⏱️ 22.04.2025):
conda install -c anaconda gluonts
Streamz (🥉29 · ⭐ 1.3K) - Real-time stream processing for python. BSD-3 - [GitHub](https://github.com/python-streamz/streamz) (👨‍💻 49 · 🔀 150 · 📦 550 · 📋 270 - 44% open · ⏱️ 22.11.2024):
git clone https://github.com/python-streamz/streamz
- [PyPi](https://pypi.org/project/streamz) (📥 27K / month · 📦 57 · ⏱️ 27.07.2022):
pip install streamz
- [Conda](https://anaconda.org/conda-forge/streamz) (📥 2.1M · ⏱️ 22.04.2025):
conda install -c conda-forge streamz
pyts (🥉27 · ⭐ 1.8K) - A Python package for time series classification. BSD-3 - [GitHub](https://github.com/johannfaouzi/pyts) (👨‍💻 15 · 🔀 170 · 📦 870 · 📋 87 - 58% open · ⏱️ 27.04.2025):
git clone https://github.com/johannfaouzi/pyts
- [PyPi](https://pypi.org/project/pyts) (📥 120K / month · 📦 45 · ⏱️ 18.06.2023):
pip install pyts
- [Conda](https://anaconda.org/conda-forge/pyts) (📥 33K · ⏱️ 22.04.2025):
conda install -c conda-forge pyts
NeuralProphet (🥉26 · ⭐ 4.1K · 💤) - NeuralProphet: A simple forecasting package. MIT - [GitHub](https://github.com/ourownstory/neural_prophet) (👨‍💻 55 · 🔀 490 · 📋 560 - 11% open · ⏱️ 13.09.2024):
git clone https://github.com/ourownstory/neural_prophet
- [PyPi](https://pypi.org/project/neuralprophet) (📥 87K / month · 📦 8 · ⏱️ 26.06.2024):
pip install neuralprophet
greykite (🥉22 · ⭐ 1.8K) - A flexible, intuitive and fast forecasting library. BSD-2 - [GitHub](https://github.com/linkedin/greykite) (👨‍💻 10 · 🔀 110 · 📥 36 · 📦 44 · 📋 110 - 10% open · ⏱️ 20.02.2025):
git clone https://github.com/linkedin/greykite
- [PyPi](https://pypi.org/project/greykite) (📥 6.9K / month · ⏱️ 20.02.2025):
pip install greykite
TSFEL (🥉21 · ⭐ 1K · 💤) - An intuitive library to extract features from time series. BSD-3 - [GitHub](https://github.com/fraunhoferportugal/tsfel) (👨‍💻 20 · 🔀 140 · 📦 210 · 📋 82 - 12% open · ⏱️ 17.10.2024):
git clone https://github.com/fraunhoferportugal/tsfel
- [PyPi](https://pypi.org/project/tsfel) (📥 6.8K / month · 📦 7 · ⏱️ 12.09.2024):
pip install tsfel
pydlm (🥉21 · ⭐ 480 · 💤) - A python library for Bayesian time series modeling. BSD-3 - [GitHub](https://github.com/wwrechard/pydlm) (👨‍💻 7 · 🔀 98 · 📦 41 · 📋 51 - 70% open · ⏱️ 07.09.2024):
git clone https://github.com/wwrechard/pydlm
- [PyPi](https://pypi.org/project/pydlm) (📥 71K / month · 📦 2 · ⏱️ 13.08.2024):
pip install pydlm
tsflex (🥉20 · ⭐ 420 · 💤) - Flexible time series feature extraction & processing. MIT - [GitHub](https://github.com/predict-idlab/tsflex) (👨‍💻 6 · 🔀 26 · 📦 22 · 📋 56 - 58% open · ⏱️ 06.09.2024):
git clone https://github.com/predict-idlab/tsflex
- [PyPi](https://pypi.org/project/tsflex) (📥 1.9K / month · 📦 2 · ⏱️ 06.09.2024):
pip install tsflex
- [Conda](https://anaconda.org/conda-forge/tsflex) (📥 33K · ⏱️ 22.04.2025):
conda install -c conda-forge tsflex
Auto TS (🥉18 · ⭐ 760 · 💤) - Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost.. Apache-2 - [GitHub](https://github.com/AutoViML/Auto_TS) (👨‍💻 13 · 🔀 120 · 📋 90 - 2% open · ⏱️ 05.05.2024):
git clone https://github.com/AutoViML/Auto_TS
- [PyPi](https://pypi.org/project/auto-ts) (📥 2.2K / month · ⏱️ 05.05.2024):
pip install auto-ts
Show 9 hidden projects... - PyFlux (🥉25 · ⭐ 2.1K · 💀) - Open source time series library for Python. BSD-3 - ADTK (🥉23 · ⭐ 1.2K · 💀) - A Python toolkit for rule-based/unsupervised anomaly detection in.. MPL-2.0 - luminol (🥉21 · ⭐ 1.2K · 💀) - Anomaly Detection and Correlation library. Apache-2 - seglearn (🥉21 · ⭐ 580 · 💀) - Python module for machine learning time series:. BSD-3 - tick (🥉20 · ⭐ 510 · 💀) - Module for statistical learning, with a particular emphasis on time-.. BSD-3 - matrixprofile-ts (🥉19 · ⭐ 740 · 💀) - A Python library for detecting patterns and anomalies.. Apache-2 - atspy (🥉14 · ⭐ 520 · 💀) - AtsPy: Automated Time Series Models in Python (by @firmai). MIT - tsaug (🥉14 · ⭐ 350 · 💀) - A Python package for time series augmentation. Apache-2 - tslumen (🥉8 · ⭐ 69 · 💀) - A library for Time Series EDA (exploratory data analysis). Apache-2


Medical Data

Back to top

Libraries for processing and analyzing medical data such as MRIs, EEGs, genomic data, and other medical imaging formats.

MNE (🥇39 · ⭐ 2.9K) - MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python. BSD-3 - [GitHub](https://github.com/mne-tools/mne-python) (👨‍💻 390 · 🔀 1.3K · 📦 5.9K · 📋 5K - 11% open · ⏱️ 22.05.2025):
git clone https://github.com/mne-tools/mne-python
- [PyPi](https://pypi.org/project/mne) (📥 160K / month · 📦 420 · ⏱️ 18.12.2024):
pip install mne
- [Conda](https://anaconda.org/conda-forge/mne) (📥 530K · ⏱️ 22.04.2025):
conda install -c conda-forge mne
Nilearn (🥇38 · ⭐ 1.3K) - Machine learning for NeuroImaging in Python. BSD-3 - [GitHub](https://github.com/nilearn/nilearn) (👨‍💻 260 · 🔀 600 · 📥 300 · 📦 4.2K · 📋 2.4K - 12% open · ⏱️ 21.05.2025):
git clone https://github.com/nilearn/nilearn
- [PyPi](https://pypi.org/project/nilearn) (📥 260K / month · 📦 310 · ⏱️ 23.12.2024):
pip install nilearn
- [Conda](https://anaconda.org/conda-forge/nilearn) (📥 330K · ⏱️ 22.04.2025):
conda install -c conda-forge nilearn
MONAI (🥈36 · ⭐ 6.4K) - AI Toolkit for Healthcare Imaging. Apache-2 - [GitHub](https://github.com/Project-MONAI/MONAI) (👨‍💻 220 · 🔀 1.2K · 📦 4.2K · 📋 3.3K - 13% open · ⏱️ 16.05.2025):
git clone https://github.com/Project-MONAI/MONAI
- [PyPi](https://pypi.org/project/monai) (📥 350K / month · 📦 140 · ⏱️ 10.12.2024):
pip install monai
- [Conda](https://anaconda.org/conda-forge/monai) (📥 49K · ⏱️ 22.04.2025):
conda install -c conda-forge monai
NIPYPE (🥈35 · ⭐ 780) - Workflows and interfaces for neuroimaging packages. Apache-2 - [GitHub](https://github.com/nipy/nipype) (👨‍💻 260 · 🔀 530 · 📦 6.9K · 📋 1.4K - 30% open · ⏱️ 28.04.2025):
git clone https://github.com/nipy/nipype
- [PyPi](https://pypi.org/project/nipype) (📥 330K / month · 📦 150 · ⏱️ 19.03.2025):
pip install nipype
- [Conda](https://anaconda.org/conda-forge/nipype) (📥 800K · ⏱️ 05.05.2025):
conda install -c conda-forge nipype
NiBabel (🥈34 · ⭐ 700) - Python package to access a cacophony of neuro-imaging file formats. MIT - [GitHub](https://github.com/nipy/nibabel) (👨‍💻 110 · 🔀 260 · 📦 29K · 📋 550 - 23% open · ⏱️ 20.05.2025):
git clone https://github.com/nipy/nibabel
- [PyPi](https://pypi.org/project/nibabel) (📥 880K / month · 📦 1.2K · ⏱️ 23.10.2024):
pip install nibabel
- [Conda](https://anaconda.org/conda-forge/nibabel) (📥 900K · ⏱️ 22.04.2025):
conda install -c conda-forge nibabel
Lifelines (🥈33 · ⭐ 2.4K · 💤) - Survival analysis in Python. MIT - [GitHub](https://github.com/CamDavidsonPilon/lifelines) (👨‍💻 120 · 🔀 560 · 📦 3.9K · 📋 980 - 27% open · ⏱️ 29.10.2024):
git clone https://github.com/CamDavidsonPilon/lifelines
- [PyPi](https://pypi.org/project/lifelines) (📥 2.2M / month · 📦 160 · ⏱️ 29.10.2024):
pip install lifelines
- [Conda](https://anaconda.org/conda-forge/lifelines) (📥 430K · ⏱️ 22.04.2025):
conda install -c conda-forge lifelines
Hail (🥈32 · ⭐ 1K) - Cloud-native genomic dataframes and batch computing. MIT - [GitHub](https://github.com/hail-is/hail) (👨‍💻 97 · 🔀 250 · 📦 170 · 📋 2.5K - 10% open · ⏱️ 22.05.2025):
git clone https://github.com/hail-is/hail
- [PyPi](https://pypi.org/project/hail) (📥 18K / month · 📦 42 · ⏱️ 07.03.2025):
pip install hail
DeepVariant (🥉28 · ⭐ 3.4K) - DeepVariant is an analysis pipeline that uses a deep neural.. BSD-3 - [GitHub](https://github.com/google/deepvariant) (👨‍💻 41 · 🔀 740 · 📥 4.8K · 📦 4 · 📋 910 - 0% open · ⏱️ 16.05.2025):
git clone https://github.com/google/deepvariant
- [Conda](https://anaconda.org/bioconda/deepvariant) (📥 76K · ⏱️ 22.04.2025):
conda install -c bioconda deepvariant
Brainiak (🥉19 · ⭐ 350) - Brain Imaging Analysis Kit. Apache-2 - [GitHub](https://github.com/brainiak/brainiak) (👨‍💻 35 · 🔀 140 · 📋 230 - 38% open · ⏱️ 06.01.2025):
git clone https://github.com/brainiak/brainiak
- [PyPi](https://pypi.org/project/brainiak) (📥 1.7K / month · ⏱️ 07.01.2025):
pip install brainiak
- [Docker Hub](https://hub.docker.com/r/brainiak/brainiak) (📥 1.9K · ⭐ 1 · ⏱️ 07.01.2025):
docker pull brainiak/brainiak
Show 10 hidden projects... - DIPY (🥈32 · ⭐ 760) - DIPY is the paragon 3D/4D+ medical imaging library in Python... ❗Unlicensed - NiftyNet (🥉25 · ⭐ 1.4K · 💀) - [unmaintained] An open-source convolutional neural.. Apache-2 - NIPY (🥉24 · ⭐ 390) - Neuroimaging in Python FMRI analysis package. ❗Unlicensed - MedPy (🥉23 · ⭐ 600 · 💤) - Medical image processing in Python. ❗️GPL-3.0 - DLTK (🥉20 · ⭐ 1.4K · 💀) - Deep Learning Toolkit for Medical Image Analysis. Apache-2 - Glow (🥉20 · ⭐ 280) - An open-source toolkit for large-scale genomic analysis. Apache-2 - MedicalTorch (🥉15 · ⭐ 870 · 💀) - A medical imaging framework for Pytorch. Apache-2 - Medical Detection Toolkit (🥉14 · ⭐ 1.3K · 💀) - The Medical Detection Toolkit contains 2D + 3D.. Apache-2 - DeepNeuro (🥉14 · ⭐ 130 · 💀) - A deep learning python package for neuroimaging data. Made by:. MIT - MedicalNet (🥉12 · ⭐ 2K · 💀) - Many studies have shown that the performance on deep learning is.. MIT


Tabular Data

Back to top

Libraries for processing tabular and structured data.

skrub (🥇29 · ⭐ 1.4K) - Machine learning with dataframes. BSD-3 - [GitHub](https://github.com/skrub-data/skrub) (👨‍💻 59 · 🔀 120 · 📦 87 · 📋 440 - 20% open · ⏱️ 22.05.2025):
git clone https://github.com/skrub-data/skrub
- [PyPi](https://pypi.org/project/skrub) (📥 12K / month · 📦 10 · ⏱️ 03.04.2025):
pip install skrub
pytorch_tabular (🥈24 · ⭐ 1.5K) - A standard framework for modelling Deep Learning Models.. MIT - [GitHub](https://github.com/manujosephv/pytorch_tabular) (👨‍💻 27 · 🔀 150 · 📥 54 · 📋 170 - 7% open · ⏱️ 19.04.2025):
git clone https://github.com/manujosephv/pytorch_tabular
- [PyPi](https://pypi.org/project/pytorch_tabular) (📥 19K / month · 📦 9 · ⏱️ 28.11.2024):
pip install pytorch_tabular
miceforest (🥈24 · ⭐ 380 · 💤) - Multiple Imputation with LightGBM in Python. MIT - [GitHub](https://github.com/AnotherSamWilson/miceforest) (👨‍💻 8 · 🔀 30 · 📦 240 · 📋 90 - 11% open · ⏱️ 02.08.2024):
git clone https://github.com/AnotherSamWilson/miceforest
- [PyPi](https://pypi.org/project/miceforest) (📥 78K / month · 📦 9 · ⏱️ 02.08.2024):
pip install miceforest
- [Conda](https://anaconda.org/conda-forge/miceforest) (📥 19K · ⏱️ 22.04.2025):
conda install -c conda-forge miceforest
upgini (🥉22 · ⭐ 330) - Data search & enrichment library for Machine Learning Easily find and add.. BSD-3 - [GitHub](https://github.com/upgini/upgini) (👨‍💻 13 · 🔀 25 · 📦 9 · ⏱️ 20.05.2025):
git clone https://github.com/upgini/upgini
- [PyPi](https://pypi.org/project/upgini) (📥 12K / month · ⏱️ 22.05.2025):
pip install upgini
Show 2 hidden projects... - carefree-learn (🥉17 · ⭐ 410 · 💀) - Deep Learning PyTorch. MIT - deltapy (🥉12 · ⭐ 550 · 💀) - DeltaPy - Tabular Data Augmentation (by @firmai). MIT


Optical Character Recognition

Back to top

Libraries for optical character recognition (OCR) and text extraction from images or videos.

PaddleOCR (🥇42 · ⭐ 49K) - Awesome multilingual OCR toolkits based on PaddlePaddle.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleOCR) (👨‍💻 300 · 🔀 8.2K · 📥 2M · 📦 5.7K · 📋 9.6K - 1% open · ⏱️ 22.05.2025):
git clone https://github.com/PaddlePaddle/PaddleOCR
- [PyPi](https://pypi.org/project/paddleocr) (📥 330K / month · 📦 150 · ⏱️ 20.05.2025):
pip install paddleocr
OCRmyPDF (🥇37 · ⭐ 29K) - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them.. MPL-2.0 - [GitHub](https://github.com/ocrmypdf/OCRmyPDF) (👨‍💻 110 · 🔀 2K · 📥 12K · 📦 1.3K · 📋 1.3K - 10% open · ⏱️ 21.04.2025):
git clone https://github.com/ocrmypdf/OCRmyPDF
- [PyPi](https://pypi.org/project/ocrmypdf) (📥 220K / month · 📦 46 · ⏱️ 24.04.2025):
pip install ocrmypdf
- [Conda](https://anaconda.org/conda-forge/ocrmypdf) (📥 99K · ⏱️ 22.04.2025):
conda install -c conda-forge ocrmypdf
EasyOCR (🥈34 · ⭐ 27K · 💤) - Ready-to-use OCR with 80+ supported languages and all popular.. Apache-2 - [GitHub](https://github.com/JaidedAI/EasyOCR) (👨‍💻 130 · 🔀 3.3K · 📥 21M · 📦 15K · 📋 1.1K - 43% open · ⏱️ 24.09.2024):
git clone https://github.com/JaidedAI/EasyOCR
- [PyPi](https://pypi.org/project/easyocr) (📥 910K / month · 📦 250 · ⏱️ 24.09.2024):
pip install easyocr
Tesseract (🥈31 · ⭐ 6.1K) - Python-tesseract is an optical character recognition (OCR) tool.. Apache-2 - [GitHub](https://github.com/madmaze/pytesseract) (👨‍💻 50 · 🔀 720 · 📋 370 - 3% open · ⏱️ 17.02.2025):
git clone https://github.com/madmaze/pytesseract
- [PyPi](https://pypi.org/project/pytesseract) (📥 3M / month · 📦 970 · ⏱️ 16.08.2024):
pip install pytesseract
- [Conda](https://anaconda.org/conda-forge/pytesseract) (📥 660K · ⏱️ 22.04.2025):
conda install -c conda-forge pytesseract
tesserocr (🥈30 · ⭐ 2.1K) - A Python wrapper for the tesseract-ocr API. MIT - [GitHub](https://github.com/sirfz/tesserocr) (👨‍💻 32 · 🔀 260 · 📥 960 · 📦 1.3K · 📋 280 - 17% open · ⏱️ 08.05.2025):
git clone https://github.com/sirfz/tesserocr
- [PyPi](https://pypi.org/project/tesserocr) (📥 130K / month · 📦 43 · ⏱️ 12.02.2025):
pip install tesserocr
- [Conda](https://anaconda.org/conda-forge/tesserocr) (📥 250K · ⏱️ 22.04.2025):
conda install -c conda-forge tesserocr
MMOCR (🥉27 · ⭐ 4.5K) - OpenMMLab Text Detection, Recognition and Understanding Toolbox. Apache-2 - [GitHub](https://github.com/open-mmlab/mmocr) (👨‍💻 90 · 🔀 750 · 📦 230 · 📋 930 - 20% open · ⏱️ 27.11.2024):
git clone https://github.com/open-mmlab/mmocr
- [PyPi](https://pypi.org/project/mmocr) (📥 4.2K / month · 📦 4 · ⏱️ 05.05.2022):
pip install mmocr
Show 6 hidden projects... - keras-ocr (🥉26 · ⭐ 1.5K · 💀) - A packaged and flexible version of the CRAFT text detector.. MIT - calamari (🥉22 · ⭐ 1.1K) - Line based ATR Engine based on OCRopy. ❗️GPL-3.0 - pdftabextract (🥉21 · ⭐ 2.2K · 💀) - A set of tools for extracting tables from PDF files.. Apache-2 - attention-ocr (🥉21 · ⭐ 1.1K · 💀) - A Tensorflow model for text recognition (CNN + seq2seq.. MIT - doc2text (🥉20 · ⭐ 1.3K · 💀) - Detect text blocks and OCR poorly scanned PDFs in bulk. Python.. MIT - Mozart (🥉10 · ⭐ 660 · 💀) - An optical music recognition (OMR) system. Converts sheet.. Apache-2


Data Containers & Structures

Back to top

General-purpose data containers & structures as well as utilities & extensions for pandas.

🔗 best-of-python - Data Containers ( ⭐ 4K) - Collection of data-container, dataframe, and pandas-..


Data Loading & Extraction

Back to top

Libraries for loading, collecting, and extracting data from a variety of data sources and formats.

🔗 best-of-python - Data Extraction ( ⭐ 4K) - Collection of data-loading and -extraction libraries.


Web Scraping & Crawling

Back to top

Libraries for web scraping, crawling, downloading, and mining as well as libraries.

🔗 best-of-web-python - Web Scraping ( ⭐ 2.5K · 💤) - Collection of web-scraping and crawling libraries.


Data Pipelines & Streaming

Back to top

Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.

🔗 best-of-python - Data Pipelines ( ⭐ 4K) - Libraries for data batch- and stream-processing,..

Show 1 hidden projects... - pyclugen (🥇10 · ⭐ 8 · 💤) - Multidimensional cluster generation in Python. MIT


Distributed Machine Learning

Back to top

Libraries that provide capabilities to distribute and parallelize machine learning tasks across large-scale compute infrastructure.

Ray (🥇47 · ⭐ 37K) - Ray is an AI compute engine. Ray consists of a core distributed runtime.. Apache-2 - [GitHub](https://github.com/ray-project/ray) (👨‍💻 1.2K · 🔀 6.3K · 📥 260 · 📦 24K · 📋 21K - 21% open · ⏱️ 22.05.2025):
git clone https://github.com/ray-project/ray
- [PyPi](https://pypi.org/project/ray) (📥 11M / month · 📦 960 · ⏱️ 07.05.2025):
pip install ray
- [Conda](https://anaconda.org/conda-forge/ray-tune) (📥 770K · ⏱️ 12.05.2025):
conda install -c conda-forge ray-tune
dask (🥇45 · ⭐ 13K) - Parallel computing with task scheduling. BSD-3 - [GitHub](https://github.com/dask/dask) (👨‍💻 620 · 🔀 1.8K · 📦 75K · 📋 5.6K - 20% open · ⏱️ 20.05.2025):
git clone https://github.com/dask/dask
- [PyPi](https://pypi.org/project/dask) (📥 12M / month · 📦 2.9K · ⏱️ 20.05.2025):
pip install dask
- [Conda](https://anaconda.org/conda-forge/dask) (📥 13M · ⏱️ 20.05.2025):
conda install -c conda-forge dask
DeepSpeed (🥇41 · ⭐ 39K) - DeepSpeed is a deep learning optimization library that makes.. Apache-2 - [GitHub](https://github.com/deepspeedai/DeepSpeed) (👨‍💻 380 · 🔀 4.4K · 📦 13K · 📋 3.1K - 34% open · ⏱️ 22.05.2025):
git clone https://github.com/microsoft/DeepSpeed
- [PyPi](https://pypi.org/project/deepspeed) (📥 760K / month · 📦 280 · ⏱️ 19.05.2025):
pip install deepspeed
- [Docker Hub](https://hub.docker.com/r/deepspeed/deepspeed) (📥 22K · ⭐ 4 · ⏱️ 02.09.2022):
docker pull deepspeed/deepspeed
dask.distributed (🥇39 · ⭐ 1.6K) - A distributed task scheduler for Dask. BSD-3 - [GitHub](https://github.com/dask/distributed) (👨‍💻 340 · 🔀 730 · 📦 41K · 📋 4K - 38% open · ⏱️ 20.05.2025):
git clone https://github.com/dask/distributed
- [PyPi](https://pypi.org/project/distributed) (📥 3.7M / month · 📦 970 · ⏱️ 20.05.2025):
pip install distributed
- [Conda](https://anaconda.org/conda-forge/distributed) (📥 18M · ⏱️ 20.05.2025):
conda install -c conda-forge distributed
horovod (🥈36 · ⭐ 14K) - Distributed training framework for TensorFlow, Keras, PyTorch, and.. Apache-2 - [GitHub](https://github.com/horovod/horovod) (👨‍💻 170 · 🔀 2.3K · 📦 1.4K · 📋 2.3K - 17% open · ⏱️ 01.02.2025):
git clone https://github.com/horovod/horovod
- [PyPi](https://pypi.org/project/horovod) (📥 100K / month · 📦 34 · ⏱️ 12.06.2023):
pip install horovod
metrics (🥈36 · ⭐ 2.3K) - Machine learning metrics for distributed, scalable PyTorch.. Apache-2 - [GitHub](https://github.com/Lightning-AI/torchmetrics) (👨‍💻 270 · 🔀 430 · 📥 6.5K · 📦 42K · 📋 950 - 7% open · ⏱️ 19.05.2025):
git clone https://github.com/Lightning-AI/metrics
- [PyPi](https://pypi.org/project/metrics) (📥 4.9K / month · 📦 4 · ⏱️ 26.02.2025):
pip install metrics
- [Conda](https://anaconda.org/conda-forge/torchmetrics) (📥 2M · ⏱️ 22.04.2025):
conda install -c conda-forge torchmetrics
H2O-3 (🥈34 · ⭐ 7.2K) - H2O is an Open Source, Distributed, Fast & Scalable Machine Learning.. Apache-2 - [GitHub](https://github.com/h2oai/h2o-3) (👨‍💻 280 · 🔀 2K · 📦 98 · 📋 9.6K - 30% open · ⏱️ 08.05.2025):
git clone https://github.com/h2oai/h2o-3
- [PyPi](https://pypi.org/project/h2o) (📥 200K / month · 📦 58 · ⏱️ 27.03.2025):
pip install h2o
BigDL (🥈32 · ⭐ 7.9K) - Accelerate local LLM inference and finetuning (LLaMA, Mistral,.. Apache-2 - [GitHub](https://github.com/intel/ipex-llm) (👨‍💻 120 · 🔀 1.4K · 📥 690 · 📋 2.9K - 39% open · ⏱️ 22.05.2025):
git clone https://github.com/intel-analytics/BigDL
- [PyPi](https://pypi.org/project/bigdl) (📥 13K / month · 📦 2 · ⏱️ 24.03.2024):
pip install bigdl
- [Maven](https://search.maven.org/artifact/com.intel.analytics.bigdl/bigdl-SPARK_2.4) (📦 5 · ⏱️ 20.04.2021):
<dependency>
    <groupId>com.intel.analytics.bigdl</groupId>
    <artifactId>bigdl-SPARK_2.4</artifactId>
    <version>[VERSION]</version>
</dependency>
ColossalAI (🥈31 · ⭐ 41K) - Making large AI models cheaper, faster and more accessible. Apache-2 - [GitHub](https://github.com/hpcaitech/ColossalAI) (👨‍💻 190 · 🔀 4.5K · 📦 510 · 📋 1.8K - 27% open · ⏱️ 18.04.2025):
git clone https://github.com/hpcaitech/colossalai
FairScale (🥈31 · ⭐ 3.3K) - PyTorch extensions for high performance and large scale training. BSD-3 - [GitHub](https://github.com/facebookresearch/fairscale) (👨‍💻 76 · 🔀 290 · 📦 8.4K · 📋 390 - 26% open · ⏱️ 26.04.2025):
git clone https://github.com/facebookresearch/fairscale
- [PyPi](https://pypi.org/project/fairscale) (📥 510K / month · 📦 150 · ⏱️ 11.12.2022):
pip install fairscale
- [Conda](https://anaconda.org/conda-forge/fairscale) (📥 450K · ⏱️ 22.04.2025):
conda install -c conda-forge fairscale
Submit it (🥈31 · ⭐ 1.4K · 📈) - Python 3.8+ toolbox for submitting jobs to Slurm. MIT - [GitHub](https://github.com/facebookincubator/submitit) (👨‍💻 26 · 🔀 140 · 📦 4.5K · 📋 130 - 39% open · ⏱️ 21.05.2025):
git clone https://github.com/facebookincubator/submitit
- [PyPi](https://pypi.org/project/submitit) (📥 480K / month · 📦 74 · ⏱️ 21.05.2025):
pip install submitit
- [Conda](https://anaconda.org/conda-forge/submitit) (📥 58K · ⏱️ 22.04.2025):
conda install -c conda-forge submitit
mpi4py (🥈31 · ⭐ 850) - Python bindings for MPI. BSD-3 - [GitHub](https://github.com/mpi4py/mpi4py) (👨‍💻 27 · 🔀 120 · 📥 33K · 📋 210 - 2% open · ⏱️ 10.05.2025):
git clone https://github.com/mpi4py/mpi4py
- [PyPi](https://pypi.org/project/mpi4py) (📥 490K / month · 📦 830 · ⏱️ 13.02.2025):
pip install mpi4py
- [Conda](https://anaconda.org/conda-forge/mpi4py) (📥 3.9M · ⏱️ 09.05.2025):
conda install -c conda-forge mpi4py
SynapseML (🥈30 · ⭐ 5.1K) - Simple and Distributed Machine Learning. MIT - [GitHub](https://github.com/microsoft/SynapseML) (👨‍💻 120 · 🔀 840 · 📋 800 - 49% open · ⏱️ 19.04.2025):
git clone https://github.com/microsoft/SynapseML
- [PyPi](https://pypi.org/project/synapseml) (📥 820K / month · 📦 7 · ⏱️ 17.04.2025):
pip install synapseml
dask-ml (🥈29 · ⭐ 940) - Scalable Machine Learning with Dask. BSD-3 - [GitHub](https://github.com/dask/dask-ml) (👨‍💻 81 · 🔀 260 · 📦 1.3K · 📋 550 - 51% open · ⏱️ 10.05.2025):
git clone https://github.com/dask/dask-ml
- [PyPi](https://pypi.org/project/dask-ml) (📥 100K / month · 📦 100 · ⏱️ 08.02.2025):
pip install dask-ml
- [Conda](https://anaconda.org/conda-forge/dask-ml) (📥 980K · ⏱️ 22.04.2025):
conda install -c conda-forge dask-ml
Hivemind (🥉27 · ⭐ 2.2K) - Decentralized deep learning in PyTorch. Built to train models on.. MIT - [GitHub](https://github.com/learning-at-home/hivemind) (👨‍💻 34 · 🔀 180 · 📦 130 · 📋 190 - 43% open · ⏱️ 06.05.2025):
git clone https://github.com/learning-at-home/hivemind
- [PyPi](https://pypi.org/project/hivemind) (📥 6.4K / month · 📦 12 · ⏱️ 20.04.2025):
pip install hivemind
Apache Singa (🥉25 · ⭐ 3.4K) - a distributed deep learning platform. Apache-2 - [GitHub](https://github.com/apache/singa) (👨‍💻 97 · 🔀 1.3K · 📦 6 · 📋 140 - 35% open · ⏱️ 26.03.2025):
git clone https://github.com/apache/singa
- [Conda](https://anaconda.org/nusdbsystem/singa) (📥 1K · ⏱️ 25.03.2025):
conda install -c nusdbsystem singa
- [Docker Hub](https://hub.docker.com/r/apache/singa) (📥 9.1K · ⭐ 4 · ⏱️ 31.05.2022):
docker pull apache/singa
MMLSpark (🥉23 · ⭐ 5.1K) - Simple and Distributed Machine Learning. MIT - [GitHub](https://github.com/microsoft/SynapseML) (👨‍💻 120 · 🔀 840 · 📋 800 - 49% open · ⏱️ 19.04.2025):
git clone https://github.com/microsoft/SynapseML
- [PyPi](https://pypi.org/project/mmlspark) (⏱️ 18.03.2020):
pip install mmlspark
analytics-zoo (🥉22 · ⭐ 2.6K) - Distributed Tensorflow, Keras and PyTorch on Apache.. Apache-2 - [GitHub](https://github.com/intel/analytics-zoo) (👨‍💻 110 · 🔀 730 · 📋 1.3K - 32% open · ⏱️ 09.01.2025):
git clone https://github.com/intel-analytics/analytics-zoo
- [PyPi](https://pypi.org/project/analytics-zoo) (📥 2.1K / month · 📦 1 · ⏱️ 22.08.2022):
pip install analytics-zoo
Show 18 hidden projects... - DEAP (🥈34 · ⭐ 6.1K) - Distributed Evolutionary Algorithms in Python. ❗️LGPL-3.0 - ipyparallel (🥈29 · ⭐ 2.6K) - IPython Parallel: Interactive Parallel Computing in.. ❗Unlicensed - petastorm (🥉28 · ⭐ 1.8K · 💀) - Petastorm library enables single machine or distributed.. Apache-2 - TensorFlowOnSpark (🥉27 · ⭐ 3.9K · 💀) - TensorFlowOnSpark brings TensorFlow programs to.. Apache-2 - Elephas (🥉25 · ⭐ 1.6K · 💀) - Distributed Deep learning with Keras & Spark. MIT keras - Mesh (🥉22 · ⭐ 1.6K · 💀) - Mesh TensorFlow: Model Parallelism Made Easier. Apache-2 - BytePS (🥉21 · ⭐ 3.7K · 💀) - A high performance and generic framework for distributed DNN.. Apache-2 - sk-dist (🥉21 · ⭐ 280 · 💀) - Distributed scikit-learn meta-estimators in PySpark. Apache-2 - somoclu (🥉21 · ⭐ 270 · 💀) - Massively parallel self-organizing maps: accelerate training on.. MIT - mesh-transformer-jax (🥉18 · ⭐ 6.3K · 💀) - Model parallel transformers in JAX and Haiku. Apache-2 - launchpad (🥉18 · ⭐ 320 · 💀) - Launchpad is a library that simplifies writing.. Apache-2 - Fiber (🥉17 · ⭐ 1K · 💀) - Distributed Computing for AI Made Simple. Apache-2 - parallelformers (🥉17 · ⭐ 780 · 💀) - Parallelformers: An Efficient Model Parallelization.. Apache-2 - bluefog (🥉17 · ⭐ 290 · 💀) - Distributed and decentralized training framework for PyTorch.. Apache-2 - TensorFrames (🥉15 · ⭐ 720 · 💀) - Tensorflow wrapper for DataFrames on Apache Spark. Apache-2 - LazyCluster (🥉14 · ⭐ 49 · 💀) - Distributed machine learning made simple. Apache-2 - autodist (🥉12 · ⭐ 130 · 💀) - Simple Distributed Deep Learning on TensorFlow. Apache-2 - moolib (🥉11 · ⭐ 370 · 💀) - A library for distributed ML training with PyTorch. MIT


Hyperparameter Optimization & AutoML

Back to top

Libraries for hyperparameter optimization, automl and neural architecture search.

Optuna (🥇43 · ⭐ 12K) - A hyperparameter optimization framework. MIT - [GitHub](https://github.com/optuna/optuna) (👨‍💻 290 · 🔀 1.1K · 📦 27K · 📋 1.7K - 3% open · ⏱️ 16.05.2025):
git clone https://github.com/optuna/optuna
- [PyPi](https://pypi.org/project/optuna) (📥 4.1M / month · 📦 1.2K · ⏱️ 14.04.2025):
pip install optuna
- [Conda](https://anaconda.org/conda-forge/optuna) (📥 2.7M · ⏱️ 22.04.2025):
conda install -c conda-forge optuna
AutoGluon (🥇36 · ⭐ 8.9K) - Fast and Accurate ML in 3 Lines of Code. Apache-2 - [GitHub](https://github.com/autogluon/autogluon) (👨‍💻 140 · 🔀 1K · 📦 1.1K · 📋 1.7K - 24% open · ⏱️ 21.05.2025):
git clone https://github.com/autogluon/autogluon
- [PyPi](https://pypi.org/project/autogluon) (📥 190K / month · 📦 32 · ⏱️ 22.05.2025):
pip install autogluon
- [Conda](https://anaconda.org/conda-forge/autogluon) (📥 35K · ⏱️ 03.05.2025):
conda install -c conda-forge autogluon
- [Docker Hub](https://hub.docker.com/r/autogluon/autogluon) (📥 16K · ⭐ 19 · ⏱️ 07.03.2024):
docker pull autogluon/autogluon
Ax (🥇36 · ⭐ 2.5K) - Adaptive Experimentation Platform. MIT - [GitHub](https://github.com/facebook/Ax) (👨‍💻 180 · 🔀 320 · 📦 970 · 📋 880 - 12% open · ⏱️ 22.05.2025):
git clone https://github.com/facebook/Ax
- [PyPi](https://pypi.org/project/ax-platform) (📥 240K / month · 📦 63 · ⏱️ 08.05.2025):
pip install ax-platform
- [Conda](https://anaconda.org/conda-forge/ax-platform) (📥 39K · ⏱️ 09.05.2025):
conda install -c conda-forge ax-platform
Bayesian Optimization (🥇35 · ⭐ 8.2K) - A Python implementation of global optimization with.. MIT - [GitHub](https://github.com/bayesian-optimization/BayesianOptimization) (👨‍💻 50 · 🔀 1.6K · 📥 180 · 📦 3.7K · 📋 380 - 1% open · ⏱️ 19.05.2025):
git clone https://github.com/fmfn/BayesianOptimization
- [PyPi](https://pypi.org/project/bayesian-optimization) (📥 320K / month · 📦 170 · ⏱️ 12.05.2025):
pip install bayesian-optimization
Hyperopt (🥇34 · ⭐ 7.4K) - Distributed Asynchronous Hyperparameter Optimization in Python. BSD-3 - [GitHub](https://github.com/hyperopt/hyperopt) (👨‍💻 100 · 🔀 1.1K · 📦 21K · 📋 760 - 18% open · ⏱️ 27.12.2024):
git clone https://github.com/hyperopt/hyperopt
- [PyPi](https://pypi.org/project/hyperopt) (📥 2.6M / month · 📦 450 · ⏱️ 17.11.2021):
pip install hyperopt
- [Conda](https://anaconda.org/conda-forge/hyperopt) (📥 830K · ⏱️ 22.04.2025):
conda install -c conda-forge hyperopt
BoTorch (🥇34 · ⭐ 3.3K) - Bayesian optimization in PyTorch. MIT - [GitHub](https://github.com/pytorch/botorch) (👨‍💻 140 · 🔀 420 · 📦 1.5K · 📋 590 - 13% open · ⏱️ 22.05.2025):
git clone https://github.com/pytorch/botorch
- [PyPi](https://pypi.org/project/botorch) (📥 320K / month · 📦 110 · ⏱️ 06.05.2025):
pip install botorch
- [Conda](https://anaconda.org/conda-forge/botorch) (📥 150K · ⏱️ 07.05.2025):
conda install -c conda-forge botorch
nevergrad (🥈33 · ⭐ 4.1K) - A Python toolbox for performing gradient-free optimization. MIT - [GitHub](https://github.com/facebookresearch/nevergrad) (👨‍💻 58 · 🔀 360 · 📦 910 · 📋 310 - 40% open · ⏱️ 23.04.2025):
git clone https://github.com/facebookresearch/nevergrad
- [PyPi](https://pypi.org/project/nevergrad) (📥 150K / month · 📦 72 · ⏱️ 23.04.2025):
pip install nevergrad
- [Conda](https://anaconda.org/conda-forge/nevergrad) (📥 62K · ⏱️ 22.04.2025):
conda install -c conda-forge nevergrad
AutoKeras (🥈32 · ⭐ 9.2K) - AutoML library for deep learning. Apache-2 - [GitHub](https://github.com/keras-team/autokeras) (👨‍💻 140 · 🔀 1.4K · 📥 20K · 📦 850 · 📋 910 - 16% open · ⏱️ 16.12.2024):
git clone https://github.com/keras-team/autokeras
- [PyPi](https://pypi.org/project/autokeras) (📥 19K / month · 📦 13 · ⏱️ 20.03.2024):
pip install autokeras
featuretools (🥈32 · ⭐ 7.4K) - An open source python library for automated feature engineering. BSD-3 - [GitHub](https://github.com/alteryx/featuretools) (👨‍💻 74 · 🔀 880 · 📦 2.1K · 📋 1K - 15% open · ⏱️ 13.11.2024):
git clone https://github.com/alteryx/featuretools
- [PyPi](https://pypi.org/project/featuretools) (📥 79K / month · 📦 74 · ⏱️ 14.05.2024):
pip install featuretools
- [Conda](https://anaconda.org/conda-forge/featuretools) (📥 250K · ⏱️ 22.04.2025):
conda install -c conda-forge featuretools
Keras Tuner (🥈32 · ⭐ 2.9K · 💤) - A Hyperparameter Tuning Library for Keras. Apache-2 - [GitHub](https://github.com/keras-team/keras-tuner) (👨‍💻 61 · 🔀 400 · 📦 5.9K · 📋 500 - 44% open · ⏱️ 24.06.2024):
git clone https://github.com/keras-team/keras-tuner
- [PyPi](https://pypi.org/project/keras-tuner) (📥 1.6M / month · 📦 120 · ⏱️ 04.03.2024):
pip install keras-tuner
- [Conda](https://anaconda.org/conda-forge/keras-tuner) (📥 57K · ⏱️ 22.04.2025):
conda install -c conda-forge keras-tuner
lazypredict (🥈29 · ⭐ 3.1K) - Lazy Predict help build a lot of basic models without much code.. MIT - [GitHub](https://github.com/shankarpandala/lazypredict) (👨‍💻 19 · 🔀 360 · 📦 1.4K · 📋 160 - 64% open · ⏱️ 18.05.2025):
git clone https://github.com/shankarpandala/lazypredict
- [PyPi](https://pypi.org/project/lazypredict) (📥 21K / month · 📦 8 · ⏱️ 05.04.2025):
pip install lazypredict
- [Conda](https://anaconda.org/conda-forge/lazypredict) (📥 5K · ⏱️ 22.04.2025):
conda install -c conda-forge lazypredict
mljar-supervised (🥈28 · ⭐ 3.2K) - Python package for AutoML on Tabular Data with Feature.. MIT - [GitHub](https://github.com/mljar/mljar-supervised) (👨‍💻 30 · 🔀 420 · 📦 170 · 📋 670 - 20% open · ⏱️ 14.04.2025):
git clone https://github.com/mljar/mljar-supervised
- [PyPi](https://pypi.org/project/mljar-supervised) (📥 7K / month · 📦 6 · ⏱️ 01.04.2025):
pip install mljar-supervised
- [Conda](https://anaconda.org/conda-forge/mljar-supervised) (📥 42K · ⏱️ 22.04.2025):
conda install -c conda-forge mljar-supervised
FEDOT (🥈25 · ⭐ 670) - Automated modeling and machine learning framework FEDOT. BSD-3 - [GitHub](https://github.com/aimclub/FEDOT) (👨‍💻 38 · 🔀 88 · 📦 62 · 📋 570 - 11% open · ⏱️ 22.05.2025):
git clone https://github.com/nccr-itmo/FEDOT
- [PyPi](https://pypi.org/project/fedot) (📥 1.1K / month · 📦 7 · ⏱️ 10.03.2025):
pip install fedot
Hyperactive (🥉24 · ⭐ 520) - An optimization and data collection toolbox for convenient and fast.. MIT - [GitHub](https://github.com/SimonBlanke/Hyperactive) (👨‍💻 13 · 🔀 48 · 📥 310 · 📦 38 · 📋 82 - 19% open · ⏱️ 18.05.2025):
git clone https://github.com/SimonBlanke/Hyperactive
- [PyPi](https://pypi.org/project/hyperactive) (📥 2.4K / month · 📦 13 · ⏱️ 15.08.2024):
pip install hyperactive
AlphaPy (🥉21 · ⭐ 1.5K) - Python AutoML for Trading Systems and Sports Betting. Apache-2 - [GitHub](https://github.com/ScottfreeLLC/AlphaPy) (👨‍💻 5 · 🔀 250 · 📦 10 · 📋 44 - 34% open · ⏱️ 15.12.2024):
git clone https://github.com/ScottfreeLLC/AlphaPy
- [PyPi](https://pypi.org/project/alphapy) (📥 660 / month · ⏱️ 29.08.2020):
pip install alphapy
featurewiz (🥉21 · ⭐ 650) - Use advanced feature engineering strategies and select best.. Apache-2 - [GitHub](https://github.com/AutoViML/featurewiz) (👨‍💻 18 · 🔀 96 · 📦 84 · 📋 110 - 0% open · ⏱️ 19.02.2025):
git clone https://github.com/AutoViML/featurewiz
- [PyPi](https://pypi.org/project/featurewiz) (📥 7.8K / month · 📦 4 · ⏱️ 19.02.2025):
pip install featurewiz
Auto ViML (🥉21 · ⭐ 540) - Automatically Build Multiple ML Models with a Single Line of Code... Apache-2 - [GitHub](https://github.com/AutoViML/Auto_ViML) (👨‍💻 9 · 🔀 100 · 📦 28 · ⏱️ 30.01.2025):
git clone https://github.com/AutoViML/Auto_ViML
- [PyPi](https://pypi.org/project/autoviml) (📥 3.7K / month · 📦 3 · ⏱️ 30.01.2025):
pip install autoviml
opytimizer (🥉18 · ⭐ 620 · 💤) - Opytimizer is a Python library consisting of meta-heuristic.. Apache-2 - [GitHub](https://github.com/gugarosa/opytimizer) (👨‍💻 4 · 🔀 42 · 📦 21 · ⏱️ 18.08.2024):
git clone https://github.com/gugarosa/opytimizer
- [PyPi](https://pypi.org/project/opytimizer) (📥 450 / month · ⏱️ 18.08.2024):
pip install opytimizer
Show 34 hidden projects... - TPOT (🥈33 · ⭐ 9.9K) - A Python Automated Machine Learning tool that optimizes machine.. ❗️LGPL-3.0 - scikit-optimize (🥈33 · ⭐ 2.8K · 💀) - Sequential model-based optimization with a.. BSD-3 - NNI (🥈31 · ⭐ 14K · 💀) - An open source AutoML toolkit for automate machine learning lifecycle,.. MIT - auto-sklearn (🥈31 · ⭐ 7.8K · 💀) - Automated Machine Learning with scikit-learn. BSD-3 - SMAC3 (🥈28 · ⭐ 1.2K) - SMAC3: A Versatile Bayesian Optimization Package for.. ❗️BSD-1-Clause - Hyperas (🥈27 · ⭐ 2.2K · 💀) - Keras + Hyperopt: A very simple wrapper for convenient.. MIT - Talos (🥈25 · ⭐ 1.6K · 💀) - Hyperparameter Experiments with TensorFlow and Keras. MIT - GPyOpt (🥈25 · ⭐ 940 · 💀) - Gaussian Process Optimization using GPy. BSD-3 - AdaNet (🥉24 · ⭐ 3.5K · 💀) - Fast and flexible AutoML with learning guarantees. Apache-2 - auto_ml (🥉24 · ⭐ 1.6K · 💀) - [UNMAINTAINED] Automated machine learning for analytics & production. MIT - lightwood (🥉24 · ⭐ 470) - Lightwood is Legos for Machine Learning. ❗️GPL-3.0 - HpBandSter (🥉23 · ⭐ 620 · 💀) - a distributed Hyperband implementation on Steroids. BSD-3 - Neuraxle (🥉22 · ⭐ 610 · 💀) - The worlds cleanest AutoML library - Do hyperparameter tuning.. Apache-2 - Orion (🥉22 · ⭐ 290 · 💀) - Asynchronous Distributed Hyperparameter Optimization. BSD-3 - igel (🥉21 · ⭐ 3.1K · 💀) - a delightful machine learning tool that allows you to train, test, and.. MIT - MLBox (🥉21 · ⭐ 1.5K · 💀) - MLBox is a powerful Automated Machine Learning python library. ❗️BSD-1-Clause - Test Tube (🥉21 · ⭐ 740 · 💀) - Python library to easily log experiments and parallelize.. MIT - sklearn-deap (🥉20 · ⭐ 770 · 💀) - Use evolutionary algorithms instead of gridsearch in.. MIT - optunity (🥉20 · ⭐ 420 · 💀) - optimization routines for hyperparameter tuning. BSD-3 - Dragonfly (🥉19 · ⭐ 880 · 💀) - An open source python library for scalable Bayesian optimisation. MIT - Auto Tune Models (🥉19 · ⭐ 530 · 💀) - Auto Tune Models - A multi-tenant, multi-data system for.. MIT - Sherpa (🥉19 · ⭐ 340 · 💀) - Hyperparameter optimization that enables researchers to.. ❗️GPL-3.0 - Xcessiv (🥉18 · ⭐ 1.3K · 💀) - A web-based application for quick, scalable, and automated.. Apache-2 - shap-hypetune (🥉18 · ⭐ 580 · 💀) - A python package for simultaneous Hyperparameters Tuning and.. MIT - Advisor (🥉17 · ⭐ 1.6K · 💀) - Open-source implementation of Google Vizier for hyper parameters.. Apache-2 - HyperparameterHunter (🥉17 · ⭐ 710 · 💀) - Easy hyperparameter optimization and automatic result.. MIT - automl-gs (🥉16 · ⭐ 1.9K · 💀) - Provide an input CSV and a target field to predict, generate a.. MIT - Parfit (🥉15 · ⭐ 200 · 💀) - A package for parallelizing the fit and flexibly scoring of.. MIT - ENAS (🥉13 · ⭐ 2.7K · 💀) - PyTorch implementation of Efficient Neural Architecture Search via.. Apache-2 - Auptimizer (🥉13 · ⭐ 200 · 💀) - An automatic ML model optimization tool. ❗️GPL-3.0 - Hypermax (🥉12 · ⭐ 110 · 💀) - Better, faster hyper-parameter optimization. BSD-3 - model_search (🥉11 · ⭐ 3.3K · 💀) - AutoML algorithms for model architecture search at scale. Apache-2 - Devol (🥉11 · ⭐ 950 · 💀) - Genetic neural architecture search with Keras. MIT - Hypertunity (🥉10 · ⭐ 140 · 💀) - A toolset for black-box hyperparameter optimisation. Apache-2


Reinforcement Learning

Back to top

Libraries for building and evaluating reinforcement learning & agent-based systems.

FinRL (🥇32 · ⭐ 12K) - FinRL: Financial Reinforcement Learning. MIT - [GitHub](https://github.com/AI4Finance-Foundation/FinRL) (👨‍💻 120 · 🔀 2.7K · 📦 93 · 📋 740 - 34% open · ⏱️ 05.05.2025):
git clone https://github.com/AI4Finance-Foundation/FinRL
- [PyPi](https://pypi.org/project/finrl) (📥 2.8K / month · ⏱️ 08.01.2022):
pip install finrl
ViZDoom (🥇29 · ⭐ 1.8K) - Reinforcement Learning environments based on the 1993 game Doom. MIT - [GitHub](https://github.com/Farama-Foundation/ViZDoom) (👨‍💻 55 · 🔀 400 · 📥 12K · 📦 330 · 📋 470 - 6% open · ⏱️ 12.03.2025):
git clone https://github.com/mwydmuch/ViZDoom
- [PyPi](https://pypi.org/project/vizdoom) (📥 6.8K / month · 📦 15 · ⏱️ 20.08.2024):
pip install vizdoom
Acme (🥈28 · ⭐ 3.7K) - A library of reinforcement learning components and agents. Apache-2 - [GitHub](https://github.com/google-deepmind/acme) (👨‍💻 88 · 🔀 470 · 📦 240 · 📋 270 - 23% open · ⏱️ 03.05.2025):
git clone https://github.com/deepmind/acme
- [PyPi](https://pypi.org/project/dm-acme) (📥 1.7K / month · 📦 3 · ⏱️ 10.02.2022):
pip install dm-acme
- [Conda](https://anaconda.org/conda-forge/dm-acme) (📥 13K · ⏱️ 22.04.2025):
conda install -c conda-forge dm-acme
TF-Agents (🥈28 · ⭐ 2.9K) - TF-Agents: A reliable, scalable and easy to use TensorFlow.. Apache-2 - [GitHub](https://github.com/tensorflow/agents) (👨‍💻 150 · 🔀 720 · 📋 680 - 30% open · ⏱️ 30.04.2025):
git clone https://github.com/tensorflow/agents
- [PyPi](https://pypi.org/project/tf-agents) (📥 27K / month · 📦 14 · ⏱️ 14.12.2023):
pip install tf-agents
Dopamine (🥈26 · ⭐ 11K) - Dopamine is a research framework for fast prototyping of.. Apache-2 - [GitHub](https://github.com/google/dopamine) (👨‍💻 15 · 🔀 1.4K · 📦 21 · 📋 190 - 54% open · ⏱️ 04.11.2024):
git clone https://github.com/google/dopamine
- [PyPi](https://pypi.org/project/dopamine-rl) (📥 24K / month · 📦 10 · ⏱️ 31.10.2024):
pip install dopamine-rl
TensorForce (🥈26 · ⭐ 3.3K · 💤) - Tensorforce: a TensorFlow library for applied.. Apache-2 - [GitHub](https://github.com/tensorforce/tensorforce) (👨‍💻 85 · 🔀 530 · 📦 460 · 📋 680 - 6% open · ⏱️ 31.07.2024):
git clone https://github.com/tensorforce/tensorforce
- [PyPi](https://pypi.org/project/tensorforce) (📥 520 / month · 📦 4 · ⏱️ 30.08.2021):
pip install tensorforce
RLax (🥈26 · ⭐ 1.3K) - A library of reinforcement learning building blocks in JAX. Apache-2 - [GitHub](https://github.com/google-deepmind/rlax) (👨‍💻 22 · 🔀 91 · 📦 340 · 📋 27 - 29% open · ⏱️ 08.05.2025):
git clone https://github.com/deepmind/rlax
- [PyPi](https://pypi.org/project/rlax) (📥 16K / month · 📦 22 · ⏱️ 08.05.2025):
pip install rlax
PARL (🥉24 · ⭐ 3.4K) - A high-performance distributed training framework for Reinforcement.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PARL) (👨‍💻 46 · 🔀 820 · 📦 140 · 📋 540 - 23% open · ⏱️ 24.01.2025):
git clone https://github.com/PaddlePaddle/PARL
- [PyPi](https://pypi.org/project/parl) (📥 790 / month · 📦 1 · ⏱️ 13.05.2022):
pip install parl
PFRL (🥉22 · ⭐ 1.2K · 💤) - PFRL: a PyTorch-based deep reinforcement learning library. MIT - [GitHub](https://github.com/pfnet/pfrl) (👨‍💻 20 · 🔀 150 · 📦 120 · 📋 80 - 41% open · ⏱️ 04.08.2024):
git clone https://github.com/pfnet/pfrl
- [PyPi](https://pypi.org/project/pfrl) (📥 340 / month · 📦 1 · ⏱️ 16.07.2023):
pip install pfrl
ReAgent (🥉21 · ⭐ 3.6K) - A platform for Reasoning systems (Reinforcement Learning,.. BSD-3 - [GitHub](https://github.com/facebookresearch/ReAgent) (👨‍💻 170 · 🔀 510 · 📋 160 - 53% open · ⏱️ 12.03.2025):
git clone https://github.com/facebookresearch/ReAgent
- [PyPi](https://pypi.org/project/reagent) (📥 31 / month · ⏱️ 27.05.2020):
pip install reagent
rliable (🥉14 · ⭐ 830 · 💤) - [NeurIPS21 Outstanding Paper] Library for reliable evaluation on.. Apache-2 - [GitHub](https://github.com/google-research/rliable) (👨‍💻 9 · 🔀 49 · 📦 210 · 📋 20 - 15% open · ⏱️ 12.08.2024):
git clone https://github.com/google-research/rliable
- [PyPi](https://pypi.org/project/rliable`):
pip install rliable`
Show 12 hidden projects... - OpenAI Gym (🥇42 · ⭐ 36K · 💀) - A toolkit for developing and comparing reinforcement learning.. MIT - baselines (🥈28 · ⭐ 16K · 💀) - OpenAI Baselines: high-quality implementations of reinforcement.. MIT - TensorLayer (🥈27 · ⭐ 7.4K · 💀) - Deep Learning and Reinforcement Learning Library for.. Apache-2 - keras-rl (🥈27 · ⭐ 5.6K · 💀) - Deep Reinforcement Learning for Keras. MIT - garage (🥉25 · ⭐ 2K · 💀) - A toolkit for reproducible reinforcement learning research. MIT - Stable Baselines (🥉24 · ⭐ 4.3K · 💀) - A fork of OpenAI Baselines, implementations of.. MIT - ChainerRL (🥉24 · ⭐ 1.2K · 💀) - ChainerRL is a deep reinforcement learning library built on top of.. MIT - TRFL (🥉22 · ⭐ 3.1K · 💀) - TensorFlow Reinforcement Learning. Apache-2 - Coach (🥉20 · ⭐ 2.3K · 💀) - Reinforcement Learning Coach by Intel AI Lab enables easy.. Apache-2 - SerpentAI (🥉18 · ⭐ 6.9K · 💀) - Game Agent Framework. Helping you create AIs / Bots that learn to.. MIT - DeepMind Lab (🥉17 · ⭐ 7.2K · 💀) - A customisable 3D platform for agent-based AI research. ❗Unlicensed - Maze (🥉12 · ⭐ 280 · 💀) - Maze Applied Reinforcement Learning Framework. ❗️Custom


Recommender Systems

Back to top

Libraries for building and evaluating recommendation systems.

Recommenders (🥇35 · ⭐ 20K) - Best Practices on Recommendation Systems. MIT - [GitHub](https://github.com/recommenders-team/recommenders) (👨‍💻 140 · 🔀 3.2K · 📥 740 · 📦 170 · 📋 880 - 18% open · ⏱️ 08.05.2025):
git clone https://github.com/microsoft/recommenders
- [PyPi](https://pypi.org/project/recommenders) (📥 20K / month · 📦 4 · ⏱️ 24.12.2024):
pip install recommenders
torchrec (🥇31 · ⭐ 2.2K) - Pytorch domain library for recommendation systems. BSD-3 - [GitHub](https://github.com/pytorch/torchrec) (👨‍💻 350 · 🔀 510 · 📦 210 · 📋 480 - 70% open · ⏱️ 22.05.2025):
git clone https://github.com/pytorch/torchrec
- [PyPi](https://pypi.org/project/torchrec-nightly-cpu) (📥 1.8K / month · ⏱️ 12.05.2022):
pip install torchrec-nightly-cpu
Cornac (🥈30 · ⭐ 960) - A Comparative Framework for Multimodal Recommender Systems. Apache-2 - [GitHub](https://github.com/PreferredAI/cornac) (👨‍💻 24 · 🔀 150 · 📦 280 · 📋 170 - 17% open · ⏱️ 26.04.2025):
git clone https://github.com/PreferredAI/cornac
- [PyPi](https://pypi.org/project/cornac) (📥 54K / month · 📦 18 · ⏱️ 26.04.2025):
pip install cornac
- [Conda](https://anaconda.org/conda-forge/cornac) (📥 820K · ⏱️ 26.04.2025):
conda install -c conda-forge cornac
scikit-surprise (🥈28 · ⭐ 6.6K · 💤) - A Python scikit for building and analyzing recommender.. BSD-3 - [GitHub](https://github.com/NicolasHug/Surprise) (👨‍💻 46 · 🔀 1K · 📦 21 · 📋 400 - 21% open · ⏱️ 14.06.2024):
git clone https://github.com/NicolasHug/Surprise
- [PyPi](https://pypi.org/project/scikit-surprise) (📥 140K / month · 📦 37 · ⏱️ 19.05.2024):
pip install scikit-surprise
- [Conda](https://anaconda.org/conda-forge/scikit-surprise) (📥 480K · ⏱️ 22.04.2025):
conda install -c conda-forge scikit-surprise
RecBole (🥈28 · ⭐ 3.8K) - A unified, comprehensive and efficient recommendation library. MIT - [GitHub](https://github.com/RUCAIBox/RecBole) (👨‍💻 79 · 🔀 660 · 📋 1K - 30% open · ⏱️ 24.02.2025):
git clone https://github.com/RUCAIBox/RecBole
- [PyPi](https://pypi.org/project/recbole) (📥 99K / month · 📦 2 · ⏱️ 24.02.2025):
pip install recbole
- [Conda](https://anaconda.org/aibox/recbole) (📥 8.6K · ⏱️ 25.03.2025):
conda install -c aibox recbole
TF Recommenders (🥉24 · ⭐ 1.9K) - TensorFlow Recommenders is a library for building.. Apache-2 - [GitHub](https://github.com/tensorflow/recommenders) (👨‍💻 43 · 🔀 290 · 📋 450 - 59% open · ⏱️ 16.01.2025):
git clone https://github.com/tensorflow/recommenders
- [PyPi](https://pypi.org/project/tensorflow-recommenders) (📥 240K / month · 📦 2 · ⏱️ 03.02.2023):
pip install tensorflow-recommenders
Show 11 hidden projects... - implicit (🥈30 · ⭐ 3.7K · 💀) - Fast Python Collaborative Filtering for Implicit Feedback Datasets. MIT - lightfm (🥈29 · ⭐ 4.9K · 💀) - A Python implementation of LightFM, a hybrid recommendation.. Apache-2 - lkpy (🥈29 · ⭐ 290) - Python recommendation toolkit. MIT - TF Ranking (🥉26 · ⭐ 2.8K · 💀) - Learning to Rank in TensorFlow. Apache-2 - tensorrec (🥉21 · ⭐ 1.3K · 💀) - A TensorFlow recommendation algorithm and framework in.. Apache-2 - fastFM (🥉21 · ⭐ 1.1K · 💀) - fastFM: A Library for Factorization Machines. BSD-3 - Spotlight (🥉19 · ⭐ 3K · 💀) - Deep recommender models using PyTorch. MIT - recmetrics (🥉19 · ⭐ 580 · 💀) - A library of metrics for evaluating recommender systems. MIT - Case Recommender (🥉18 · ⭐ 500 · 💀) - Case Recommender: A Flexible and Extensible Python.. MIT - OpenRec (🥉16 · ⭐ 410 · 💀) - OpenRec is an open-source and modular library for neural network-.. Apache-2 - Collie (🥉11 · ⭐ 100 · 💀) - A library for preparing, training, and evaluating scalable deep.. BSD-3


Privacy Machine Learning

Back to top

Libraries for encrypted and privacy-preserving machine learning using methods like federated learning & differential privacy.

PySyft (🥇32 · ⭐ 9.7K) - Perform data science on data that remains in someone elses server. Apache-2 - [GitHub](https://github.com/OpenMined/PySyft) (👨‍💻 520 · 🔀 2K · 📥 2K · 📦 1 · 📋 3.4K - 1% open · ⏱️ 13.04.2025):
git clone https://github.com/OpenMined/PySyft
- [PyPi](https://pypi.org/project/syft) (📥 7.3K / month · 📦 5 · ⏱️ 13.04.2025):
pip install syft
Opacus (🥇32 · ⭐ 1.8K) - Training PyTorch models with differential privacy. Apache-2 - [GitHub](https://github.com/pytorch/opacus) (👨‍💻 85 · 🔀 360 · 📥 140 · 📦 1.1K · 📋 330 - 20% open · ⏱️ 13.05.2025):
git clone https://github.com/pytorch/opacus
- [PyPi](https://pypi.org/project/opacus) (📥 100K / month · 📦 42 · ⏱️ 18.02.2025):
pip install opacus
- [Conda](https://anaconda.org/conda-forge/opacus) (📥 24K · ⏱️ 22.04.2025):
conda install -c conda-forge opacus
TensorFlow Privacy (🥈25 · ⭐ 2K) - Library for training machine learning models with.. Apache-2 - [GitHub](https://github.com/tensorflow/privacy) (👨‍💻 60 · 🔀 450 · 📥 190 · 📋 210 - 55% open · ⏱️ 21.05.2025):
git clone https://github.com/tensorflow/privacy
- [PyPi](https://pypi.org/project/tensorflow-privacy) (📥 20K / month · 📦 21 · ⏱️ 14.02.2024):
pip install tensorflow-privacy
TFEncrypted (🥉24 · ⭐ 1.2K · 💤) - A Framework for Encrypted Machine Learning in.. Apache-2 - [GitHub](https://github.com/tf-encrypted/tf-encrypted) (👨‍💻 29 · 🔀 210 · 📦 68 · 📋 440 - 32% open · ⏱️ 25.09.2024):
git clone https://github.com/tf-encrypted/tf-encrypted
- [PyPi](https://pypi.org/project/tf-encrypted) (📥 710 / month · 📦 9 · ⏱️ 16.11.2022):
pip install tf-encrypted
FATE (🥉23 · ⭐ 5.9K) - An Industrial Grade Federated Learning Framework. Apache-2 - [GitHub](https://github.com/FederatedAI/FATE) (👨‍💻 100 · 🔀 1.6K · 📋 2.1K - 3% open · ⏱️ 19.11.2024):
git clone https://github.com/FederatedAI/FATE
- [PyPi](https://pypi.org/project/ETAF) (⏱️ 06.05.2020):
pip install ETAF
CrypTen (🥉21 · ⭐ 1.6K) - A framework for Privacy Preserving Machine Learning. MIT - [GitHub](https://github.com/facebookresearch/CrypTen) (👨‍💻 39 · 🔀 290 · 📋 280 - 28% open · ⏱️ 23.11.2024):
git clone https://github.com/facebookresearch/CrypTen
- [PyPi](https://pypi.org/project/crypten) (📥 520 / month · 📦 1 · ⏱️ 08.12.2022):
pip install crypten
Show 1 hidden projects... - PipelineDP (🥉20 · ⭐ 280) - PipelineDP is a Python framework for applying differentially.. Apache-2


Workflow & Experiment Tracking

Back to top

Libraries to organize, track, and visualize machine learning experiments.

mlflow (🥇44 · ⭐ 21K) - Open source platform for the machine learning lifecycle. Apache-2 - [GitHub](https://github.com/mlflow/mlflow) (👨‍💻 860 · 🔀 4.5K · 📦 62K · 📋 4.7K - 39% open · ⏱️ 22.05.2025):
git clone https://github.com/mlflow/mlflow
- [PyPi](https://pypi.org/project/mlflow) (📥 17M / month · 📦 1.1K · ⏱️ 21.05.2025):
pip install mlflow
- [Conda](https://anaconda.org/conda-forge/mlflow) (📥 3.2M · ⏱️ 26.04.2025):
conda install -c conda-forge mlflow
wandb client (🥇43 · ⭐ 9.9K) - The AI developer platform. Use Weights & Biases to train and fine-.. MIT - [GitHub](https://github.com/wandb/wandb) (👨‍💻 210 · 🔀 740 · 📥 720 · 📦 79K · 📋 3.6K - 17% open · ⏱️ 21.05.2025):
git clone https://github.com/wandb/client
- [PyPi](https://pypi.org/project/wandb) (📥 18M / month · 📦 1.9K · ⏱️ 07.05.2025):
pip install wandb
- [Conda](https://anaconda.org/conda-forge/wandb) (📥 1.1M · ⏱️ 08.05.2025):
conda install -c conda-forge wandb
Tensorboard (🥇43 · ⭐ 6.9K) - TensorFlows Visualization Toolkit. Apache-2 - [GitHub](https://github.com/tensorflow/tensorboard) (👨‍💻 330 · 🔀 1.7K · 📦 320K · 📋 1.9K - 35% open · ⏱️ 09.05.2025):
git clone https://github.com/tensorflow/tensorboard
- [PyPi](https://pypi.org/project/tensorboard) (📥 27M / month · 📦 2.5K · ⏱️ 12.02.2025):
pip install tensorboard
- [Conda](https://anaconda.org/conda-forge/tensorboard) (📥 5.6M · ⏱️ 22.04.2025):
conda install -c conda-forge tensorboard
DVC (🥇41 · ⭐ 14K) - Data Versioning and ML Experiments. Apache-2 - [GitHub](https://github.com/iterative/dvc) (👨‍💻 310 · 🔀 1.2K · 📦 24K · 📋 4.8K - 5% open · ⏱️ 20.05.2025):
git clone https://github.com/iterative/dvc
- [PyPi](https://pypi.org/project/dvc) (📥 660K / month · 📦 140 · ⏱️ 06.05.2025):
pip install dvc
- [Conda](https://anaconda.org/conda-forge/dvc) (📥 2.8M · ⏱️ 06.05.2025):
conda install -c conda-forge dvc
SageMaker SDK (🥇41 · ⭐ 2.2K) - A library for training and deploying machine learning.. Apache-2 - [GitHub](https://github.com/aws/sagemaker-python-sdk) (👨‍💻 480 · 🔀 1.2K · 📦 6K · 📋 1.6K - 20% open · ⏱️ 22.05.2025):
git clone https://github.com/aws/sagemaker-python-sdk
- [PyPi](https://pypi.org/project/sagemaker) (📥 25M / month · 📦 180 · ⏱️ 19.05.2025):
pip install sagemaker
- [Conda](https://anaconda.org/conda-forge/sagemaker-python-sdk) (📥 1.5M · ⏱️ 21.05.2025):
conda install -c conda-forge sagemaker-python-sdk
Metaflow (🥈36 · ⭐ 8.8K) - Build, Manage and Deploy AI/ML Systems. Apache-2 - [GitHub](https://github.com/Netflix/metaflow) (👨‍💻 100 · 🔀 830 · 📦 910 · 📋 800 - 43% open · ⏱️ 21.05.2025):
git clone https://github.com/Netflix/metaflow
- [PyPi](https://pypi.org/project/metaflow) (📥 250K / month · 📦 52 · ⏱️ 21.05.2025):
pip install metaflow
- [Conda](https://anaconda.org/conda-forge/metaflow) (📥 300K · ⏱️ 22.04.2025):
conda install -c conda-forge metaflow
PyCaret (🥈35 · ⭐ 9.3K) - An open-source, low-code machine learning library in Python. MIT - [GitHub](https://github.com/pycaret/pycaret) (👨‍💻 140 · 🔀 1.8K · 📥 730 · 📦 7.7K · 📋 2.3K - 16% open · ⏱️ 06.03.2025):
git clone https://github.com/pycaret/pycaret
- [PyPi](https://pypi.org/project/pycaret) (📥 340K / month · 📦 31 · ⏱️ 28.04.2024):
pip install pycaret
- [Conda](https://anaconda.org/conda-forge/pycaret) (📥 70K · ⏱️ 22.04.2025):
conda install -c conda-forge pycaret
ClearML (🥈34 · ⭐ 6K) - ClearML - Auto-Magical CI/CD to streamline your AI workload... Apache-2 - [GitHub](https://github.com/clearml/clearml) (👨‍💻 100 · 🔀 680 · 📥 3.2K · 📦 1.8K · 📋 1.1K - 43% open · ⏱️ 22.05.2025):
git clone https://github.com/allegroai/clearml
- [PyPi](https://pypi.org/project/clearml) (📥 380K / month · 📦 58 · ⏱️ 22.05.2025):
pip install clearml
- [Docker Hub](https://hub.docker.com/r/allegroai/trains) (📥 31K · ⏱️ 05.10.2020):
docker pull allegroai/trains
snakemake (🥈34 · ⭐ 2.5K) - This is the development home of the workflow management system.. MIT - [GitHub](https://github.com/snakemake/snakemake) (👨‍💻 370 · 🔀 590 · 📦 2.4K · 📋 2K - 60% open · ⏱️ 22.05.2025):
git clone https://github.com/snakemake/snakemake
- [PyPi](https://pypi.org/project/snakemake) (📥 79K / month · 📦 280 · ⏱️ 22.05.2025):
pip install snakemake
- [Conda](https://anaconda.org/bioconda/snakemake) (📥 1.4M · ⏱️ 21.05.2025):
conda install -c bioconda snakemake
tensorboardX (🥈33 · ⭐ 7.9K) - tensorboard for pytorch (and chainer, mxnet, numpy, ...). MIT - [GitHub](https://github.com/lanpa/tensorboardX) (👨‍💻 85 · 🔀 860 · 📥 480 · 📦 58K · 📋 460 - 17% open · ⏱️ 24.04.2025):
git clone https://github.com/lanpa/tensorboardX
- [PyPi](https://pypi.org/project/tensorboardX) (📥 2.7M / month · 📦 620 · ⏱️ 20.08.2023):
pip install tensorboardX
- [Conda](https://anaconda.org/conda-forge/tensorboardx) (📥 1.3M · ⏱️ 22.04.2025):
conda install -c conda-forge tensorboardx
kaggle (🥈33 · ⭐ 6.6K) - Official Kaggle API. Apache-2 - [GitHub](https://github.com/Kaggle/kaggle-api) (👨‍💻 49 · 🔀 1.2K · 📦 21 · 📋 520 - 27% open · ⏱️ 22.05.2025):
git clone https://github.com/Kaggle/kaggle-api
- [PyPi](https://pypi.org/project/kaggle) (📥 340K / month · 📦 240 · ⏱️ 08.05.2025):
pip install kaggle
- [Conda](https://anaconda.org/conda-forge/kaggle) (📥 230K · ⏱️ 22.04.2025):
conda install -c conda-forge kaggle
aim (🥈33 · ⭐ 5.6K) - Aim An easy-to-use & supercharged open-source experiment tracker. Apache-2 - [GitHub](https://github.com/aimhubio/aim) (👨‍💻 82 · 🔀 340 · 📦 890 · 📋 1.1K - 36% open · ⏱️ 08.05.2025):
git clone https://github.com/aimhubio/aim
- [PyPi](https://pypi.org/project/aim) (📥 130K / month · 📦 41 · ⏱️ 21.05.2025):
pip install aim
- [Conda](https://anaconda.org/conda-forge/aim) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge aim
AzureML SDK (🥈31 · ⭐ 4.2K) - Python notebooks with ML and deep learning examples with Azure.. MIT - [GitHub](https://github.com/Azure/MachineLearningNotebooks) (👨‍💻 65 · 🔀 2.5K · 📥 660 · 📋 1.5K - 26% open · ⏱️ 14.03.2025):
git clone https://github.com/Azure/MachineLearningNotebooks
- [PyPi](https://pypi.org/project/azureml-sdk) (📥 340K / month · 📦 31 · ⏱️ 11.04.2025):
pip install azureml-sdk
sacred (🥈30 · ⭐ 4.3K) - Sacred is a tool to help you configure, organize, log and reproduce.. MIT - [GitHub](https://github.com/IDSIA/sacred) (👨‍💻 110 · 🔀 380 · 📦 3.6K · 📋 560 - 18% open · ⏱️ 26.11.2024):
git clone https://github.com/IDSIA/sacred
- [PyPi](https://pypi.org/project/sacred) (📥 29K / month · 📦 60 · ⏱️ 26.11.2024):
pip install sacred
- [Conda](https://anaconda.org/conda-forge/sacred) (📥 8.8K · ⏱️ 22.04.2025):
conda install -c conda-forge sacred
Neptune.ai (🥈29 · ⭐ 610) - The experiment tracker for foundation model training. Apache-2 - [GitHub](https://github.com/neptune-ai/neptune-client) (👨‍💻 55 · 🔀 65 · 📦 870 · 📋 260 - 12% open · ⏱️ 16.04.2025):
git clone https://github.com/neptune-ai/neptune-client
- [PyPi](https://pypi.org/project/neptune-client) (📥 490K / month · 📦 77 · ⏱️ 15.04.2025):
pip install neptune-client
- [Conda](https://anaconda.org/conda-forge/neptune-client) (📥 350K · ⏱️ 22.04.2025):
conda install -c conda-forge neptune-client
ml-metadata (🥈28 · ⭐ 650) - For recording and retrieving metadata associated with ML.. Apache-2 - [GitHub](https://github.com/google/ml-metadata) (👨‍💻 23 · 🔀 160 · 📥 3K · 📦 700 · 📋 120 - 39% open · ⏱️ 03.04.2025):
git clone https://github.com/google/ml-metadata
- [PyPi](https://pypi.org/project/ml-metadata) (📥 67K / month · 📦 32 · ⏱️ 07.04.2025):
pip install ml-metadata
VisualDL (🥉27 · ⭐ 4.8K) - Deep Learning Visualization Toolkit. Apache-2 - [GitHub](https://github.com/PaddlePaddle/VisualDL) (👨‍💻 36 · 🔀 630 · 📥 510 · 📦 2 · 📋 510 - 30% open · ⏱️ 22.01.2025):
git clone https://github.com/PaddlePaddle/VisualDL
- [PyPi](https://pypi.org/project/visualdl) (📥 130K / month · 📦 82 · ⏱️ 30.10.2024):
pip install visualdl
livelossplot (🥉27 · ⭐ 1.3K) - Live training loss plot in Jupyter Notebook for Keras,.. MIT - [GitHub](https://github.com/stared/livelossplot) (👨‍💻 17 · 🔀 140 · 📦 1.8K · 📋 79 - 7% open · ⏱️ 03.01.2025):
git clone https://github.com/stared/livelossplot
- [PyPi](https://pypi.org/project/livelossplot) (📥 16K / month · 📦 16 · ⏱️ 03.01.2025):
pip install livelossplot
quinn (🥉26 · ⭐ 670) - pyspark methods to enhance developer productivity. Apache-2 - [GitHub](https://github.com/mrpowers-io/quinn) (👨‍💻 31 · 🔀 97 · 📥 57 · 📦 93 · 📋 130 - 27% open · ⏱️ 06.12.2024):
git clone https://github.com/MrPowers/quinn
- [PyPi](https://pypi.org/project/quinn) (📥 590K / month · 📦 7 · ⏱️ 13.02.2024):
pip install quinn
Labml (🥉25 · ⭐ 2.2K) - Monitor deep learning model training and hardware usage from your mobile.. MIT - [GitHub](https://github.com/labmlai/labml) (👨‍💻 9 · 🔀 140 · 📦 220 · 📋 50 - 12% open · ⏱️ 10.04.2025):
git clone https://github.com/labmlai/labml
- [PyPi](https://pypi.org/project/labml) (📥 2.5K / month · 📦 14 · ⏱️ 15.09.2024):
pip install labml
TNT (🥉25 · ⭐ 1.7K) - A lightweight library for PyTorch training tools and utilities. BSD-3 - [GitHub](https://github.com/pytorch/tnt) (👨‍💻 150 · 🔀 280 · 📋 150 - 56% open · ⏱️ 12.05.2025):
git clone https://github.com/pytorch/tnt
- [PyPi](https://pypi.org/project/torchnet) (📥 5.9K / month · 📦 24 · ⏱️ 29.07.2018):
pip install torchnet
gokart (🥉25 · ⭐ 320) - Gokart solves reproducibility, task dependencies, constraints of good code,.. MIT - [GitHub](https://github.com/m3dev/gokart) (👨‍💻 47 · 🔀 62 · 📦 85 · 📋 99 - 32% open · ⏱️ 29.04.2025):
git clone https://github.com/m3dev/gokart
- [PyPi](https://pypi.org/project/gokart) (📥 4.7K / month · 📦 8 · ⏱️ 27.02.2025):
pip install gokart
Guild AI (🥉23 · ⭐ 880) - Experiment tracking, ML developer tools. Apache-2 - [GitHub](https://github.com/guildai/guildai) (👨‍💻 30 · 🔀 88 · 📥 31 · 📦 100 · 📋 440 - 50% open · ⏱️ 29.04.2025):
git clone https://github.com/guildai/guildai
- [PyPi](https://pypi.org/project/guildai) (📥 2K / month · ⏱️ 11.05.2022):
pip install guildai
keepsake (🥉17 · ⭐ 1.7K) - Version control for machine learning. Apache-2 - [GitHub](https://github.com/replicate/keepsake) (👨‍💻 18 · 🔀 71 · 📋 190 - 66% open · ⏱️ 03.12.2024):
git clone https://github.com/replicate/keepsake
- [PyPi](https://pypi.org/project/keepsake) (📥 84 / month · 📦 1 · ⏱️ 25.01.2021):
pip install keepsake
CometML (🥉16) - Supercharging Machine Learning. MIT - [GitHub]():
git clone https://github.com/comet-ml/examples
- [PyPi](https://pypi.org/project/comet_ml) (📥 470K / month · 📦 94 · ⏱️ 14.05.2025):
pip install comet_ml
- [Conda](https://anaconda.org/anaconda/comet_ml):
conda install -c anaconda comet_ml
Show 15 hidden projects... - Catalyst (🥈28 · ⭐ 3.4K · 💀) - Accelerated deep learning R&D. Apache-2 - knockknock (🥉26 · ⭐ 2.8K · 💀) - Knock Knock: Get notified when your training ends with only two.. MIT - SKLL (🥉24 · ⭐ 560) - SciKit-Learn Laboratory (SKLL) makes it easy to run machine.. ❗Unlicensed - hiddenlayer (🥉22 · ⭐ 1.8K · 💀) - Neural network graphs and training metrics for.. MIT - Studio.ml (🥉22 · ⭐ 380 · 💀) - Studio: Simplify and expedite model building process. Apache-2 - lore (🥉21 · ⭐ 1.6K · 💀) - Lore makes machine learning approachable for Software Engineers and.. MIT - TensorBoard Logger (🥉21 · ⭐ 630 · 💀) - Log TensorBoard events without touching TensorFlow. MIT - TensorWatch (🥉20 · ⭐ 3.4K · 💀) - Debugging, monitoring and visualization for Python Machine.. MIT - MXBoard (🥉20 · ⭐ 320 · 💀) - Logging MXNet data for visualization in TensorBoard. Apache-2 - datmo (🥉18 · ⭐ 340 · 💀) - Open source production model management tool for data scientists. MIT - chitra (🥉17 · ⭐ 230 · 💤) - A multi-functional library for full-stack Deep Learning... Apache-2 - caliban (🥉16 · ⭐ 500 · 💀) - Research workflows made easy, locally and in the Cloud. Apache-2 - steppy (🥉16 · ⭐ 130 · 💀) - Lightweight, Python library for fast and reproducible experimentation. MIT - ModelChimp (🥉13 · ⭐ 130 · 💀) - Experiment tracking for machine and deep learning projects. BSD-2 - traintool (🥉9 · ⭐ 12 · 💀) - Train off-the-shelf machine learning models in one.. Apache-2


Model Serialization & Deployment

Back to top

Libraries to serialize models to files, convert between a variety of model formats, and optimize models for deployment.

onnx (🥇43 · ⭐ 19K) - Open standard for machine learning interoperability. Apache-2 - [GitHub](https://github.com/onnx/onnx) (👨‍💻 340 · 🔀 3.7K · 📥 24K · 📦 46K · 📋 3K - 10% open · ⏱️ 22.05.2025):
git clone https://github.com/onnx/onnx
- [PyPi](https://pypi.org/project/onnx) (📥 7M / month · 📦 1.3K · ⏱️ 12.05.2025):
pip install onnx
- [Conda](https://anaconda.org/conda-forge/onnx) (📥 1.8M · ⏱️ 16.05.2025):
conda install -c conda-forge onnx
triton (🥇43 · ⭐ 16K) - Development repository for the Triton language and compiler. MIT - [GitHub](https://github.com/triton-lang/triton) (👨‍💻 400 · 🔀 1.9K · 📦 68K · 📋 1.8K - 42% open · ⏱️ 22.05.2025):
git clone https://github.com/openai/triton
- [PyPi](https://pypi.org/project/triton) (📥 24M / month · 📦 400 · ⏱️ 09.04.2025):
pip install triton
huggingface_hub (🥈38 · ⭐ 2.6K) - The official Python client for the Huggingface Hub. Apache-2 - [GitHub](https://github.com/huggingface/huggingface_hub) (👨‍💻 240 · 🔀 710 · 📋 1.2K - 15% open · ⏱️ 22.05.2025):
git clone https://github.com/huggingface/huggingface_hub
- [PyPi](https://pypi.org/project/huggingface_hub) (📥 88M / month · 📦 3K · ⏱️ 19.05.2025):
pip install huggingface_hub
- [Conda](https://anaconda.org/conda-forge/huggingface_hub) (📥 3.2M · ⏱️ 19.05.2025):
conda install -c conda-forge huggingface_hub
Core ML Tools (🥈36 · ⭐ 4.7K) - Core ML tools contain supporting tools for Core ML model.. BSD-3 - [GitHub](https://github.com/apple/coremltools) (👨‍💻 190 · 🔀 680 · 📥 15K · 📦 4.9K · 📋 1.5K - 25% open · ⏱️ 20.05.2025):
git clone https://github.com/apple/coremltools
- [PyPi](https://pypi.org/project/coremltools) (📥 510K / month · 📦 98 · ⏱️ 28.04.2025):
pip install coremltools
- [Conda](https://anaconda.org/conda-forge/coremltools) (📥 97K · ⏱️ 22.04.2025):
conda install -c conda-forge coremltools
BentoML (🥈35 · ⭐ 7.7K) - The easiest way to serve AI apps and models - Build Model Inference.. Apache-2 - [GitHub](https://github.com/bentoml/BentoML) (👨‍💻 260 · 🔀 840 · 📥 400 · 📦 2.7K · 📋 1.1K - 11% open · ⏱️ 22.05.2025):
git clone https://github.com/bentoml/BentoML
- [PyPi](https://pypi.org/project/bentoml) (📥 110K / month · 📦 40 · ⏱️ 20.05.2025):
pip install bentoml
TorchServe (🥈33 · ⭐ 4.4K) - Serve, optimize and scale PyTorch models in production. Apache-2 - [GitHub](https://github.com/pytorch/serve) (👨‍💻 220 · 🔀 880 · 📥 7.7K · 📦 870 · 📋 1.7K - 25% open · ⏱️ 17.03.2025):
git clone https://github.com/pytorch/serve
- [PyPi](https://pypi.org/project/torchserve) (📥 92K / month · 📦 24 · ⏱️ 30.09.2024):
pip install torchserve
- [Conda](https://anaconda.org/pytorch/torchserve) (📥 500K · ⏱️ 25.03.2025):
conda install -c pytorch torchserve
- [Docker Hub](https://hub.docker.com/r/pytorch/torchserve) (📥 1.4M · ⭐ 32 · ⏱️ 30.09.2024):
docker pull pytorch/torchserve
hls4ml (🥈30 · ⭐ 1.5K) - Machine learning on FPGAs using HLS. Apache-2 - [GitHub](https://github.com/fastmachinelearning/hls4ml) (👨‍💻 69 · 🔀 440 · 📦 47 · 📋 480 - 42% open · ⏱️ 05.05.2025):
git clone https://github.com/fastmachinelearning/hls4ml
- [PyPi](https://pypi.org/project/hls4ml) (📥 2K / month · 📦 1 · ⏱️ 17.03.2025):
pip install hls4ml
- [Conda](https://anaconda.org/conda-forge/hls4ml) (📥 11K · ⏱️ 22.04.2025):
conda install -c conda-forge hls4ml
Hummingbird (🥉24 · ⭐ 3.4K · 💤) - Hummingbird compiles trained ML models into tensor computation.. MIT - [GitHub](https://github.com/microsoft/hummingbird) (👨‍💻 40 · 🔀 280 · 📥 850 · 📋 330 - 20% open · ⏱️ 24.10.2024):
git clone https://github.com/microsoft/hummingbird
- [PyPi](https://pypi.org/project/hummingbird-ml) (📥 6K / month · 📦 7 · ⏱️ 25.10.2024):
pip install hummingbird-ml
- [Conda](https://anaconda.org/conda-forge/hummingbird-ml) (📥 59K · ⏱️ 22.04.2025):
conda install -c conda-forge hummingbird-ml
nebullvm (🥉21 · ⭐ 8.4K · 💤) - A collection of libraries to optimise AI model performances. Apache-2 - [GitHub](https://github.com/nebuly-ai/optimate) (👨‍💻 40 · 🔀 630 · 📋 200 - 49% open · ⏱️ 22.07.2024):
git clone https://github.com/nebuly-ai/nebullvm
- [PyPi](https://pypi.org/project/nebullvm) (📥 950 / month · 📦 2 · ⏱️ 18.06.2023):
pip install nebullvm
tfdeploy (🥉17 · ⭐ 350) - Deploy tensorflow graphs for fast evaluation and export to.. BSD-3 - [GitHub](https://github.com/riga/tfdeploy) (👨‍💻 4 · 🔀 38 · 📋 34 - 32% open · ⏱️ 04.01.2025):
git clone https://github.com/riga/tfdeploy
- [PyPi](https://pypi.org/project/tfdeploy) (📥 160 / month · ⏱️ 30.03.2017):
pip install tfdeploy
Show 10 hidden projects... - mmdnn (🥈25 · ⭐ 5.8K · 💀) - MMdnn is a set of tools to help users inter-operate among different deep.. MIT - m2cgen (🥈25 · ⭐ 2.9K · 💀) - Transform ML models into a native code (Java, C, Python, Go,.. MIT - sklearn-porter (🥉24 · ⭐ 1.3K · 💀) - Transpile trained scikit-learn estimators to C, Java,.. BSD-3 - cortex (🥉22 · ⭐ 8K · 💀) - Production infrastructure for machine learning at scale. Apache-2 - OMLT (🥉21 · ⭐ 320) - Represent trained machine learning models as Pyomo optimization.. ❗Unlicensed - pytorch2keras (🥉19 · ⭐ 860 · 💀) - PyTorch to Keras model convertor. MIT - Larq Compute Engine (🥉19 · ⭐ 250) - Highly optimized inference engine for Binarized.. Apache-2 - modelkit (🥉18 · ⭐ 150 · 💤) - Toolkit for developing and maintaining ML models. MIT - backprop (🥉15 · ⭐ 240 · 💀) - Backprop makes it simple to use, finetune, and deploy state-of-.. Apache-2 - ml-ane-transformers (🥉13 · ⭐ 2.6K · 💀) - Reference implementation of the Transformer.. ❗Unlicensed


Model Interpretability

Back to top

Libraries to visualize, explain, debug, evaluate, and interpret machine learning models.

shap (🥇42 · ⭐ 24K) - A game theoretic approach to explain the output of any machine learning model. MIT - [GitHub](https://github.com/shap/shap) (👨‍💻 270 · 🔀 3.4K · 📦 33K · 📋 2.6K - 25% open · ⏱️ 22.05.2025):
git clone https://github.com/slundberg/shap
- [PyPi](https://pypi.org/project/shap) (📥 7.2M / month · 📦 960 · ⏱️ 17.04.2025):
pip install shap
- [Conda](https://anaconda.org/conda-forge/shap) (📥 6.1M · ⏱️ 22.04.2025):
conda install -c conda-forge shap
arviz (🥇36 · ⭐ 1.7K) - Exploratory analysis of Bayesian models with Python. Apache-2 - [GitHub](https://github.com/arviz-devs/arviz) (👨‍💻 170 · 🔀 430 · 📥 180 · 📦 11K · 📋 890 - 21% open · ⏱️ 28.04.2025):
git clone https://github.com/arviz-devs/arviz
- [PyPi](https://pypi.org/project/arviz) (📥 1.7M / month · 📦 360 · ⏱️ 06.03.2025):
pip install arviz
- [Conda](https://anaconda.org/conda-forge/arviz) (📥 2.4M · ⏱️ 22.04.2025):
conda install -c conda-forge arviz
Netron (🥇35 · ⭐ 30K) - Visualizer for neural network, deep learning and machine learning.. MIT - [GitHub](https://github.com/lutzroeder/netron) (👨‍💻 2 · 🔀 2.9K · 📥 53K · 📦 13 · 📋 1.2K - 1% open · ⏱️ 21.05.2025):
git clone https://github.com/lutzroeder/netron
- [PyPi](https://pypi.org/project/netron) (📥 39K / month · 📦 88 · ⏱️ 16.05.2025):
pip install netron
Captum (🥇35 · ⭐ 5.2K) - Model interpretability and understanding for PyTorch. BSD-3 - [GitHub](https://github.com/pytorch/captum) (👨‍💻 130 · 🔀 510 · 📦 3.3K · 📋 600 - 42% open · ⏱️ 21.05.2025):
git clone https://github.com/pytorch/captum
- [PyPi](https://pypi.org/project/captum) (📥 300K / month · 📦 170 · ⏱️ 27.03.2025):
pip install captum
- [Conda](https://anaconda.org/conda-forge/captum) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge captum
InterpretML (🥇34 · ⭐ 6.5K) - Fit interpretable models. Explain blackbox machine learning. MIT - [GitHub](https://github.com/interpretml/interpret) (👨‍💻 48 · 🔀 740 · 📦 900 · 📋 470 - 22% open · ⏱️ 17.04.2025):
git clone https://github.com/interpretml/interpret
- [PyPi](https://pypi.org/project/interpret) (📥 180K / month · 📦 53 · ⏱️ 26.03.2025):
pip install interpret
shapash (🥈31 · ⭐ 2.9K) - Shapash: User-friendly Explainability and Interpretability to.. Apache-2 - [GitHub](https://github.com/MAIF/shapash) (👨‍💻 41 · 🔀 340 · 📦 190 · 📋 240 - 18% open · ⏱️ 16.05.2025):
git clone https://github.com/MAIF/shapash
- [PyPi](https://pypi.org/project/shapash) (📥 9.9K / month · 📦 4 · ⏱️ 20.03.2025):
pip install shapash
evaluate (🥈31 · ⭐ 2.2K) - Evaluate: A library for easily evaluating machine learning models.. Apache-2 - [GitHub](https://github.com/huggingface/evaluate) (👨‍💻 130 · 🔀 270 · 📦 22K · 📋 370 - 61% open · ⏱️ 10.01.2025):
git clone https://github.com/huggingface/evaluate
- [PyPi](https://pypi.org/project/evaluate) (📥 3M / month · 📦 400 · ⏱️ 11.09.2024):
pip install evaluate
explainerdashboard (🥈30 · ⭐ 2.4K) - Quickly build Explainable AI dashboards that show the inner.. MIT - [GitHub](https://github.com/oegedijk/explainerdashboard) (👨‍💻 21 · 🔀 340 · 📦 630 · 📋 240 - 15% open · ⏱️ 29.12.2024):
git clone https://github.com/oegedijk/explainerdashboard
- [PyPi](https://pypi.org/project/explainerdashboard) (📥 51K / month · 📦 13 · ⏱️ 29.12.2024):
pip install explainerdashboard
- [Conda](https://anaconda.org/conda-forge/explainerdashboard) (📥 66K · ⏱️ 22.04.2025):
conda install -c conda-forge explainerdashboard
fairlearn (🥈30 · ⭐ 2.1K) - A Python package to assess and improve fairness of machine.. MIT - [GitHub](https://github.com/fairlearn/fairlearn) (👨‍💻 100 · 🔀 450 · 📦 3 · 📋 540 - 28% open · ⏱️ 05.05.2025):
git clone https://github.com/fairlearn/fairlearn
- [PyPi](https://pypi.org/project/fairlearn) (📥 130K / month · 📦 63 · ⏱️ 11.12.2024):
pip install fairlearn
- [Conda](https://anaconda.org/conda-forge/fairlearn) (📥 46K · ⏱️ 22.04.2025):
conda install -c conda-forge fairlearn
dtreeviz (🥈28 · ⭐ 3.1K) - A python library for decision tree visualization and model interpretation. MIT - [GitHub](https://github.com/parrt/dtreeviz) (👨‍💻 27 · 🔀 340 · 📦 1.6K · 📋 210 - 34% open · ⏱️ 06.03.2025):
git clone https://github.com/parrt/dtreeviz
- [PyPi](https://pypi.org/project/dtreeviz) (📥 93K / month · 📦 53 · ⏱️ 07.07.2022):
pip install dtreeviz
- [Conda](https://anaconda.org/conda-forge/dtreeviz) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge dtreeviz
DoWhy (🥈27 · ⭐ 7.5K) - DoWhy is a Python library for causal inference that supports explicit.. MIT - [GitHub](https://github.com/py-why/dowhy) (👨‍💻 100 · 🔀 940 · 📥 43 · 📦 610 · 📋 500 - 27% open · ⏱️ 19.05.2025):
git clone https://github.com/py-why/dowhy
- [PyPi](https://pypi.org/project/dowhy) (📥 53K / month · 📦 18 · ⏱️ 24.11.2024):
pip install dowhy
- [Conda](https://anaconda.org/conda-forge/dowhy) (📥 44K · ⏱️ 22.04.2025):
conda install -c conda-forge dowhy
Fairness 360 (🥈27 · ⭐ 2.6K) - A comprehensive set of fairness metrics for datasets and.. Apache-2 - [GitHub](https://github.com/Trusted-AI/AIF360) (👨‍💻 73 · 🔀 850 · 📦 680 · 📋 300 - 65% open · ⏱️ 10.12.2024):
git clone https://github.com/Trusted-AI/AIF360
- [PyPi](https://pypi.org/project/aif360) (📥 23K / month · 📦 32 · ⏱️ 08.04.2024):
pip install aif360
- [Conda](https://anaconda.org/conda-forge/aif360) (📥 23K · ⏱️ 22.04.2025):
conda install -c conda-forge aif360
Model Analysis (🥈26 · ⭐ 1.3K) - Model analysis tools for TensorFlow. Apache-2 - [GitHub](https://github.com/tensorflow/model-analysis) (👨‍💻 59 · 🔀 280 · 📋 97 - 39% open · ⏱️ 28.04.2025):
git clone https://github.com/tensorflow/model-analysis
- [PyPi](https://pypi.org/project/tensorflow-model-analysis) (📥 69K / month · 📦 19 · ⏱️ 05.12.2024):
pip install tensorflow-model-analysis
LIT (🥉25 · ⭐ 3.6K · 📉) - The Learning Interpretability Tool: Interactively analyze ML models.. Apache-2 - [GitHub](https://github.com/PAIR-code/lit) (👨‍💻 38 · 🔀 360 · 📋 210 - 57% open · ⏱️ 20.12.2024):
git clone https://github.com/PAIR-code/lit
- [PyPi](https://pypi.org/project/lit-nlp) (📥 4.9K / month · 📦 3 · ⏱️ 20.12.2024):
pip install lit-nlp
- [Conda](https://anaconda.org/conda-forge/lit-nlp) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge lit-nlp
responsible-ai-widgets (🥉25 · ⭐ 1.5K) - Responsible AI Toolbox is a suite of tools providing.. MIT - [GitHub](https://github.com/microsoft/responsible-ai-toolbox) (👨‍💻 43 · 🔀 400 · 📋 320 - 26% open · ⏱️ 07.02.2025):
git clone https://github.com/microsoft/responsible-ai-toolbox
- [PyPi](https://pypi.org/project/raiwidgets) (📥 8.3K / month · 📦 6 · ⏱️ 08.07.2024):
pip install raiwidgets
imodels (🥉24 · ⭐ 1.5K) - Interpretable ML package for concise, transparent, and accurate.. MIT - [GitHub](https://github.com/csinva/imodels) (👨‍💻 25 · 🔀 120 · 📦 120 · 📋 95 - 38% open · ⏱️ 05.03.2025):
git clone https://github.com/csinva/imodels
- [PyPi](https://pypi.org/project/imodels) (📥 41K / month · 📦 9 · ⏱️ 15.10.2024):
pip install imodels
keract (🥉24 · ⭐ 1.1K) - Layers Outputs and Gradients in Keras. Made easy. MIT - [GitHub](https://github.com/philipperemy/keract) (👨‍💻 17 · 🔀 190 · 📦 250 · 📋 89 - 3% open · ⏱️ 07.04.2025):
git clone https://github.com/philipperemy/keract
- [PyPi](https://pypi.org/project/keract) (📥 3.8K / month · 📦 7 · ⏱️ 07.04.2025):
pip install keract
aequitas (🥉24 · ⭐ 720) - Bias Auditing & Fair ML Toolkit. MIT - [GitHub](https://github.com/dssg/aequitas) (👨‍💻 23 · 🔀 120 · 📦 190 · 📋 99 - 51% open · ⏱️ 25.03.2025):
git clone https://github.com/dssg/aequitas
- [PyPi](https://pypi.org/project/aequitas) (📥 18K / month · 📦 8 · ⏱️ 30.01.2024):
pip install aequitas
ecco (🥉22 · ⭐ 2K · 💤) - Explain, analyze, and visualize NLP language models. Ecco creates.. BSD-3 - [GitHub](https://github.com/jalammar/ecco) (👨‍💻 12 · 🔀 170 · 📥 140 · 📦 33 · 📋 64 - 51% open · ⏱️ 15.08.2024):
git clone https://github.com/jalammar/ecco
- [PyPi](https://pypi.org/project/ecco) (📥 660 / month · 📦 1 · ⏱️ 09.01.2022):
pip install ecco
- [Conda](https://anaconda.org/conda-forge/ecco) (📥 6.8K · ⏱️ 22.04.2025):
conda install -c conda-forge ecco
Explainability 360 (🥉22 · ⭐ 1.7K) - Interpretability and explainability of data and.. Apache-2 - [GitHub](https://github.com/Trusted-AI/AIX360) (👨‍💻 42 · 🔀 310 · 📦 160 · 📋 86 - 62% open · ⏱️ 26.02.2025):
git clone https://github.com/Trusted-AI/AIX360
- [PyPi](https://pypi.org/project/aix360) (📥 910 / month · 📦 1 · ⏱️ 31.07.2023):
pip install aix360
random-forest-importances (🥉22 · ⭐ 610) - Code to compute permutation and drop-column.. MIT - [GitHub](https://github.com/parrt/random-forest-importances) (👨‍💻 16 · 🔀 130 · 📦 180 · 📋 39 - 20% open · ⏱️ 24.03.2025):
git clone https://github.com/parrt/random-forest-importances
- [PyPi](https://pypi.org/project/rfpimp) (📥 14K / month · 📦 5 · ⏱️ 28.01.2021):
pip install rfpimp
DiCE (🥉20 · ⭐ 1.4K) - Generate Diverse Counterfactual Explanations for any machine.. MIT - [GitHub](https://github.com/interpretml/DiCE) (👨‍💻 19 · 🔀 200 · 📋 180 - 48% open · ⏱️ 22.11.2024):
git clone https://github.com/interpretml/DiCE
- [PyPi](https://pypi.org/project/dice-ml) (📥 32K / month · 📦 6 · ⏱️ 27.10.2023):
pip install dice-ml
LOFO (🥉20 · ⭐ 840) - Leave One Feature Out Importance. MIT - [GitHub](https://github.com/aerdem4/lofo-importance) (👨‍💻 6 · 🔀 87 · 📦 40 · 📋 30 - 13% open · ⏱️ 14.02.2025):
git clone https://github.com/aerdem4/lofo-importance
- [PyPi](https://pypi.org/project/lofo-importance) (📥 1.9K / month · 📦 5 · ⏱️ 14.02.2025):
pip install lofo-importance
fairness-indicators (🥉20 · ⭐ 350) - Tensorflows Fairness Evaluation and Visualization.. Apache-2 - [GitHub](https://github.com/tensorflow/fairness-indicators) (👨‍💻 36 · 🔀 85 · 📋 39 - 74% open · ⏱️ 22.01.2025):
git clone https://github.com/tensorflow/fairness-indicators
- [PyPi](https://pypi.org/project/fairness-indicators) (📥 2.2K / month · ⏱️ 22.01.2025):
pip install fairness-indicators
ExplainX.ai (🥉15 · ⭐ 440 · 💤) - Explainable AI framework for data scientists. Explain & debug any.. MIT - [GitHub](https://github.com/explainX/explainx) (👨‍💻 5 · 🔀 55 · 📥 20 · 📋 39 - 25% open · ⏱️ 21.08.2024):
git clone https://github.com/explainX/explainx
- [PyPi](https://pypi.org/project/explainx) (📥 560 / month · ⏱️ 04.02.2021):
pip install explainx
Show 30 hidden projects... - Lime (🥇33 · ⭐ 12K · 💀) - Lime: Explaining the predictions of any machine learning classifier. BSD-2 - pyLDAvis (🥈30 · ⭐ 1.8K · 💀) - Python library for interactive topic model visualization... BSD-3 - yellowbrick (🥈28 · ⭐ 4.3K · 💀) - Visual analysis and diagnostic tools to facilitate.. Apache-2 - Deep Checks (🥈28 · ⭐ 3.8K) - Deepchecks: Tests for Continuous Validation of ML Models &.. ❗️AGPL-3.0 - eli5 (🥈28 · ⭐ 2.8K · 💀) - A library for debugging/inspecting machine learning classifiers and.. MIT - scikit-plot (🥈28 · ⭐ 2.4K · 💀) - An intuitive library to add plotting functionality to.. MIT - DALEX (🥈28 · ⭐ 1.4K) - moDel Agnostic Language for Exploration and eXplanation. ❗️GPL-3.0 - Alibi (🥈26 · ⭐ 2.5K) - Algorithms for explaining machine learning models. ❗️Intel - iNNvestigate (🥈26 · ⭐ 1.3K · 💀) - A toolbox to iNNvestigate neural networks predictions!. BSD-2 - Lucid (🥉25 · ⭐ 4.7K · 💀) - A collection of infrastructure and tools for research in.. Apache-2 - checklist (🥉25 · ⭐ 2K · 💀) - Beyond Accuracy: Behavioral Testing of NLP models with CheckList. MIT - keras-vis (🥉24 · ⭐ 3K · 💀) - Neural network visualization toolkit for keras. MIT - CausalNex (🥉24 · ⭐ 2.3K · 💀) - A Python library that helps data scientists to infer.. Apache-2 - What-If Tool (🥉23 · ⭐ 950 · 💀) - Source code/webpage/demos for the What-If Tool. Apache-2 - TreeInterpreter (🥉23 · ⭐ 760 · 💀) - Package for interpreting scikit-learns decision tree.. BSD-3 - tf-explain (🥉22 · ⭐ 1K · 💀) - Interpretability Methods for tf.keras models with Tensorflow.. MIT - deeplift (🥉21 · ⭐ 860 · 💀) - Public facing deeplift repo. MIT - Quantus (🥉21 · ⭐ 600) - Quantus is an eXplainable AI toolkit for responsible evaluation of.. ❗️GPL-3.0 - tcav (🥉20 · ⭐ 640 · 💀) - Code for the TCAV ML interpretability project. Apache-2 - XAI (🥉19 · ⭐ 1.2K · 💀) - XAI - An eXplainability toolbox for machine learning. MIT - model-card-toolkit (🥉18 · ⭐ 430 · 💀) - A toolkit that streamlines and automates the.. Apache-2 - sklearn-evaluation (🥉17 · ⭐ 460 · 💀) - Machine learning model evaluation made easy: plots,.. MIT - FlashTorch (🥉16 · ⭐ 740 · 💀) - Visualization toolkit for neural networks in PyTorch! Demo --. MIT - Skater (🥉15 · ⭐ 1.1K) - Python Library for Model Interpretation/Explanations. ❗️UPL-1.0 - Anchor (🥉15 · ⭐ 800 · 💀) - Code for High-Precision Model-Agnostic Explanations paper. BSD-2 - effector (🥉15 · ⭐ 110) - Effector - a Python package for global and regional effect methods. MIT - interpret-text (🥉14 · ⭐ 420 · 💀) - A library that incorporates state-of-the-art explainers.. MIT - bias-detector (🥉13 · ⭐ 43 · 💀) - Bias Detector is a python package for detecting bias in machine.. MIT - Attribution Priors (🥉12 · ⭐ 120 · 💀) - Tools for training explainable models using.. MIT - contextual-ai (🥉12 · ⭐ 87 · 💀) - Contextual AI adds explainability to different stages of.. Apache-2


Vector Similarity Search (ANN)

Back to top

Libraries for Approximate Nearest Neighbor Search and Vector Indexing/Similarity Search.

🔗 ANN Benchmarks ( ⭐ 5.3K) - Benchmarks of approximate nearest neighbor libraries in Python.

Milvus (🥇42 · ⭐ 35K · 📈) - Milvus is a high-performance, cloud-native vector database built.. Apache-2 - [GitHub](https://github.com/milvus-io/milvus) (👨‍💻 310 · 🔀 3.2K · 📥 360K · 📋 14K - 5% open · ⏱️ 22.05.2025):
git clone https://github.com/milvus-io/milvus
- [PyPi](https://pypi.org/project/pymilvus) (📥 1.8M / month · 📦 270 · ⏱️ 19.05.2025):
pip install pymilvus
- [Docker Hub](https://hub.docker.com/r/milvusdb/milvus) (📥 69M · ⭐ 81 · ⏱️ 22.05.2025):
docker pull milvusdb/milvus
Faiss (🥇41 · ⭐ 35K) - A library for efficient similarity search and clustering of dense vectors. MIT - [GitHub](https://github.com/facebookresearch/faiss) (👨‍💻 220 · 🔀 3.8K · 📦 4.8K · 📋 2.7K - 9% open · ⏱️ 21.05.2025):
git clone https://github.com/facebookresearch/faiss
- [PyPi](https://pypi.org/project/pymilvus) (📥 1.8M / month · 📦 270 · ⏱️ 19.05.2025):
pip install pymilvus
- [Conda](https://anaconda.org/conda-forge/faiss) (📥 2.6M · ⏱️ 22.04.2025):
conda install -c conda-forge faiss
Annoy (🥈35 · ⭐ 14K · 💤) - Approximate Nearest Neighbors in C++/Python optimized for memory.. Apache-2 - [GitHub](https://github.com/spotify/annoy) (👨‍💻 88 · 🔀 1.2K · 📦 5.1K · 📋 410 - 15% open · ⏱️ 29.07.2024):
git clone https://github.com/spotify/annoy
- [PyPi](https://pypi.org/project/annoy) (📥 700K / month · 📦 200 · ⏱️ 14.06.2023):
pip install annoy
- [Conda](https://anaconda.org/conda-forge/python-annoy) (📥 690K · ⏱️ 22.04.2025):
conda install -c conda-forge python-annoy
hnswlib (🥈32 · ⭐ 4.7K · 💤) - Header-only C++/python library for fast approximate nearest.. Apache-2 - [GitHub](https://github.com/nmslib/hnswlib) (👨‍💻 72 · 🔀 680 · 📦 8.1K · 📋 420 - 60% open · ⏱️ 17.06.2024):
git clone https://github.com/nmslib/hnswlib
- [PyPi](https://pypi.org/project/hnswlib) (📥 470K / month · 📦 130 · ⏱️ 03.12.2023):
pip install hnswlib
- [Conda](https://anaconda.org/conda-forge/hnswlib) (📥 360K · ⏱️ 22.04.2025):
conda install -c conda-forge hnswlib
NMSLIB (🥈31 · ⭐ 3.5K · 💤) - Non-Metric Space Library (NMSLIB): An efficient similarity search.. Apache-2 - [GitHub](https://github.com/nmslib/nmslib) (👨‍💻 49 · 🔀 460 · 📦 1.4K · 📋 440 - 21% open · ⏱️ 21.09.2024):
git clone https://github.com/nmslib/nmslib
- [PyPi](https://pypi.org/project/nmslib) (📥 390K / month · 📦 63 · ⏱️ 03.02.2021):
pip install nmslib
- [Conda](https://anaconda.org/conda-forge/nmslib) (📥 200K · ⏱️ 22.04.2025):
conda install -c conda-forge nmslib
USearch (🥈31 · ⭐ 2.7K) - Fast Open-Source Search & Clustering engine for Vectors & Strings in.. Apache-2 - [GitHub](https://github.com/unum-cloud/usearch) (👨‍💻 70 · 🔀 180 · 📥 69K · 📦 180 · 📋 210 - 42% open · ⏱️ 16.04.2025):
git clone https://github.com/unum-cloud/usearch
- [PyPi](https://pypi.org/project/usearch) (📥 130K / month · 📦 35 · ⏱️ 16.04.2025):
pip install usearch
- [npm](https://www.npmjs.com/package/usearch) (📥 7.1K / month · 📦 15 · ⏱️ 23.01.2025):
npm install usearch
- [Docker Hub](https://hub.docker.com/r/unum/usearch) (📥 200 · ⭐ 1 · ⏱️ 16.04.2025):
docker pull unum/usearch
PyNNDescent (🥉28 · ⭐ 920) - A Python nearest neighbor descent for approximate nearest neighbors. BSD-2 - [GitHub](https://github.com/lmcinnes/pynndescent) (👨‍💻 30 · 🔀 110 · 📦 12K · 📋 140 - 52% open · ⏱️ 10.11.2024):
git clone https://github.com/lmcinnes/pynndescent
- [PyPi](https://pypi.org/project/pynndescent) (📥 1.5M / month · 📦 160 · ⏱️ 17.06.2024):
pip install pynndescent
- [Conda](https://anaconda.org/conda-forge/pynndescent) (📥 2.3M · ⏱️ 22.04.2025):
conda install -c conda-forge pynndescent
NGT (🥉24 · ⭐ 1.3K) - Nearest Neighbor Search with Neighborhood Graph and Tree for High-.. Apache-2 - [GitHub](https://github.com/yahoojapan/NGT) (👨‍💻 19 · 🔀 120 · 📋 150 - 16% open · ⏱️ 30.04.2025):
git clone https://github.com/yahoojapan/NGT
- [PyPi](https://pypi.org/project/ngt) (📥 2.2K / month · 📦 12 · ⏱️ 26.02.2025):
pip install ngt
Show 4 hidden projects... - NearPy (🥉21 · ⭐ 770 · 💀) - Python framework for fast (approximated) nearest neighbour search in.. MIT - N2 (🥉21 · ⭐ 580 · 💀) - TOROS N2 - lightweight approximate Nearest Neighbor library which runs.. Apache-2 - Magnitude (🥉20 · ⭐ 1.6K · 💀) - A fast, efficient universal vector embedding utility package. MIT - PySparNN (🥉11 · ⭐ 920 · 💀) - Approximate Nearest Neighbor Search for Sparse Data in Python!. BSD-3


Probabilistics & Statistics

Back to top

Libraries providing capabilities for probabilistic programming/reasoning, bayesian inference, gaussian processes, or statistics.

PyMC3 (🥇41 · ⭐ 9K) - Bayesian Modeling and Probabilistic Programming in Python. Apache-2 - [GitHub](https://github.com/pymc-devs/pymc) (👨‍💻 520 · 🔀 2.1K · 📥 2K · 📦 7.2K · 📋 3.5K - 10% open · ⏱️ 21.05.2025):
git clone https://github.com/pymc-devs/pymc
- [PyPi](https://pypi.org/project/pymc3) (📥 270K / month · 📦 190 · ⏱️ 31.05.2024):
pip install pymc3
- [Conda](https://anaconda.org/conda-forge/pymc3) (📥 670K · ⏱️ 22.04.2025):
conda install -c conda-forge pymc3
tensorflow-probability (🥇36 · ⭐ 4.3K) - Probabilistic reasoning and statistical analysis in.. Apache-2 - [GitHub](https://github.com/tensorflow/probability) (👨‍💻 500 · 🔀 1.1K · 📦 4 · 📋 1.5K - 48% open · ⏱️ 14.05.2025):
git clone https://github.com/tensorflow/probability
- [PyPi](https://pypi.org/project/tensorflow-probability) (📥 930K / month · 📦 620 · ⏱️ 08.11.2024):
pip install tensorflow-probability
- [Conda](https://anaconda.org/conda-forge/tensorflow-probability) (📥 180K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow-probability
Pyro (🥇34 · ⭐ 8.8K) - Deep universal probabilistic programming with Python and PyTorch. Apache-2 - [GitHub](https://github.com/pyro-ppl/pyro) (👨‍💻 160 · 🔀 990 · 📋 1.1K - 23% open · ⏱️ 24.04.2025):
git clone https://github.com/pyro-ppl/pyro
- [PyPi](https://pypi.org/project/pyro-ppl) (📥 460K / month · 📦 190 · ⏱️ 02.06.2024):
pip install pyro-ppl
- [Conda](https://anaconda.org/conda-forge/pyro-ppl) (📥 240K · ⏱️ 22.04.2025):
conda install -c conda-forge pyro-ppl
pgmpy (🥇34 · ⭐ 2.9K) - Python Library for Causal and Probabilistic Modeling using Bayesian Networks. MIT - [GitHub](https://github.com/pgmpy/pgmpy) (👨‍💻 150 · 🔀 750 · 📥 610 · 📦 1.6K · 📋 1K - 29% open · ⏱️ 15.05.2025):
git clone https://github.com/pgmpy/pgmpy
- [PyPi](https://pypi.org/project/pgmpy) (📥 180K / month · 📦 72 · ⏱️ 31.03.2025):
pip install pgmpy
GPyTorch (🥈31 · ⭐ 3.7K) - A highly efficient implementation of Gaussian Processes in PyTorch. MIT - [GitHub](https://github.com/cornellius-gp/gpytorch) (👨‍💻 140 · 🔀 560 · 📦 2.9K · 📋 1.4K - 27% open · ⏱️ 07.02.2025):
git clone https://github.com/cornellius-gp/gpytorch
- [PyPi](https://pypi.org/project/gpytorch) (📥 390K / month · 📦 190 · ⏱️ 29.01.2025):
pip install gpytorch
- [Conda](https://anaconda.org/conda-forge/gpytorch) (📥 210K · ⏱️ 22.04.2025):
conda install -c conda-forge gpytorch
patsy (🥈31 · ⭐ 970) - Describing statistical models in Python using symbolic formulas. BSD-2 - [GitHub](https://github.com/pydata/patsy) (👨‍💻 22 · 🔀 100 · 📦 120K · 📋 160 - 46% open · ⏱️ 24.02.2025):
git clone https://github.com/pydata/patsy
- [PyPi](https://pypi.org/project/patsy) (📥 16M / month · 📦 530 · ⏱️ 12.11.2024):
pip install patsy
- [Conda](https://anaconda.org/conda-forge/patsy) (📥 16M · ⏱️ 22.04.2025):
conda install -c conda-forge patsy
hmmlearn (🥈30 · ⭐ 3.2K · 💤) - Hidden Markov Models in Python, with scikit-learn like API. BSD-3 - [GitHub](https://github.com/hmmlearn/hmmlearn) (👨‍💻 49 · 🔀 740 · 📦 3.5K · 📋 450 - 15% open · ⏱️ 31.10.2024):
git clone https://github.com/hmmlearn/hmmlearn
- [PyPi](https://pypi.org/project/hmmlearn) (📥 150K / month · 📦 92 · ⏱️ 31.10.2024):
pip install hmmlearn
- [Conda](https://anaconda.org/conda-forge/hmmlearn) (📥 380K · ⏱️ 22.04.2025):
conda install -c conda-forge hmmlearn
GPflow (🥈29 · ⭐ 1.9K) - Gaussian processes in TensorFlow. Apache-2 - [GitHub](https://github.com/GPflow/GPflow) (👨‍💻 84 · 🔀 440 · 📦 760 · 📋 840 - 18% open · ⏱️ 22.05.2025):
git clone https://github.com/GPflow/GPflow
- [PyPi](https://pypi.org/project/gpflow) (📥 71K / month · 📦 35 · ⏱️ 17.06.2024):
pip install gpflow
- [Conda](https://anaconda.org/conda-forge/gpflow) (📥 45K · ⏱️ 22.04.2025):
conda install -c conda-forge gpflow
emcee (🥈29 · ⭐ 1.5K) - The Python ensemble sampling toolkit for affine-invariant MCMC. MIT - [GitHub](https://github.com/dfm/emcee) (👨‍💻 75 · 🔀 430 · 📦 3K · 📋 300 - 19% open · ⏱️ 16.03.2025):
git clone https://github.com/dfm/emcee
- [PyPi](https://pypi.org/project/emcee) (📥 150K / month · 📦 440 · ⏱️ 19.04.2024):
pip install emcee
- [Conda](https://anaconda.org/conda-forge/emcee) (📥 410K · ⏱️ 22.04.2025):
conda install -c conda-forge emcee
bambi (🥈29 · ⭐ 1.2K) - BAyesian Model-Building Interface (Bambi) in Python. MIT - [GitHub](https://github.com/bambinos/bambi) (👨‍💻 47 · 🔀 130 · 📦 200 · 📋 440 - 20% open · ⏱️ 21.05.2025):
git clone https://github.com/bambinos/bambi
- [PyPi](https://pypi.org/project/bambi) (📥 34K / month · 📦 14 · ⏱️ 21.12.2024):
pip install bambi
- [Conda](https://anaconda.org/conda-forge/bambi) (📥 49K · ⏱️ 22.04.2025):
conda install -c conda-forge bambi
pomegranate (🥉28 · ⭐ 3.4K) - Fast, flexible and easy to use probabilistic modelling in Python. MIT - [GitHub](https://github.com/jmschrei/pomegranate) (👨‍💻 75 · 🔀 590 · 📋 790 - 3% open · ⏱️ 07.02.2025):
git clone https://github.com/jmschrei/pomegranate
- [PyPi](https://pypi.org/project/pomegranate) (📥 22K / month · 📦 67 · ⏱️ 07.02.2025):
pip install pomegranate
- [Conda](https://anaconda.org/conda-forge/pomegranate) (📥 210K · ⏱️ 22.04.2025):
conda install -c conda-forge pomegranate
SALib (🥉28 · ⭐ 920) - Sensitivity Analysis Library in Python. Contains Sobol, Morris, FAST, and.. MIT - [GitHub](https://github.com/SALib/SALib) (👨‍💻 51 · 🔀 240 · 📦 1.5K · 📋 340 - 15% open · ⏱️ 18.04.2025):
git clone https://github.com/SALib/SALib
- [PyPi](https://pypi.org/project/salib) (📥 250K / month · 📦 130 · ⏱️ 19.08.2024):
pip install salib
- [Conda](https://anaconda.org/conda-forge/salib) (📥 220K · ⏱️ 22.04.2025):
conda install -c conda-forge salib
PyStan (🥉28 · ⭐ 350 · 💤) - PyStan, a Python interface to Stan, a platform for statistical.. ISC - [GitHub](https://github.com/stan-dev/pystan) (👨‍💻 14 · 🔀 61 · 📦 10K · 📋 200 - 6% open · ⏱️ 03.07.2024):
git clone https://github.com/stan-dev/pystan
- [PyPi](https://pypi.org/project/pystan) (📥 670K / month · 📦 160 · ⏱️ 03.07.2024):
pip install pystan
- [Conda](https://anaconda.org/conda-forge/pystan) (📥 3M · ⏱️ 22.04.2025):
conda install -c conda-forge pystan
scikit-posthocs (🥉27 · ⭐ 370) - Multiple Pairwise Comparisons (Post Hoc) Tests in Python. MIT - [GitHub](https://github.com/maximtrp/scikit-posthocs) (👨‍💻 16 · 🔀 40 · 📥 66 · 📦 1.1K · 📋 72 - 6% open · ⏱️ 16.04.2025):
git clone https://github.com/maximtrp/scikit-posthocs
- [PyPi](https://pypi.org/project/scikit-posthocs) (📥 94K / month · 📦 73 · ⏱️ 29.03.2025):
pip install scikit-posthocs
- [Conda](https://anaconda.org/conda-forge/scikit-posthocs) (📥 1M · ⏱️ 22.04.2025):
conda install -c conda-forge scikit-posthocs
Orbit (🥉24 · ⭐ 2K · 💤) - A Python package for Bayesian forecasting with object-oriented.. Apache-2 - [GitHub](https://github.com/uber/orbit) (👨‍💻 20 · 🔀 140 · 📦 72 · 📋 400 - 12% open · ⏱️ 10.07.2024):
git clone https://github.com/uber/orbit
- [PyPi](https://pypi.org/project/orbit-ml) (📥 20K / month · 📦 1 · ⏱️ 01.04.2024):
pip install orbit-ml
TorchUncertainty (🥉23 · ⭐ 390) - Open-source framework for uncertainty and deep.. Apache-2 - [GitHub](https://github.com/ENSTA-U2IS-AI/torch-uncertainty) (👨‍💻 13 · 🔀 32 · 📋 55 - 18% open · ⏱️ 21.05.2025):
git clone https://github.com/ENSTA-U2IS-AI/torch-uncertainty
- [PyPi](https://pypi.org/project/torch-uncertainty) (📥 540 / month · 📦 4 · ⏱️ 21.05.2025):
pip install torch-uncertainty
pandas-ta (🥉22 · ⭐ 5.5K) - Technical Analysis Indicators - Pandas TA is an easy to use.. MIT - [GitHub](https://github.com/twopirllc/pandas-ta) (👨‍💻 40 · 🔀 1.1K):
git clone https://github.com/twopirllc/pandas-ta
- [PyPi](https://pypi.org/project/pandas-ta) (📥 200K / month · 📦 140 · ⏱️ 28.07.2021):
pip install pandas-ta
- [Conda](https://anaconda.org/conda-forge/pandas-ta) (📥 27K · ⏱️ 22.04.2025):
conda install -c conda-forge pandas-ta
Baal (🥉22 · ⭐ 900 · 💤) - Bayesian active learning library for research and industrial usecases. Apache-2 - [GitHub](https://github.com/baal-org/baal) (👨‍💻 23 · 🔀 87 · 📦 66 · 📋 110 - 17% open · ⏱️ 27.06.2024):
git clone https://github.com/baal-org/baal
- [PyPi](https://pypi.org/project/baal) (📥 1.2K / month · 📦 2 · ⏱️ 11.06.2024):
pip install baal
- [Conda](https://anaconda.org/conda-forge/baal) (📥 13K · ⏱️ 22.04.2025):
conda install -c conda-forge baal
pyhsmm (🥉21 · ⭐ 560) - Bayesian inference in HSMMs and HMMs. MIT - [GitHub](https://github.com/mattjj/pyhsmm) (👨‍💻 14 · 🔀 170 · 📦 35 · 📋 100 - 39% open · ⏱️ 25.01.2025):
git clone https://github.com/mattjj/pyhsmm
- [PyPi](https://pypi.org/project/pyhsmm) (📥 160 / month · 📦 1 · ⏱️ 10.05.2017):
pip install pyhsmm
Show 5 hidden projects... - filterpy (🥈32 · ⭐ 3.6K · 💀) - Python Kalman filtering and optimal estimation library. Implements.. MIT - pingouin (🥈30 · ⭐ 1.8K) - Statistical package in Python based on Pandas. ❗️GPL-3.0 - Edward (🥉27 · ⭐ 4.8K · 💀) - A probabilistic programming language in TensorFlow. Deep.. Apache-2 - Funsor (🥉20 · ⭐ 240 · 💀) - Functional tensors for probabilistic programming. Apache-2 - ZhuSuan (🥉15 · ⭐ 2.2K · 💀) - A probabilistic programming library for Bayesian deep learning,.. MIT


Adversarial Robustness

Back to top

Libraries for testing the robustness of machine learning models against attacks with adversarial/malicious examples.

ART (🥇33 · ⭐ 5.3K · 📈) - Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning.. MIT - [GitHub](https://github.com/Trusted-AI/adversarial-robustness-toolbox) (👨‍💻 140 · 🔀 1.2K · 📦 740 · 📋 910 - 2% open · ⏱️ 22.05.2025):
git clone https://github.com/Trusted-AI/adversarial-robustness-toolbox
- [PyPi](https://pypi.org/project/adversarial-robustness-toolbox) (📥 28K / month · 📦 20 · ⏱️ 22.01.2025):
pip install adversarial-robustness-toolbox
- [Conda](https://anaconda.org/conda-forge/adversarial-robustness-toolbox) (📥 72K · ⏱️ 22.04.2025):
conda install -c conda-forge adversarial-robustness-toolbox
TextAttack (🥈28 · ⭐ 3.2K · 💤) - TextAttack is a Python framework for adversarial attacks, data.. MIT - [GitHub](https://github.com/QData/TextAttack) (👨‍💻 66 · 🔀 400 · 📦 410 · 📋 290 - 23% open · ⏱️ 25.07.2024):
git clone https://github.com/QData/TextAttack
- [PyPi](https://pypi.org/project/textattack) (📥 6.6K / month · 📦 11 · ⏱️ 11.03.2024):
pip install textattack
- [Conda](https://anaconda.org/conda-forge/textattack) (📥 10K · ⏱️ 22.04.2025):
conda install -c conda-forge textattack
Show 7 hidden projects... - CleverHans (🥈30 · ⭐ 6.3K · 💀) - An adversarial example library for constructing attacks,.. MIT - Foolbox (🥈28 · ⭐ 2.9K · 💀) - A Python toolbox to create adversarial examples that fool neural.. MIT - advertorch (🥉24 · ⭐ 1.3K · 💀) - A Toolbox for Adversarial Robustness Research. ❗️GPL-3.0 - robustness (🥉20 · ⭐ 940 · 💀) - A library for experimenting with, training and evaluating neural.. MIT - AdvBox (🥉19 · ⭐ 1.4K · 💀) - Advbox is a toolbox to generate adversarial examples that fool.. Apache-2 - textflint (🥉17 · ⭐ 640 · 💀) - Unified Multilingual Robustness Evaluation Toolkit for.. ❗️GPL-3.0 - Adversary (🥉16 · ⭐ 400 · 💀) - Tool to generate adversarial text examples and test machine.. MIT


GPU & Accelerator Utilities

Back to top

Libraries that require and make use of CUDA/GPU or other accelerator hardware capabilities to optimize machine learning tasks.

cuDF (🥇35 · ⭐ 8.9K) - cuDF - GPU DataFrame Library. Apache-2 - [GitHub](https://github.com/rapidsai/cudf) (👨‍💻 310 · 🔀 940 · 📦 62 · 📋 7K - 15% open · ⏱️ 21.05.2025):
git clone https://github.com/rapidsai/cudf
- [PyPi](https://pypi.org/project/cudf) (📥 3.3K / month · 📦 22 · ⏱️ 01.06.2020):
pip install cudf
optimum (🥇35 · ⭐ 2.9K) - Accelerate inference and training of Transformers, Diffusers, TIMM.. Apache-2 - [GitHub](https://github.com/huggingface/optimum) (👨‍💻 150 · 🔀 540 · 📦 5.6K · 📋 890 - 40% open · ⏱️ 20.05.2025):
git clone https://github.com/huggingface/optimum
- [PyPi](https://pypi.org/project/optimum) (📥 1.3M / month · 📦 240 · ⏱️ 16.05.2025):
pip install optimum
- [Conda](https://anaconda.org/conda-forge/optimum) (📥 39K · ⏱️ 22.04.2025):
conda install -c conda-forge optimum
Apex (🥈32 · ⭐ 8.7K) - A PyTorch Extension: Tools for easy mixed precision and distributed.. BSD-3 - [GitHub](https://github.com/NVIDIA/apex) (👨‍💻 130 · 🔀 1.4K · 📦 3.2K · 📋 1.3K - 58% open · ⏱️ 15.05.2025):
git clone https://github.com/NVIDIA/apex
- [Conda](https://anaconda.org/conda-forge/nvidia-apex) (📥 500K · ⏱️ 22.04.2025):
conda install -c conda-forge nvidia-apex
cuML (🥈32 · ⭐ 4.7K) - cuML - RAPIDS Machine Learning Library. Apache-2 - [GitHub](https://github.com/rapidsai/cuml) (👨‍💻 180 · 🔀 570 · 📋 2.8K - 35% open · ⏱️ 22.05.2025):
git clone https://github.com/rapidsai/cuml
- [PyPi](https://pypi.org/project/cuml) (📥 4.6K / month · 📦 14 · ⏱️ 01.06.2020):
pip install cuml
PyCUDA (🥈32 · ⭐ 1.9K) - CUDA integration for Python, plus shiny features. MIT - [GitHub](https://github.com/inducer/pycuda) (👨‍💻 82 · 🔀 290 · 📦 3.8K · 📋 280 - 30% open · ⏱️ 06.05.2025):
git clone https://github.com/inducer/pycuda
- [PyPi](https://pypi.org/project/pycuda) (📥 120K / month · 📦 170 · ⏱️ 07.02.2025):
pip install pycuda
- [Conda](https://anaconda.org/conda-forge/pycuda) (📥 980K · ⏱️ 22.04.2025):
conda install -c conda-forge pycuda
gpustat (🥈29 · ⭐ 4.2K) - A simple command-line utility for querying and monitoring GPU status. MIT - [GitHub](https://github.com/wookayin/gpustat) (👨‍💻 17 · 🔀 280 · 📦 7.4K · 📋 130 - 22% open · ⏱️ 13.04.2025):
git clone https://github.com/wookayin/gpustat
- [PyPi](https://pypi.org/project/gpustat) (📥 740K / month · 📦 150 · ⏱️ 22.08.2023):
pip install gpustat
- [Conda](https://anaconda.org/conda-forge/gpustat) (📥 310K · ⏱️ 22.04.2025):
conda install -c conda-forge gpustat
ArrayFire (🥈28 · ⭐ 4.7K) - ArrayFire: a general purpose GPU library. BSD-3 - [GitHub](https://github.com/arrayfire/arrayfire) (👨‍💻 97 · 🔀 540 · 📥 8.4K · 📋 1.8K - 19% open · ⏱️ 04.04.2025):
git clone https://github.com/arrayfire/arrayfire
- [PyPi](https://pypi.org/project/arrayfire) (📥 3.7K / month · 📦 10 · ⏱️ 22.02.2022):
pip install arrayfire
cuGraph (🥈28 · ⭐ 2K) - cuGraph - RAPIDS Graph Analytics Library. Apache-2 - [GitHub](https://github.com/rapidsai/cugraph) (👨‍💻 120 · 🔀 330 · 📋 1.8K - 9% open · ⏱️ 21.05.2025):
git clone https://github.com/rapidsai/cugraph
- [PyPi](https://pypi.org/project/cugraph) (📥 370 / month · 📦 4 · ⏱️ 01.06.2020):
pip install cugraph
- [Conda](https://anaconda.org/conda-forge/libcugraph) (📥 65K · ⏱️ 22.04.2025):
conda install -c conda-forge libcugraph
CuPy (🥉27 · ⭐ 10K) - NumPy & SciPy for GPU. MIT - [GitHub](https://github.com/cupy/cupy) (👨‍💻 340 · 🔀 900):
git clone https://github.com/cupy/cupy
- [PyPi](https://pypi.org/project/cupy) (📥 26K / month · 📦 350 · ⏱️ 04.04.2025):
pip install cupy
- [Conda](https://anaconda.org/conda-forge/cupy) (📥 6.4M · ⏱️ 22.04.2025):
conda install -c conda-forge cupy
- [Docker Hub](https://hub.docker.com/r/cupy/cupy) (📥 83K · ⭐ 13 · ⏱️ 04.04.2025):
docker pull cupy/cupy
DALI (🥉25 · ⭐ 5.4K) - A GPU-accelerated library containing highly optimized building blocks.. Apache-2 - [GitHub](https://github.com/NVIDIA/DALI) (👨‍💻 99 · 🔀 640 · 📋 1.7K - 14% open · ⏱️ 21.05.2025):
git clone https://github.com/NVIDIA/DALI
Vulkan Kompute (🥉23 · ⭐ 2.2K) - General purpose GPU compute framework built on Vulkan to.. Apache-2 - [GitHub](https://github.com/KomputeProject/kompute) (👨‍💻 32 · 🔀 160 · 📥 640 · 📋 230 - 32% open · ⏱️ 19.03.2025):
git clone https://github.com/KomputeProject/kompute
- [PyPi](https://pypi.org/project/kp) (📥 280 / month · ⏱️ 20.01.2024):
pip install kp
Merlin (🥉20 · ⭐ 830 · 💤) - NVIDIA Merlin is an open source library providing end-to-end GPU-.. Apache-2 - [GitHub](https://github.com/NVIDIA-Merlin/Merlin) (👨‍💻 32 · 🔀 120 · 📋 460 - 46% open · ⏱️ 22.07.2024):
git clone https://github.com/NVIDIA-Merlin/Merlin
- [PyPi](https://pypi.org/project/merlin-core) (📥 18K / month · 📦 1 · ⏱️ 29.08.2023):
pip install merlin-core
Show 8 hidden projects... - GPUtil (🥉25 · ⭐ 1.2K · 💀) - A Python module for getting the GPU status from NVIDA GPUs using.. MIT - scikit-cuda (🥉24 · ⭐ 990 · 💀) - Python interface to GPU-powered libraries. BSD-3 - py3nvml (🥉23 · ⭐ 250 · 💀) - Python 3 Bindings for NVML library. Get NVIDIA GPU status inside.. BSD-3 - BlazingSQL (🥉21 · ⭐ 2K · 💀) - BlazingSQL is a lightweight, GPU accelerated, SQL engine for.. Apache-2 - nvidia-ml-py3 (🥉19 · ⭐ 140 · 💤) - Python 3 Bindings for the NVIDIA Management Library. BSD-3 - cuSignal (🥉16 · ⭐ 720 · 💀) - GPU accelerated signal processing. Apache-2 - ipyexperiments (🥉16 · ⭐ 220 · 💀) - Automatic GPU+CPU memory profiling, re-use and memory.. Apache-2 - SpeedTorch (🥉15 · ⭐ 680 · 💀) - Library for faster pinned CPU - GPU transfer in Pytorch. MIT


Tensorflow Utilities

Back to top

Libraries that extend TensorFlow with additional capabilities.

TensorFlow Datasets (🥇38 · ⭐ 4.4K) - TFDS is a collection of datasets ready to use with.. Apache-2 - [GitHub](https://github.com/tensorflow/datasets) (👨‍💻 520 · 🔀 1.6K · 📦 24K · 📋 1.5K - 47% open · ⏱️ 21.05.2025):
git clone https://github.com/tensorflow/datasets
- [PyPi](https://pypi.org/project/tensorflow-datasets) (📥 1.3M / month · 📦 340 · ⏱️ 12.03.2025):
pip install tensorflow-datasets
- [Conda](https://anaconda.org/conda-forge/tensorflow-datasets) (📥 46K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow-datasets
tensorflow-hub (🥇33 · ⭐ 3.5K) - A library for transfer learning by reusing parts of.. Apache-2 - [GitHub](https://github.com/tensorflow/hub) (👨‍💻 110 · 🔀 1.7K · 📋 710 - 2% open · ⏱️ 17.01.2025):
git clone https://github.com/tensorflow/hub
- [PyPi](https://pypi.org/project/tensorflow-hub) (📥 1.7M / month · 📦 300 · ⏱️ 30.01.2024):
pip install tensorflow-hub
- [Conda](https://anaconda.org/conda-forge/tensorflow-hub) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow-hub
TFX (🥈32 · ⭐ 2.1K) - TFX is an end-to-end platform for deploying production ML pipelines. Apache-2 - [GitHub](https://github.com/tensorflow/tfx) (👨‍💻 190 · 🔀 710 · 📦 1.8K · 📋 1.1K - 22% open · ⏱️ 26.03.2025):
git clone https://github.com/tensorflow/tfx
- [PyPi](https://pypi.org/project/tfx) (📥 34K / month · 📦 17 · ⏱️ 11.12.2024):
pip install tfx
TensorFlow I/O (🥈30 · ⭐ 720) - Dataset, streaming, and file system extensions.. Apache-2 - [GitHub](https://github.com/tensorflow/io) (👨‍💻 120 · 🔀 290 · 📋 660 - 44% open · ⏱️ 10.04.2025):
git clone https://github.com/tensorflow/io
- [PyPi](https://pypi.org/project/tensorflow-io) (📥 710K / month · 📦 61 · ⏱️ 01.07.2024):
pip install tensorflow-io
TF Model Optimization (🥉28 · ⭐ 1.5K) - A toolkit to optimize ML models for deployment for.. Apache-2 - [GitHub](https://github.com/tensorflow/model-optimization) (👨‍💻 87 · 🔀 320 · 📋 400 - 57% open · ⏱️ 10.02.2025):
git clone https://github.com/tensorflow/model-optimization
- [PyPi](https://pypi.org/project/tensorflow-model-optimization) (📥 1.5M / month · 📦 45 · ⏱️ 08.02.2024):
pip install tensorflow-model-optimization
TensorFlow Transform (🥉25 · ⭐ 990) - Input pipeline framework. Apache-2 - [GitHub](https://github.com/tensorflow/transform) (👨‍💻 29 · 🔀 220 · 📋 220 - 17% open · ⏱️ 30.04.2025):
git clone https://github.com/tensorflow/transform
- [PyPi](https://pypi.org/project/tensorflow-transform) (📥 210K / month · 📦 18 · ⏱️ 28.10.2024):
pip install tensorflow-transform
Neural Structured Learning (🥉24 · ⭐ 1K) - Training neural models with structured signals. Apache-2 - [GitHub](https://github.com/tensorflow/neural-structured-learning) (👨‍💻 39 · 🔀 190 · 📦 510 · 📋 69 - 1% open · ⏱️ 29.01.2025):
git clone https://github.com/tensorflow/neural-structured-learning
- [PyPi](https://pypi.org/project/neural-structured-learning) (📥 5.3K / month · 📦 3 · ⏱️ 29.07.2022):
pip install neural-structured-learning
TensorFlow Cloud (🥉21 · ⭐ 380) - The TensorFlow Cloud repository provides APIs that.. Apache-2 - [GitHub](https://github.com/tensorflow/cloud) (👨‍💻 28 · 🔀 90 · 📋 100 - 73% open · ⏱️ 29.01.2025):
git clone https://github.com/tensorflow/cloud
- [PyPi](https://pypi.org/project/tensorflow-cloud) (📥 30K / month · 📦 7 · ⏱️ 17.06.2021):
pip install tensorflow-cloud
TF Compression (🥉20 · ⭐ 890) - Data compression in TensorFlow. Apache-2 - [GitHub](https://github.com/tensorflow/compression) (👨‍💻 24 · 🔀 250 · 📋 100 - 10% open · ⏱️ 29.04.2025):
git clone https://github.com/tensorflow/compression
- [PyPi](https://pypi.org/project/tensorflow-compression) (📥 2.8K / month · 📦 2 · ⏱️ 02.02.2024):
pip install tensorflow-compression
Show 7 hidden projects... - tensor2tensor (🥈32 · ⭐ 16K · 💀) - Library of deep learning models and datasets designed.. Apache-2 - TF Addons (🥈32 · ⭐ 1.7K · 💀) - Useful extra functionality for TensorFlow 2.x maintained.. Apache-2 - Keras-Preprocessing (🥈29 · ⭐ 1K · 💀) - Utilities for working with image data, text data, and.. MIT - efficientnet (🥉27 · ⭐ 2.1K · 💀) - Implementation of EfficientNet model. Keras and.. Apache-2 - Saliency (🥉22 · ⭐ 970 · 💀) - Framework-agnostic implementation for state-of-the-art.. Apache-2 - TensorNets (🥉20 · ⭐ 1K · 💀) - High level network definitions with pre-trained weights in.. MIT - tffm (🥉18 · ⭐ 780 · 💀) - TensorFlow implementation of an arbitrary order Factorization Machine. MIT


Jax Utilities

Back to top

Libraries that extend Jax with additional capabilities.

equinox (🥇32 · ⭐ 2.4K) - Elegant easy-to-use neural networks + scientific computing in.. Apache-2 - [GitHub](https://github.com/patrick-kidger/equinox) (👨‍💻 68 · 🔀 160 · 📦 1.2K · 📋 560 - 34% open · ⏱️ 16.05.2025):
git clone https://github.com/patrick-kidger/equinox
- [PyPi](https://pypi.org/project/equinox) (📥 300K / month · 📦 250 · ⏱️ 14.05.2025):
pip install equinox
evojax (🥉20 · ⭐ 900 · 💤) - EvoJAX: Hardware-accelerated Neuroevolution. Apache-2 - [GitHub](https://github.com/google/evojax) (👨‍💻 14 · 🔀 100 · 📦 31 · 📋 37 - 54% open · ⏱️ 27.06.2024):
git clone https://github.com/google/evojax
- [PyPi](https://pypi.org/project/evojax) (📥 930 / month · 📦 6 · ⏱️ 18.06.2024):
pip install evojax
- [Conda](https://anaconda.org/conda-forge/evojax) (📥 39K · ⏱️ 22.04.2025):
conda install -c conda-forge evojax
Show 1 hidden projects... - jaxdf (🥉12 · ⭐ 130 · 💤) - A JAX-based research framework for writing differentiable.. ❗️LGPL-3.0


Sklearn Utilities

Back to top

Libraries that extend scikit-learn with additional capabilities.

MLxtend (🥇35 · ⭐ 5K) - A library of extension and helper modules for Pythons data analysis.. BSD-3 - [GitHub](https://github.com/rasbt/mlxtend) (👨‍💻 110 · 🔀 880 · 📦 20K · 📋 500 - 29% open · ⏱️ 26.01.2025):
git clone https://github.com/rasbt/mlxtend
- [PyPi](https://pypi.org/project/mlxtend) (📥 740K / month · 📦 200 · ⏱️ 26.01.2025):
pip install mlxtend
- [Conda](https://anaconda.org/conda-forge/mlxtend) (📥 360K · ⏱️ 22.04.2025):
conda install -c conda-forge mlxtend
scikit-learn-intelex (🥇35 · ⭐ 1.3K) - Extension for Scikit-learn is a seamless way to speed.. Apache-2 - [GitHub](https://github.com/uxlfoundation/scikit-learn-intelex) (👨‍💻 85 · 🔀 180 · 📦 14K · 📋 250 - 19% open · ⏱️ 21.05.2025):
git clone https://github.com/intel/scikit-learn-intelex
- [PyPi](https://pypi.org/project/scikit-learn-intelex) (📥 97K / month · 📦 65 · ⏱️ 22.04.2025):
pip install scikit-learn-intelex
- [Conda](https://anaconda.org/conda-forge/scikit-learn-intelex) (📥 540K · ⏱️ 05.05.2025):
conda install -c conda-forge scikit-learn-intelex
imbalanced-learn (🥈33 · ⭐ 7K) - A Python Package to Tackle the Curse of Imbalanced.. MIT - [GitHub](https://github.com/scikit-learn-contrib/imbalanced-learn) (👨‍💻 87 · 🔀 1.3K · 📋 620 - 7% open · ⏱️ 11.05.2025):
git clone https://github.com/scikit-learn-contrib/imbalanced-learn
- [PyPi](https://pypi.org/project/imbalanced-learn) (📥 15M / month · 📦 480 · ⏱️ 20.12.2024):
pip install imbalanced-learn
- [Conda](https://anaconda.org/conda-forge/imbalanced-learn) (📥 700K · ⏱️ 22.04.2025):
conda install -c conda-forge imbalanced-learn
category_encoders (🥈33 · ⭐ 2.4K) - A library of sklearn compatible categorical variable.. BSD-3 - [GitHub](https://github.com/scikit-learn-contrib/category_encoders) (👨‍💻 71 · 🔀 400 · 📦 3.8K · 📋 300 - 14% open · ⏱️ 24.03.2025):
git clone https://github.com/scikit-learn-contrib/category_encoders
- [PyPi](https://pypi.org/project/category_encoders) (📥 1.8M / month · 📦 310 · ⏱️ 15.03.2025):
pip install category_encoders
- [Conda](https://anaconda.org/conda-forge/category_encoders) (📥 320K · ⏱️ 22.04.2025):
conda install -c conda-forge category_encoders
scikit-lego (🥈27 · ⭐ 1.3K) - Extra blocks for scikit-learn pipelines. MIT - [GitHub](https://github.com/koaning/scikit-lego) (👨‍💻 68 · 🔀 120 · 📦 180 · 📋 340 - 10% open · ⏱️ 30.04.2025):
git clone https://github.com/koaning/scikit-lego
- [PyPi](https://pypi.org/project/scikit-lego) (📥 30K / month · 📦 13 · ⏱️ 30.04.2025):
pip install scikit-lego
- [Conda](https://anaconda.org/conda-forge/scikit-lego) (📥 68K · ⏱️ 22.04.2025):
conda install -c conda-forge scikit-lego
scikit-opt (🥉25 · ⭐ 5.5K · 💤) - Genetic Algorithm, Particle Swarm Optimization, Simulated.. MIT - [GitHub](https://github.com/guofei9987/scikit-opt) (👨‍💻 24 · 🔀 990 · 📦 270 · 📋 180 - 37% open · ⏱️ 23.06.2024):
git clone https://github.com/guofei9987/scikit-opt
- [PyPi](https://pypi.org/project/scikit-opt) (📥 6.5K / month · 📦 15 · ⏱️ 14.01.2022):
pip install scikit-opt
iterative-stratification (🥉21 · ⭐ 860 · 💤) - scikit-learn cross validators for iterative.. BSD-3 - [GitHub](https://github.com/trent-b/iterative-stratification) (👨‍💻 7 · 🔀 75 · 📦 590 · 📋 27 - 7% open · ⏱️ 12.10.2024):
git clone https://github.com/trent-b/iterative-stratification
- [PyPi](https://pypi.org/project/iterative-stratification) (📥 54K / month · 📦 15 · ⏱️ 12.10.2024):
pip install iterative-stratification
dabl (🥉19 · ⭐ 730 · 💤) - Data Analysis Baseline Library. BSD-3 - [GitHub](https://github.com/amueller/dabl) (👨‍💻 24 · 🔀 100 · ⏱️ 07.08.2024):
git clone https://github.com/amueller/dabl
- [PyPi](https://pypi.org/project/dabl) (📥 4.1K / month · 📦 3 · ⏱️ 16.12.2024):
pip install dabl
scikit-tda (🥉18 · ⭐ 550 · 💤) - Topological Data Analysis for Python. MIT - [GitHub](https://github.com/scikit-tda/scikit-tda) (👨‍💻 6 · 🔀 54 · 📦 87 · 📋 22 - 18% open · ⏱️ 19.07.2024):
git clone https://github.com/scikit-tda/scikit-tda
- [PyPi](https://pypi.org/project/scikit-tda) (📥 1.5K / month · ⏱️ 19.07.2024):
pip install scikit-tda
Show 10 hidden projects... - scikit-survival (🥈32 · ⭐ 1.2K) - Survival analysis built on top of scikit-learn. ❗️GPL-3.0 - fancyimpute (🥈27 · ⭐ 1.3K · 💀) - Multivariate imputation and matrix completion.. Apache-2 - scikit-multilearn (🥈27 · ⭐ 930 · 💀) - A scikit-learn based module for multi-label et. al... BSD-2 - sklearn-crfsuite (🥈27 · ⭐ 430 · 💀) - scikit-learn inspired API for CRFsuite. MIT - sklearn-contrib-lightning (🥉24 · ⭐ 1.7K · 💀) - Large-scale linear classification, regression and.. BSD-3 - skope-rules (🥉22 · ⭐ 640 · 💀) - machine learning with logical rules in Python. ❗️BSD-1-Clause - celer (🥉22 · ⭐ 230) - Fast solver for L1-type problems: Lasso, sparse Logisitic regression,.. BSD-3 - combo (🥉21 · ⭐ 660 · 💀) - (AAAI 20) A Python Toolbox for Machine Learning Model.. BSD-2 xgboost - DESlib (🥉18 · ⭐ 490 · 💀) - A Python library for dynamic classifier and ensemble selection. BSD-3 - skggm (🥉17 · ⭐ 250 · 💀) - Scikit-learn compatible estimation of general graphical models. MIT


Pytorch Utilities

Back to top

Libraries that extend Pytorch with additional capabilities.

accelerate (🥇43 · ⭐ 8.7K) - A simple way to launch, train, and use PyTorch models on.. Apache-2 - [GitHub](https://github.com/huggingface/accelerate) (👨‍💻 350 · 🔀 1.1K · 📦 95K · 📋 1.8K - 5% open · ⏱️ 22.05.2025):
git clone https://github.com/huggingface/accelerate
- [PyPi](https://pypi.org/project/accelerate) (📥 12M / month · 📦 2.2K · ⏱️ 15.05.2025):
pip install accelerate
- [Conda](https://anaconda.org/conda-forge/accelerate) (📥 400K · ⏱️ 15.05.2025):
conda install -c conda-forge accelerate
tinygrad (🥇35 · ⭐ 29K) - You like pytorch? You like micrograd? You love tinygrad!. MIT - [GitHub](https://github.com/tinygrad/tinygrad) (👨‍💻 400 · 🔀 3.3K · 📦 220 · 📋 920 - 14% open · ⏱️ 22.05.2025):
git clone https://github.com/geohot/tinygrad
PML (🥇33 · ⭐ 6.1K) - The easiest way to use deep metric learning in your application. Modular,.. MIT - [GitHub](https://github.com/KevinMusgrave/pytorch-metric-learning) (👨‍💻 46 · 🔀 660 · 📦 2.7K · 📋 530 - 14% open · ⏱️ 11.12.2024):
git clone https://github.com/KevinMusgrave/pytorch-metric-learning
- [PyPi](https://pypi.org/project/pytorch-metric-learning) (📥 820K / month · 📦 55 · ⏱️ 11.12.2024):
pip install pytorch-metric-learning
- [Conda](https://anaconda.org/metric-learning/pytorch-metric-learning) (📥 13K · ⏱️ 25.03.2025):
conda install -c metric-learning pytorch-metric-learning
torchdiffeq (🥇32 · ⭐ 6K) - Differentiable ODE solvers with full GPU support and O(1)-memory.. MIT - [GitHub](https://github.com/rtqichen/torchdiffeq) (👨‍💻 22 · 🔀 930 · 📦 5.2K · 📋 220 - 33% open · ⏱️ 04.04.2025):
git clone https://github.com/rtqichen/torchdiffeq
- [PyPi](https://pypi.org/project/torchdiffeq) (📥 1.2M / month · 📦 120 · ⏱️ 21.11.2024):
pip install torchdiffeq
- [Conda](https://anaconda.org/conda-forge/torchdiffeq) (📥 22K · ⏱️ 22.04.2025):
conda install -c conda-forge torchdiffeq
torchsde (🥈30 · ⭐ 1.6K) - Differentiable SDE solvers with GPU support and efficient.. Apache-2 - [GitHub](https://github.com/google-research/torchsde) (👨‍💻 9 · 🔀 200 · 📦 5.2K · 📋 82 - 35% open · ⏱️ 30.12.2024):
git clone https://github.com/google-research/torchsde
- [PyPi](https://pypi.org/project/torchsde) (📥 2.7M / month · 📦 37 · ⏱️ 26.09.2023):
pip install torchsde
- [Conda](https://anaconda.org/conda-forge/torchsde) (📥 39K · ⏱️ 22.04.2025):
conda install -c conda-forge torchsde
torch-scatter (🥈27 · ⭐ 1.6K) - PyTorch Extension Library of Optimized Scatter Operations. MIT - [GitHub](https://github.com/rusty1s/pytorch_scatter) (👨‍💻 33 · 🔀 190 · 📋 410 - 6% open · ⏱️ 20.04.2025):
git clone https://github.com/rusty1s/pytorch_scatter
- [PyPi](https://pypi.org/project/torch-scatter) (📥 52K / month · 📦 150 · ⏱️ 06.10.2023):
pip install torch-scatter
- [Conda](https://anaconda.org/conda-forge/pytorch_scatter) (📥 850K · ⏱️ 22.04.2025):
conda install -c conda-forge pytorch_scatter
EfficientNets (🥈25 · ⭐ 1.6K · 💤) - Pretrained EfficientNet, EfficientNet-Lite, MixNet,.. Apache-2 - [GitHub](https://github.com/rwightman/gen-efficientnet-pytorch) (👨‍💻 5 · 🔀 210 · 📦 300 · 📋 55 - 7% open · ⏱️ 13.06.2024):
git clone https://github.com/rwightman/gen-efficientnet-pytorch
- [PyPi](https://pypi.org/project/geffnet) (📥 190K / month · 📦 4 · ⏱️ 08.07.2021):
pip install geffnet
PyTorch Sparse (🥈25 · ⭐ 1.1K) - PyTorch Extension Library of Optimized Autograd Sparse.. MIT - [GitHub](https://github.com/rusty1s/pytorch_sparse) (👨‍💻 47 · 🔀 150 · 📋 290 - 10% open · ⏱️ 10.04.2025):
git clone https://github.com/rusty1s/pytorch_sparse
- [PyPi](https://pypi.org/project/torch-sparse) (📥 35K / month · 📦 120 · ⏱️ 06.10.2023):
pip install torch-sparse
- [Conda](https://anaconda.org/conda-forge/pytorch_sparse) (📥 800K · ⏱️ 22.04.2025):
conda install -c conda-forge pytorch_sparse
Pytorch Toolbelt (🥉24 · ⭐ 1.5K) - PyTorch extensions for fast R&D prototyping and Kaggle.. MIT - [GitHub](https://github.com/BloodAxe/pytorch-toolbelt) (👨‍💻 8 · 🔀 120 · 📥 160 · 📋 33 - 12% open · ⏱️ 01.03.2025):
git clone https://github.com/BloodAxe/pytorch-toolbelt
- [PyPi](https://pypi.org/project/pytorch_toolbelt) (📥 14K / month · 📦 12 · ⏱️ 21.11.2024):
pip install pytorch_toolbelt
madgrad (🥉17 · ⭐ 800) - MADGRAD Optimization Method. MIT - [GitHub](https://github.com/facebookresearch/madgrad) (👨‍💻 3 · 🔀 57 · 📦 100 · ⏱️ 27.01.2025):
git clone https://github.com/facebookresearch/madgrad
- [PyPi](https://pypi.org/project/madgrad) (📥 3K / month · 📦 1 · ⏱️ 08.03.2022):
pip install madgrad
pytorchviz (🥉14 · ⭐ 3.4K) - A small package to create visualizations of PyTorch execution graphs. MIT - [GitHub](https://github.com/szagoruyko/pytorchviz) (👨‍💻 6 · 🔀 280 · 📋 72 - 47% open · ⏱️ 30.12.2024):
git clone https://github.com/szagoruyko/pytorchviz
Show 21 hidden projects... - pretrainedmodels (🥈30 · ⭐ 9.1K · 💀) - Pretrained ConvNets for pytorch: NASNet, ResNeXt,.. BSD-3 - EfficientNet-PyTorch (🥈28 · ⭐ 8.1K · 💀) - A PyTorch implementation of EfficientNet. Apache-2 - lightning-flash (🥈28 · ⭐ 1.7K · 💀) - Your PyTorch AI Factory - Flash enables you to easily.. Apache-2 - pytorch-optimizer (🥈27 · ⭐ 3.1K · 💀) - torch-optimizer -- collection of optimizers for.. Apache-2 - TabNet (🥈26 · ⭐ 2.8K · 💀) - PyTorch implementation of TabNet paper :.. MIT - pytorch-summary (🥉24 · ⭐ 4K · 💀) - Model summary in PyTorch similar to `model.summary()` in.. MIT - Torchmeta (🥉24 · ⭐ 2K · 💀) - A collection of extensions and data-loaders for few-shot.. MIT - Higher (🥉24 · ⭐ 1.6K · 💀) - higher is a pytorch library allowing users to obtain higher.. Apache-2 - micrograd (🥉22 · ⭐ 12K · 💀) - A tiny scalar-valued autograd engine and a neural net library.. MIT - SRU (🥉22 · ⭐ 2.1K · 💀) - Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755). MIT - Antialiased CNNs (🥉22 · ⭐ 1.7K · 💀) - pip install antialiased-cnns to improve stability and.. ❗️CC BY-NC-SA 4.0 - AdaBound (🥉21 · ⭐ 2.9K · 💀) - An optimizer that trains as fast as Adam and as good as SGD. Apache-2 - reformer-pytorch (🥉21 · ⭐ 2.2K · 💀) - Reformer, the efficient Transformer, in Pytorch. MIT - Performer Pytorch (🥉21 · ⭐ 1.1K · 💀) - An implementation of Performer, a linear attention-.. MIT - Poutyne (🥉21 · ⭐ 570) - A simplified framework and utilities for PyTorch. ❗️LGPL-3.0 - Lambda Networks (🥉19 · ⭐ 1.5K · 💀) - Implementation of LambdaNetworks, a new approach to.. MIT - Torch-Struct (🥉19 · ⭐ 1.1K · 💀) - Fast, general, and tested differentiable structured.. MIT - Tensor Sensor (🥉18 · ⭐ 810 · 💀) - The goal of this library is to generate more helpful.. MIT - Tez (🥉17 · ⭐ 1.2K · 💀) - Tez is a super-simple and lightweight Trainer for PyTorch. It.. Apache-2 - Pywick (🥉17 · ⭐ 400 · 💀) - High-level batteries-included neural network training library for.. MIT - TorchDrift (🥉15 · ⭐ 320 · 💀) - Drift Detection for your PyTorch Models. Apache-2


Database Clients

Back to top

Libraries for connecting to, operating, and querying databases.

🔗 best-of-python - DB Clients ( ⭐ 4K) - Collection of database clients for python.


Others

Back to top

scipy (🥇50 · ⭐ 14K) - Ecosystem of open-source software for mathematics, science, and engineering. BSD-3 - [GitHub](https://github.com/scipy/scipy) (👨‍💻 1.8K · 🔀 5.4K · 📥 490K · 📦 1.4M · 📋 11K - 15% open · ⏱️ 21.05.2025):
git clone https://github.com/scipy/scipy
- [PyPi](https://pypi.org/project/scipy) (📥 150M / month · 📦 55K · ⏱️ 21.05.2025):
pip install scipy
- [Conda](https://anaconda.org/conda-forge/scipy) (📥 63M · ⏱️ 22.04.2025):
conda install -c conda-forge scipy
SymPy (🥇50 · ⭐ 14K) - A computer algebra system written in pure Python. BSD-3 - [GitHub](https://github.com/sympy/sympy) (👨‍💻 1.4K · 🔀 4.7K · 📥 560K · 📦 270K · 📋 14K - 36% open · ⏱️ 21.05.2025):
git clone https://github.com/sympy/sympy
- [PyPi](https://pypi.org/project/sympy) (📥 56M / month · 📦 4.3K · ⏱️ 27.04.2025):
pip install sympy
- [Conda](https://anaconda.org/conda-forge/sympy) (📥 8.9M · ⏱️ 29.04.2025):
conda install -c conda-forge sympy
Streamlit (🥇46 · ⭐ 39K) - Streamlit A faster way to build and share data apps. Apache-2 - [GitHub](https://github.com/streamlit/streamlit) (👨‍💻 410 · 🔀 3.5K · 📦 940K · 📋 5.3K - 22% open · ⏱️ 21.05.2025):
git clone https://github.com/streamlit/streamlit
- [PyPi](https://pypi.org/project/streamlit) (📥 12M / month · 📦 3.8K · ⏱️ 12.05.2025):
pip install streamlit
Gradio (🥇45 · ⭐ 38K · 📈) - Wrap UIs around any model, share with anyone. Apache-2 - [GitHub](https://github.com/gradio-app/gradio) (👨‍💻 590 · 🔀 2.9K · 📦 76K · 📋 5.7K - 8% open · ⏱️ 22.05.2025):
git clone https://github.com/gradio-app/gradio
- [PyPi](https://pypi.org/project/gradio) (📥 8.5M / month · 📦 1.2K · ⏱️ 19.05.2025):
pip install gradio
carla (🥇37 · ⭐ 12K) - Open-source simulator for autonomous driving research. MIT - [GitHub](https://github.com/carla-simulator/carla) (👨‍💻 180 · 🔀 3.9K · 📦 1.1K · 📋 5.9K - 18% open · ⏱️ 06.05.2025):
git clone https://github.com/carla-simulator/carla
- [PyPi](https://pypi.org/project/carla) (📥 43K / month · 📦 11 · ⏱️ 14.11.2023):
pip install carla
Autograd (🥇37 · ⭐ 7.3K) - Efficiently computes derivatives of NumPy code. MIT - [GitHub](https://github.com/HIPS/autograd) (👨‍💻 61 · 🔀 910 · 📦 13K · 📋 440 - 42% open · ⏱️ 19.05.2025):
git clone https://github.com/HIPS/autograd
- [PyPi](https://pypi.org/project/autograd) (📥 3.4M / month · 📦 310 · ⏱️ 05.05.2025):
pip install autograd
- [Conda](https://anaconda.org/conda-forge/autograd) (📥 540K · ⏱️ 05.05.2025):
conda install -c conda-forge autograd
PennyLane (🥇37 · ⭐ 2.6K) - PennyLane is a cross-platform Python library for quantum.. Apache-2 - [GitHub](https://github.com/PennyLaneAI/pennylane) (👨‍💻 200 · 🔀 660 · 📥 100 · 📦 1.7K · 📋 1.6K - 25% open · ⏱️ 22.05.2025):
git clone https://github.com/PennyLaneAI/PennyLane
- [PyPi](https://pypi.org/project/pennylane) (📥 90K / month · 📦 150 · ⏱️ 02.05.2025):
pip install pennylane
- [Conda](https://anaconda.org/conda-forge/pennylane) (📥 270K · ⏱️ 22.04.2025):
conda install -c conda-forge pennylane
PyOD (🥈36 · ⭐ 9.2K) - A Python Library for Outlier and Anomaly Detection, Integrating Classical.. BSD-2 - [GitHub](https://github.com/yzhao062/pyod) (👨‍💻 65 · 🔀 1.4K · 📦 5.3K · 📋 380 - 60% open · ⏱️ 29.04.2025):
git clone https://github.com/yzhao062/pyod
- [PyPi](https://pypi.org/project/pyod) (📥 640K / month · 📦 130 · ⏱️ 29.04.2025):
pip install pyod
- [Conda](https://anaconda.org/conda-forge/pyod) (📥 150K · ⏱️ 30.04.2025):
conda install -c conda-forge pyod
Datasette (🥈34 · ⭐ 10K) - An open source multi-tool for exploring and publishing data. Apache-2 - [GitHub](https://github.com/simonw/datasette) (👨‍💻 82 · 🔀 740 · 📥 72 · 📦 1.5K · 📋 1.9K - 32% open · ⏱️ 22.04.2025):
git clone https://github.com/simonw/datasette
- [PyPi](https://pypi.org/project/datasette) (📥 190K / month · 📦 460 · ⏱️ 22.04.2025):
pip install datasette
- [Conda](https://anaconda.org/conda-forge/datasette) (📥 62K · ⏱️ 22.04.2025):
conda install -c conda-forge datasette
DeepChem (🥈34 · ⭐ 6K) - Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry,.. MIT - [GitHub](https://github.com/deepchem/deepchem) (👨‍💻 260 · 🔀 1.9K · 📦 610 · 📋 2K - 38% open · ⏱️ 21.05.2025):
git clone https://github.com/deepchem/deepchem
- [PyPi](https://pypi.org/project/deepchem) (📥 39K / month · 📦 17 · ⏱️ 21.05.2025):
pip install deepchem
- [Conda](https://anaconda.org/conda-forge/deepchem) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge deepchem
agate (🥈34 · ⭐ 1.2K) - A Python data analysis library that is optimized for humans instead of.. MIT - [GitHub](https://github.com/wireservice/agate) (👨‍💻 53 · 🔀 150 · 📦 5K · 📋 650 - 0% open · ⏱️ 27.02.2025):
git clone https://github.com/wireservice/agate
- [PyPi](https://pypi.org/project/agate) (📥 18M / month · 📦 54 · ⏱️ 29.01.2025):
pip install agate
- [Conda](https://anaconda.org/conda-forge/agate) (📥 320K · ⏱️ 22.04.2025):
conda install -c conda-forge agate
Pythran (🥈33 · ⭐ 2.1K) - Ahead of Time compiler for numeric kernels. BSD-3 - [GitHub](https://github.com/serge-sans-paille/pythran) (👨‍💻 74 · 🔀 200 · 📦 3.4K · 📋 890 - 14% open · ⏱️ 10.05.2025):
git clone https://github.com/serge-sans-paille/pythran
- [PyPi](https://pypi.org/project/pythran) (📥 310K / month · 📦 21 · ⏱️ 31.10.2024):
pip install pythran
- [Conda](https://anaconda.org/conda-forge/pythran) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge pythran
hdbscan (🥈32 · ⭐ 2.9K) - A high performance implementation of HDBSCAN clustering. BSD-3 - [GitHub](https://github.com/scikit-learn-contrib/hdbscan) (👨‍💻 96 · 🔀 500 · 📦 6.8K · 📋 530 - 67% open · ⏱️ 13.05.2025):
git clone https://github.com/scikit-learn-contrib/hdbscan
- [PyPi](https://pypi.org/project/hdbscan) (📥 740K / month · 📦 350 · ⏱️ 18.11.2024):
pip install hdbscan
- [Conda](https://anaconda.org/conda-forge/hdbscan) (📥 2.5M · ⏱️ 22.04.2025):
conda install -c conda-forge hdbscan
tensorly (🥈32 · ⭐ 1.6K) - TensorLy: Tensor Learning in Python. BSD-2 - [GitHub](https://github.com/tensorly/tensorly) (👨‍💻 74 · 🔀 290 · 📦 1K · 📋 280 - 23% open · ⏱️ 05.05.2025):
git clone https://github.com/tensorly/tensorly
- [PyPi](https://pypi.org/project/tensorly) (📥 76K / month · 📦 99 · ⏱️ 12.11.2024):
pip install tensorly
- [Conda](https://anaconda.org/conda-forge/tensorly) (📥 380K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorly
pyjanitor (🥈32 · ⭐ 1.4K) - Clean APIs for data cleaning. Python implementation of R package.. MIT - [GitHub](https://github.com/pyjanitor-devs/pyjanitor) (👨‍💻 110 · 🔀 170 · 📦 940 · 📋 580 - 19% open · ⏱️ 14.05.2025):
git clone https://github.com/pyjanitor-devs/pyjanitor
- [PyPi](https://pypi.org/project/pyjanitor) (📥 90K / month · 📦 42 · ⏱️ 07.03.2025):
pip install pyjanitor
- [Conda](https://anaconda.org/conda-forge/pyjanitor) (📥 260K · ⏱️ 22.04.2025):
conda install -c conda-forge pyjanitor
PaddleHub (🥈31 · ⭐ 13K · 💤) - Awesome pre-trained models toolkit based on PaddlePaddle... Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleHub) (👨‍💻 69 · 🔀 2.1K · 📥 840 · 📦 1.9K · 📋 1.3K - 44% open · ⏱️ 07.08.2024):
git clone https://github.com/PaddlePaddle/PaddleHub
- [PyPi](https://pypi.org/project/paddlehub) (📥 5.8K / month · 📦 7 · ⏱️ 20.09.2023):
pip install paddlehub
pyopencl (🥈31 · ⭐ 1.1K) - OpenCL integration for Python, plus shiny features. MIT - [GitHub](https://github.com/inducer/pyopencl) (👨‍💻 98 · 🔀 240 · 📦 2.3K · 📋 360 - 21% open · ⏱️ 07.05.2025):
git clone https://github.com/inducer/pyopencl
- [PyPi](https://pypi.org/project/pyopencl) (📥 91K / month · 📦 180 · ⏱️ 22.01.2025):
pip install pyopencl
- [Conda](https://anaconda.org/conda-forge/pyopencl) (📥 1.8M · ⏱️ 22.04.2025):
conda install -c conda-forge pyopencl
datalad (🥈31 · ⭐ 580) - Keep code, data, containers under control with git and git-annex. MIT - [GitHub](https://github.com/datalad/datalad) (👨‍💻 57 · 🔀 110 · 📦 520 · 📋 4K - 13% open · ⏱️ 21.05.2025):
git clone https://github.com/datalad/datalad
- [PyPi](https://pypi.org/project/datalad) (📥 21K / month · 📦 100 · ⏱️ 21.05.2025):
pip install datalad
- [Conda](https://anaconda.org/conda-forge/datalad) (📥 870K · ⏱️ 22.04.2025):
conda install -c conda-forge datalad
River (🥈30 · ⭐ 5.3K) - Online machine learning in Python. BSD-3 - [GitHub](https://github.com/online-ml/river) (👨‍💻 130 · 🔀 570 · 📦 740 · 📋 620 - 19% open · ⏱️ 15.05.2025):
git clone https://github.com/online-ml/river
- [PyPi](https://pypi.org/project/river) (📥 62K / month · 📦 64 · ⏱️ 25.11.2024):
pip install river
- [Conda](https://anaconda.org/conda-forge/river) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge river
anomalib (🥈30 · ⭐ 4.3K) - An anomaly detection library comprising state-of-the-art algorithms.. Apache-2 - [GitHub](https://github.com/open-edge-platform/anomalib) (👨‍💻 86 · 🔀 740 · 📥 26K · 📦 190 · 📋 1K - 15% open · ⏱️ 22.05.2025):
git clone https://github.com/openvinotoolkit/anomalib
- [PyPi](https://pypi.org/project/anomalib) (📥 59K / month · 📦 5 · ⏱️ 19.03.2025):
pip install anomalib
dstack (🥈30 · ⭐ 1.8K) - dstack is an open-source alternative to Kubernetes and Slurm, designed.. MPL-2.0 - [GitHub](https://github.com/dstackai/dstack) (👨‍💻 53 · 🔀 180 · 📦 18 · 📋 1.3K - 9% open · ⏱️ 22.05.2025):
git clone https://github.com/dstackai/dstack
- [PyPi](https://pypi.org/project/dstack) (📥 5.9K / month · ⏱️ 21.05.2025):
pip install dstack
causalml (🥈29 · ⭐ 5.4K) - Uplift modeling and causal inference with machine learning.. Apache-2 - [GitHub](https://github.com/uber/causalml) (👨‍💻 66 · 🔀 810 · 📦 290 · 📋 410 - 10% open · ⏱️ 19.05.2025):
git clone https://github.com/uber/causalml
- [PyPi](https://pypi.org/project/causalml) (📥 42K / month · 📦 10 · ⏱️ 15.05.2025):
pip install causalml
adapter-transformers (🥉28 · ⭐ 2.7K) - A Unified Library for Parameter-Efficient and Modular.. Apache-2 huggingface - [GitHub](https://github.com/adapter-hub/adapters) (👨‍💻 15 · 🔀 360 · 📦 230 · 📋 400 - 10% open · ⏱️ 20.05.2025):
git clone https://github.com/Adapter-Hub/adapter-transformers
- [PyPi](https://pypi.org/project/adapter-transformers) (📥 4.6K / month · 📦 12 · ⏱️ 07.07.2024):
pip install adapter-transformers
Prince (🥉28 · ⭐ 1.4K) - Multivariate exploratory data analysis in Python PCA, CA, MCA, MFA,.. MIT - [GitHub](https://github.com/MaxHalford/prince) (👨‍💻 16 · 🔀 180 · 📦 740 · ⏱️ 09.03.2025):
git clone https://github.com/MaxHalford/prince
- [PyPi](https://pypi.org/project/prince) (📥 150K / month · 📦 20 · ⏱️ 09.03.2025):
pip install prince
- [Conda](https://anaconda.org/conda-forge/prince-factor-analysis) (📥 25K · ⏱️ 22.04.2025):
conda install -c conda-forge prince-factor-analysis
Trax (🥉27 · ⭐ 8.2K) - Trax Deep Learning with Clear Code and Speed. Apache-2 - [GitHub](https://github.com/google/trax) (👨‍💻 81 · 🔀 820 · 📦 220 · 📋 250 - 49% open · ⏱️ 10.04.2025):
git clone https://github.com/google/trax
- [PyPi](https://pypi.org/project/trax) (📥 2.2K / month · 📦 1 · ⏱️ 26.10.2021):
pip install trax
avalanche (🥉27 · ⭐ 1.9K) - Avalanche: an End-to-End Library for Continual Learning based on.. MIT - [GitHub](https://github.com/ContinualAI/avalanche) (👨‍💻 84 · 🔀 300 · 📥 54 · 📦 140 · 📋 830 - 12% open · ⏱️ 11.03.2025):
git clone https://github.com/ContinualAI/avalanche
- [PyPi](https://pypi.org/project/avalanche-lib) (📥 1.9K / month · 📦 3 · ⏱️ 29.10.2024):
pip install avalanche-lib
TabPy (🥉27 · ⭐ 1.6K) - Execute Python code on the fly and display results in Tableau visualizations:. MIT - [GitHub](https://github.com/tableau/TabPy) (👨‍💻 51 · 🔀 600 · 📦 210 · 📋 320 - 6% open · ⏱️ 25.11.2024):
git clone https://github.com/tableau/TabPy
- [PyPi](https://pypi.org/project/tabpy) (📥 6.3K / month · 📦 2 · ⏱️ 25.11.2024):
pip install tabpy
- [Conda](https://anaconda.org/anaconda/tabpy-client) (📥 5.2K · ⏱️ 22.04.2025):
conda install -c anaconda tabpy-client
pycm (🥉27 · ⭐ 1.5K) - Multi-class confusion matrix library in Python. MIT - [GitHub](https://github.com/sepandhaghighi/pycm) (👨‍💻 18 · 🔀 130 · 📦 400 · 📋 210 - 8% open · ⏱️ 04.04.2025):
git clone https://github.com/sepandhaghighi/pycm
- [PyPi](https://pypi.org/project/pycm) (📥 40K / month · 📦 24 · ⏱️ 04.04.2025):
pip install pycm
metric-learn (🥉26 · ⭐ 1.4K · 💤) - Metric learning algorithms in Python. MIT - [GitHub](https://github.com/scikit-learn-contrib/metric-learn) (👨‍💻 23 · 🔀 230 · 📦 480 · 📋 180 - 30% open · ⏱️ 03.08.2024):
git clone https://github.com/scikit-learn-contrib/metric-learn
- [PyPi](https://pypi.org/project/metric-learn) (📥 5.3K / month · 📦 7 · ⏱️ 09.10.2023):
pip install metric-learn
- [Conda](https://anaconda.org/conda-forge/metric-learn) (📥 16K · ⏱️ 22.04.2025):
conda install -c conda-forge metric-learn
Feature Engine (🥉25 · ⭐ 2K · 💤) - Feature engineering package with sklearn like functionality. BSD-3 - [GitHub](https://github.com/solegalli/feature_engine) (👨‍💻 49 · 🔀 320 · ⏱️ 31.08.2024):
git clone https://github.com/solegalli/feature_engine
- [PyPi](https://pypi.org/project/feature_engine) (📥 270K / month · 📦 180 · ⏱️ 22.01.2025):
pip install feature_engine
- [Conda](https://anaconda.org/conda-forge/feature_engine) (📥 73K · ⏱️ 22.04.2025):
conda install -c conda-forge feature_engine
AugLy (🥉24 · ⭐ 5K) - A data augmentations library for audio, image, text, and video. MIT - [GitHub](https://github.com/facebookresearch/AugLy) (👨‍💻 38 · 🔀 300 · 📦 180 · 📋 80 - 30% open · ⏱️ 28.02.2025):
git clone https://github.com/facebookresearch/AugLy
- [PyPi](https://pypi.org/project/augly) (📥 2.6K / month · 📦 4 · ⏱️ 05.12.2023):
pip install augly
BioPandas (🥉23 · ⭐ 730 · 💤) - Working with molecular structures in pandas DataFrames. BSD-3 - [GitHub](https://github.com/BioPandas/biopandas) (👨‍💻 18 · 🔀 120 · 📦 370 · 📋 60 - 36% open · ⏱️ 01.08.2024):
git clone https://github.com/rasbt/biopandas
- [PyPi](https://pypi.org/project/biopandas) (📥 10K / month · 📦 38 · ⏱️ 01.08.2024):
pip install biopandas
- [Conda](https://anaconda.org/conda-forge/biopandas) (📥 180K · ⏱️ 22.04.2025):
conda install -c conda-forge biopandas
MONAILabel (🥉22 · ⭐ 710) - MONAI Label is an intelligent open source image labeling and.. Apache-2 - [GitHub](https://github.com/Project-MONAI/MONAILabel) (👨‍💻 66 · 🔀 220 · 📥 120K · 📋 550 - 25% open · ⏱️ 05.05.2025):
git clone https://github.com/Project-MONAI/MONAILabel
- [PyPi](https://pypi.org/project/monailabel-weekly) (📥 1.2K / month · ⏱️ 01.10.2023):
pip install monailabel-weekly
pykale (🥉22 · ⭐ 460) - Knowledge-Aware machine LEarning (KALE): accessible machine learning.. MIT - [GitHub](https://github.com/pykale/pykale) (👨‍💻 26 · 🔀 66 · 📦 6 · 📋 130 - 8% open · ⏱️ 18.05.2025):
git clone https://github.com/pykale/pykale
- [PyPi](https://pypi.org/project/pykale) (📥 240 / month · ⏱️ 12.04.2022):
pip install pykale
SUOD (🥉22 · ⭐ 390) - (MLSys 21) An Acceleration System for Large-scare Unsupervised Heterogeneous.. BSD-2 - [GitHub](https://github.com/yzhao062/SUOD) (👨‍💻 3 · 🔀 49 · 📦 550 · 📋 15 - 80% open · ⏱️ 24.03.2025):
git clone https://github.com/yzhao062/SUOD
- [PyPi](https://pypi.org/project/suod) (📥 14K / month · 📦 9 · ⏱️ 24.03.2025):
pip install suod
benchmark_VAE (🥉20 · ⭐ 1.9K · 💤) - Unifying Variational Autoencoder (VAE).. Apache-2 - [GitHub](https://github.com/clementchadebec/benchmark_VAE) (👨‍💻 18 · 🔀 170 · 📦 40 · 📋 71 - 36% open · ⏱️ 17.07.2024):
git clone https://github.com/clementchadebec/benchmark_VAE
- [PyPi](https://pypi.org/project/pythae) (📥 660 / month · ⏱️ 06.09.2023):
pip install pythae
pymdp (🥉20 · ⭐ 530) - A Python implementation of active inference for Markov Decision Processes. MIT - [GitHub](https://github.com/infer-actively/pymdp) (👨‍💻 19 · 🔀 100 · 📦 24 · 📋 56 - 48% open · ⏱️ 06.02.2025):
git clone https://github.com/infer-actively/pymdp
- [PyPi](https://pypi.org/project/inferactively-pymdp) (📥 10K / month · ⏱️ 08.12.2022):
pip install inferactively-pymdp
NeuralCompression (🥉15 · ⭐ 550 · 💤) - A collection of tools for neural compression enthusiasts. MIT - [GitHub](https://github.com/facebookresearch/NeuralCompression) (👨‍💻 10 · 🔀 45 · 📋 73 - 9% open · ⏱️ 20.09.2024):
git clone https://github.com/facebookresearch/NeuralCompression
- [PyPi](https://pypi.org/project/neuralcompression) (📥 160 / month · ⏱️ 03.10.2023):
pip install neuralcompression
Show 28 hidden projects... - Cython BLIS (🥈31 · ⭐ 230) - Fast matrix-multiplication as a self-contained Python library no.. BSD-3 - cleanlab (🥈30 · ⭐ 11K) - The standard data-centric AI package for data quality and machine.. ❗️AGPL-3.0 - pysc2 (🥈29 · ⭐ 8.1K · 💀) - StarCraft II Learning Environment. Apache-2 - minisom (🥈29 · ⭐ 1.5K) - MiniSom is a minimalistic implementation of the Self Organizing.. ❗️CC-BY-3.0 - kmodes (🥈29 · ⭐ 1.3K · 💀) - Python implementations of the k-modes and k-prototypes clustering.. MIT - pyclustering (🥈29 · ⭐ 1.2K · 💀) - pyclustering is a Python, C++ data mining library. BSD-3 - alibi-detect (🥉28 · ⭐ 2.4K) - Algorithms for outlier, adversarial and drift detection. ❗️Intel - modAL (🥉28 · ⭐ 2.3K · 💀) - A modular active learning framework for Python. MIT - gplearn (🥉27 · ⭐ 1.7K · 💀) - Genetic Programming in Python, with a scikit-learn inspired API. BSD-3 - PySwarms (🥉27 · ⭐ 1.3K · 💀) - A research toolkit for particle swarm optimization in Python. MIT - metricflow (🥉27 · ⭐ 1.2K) - MetricFlow allows you to define, build, and maintain metrics.. ❗Unlicensed - findspark (🥉25 · ⭐ 520 · 💀) - Find pyspark to make it importable. BSD-3 - pandas-ai (🥉24 · ⭐ 20K) - Chat with your database or your datalake (SQL, CSV, parquet)... ❗Unlicensed - Mars (🥉24 · ⭐ 2.7K · 💀) - Mars is a tensor-based unified framework for large-scale data.. Apache-2 - AstroML (🥉23 · ⭐ 1.1K · 💀) - Machine learning, statistics, and data mining for astronomy.. BSD-2 - opyrator (🥉22 · ⭐ 3.1K · 💀) - Turns your machine learning code into microservices with web API,.. MIT - mlens (🥉22 · ⭐ 860 · 💀) - ML-Ensemble high performance ensemble learning. MIT - vecstack (🥉22 · ⭐ 690 · 💀) - Python package for stacking (machine learning technique). MIT - apricot (🥉22 · ⭐ 510 · 💀) - apricot implements submodular optimization for the purpose of.. MIT - impyute (🥉21 · ⭐ 360 · 💀) - Data imputations library to preprocess datasets with missing data. MIT - StreamAlert (🥉20 · ⭐ 2.9K · 💀) - StreamAlert is a serverless, realtime data analysis.. Apache-2 - rrcf (🥉20 · ⭐ 510 · 💀) - Implementation of the Robust Random Cut Forest algorithm for anomaly.. MIT - scikit-rebate (🥉20 · ⭐ 420 · 💀) - A scikit-learn-compatible Python implementation of.. MIT - baikal (🥉19 · ⭐ 590 · 💀) - A graph-based functional API for building complex scikit-learn.. BSD-3 - pandas-ml (🥉16 · ⭐ 320 · 💀) - pandas, scikit-learn, xgboost and seaborn integration. BSD-3 - KD-Lib (🥉15 · ⭐ 630 · 💀) - A Pytorch Knowledge Distillation library for benchmarking and.. MIT - traingenerator (🥉13 · ⭐ 1.4K · 💀) - A web app to generate template code for machine learning. MIT - nylon (🥉13 · ⭐ 83 · 💀) - An intelligent, flexible grammar of machine learning. MIT

Contribution

Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:

  • Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
  • Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.

If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.

For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.

License

CC0

Source

Best-of Machine Learning with Python

Best-of Machine Learning with Python

🏆  A ranked list of awesome machine learning Python libraries. Updated weekly.

This curated list contains 920 awesome open-source projects with a total of 5.1M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!


🧙‍♂️  Discover other best-of lists or create your own.
📫  Subscribe to our newsletter for updates and trending projects.


Contents

Explanation

  • 🥇🥈🥉  Combined project-quality score
  • ⭐️  Star count from GitHub
  • 🐣  New project (less than 6 months old)
  • 💤  Inactive project (6 months no activity)
  • 💀  Dead project (12 months no activity)
  • 📈📉  Project is trending up or down
  • ➕  Project was recently added
  • ❗️  Warning (e.g. missing/risky license)
  • 👨‍💻  Contributors count from GitHub
  • 🔀  Fork count from GitHub
  • 📋  Issue count from GitHub
  • ⏱️  Last update timestamp on package manager
  • 📥  Download count from package manager
  • 📦  Number of dependent projects
  •   Tensorflow related project
  •   Sklearn related project
  •   PyTorch related project
  •   MxNet related project
  •   Apache Spark related project
  •   Jupyter related project
  •   PaddlePaddle related project
  •   Pandas related project
  •   Jax related project


Machine Learning Frameworks

Back to top

General-purpose machine learning and deep learning frameworks.

Tensorflow (🥇56 · ⭐ 200K) - An Open Source Machine Learning Framework for Everyone. Apache-2 - [GitHub](https://github.com/tensorflow/tensorflow) (👨‍💻 5K · 🔀 75K · 📦 540K · 📋 42K - 4% open · ⏱️ 30.10.2025):
git clone https://github.com/tensorflow/tensorflow
- [PyPi](https://pypi.org/project/tensorflow) (📥 26M / month · 📦 9.6K · ⏱️ 13.08.2025):
pip install tensorflow
- [Conda](https://anaconda.org/conda-forge/tensorflow) (📥 6M · ⏱️ 27.10.2025):
conda install -c conda-forge tensorflow
- [Docker Hub](https://hub.docker.com/r/tensorflow/tensorflow) (📥 81M · ⭐ 2.8K · ⏱️ 30.10.2025):
docker pull tensorflow/tensorflow
PyTorch (🥇56 · ⭐ 94K) - Tensors and Dynamic neural networks in Python with strong GPU.. BSD-3 - [GitHub](https://github.com/pytorch/pytorch) (👨‍💻 6K · 🔀 26K · 📥 110K · 📦 830K · 📋 56K - 30% open · ⏱️ 30.10.2025):
git clone https://github.com/pytorch/pytorch
- [PyPi](https://pypi.org/project/torch) (📥 70M / month · 📦 30K · ⏱️ 15.10.2025):
pip install torch
- [Conda](https://anaconda.org/pytorch/pytorch) (📥 29M · ⏱️ 25.03.2025):
conda install -c pytorch pytorch
scikit-learn (🥇53 · ⭐ 64K) - scikit-learn: machine learning in Python. BSD-3 - [GitHub](https://github.com/scikit-learn/scikit-learn) (👨‍💻 3.4K · 🔀 26K · 📥 1.1K · 📦 1.3M · 📋 12K - 17% open · ⏱️ 30.10.2025):
git clone https://github.com/scikit-learn/scikit-learn
- [PyPi](https://pypi.org/project/scikit-learn) (📥 140M / month · 📦 35K · ⏱️ 09.09.2025):
pip install scikit-learn
- [Conda](https://anaconda.org/conda-forge/scikit-learn) (📥 40M · ⏱️ 09.09.2025):
conda install -c conda-forge scikit-learn
Keras (🥇50 · ⭐ 64K) - Deep Learning for humans. Apache-2 - [GitHub](https://github.com/keras-team/keras) (👨‍💻 1.4K · 🔀 20K · 📦 300K · 📋 13K - 2% open · ⏱️ 30.10.2025):
git clone https://github.com/keras-team/keras
- [PyPi](https://pypi.org/project/keras) (📥 19M / month · 📦 2K · ⏱️ 27.10.2025):
pip install keras
- [Conda](https://anaconda.org/conda-forge/keras) (📥 4.5M · ⏱️ 28.10.2025):
conda install -c conda-forge keras
XGBoost (🥇46 · ⭐ 28K) - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or.. Apache-2 - [GitHub](https://github.com/dmlc/xgboost) (👨‍💻 670 · 🔀 8.8K · 📥 20K · 📦 170K · 📋 5.6K - 8% open · ⏱️ 30.10.2025):
git clone https://github.com/dmlc/xgboost
- [PyPi](https://pypi.org/project/xgboost) (📥 31M / month · 📦 2.9K · ⏱️ 21.10.2025):
pip install xgboost
- [Conda](https://anaconda.org/conda-forge/xgboost) (📥 6.6M · ⏱️ 16.09.2025):
conda install -c conda-forge xgboost
PaddlePaddle (🥇46 · ⭐ 23K) - PArallel Distributed Deep LEarning: Machine Learning.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/Paddle) (👨‍💻 1.5K · 🔀 5.9K · 📥 15K · 📦 8.8K · 📋 20K - 8% open · ⏱️ 30.10.2025):
git clone https://github.com/PaddlePaddle/Paddle
- [PyPi](https://pypi.org/project/paddlepaddle) (📥 1.6M / month · 📦 280 · ⏱️ 30.10.2025):
pip install paddlepaddle
jax (🥇45 · ⭐ 34K) - Composable transformations of Python+NumPy programs: differentiate,.. Apache-2 - [GitHub](https://github.com/jax-ml/jax) (👨‍💻 980 · 🔀 3.2K · 📦 47K · 📋 6.6K - 24% open · ⏱️ 30.10.2025):
git clone https://github.com/google/jax
- [PyPi](https://pypi.org/project/jax) (📥 12M / month · 📦 3.1K · ⏱️ 15.10.2025):
pip install jax
- [Conda](https://anaconda.org/conda-forge/jaxlib) (📥 3.2M · ⏱️ 06.10.2025):
conda install -c conda-forge jaxlib
pytorch-lightning (🥇45 · ⭐ 30K) - Pretrain, finetune ANY AI model of ANY size on 1 or.. Apache-2 - [GitHub](https://github.com/Lightning-AI/pytorch-lightning) (👨‍💻 1K · 🔀 3.6K · 📥 15K · 📦 48K · 📋 7.4K - 11% open · ⏱️ 29.10.2025):
git clone https://github.com/Lightning-AI/lightning
- [PyPi](https://pypi.org/project/pytorch-lightning) (📥 9.8M / month · 📦 1.8K · ⏱️ 05.09.2025):
pip install pytorch-lightning
- [Conda](https://anaconda.org/conda-forge/pytorch-lightning) (📥 1.7M · ⏱️ 05.09.2025):
conda install -c conda-forge pytorch-lightning
StatsModels (🥇45 · ⭐ 11K) - Statsmodels: statistical modeling and econometrics in Python. BSD-3 - [GitHub](https://github.com/statsmodels/statsmodels) (👨‍💻 470 · 🔀 3.3K · 📥 36 · 📦 180K · 📋 5.8K - 50% open · ⏱️ 22.10.2025):
git clone https://github.com/statsmodels/statsmodels
- [PyPi](https://pypi.org/project/statsmodels) (📥 24M / month · 📦 5.6K · ⏱️ 07.07.2025):
pip install statsmodels
- [Conda](https://anaconda.org/conda-forge/statsmodels) (📥 22M · ⏱️ 01.10.2025):
conda install -c conda-forge statsmodels
PySpark (🥈44 · ⭐ 42K) - Apache Spark Python API. Apache-2 - [GitHub](https://github.com/apache/spark) (👨‍💻 3.3K · 🔀 29K · ⏱️ 30.10.2025):
git clone https://github.com/apache/spark
- [PyPi](https://pypi.org/project/pyspark) (📥 47M / month · 📦 2.1K · ⏱️ 30.10.2025):
pip install pyspark
- [Conda](https://anaconda.org/conda-forge/pyspark) (📥 4.2M · ⏱️ 08.09.2025):
conda install -c conda-forge pyspark
LightGBM (🥈42 · ⭐ 18K) - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT,.. MIT - [GitHub](https://github.com/microsoft/LightGBM) (👨‍💻 330 · 🔀 3.9K · 📥 310K · 📦 56K · 📋 3.6K - 12% open · ⏱️ 28.10.2025):
git clone https://github.com/microsoft/LightGBM
- [PyPi](https://pypi.org/project/lightgbm) (📥 11M / month · 📦 1.6K · ⏱️ 15.02.2025):
pip install lightgbm
- [Conda](https://anaconda.org/conda-forge/lightgbm) (📥 4.1M · ⏱️ 20.10.2025):
conda install -c conda-forge lightgbm
Catboost (🥈42 · ⭐ 8.6K) - A fast, scalable, high performance Gradient Boosting on Decision.. Apache-2 - [GitHub](https://github.com/catboost/catboost) (👨‍💻 1.4K · 🔀 1.2K · 📥 460K · 📦 19 · 📋 2.5K - 25% open · ⏱️ 30.10.2025):
git clone https://github.com/catboost/catboost
- [PyPi](https://pypi.org/project/catboost) (📥 5.1M / month · 📦 650 · ⏱️ 13.04.2025):
pip install catboost
- [Conda](https://anaconda.org/conda-forge/catboost) (📥 2.2M · ⏱️ 09.08.2025):
conda install -c conda-forge catboost
Fastai (🥈41 · ⭐ 28K) - The fastai deep learning library. Apache-2 - [GitHub](https://github.com/fastai/fastai) (👨‍💻 680 · 🔀 7.6K · 📦 23K · 📋 1.9K - 14% open · ⏱️ 26.10.2025):
git clone https://github.com/fastai/fastai
- [PyPi](https://pypi.org/project/fastai) (📥 640K / month · 📦 340 · ⏱️ 26.10.2025):
pip install fastai
PyFlink (🥈39 · ⭐ 25K) - Apache Flink Python API. Apache-2 - [GitHub](https://github.com/apache/flink) (👨‍💻 2.1K · 🔀 14K · 📦 21 · ⏱️ 30.10.2025):
git clone https://github.com/apache/flink
- [PyPi](https://pypi.org/project/apache-flink) (📥 450K / month · 📦 38 · ⏱️ 28.10.2025):
pip install apache-flink
Flax (🥈38 · ⭐ 6.9K) - Flax is a neural network library for JAX that is designed for.. Apache-2 - [GitHub](https://github.com/google/flax) (👨‍💻 280 · 🔀 740 · 📥 61 · 📦 15K · 📋 1.3K - 33% open · ⏱️ 27.10.2025):
git clone https://github.com/google/flax
- [PyPi](https://pypi.org/project/flax) (📥 2M / month · 📦 740 · ⏱️ 25.09.2025):
pip install flax
- [Conda](https://anaconda.org/conda-forge/flax) (📥 130K · ⏱️ 27.10.2025):
conda install -c conda-forge flax
Ignite (🥈36 · ⭐ 4.7K) - High-level library to help with training and evaluating neural.. BSD-3 - [GitHub](https://github.com/pytorch/ignite) (👨‍💻 1K · 🔀 660 · 📦 3.9K · 📋 1.4K - 10% open · ⏱️ 16.10.2025):
git clone https://github.com/pytorch/ignite
- [PyPi](https://pypi.org/project/pytorch-ignite) (📥 170K / month · 📦 120 · ⏱️ 30.10.2025):
pip install pytorch-ignite
- [Conda](https://anaconda.org/pytorch/ignite) (📥 250K · ⏱️ 16.10.2025):
conda install -c pytorch ignite
einops (🥈35 · ⭐ 9.2K) - Flexible and powerful tensor operations for readable and reliable code.. MIT - [GitHub](https://github.com/arogozhnikov/einops) (👨‍💻 34 · 🔀 380 · 📦 82K · 📋 200 - 17% open · ⏱️ 12.08.2025):
git clone https://github.com/arogozhnikov/einops
- [PyPi](https://pypi.org/project/einops) (📥 15M / month · 📦 2.6K · ⏱️ 09.02.2025):
pip install einops
- [Conda](https://anaconda.org/conda-forge/einops) (📥 470K · ⏱️ 22.04.2025):
conda install -c conda-forge einops
ivy (🥈34 · ⭐ 14K) - Convert Machine Learning Code Between Frameworks. Apache-2 - [GitHub](https://github.com/ivy-llc/ivy) (👨‍💻 1.5K · 🔀 5.6K · 📋 17K - 5% open · ⏱️ 10.10.2025):
git clone https://github.com/unifyai/ivy
- [PyPi](https://pypi.org/project/ivy) (📥 33K / month · 📦 16 · ⏱️ 16.06.2025):
pip install ivy
Jina (🥈33 · ⭐ 22K · 💤) - Build multimodal AI applications with cloud-native stack. Apache-2 - [GitHub](https://github.com/jina-ai/serve) (👨‍💻 180 · 🔀 2.2K · ⏱️ 24.03.2025):
git clone https://github.com/jina-ai/jina
- [PyPi](https://pypi.org/project/jina) (📥 120K / month · 📦 29 · ⏱️ 24.03.2025):
pip install jina
- [Conda](https://anaconda.org/conda-forge/jina-core) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge jina-core
- [Docker Hub](https://hub.docker.com/r/jinaai/jina) (📥 1.8M · ⭐ 9 · ⏱️ 24.03.2025):
docker pull jinaai/jina
mlpack (🥈33 · ⭐ 5.5K) - mlpack: a fast, header-only C++ machine learning library. BSD-3 - [GitHub](https://github.com/mlpack/mlpack) (👨‍💻 340 · 🔀 1.7K · 📋 1.7K - 1% open · ⏱️ 27.10.2025):
git clone https://github.com/mlpack/mlpack
- [PyPi](https://pypi.org/project/mlpack) (📥 4.7K / month · 📦 6 · ⏱️ 22.05.2025):
pip install mlpack
- [Conda](https://anaconda.org/conda-forge/mlpack) (📥 410K · ⏱️ 22.04.2025):
conda install -c conda-forge mlpack
Thinc (🥈33 · ⭐ 2.9K · 💤) - A refreshing functional take on deep learning, compatible with your.. MIT - [GitHub](https://github.com/explosion/thinc) (👨‍💻 67 · 🔀 280 · 📥 2K · 📦 70K · 📋 160 - 14% open · ⏱️ 07.03.2025):
git clone https://github.com/explosion/thinc
- [PyPi](https://pypi.org/project/thinc) (📥 17M / month · 📦 160 · ⏱️ 04.04.2025):
pip install thinc
- [Conda](https://anaconda.org/conda-forge/thinc) (📥 3.9M · ⏱️ 06.07.2025):
conda install -c conda-forge thinc
Ludwig (🥉32 · ⭐ 12K · 💤) - Low-code framework for building custom LLMs, neural networks,.. Apache-2 - [GitHub](https://github.com/ludwig-ai/ludwig) (👨‍💻 160 · 🔀 1.2K · 📦 340 · 📋 1.1K - 4% open · ⏱️ 17.10.2024):
git clone https://github.com/ludwig-ai/ludwig
- [PyPi](https://pypi.org/project/ludwig) (📥 3.8K / month · 📦 6 · ⏱️ 30.07.2024):
pip install ludwig
skorch (🥉32 · ⭐ 6.1K) - A scikit-learn compatible neural network library that wraps.. BSD-3 - [GitHub](https://github.com/skorch-dev/skorch) (👨‍💻 68 · 🔀 400 · 📦 1.7K · 📋 540 - 12% open · ⏱️ 23.10.2025):
git clone https://github.com/skorch-dev/skorch
- [PyPi](https://pypi.org/project/skorch) (📥 150K / month · 📦 110 · ⏱️ 08.08.2025):
pip install skorch
- [Conda](https://anaconda.org/conda-forge/skorch) (📥 810K · ⏱️ 08.08.2025):
conda install -c conda-forge skorch
Sonnet (🥉31 · ⭐ 9.9K) - TensorFlow-based neural network library. Apache-2 - [GitHub](https://github.com/google-deepmind/sonnet) (👨‍💻 61 · 🔀 1.3K · 📦 1.5K · 📋 190 - 16% open · ⏱️ 04.08.2025):
git clone https://github.com/deepmind/sonnet
- [PyPi](https://pypi.org/project/dm-sonnet) (📥 35K / month · 📦 19 · ⏱️ 02.01.2024):
pip install dm-sonnet
- [Conda](https://anaconda.org/conda-forge/sonnet) (📥 47K · ⏱️ 22.04.2025):
conda install -c conda-forge sonnet
Haiku (🥉31 · ⭐ 3.1K · 📉) - JAX-based neural network library. Apache-2 - [GitHub](https://github.com/google-deepmind/dm-haiku) (👨‍💻 90 · 🔀 260 · 📦 2.6K · 📋 250 - 29% open · ⏱️ 29.09.2025):
git clone https://github.com/deepmind/dm-haiku
- [PyPi](https://pypi.org/project/dm-haiku) (📥 260K / month · 📦 200 · ⏱️ 18.09.2025):
pip install dm-haiku
- [Conda](https://anaconda.org/conda-forge/dm-haiku) (📥 44K · ⏱️ 19.09.2025):
conda install -c conda-forge dm-haiku
tensorflow-upstream (🥉31 · ⭐ 700) - TensorFlow ROCm port. Apache-2 - [GitHub](https://github.com/ROCm/tensorflow-upstream) (👨‍💻 5K · 🔀 100 · 📥 31 · 📋 400 - 3% open · ⏱️ 29.10.2025):
git clone https://github.com/ROCmSoftwarePlatform/tensorflow-upstream
- [PyPi](https://pypi.org/project/tensorflow-rocm) (📥 1.7K / month · 📦 9 · ⏱️ 10.01.2024):
pip install tensorflow-rocm
Geomstats (🥉30 · ⭐ 1.4K) - Computations and statistics on manifolds with geometric structures. MIT - [GitHub](https://github.com/geomstats/geomstats) (👨‍💻 97 · 🔀 260 · 📦 150 · 📋 570 - 36% open · ⏱️ 06.10.2025):
git clone https://github.com/geomstats/geomstats
- [PyPi](https://pypi.org/project/geomstats) (📥 15K / month · 📦 12 · ⏱️ 09.09.2024):
pip install geomstats
- [Conda](https://anaconda.org/conda-forge/geomstats) (📥 8.2K · ⏱️ 22.04.2025):
conda install -c conda-forge geomstats
pyRiemann (🥉28 · ⭐ 700) - Machine learning for multivariate data through the Riemannian.. BSD-3 - [GitHub](https://github.com/pyRiemann/pyRiemann) (👨‍💻 38 · 🔀 170 · 📦 480 · 📋 110 - 2% open · ⏱️ 29.10.2025):
git clone https://github.com/pyRiemann/pyRiemann
- [PyPi](https://pypi.org/project/pyriemann) (📥 75K / month · 📦 31 · ⏱️ 23.07.2025):
pip install pyriemann
- [Conda](https://anaconda.org/conda-forge/pyriemann) (📥 16K · ⏱️ 23.07.2025):
conda install -c conda-forge pyriemann
NuPIC (🥉27 · ⭐ 6.4K · 💤) - Numenta Platform for Intelligent Computing is an implementation of.. MIT - [GitHub](https://github.com/numenta/nupic-legacy) (👨‍💻 120 · 🔀 1.5K · 📥 26 · 📦 21 · 📋 1.8K - 25% open · ⏱️ 03.12.2024):
git clone https://github.com/numenta/nupic
- [PyPi](https://pypi.org/project/nupic) (📥 510 / month · ⏱️ 01.09.2016):
pip install nupic
Determined (🥉26 · ⭐ 3.2K · 💤) - Determined is an open-source machine learning.. Apache-2 - [GitHub](https://github.com/determined-ai/determined) (👨‍💻 120 · 🔀 360 · 📥 7.8K · 📋 450 - 22% open · ⏱️ 20.03.2025):
git clone https://github.com/determined-ai/determined
- [PyPi](https://pypi.org/project/determined) (📥 33K / month · 📦 4 · ⏱️ 19.03.2025):
pip install determined
Neural Network Libraries (🥉26 · ⭐ 2.8K) - Neural Network Libraries. Apache-2 - [GitHub](https://github.com/sony/nnabla) (👨‍💻 76 · 🔀 340 · 📥 1K · 📋 95 - 36% open · ⏱️ 29.08.2025):
git clone https://github.com/sony/nnabla
- [PyPi](https://pypi.org/project/nnabla) (📥 1.6K / month · 📦 44 · ⏱️ 29.05.2024):
pip install nnabla
deepinv (🥉26 · ⭐ 540) - DeepInverse: a PyTorch library for solving imaging inverse problems.. BSD-3 - [GitHub](https://github.com/deepinv/deepinv) (👨‍💻 53 · 🔀 120 · 📥 24 · 📦 23 · 📋 350 - 33% open · ⏱️ 29.10.2025):
git clone https://github.com/deepinv/deepinv
- [PyPi](https://pypi.org/project/deepinv) (📥 2.4K / month · ⏱️ 08.10.2025):
pip install deepinv
Towhee (🥉23 · ⭐ 3.4K · 💤) - Towhee is a framework that is dedicated to making neural data.. Apache-2 - [GitHub](https://github.com/towhee-io/towhee) (👨‍💻 38 · 🔀 260 · 📥 2.7K · 📋 670 - 0% open · ⏱️ 18.10.2024):
git clone https://github.com/towhee-io/towhee
- [PyPi](https://pypi.org/project/towhee) (📥 1.3K / month · ⏱️ 04.12.2023):
pip install towhee
fklearn (🥉22 · ⭐ 1.5K) - fklearn: Functional Machine Learning. Apache-2 - [GitHub](https://github.com/nubank/fklearn) (👨‍💻 56 · 🔀 170 · 📦 16 · 📋 64 - 60% open · ⏱️ 23.04.2025):
git clone https://github.com/nubank/fklearn
- [PyPi](https://pypi.org/project/fklearn) (📥 750 / month · ⏱️ 26.02.2025):
pip install fklearn
Runhouse (🥉21 · ⭐ 1.1K) - Distribute and run AI workloads magically in Python, like PyTorch.. Apache-2 - [GitHub](https://github.com/run-house/kubetorch) (👨‍💻 16 · 🔀 41 · 📥 79 · ⏱️ 29.10.2025):
git clone https://github.com/run-house/runhouse
- [PyPi](https://pypi.org/project/runhouse) (📥 4.5K / month · 📦 1 · ⏱️ 10.03.2025):
pip install runhouse
NeoML (🥉19 · ⭐ 790) - Machine learning framework for both deep learning and traditional.. Apache-2 - [GitHub](https://github.com/neoml-lib/neoml) (👨‍💻 41 · 🔀 130 · 📦 2 · 📋 91 - 40% open · ⏱️ 28.10.2025):
git clone https://github.com/neoml-lib/neoml
- [PyPi](https://pypi.org/project/neoml) (📥 190 / month · ⏱️ 26.12.2023):
pip install neoml
chefboost (🥉19 · ⭐ 480) - A Lightweight Decision Tree Framework supporting regular algorithms:.. MIT - [GitHub](https://github.com/serengil/chefboost) (👨‍💻 7 · 🔀 100 · 📦 72 · ⏱️ 09.07.2025):
git clone https://github.com/serengil/chefboost
- [PyPi](https://pypi.org/project/chefboost) (📥 770 / month · ⏱️ 30.10.2024):
pip install chefboost
ThunderGBM (🥉18 · ⭐ 710 · 💤) - ThunderGBM: Fast GBDTs and Random Forests on GPUs. Apache-2 - [GitHub](https://github.com/Xtra-Computing/thundergbm) (👨‍💻 12 · 🔀 88 · 📦 4 · 📋 81 - 48% open · ⏱️ 19.03.2025):
git clone https://github.com/Xtra-Computing/thundergbm
- [PyPi](https://pypi.org/project/thundergbm) (📥 220 / month · ⏱️ 19.09.2022):
pip install thundergbm
Show 26 hidden projects... - dlib (🥈40 · ⭐ 14K) - A toolkit for making real world machine learning and data analysis.. ❗️BSL-1.0 - MXNet (🥈38 · ⭐ 21K · 💀) - Lightweight, Portable, Flexible Distributed/Mobile Deep.. Apache-2 - Theano (🥈37 · ⭐ 10K · 💀) - Theano was a Python library that allows you to define, optimize, and.. BSD-3 - MindsDB (🥈33 · ⭐ 37K) - Federated query engine for AI - The only MCP Server youll ever need. ❗️ICU - Vowpal Wabbit (🥈33 · ⭐ 8.6K · 💀) - Vowpal Wabbit is a machine learning system which pushes the.. BSD-3 - Chainer (🥈33 · ⭐ 5.9K · 💀) - A flexible framework of neural networks for deep learning. MIT - Turi Create (🥉32 · ⭐ 11K · 💀) - Turi Create simplifies the development of custom machine.. BSD-3 - tensorpack (🥉32 · ⭐ 6.3K · 💀) - A Neural Net Training Interface on TensorFlow, with.. Apache-2 - TFlearn (🥉31 · ⭐ 9.6K · 💀) - Deep learning library featuring a higher-level API for TensorFlow. MIT - dyNET (🥉31 · ⭐ 3.4K · 💀) - DyNet: The Dynamic Neural Network Toolkit. Apache-2 - CNTK (🥉29 · ⭐ 18K · 💀) - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit. MIT - Lasagne (🥉28 · ⭐ 3.9K · 💀) - Lightweight library to build and train neural networks in Theano. MIT - SHOGUN (🥉26 · ⭐ 3.1K · 💀) - Unified and efficient Machine Learning. BSD-3 - ktrain (🥉26 · ⭐ 1.3K · 💀) - ktrain is a Python library that makes deep learning and AI.. Apache-2 - NeuPy (🥉25 · ⭐ 740 · 💀) - NeuPy is a Tensorflow based python library for prototyping and building.. MIT - xLearn (🥉24 · ⭐ 3.1K · 💀) - High performance, easy-to-use, and scalable machine learning (ML).. Apache-2 - EvaDB (🥉24 · ⭐ 2.7K · 💀) - Database system for AI-powered apps. Apache-2 - neon (🥉22 · ⭐ 3.9K · 💀) - Intel Nervana reference deep learning framework committed to best.. Apache-2 - ThunderSVM (🥉22 · ⭐ 1.6K · 💀) - ThunderSVM: A Fast SVM Library on GPUs and CPUs. Apache-2 - Torchbearer (🥉22 · ⭐ 640 · 💀) - torchbearer: A model fitting library for PyTorch. MIT - mace (🥉21 · ⭐ 5K · 💀) - MACE is a deep learning inference framework optimized for mobile.. Apache-2 - Neural Tangents (🥉21 · ⭐ 2.4K · 💀) - Fast and Easy Infinite Neural Networks in Python. Apache-2 - Objax (🥉20 · ⭐ 770 · 💀) - Objax is a machine learning framework that provides an Object.. Apache-2 - elegy (🥉19 · ⭐ 480 · 💀) - A High Level API for Deep Learning in JAX. MIT - StarSpace (🥉16 · ⭐ 4K · 💀) - Learning embeddings for classification, retrieval and ranking. MIT - nanodl (🥉14 · ⭐ 300 · 💀) - A Jax-based library for building transformers, includes.. MIT


Data Visualization

Back to top

General-purpose and task-specific data visualization libraries.

Matplotlib (🥇49 · ⭐ 22K) - matplotlib: plotting with Python. ❗Unlicensed - [GitHub](https://github.com/matplotlib/matplotlib) (👨‍💻 1.9K · 🔀 8.1K · 📦 1.9M · 📋 11K - 14% open · ⏱️ 30.10.2025):
git clone https://github.com/matplotlib/matplotlib
- [PyPi](https://pypi.org/project/matplotlib) (📥 120M / month · 📦 68K · ⏱️ 09.10.2025):
pip install matplotlib
- [Conda](https://anaconda.org/conda-forge/matplotlib) (📥 33M · ⏱️ 15.10.2025):
conda install -c conda-forge matplotlib
Plotly (🥇47 · ⭐ 18K) - The interactive graphing library for Python. MIT - [GitHub](https://github.com/plotly/plotly.py) (👨‍💻 300 · 🔀 2.7K · 📥 550 · 📦 460K · 📋 3.3K - 21% open · ⏱️ 28.10.2025):
git clone https://github.com/plotly/plotly.py
- [PyPi](https://pypi.org/project/plotly) (📥 37M / month · 📦 9.7K · ⏱️ 02.10.2025):
pip install plotly
- [Conda](https://anaconda.org/conda-forge/plotly) (📥 12M · ⏱️ 03.10.2025):
conda install -c conda-forge plotly
- [npm](https://www.npmjs.com/package/plotlywidget) (📥 2.8K / month · 📦 9 · ⏱️ 12.01.2021):
npm install plotlywidget
dash (🥇45 · ⭐ 24K) - Data Apps & Dashboards for Python. No JavaScript Required. MIT - [GitHub](https://github.com/plotly/dash) (👨‍💻 190 · 🔀 2.2K · 📥 120 · 📦 89K · 📋 2.1K - 27% open · ⏱️ 21.10.2025):
git clone https://github.com/plotly/dash
- [PyPi](https://pypi.org/project/dash) (📥 5.5M / month · 📦 1.9K · ⏱️ 22.10.2025):
pip install dash
- [Conda](https://anaconda.org/conda-forge/dash) (📥 2.1M · ⏱️ 11.08.2025):
conda install -c conda-forge dash
Bokeh (🥇45 · ⭐ 20K) - Interactive Data Visualization in the browser, from Python. BSD-3 - [GitHub](https://github.com/bokeh/bokeh) (👨‍💻 720 · 🔀 4.2K · 📦 100K · 📋 8.1K - 10% open · ⏱️ 28.10.2025):
git clone https://github.com/bokeh/bokeh
- [PyPi](https://pypi.org/project/bokeh) (📥 5M / month · 📦 2.2K · ⏱️ 13.10.2025):
pip install bokeh
- [Conda](https://anaconda.org/conda-forge/bokeh) (📥 18M · ⏱️ 30.08.2025):
conda install -c conda-forge bokeh
Seaborn (🥇42 · ⭐ 14K) - Statistical data visualization in Python. BSD-3 - [GitHub](https://github.com/mwaskom/seaborn) (👨‍💻 220 · 🔀 2K · 📥 510 · 📦 700K · 📋 2.6K - 6% open · ⏱️ 10.07.2025):
git clone https://github.com/mwaskom/seaborn
- [PyPi](https://pypi.org/project/seaborn) (📥 31M / month · 📦 11K · ⏱️ 25.01.2024):
pip install seaborn
- [Conda](https://anaconda.org/conda-forge/seaborn) (📥 15M · ⏱️ 22.04.2025):
conda install -c conda-forge seaborn
Altair (🥇41 · ⭐ 10K) - Declarative visualization library for Python. BSD-3 - [GitHub](https://github.com/vega/altair) (👨‍💻 180 · 🔀 800 · 📥 280 · 📦 240K · 📋 2.1K - 6% open · ⏱️ 27.10.2025):
git clone https://github.com/altair-viz/altair
- [PyPi](https://pypi.org/project/altair) (📥 37M / month · 📦 920 · ⏱️ 23.11.2024):
pip install altair
- [Conda](https://anaconda.org/conda-forge/altair) (📥 3M · ⏱️ 22.04.2025):
conda install -c conda-forge altair
FiftyOne (🥈39 · ⭐ 10K) - Visualize, create, and debug image and video datasets.. Apache-2 - [GitHub](https://github.com/voxel51/fiftyone) (👨‍💻 160 · 🔀 680 · 📦 1K · 📋 1.8K - 35% open · ⏱️ 29.10.2025):
git clone https://github.com/voxel51/fiftyone
- [PyPi](https://pypi.org/project/fiftyone) (📥 170K / month · 📦 36 · ⏱️ 20.10.2025):
pip install fiftyone
Graphviz (🥈39 · ⭐ 1.8K) - Simple Python interface for Graphviz. MIT - [GitHub](https://github.com/xflr6/graphviz) (👨‍💻 24 · 🔀 220 · 📦 95K · 📋 190 - 6% open · ⏱️ 26.10.2025):
git clone https://github.com/xflr6/graphviz
- [PyPi](https://pypi.org/project/graphviz) (📥 26M / month · 📦 3.2K · ⏱️ 15.06.2025):
pip install graphviz
- [Conda](https://anaconda.org/anaconda/python-graphviz) (📥 59K · ⏱️ 22.04.2025):
conda install -c anaconda python-graphviz
PyVista (🥈38 · ⭐ 3.3K) - 3D plotting and mesh analysis through a streamlined interface for.. MIT - [GitHub](https://github.com/pyvista/pyvista) (👨‍💻 190 · 🔀 590 · 📥 960 · 📦 5.2K · 📋 2K - 35% open · ⏱️ 28.10.2025):
git clone https://github.com/pyvista/pyvista
- [PyPi](https://pypi.org/project/pyvista) (📥 1M / month · 📦 820 · ⏱️ 26.08.2025):
pip install pyvista
- [Conda](https://anaconda.org/conda-forge/pyvista) (📥 810K · ⏱️ 10.10.2025):
conda install -c conda-forge pyvista
HoloViews (🥈38 · ⭐ 2.8K) - With Holoviews, your data visualizes itself. BSD-3 - [GitHub](https://github.com/holoviz/holoviews) (👨‍💻 150 · 🔀 410 · 📦 17K · 📋 3.4K - 31% open · ⏱️ 29.10.2025):
git clone https://github.com/holoviz/holoviews
- [PyPi](https://pypi.org/project/holoviews) (📥 820K / month · 📦 490 · ⏱️ 29.10.2025):
pip install holoviews
- [Conda](https://anaconda.org/conda-forge/holoviews) (📥 2.4M · ⏱️ 25.06.2025):
conda install -c conda-forge holoviews
- [npm](https://www.npmjs.com/package/@pyviz/jupyterlab_pyviz) (📥 380 / month · 📦 7 · ⏱️ 20.06.2025):
npm install @pyviz/jupyterlab_pyviz
pyecharts (🥈37 · ⭐ 16K) - Python Echarts Plotting Library. MIT - [GitHub](https://github.com/pyecharts/pyecharts) (👨‍💻 45 · 🔀 2.9K · 📥 75 · 📦 5.5K · 📋 1.9K - 0% open · ⏱️ 10.10.2025):
git clone https://github.com/pyecharts/pyecharts
- [PyPi](https://pypi.org/project/pyecharts) (📥 530K / month · 📦 280 · ⏱️ 10.10.2025):
pip install pyecharts
PyQtGraph (🥈37 · ⭐ 4.2K) - Fast data visualization and GUI tools for scientific / engineering.. MIT - [GitHub](https://github.com/pyqtgraph/pyqtgraph) (👨‍💻 300 · 🔀 1.1K · 📦 13K · 📋 1.4K - 31% open · ⏱️ 02.10.2025):
git clone https://github.com/pyqtgraph/pyqtgraph
- [PyPi](https://pypi.org/project/pyqtgraph) (📥 560K / month · 📦 1K · ⏱️ 29.04.2024):
pip install pyqtgraph
- [Conda](https://anaconda.org/conda-forge/pyqtgraph) (📥 880K · ⏱️ 22.04.2025):
conda install -c conda-forge pyqtgraph
pandas-profiling (🥈35 · ⭐ 13K) - 1 Line of code data quality profiling & exploratory.. MIT - [GitHub](https://github.com/ydataai/ydata-profiling) (👨‍💻 140 · 🔀 1.7K · 📥 490 · 📦 6.9K · 📋 850 - 30% open · ⏱️ 19.09.2025):
git clone https://github.com/ydataai/pandas-profiling
- [PyPi](https://pypi.org/project/pandas-profiling) (📥 330K / month · 📦 180 · ⏱️ 03.02.2023):
pip install pandas-profiling
- [Conda](https://anaconda.org/conda-forge/pandas-profiling) (📥 590K · ⏱️ 22.04.2025):
conda install -c conda-forge pandas-profiling
plotnine (🥈35 · ⭐ 4.4K) - A Grammar of Graphics for Python. MIT - [GitHub](https://github.com/has2k1/plotnine) (👨‍💻 110 · 🔀 240 · 📦 13K · 📋 750 - 10% open · ⏱️ 16.10.2025):
git clone https://github.com/has2k1/plotnine
- [PyPi](https://pypi.org/project/plotnine) (📥 2.2M / month · 📦 400 · ⏱️ 15.07.2025):
pip install plotnine
- [Conda](https://anaconda.org/conda-forge/plotnine) (📥 560K · ⏱️ 15.07.2025):
conda install -c conda-forge plotnine
cartopy (🥈35 · ⭐ 1.5K) - Cartopy - a cartographic python library with matplotlib support. BSD-3 - [GitHub](https://github.com/SciTools/cartopy) (👨‍💻 140 · 🔀 390 · 📦 8.1K · 📋 1.3K - 23% open · ⏱️ 30.10.2025):
git clone https://github.com/SciTools/cartopy
- [PyPi](https://pypi.org/project/cartopy) (📥 810K / month · 📦 970 · ⏱️ 01.08.2025):
pip install cartopy
- [Conda](https://anaconda.org/conda-forge/cartopy) (📥 5.6M · ⏱️ 27.10.2025):
conda install -c conda-forge cartopy
VisPy (🥈34 · ⭐ 3.5K · 📉) - High-performance interactive 2D/3D data visualization library. BSD-3 - [GitHub](https://github.com/vispy/vispy) (👨‍💻 210 · 🔀 620 · 📦 2.1K · 📋 1.5K - 25% open · ⏱️ 13.10.2025):
git clone https://github.com/vispy/vispy
- [PyPi](https://pypi.org/project/vispy) (📥 190K / month · 📦 200 · ⏱️ 19.05.2025):
pip install vispy
- [Conda](https://anaconda.org/conda-forge/vispy) (📥 980K · ⏱️ 30.08.2025):
conda install -c conda-forge vispy
- [npm](https://www.npmjs.com/package/vispy) (📥 12 / month · 📦 3 · ⏱️ 15.03.2020):
npm install vispy
datashader (🥈34 · ⭐ 3.5K) - Quickly and accurately render even the largest data. BSD-3 - [GitHub](https://github.com/holoviz/datashader) (👨‍💻 63 · 🔀 380 · 📦 6.3K · 📋 620 - 24% open · ⏱️ 09.10.2025):
git clone https://github.com/holoviz/datashader
- [PyPi](https://pypi.org/project/datashader) (📥 280K / month · 📦 250 · ⏱️ 05.08.2025):
pip install datashader
- [Conda](https://anaconda.org/conda-forge/datashader) (📥 1.6M · ⏱️ 05.08.2025):
conda install -c conda-forge datashader
lets-plot (🥈34 · ⭐ 1.7K) - Multiplatform plotting library based on the Grammar of Graphics. MIT - [GitHub](https://github.com/JetBrains/lets-plot) (👨‍💻 21 · 🔀 54 · 📥 3.4K · 📦 190 · 📋 740 - 21% open · ⏱️ 30.10.2025):
git clone https://github.com/JetBrains/lets-plot
- [PyPi](https://pypi.org/project/lets-plot) (📥 120K / month · 📦 16 · ⏱️ 12.09.2025):
pip install lets-plot
wordcloud (🥈33 · ⭐ 10K) - A little word cloud generator in Python. MIT - [GitHub](https://github.com/amueller/word_cloud) (👨‍💻 75 · 🔀 2.3K · 📦 21 · 📋 560 - 24% open · ⏱️ 31.08.2025):
git clone https://github.com/amueller/word_cloud
- [PyPi](https://pypi.org/project/wordcloud) (📥 2M / month · 📦 550 · ⏱️ 10.11.2024):
pip install wordcloud
- [Conda](https://anaconda.org/conda-forge/wordcloud) (📥 790K · ⏱️ 03.09.2025):
conda install -c conda-forge wordcloud
Perspective (🥈33 · ⭐ 9.5K) - A data visualization and analytics component, especially.. Apache-2 - [GitHub](https://github.com/perspective-dev/perspective) (👨‍💻 100 · 🔀 1.2K · 📥 12K · 📦 190 · 📋 890 - 12% open · ⏱️ 29.10.2025):
git clone https://github.com/finos/perspective
- [PyPi](https://pypi.org/project/perspective-python) (📥 17K / month · 📦 31 · ⏱️ 28.10.2025):
pip install perspective-python
- [Conda](https://anaconda.org/conda-forge/perspective) (📥 2.4M · ⏱️ 28.10.2025):
conda install -c conda-forge perspective
- [npm](https://www.npmjs.com/package/@finos/perspective-jupyterlab) (📥 600 / month · 📦 6 · ⏱️ 03.09.2025):
npm install @finos/perspective-jupyterlab
UMAP (🥈33 · ⭐ 8K) - Uniform Manifold Approximation and Projection. BSD-3 - [GitHub](https://github.com/lmcinnes/umap) (👨‍💻 140 · 🔀 850 · 📦 1 · 📋 860 - 59% open · ⏱️ 26.10.2025):
git clone https://github.com/lmcinnes/umap
- [PyPi](https://pypi.org/project/umap-learn) (📥 2.7M / month · 📦 1.3K · ⏱️ 03.07.2025):
pip install umap-learn
- [Conda](https://anaconda.org/conda-forge/umap-learn) (📥 3.2M · ⏱️ 03.07.2025):
conda install -c conda-forge umap-learn
hvPlot (🥈32 · ⭐ 1.3K) - A high-level plotting API for pandas, dask, xarray, and networkx built.. BSD-3 - [GitHub](https://github.com/holoviz/hvplot) (👨‍💻 52 · 🔀 110 · 📦 7.3K · 📋 940 - 41% open · ⏱️ 24.10.2025):
git clone https://github.com/holoviz/hvplot
- [PyPi](https://pypi.org/project/hvplot) (📥 310K / month · 📦 270 · ⏱️ 29.08.2025):
pip install hvplot
- [Conda](https://anaconda.org/conda-forge/hvplot) (📥 860K · ⏱️ 04.09.2025):
conda install -c conda-forge hvplot
mpld3 (🥉31 · ⭐ 2.4K · 📉) - An interactive data visualization tool which brings matplotlib.. BSD-3 - [GitHub](https://github.com/mpld3/mpld3) (👨‍💻 54 · 🔀 360 · 📦 7.6K · 📋 370 - 59% open · ⏱️ 27.07.2025):
git clone https://github.com/mpld3/mpld3
- [PyPi](https://pypi.org/project/mpld3) (📥 440K / month · 📦 160 · ⏱️ 27.07.2025):
pip install mpld3
- [Conda](https://anaconda.org/conda-forge/mpld3) (📥 280K · ⏱️ 28.07.2025):
conda install -c conda-forge mpld3
- [npm](https://www.npmjs.com/package/mpld3) (📥 900 / month · 📦 11 · ⏱️ 27.07.2025):
npm install mpld3
bqplot (🥉30 · ⭐ 3.7K) - Plotting library for IPython/Jupyter notebooks. Apache-2 - [GitHub](https://github.com/bqplot/bqplot) (👨‍💻 66 · 🔀 480 · 📦 62 · 📋 650 - 42% open · ⏱️ 25.08.2025):
git clone https://github.com/bqplot/bqplot
- [PyPi](https://pypi.org/project/bqplot) (📥 230K / month · 📦 110 · ⏱️ 21.05.2025):
pip install bqplot
- [Conda](https://anaconda.org/conda-forge/bqplot) (📥 1.9M · ⏱️ 02.09.2025):
conda install -c conda-forge bqplot
- [npm](https://www.npmjs.com/package/bqplot) (📥 3K / month · 📦 21 · ⏱️ 03.09.2025):
npm install bqplot
D-Tale (🥉29 · ⭐ 5K) - Visualizer for pandas data structures. ❗️LGPL-2.1 - [GitHub](https://github.com/man-group/dtale) (👨‍💻 31 · 🔀 430 · 📦 1.5K · 📋 610 - 10% open · ⏱️ 30.07.2025):
git clone https://github.com/man-group/dtale
- [PyPi](https://pypi.org/project/dtale) (📥 31K / month · 📦 53 · ⏱️ 30.07.2025):
pip install dtale
- [Conda](https://anaconda.org/conda-forge/dtale) (📥 480K · ⏱️ 30.07.2025):
conda install -c conda-forge dtale
openTSNE (🥉29 · ⭐ 1.6K · 📈) - Extensible, parallel implementations of t-SNE. BSD-3 - [GitHub](https://github.com/pavlin-policar/openTSNE) (👨‍💻 14 · 🔀 170 · 📦 1.1K · 📋 150 - 2% open · ⏱️ 27.10.2025):
git clone https://github.com/pavlin-policar/openTSNE
- [PyPi](https://pypi.org/project/opentsne) (📥 58K / month · 📦 69 · ⏱️ 27.10.2025):
pip install opentsne
- [Conda](https://anaconda.org/conda-forge/opentsne) (📥 500K · ⏱️ 27.10.2025):
conda install -c conda-forge opentsne
Plotly-Resampler (🥉27 · ⭐ 1.2K) - Visualize large time series data with plotly.py. MIT - [GitHub](https://github.com/predict-idlab/plotly-resampler) (👨‍💻 14 · 🔀 74 · 📦 2K · 📋 190 - 32% open · ⏱️ 03.09.2025):
git clone https://github.com/predict-idlab/plotly-resampler
- [PyPi](https://pypi.org/project/plotly-resampler) (📥 370K / month · 📦 38 · ⏱️ 29.08.2025):
pip install plotly-resampler
- [Conda](https://anaconda.org/conda-forge/plotly-resampler) (📥 140K · ⏱️ 09.10.2025):
conda install -c conda-forge plotly-resampler
HyperTools (🥉26 · ⭐ 1.9K) - A Python toolbox for gaining geometric insights into high-dimensional.. MIT - [GitHub](https://github.com/ContextLab/hypertools) (👨‍💻 23 · 🔀 160 · 📥 73 · 📦 510 · 📋 200 - 34% open · ⏱️ 10.07.2025):
git clone https://github.com/ContextLab/hypertools
- [PyPi](https://pypi.org/project/hypertools) (📥 1.1K / month · 📦 2 · ⏱️ 09.07.2025):
pip install hypertools
data-validation (🥉25 · ⭐ 780) - Library for exploring and validating machine learning.. Apache-2 - [GitHub](https://github.com/tensorflow/data-validation) (👨‍💻 30 · 🔀 180 · 📥 1K · 📋 190 - 20% open · ⏱️ 23.06.2025):
git clone https://github.com/tensorflow/data-validation
- [PyPi](https://pypi.org/project/tensorflow-data-validation) (📥 150K / month · 📦 32 · ⏱️ 09.06.2025):
pip install tensorflow-data-validation
Chartify (🥉24 · ⭐ 3.6K · 💤) - Python library that makes it easy for data scientists to create.. Apache-2 - [GitHub](https://github.com/spotify/chartify) (👨‍💻 27 · 🔀 340 · 📦 83 · 📋 86 - 62% open · ⏱️ 16.10.2024):
git clone https://github.com/spotify/chartify
- [PyPi](https://pypi.org/project/chartify) (📥 1.2K / month · 📦 9 · ⏱️ 16.10.2024):
pip install chartify
- [Conda](https://anaconda.org/conda-forge/chartify) (📥 40K · ⏱️ 22.04.2025):
conda install -c conda-forge chartify
Popmon (🥉22 · ⭐ 510) - Monitor the stability of a Pandas or Spark dataframe. MIT - [GitHub](https://github.com/ing-bank/popmon) (👨‍💻 19 · 🔀 36 · 📥 280 · 📦 22 · 📋 57 - 28% open · ⏱️ 04.09.2025):
git clone https://github.com/ing-bank/popmon
- [PyPi](https://pypi.org/project/popmon) (📥 3.4K / month · 📦 4 · ⏱️ 04.09.2025):
pip install popmon
vega (🥉22 · ⭐ 390 · 💤) - IPython/Jupyter notebook module for Vega and Vega-Lite. BSD-3 - [GitHub](https://github.com/vega/ipyvega) (👨‍💻 15 · 🔀 65 · 📦 4 · 📋 110 - 14% open · ⏱️ 01.01.2025):
git clone https://github.com/vega/ipyvega
- [PyPi](https://pypi.org/project/vega) (📥 26K / month · 📦 17 · ⏱️ 25.09.2024):
pip install vega
- [Conda](https://anaconda.org/conda-forge/vega) (📥 940K · ⏱️ 04.10.2025):
conda install -c conda-forge vega
vegafusion (🥉21 · ⭐ 390) - Serverside scaling for Vega and Altair visualizations. BSD-3 - [GitHub](https://github.com/vega/vegafusion) (👨‍💻 6 · 🔀 26 · 📥 6.6K · 📋 150 - 36% open · ⏱️ 29.09.2025):
git clone https://github.com/vegafusion/vegafusion
- [PyPi](https://pypi.org/project/vegafusion-jupyter) (📥 770 / month · 📦 2 · ⏱️ 09.05.2024):
pip install vegafusion-jupyter
- [Conda](https://anaconda.org/conda-forge/vegafusion-python-embed) (📥 520K · ⏱️ 27.10.2025):
conda install -c conda-forge vegafusion-python-embed
- [npm](https://www.npmjs.com/package/vegafusion-jupyter) (📥 1.9K / month · 📦 3 · ⏱️ 09.05.2024):
npm install vegafusion-jupyter
Show 22 hidden projects... - missingno (🥉30 · ⭐ 4.2K · 💀) - Missing data visualization module for Python. MIT - Facets Overview (🥉28 · ⭐ 7.4K · 💀) - Visualizations for machine learning datasets. Apache-2 - Cufflinks (🥉28 · ⭐ 3.1K · 💀) - Productivity Tools for Plotly + Pandas. MIT - pythreejs (🥉27 · ⭐ 980 · 💀) - A Jupyter - Three.js bridge. BSD-3 - Sweetviz (🥉26 · ⭐ 3.1K · 💀) - Visualize and compare datasets, target values and associations, with.. MIT - AutoViz (🥉26 · ⭐ 1.9K · 💀) - Automatically Visualize any dataset, any size with a single line.. Apache-2 - ridgeplot (🥉26 · ⭐ 240) - Beautiful ridgeline plots in Python. MIT - PandasGUI (🥉24 · ⭐ 3.3K) - A GUI for Pandas DataFrames. ❗️MIT-0 - HiPlot (🥉24 · ⭐ 2.8K · 💀) - HiPlot makes understanding high dimensional data easy. MIT - python-ternary (🥉24 · ⭐ 770 · 💀) - Ternary plotting library for python with matplotlib. MIT - Multicore-TSNE (🥉23 · ⭐ 1.9K · 💀) - Parallel t-SNE implementation with Python and Torch.. BSD-3 - Pandas-Bokeh (🥉22 · ⭐ 890 · 💀) - Bokeh Plotting Backend for Pandas and GeoPandas. MIT - pivottablejs (🥉21 · ⭐ 710 · 💀) - Dragndrop Pivot Tables and Charts for Jupyter/IPython.. MIT - joypy (🥉21 · ⭐ 600 · 💀) - Joyplots in Python with matplotlib & pandas. MIT - PyWaffle (🥉21 · ⭐ 600 · 💀) - Make Waffle Charts in Python. MIT - PDPbox (🥉20 · ⭐ 860 · 💀) - python partial dependence plot toolbox. MIT - animatplot (🥉18 · ⭐ 410 · 💀) - A python package for animating plots build on matplotlib. MIT - ivis (🥉18 · ⭐ 340 · 💀) - Dimensionality reduction in very large datasets using Siamese.. Apache-2 - pdvega (🥉16 · ⭐ 340 · 💀) - Interactive plotting for Pandas using Vega-Lite. MIT - nx-altair (🥉16 · ⭐ 230 · 💀) - Draw interactive NetworkX graphs with Altair. MIT - data-describe (🥉15 · ⭐ 300 · 💀) - datadescribe: Pythonic EDA Accelerator for Data Science. Apache-2 - nptsne (🥉11 · ⭐ 33 · 💀) - nptsne is a numpy compatible python binary package that offers a.. Apache-2


Text Data & NLP

Back to top

Libraries for processing, cleaning, manipulating, and analyzing text data as well as libraries for NLP tasks such as language detection, fuzzy matching, classification, seq2seq learning, conversational AI, keyword extraction, and translation.

transformers (🥇54 · ⭐ 150K) - Transformers: the model-definition framework for.. Apache-2 - [GitHub](https://github.com/huggingface/transformers) (👨‍💻 3.6K · 🔀 31K · 📦 400K · 📋 19K - 11% open · ⏱️ 30.10.2025):
git clone https://github.com/huggingface/transformers
- [PyPi](https://pypi.org/project/transformers) (📥 93M / month · 📦 11K · ⏱️ 14.10.2025):
pip install transformers
- [Conda](https://anaconda.org/conda-forge/transformers) (📥 3.3M · ⏱️ 14.10.2025):
conda install -c conda-forge transformers
nltk (🥇47 · ⭐ 14K) - Suite of libraries and programs for symbolic and statistical natural.. Apache-2 - [GitHub](https://github.com/nltk/nltk) (👨‍💻 480 · 🔀 3K · 📦 410K · 📋 1.9K - 14% open · ⏱️ 22.10.2025):
git clone https://github.com/nltk/nltk
- [PyPi](https://pypi.org/project/nltk) (📥 42M / month · 📦 6.3K · ⏱️ 01.10.2025):
pip install nltk
- [Conda](https://anaconda.org/conda-forge/nltk) (📥 3.4M · ⏱️ 01.10.2025):
conda install -c conda-forge nltk
litellm (🥇45 · ⭐ 30K · 📉) - Python SDK, Proxy Server (LLM Gateway) to call 100+.. MIT o t h e r s - [GitHub](https://github.com/BerriAI/litellm) (👨‍💻 960 · 🔀 4.5K · 📥 800 · 📦 17K · 📋 7.8K - 17% open · ⏱️ 30.10.2025):
git clone https://github.com/BerriAI/litellm
- [PyPi](https://pypi.org/project/litellm) (📥 34M / month · 📦 1.9K · ⏱️ 29.10.2025):
pip install litellm
spaCy (🥇43 · ⭐ 33K · 📈) - Industrial-strength Natural Language Processing (NLP) in Python. MIT - [GitHub](https://github.com/explosion/spaCy) (👨‍💻 780 · 🔀 4.5K · 📥 4.9K · 📦 140K · 📋 5.8K - 3% open · ⏱️ 28.10.2025):
git clone https://github.com/explosion/spaCy
- [PyPi](https://pypi.org/project/spacy) (📥 17M / month · 📦 3.2K · ⏱️ 23.05.2025):
pip install spacy
- [Conda](https://anaconda.org/conda-forge/spacy) (📥 6.5M · ⏱️ 06.07.2025):
conda install -c conda-forge spacy
sentence-transformers (🥇42 · ⭐ 18K) - State-of-the-Art Text Embeddings. Apache-2 - [GitHub](https://github.com/huggingface/sentence-transformers) (👨‍💻 240 · 🔀 2.7K · 📦 120K · 📋 2.5K - 51% open · ⏱️ 22.10.2025):
git clone https://github.com/UKPLab/sentence-transformers
- [PyPi](https://pypi.org/project/sentence-transformers) (📥 17M / month · 📦 3.7K · ⏱️ 22.10.2025):
pip install sentence-transformers
- [Conda](https://anaconda.org/conda-forge/sentence-transformers) (📥 1M · ⏱️ 22.10.2025):
conda install -c conda-forge sentence-transformers
gensim (🥇42 · ⭐ 16K) - Topic Modelling for Humans. ❗️LGPL-2.1 - [GitHub](https://github.com/piskvorky/gensim) (👨‍💻 460 · 🔀 4.4K · 📥 6.4K · 📦 78K · 📋 1.9K - 21% open · ⏱️ 16.10.2025):
git clone https://github.com/RaRe-Technologies/gensim
- [PyPi](https://pypi.org/project/gensim) (📥 5.2M / month · 📦 1.6K · ⏱️ 18.10.2025):
pip install gensim
- [Conda](https://anaconda.org/conda-forge/gensim) (📥 1.8M · ⏱️ 22.04.2025):
conda install -c conda-forge gensim
sentencepiece (🥇42 · ⭐ 11K) - Unsupervised text tokenizer for Neural Network-based text.. Apache-2 - [GitHub](https://github.com/google/sentencepiece) (👨‍💻 100 · 🔀 1.3K · 📥 110K · 📦 120K · 📋 800 - 3% open · ⏱️ 04.10.2025):
git clone https://github.com/google/sentencepiece
- [PyPi](https://pypi.org/project/sentencepiece) (📥 31M / month · 📦 2.4K · ⏱️ 12.08.2025):
pip install sentencepiece
- [Conda](https://anaconda.org/conda-forge/sentencepiece) (📥 1.7M · ⏱️ 22.09.2025):
conda install -c conda-forge sentencepiece
Tokenizers (🥇40 · ⭐ 10K) - Fast State-of-the-Art Tokenizers optimized for Research and.. Apache-2 - [GitHub](https://github.com/huggingface/tokenizers) (👨‍💻 130 · 🔀 970 · 📥 86 · 📦 180K · 📋 1.1K - 9% open · ⏱️ 16.10.2025):
git clone https://github.com/huggingface/tokenizers
- [PyPi](https://pypi.org/project/tokenizers) (📥 81M / month · 📦 1.7K · ⏱️ 19.09.2025):
pip install tokenizers
- [Conda](https://anaconda.org/conda-forge/tokenizers) (📥 3.6M · ⏱️ 19.09.2025):
conda install -c conda-forge tokenizers
NeMo (🥇38 · ⭐ 16K) - A scalable generative AI framework built for researchers and.. Apache-2 - [GitHub](https://github.com/NVIDIA-NeMo/NeMo) (👨‍💻 460 · 🔀 3.2K · 📥 520K · 📦 21 · 📋 2.8K - 4% open · ⏱️ 29.10.2025):
git clone https://github.com/NVIDIA/NeMo
- [PyPi](https://pypi.org/project/nemo-toolkit) (📥 810K / month · 📦 18 · ⏱️ 27.10.2025):
pip install nemo-toolkit
haystack (🥇37 · ⭐ 23K) - AI orchestration framework to build customizable, production-ready.. Apache-2 - [GitHub](https://github.com/deepset-ai/haystack) (👨‍💻 310 · 🔀 2.5K · 📦 1.3K · 📋 4.1K - 2% open · ⏱️ 30.10.2025):
git clone https://github.com/deepset-ai/haystack
- [PyPi](https://pypi.org/project/haystack) (📥 7.4K / month · 📦 5 · ⏱️ 15.12.2021):
pip install haystack
Opik (🥇37 · ⭐ 15K) - Debug, evaluate, and monitor your LLM applications, RAG systems, and.. Apache-2 - [GitHub](https://github.com/comet-ml/opik) (👨‍💻 81 · 🔀 1.1K · 📦 17 · 📋 540 - 29% open · ⏱️ 30.10.2025):
git clone https://github.com/comet-ml/opik
- [PyPi](https://pypi.org/project/opik) (📥 850K / month · 📦 34 · ⏱️ 29.10.2025):
pip install opik
ChatterBot (🥇37 · ⭐ 14K) - ChatterBot is a machine learning, conversational dialog engine for.. BSD-3 - [GitHub](https://github.com/gunthercox/ChatterBot) (👨‍💻 110 · 🔀 4.5K · 📦 6.5K · 📋 1.7K - 6% open · ⏱️ 25.10.2025):
git clone https://github.com/gunthercox/ChatterBot
- [PyPi](https://pypi.org/project/chatterbot) (📥 20K / month · 📦 19 · ⏱️ 16.10.2025):
pip install chatterbot
flair (🥇37 · ⭐ 14K) - A very simple framework for state-of-the-art Natural Language Processing.. MIT - [GitHub](https://github.com/flairNLP/flair) (👨‍💻 280 · 🔀 2.1K · 📦 4.1K · 📋 2.4K - 1% open · ⏱️ 12.06.2025):
git clone https://github.com/flairNLP/flair
- [PyPi](https://pypi.org/project/flair) (📥 180K / month · 📦 160 · ⏱️ 05.02.2025):
pip install flair
- [Conda](https://anaconda.org/conda-forge/python-flair) (📥 49K · ⏱️ 22.04.2025):
conda install -c conda-forge python-flair
TextBlob (🥇37 · ⭐ 9.5K) - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech.. MIT - [GitHub](https://github.com/sloria/TextBlob) (👨‍💻 37 · 🔀 1.2K · 📥 140 · 📦 60K · 📋 280 - 25% open · ⏱️ 18.10.2025):
git clone https://github.com/sloria/TextBlob
- [PyPi](https://pypi.org/project/textblob) (📥 1.5M / month · 📦 400 · ⏱️ 13.01.2025):
pip install textblob
- [Conda](https://anaconda.org/conda-forge/textblob) (📥 340K · ⏱️ 22.04.2025):
conda install -c conda-forge textblob
fairseq (🥈36 · ⭐ 32K) - Facebook AI Research Sequence-to-Sequence Toolkit written in Python. MIT - [GitHub](https://github.com/facebookresearch/fairseq) (👨‍💻 430 · 🔀 6.6K · 📥 440 · 📦 4.4K · 📋 4.4K - 30% open · ⏱️ 30.09.2025):
git clone https://github.com/facebookresearch/fairseq
- [PyPi](https://pypi.org/project/fairseq) (📥 77K / month · 📦 120 · ⏱️ 27.06.2022):
pip install fairseq
- [Conda](https://anaconda.org/conda-forge/fairseq) (📥 170K · ⏱️ 02.10.2025):
conda install -c conda-forge fairseq
stanza (🥈36 · ⭐ 7.6K) - Stanford NLP Python library for tokenization, sentence segmentation,.. Apache-2 - [GitHub](https://github.com/stanfordnlp/stanza) (👨‍💻 72 · 🔀 920 · 📦 4.1K · 📋 950 - 10% open · ⏱️ 05.10.2025):
git clone https://github.com/stanfordnlp/stanza
- [PyPi](https://pypi.org/project/stanza) (📥 770K / month · 📦 240 · ⏱️ 05.10.2025):
pip install stanza
- [Conda](https://anaconda.org/stanfordnlp/stanza) (📥 9K · ⏱️ 25.03.2025):
conda install -c stanfordnlp stanza
qdrant (🥈35 · ⭐ 27K) - Qdrant - High-performance, massive-scale Vector Database and Vector.. Apache-2 - [GitHub](https://github.com/qdrant/qdrant) (👨‍💻 140 · 🔀 1.9K · 📥 500K · 📦 120 · 📋 1.6K - 22% open · ⏱️ 30.09.2025):
git clone https://github.com/qdrant/qdrant
spark-nlp (🥈35 · ⭐ 4.1K) - State of the Art Natural Language Processing. Apache-2 - [GitHub](https://github.com/JohnSnowLabs/spark-nlp) (👨‍💻 120 · 🔀 730 · 📦 620 · 📋 910 - 2% open · ⏱️ 22.10.2025):
git clone https://github.com/JohnSnowLabs/spark-nlp
- [PyPi](https://pypi.org/project/spark-nlp) (📥 1M / month · 📦 39 · ⏱️ 22.10.2025):
pip install spark-nlp
Rasa (🥈34 · ⭐ 21K) - Open source machine learning framework to automate text- and voice-.. Apache-2 - [GitHub](https://github.com/RasaHQ/rasa) (👨‍💻 600 · 🔀 4.9K · 📋 6.8K - 2% open · ⏱️ 26.08.2025):
git clone https://github.com/RasaHQ/rasa
- [PyPi](https://pypi.org/project/rasa) (📥 110K / month · 📦 60 · ⏱️ 14.01.2025):
pip install rasa
TensorFlow Text (🥈34 · ⭐ 1.3K) - Making text a first-class citizen in TensorFlow. Apache-2 - [GitHub](https://github.com/tensorflow/text) (👨‍💻 190 · 🔀 360 · 📦 10K · 📋 370 - 53% open · ⏱️ 18.08.2025):
git clone https://github.com/tensorflow/text
- [PyPi](https://pypi.org/project/tensorflow-text) (📥 6.8M / month · 📦 230 · ⏱️ 04.04.2025):
pip install tensorflow-text
snowballstemmer (🥈34 · ⭐ 810) - Snowball compiler and stemming algorithms. BSD-3 - [GitHub](https://github.com/snowballstem/snowball) (👨‍💻 41 · 🔀 190 · 📦 11 · 📋 120 - 17% open · ⏱️ 28.10.2025):
git clone https://github.com/snowballstem/snowball
- [PyPi](https://pypi.org/project/snowballstemmer) (📥 24M / month · 📦 550 · ⏱️ 09.05.2025):
pip install snowballstemmer
- [Conda](https://anaconda.org/conda-forge/snowballstemmer) (📥 11M · ⏱️ 20.05.2025):
conda install -c conda-forge snowballstemmer
torchtext (🥈32 · ⭐ 3.6K) - Models, data loaders and abstractions for language processing,.. BSD-3 - [GitHub](https://github.com/pytorch/text) (👨‍💻 160 · 🔀 810 · 📋 850 - 38% open · ⏱️ 10.09.2025):
git clone https://github.com/pytorch/text
- [PyPi](https://pypi.org/project/torchtext) (📥 730K / month · 📦 280 · ⏱️ 24.04.2024):
pip install torchtext
jellyfish (🥈32 · ⭐ 2.2K) - a python library for doing approximate and phonetic matching of strings. MIT - [GitHub](https://github.com/jamesturk/jellyfish) (👨‍💻 37 · 🔀 160 · 📦 15K · ⏱️ 11.10.2025):
git clone https://github.com/jamesturk/jellyfish
- [PyPi](https://pypi.org/project/jellyfish) (📥 8.6M / month · 📦 320 · ⏱️ 11.10.2025):
pip install jellyfish
- [Conda](https://anaconda.org/conda-forge/jellyfish) (📥 1.7M · ⏱️ 22.04.2025):
conda install -c conda-forge jellyfish
DeepPavlov (🥈31 · ⭐ 6.9K · 💤) - An open source library for deep learning end-to-end.. Apache-2 - [GitHub](https://github.com/deeppavlov/DeepPavlov) (👨‍💻 78 · 🔀 1.2K · 📦 440 · 📋 640 - 4% open · ⏱️ 26.11.2024):
git clone https://github.com/deepmipt/DeepPavlov
- [PyPi](https://pypi.org/project/deeppavlov) (📥 11K / month · 📦 4 · ⏱️ 12.08.2024):
pip install deeppavlov
ftfy (🥈31 · ⭐ 4K · 💤) - Fixes mojibake and other glitches in Unicode text, after the fact. Apache-2 - [GitHub](https://github.com/rspeer/python-ftfy) (👨‍💻 22 · 🔀 120 · 📥 100 · 📦 33K · 📋 150 - 7% open · ⏱️ 30.10.2024):
git clone https://github.com/rspeer/python-ftfy
- [PyPi](https://pypi.org/project/ftfy) (📥 11M / month · 📦 570 · ⏱️ 26.10.2024):
pip install ftfy
- [Conda](https://anaconda.org/conda-forge/ftfy) (📥 380K · ⏱️ 22.04.2025):
conda install -c conda-forge ftfy
SciSpacy (🥈31 · ⭐ 1.9K) - A full spaCy pipeline and models for scientific/biomedical documents. Apache-2 - [GitHub](https://github.com/allenai/scispacy) (👨‍💻 38 · 🔀 240 · 📦 1.3K · 📋 330 - 11% open · ⏱️ 01.10.2025):
git clone https://github.com/allenai/scispacy
- [PyPi](https://pypi.org/project/scispacy) (📥 42K / month · 📦 50 · ⏱️ 01.10.2025):
pip install scispacy
CLTK (🥈31 · ⭐ 870 · 📉) - The Classical Language Toolkit. MIT - [GitHub](https://github.com/cltk/cltk) (👨‍💻 120 · 🔀 340 · 📥 160 · 📦 300 · 📋 580 - 0% open · ⏱️ 21.10.2025):
git clone https://github.com/cltk/cltk
- [PyPi](https://pypi.org/project/cltk) (📥 14K / month · 📦 17 · ⏱️ 21.10.2025):
pip install cltk
english-words (🥈29 · ⭐ 12K · 💤) - A text file containing 479k English words for all your.. Unlicense - [GitHub](https://github.com/dwyl/english-words) (👨‍💻 34 · 🔀 2K · 📦 2 · 📋 170 - 75% open · ⏱️ 06.01.2025):
git clone https://github.com/dwyl/english-words
- [PyPi](https://pypi.org/project/english-words) (📥 78K / month · 📦 15 · ⏱️ 14.08.2025):
pip install english-words
rubrix (🥈29 · ⭐ 4.7K) - Argilla is a collaboration tool for AI engineers and domain experts.. Apache-2 - [GitHub](https://github.com/argilla-io/argilla) (👨‍💻 110 · 🔀 460 · 📦 3.1K · 📋 2.2K - 0% open · ⏱️ 05.08.2025):
git clone https://github.com/recognai/rubrix
- [PyPi](https://pypi.org/project/rubrix) (📥 1.2K / month · ⏱️ 24.10.2022):
pip install rubrix
- [Conda](https://anaconda.org/conda-forge/rubrix) (📥 52K · ⏱️ 22.04.2025):
conda install -c conda-forge rubrix
Dedupe (🥈29 · ⭐ 4.4K · 📈) - A python library for accurate and scalable fuzzy matching, record.. MIT - [GitHub](https://github.com/dedupeio/dedupe) (👨‍💻 72 · 🔀 560 · 📦 370 · 📋 820 - 9% open · ⏱️ 29.07.2025):
git clone https://github.com/dedupeio/dedupe
- [PyPi](https://pypi.org/project/dedupe) (📥 59K / month · 📦 19 · ⏱️ 15.08.2024):
pip install dedupe
- [Conda](https://anaconda.org/conda-forge/dedupe) (📥 130K · ⏱️ 22.04.2025):
conda install -c conda-forge dedupe
TextDistance (🥈28 · ⭐ 3.5K) - Compute distance between sequences. 30+ algorithms, pure python.. MIT - [GitHub](https://github.com/life4/textdistance) (👨‍💻 18 · 🔀 260 · 📥 1.1K · 📦 8.8K · ⏱️ 18.04.2025):
git clone https://github.com/life4/textdistance
- [PyPi](https://pypi.org/project/textdistance) (📥 1.3M / month · 📦 99 · ⏱️ 16.07.2024):
pip install textdistance
- [Conda](https://anaconda.org/conda-forge/textdistance) (📥 970K · ⏱️ 22.04.2025):
conda install -c conda-forge textdistance
spacy-transformers (🥈28 · ⭐ 1.4K) - Use pretrained transformers like BERT, XLNet and GPT-2.. MIT spacy - [GitHub](https://github.com/explosion/spacy-transformers) (👨‍💻 23 · 🔀 170 · 📥 610 · 📦 2.4K · ⏱️ 26.05.2025):
git clone https://github.com/explosion/spacy-transformers
- [PyPi](https://pypi.org/project/spacy-transformers) (📥 270K / month · 📦 110 · ⏱️ 26.05.2025):
pip install spacy-transformers
- [Conda](https://anaconda.org/conda-forge/spacy-transformers) (📥 140K · ⏱️ 22.04.2025):
conda install -c conda-forge spacy-transformers
detoxify (🥉26 · ⭐ 1.1K) - Trained models & code to predict toxic comments on all 3 Jigsaw.. Apache-2 - [GitHub](https://github.com/unitaryai/detoxify) (👨‍💻 14 · 🔀 130 · 📥 1.9M · 📦 980 · 📋 67 - 55% open · ⏱️ 29.07.2025):
git clone https://github.com/unitaryai/detoxify
- [PyPi](https://pypi.org/project/detoxify) (📥 140K / month · 📦 30 · ⏱️ 01.02.2024):
pip install detoxify
scattertext (🥉25 · ⭐ 2.3K) - Beautiful visualizations of how language differs among document.. Apache-2 - [GitHub](https://github.com/JasonKessler/scattertext) (👨‍💻 14 · 🔀 290 · 📦 670 · 📋 100 - 22% open · ⏱️ 29.04.2025):
git clone https://github.com/JasonKessler/scattertext
- [PyPi](https://pypi.org/project/scattertext) (📥 7.5K / month · 📦 5 · ⏱️ 23.09.2024):
pip install scattertext
- [Conda](https://anaconda.org/conda-forge/scattertext) (📥 140K · ⏱️ 22.04.2025):
conda install -c conda-forge scattertext
T5 (🥉24 · ⭐ 6.4K) - Code for the paper Exploring the Limits of Transfer Learning with a.. Apache-2 - [GitHub](https://github.com/google-research/text-to-text-transfer-transformer) (👨‍💻 61 · 🔀 780 · 📋 450 - 23% open · ⏱️ 28.04.2025):
git clone https://github.com/google-research/text-to-text-transfer-transformer
- [PyPi](https://pypi.org/project/t5) (📥 83K / month · 📦 2 · ⏱️ 18.10.2021):
pip install t5
DeepKE (🥉24 · ⭐ 4.2K) - [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and.. MIT - [GitHub](https://github.com/zjunlp/DeepKE) (👨‍💻 34 · 🔀 730 · 📦 25 · ⏱️ 19.07.2025):
git clone https://github.com/zjunlp/deepke
- [PyPi](https://pypi.org/project/deepke) (📥 950 / month · ⏱️ 21.09.2023):
pip install deepke
sense2vec (🥉24 · ⭐ 1.7K) - Contextually-keyed word vectors. MIT - [GitHub](https://github.com/explosion/sense2vec) (👨‍💻 20 · 🔀 240 · 📥 73K · 📦 470 · 📋 120 - 20% open · ⏱️ 23.04.2025):
git clone https://github.com/explosion/sense2vec
- [PyPi](https://pypi.org/project/sense2vec) (📥 3.4K / month · 📦 13 · ⏱️ 19.04.2021):
pip install sense2vec
- [Conda](https://anaconda.org/conda-forge/sense2vec) (📥 67K · ⏱️ 22.04.2025):
conda install -c conda-forge sense2vec
finetune (🥉23 · ⭐ 720) - Scikit-learn style model finetuning for NLP. MPL-2.0 - [GitHub](https://github.com/IndicoDataSolutions/finetune) (👨‍💻 24 · 🔀 79 · 📦 16 · 📋 190 - 39% open · ⏱️ 21.10.2025):
git clone https://github.com/IndicoDataSolutions/finetune
- [PyPi](https://pypi.org/project/finetune) (📥 2.7K / month · 📦 2 · ⏱️ 29.09.2023):
pip install finetune
happy-transformer (🥉23 · ⭐ 540 · 💤) - Happy Transformer makes it easy to fine-tune and.. Apache-2 huggingface - [GitHub](https://github.com/EricFillion/happy-transformer) (👨‍💻 14 · 🔀 69 · 📦 330 · 📋 130 - 16% open · ⏱️ 22.03.2025):
git clone https://github.com/EricFillion/happy-transformer
- [PyPi](https://pypi.org/project/happytransformer) (📥 2.7K / month · 📦 5 · ⏱️ 05.08.2023):
pip install happytransformer
Sockeye (🥉21 · ⭐ 1.2K · 💤) - Sequence-to-sequence framework with a focus on Neural.. Apache-2 - [GitHub](https://github.com/awslabs/sockeye) (👨‍💻 60 · 🔀 320 · 📥 21 · 📋 310 - 3% open · ⏱️ 24.10.2024):
git clone https://github.com/awslabs/sockeye
- [PyPi](https://pypi.org/project/sockeye) (📥 580 / month · ⏱️ 03.03.2023):
pip install sockeye
UForm (🥉21 · ⭐ 1.2K) - Pocket-Sized Multimodal AI for content understanding and.. Apache-2 - [GitHub](https://github.com/unum-cloud/UForm) (👨‍💻 21 · 🔀 76 · 📥 710 · 📦 36 · 📋 39 - 38% open · ⏱️ 03.09.2025):
git clone https://github.com/unum-cloud/uform
- [PyPi](https://pypi.org/project/uform) (📥 490 / month · 📦 2 · ⏱️ 03.09.2025):
pip install uform
small-text (🥉20 · ⭐ 630) - Active Learning for Text Classification in Python. MIT - [GitHub](https://github.com/webis-de/small-text) (👨‍💻 10 · 🔀 76 · 📦 34 · 📋 74 - 28% open · ⏱️ 28.10.2025):
git clone https://github.com/webis-de/small-text
- [PyPi](https://pypi.org/project/small-text) (📥 390 / month · ⏱️ 17.08.2025):
pip install small-text
- [Conda](https://anaconda.org/conda-forge/small-text) (📥 19K · ⏱️ 17.08.2025):
conda install -c conda-forge small-text
textaugment (🥉19 · ⭐ 430) - TextAugment: Text Augmentation Library. MIT - [GitHub](https://github.com/dsfsi/textaugment) (👨‍💻 10 · 🔀 60 · 📥 140 · 📦 180 · 📋 29 - 37% open · ⏱️ 09.09.2025):
git clone https://github.com/dsfsi/textaugment
- [PyPi](https://pypi.org/project/textaugment) (📥 4.2K / month · 📦 4 · ⏱️ 16.11.2023):
pip install textaugment
VizSeq (🥉15 · ⭐ 450) - An Analysis Toolkit for Natural Language Generation (Translation,.. MIT - [GitHub](https://github.com/facebookresearch/vizseq) (👨‍💻 4 · 🔀 61 · 📦 13 · 📋 16 - 43% open · ⏱️ 24.06.2025):
git clone https://github.com/facebookresearch/vizseq
- [PyPi](https://pypi.org/project/vizseq) (📥 120 / month · ⏱️ 07.08.2020):
pip install vizseq
Show 59 hidden projects... - AllenNLP (🥈36 · ⭐ 12K · 💀) - An open-source NLP research library, built on PyTorch. Apache-2 - fastText (🥈34 · ⭐ 26K · 💀) - Library for fast text representation and classification. MIT - OpenNMT (🥈33 · ⭐ 7K · 💀) - Open Source Neural Machine Translation and (Large) Language Models.. MIT - ParlAI (🥈32 · ⭐ 11K · 💀) - A framework for training and evaluating AI models on a variety of.. MIT - fuzzywuzzy (🥈31 · ⭐ 9.3K · 💀) - Fuzzy String Matching in Python. ❗️GPL-2.0 - Sumy (🥈30 · ⭐ 3.6K · 💀) - Module for automatic summarization of text documents and HTML pages. Apache-2 - underthesea (🥈30 · ⭐ 1.6K) - Underthesea - Vietnamese NLP Toolkit. ❗️GPL-3.0 - nlpaug (🥈29 · ⭐ 4.6K · 💀) - Data augmentation for NLP. MIT - vaderSentiment (🥈28 · ⭐ 4.9K · 💀) - VADER Sentiment Analysis. VADER (Valence Aware Dictionary.. MIT - textacy (🥈28 · ⭐ 2.2K · 💀) - NLP, before and after spaCy. ❗Unlicensed - PyTextRank (🥈28 · ⭐ 2.2K · 💀) - Python implementation of TextRank algorithms (textgraphs) for.. MIT - Ciphey (🥉27 · ⭐ 20K · 💀) - Automatically decrypt encryptions without knowing the key or cipher,.. MIT - fastNLP (🥉27 · ⭐ 3.1K · 💀) - fastNLP: A Modularized and Extensible NLP Framework. Currently.. Apache-2 - polyglot (🥉27 · ⭐ 2.3K · 💀) - Multilingual text (NLP) processing toolkit. ❗️GPL-3.0 - flashtext (🥉26 · ⭐ 5.7K · 💀) - Extract Keywords from sentence or Replace keywords in sentences. MIT - langid (🥉26 · ⭐ 2.4K · 💀) - Stand-alone language identification system. BSD-3 - pySBD (🥉26 · ⭐ 880 · 💀) - pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence.. MIT - neuralcoref (🥉25 · ⭐ 2.9K · 💀) - Fast Coreference Resolution in spaCy with Neural Networks. MIT - GluonNLP (🥉25 · ⭐ 2.6K · 💀) - Toolkit that enables easy text preprocessing, datasets.. Apache-2 - pytorch-nlp (🥉25 · ⭐ 2.2K · 💀) - Basic Utilities for PyTorch Natural Language Processing.. BSD-3 - whoosh (🥉25 · ⭐ 640 · 💀) - Pure-Python full-text search library. ❗️BSD-1-Clause - PyText (🥉24 · ⭐ 6.3K · 💀) - A natural language modeling framework based on PyTorch. BSD-3 - textgenrnn (🥉24 · ⭐ 4.9K · 💀) - Easily train your own text-generating neural network of any.. MIT - OpenPrompt (🥉24 · ⭐ 4.7K · 💀) - An Open-Source Framework for Prompt-Learning. Apache-2 - Snips NLU (🥉24 · ⭐ 3.9K · 💀) - Snips Python library to extract meaning from text. Apache-2 - MatchZoo (🥉24 · ⭐ 3.9K · 💀) - Facilitating the design, comparison and sharing of deep.. Apache-2 - promptsource (🥉24 · ⭐ 3K · 💀) - Toolkit for creating, sharing and using natural language.. Apache-2 - YouTokenToMe (🥉24 · ⭐ 970 · 💀) - Unsupervised text tokenizer focused on computational efficiency. MIT - Kashgari (🥉23 · ⭐ 2.4K · 💀) - Kashgari is a production-level NLP Transfer learning.. Apache-2 - FARM (🥉23 · ⭐ 1.8K · 💀) - Fast & easy transfer learning for NLP. Harvesting language.. Apache-2 - gpt-2-simple (🥉22 · ⭐ 3.4K · 💀) - Python package to easily retrain OpenAIs GPT-2 text-.. MIT - Texar (🥉22 · ⭐ 2.4K · 💀) - Toolkit for Machine Learning, Natural Language Processing, and.. Apache-2 - jiant (🥉22 · ⭐ 1.7K · 💀) - jiant is an nlp toolkit. MIT - stop-words (🥉22 · ⭐ 160) - Get list of common stop words in various languages in Python. BSD-3 - NLP Architect (🥉21 · ⭐ 2.9K · 💀) - A model library for exploring state-of-the-art deep.. Apache-2 - Texthero (🥉21 · ⭐ 2.9K · 💀) - Text preprocessing, representation and visualization from zero to.. MIT - anaGo (🥉21 · ⭐ 1.5K · 💀) - Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition,.. MIT - lightseq (🥉20 · ⭐ 3.3K · 💀) - LightSeq: A High Performance Library for Sequence Processing.. Apache-2 - fast-bert (🥉20 · ⭐ 1.9K · 💀) - Super easy library for BERT based NLP models. Apache-2 - DELTA (🥉20 · ⭐ 1.6K · 💀) - DELTA is a deep learning based natural language and speech.. Apache-2 - textpipe (🥉20 · ⭐ 300 · 💀) - Textpipe: clean and extract metadata from text. MIT - numerizer (🥉19 · ⭐ 230 · 💀) - A Python module to convert natural language numerics into ints and.. MIT - pyfasttext (🥉19 · ⭐ 230 · 💀) - Yet another Python binding for fastText. ❗️GPL-3.0 - DeepMatcher (🥉18 · ⭐ 5.2K · 💀) - Python package for performing Entity and Text Matching using.. BSD-3 - nboost (🥉18 · ⭐ 670 · 💀) - NBoost is a scalable, search-api-boosting platform for deploying.. Apache-2 - fastT5 (🥉18 · ⭐ 590 · 💀) - boost inference speed of T5 models by 5x & reduce the model size.. Apache-2 - Camphr (🥉18 · ⭐ 340 · 💀) - Camphr - NLP libary for creating pipeline components. Apache-2 spacy - NeuroNER (🥉17 · ⭐ 1.7K · 💀) - Named-entity recognition using neural networks. Easy-to-use and.. MIT - OpenNRE (🥉16 · ⭐ 4.4K · 💀) - An Open-Source Package for Neural Relation Extraction (NRE). MIT - BLINK (🥉15 · ⭐ 1.2K · 💀) - Entity Linker solution. MIT - TextBox (🥉15 · ⭐ 1.1K · 💀) - TextBox 2.0 is a text generation library with pre-trained language.. MIT - Translate (🥉15 · ⭐ 830 · 💀) - Translate - a PyTorch Language Library. BSD-3 - skift (🥉15 · ⭐ 240 · 💀) - scikit-learn wrappers for Python fastText. MIT - ONNX-T5 (🥉14 · ⭐ 260 · 💀) - Summarization, translation, sentiment-analysis, text-generation.. Apache-2 - NeuralQA (🥉14 · ⭐ 230 · 💀) - NeuralQA: A Usable Library for Question Answering on Large Datasets.. MIT - TransferNLP (🥉13 · ⭐ 290 · 💀) - NLP library designed for reproducible experimentation.. MIT - Headliner (🥉13 · ⭐ 230 · 💀) - Easy training and deployment of seq2seq models. MIT - textvec (🥉12 · ⭐ 200 · 💀) - Text vectorization tool to outperform TFIDF for classification.. MIT - spacy-dbpedia-spotlight (🥉12 · ⭐ 110 · 💀) - A spaCy wrapper for DBpedia Spotlight. MIT spacy


Image Data

Back to top

Libraries for image & video processing, manipulation, and augmentation as well as libraries for computer vision tasks such as facial recognition, object detection, and classification.

Pillow (🥇49 · ⭐ 13K) - Python Imaging Library (Fork). ❗️PIL - [GitHub](https://github.com/python-pillow/Pillow) (👨‍💻 490 · 🔀 2.3K · 📦 2.4M · 📋 3.4K - 3% open · ⏱️ 27.10.2025):
git clone https://github.com/python-pillow/Pillow
- [PyPi](https://pypi.org/project/Pillow) (📥 220M / month · 📦 20K · ⏱️ 15.10.2025):
pip install Pillow
- [Conda](https://anaconda.org/conda-forge/pillow) (📥 62M · ⏱️ 28.10.2025):
conda install -c conda-forge pillow
PyTorch Image Models (🥇42 · ⭐ 36K) - The largest collection of PyTorch image encoders /.. Apache-2 - [GitHub](https://github.com/huggingface/pytorch-image-models) (👨‍💻 180 · 🔀 5.1K · 📥 8.4M · 📦 62K · 📋 1K - 4% open · ⏱️ 30.10.2025):
git clone https://github.com/rwightman/pytorch-image-models
- [PyPi](https://pypi.org/project/timm) (📥 11M / month · 📦 1.5K · ⏱️ 24.10.2025):
pip install timm
- [Conda](https://anaconda.org/conda-forge/timm) (📥 470K · ⏱️ 24.10.2025):
conda install -c conda-forge timm
torchvision (🥇42 · ⭐ 17K) - Datasets, Transforms and Models specific to Computer Vision. BSD-3 - [GitHub](https://github.com/pytorch/vision) (👨‍💻 660 · 🔀 7.2K · 📥 41K · 📦 21 · 📋 3.8K - 30% open · ⏱️ 27.10.2025):
git clone https://github.com/pytorch/vision
- [PyPi](https://pypi.org/project/torchvision) (📥 26M / month · 📦 8.4K · ⏱️ 15.10.2025):
pip install torchvision
- [Conda](https://anaconda.org/conda-forge/torchvision) (📥 3.1M · ⏱️ 23.10.2025):
conda install -c conda-forge torchvision
MoviePy (🥇42 · ⭐ 14K) - Video editing with Python. MIT - [GitHub](https://github.com/Zulko/moviepy) (👨‍💻 190 · 🔀 1.9K · 📦 67K · 📋 1.7K - 3% open · ⏱️ 25.09.2025):
git clone https://github.com/Zulko/moviepy
- [PyPi](https://pypi.org/project/moviepy) (📥 4.3M / month · 📦 1.2K · ⏱️ 21.05.2025):
pip install moviepy
- [Conda](https://anaconda.org/conda-forge/moviepy) (📥 360K · ⏱️ 22.04.2025):
conda install -c conda-forge moviepy
Kornia (🥇39 · ⭐ 11K) - Geometric Computer Vision Library for Spatial AI. Apache-2 - [GitHub](https://github.com/kornia/kornia) (👨‍💻 300 · 🔀 1.1K · 📥 2.2K · 📦 17K · 📋 1K - 32% open · ⏱️ 30.10.2025):
git clone https://github.com/kornia/kornia
- [PyPi](https://pypi.org/project/kornia) (📥 3M / month · 📦 340 · ⏱️ 08.05.2025):
pip install kornia
- [Conda](https://anaconda.org/conda-forge/kornia) (📥 260K · ⏱️ 08.05.2025):
conda install -c conda-forge kornia
imageio (🥇39 · ⭐ 1.7K) - Python library for reading and writing image data. BSD-2 - [GitHub](https://github.com/imageio/imageio) (👨‍💻 130 · 🔀 330 · 📥 1.9K · 📦 180K · 📋 620 - 16% open · ⏱️ 24.10.2025):
git clone https://github.com/imageio/imageio
- [PyPi](https://pypi.org/project/imageio) (📥 36M / month · 📦 2.6K · ⏱️ 20.01.2025):
pip install imageio
- [Conda](https://anaconda.org/conda-forge/imageio) (📥 8.5M · ⏱️ 22.04.2025):
conda install -c conda-forge imageio
deepface (🥈38 · ⭐ 21K · 📉) - A Lightweight Face Recognition and Facial Attribute Analysis (Age,.. MIT - [GitHub](https://github.com/serengil/deepface) (👨‍💻 96 · 🔀 2.8K · 📦 8.4K · 📋 1.2K - 0% open · ⏱️ 21.10.2025):
git clone https://github.com/serengil/deepface
- [PyPi](https://pypi.org/project/deepface) (📥 280K / month · 📦 78 · ⏱️ 05.08.2025):
pip install deepface
InsightFace (🥈37 · ⭐ 27K) - State-of-the-art 2D and 3D Face Analysis Project. MIT - [GitHub](https://github.com/deepinsight/insightface) (👨‍💻 67 · 🔀 5.7K · 📥 11M · 📦 4.8K · 📋 2.6K - 46% open · ⏱️ 27.09.2025):
git clone https://github.com/deepinsight/insightface
- [PyPi](https://pypi.org/project/insightface) (📥 350K / month · 📦 30 · ⏱️ 17.12.2022):
pip install insightface
Albumentations (🥈36 · ⭐ 15K) - Fast and flexible image augmentation library. Paper about.. MIT - [GitHub](https://github.com/albumentations-team/albumentations) (👨‍💻 170 · 🔀 1.7K · 📋 1.5K - 14% open · ⏱️ 25.06.2025):
git clone https://github.com/albumentations-team/albumentations
- [PyPi](https://pypi.org/project/albumentations) (📥 4.6M / month · 📦 730 · ⏱️ 27.05.2025):
pip install albumentations
- [Conda](https://anaconda.org/conda-forge/albumentations) (📥 340K · ⏱️ 28.05.2025):
conda install -c conda-forge albumentations
opencv-python (🥈36 · ⭐ 5.1K) - Automated CI toolchain to produce precompiled opencv-python,.. MIT - [GitHub](https://github.com/opencv/opencv-python) (👨‍💻 56 · 🔀 950 · 📦 610K · 📋 890 - 19% open · ⏱️ 30.07.2025):
git clone https://github.com/opencv/opencv-python
- [PyPi](https://pypi.org/project/opencv-python) (📥 29M / month · 📦 15K · ⏱️ 07.07.2025):
pip install opencv-python
detectron2 (🥈34 · ⭐ 34K) - Detectron2 is a platform for object detection, segmentation.. Apache-2 - [GitHub](https://github.com/facebookresearch/detectron2) (👨‍💻 280 · 🔀 7.5K · 📦 2.6K · 📋 3.6K - 14% open · ⏱️ 27.10.2025):
git clone https://github.com/facebookresearch/detectron2
- [PyPi](https://pypi.org/project/detectron2) (📦 13 · ⏱️ 06.02.2020):
pip install detectron2
- [Conda](https://anaconda.org/conda-forge/detectron2) (📥 820K · ⏱️ 02.06.2025):
conda install -c conda-forge detectron2
Wand (🥈34 · ⭐ 1.5K) - The ctypes-based simple ImageMagick binding for Python. MIT - [GitHub](https://github.com/emcconville/wand) (👨‍💻 110 · 🔀 200 · 📥 52K · 📦 21K · 📋 440 - 5% open · ⏱️ 06.10.2025):
git clone https://github.com/emcconville/wand
- [PyPi](https://pypi.org/project/wand) (📥 2.2M / month · 📦 260 · ⏱️ 03.11.2023):
pip install wand
- [Conda](https://anaconda.org/conda-forge/wand) (📥 180K · ⏱️ 22.04.2025):
conda install -c conda-forge wand
ImageHash (🥈32 · ⭐ 3.7K) - A Python Perceptual Image Hashing Module. BSD-2 - [GitHub](https://github.com/JohannesBuchner/imagehash) (👨‍💻 29 · 🔀 340 · 📦 18K · 📋 150 - 15% open · ⏱️ 17.04.2025):
git clone https://github.com/JohannesBuchner/imagehash
- [PyPi](https://pypi.org/project/ImageHash) (📥 5.6M / month · 📦 270 · ⏱️ 01.02.2025):
pip install ImageHash
- [Conda](https://anaconda.org/conda-forge/imagehash) (📥 500K · ⏱️ 22.04.2025):
conda install -c conda-forge imagehash
vit-pytorch (🥈31 · ⭐ 24K) - Implementation of Vision Transformer, a simple way to achieve.. MIT - [GitHub](https://github.com/lucidrains/vit-pytorch) (👨‍💻 24 · 🔀 3.4K · 📦 21 · 📋 290 - 49% open · ⏱️ 28.10.2025):
git clone https://github.com/lucidrains/vit-pytorch
- [PyPi](https://pypi.org/project/vit-pytorch) (📥 31K / month · 📦 28 · ⏱️ 27.10.2025):
pip install vit-pytorch
PaddleSeg (🥈31 · ⭐ 9.2K) - Easy-to-use image segmentation library with awesome pre-.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleSeg) (👨‍💻 130 · 🔀 1.7K · 📦 1.5K · 📋 2.2K - 0% open · ⏱️ 10.10.2025):
git clone https://github.com/PaddlePaddle/PaddleSeg
- [PyPi](https://pypi.org/project/paddleseg) (📥 3.8K / month · 📦 7 · ⏱️ 30.11.2022):
pip install paddleseg
sahi (🥈31 · ⭐ 4.9K) - Framework agnostic sliced/tiled inference + interactive ui + error analysis.. MIT - [GitHub](https://github.com/obss/sahi) (👨‍💻 69 · 🔀 700 · 📥 43K · 📦 1.9K · ⏱️ 28.10.2025):
git clone https://github.com/obss/sahi
- [PyPi](https://pypi.org/project/sahi) (📥 140K / month · 📦 43 · ⏱️ 28.09.2025):
pip install sahi
- [Conda](https://anaconda.org/conda-forge/sahi) (📥 120K · ⏱️ 29.09.2025):
conda install -c conda-forge sahi
lightly (🥈31 · ⭐ 3.6K) - A python library for self-supervised learning on images. MIT - [GitHub](https://github.com/lightly-ai/lightly) (👨‍💻 72 · 🔀 310 · 📦 510 · 📋 610 - 12% open · ⏱️ 25.09.2025):
git clone https://github.com/lightly-ai/lightly
- [PyPi](https://pypi.org/project/lightly) (📥 190K / month · 📦 20 · ⏱️ 22.07.2025):
pip install lightly
doctr (🥈29 · ⭐ 5.6K) - docTR (Document Text Recognition) - a seamless, high-.. Apache-2 - [GitHub](https://github.com/mindee/doctr) (👨‍💻 68 · 🔀 580 · 📥 6.5M · 📋 440 - 6% open · ⏱️ 07.09.2025):
git clone https://github.com/mindee/doctr
- [PyPi](https://pypi.org/project/python-doctr) (📥 2M / month · 📦 18 · ⏱️ 09.07.2025):
pip install python-doctr
PaddleDetection (🥉28 · ⭐ 14K) - Object Detection toolkit based on PaddlePaddle. It.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleDetection) (👨‍💻 190 · 🔀 3K · 📋 5.7K - 17% open · ⏱️ 10.10.2025):
git clone https://github.com/PaddlePaddle/PaddleDetection
- [PyPi](https://pypi.org/project/paddledet) (📥 2.2K / month · 📦 2 · ⏱️ 19.09.2022):
pip install paddledet
mtcnn (🥉27 · ⭐ 2.4K · 💤) - MTCNN face detection implementation for TensorFlow, as a PIP.. MIT - [GitHub](https://github.com/ipazc/mtcnn) (👨‍💻 15 · 🔀 530 · 📥 76 · 📦 9.2K · 📋 130 - 37% open · ⏱️ 08.10.2024):
git clone https://github.com/ipazc/mtcnn
- [PyPi](https://pypi.org/project/mtcnn) (📥 210K / month · 📦 73 · ⏱️ 08.10.2024):
pip install mtcnn
- [Conda](https://anaconda.org/conda-forge/mtcnn) (📥 16K · ⏱️ 22.04.2025):
conda install -c conda-forge mtcnn
CellProfiler (🥉27 · ⭐ 1.1K) - An open-source application for biological image analysis. BSD-3 - [GitHub](https://github.com/CellProfiler/CellProfiler) (👨‍💻 150 · 🔀 410 · 📥 24K · 📦 28 · 📋 3.4K - 10% open · ⏱️ 24.09.2025):
git clone https://github.com/CellProfiler/CellProfiler
- [PyPi](https://pypi.org/project/cellprofiler) (📥 1.6K / month · 📦 2 · ⏱️ 16.09.2024):
pip install cellprofiler
mahotas (🥉27 · ⭐ 880) - Computer Vision in Python. MIT - [GitHub](https://github.com/luispedro/mahotas) (👨‍💻 35 · 🔀 150 · 📦 1.6K · 📋 92 - 21% open · ⏱️ 05.08.2025):
git clone https://github.com/luispedro/mahotas
- [PyPi](https://pypi.org/project/mahotas) (📥 42K / month · 📦 63 · ⏱️ 17.07.2024):
pip install mahotas
- [Conda](https://anaconda.org/conda-forge/mahotas) (📥 790K · ⏱️ 21.10.2025):
conda install -c conda-forge mahotas
Image Deduplicator (🥉26 · ⭐ 5.5K) - Finding duplicate images made easy!. Apache-2 - [GitHub](https://github.com/idealo/imagededup) (👨‍💻 19 · 🔀 460 · 📥 29 · 📦 200 · 📋 140 - 25% open · ⏱️ 15.08.2025):
git clone https://github.com/idealo/imagededup
- [PyPi](https://pypi.org/project/imagededup) (📥 69K / month · 📦 29 · ⏱️ 15.08.2025):
pip install imagededup
tensorflow-graphics (🥉26 · ⭐ 2.8K · 💤) - TensorFlow Graphics: Differentiable Graphics Layers.. Apache-2 - [GitHub](https://github.com/tensorflow/graphics) (👨‍💻 39 · 🔀 370 · 📋 240 - 60% open · ⏱️ 03.02.2025):
git clone https://github.com/tensorflow/graphics
- [PyPi](https://pypi.org/project/tensorflow-graphics) (📥 61K / month · 📦 11 · ⏱️ 03.12.2021):
pip install tensorflow-graphics
Norfair (🥉26 · ⭐ 2.5K) - Lightweight Python library for adding real-time multi-object tracking.. BSD-3 - [GitHub](https://github.com/tryolabs/norfair) (👨‍💻 32 · 🔀 260 · 📥 360 · 📦 340 · 📋 180 - 16% open · ⏱️ 30.04.2025):
git clone https://github.com/tryolabs/norfair
- [PyPi](https://pypi.org/project/norfair) (📥 44K / month · 📦 9 · ⏱️ 30.04.2025):
pip install norfair
pyvips (🥉26 · ⭐ 740) - python binding for libvips using cffi. MIT - [GitHub](https://github.com/libvips/pyvips) (👨‍💻 17 · 🔀 53 · 📦 1.2K · 📋 670 - 29% open · ⏱️ 04.09.2025):
git clone https://github.com/libvips/pyvips
- [PyPi](https://pypi.org/project/pyvips) (📥 190K / month · 📦 94 · ⏱️ 28.04.2025):
pip install pyvips
- [Conda](https://anaconda.org/conda-forge/pyvips) (📥 260K · ⏱️ 04.09.2025):
conda install -c conda-forge pyvips
pytorchvideo (🥉25 · ⭐ 3.5K) - A deep learning library for video understanding research. Apache-2 - [GitHub](https://github.com/facebookresearch/pytorchvideo) (👨‍💻 59 · 🔀 420 · 📋 210 - 50% open · ⏱️ 27.10.2025):
git clone https://github.com/facebookresearch/pytorchvideo
- [PyPi](https://pypi.org/project/pytorchvideo) (📥 53K / month · 📦 24 · ⏱️ 20.01.2022):
pip install pytorchvideo
MMF (🥉24 · ⭐ 5.6K) - A modular framework for vision & language multimodal research from.. BSD-3 - [GitHub](https://github.com/facebookresearch/mmf) (👨‍💻 120 · 🔀 920 · 📦 23 · 📋 690 - 21% open · ⏱️ 24.04.2025):
git clone https://github.com/facebookresearch/mmf
- [PyPi](https://pypi.org/project/mmf) (📥 190 / month · 📦 1 · ⏱️ 12.06.2020):
pip install mmf
kubric (🥉22 · ⭐ 2.6K) - A data generation pipeline for creating semi-realistic synthetic.. Apache-2 - [GitHub](https://github.com/google-research/kubric) (👨‍💻 32 · 🔀 250 · 📦 7 · 📋 200 - 35% open · ⏱️ 06.05.2025):
git clone https://github.com/google-research/kubric
- [PyPi](https://pypi.org/project/kubric-nightly) (📥 6.6K / month · ⏱️ 27.12.2023):
pip install kubric-nightly
icevision (🥉22 · ⭐ 870 · 💤) - An Agnostic Computer Vision Framework - Pluggable to any.. Apache-2 - [GitHub](https://github.com/airctic/icevision) (👨‍💻 41 · 🔀 130 · 📋 570 - 10% open · ⏱️ 31.10.2024):
git clone https://github.com/airctic/icevision
- [PyPi](https://pypi.org/project/icevision) (📥 2.3K / month · 📦 6 · ⏱️ 10.02.2022):
pip install icevision
PySlowFast (🥉21 · ⭐ 7.2K) - PySlowFast: video understanding codebase from FAIR for.. Apache-2 - [GitHub](https://github.com/facebookresearch/SlowFast) (👨‍💻 35 · 🔀 1.2K · 📦 23 · 📋 720 - 59% open · ⏱️ 27.10.2025):
git clone https://github.com/facebookresearch/SlowFast
- [PyPi](https://pypi.org/project/pyslowfast) (📥 22 / month · ⏱️ 15.01.2020):
pip install pyslowfast
Image Super-Resolution (🥉21 · ⭐ 4.8K · 💤) - Super-scale your images and run experiments with.. Apache-2 - [GitHub](https://github.com/idealo/image-super-resolution) (👨‍💻 11 · 🔀 760 · 📋 220 - 48% open · ⏱️ 18.12.2024):
git clone https://github.com/idealo/image-super-resolution
- [PyPi](https://pypi.org/project/ISR) (📥 3.9K / month · 📦 5 · ⏱️ 08.01.2020):
pip install ISR
- [Docker Hub](https://hub.docker.com/r/idealo/image-super-resolution-gpu) (📥 290 · ⭐ 1 · ⏱️ 01.04.2019):
docker pull idealo/image-super-resolution-gpu
Caer (🥉21 · ⭐ 800) - A lightweight Computer Vision library. Scale your models, not boilerplate. MIT - [GitHub](https://github.com/jasmcaus/caer) (👨‍💻 8 · 🔀 110 · 📥 48 · ⏱️ 11.08.2025):
git clone https://github.com/jasmcaus/caer
- [PyPi](https://pypi.org/project/caer) (📥 3.8K / month · 📦 3 · ⏱️ 11.08.2025):
pip install caer
scenic (🥉16 · ⭐ 3.7K) - Scenic: A Jax Library for Computer Vision Research and Beyond. Apache-2 - [GitHub](https://github.com/google-research/scenic) (👨‍💻 95 · 🔀 460 · 📋 400 - 70% open · ⏱️ 06.08.2025):
git clone https://github.com/google-research/scenic
Show 30 hidden projects... - scikit-image (🥇41 · ⭐ 6.4K · 📈) - Image processing in Python. ❗Unlicensed - glfw (🥈38 · ⭐ 14K) - A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input. ❗️Zlib - MMDetection (🥈37 · ⭐ 32K · 💀) - OpenMMLab Detection Toolbox and Benchmark. Apache-2 - imgaug (🥈36 · ⭐ 15K · 💀) - Image augmentation for machine learning experiments. MIT - Face Recognition (🥈35 · ⭐ 56K · 💀) - The worlds simplest facial recognition api for Python.. MIT - imutils (🥈31 · ⭐ 4.6K · 💀) - A series of convenience functions to make basic image processing.. MIT - PyTorch3D (🥈30 · ⭐ 9.6K) - PyTorch3D is FAIRs library of reusable components for.. ❗Unlicensed - imageai (🥈30 · ⭐ 8.8K · 💀) - A python library built to empower developers to build applications.. MIT - Face Alignment (🥉28 · ⭐ 7.4K · 💀) - 2D and 3D Face alignment library build using pytorch. BSD-3 - GluonCV (🥉27 · ⭐ 5.9K · 💀) - Gluon CV Toolkit. Apache-2 - Augmentor (🥉27 · ⭐ 5.1K · 💀) - Image augmentation library in Python for machine learning. MIT - vidgear (🥉27 · ⭐ 3.6K · 💀) - A High-performance cross-platform Video Processing Python.. Apache-2 - chainercv (🥉27 · ⭐ 1.5K · 💀) - ChainerCV: a Library for Deep Learning in Computer Vision. MIT - facenet-pytorch (🥉26 · ⭐ 5K · 💀) - Pretrained Pytorch face detection (MTCNN) and facial.. MIT - Pillow-SIMD (🥉25 · ⭐ 2.2K · 💀) - The friendly PIL fork. ❗️PIL - layout-parser (🥉24 · ⭐ 5.6K · 💀) - A Unified Toolkit for Deep Learning Based Document Image.. Apache-2 - segmentation_models (🥉24 · ⭐ 4.9K · 💀) - Segmentation models with pretrained backbones. Keras.. MIT - ffcv (🥉23 · ⭐ 3K · 💀) - FFCV: Fast Forward Computer Vision (and other ML workloads!). Apache-2 - Classy Vision (🥉23 · ⭐ 1.6K · 💀) - An end-to-end PyTorch framework for image and video.. MIT - deep-daze (🥉22 · ⭐ 4.3K · 💀) - Simple command line tool for text to image generation using.. MIT - vissl (🥉22 · ⭐ 3.3K · 💀) - VISSL is FAIRs library of extensible, modular and scalable.. MIT - Luminoth (🥉22 · ⭐ 2.4K · 💀) - Deep Learning toolkit for Computer Vision. BSD-3 - detecto (🥉21 · ⭐ 620 · 💀) - Build fully-functioning computer vision models with PyTorch. MIT - DE⫶TR (🥉20 · ⭐ 15K · 💀) - End-to-End Object Detection with Transformers. Apache-2 - image-match (🥉20 · ⭐ 3K · 💀) - Quickly search over billions of images. Apache-2 - solt (🥉19 · ⭐ 260) - Streaming over lightweight data transformations. MIT - pycls (🥉18 · ⭐ 2.2K · 💀) - Codebase for Image Classification Research, written in PyTorch. MIT - Torch Points 3D (🥉17 · ⭐ 260 · 💀) - Pytorch framework for doing deep learning on point.. BSD-3 - nude.py (🥉16 · ⭐ 920 · 💀) - Nudity detection with Python. MIT - HugsVision (🥉14 · ⭐ 200 · 💀) - HugsVision is a easy to use huggingface wrapper for state-of-.. MIT huggingface


Graph Data

Back to top

Libraries for graph processing, clustering, embedding, and machine learning tasks.

networkx (🥇46 · ⭐ 16K) - Network Analysis in Python. BSD-3 - [GitHub](https://github.com/networkx/networkx) (👨‍💻 790 · 🔀 3.4K · 📥 110 · 📦 430K · 📋 3.5K - 10% open · ⏱️ 29.10.2025):
git clone https://github.com/networkx/networkx
- [PyPi](https://pypi.org/project/networkx) (📥 130M / month · 📦 12K · ⏱️ 29.05.2025):
pip install networkx
- [Conda](https://anaconda.org/conda-forge/networkx) (📥 26M · ⏱️ 04.06.2025):
conda install -c conda-forge networkx
PyTorch Geometric (🥇41 · ⭐ 23K) - Graph Neural Network Library for PyTorch. MIT - [GitHub](https://github.com/pyg-team/pytorch_geometric) (👨‍💻 560 · 🔀 3.9K · 📦 11K · 📋 4K - 30% open · ⏱️ 29.10.2025):
git clone https://github.com/pyg-team/pytorch_geometric
- [PyPi](https://pypi.org/project/torch-geometric) (📥 940K / month · 📦 730 · ⏱️ 15.10.2025):
pip install torch-geometric
- [Conda](https://anaconda.org/conda-forge/pytorch_geometric) (📥 190K · ⏱️ 16.10.2025):
conda install -c conda-forge pytorch_geometric
dgl (🥇36 · ⭐ 14K) - Python package built to ease deep learning on graph, on top of existing DL.. Apache-2 - [GitHub](https://github.com/dmlc/dgl) (👨‍💻 300 · 🔀 3K · 📦 4.1K · 📋 3K - 20% open · ⏱️ 31.07.2025):
git clone https://github.com/dmlc/dgl
- [PyPi](https://pypi.org/project/dgl) (📥 150K / month · 📦 150 · ⏱️ 13.05.2024):
pip install dgl
pygraphistry (🥈29 · ⭐ 2.4K) - PyGraphistry is a Python library to quickly load, shape,.. BSD-3 - [GitHub](https://github.com/graphistry/pygraphistry) (👨‍💻 48 · 🔀 220 · 📋 420 - 51% open · ⏱️ 30.10.2025):
git clone https://github.com/graphistry/pygraphistry
- [PyPi](https://pypi.org/project/graphistry) (📥 8.5K / month · 📦 9 · ⏱️ 21.10.2025):
pip install graphistry
ogb (🥈29 · ⭐ 2K) - Benchmark datasets, data loaders, and evaluators for graph machine learning. MIT - [GitHub](https://github.com/snap-stanford/ogb) (👨‍💻 32 · 🔀 400 · 📦 2.6K · 📋 310 - 11% open · ⏱️ 06.05.2025):
git clone https://github.com/snap-stanford/ogb
- [PyPi](https://pypi.org/project/ogb) (📥 100K / month · 📦 73 · ⏱️ 07.04.2023):
pip install ogb
- [Conda](https://anaconda.org/conda-forge/ogb) (📥 63K · ⏱️ 22.04.2025):
conda install -c conda-forge ogb
PyKEEN (🥈28 · ⭐ 1.9K) - A Python library for learning and evaluating knowledge graph embeddings. MIT - [GitHub](https://github.com/pykeen/pykeen) (👨‍💻 43 · 🔀 210 · 📥 240 · 📦 350 · 📋 590 - 20% open · ⏱️ 18.07.2025):
git clone https://github.com/pykeen/pykeen
- [PyPi](https://pypi.org/project/pykeen) (📥 31K / month · 📦 28 · ⏱️ 24.04.2025):
pip install pykeen
pytorch_geometric_temporal (🥈27 · ⭐ 2.9K) - PyTorch Geometric Temporal: Spatiotemporal Signal.. MIT - [GitHub](https://github.com/benedekrozemberczki/pytorch_geometric_temporal) (👨‍💻 39 · 🔀 400 · 📋 210 - 18% open · ⏱️ 18.09.2025):
git clone https://github.com/benedekrozemberczki/pytorch_geometric_temporal
- [PyPi](https://pypi.org/project/torch-geometric-temporal) (📥 6.7K / month · 📦 12 · ⏱️ 16.07.2025):
pip install torch-geometric-temporal
torch-cluster (🥈24 · ⭐ 900) - PyTorch Extension Library of Optimized Graph Cluster.. MIT - [GitHub](https://github.com/rusty1s/pytorch_cluster) (👨‍💻 40 · 🔀 150 · 📋 190 - 16% open · ⏱️ 12.08.2025):
git clone https://github.com/rusty1s/pytorch_cluster
- [PyPi](https://pypi.org/project/torch-cluster) (📥 34K / month · 📦 62 · ⏱️ 12.10.2023):
pip install torch-cluster
- [Conda](https://anaconda.org/conda-forge/pytorch_cluster) (📥 440K · ⏱️ 22.09.2025):
conda install -c conda-forge pytorch_cluster
Show 28 hidden projects... - igraph (🥇34 · ⭐ 1.4K) - Python interface for igraph. ❗️GPL-2.0 - Spektral (🥈28 · ⭐ 2.4K · 💀) - Graph Neural Networks with Keras and Tensorflow 2. MIT - StellarGraph (🥈27 · ⭐ 3K · 💀) - StellarGraph - Machine Learning on Graphs. Apache-2 - pygal (🥈26 · ⭐ 2.7K · 💀) - PYthon svg GrAph plotting Library. ❗️LGPL-3.0 - Paddle Graph Learning (🥈26 · ⭐ 1.6K · 💀) - Paddle Graph Learning (PGL) is an efficient and.. Apache-2 - AmpliGraph (🥈25 · ⭐ 2.2K · 💀) - Python library for Representation Learning on Knowledge.. Apache-2 - Node2Vec (🥈25 · ⭐ 1.3K · 💀) - Implementation of the node2vec algorithm. MIT - Karate Club (🥈24 · ⭐ 2.3K · 💀) - Karate Club: An API Oriented Open-source Python Framework.. ❗️GPL-3.0 - graph-nets (🥉22 · ⭐ 5.4K · 💀) - Build Graph Nets in Tensorflow. Apache-2 - PyTorch-BigGraph (🥉21 · ⭐ 3.4K · 💀) - Generate embeddings from large-scale graph-structured.. BSD-3 - graph4nlp (🥉21 · ⭐ 1.7K · 💀) - Graph4nlp is the library for the easy use of Graph.. Apache-2 - jraph (🥉21 · ⭐ 1.5K · 💀) - A Graph Neural Network Library in Jax. Apache-2 - DeepWalk (🥉20 · ⭐ 2.7K · 💀) - DeepWalk - Deep Learning for Graphs. ❗️GPL-3.0 - DIG (🥉20 · ⭐ 2K · 💀) - A library for graph deep learning research. ❗️GPL-3.0 - deepsnap (🥉20 · ⭐ 560 · 💀) - Python library assists deep learning on graphs. MIT - pyRDF2Vec (🥉20 · ⭐ 260 · 💀) - Python Implementation and Extension of RDF2Vec. MIT - GraphGym (🥉17 · ⭐ 1.8K · 💀) - Platform for designing and evaluating Graph Neural Networks (GNN). MIT - Sematch (🥉17 · ⭐ 440 · 💀) - semantic similarity framework for knowledge graph. Apache-2 - DeepGraph (🥉17 · ⭐ 320) - Analyze Data with Pandas-based Networks. Documentation:. ❗Unlicensed - AutoGL (🥉16 · ⭐ 1.1K · 💀) - An autoML framework & toolkit for machine learning on graphs. Apache-2 - kglib (🥉16 · ⭐ 550 · 💀) - TypeDB-ML is the Machine Learning integrations library for TypeDB. Apache-2 - ptgnn (🥉16 · ⭐ 380 · 💀) - A PyTorch Graph Neural Network Library. MIT - Euler (🥉15 · ⭐ 2.9K · 💀) - A distributed graph deep learning framework. Apache-2 - GraphEmbedding (🥉14 · ⭐ 3.8K · 💀) - Implementation and experiments of graph embedding.. MIT - GraphSAGE (🥉14 · ⭐ 3.6K · 💀) - Representation learning on large graphs using stochastic.. MIT - OpenNE (🥉14 · ⭐ 1.7K · 💀) - An Open-Source Package for Network Embedding (NE). MIT - GraphVite (🥉14 · ⭐ 1.3K · 💀) - GraphVite: A General and High-performance Graph Embedding.. Apache-2 - OpenKE (🥉13 · ⭐ 4K · 💀) - An Open-Source Package for Knowledge Embedding (KE). ❗Unlicensed


Audio Data

Back to top

Libraries for audio analysis, manipulation, transformation, and extraction, as well as speech recognition and music generation tasks.

speechbrain (🥇38 · ⭐ 11K) - A PyTorch-based Speech Toolkit. Apache-2 - [GitHub](https://github.com/speechbrain/speechbrain) (👨‍💻 260 · 🔀 1.5K · 📦 3.9K · 📋 1.2K - 12% open · ⏱️ 30.10.2025):
git clone https://github.com/speechbrain/speechbrain
- [PyPi](https://pypi.org/project/speechbrain) (📥 1.1M / month · 📦 79 · ⏱️ 07.04.2025):
pip install speechbrain
espnet (🥇38 · ⭐ 9.5K) - End-to-End Speech Processing Toolkit. Apache-2 - [GitHub](https://github.com/espnet/espnet) (👨‍💻 520 · 🔀 2.3K · 📥 84 · 📦 480 · 📋 2.5K - 3% open · ⏱️ 30.10.2025):
git clone https://github.com/espnet/espnet
- [PyPi](https://pypi.org/project/espnet) (📥 24K / month · 📦 19 · ⏱️ 13.09.2025):
pip install espnet
torchaudio (🥇37 · ⭐ 2.8K) - Data manipulation and transformation for audio signal.. BSD-2 - [GitHub](https://github.com/pytorch/audio) (👨‍💻 240 · 🔀 730 · 📋 1.1K - 31% open · ⏱️ 29.10.2025):
git clone https://github.com/pytorch/audio
- [PyPi](https://pypi.org/project/torchaudio) (📥 15M / month · 📦 2.4K · ⏱️ 15.10.2025):
pip install torchaudio
SpeechRecognition (🥈34 · ⭐ 8.9K) - Speech recognition module for Python, supporting several.. BSD-3 - [GitHub](https://github.com/Uberi/speech_recognition) (👨‍💻 56 · 🔀 2.4K · 📦 21 · 📋 670 - 48% open · ⏱️ 28.10.2025):
git clone https://github.com/Uberi/speech_recognition
- [PyPi](https://pypi.org/project/SpeechRecognition) (📥 2.2M / month · 📦 730 · ⏱️ 12.05.2025):
pip install SpeechRecognition
- [Conda](https://anaconda.org/conda-forge/speechrecognition) (📥 360K · ⏱️ 12.05.2025):
conda install -c conda-forge speechrecognition
librosa (🥈34 · ⭐ 8K) - Python library for audio and music analysis. ISC - [GitHub](https://github.com/librosa/librosa) (👨‍💻 130 · 🔀 1K · 📋 1.3K - 5% open · ⏱️ 19.05.2025):
git clone https://github.com/librosa/librosa
- [PyPi](https://pypi.org/project/librosa) (📥 5.6M / month · 📦 1.6K · ⏱️ 11.03.2025):
pip install librosa
- [Conda](https://anaconda.org/conda-forge/librosa) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge librosa
DeepSpeech (🥈33 · ⭐ 27K · 📈) - DeepSpeech is an open source embedded (offline, on-.. MPL-2.0 - [GitHub](https://github.com/mozilla/DeepSpeech) (👨‍💻 160 · 🔀 4.1K · 📥 660K · 📦 540 · 📋 2.1K - 7% open · ⏱️ 19.06.2025):
git clone https://github.com/mozilla/DeepSpeech
- [PyPi](https://pypi.org/project/deepspeech) (📥 5.5K / month · 📦 24 · ⏱️ 19.12.2020):
pip install deepspeech
- [Conda](https://anaconda.org/conda-forge/deepspeech) (📥 4.2K · ⏱️ 22.04.2025):
conda install -c conda-forge deepspeech
audioread (🥈33 · ⭐ 520 · 📈) - cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio.. MIT - [GitHub](https://github.com/beetbox/audioread) (👨‍💻 27 · 🔀 110 · 📦 35K · 📋 98 - 40% open · ⏱️ 26.10.2025):
git clone https://github.com/beetbox/audioread
- [PyPi](https://pypi.org/project/audioread) (📥 4.8M / month · 📦 180 · ⏱️ 26.10.2025):
pip install audioread
- [Conda](https://anaconda.org/conda-forge/audioread) (📥 1.2M · ⏱️ 02.10.2025):
conda install -c conda-forge audioread
spleeter (🥈32 · ⭐ 27K) - Deezer source separation library including pretrained models. MIT - [GitHub](https://github.com/deezer/spleeter) (👨‍💻 22 · 🔀 3K · 📥 4.4M · 📦 1.1K · 📋 830 - 32% open · ⏱️ 02.04.2025):
git clone https://github.com/deezer/spleeter
- [PyPi](https://pypi.org/project/spleeter) (📥 26K / month · 📦 18 · ⏱️ 03.04.2025):
pip install spleeter
- [Conda](https://anaconda.org/conda-forge/spleeter) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge spleeter
audiomentations (🥈32 · ⭐ 2.2K) - A Python library for audio data augmentation. Useful for.. MIT - [GitHub](https://github.com/iver56/audiomentations) (👨‍💻 34 · 🔀 200 · 📦 840 · 📋 210 - 26% open · ⏱️ 26.09.2025):
git clone https://github.com/iver56/audiomentations
- [PyPi](https://pypi.org/project/audiomentations) (📥 110K / month · 📦 38 · ⏱️ 13.09.2025):
pip install audiomentations
Coqui TTS (🥈32 · ⭐ 1.9K) - - a deep learning toolkit for Text-to-Speech, battle-.. MPL-2.0 - [GitHub](https://github.com/idiap/coqui-ai-TTS) (👨‍💻 200 · 🔀 240 · 📥 3.8K · 📦 760 · 📋 160 - 14% open · ⏱️ 16.10.2025):
git clone https://github.com/idiap/coqui-ai-TTS
- [PyPi](https://pypi.org/project/coqui-tts) (📥 94K / month · 📦 34 · ⏱️ 25.09.2025):
pip install coqui-tts
Magenta (🥈31 · ⭐ 20K) - Magenta: Music and Art Generation with Machine Intelligence. Apache-2 - [GitHub](https://github.com/magenta/magenta) (👨‍💻 160 · 🔀 3.7K · 📦 600 · 📋 1K - 41% open · ⏱️ 08.07.2025):
git clone https://github.com/magenta/magenta
- [PyPi](https://pypi.org/project/magenta) (📥 4.8K / month · 📦 5 · ⏱️ 01.08.2022):
pip install magenta
Porcupine (🥉29 · ⭐ 4.5K) - On-device wake word detection powered by deep learning. Apache-2 - [GitHub](https://github.com/Picovoice/porcupine) (👨‍💻 43 · 🔀 550 · 📦 51 · 📋 600 - 0% open · ⏱️ 17.10.2025):
git clone https://github.com/Picovoice/Porcupine
- [PyPi](https://pypi.org/project/pvporcupine) (📥 25K / month · 📦 38 · ⏱️ 05.02.2025):
pip install pvporcupine
pyAudioAnalysis (🥉28 · ⭐ 6.2K) - Python Audio Analysis Library: Feature Extraction,.. Apache-2 - [GitHub](https://github.com/tyiannak/pyAudioAnalysis) (👨‍💻 28 · 🔀 1.2K · 📦 670 · 📋 330 - 62% open · ⏱️ 04.08.2025):
git clone https://github.com/tyiannak/pyAudioAnalysis
- [PyPi](https://pypi.org/project/pyAudioAnalysis) (📥 24K / month · 📦 12 · ⏱️ 07.02.2022):
pip install pyAudioAnalysis
python-soundfile (🥉27 · ⭐ 800) - SoundFile is an audio library based on libsndfile, CFFI, and.. BSD-3 - [GitHub](https://github.com/bastibe/python-soundfile) (👨‍💻 38 · 🔀 120 · 📥 21K · 📋 260 - 46% open · ⏱️ 28.04.2025):
git clone https://github.com/bastibe/python-soundfile
- [PyPi](https://pypi.org/project/soundfile) (📥 9.5M / month · 📦 1.1K · ⏱️ 25.01.2025):
pip install soundfile
- [Conda](https://anaconda.org/anaconda/pysoundfile):
conda install -c anaconda pysoundfile
tinytag (🥉27 · ⭐ 780) - Python library for reading audio file metadata. MIT - [GitHub](https://github.com/tinytag/tinytag) (👨‍💻 27 · 🔀 100 · 📦 1.3K · 📋 120 - 4% open · ⏱️ 13.08.2025):
git clone https://github.com/devsnd/tinytag
- [PyPi](https://pypi.org/project/tinytag) (📥 120K / month · 📦 130 · ⏱️ 13.08.2025):
pip install tinytag
kapre (🥉25 · ⭐ 930 · 📈) - kapre: Keras Audio Preprocessors. MIT - [GitHub](https://github.com/keunwoochoi/kapre) (👨‍💻 13 · 🔀 150 · 📥 33 · 📦 2.5K · 📋 99 - 17% open · ⏱️ 26.10.2025):
git clone https://github.com/keunwoochoi/kapre
- [PyPi](https://pypi.org/project/kapre) (📥 3.2K / month · 📦 11 · ⏱️ 26.10.2025):
pip install kapre
nnAudio (🥉22 · ⭐ 1.1K) - Audio processing by using pytorch 1D convolution network. MIT - [GitHub](https://github.com/KinWaiCheuk/nnAudio) (👨‍💻 16 · 🔀 96 · 📦 410 · 📋 65 - 30% open · ⏱️ 16.05.2025):
git clone https://github.com/KinWaiCheuk/nnAudio
- [PyPi](https://pypi.org/project/nnAudio) (📥 59K / month · 📦 4 · ⏱️ 13.02.2024):
pip install nnAudio
Julius (🥉21 · ⭐ 450 · 💤) - Fast PyTorch based DSP for audio and 1D signals. MIT - [GitHub](https://github.com/adefossez/julius) (👨‍💻 3 · 🔀 26 · 📋 12 - 16% open · ⏱️ 17.02.2025):
git clone https://github.com/adefossez/julius
- [PyPi](https://pypi.org/project/julius) (📥 840K / month · 📦 44 · ⏱️ 20.09.2022):
pip install julius
Show 11 hidden projects... - Pydub (🥈36 · ⭐ 9.6K · 💀) - Manipulate audio with a simple and easy high level interface. MIT - aubio (🥉27 · ⭐ 3.5K) - a library for audio and music analysis. ❗️GPL-3.0 - Essentia (🥉27 · ⭐ 3.3K) - C++ library for audio and music analysis, description and.. ❗️AGPL-3.0 - Madmom (🥉27 · ⭐ 1.5K · 💀) - Python audio and music signal processing library. BSD-3 - TTS (🥉26 · ⭐ 10K · 💀) - Deep learning for Text to Speech (Discussion forum:.. MPL-2.0 - python_speech_features (🥉26 · ⭐ 2.4K · 💀) - This library provides common speech features for ASR.. MIT - DDSP (🥉25 · ⭐ 3.1K · 💀) - DDSP: Differentiable Digital Signal Processing. Apache-2 - Dejavu (🥉23 · ⭐ 6.7K · 💀) - Audio fingerprinting and recognition in Python. MIT - TimeSide (🥉21 · ⭐ 390 · 💤) - scalable audio processing framework and server written in.. ❗️AGPL-3.0 - Muda (🥉18 · ⭐ 240 · 💀) - A library for augmenting annotated audio data. ISC - textlesslib (🥉10 · ⭐ 550 · 💀) - Library for Textless Spoken Language Processing. MIT


Geospatial Data

Back to top

Libraries to load, process, analyze, and write geographic data as well as libraries for spatial analysis, map visualization, and geocoding.

pydeck (🥇43 · ⭐ 14K) - WebGL2 powered visualization framework. MIT - [GitHub](https://github.com/visgl/deck.gl) (👨‍💻 310 · 🔀 2.2K · 📦 9.2K · 📋 3.3K - 13% open · ⏱️ 29.10.2025):
git clone https://github.com/visgl/deck.gl
- [PyPi](https://pypi.org/project/pydeck) (📥 16M / month · 📦 160 · ⏱️ 21.03.2025):
pip install pydeck
- [Conda](https://anaconda.org/conda-forge/pydeck) (📥 850K · ⏱️ 22.04.2025):
conda install -c conda-forge pydeck
- [npm](https://www.npmjs.com/package/deck.gl) (📥 750K / month · 📦 360 · ⏱️ 16.10.2025):
npm install deck.gl
folium (🥇40 · ⭐ 7.3K) - Python Data. Leaflet.js Maps. MIT - [GitHub](https://github.com/python-visualization/folium) (👨‍💻 180 · 🔀 2.2K · 📦 65K · 📋 1.2K - 6% open · ⏱️ 06.10.2025):
git clone https://github.com/python-visualization/folium
- [PyPi](https://pypi.org/project/folium) (📥 2.8M / month · 📦 1K · ⏱️ 16.06.2025):
pip install folium
- [Conda](https://anaconda.org/conda-forge/folium) (📥 4.4M · ⏱️ 16.06.2025):
conda install -c conda-forge folium
Shapely (🥇40 · ⭐ 4.3K) - Manipulation and analysis of geometric objects. BSD-3 - [GitHub](https://github.com/shapely/shapely) (👨‍💻 170 · 🔀 600 · 📥 4K · 📦 110K · 📋 1.3K - 18% open · ⏱️ 28.10.2025):
git clone https://github.com/shapely/shapely
- [PyPi](https://pypi.org/project/shapely) (📥 62M / month · 📦 4.7K · ⏱️ 24.09.2025):
pip install shapely
- [Conda](https://anaconda.org/conda-forge/shapely) (📥 14M · ⏱️ 28.10.2025):
conda install -c conda-forge shapely
GeoPandas (🥈39 · ⭐ 4.9K) - Python tools for geographic data. BSD-3 - [GitHub](https://github.com/geopandas/geopandas) (👨‍💻 250 · 🔀 980 · 📥 3.1K · 📦 60K · 📋 1.8K - 24% open · ⏱️ 25.10.2025):
git clone https://github.com/geopandas/geopandas
- [PyPi](https://pypi.org/project/geopandas) (📥 11M / month · 📦 3.8K · ⏱️ 26.06.2025):
pip install geopandas
- [Conda](https://anaconda.org/conda-forge/geopandas) (📥 5.4M · ⏱️ 06.10.2025):
conda install -c conda-forge geopandas
Rasterio (🥈37 · ⭐ 2.4K) - Rasterio reads and writes geospatial raster datasets. BSD-3 - [GitHub](https://github.com/rasterio/rasterio) (👨‍💻 170 · 🔀 540 · 📥 1K · 📦 19K · 📋 1.9K - 8% open · ⏱️ 26.09.2025):
git clone https://github.com/rasterio/rasterio
- [PyPi](https://pypi.org/project/rasterio) (📥 2.8M / month · 📦 1.5K · ⏱️ 02.12.2024):
pip install rasterio
- [Conda](https://anaconda.org/conda-forge/rasterio) (📥 5.3M · ⏱️ 17.09.2025):
conda install -c conda-forge rasterio
pyproj (🥈37 · ⭐ 1.2K) - Python interface to PROJ (cartographic projections and coordinate.. MIT - [GitHub](https://github.com/pyproj4/pyproj) (👨‍💻 74 · 🔀 230 · 📦 47K · 📋 660 - 6% open · ⏱️ 29.10.2025):
git clone https://github.com/pyproj4/pyproj
- [PyPi](https://pypi.org/project/pyproj) (📥 14M / month · 📦 2.3K · ⏱️ 14.08.2025):
pip install pyproj
- [Conda](https://anaconda.org/conda-forge/pyproj) (📥 12M · ⏱️ 15.09.2025):
conda install -c conda-forge pyproj
ArcGIS API (🥈36 · ⭐ 2.1K) - Documentation and samples for ArcGIS API for Python. Apache-2 - [GitHub](https://github.com/Esri/arcgis-python-api) (👨‍💻 99 · 🔀 1.1K · 📥 16K · 📦 1K · 📋 920 - 8% open · ⏱️ 28.10.2025):
git clone https://github.com/Esri/arcgis-python-api
- [PyPi](https://pypi.org/project/arcgis) (📥 150K / month · 📦 44 · ⏱️ 27.10.2025):
pip install arcgis
- [Docker Hub](https://hub.docker.com/r/esridocker/arcgis-api-python-notebook):
docker pull esridocker/arcgis-api-python-notebook
Fiona (🥈34 · ⭐ 1.2K · 💤) - Fiona reads and writes geographic data files. BSD-3 - [GitHub](https://github.com/Toblerity/Fiona) (👨‍💻 78 · 🔀 210 · 📦 27K · 📋 820 - 5% open · ⏱️ 20.02.2025):
git clone https://github.com/Toblerity/Fiona
- [PyPi](https://pypi.org/project/fiona) (📥 5.6M / month · 📦 380 · ⏱️ 16.09.2024):
pip install fiona
- [Conda](https://anaconda.org/conda-forge/fiona) (📥 7.9M · ⏱️ 22.04.2025):
conda install -c conda-forge fiona
ipyleaflet (🥉33 · ⭐ 1.5K) - A Jupyter - Leaflet.js bridge. MIT - [GitHub](https://github.com/jupyter-widgets/ipyleaflet) (👨‍💻 94 · 🔀 360 · 📦 18K · 📋 660 - 44% open · ⏱️ 19.06.2025):
git clone https://github.com/jupyter-widgets/ipyleaflet
- [PyPi](https://pypi.org/project/ipyleaflet) (📥 230K / month · 📦 340 · ⏱️ 13.06.2025):
pip install ipyleaflet
- [Conda](https://anaconda.org/conda-forge/ipyleaflet) (📥 1.8M · ⏱️ 13.06.2025):
conda install -c conda-forge ipyleaflet
- [npm](https://www.npmjs.com/package/jupyter-leaflet) (📥 2.7K / month · 📦 9 · ⏱️ 13.06.2025):
npm install jupyter-leaflet
geojson (🥉31 · ⭐ 970 · 💤) - Python bindings and utilities for GeoJSON. BSD-3 - [GitHub](https://github.com/jazzband/geojson) (👨‍💻 58 · 🔀 120 · 📦 21K · 📋 100 - 26% open · ⏱️ 21.12.2024):
git clone https://github.com/jazzband/geojson
- [PyPi](https://pypi.org/project/geojson) (📥 3.6M / month · 📦 720 · ⏱️ 21.12.2024):
pip install geojson
- [Conda](https://anaconda.org/conda-forge/geojson) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge geojson
PySAL (🥉30 · ⭐ 1.4K) - PySAL: Python Spatial Analysis Library Meta-Package. BSD-3 - [GitHub](https://github.com/pysal/pysal) (👨‍💻 79 · 🔀 310 · 📦 1.8K · 📋 660 - 3% open · ⏱️ 08.09.2025):
git clone https://github.com/pysal/pysal
- [PyPi](https://pypi.org/project/pysal) (📥 42K / month · 📦 65 · ⏱️ 31.07.2025):
pip install pysal
- [Conda](https://anaconda.org/conda-forge/pysal) (📥 730K · ⏱️ 01.08.2025):
conda install -c conda-forge pysal
GeoViews (🥉28 · ⭐ 620) - Simple, concise geographical visualization in Python. BSD-3 - [GitHub](https://github.com/holoviz/geoviews) (👨‍💻 34 · 🔀 79 · 📦 5 · 📋 360 - 31% open · ⏱️ 29.10.2025):
git clone https://github.com/holoviz/geoviews
- [PyPi](https://pypi.org/project/geoviews) (📥 69K / month · 📦 76 · ⏱️ 14.08.2025):
pip install geoviews
- [Conda](https://anaconda.org/conda-forge/geoviews) (📥 340K · ⏱️ 14.08.2025):
conda install -c conda-forge geoviews
EarthPy (🥉28 · ⭐ 530) - A package built to support working with spatial data using open source.. BSD-3 - [GitHub](https://github.com/earthlab/earthpy) (👨‍💻 44 · 🔀 160 · 📥 75 · 📦 440 · 📋 250 - 16% open · ⏱️ 31.07.2025):
git clone https://github.com/earthlab/earthpy
- [PyPi](https://pypi.org/project/earthpy) (📥 14K / month · 📦 17 · ⏱️ 01.10.2021):
pip install earthpy
- [Conda](https://anaconda.org/conda-forge/earthpy) (📥 98K · ⏱️ 22.04.2025):
conda install -c conda-forge earthpy
pymap3d (🥉25 · ⭐ 430) - pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef.. BSD-2 - [GitHub](https://github.com/geospace-code/pymap3d) (👨‍💻 19 · 🔀 87 · 📦 540 · 📋 59 - 8% open · ⏱️ 08.07.2025):
git clone https://github.com/geospace-code/pymap3d
- [PyPi](https://pypi.org/project/pymap3d) (📥 490K / month · 📦 50 · ⏱️ 08.07.2025):
pip install pymap3d
- [Conda](https://anaconda.org/conda-forge/pymap3d) (📥 120K · ⏱️ 08.07.2025):
conda install -c conda-forge pymap3d
Mapbox GL (🥉22 · ⭐ 680 · 💤) - Use Mapbox GL JS to visualize data in a Python Jupyter notebook. MIT - [GitHub](https://github.com/mapbox/mapboxgl-jupyter) (👨‍💻 23 · 🔀 140 · 📋 110 - 38% open · ⏱️ 06.02.2025):
git clone https://github.com/mapbox/mapboxgl-jupyter
- [PyPi](https://pypi.org/project/mapboxgl) (📥 10K / month · 📦 12 · ⏱️ 02.06.2019):
pip install mapboxgl
Show 7 hidden projects... - Satpy (🥈34 · ⭐ 1.1K) - Python package for earth-observing satellite data processing. ❗️GPL-3.0 - geopy (🥉32 · ⭐ 4.7K · 💀) - Geocoding library for Python. MIT - Geocoder (🥉32 · ⭐ 1.6K · 💀) - Python Geocoder. MIT - prettymaps (🥉24 · ⭐ 12K) - Draw pretty maps from OpenStreetMap data! Built with osmnx.. ❗️AGPL-3.0 - Sentinelsat (🥉24 · ⭐ 1K · 💀) - Search and download Copernicus Sentinel satellite images. ❗️GPL-3.0 - gmaps (🥉22 · ⭐ 760 · 💀) - Google maps for Jupyter notebooks. BSD-3 - geoplotlib (🥉21 · ⭐ 1K · 💀) - python toolbox for visualizing geographical data and making maps. MIT


Financial Data

Back to top

Libraries for algorithmic stock/crypto trading, risk analytics, backtesting, technical analysis, and other tasks on financial data.

yfinance (🥇42 · ⭐ 20K) - Download market data from Yahoo! Finances API. Apache-2 - [GitHub](https://github.com/ranaroussi/yfinance) (👨‍💻 140 · 🔀 2.8K · 📦 86K · 📋 1.7K - 9% open · ⏱️ 18.09.2025):
git clone https://github.com/ranaroussi/yfinance
- [PyPi](https://pypi.org/project/yfinance) (📥 5.9M / month · 📦 1.2K · ⏱️ 17.09.2025):
pip install yfinance
- [Conda](https://anaconda.org/ranaroussi/yfinance) (📥 99K · ⏱️ 25.03.2025):
conda install -c ranaroussi yfinance
Qlib (🥇32 · ⭐ 33K) - Qlib is an AI-oriented Quant investment platform that aims to use AI tech.. MIT - [GitHub](https://github.com/microsoft/qlib) (👨‍💻 140 · 🔀 5K · 📥 910 · 📦 21 · 📋 1K - 28% open · ⏱️ 17.10.2025):
git clone https://github.com/microsoft/qlib
- [PyPi](https://pypi.org/project/pyqlib) (📥 16K / month · 📦 3 · ⏱️ 15.08.2025):
pip install pyqlib
bt (🥈30 · ⭐ 2.7K) - bt - flexible backtesting for Python. MIT - [GitHub](https://github.com/pmorissette/bt) (👨‍💻 35 · 🔀 450 · 📦 1.7K · 📋 350 - 23% open · ⏱️ 27.10.2025):
git clone https://github.com/pmorissette/bt
- [PyPi](https://pypi.org/project/bt) (📥 11K / month · 📦 15 · ⏱️ 12.04.2025):
pip install bt
- [Conda](https://anaconda.org/conda-forge/bt) (📥 110K · ⏱️ 02.10.2025):
conda install -c conda-forge bt
Alpha Vantage (🥈27 · ⭐ 4.6K) - A python wrapper for Alpha Vantage API for financial data. MIT - [GitHub](https://github.com/RomelTorres/alpha_vantage) (👨‍💻 44 · 🔀 760 · 📋 290 - 0% open · ⏱️ 27.07.2025):
git clone https://github.com/RomelTorres/alpha_vantage
- [PyPi](https://pypi.org/project/alpha_vantage) (📥 140K / month · 📦 35 · ⏱️ 18.07.2024):
pip install alpha_vantage
- [Conda](https://anaconda.org/conda-forge/alpha_vantage) (📥 10K · ⏱️ 22.04.2025):
conda install -c conda-forge alpha_vantage
ffn (🥈27 · ⭐ 2.4K) - ffn - a financial function library for Python. MIT - [GitHub](https://github.com/pmorissette/ffn) (👨‍💻 36 · 🔀 330 · 📦 580 · 📋 140 - 17% open · ⏱️ 27.10.2025):
git clone https://github.com/pmorissette/ffn
- [PyPi](https://pypi.org/project/ffn) (📥 25K / month · 📦 22 · ⏱️ 11.02.2025):
pip install ffn
- [Conda](https://anaconda.org/conda-forge/ffn) (📥 26K · ⏱️ 22.04.2025):
conda install -c conda-forge ffn
stockstats (🥉26 · ⭐ 1.4K) - Supply a wrapper ``StockDataFrame`` based on the.. BSD-3 - [GitHub](https://github.com/jealous/stockstats) (👨‍💻 10 · 🔀 310 · 📦 1.3K · 📋 130 - 10% open · ⏱️ 18.05.2025):
git clone https://github.com/jealous/stockstats
- [PyPi](https://pypi.org/project/stockstats) (📥 51K / month · 📦 14 · ⏱️ 18.05.2025):
pip install stockstats
tf-quant-finance (🥉21 · ⭐ 5K · 💤) - High-performance TensorFlow library for quantitative.. Apache-2 - [GitHub](https://github.com/google/tf-quant-finance) (👨‍💻 48 · 🔀 630 · 📋 65 - 56% open · ⏱️ 21.03.2025):
git clone https://github.com/google/tf-quant-finance
- [PyPi](https://pypi.org/project/tf-quant-finance) (📥 410 / month · 📦 3 · ⏱️ 19.08.2022):
pip install tf-quant-finance
finmarketpy (🥉21 · ⭐ 3.7K · 💤) - Python library for backtesting trading strategies &.. Apache-2 - [GitHub](https://github.com/cuemacro/finmarketpy) (👨‍💻 19 · 🔀 510 · 📥 57 · 📦 16 · 📋 35 - 88% open · ⏱️ 10.03.2025):
git clone https://github.com/cuemacro/finmarketpy
- [PyPi](https://pypi.org/project/finmarketpy) (📥 340 / month · ⏱️ 10.03.2025):
pip install finmarketpy
Show 17 hidden projects... - arch (🥇33 · ⭐ 1.5K) - ARCH models in Python. ❗Unlicensed - zipline (🥇32 · ⭐ 19K · 💀) - Zipline, a Pythonic Algorithmic Trading Library. Apache-2 - ta (🥇32 · ⭐ 4.8K · 💀) - Technical Analysis Library using Pandas and Numpy. MIT - pyfolio (🥈31 · ⭐ 6.1K · 💀) - Portfolio and risk analytics in Python. Apache-2 - backtrader (🥈29 · ⭐ 19K · 💀) - Python Backtesting library for trading strategies. ❗️GPL-3.0 - IB-insync (🥈28 · ⭐ 3.1K · 💀) - Python sync/async framework for Interactive Brokers API. BSD-2 - Alphalens (🥈27 · ⭐ 4K · 💀) - Performance analysis of predictive (alpha) stock factors. Apache-2 - Enigma Catalyst (🥈27 · ⭐ 2.5K · 💀) - An Algorithmic Trading Library for Crypto-Assets in.. Apache-2 - empyrical (🥈27 · ⭐ 1.4K · 💀) - Common financial risk and performance metrics. Used by.. Apache-2 - Backtesting.py (🥉26 · ⭐ 7.4K) - Backtest trading strategies in Python. ❗️AGPL-3.0 - TensorTrade (🥉26 · ⭐ 5.6K · 💀) - An open source reinforcement learning framework for.. Apache-2 - PyAlgoTrade (🥉25 · ⭐ 4.6K · 💀) - Python Algorithmic Trading Library. Apache-2 - FinTA (🥉24 · ⭐ 2.2K · 💀) - Common financial technical indicators implemented in Pandas. ❗️LGPL-3.0 - Crypto Signals (🥉22 · ⭐ 5.4K · 💀) - Github.com/CryptoSignal - Trading & Technical Analysis Bot -.. MIT - FinQuant (🥉22 · ⭐ 1.6K · 💀) - A program for financial portfolio management, analysis and.. MIT - surpriver (🥉12 · ⭐ 1.8K · 💀) - Find big moving stocks before they move using machine.. ❗️GPL-3.0 - pyrtfolio (🥉9 · ⭐ 150 · 💀) - Python package to generate stock portfolios. ❗️GPL-3.0


Time Series Data

Back to top

Libraries for forecasting, anomaly detection, feature extraction, and machine learning on time-series and sequential data.

sktime (🥇41 · ⭐ 9.3K) - A unified framework for machine learning with time series. BSD-3 - [GitHub](https://github.com/sktime/sktime) (👨‍💻 520 · 🔀 1.7K · 📥 110 · 📦 4.7K · 📋 3.1K - 39% open · ⏱️ 28.10.2025):
git clone https://github.com/alan-turing-institute/sktime
- [PyPi](https://pypi.org/project/sktime) (📥 1M / month · 📦 160 · ⏱️ 25.09.2025):
pip install sktime
- [Conda](https://anaconda.org/conda-forge/sktime-all-extras) (📥 1.2M · ⏱️ 18.09.2025):
conda install -c conda-forge sktime-all-extras
Prophet (🥇34 · ⭐ 20K) - Tool for producing high quality forecasts for time series data that has.. MIT - [GitHub](https://github.com/facebook/prophet) (👨‍💻 190 · 🔀 4.6K · 📥 3.2K · 📦 21 · 📋 2.2K - 20% open · ⏱️ 21.10.2025):
git clone https://github.com/facebook/prophet
- [PyPi](https://pypi.org/project/fbprophet) (📥 84K / month · 📦 91 · ⏱️ 05.09.2020):
pip install fbprophet
- [Conda](https://anaconda.org/conda-forge/prophet) (📥 1.5M · ⏱️ 22.10.2025):
conda install -c conda-forge prophet
StatsForecast (🥇34 · ⭐ 4.6K) - Lightning fast forecasting with statistical and econometric.. Apache-2 - [GitHub](https://github.com/Nixtla/statsforecast) (👨‍💻 56 · 🔀 340 · 📦 2K · 📋 400 - 34% open · ⏱️ 29.10.2025):
git clone https://github.com/Nixtla/statsforecast
- [PyPi](https://pypi.org/project/statsforecast) (📥 990K / month · 📦 91 · ⏱️ 29.10.2025):
pip install statsforecast
- [Conda](https://anaconda.org/conda-forge/statsforecast) (📥 220K · ⏱️ 30.10.2025):
conda install -c conda-forge statsforecast
tslearn (🥈33 · ⭐ 3.1K) - The machine learning toolkit for time series analysis in Python. BSD-2 - [GitHub](https://github.com/tslearn-team/tslearn) (👨‍💻 46 · 🔀 350 · 📦 1.9K · 📋 380 - 38% open · ⏱️ 27.10.2025):
git clone https://github.com/tslearn-team/tslearn
- [PyPi](https://pypi.org/project/tslearn) (📥 400K / month · 📦 110 · ⏱️ 02.07.2025):
pip install tslearn
- [Conda](https://anaconda.org/conda-forge/tslearn) (📥 1.7M · ⏱️ 03.07.2025):
conda install -c conda-forge tslearn
skforecast (🥈33 · ⭐ 1.4K) - Time series forecasting with machine learning models. BSD-3 - [GitHub](https://github.com/skforecast/skforecast) (👨‍💻 23 · 🔀 170 · 📦 490 · 📋 210 - 8% open · ⏱️ 22.09.2025):
git clone https://github.com/JoaquinAmatRodrigo/skforecast
- [PyPi](https://pypi.org/project/skforecast) (📥 96K / month · 📦 18 · ⏱️ 22.09.2025):
pip install skforecast
Darts (🥈32 · ⭐ 9K) - A python library for user-friendly forecasting and anomaly detection on.. Apache-2 - [GitHub](https://github.com/unit8co/darts) (👨‍💻 140 · 🔀 970 · 📋 1.8K - 13% open · ⏱️ 26.10.2025):
git clone https://github.com/unit8co/darts
- [PyPi](https://pypi.org/project/u8darts) (📥 86K / month · 📦 10 · ⏱️ 03.10.2025):
pip install u8darts
- [Conda](https://anaconda.org/conda-forge/u8darts-all) (📥 94K · ⏱️ 05.10.2025):
conda install -c conda-forge u8darts-all
- [Docker Hub](https://hub.docker.com/r/unit8/darts) (📥 2.1K · ⏱️ 03.10.2025):
docker pull unit8/darts
pytorch-forecasting (🥈32 · ⭐ 4.6K) - Time series forecasting with PyTorch. MIT - [GitHub](https://github.com/sktime/pytorch-forecasting) (👨‍💻 79 · 🔀 710 · 📦 670 · 📋 920 - 59% open · ⏱️ 19.10.2025):
git clone https://github.com/jdb78/pytorch-forecasting
- [PyPi](https://pypi.org/project/pytorch-forecasting) (📥 270K / month · 📦 27 · ⏱️ 10.10.2025):
pip install pytorch-forecasting
- [Conda](https://anaconda.org/conda-forge/pytorch-forecasting) (📥 87K · ⏱️ 05.07.2025):
conda install -c conda-forge pytorch-forecasting
pmdarima (🥈32 · ⭐ 1.7K · 💤) - A statistical library designed to fill the void in Pythons time.. MIT - [GitHub](https://github.com/alkaline-ml/pmdarima) (👨‍💻 23 · 🔀 250 · 📦 13K · 📋 340 - 19% open · ⏱️ 07.11.2024):
git clone https://github.com/alkaline-ml/pmdarima
- [PyPi](https://pypi.org/project/pmdarima) (📥 7.5M / month · 📦 150 · ⏱️ 23.10.2023):
pip install pmdarima
- [Conda](https://anaconda.org/conda-forge/pmdarima) (📥 1.4M · ⏱️ 22.04.2025):
conda install -c conda-forge pmdarima
tsfresh (🥈31 · ⭐ 9K) - Automatic extraction of relevant features from time series:. MIT - [GitHub](https://github.com/blue-yonder/tsfresh) (👨‍💻 100 · 🔀 1.3K · 📦 21 · 📋 550 - 12% open · ⏱️ 30.08.2025):
git clone https://github.com/blue-yonder/tsfresh
- [PyPi](https://pypi.org/project/tsfresh) (📥 340K / month · 📦 120 · ⏱️ 30.08.2025):
pip install tsfresh
- [Conda](https://anaconda.org/conda-forge/tsfresh) (📥 1.5M · ⏱️ 31.08.2025):
conda install -c conda-forge tsfresh
STUMPY (🥈30 · ⭐ 4K) - STUMPY is a powerful and scalable Python library for modern time series.. BSD-3 - [GitHub](https://github.com/stumpy-dev/stumpy) (👨‍💻 41 · 🔀 340 · 📦 1.6K · 📋 540 - 13% open · ⏱️ 02.09.2025):
git clone https://github.com/TDAmeritrade/stumpy
- [PyPi](https://pypi.org/project/stumpy) (📥 380K / month · 📦 30 · ⏱️ 09.07.2024):
pip install stumpy
- [Conda](https://anaconda.org/conda-forge/stumpy) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge stumpy
NeuralForecast (🥈30 · ⭐ 3.8K) - Scalable and user friendly neural forecasting algorithms. Apache-2 - [GitHub](https://github.com/Nixtla/neuralforecast) (👨‍💻 55 · 🔀 450 · 📋 650 - 17% open · ⏱️ 01.10.2025):
git clone https://github.com/Nixtla/neuralforecast
- [PyPi](https://pypi.org/project/neuralforecast) (📥 160K / month · 📦 30 · ⏱️ 01.10.2025):
pip install neuralforecast
- [Conda](https://anaconda.org/conda-forge/neuralforecast) (📥 47K · ⏱️ 06.10.2025):
conda install -c conda-forge neuralforecast
GluonTS (🥈29 · ⭐ 5K) - Probabilistic time series modeling in Python. Apache-2 - [GitHub](https://github.com/awslabs/gluonts) (👨‍💻 120 · 🔀 790 · 📋 970 - 34% open · ⏱️ 14.08.2025):
git clone https://github.com/awslabs/gluon-ts
- [PyPi](https://pypi.org/project/gluonts) (📥 1.9M / month · 📦 41 · ⏱️ 27.06.2025):
pip install gluonts
- [Conda](https://anaconda.org/anaconda/gluonts) (📥 3.2K · ⏱️ 22.04.2025):
conda install -c anaconda gluonts
Streamz (🥉28 · ⭐ 1.3K · 💤) - Real-time stream processing for python. BSD-3 - [GitHub](https://github.com/python-streamz/streamz) (👨‍💻 49 · 🔀 150 · 📦 570 · 📋 270 - 44% open · ⏱️ 22.11.2024):
git clone https://github.com/python-streamz/streamz
- [PyPi](https://pypi.org/project/streamz) (📥 26K / month · 📦 57 · ⏱️ 27.07.2022):
pip install streamz
- [Conda](https://anaconda.org/conda-forge/streamz) (📥 2.9M · ⏱️ 22.04.2025):
conda install -c conda-forge streamz
pyts (🥉27 · ⭐ 1.9K) - A Python package for time series classification. BSD-3 - [GitHub](https://github.com/johannfaouzi/pyts) (👨‍💻 15 · 🔀 180 · 📦 900 · 📋 88 - 59% open · ⏱️ 18.06.2025):
git clone https://github.com/johannfaouzi/pyts
- [PyPi](https://pypi.org/project/pyts) (📥 190K / month · 📦 45 · ⏱️ 18.06.2023):
pip install pyts
- [Conda](https://anaconda.org/conda-forge/pyts) (📥 35K · ⏱️ 22.04.2025):
conda install -c conda-forge pyts
TSFEL (🥉26 · ⭐ 1.1K) - An intuitive library to extract features from time series. BSD-3 - [GitHub](https://github.com/fraunhoferportugal/tsfel) (👨‍💻 21 · 🔀 150 · 📦 220 · 📋 87 - 5% open · ⏱️ 20.08.2025):
git clone https://github.com/fraunhoferportugal/tsfel
- [PyPi](https://pypi.org/project/tsfel) (📥 9.4K / month · 📦 14 · ⏱️ 20.08.2025):
pip install tsfel
greykite (🥉22 · ⭐ 1.8K · 💤) - A flexible, intuitive and fast forecasting library. BSD-2 - [GitHub](https://github.com/linkedin/greykite) (👨‍💻 10 · 🔀 110 · 📥 39 · 📦 47 · 📋 110 - 11% open · ⏱️ 20.02.2025):
git clone https://github.com/linkedin/greykite
- [PyPi](https://pypi.org/project/greykite) (📥 11K / month · ⏱️ 20.02.2025):
pip install greykite
Show 13 hidden projects... - NeuralProphet (🥉26 · ⭐ 4.2K · 💀) - NeuralProphet: A simple forecasting package. MIT - PyFlux (🥉25 · ⭐ 2.1K · 💀) - Open source time series library for Python. BSD-3 - luminol (🥉22 · ⭐ 1.2K · 💀) - Anomaly Detection and Correlation library. Apache-2 - ADTK (🥉22 · ⭐ 1.2K · 💀) - A Python toolkit for rule-based/unsupervised anomaly detection in.. MPL-2.0 - seglearn (🥉21 · ⭐ 580 · 💀) - Python module for machine learning time series:. BSD-3 - pydlm (🥉21 · ⭐ 480 · 💀) - A python library for Bayesian time series modeling. BSD-3 - tick (🥉20 · ⭐ 520 · 💀) - Module for statistical learning, with a particular emphasis on time-.. BSD-3 - matrixprofile-ts (🥉19 · ⭐ 740 · 💀) - A Python library for detecting patterns and anomalies.. Apache-2 - tsflex (🥉19 · ⭐ 430 · 💀) - Flexible time series feature extraction & processing. MIT - Auto TS (🥉17 · ⭐ 760 · 💀) - Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost.. Apache-2 - tsaug (🥉15 · ⭐ 360 · 💀) - A Python package for time series augmentation. Apache-2 - atspy (🥉14 · ⭐ 520 · 💀) - AtsPy: Automated Time Series Models in Python (by @firmai). MIT - tslumen (🥉8 · ⭐ 71 · 💀) - A library for Time Series EDA (exploratory data analysis). Apache-2


Medical Data

Back to top

Libraries for processing and analyzing medical data such as MRIs, EEGs, genomic data, and other medical imaging formats.

Nilearn (🥇38 · ⭐ 1.3K) - Machine learning for NeuroImaging in Python. BSD-3 - [GitHub](https://github.com/nilearn/nilearn) (👨‍💻 260 · 🔀 610 · 📥 410 · 📦 4.4K · 📋 2.4K - 12% open · ⏱️ 30.10.2025):
git clone https://github.com/nilearn/nilearn
- [PyPi](https://pypi.org/project/nilearn) (📥 270K / month · 📦 350 · ⏱️ 03.09.2025):
pip install nilearn
- [Conda](https://anaconda.org/conda-forge/nilearn) (📥 400K · ⏱️ 04.09.2025):
conda install -c conda-forge nilearn
MONAI (🥇37 · ⭐ 7K) - AI Toolkit for Healthcare Imaging. Apache-2 - [GitHub](https://github.com/Project-MONAI/MONAI) (👨‍💻 240 · 🔀 1.3K · 📦 4.5K · 📋 3.3K - 14% open · ⏱️ 10.10.2025):
git clone https://github.com/Project-MONAI/MONAI
- [PyPi](https://pypi.org/project/monai) (📥 320K / month · 📦 200 · ⏱️ 22.09.2025):
pip install monai
- [Conda](https://anaconda.org/conda-forge/monai) (📥 60K · ⏱️ 22.09.2025):
conda install -c conda-forge monai
MNE (🥇37 · ⭐ 3.1K) - MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python. BSD-3 - [GitHub](https://github.com/mne-tools/mne-python) (👨‍💻 410 · 🔀 1.4K · 📋 5.1K - 11% open · ⏱️ 29.10.2025):
git clone https://github.com/mne-tools/mne-python
- [PyPi](https://pypi.org/project/mne) (📥 280K / month · 📦 530 · ⏱️ 14.10.2025):
pip install mne
- [Conda](https://anaconda.org/conda-forge/mne) (📥 620K · ⏱️ 14.10.2025):
conda install -c conda-forge mne
Hail (🥈34 · ⭐ 1K) - Cloud-native genomic dataframes and batch computing. MIT - [GitHub](https://github.com/hail-is/hail) (👨‍💻 100 · 🔀 260 · 📦 170 · 📋 2.6K - 11% open · ⏱️ 29.10.2025):
git clone https://github.com/hail-is/hail
- [PyPi](https://pypi.org/project/hail) (📥 110K / month · 📦 44 · ⏱️ 09.09.2025):
pip install hail
NiBabel (🥈34 · ⭐ 740) - Python package to access a cacophony of neuro-imaging file formats. MIT - [GitHub](https://github.com/nipy/nibabel) (👨‍💻 110 · 🔀 260 · 📦 30K · 📋 550 - 23% open · ⏱️ 21.10.2025):
git clone https://github.com/nipy/nibabel
- [PyPi](https://pypi.org/project/nibabel) (📥 910K / month · 📦 1.2K · ⏱️ 23.10.2024):
pip install nibabel
- [Conda](https://anaconda.org/conda-forge/nibabel) (📥 1M · ⏱️ 22.04.2025):
conda install -c conda-forge nibabel
NIPYPE (🥈33 · ⭐ 790) - Workflows and interfaces for neuroimaging packages. Apache-2 - [GitHub](https://github.com/nipy/nipype) (👨‍💻 260 · 🔀 530 · 📦 7.2K · 📋 1.4K - 30% open · ⏱️ 28.04.2025):
git clone https://github.com/nipy/nipype
- [PyPi](https://pypi.org/project/nipype) (📥 360K / month · 📦 150 · ⏱️ 19.03.2025):
pip install nipype
- [Conda](https://anaconda.org/conda-forge/nipype) (📥 990K · ⏱️ 05.05.2025):
conda install -c conda-forge nipype
Lifelines (🥈32 · ⭐ 2.5K · 💤) - Survival analysis in Python. MIT - [GitHub](https://github.com/CamDavidsonPilon/lifelines) (👨‍💻 120 · 🔀 560 · 📦 4.2K · 📋 980 - 27% open · ⏱️ 29.10.2024):
git clone https://github.com/CamDavidsonPilon/lifelines
- [PyPi](https://pypi.org/project/lifelines) (📥 1.5M / month · 📦 160 · ⏱️ 29.10.2024):
pip install lifelines
- [Conda](https://anaconda.org/conda-forge/lifelines) (📥 500K · ⏱️ 22.04.2025):
conda install -c conda-forge lifelines
DeepVariant (🥉27 · ⭐ 3.5K) - DeepVariant is an analysis pipeline that uses a deep neural.. BSD-3 - [GitHub](https://github.com/google/deepvariant) (👨‍💻 41 · 🔀 760 · 📥 4.9K · 📦 4 · 📋 960 - 0% open · ⏱️ 10.09.2025):
git clone https://github.com/google/deepvariant
- [Conda](https://anaconda.org/bioconda/deepvariant) (📥 79K · ⏱️ 24.05.2025):
conda install -c bioconda deepvariant
Brainiak (🥉19 · ⭐ 360 · 💤) - Brain Imaging Analysis Kit. Apache-2 - [GitHub](https://github.com/brainiak/brainiak) (👨‍💻 35 · 🔀 140 · 📋 230 - 38% open · ⏱️ 06.01.2025):
git clone https://github.com/brainiak/brainiak
- [PyPi](https://pypi.org/project/brainiak) (📥 1.3K / month · ⏱️ 07.01.2025):
pip install brainiak
- [Docker Hub](https://hub.docker.com/r/brainiak/brainiak) (📥 2K · ⭐ 1 · ⏱️ 07.01.2025):
docker pull brainiak/brainiak
Show 10 hidden projects... - DIPY (🥈31 · ⭐ 790) - DIPY is the paragon 3D/4D+ medical imaging library in Python... ❗Unlicensed - NiftyNet (🥉24 · ⭐ 1.4K · 💀) - [unmaintained] An open-source convolutional neural.. Apache-2 - NIPY (🥉24 · ⭐ 400 · 💤) - Neuroimaging in Python FMRI analysis package. ❗Unlicensed - MedPy (🥉23 · ⭐ 610 · 💀) - Medical image processing in Python. ❗️GPL-3.0 - DLTK (🥉20 · ⭐ 1.4K · 💀) - Deep Learning Toolkit for Medical Image Analysis. Apache-2 - Glow (🥉19 · ⭐ 290 · 💤) - An open-source toolkit for large-scale genomic analysis. Apache-2 - MedicalTorch (🥉17 · ⭐ 870 · 💀) - A medical imaging framework for Pytorch. Apache-2 - Medical Detection Toolkit (🥉14 · ⭐ 1.3K · 💀) - The Medical Detection Toolkit contains 2D + 3D.. Apache-2 - DeepNeuro (🥉14 · ⭐ 130 · 💀) - A deep learning python package for neuroimaging data. Made by:. MIT - MedicalNet (🥉12 · ⭐ 2.1K · 💀) - Many studies have shown that the performance on deep learning is.. MIT


Tabular Data

Back to top

Libraries for processing tabular and structured data.

skrub (🥇30 · ⭐ 1.5K) - Machine learning with dataframes. BSD-3 - [GitHub](https://github.com/skrub-data/skrub) (👨‍💻 82 · 🔀 170 · 📦 100 · 📋 580 - 21% open · ⏱️ 30.10.2025):
git clone https://github.com/skrub-data/skrub
- [PyPi](https://pypi.org/project/skrub) (📥 45K / month · 📦 20 · ⏱️ 25.09.2025):
pip install skrub
pytorch_tabular (🥈23 · ⭐ 1.6K) - A standard framework for modelling Deep Learning Models.. MIT - [GitHub](https://github.com/manujosephv/pytorch_tabular) (👨‍💻 27 · 🔀 160 · 📥 64 · 📋 180 - 5% open · ⏱️ 19.04.2025):
git clone https://github.com/manujosephv/pytorch_tabular
- [PyPi](https://pypi.org/project/pytorch_tabular) (📥 5.5K / month · 📦 9 · ⏱️ 28.11.2024):
pip install pytorch_tabular
upgini (🥈21 · ⭐ 350) - Data search & enrichment library for Machine Learning Easily find and add.. BSD-3 - [GitHub](https://github.com/upgini/upgini) (👨‍💻 14 · 🔀 25 · 📦 9 · ⏱️ 28.10.2025):
git clone https://github.com/upgini/upgini
- [PyPi](https://pypi.org/project/upgini) (📥 5.9K / month · ⏱️ 28.10.2025):
pip install upgini
Show 3 hidden projects... - miceforest (🥈21 · ⭐ 390) - Multiple Imputation with LightGBM in Python. ❗Unlicensed - carefree-learn (🥉18 · ⭐ 410 · 💀) - Deep Learning PyTorch. MIT - deltapy (🥉13 · ⭐ 550 · 💀) - DeltaPy - Tabular Data Augmentation (by @firmai). MIT


Optical Character Recognition

Back to top

Libraries for optical character recognition (OCR) and text extraction from images or videos.

PaddleOCR (🥇44 · ⭐ 62K) - Turn any PDF or image document into structured data for your.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PaddleOCR) (👨‍💻 320 · 🔀 9.2K · 📥 2M · 📦 6.2K · 📋 10K - 1% open · ⏱️ 30.10.2025):
git clone https://github.com/PaddlePaddle/PaddleOCR
- [PyPi](https://pypi.org/project/paddleocr) (📥 750K / month · 📦 210 · ⏱️ 29.10.2025):
pip install paddleocr
OCRmyPDF (🥇37 · ⭐ 32K) - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them.. MPL-2.0 - [GitHub](https://github.com/ocrmypdf/OCRmyPDF) (👨‍💻 120 · 🔀 2.2K · 📥 15K · 📦 1.4K · 📋 1.3K - 11% open · ⏱️ 25.10.2025):
git clone https://github.com/ocrmypdf/OCRmyPDF
- [PyPi](https://pypi.org/project/ocrmypdf) (📥 400K / month · 📦 58 · ⏱️ 16.10.2025):
pip install ocrmypdf
- [Conda](https://anaconda.org/conda-forge/ocrmypdf) (📥 110K · ⏱️ 22.04.2025):
conda install -c conda-forge ocrmypdf
Tesseract (🥈32 · ⭐ 6.2K · 💤) - Python-tesseract is an optical character recognition (OCR).. Apache-2 - [GitHub](https://github.com/madmaze/pytesseract) (👨‍💻 50 · 🔀 730 · 📋 370 - 3% open · ⏱️ 17.02.2025):
git clone https://github.com/madmaze/pytesseract
- [PyPi](https://pypi.org/project/pytesseract) (📥 5.6M / month · 📦 970 · ⏱️ 16.08.2024):
pip install pytesseract
- [Conda](https://anaconda.org/conda-forge/pytesseract) (📥 690K · ⏱️ 22.04.2025):
conda install -c conda-forge pytesseract
tesserocr (🥈32 · ⭐ 2.1K) - A Python wrapper for the tesseract-ocr API. MIT - [GitHub](https://github.com/sirfz/tesserocr) (👨‍💻 34 · 🔀 260 · 📥 1.2K · 📦 1.3K · 📋 290 - 14% open · ⏱️ 10.10.2025):
git clone https://github.com/sirfz/tesserocr
- [PyPi](https://pypi.org/project/tesserocr) (📥 210K / month · 📦 56 · ⏱️ 10.10.2025):
pip install tesserocr
- [Conda](https://anaconda.org/conda-forge/tesserocr) (📥 290K · ⏱️ 22.04.2025):
conda install -c conda-forge tesserocr
MMOCR (🥉27 · ⭐ 4.7K · 💤) - OpenMMLab Text Detection, Recognition and Understanding Toolbox. Apache-2 - [GitHub](https://github.com/open-mmlab/mmocr) (👨‍💻 90 · 🔀 770 · 📦 240 · 📋 930 - 20% open · ⏱️ 27.11.2024):
git clone https://github.com/open-mmlab/mmocr
- [PyPi](https://pypi.org/project/mmocr) (📥 6K / month · 📦 4 · ⏱️ 05.05.2022):
pip install mmocr
keras-ocr (🥉25 · ⭐ 1.5K) - A packaged and flexible version of the CRAFT text detector and.. MIT - [GitHub](https://github.com/faustomorales/keras-ocr) (👨‍💻 19 · 🔀 340 · 📥 2.1M · 📦 720 · 📋 220 - 46% open · ⏱️ 22.09.2025):
git clone https://github.com/faustomorales/keras-ocr
- [PyPi](https://pypi.org/project/keras-ocr) (📥 18K / month · 📦 8 · ⏱️ 06.11.2023):
pip install keras-ocr
- [Conda](https://anaconda.org/anaconda/keras-ocr) (📥 450 · ⏱️ 22.04.2025):
conda install -c anaconda keras-ocr
Show 6 hidden projects... - EasyOCR (🥈34 · ⭐ 28K · 💀) - Ready-to-use OCR with 80+ supported languages and all popular.. Apache-2 - calamari (🥉22 · ⭐ 1.2K) - Line based ATR Engine based on OCRopy. ❗️GPL-3.0 - pdftabextract (🥉21 · ⭐ 2.2K · 💀) - A set of tools for extracting tables from PDF files.. Apache-2 - attention-ocr (🥉21 · ⭐ 1.1K · 💀) - A Tensorflow model for text recognition (CNN + seq2seq.. MIT - doc2text (🥉20 · ⭐ 1.3K · 💀) - Detect text blocks and OCR poorly scanned PDFs in bulk. Python.. MIT - Mozart (🥉10 · ⭐ 690 · 💀) - An optical music recognition (OMR) system. Converts sheet.. Apache-2


Data Containers & Structures

Back to top

General-purpose data containers & structures as well as utilities & extensions for pandas.

🔗 best-of-python - Data Containers ( ⭐ 4.2K) - Collection of data-container, dataframe, and pandas-..


Data Loading & Extraction

Back to top

Libraries for loading, collecting, and extracting data from a variety of data sources and formats.

🔗 best-of-python - Data Extraction ( ⭐ 4.2K) - Collection of data-loading and -extraction libraries.


Web Scraping & Crawling

Back to top

Libraries for web scraping, crawling, downloading, and mining as well as libraries.

🔗 best-of-web-python - Web Scraping ( ⭐ 2.6K) - Collection of web-scraping and crawling libraries.


Data Pipelines & Streaming

Back to top

Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.

🔗 best-of-python - Data Pipelines ( ⭐ 4.2K) - Libraries for data batch- and stream-processing,..

Show 1 hidden projects... - pyclugen (🥇10 · ⭐ 10) - Multidimensional cluster generation in Python. MIT


Distributed Machine Learning

Back to top

Libraries that provide capabilities to distribute and parallelize machine learning tasks across large-scale compute infrastructure.

Ray (🥇48 · ⭐ 40K) - Ray is an AI compute engine. Ray consists of a core distributed runtime.. Apache-2 - [GitHub](https://github.com/ray-project/ray) (👨‍💻 1.4K · 🔀 6.8K · 📥 270 · 📦 27K · 📋 22K - 14% open · ⏱️ 30.10.2025):
git clone https://github.com/ray-project/ray
- [PyPi](https://pypi.org/project/ray) (📥 30M / month · 📦 1.1K · ⏱️ 29.10.2025):
pip install ray
- [Conda](https://anaconda.org/conda-forge/ray-tune) (📥 920K · ⏱️ 22.10.2025):
conda install -c conda-forge ray-tune
dask (🥇45 · ⭐ 14K · 📈) - Parallel computing with task scheduling. BSD-3 - [GitHub](https://github.com/dask/dask) (👨‍💻 630 · 🔀 1.8K · 📦 77K · 📋 5.6K - 21% open · ⏱️ 29.10.2025):
git clone https://github.com/dask/dask
- [PyPi](https://pypi.org/project/dask) (📥 20M / month · 📦 3.2K · ⏱️ 14.10.2025):
pip install dask
- [Conda](https://anaconda.org/conda-forge/dask) (📥 14M · ⏱️ 14.10.2025):
conda install -c conda-forge dask
DeepSpeed (🥇41 · ⭐ 41K) - DeepSpeed is a deep learning optimization library that makes.. Apache-2 - [GitHub](https://github.com/deepspeedai/DeepSpeed) (👨‍💻 420 · 🔀 4.6K · 📦 15K · 📋 3.2K - 34% open · ⏱️ 29.10.2025):
git clone https://github.com/microsoft/DeepSpeed
- [PyPi](https://pypi.org/project/deepspeed) (📥 990K / month · 📦 350 · ⏱️ 23.10.2025):
pip install deepspeed
- [Docker Hub](https://hub.docker.com/r/deepspeed/deepspeed) (📥 24K · ⭐ 4 · ⏱️ 02.09.2022):
docker pull deepspeed/deepspeed
dask.distributed (🥇39 · ⭐ 1.7K) - A distributed task scheduler for Dask. BSD-3 - [GitHub](https://github.com/dask/distributed) (👨‍💻 340 · 🔀 740 · 📦 42K · 📋 3.9K - 37% open · ⏱️ 28.10.2025):
git clone https://github.com/dask/distributed
- [PyPi](https://pypi.org/project/distributed) (📥 5.4M / month · 📦 1K · ⏱️ 14.10.2025):
pip install distributed
- [Conda](https://anaconda.org/conda-forge/distributed) (📥 20M · ⏱️ 14.10.2025):
conda install -c conda-forge distributed
horovod (🥈36 · ⭐ 15K) - Distributed training framework for TensorFlow, Keras, PyTorch, and.. Apache-2 - [GitHub](https://github.com/horovod/horovod) (👨‍💻 180 · 🔀 2.3K · 📦 1.4K · 📋 2.3K - 17% open · ⏱️ 28.10.2025):
git clone https://github.com/horovod/horovod
- [PyPi](https://pypi.org/project/horovod) (📥 78K / month · 📦 34 · ⏱️ 12.06.2023):
pip install horovod
metrics (🥈36 · ⭐ 2.3K) - Machine learning metrics for distributed, scalable PyTorch.. Apache-2 - [GitHub](https://github.com/Lightning-AI/torchmetrics) (👨‍💻 280 · 🔀 460 · 📥 6.9K · 📦 45K · 📋 990 - 8% open · ⏱️ 27.10.2025):
git clone https://github.com/Lightning-AI/metrics
- [PyPi](https://pypi.org/project/metrics) (📥 4.6K / month · 📦 4 · ⏱️ 26.02.2025):
pip install metrics
- [Conda](https://anaconda.org/conda-forge/torchmetrics) (📥 2.2M · ⏱️ 03.09.2025):
conda install -c conda-forge torchmetrics
H2O-3 (🥈34 · ⭐ 7.3K) - H2O is an Open Source, Distributed, Fast & Scalable Machine Learning.. Apache-2 - [GitHub](https://github.com/h2oai/h2o-3) (👨‍💻 280 · 🔀 2K · 📦 99 · 📋 9.6K - 30% open · ⏱️ 21.10.2025):
git clone https://github.com/h2oai/h2o-3
- [PyPi](https://pypi.org/project/h2o) (📥 180K / month · 📦 68 · ⏱️ 08.10.2025):
pip install h2o
ColossalAI (🥈33 · ⭐ 41K) - Making large AI models cheaper, faster and more accessible. Apache-2 - [GitHub](https://github.com/hpcaitech/ColossalAI) (👨‍💻 200 · 🔀 4.5K · 📦 530 · 📋 1.8K - 26% open · ⏱️ 26.09.2025):
git clone https://github.com/hpcaitech/colossalai
mpi4py (🥈33 · ⭐ 880) - Python bindings for MPI. BSD-3 - [GitHub](https://github.com/mpi4py/mpi4py) (👨‍💻 28 · 🔀 130 · 📥 39K · 📦 12K · 📋 230 - 2% open · ⏱️ 29.10.2025):
git clone https://github.com/mpi4py/mpi4py
- [PyPi](https://pypi.org/project/mpi4py) (📥 980K / month · 📦 1K · ⏱️ 10.10.2025):
pip install mpi4py
- [Conda](https://anaconda.org/conda-forge/mpi4py) (📥 4.5M · ⏱️ 14.10.2025):
conda install -c conda-forge mpi4py
FairScale (🥈31 · ⭐ 3.4K) - PyTorch extensions for high performance and large scale training. BSD-3 - [GitHub](https://github.com/facebookresearch/fairscale) (👨‍💻 77 · 🔀 290 · 📦 9K · 📋 390 - 26% open · ⏱️ 26.04.2025):
git clone https://github.com/facebookresearch/fairscale
- [PyPi](https://pypi.org/project/fairscale) (📥 530K / month · 📦 150 · ⏱️ 11.12.2022):
pip install fairscale
- [Conda](https://anaconda.org/conda-forge/fairscale) (📥 530K · ⏱️ 22.04.2025):
conda install -c conda-forge fairscale
Submit it (🥈31 · ⭐ 1.5K) - Python 3.8+ toolbox for submitting jobs to Slurm. MIT - [GitHub](https://github.com/facebookincubator/submitit) (👨‍💻 26 · 🔀 140 · 📦 4.7K · 📋 130 - 38% open · ⏱️ 21.05.2025):
git clone https://github.com/facebookincubator/submitit
- [PyPi](https://pypi.org/project/submitit) (📥 840K / month · 📦 74 · ⏱️ 21.05.2025):
pip install submitit
- [Conda](https://anaconda.org/conda-forge/submitit) (📥 65K · ⏱️ 22.04.2025):
conda install -c conda-forge submitit
BigDL (🥈30 · ⭐ 8.4K) - Accelerate local LLM inference and finetuning (LLaMA, Mistral,.. Apache-2 - [GitHub](https://github.com/intel/ipex-llm) (👨‍💻 120 · 🔀 1.4K · 📥 710 · 📋 3K - 40% open · ⏱️ 14.10.2025):
git clone https://github.com/intel-analytics/BigDL
- [PyPi](https://pypi.org/project/bigdl) (📥 15K / month · 📦 2 · ⏱️ 24.03.2024):
pip install bigdl
- [Maven](https://search.maven.org/artifact/com.intel.analytics.bigdl/bigdl-SPARK_2.4) (📦 5 · ⏱️ 20.04.2021):
<dependency>
    <groupId>com.intel.analytics.bigdl</groupId>
    <artifactId>bigdl-SPARK_2.4</artifactId>
    <version>[VERSION]</version>
</dependency>
SynapseML (🥈30 · ⭐ 5.2K) - Simple and Distributed Machine Learning. MIT - [GitHub](https://github.com/microsoft/SynapseML) (👨‍💻 130 · 🔀 850 · 📋 820 - 49% open · ⏱️ 29.10.2025):
git clone https://github.com/microsoft/SynapseML
- [PyPi](https://pypi.org/project/synapseml) (📥 1.6M / month · 📦 7 · ⏱️ 03.10.2025):
pip install synapseml
petastorm (🥈29 · ⭐ 1.9K) - Petastorm library enables single machine or distributed training.. Apache-2 - [GitHub](https://github.com/uber/petastorm) (👨‍💻 52 · 🔀 280 · 📥 580 · 📦 390 · 📋 330 - 54% open · ⏱️ 15.09.2025):
git clone https://github.com/uber/petastorm
- [PyPi](https://pypi.org/project/petastorm) (📥 270K / month · 📦 15 · ⏱️ 11.08.2025):
pip install petastorm
dask-ml (🥉28 · ⭐ 940) - Scalable Machine Learning with Dask. BSD-3 - [GitHub](https://github.com/dask/dask-ml) (👨‍💻 82 · 🔀 260 · 📦 1.3K · 📋 550 - 51% open · ⏱️ 27.09.2025):
git clone https://github.com/dask/dask-ml
- [PyPi](https://pypi.org/project/dask-ml) (📥 120K / month · 📦 100 · ⏱️ 08.02.2025):
pip install dask-ml
- [Conda](https://anaconda.org/conda-forge/dask-ml) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge dask-ml
Hivemind (🥉26 · ⭐ 2.3K) - Decentralized deep learning in PyTorch. Built to train models on.. MIT - [GitHub](https://github.com/learning-at-home/hivemind) (👨‍💻 34 · 🔀 190 · 📦 130 · 📋 190 - 43% open · ⏱️ 12.10.2025):
git clone https://github.com/learning-at-home/hivemind
- [PyPi](https://pypi.org/project/hivemind) (📥 43K / month · 📦 12 · ⏱️ 20.04.2025):
pip install hivemind
MMLSpark (🥉23 · ⭐ 5.2K) - Simple and Distributed Machine Learning. MIT - [GitHub](https://github.com/microsoft/SynapseML) (👨‍💻 130 · 🔀 850 · 📋 820 - 49% open · ⏱️ 29.10.2025):
git clone https://github.com/microsoft/SynapseML
- [PyPi](https://pypi.org/project/mmlspark) (⏱️ 18.03.2020):
pip install mmlspark
Apache Singa (🥉23 · ⭐ 3.6K · 💤) - a distributed deep learning platform. Apache-2 - [GitHub](https://github.com/apache/singa) (👨‍💻 98 · 🔀 1.3K · 📦 6 · 📋 140 - 35% open · ⏱️ 26.03.2025):
git clone https://github.com/apache/singa
- [Conda](https://anaconda.org/nusdbsystem/singa) (📥 1.2K · ⏱️ 25.03.2025):
conda install -c nusdbsystem singa
- [Docker Hub](https://hub.docker.com/r/apache/singa) (📥 9.9K · ⭐ 3 · ⏱️ 31.05.2022):
docker pull apache/singa
analytics-zoo (🥉22 · ⭐ 2.6K) - Distributed Tensorflow, Keras and PyTorch on Apache.. Apache-2 - [GitHub](https://github.com/intel/analytics-zoo) (👨‍💻 110 · 🔀 730 · 📋 1.3K - 32% open · ⏱️ 09.10.2025):
git clone https://github.com/intel-analytics/analytics-zoo
- [PyPi](https://pypi.org/project/analytics-zoo) (📥 600 / month · 📦 1 · ⏱️ 22.08.2022):
pip install analytics-zoo
Show 17 hidden projects... - DEAP (🥈34 · ⭐ 6.2K) - Distributed Evolutionary Algorithms in Python. ❗️LGPL-3.0 - ipyparallel (🥈29 · ⭐ 2.6K) - IPython Parallel: Interactive Parallel Computing in.. ❗Unlicensed - TensorFlowOnSpark (🥉25 · ⭐ 3.9K · 💀) - TensorFlowOnSpark brings TensorFlow programs to.. Apache-2 - Elephas (🥉24 · ⭐ 1.6K · 💀) - Distributed Deep learning with Keras & Spark. MIT keras - Mesh (🥉23 · ⭐ 1.6K · 💀) - Mesh TensorFlow: Model Parallelism Made Easier. Apache-2 - BytePS (🥉21 · ⭐ 3.7K · 💀) - A high performance and generic framework for distributed DNN.. Apache-2 - somoclu (🥉21 · ⭐ 280 · 💀) - Massively parallel self-organizing maps: accelerate training on.. MIT - TensorFrames (🥉19 · ⭐ 750 · 💀) - [DEPRECATED] Tensorflow wrapper for DataFrames on.. Apache-2 - sk-dist (🥉19 · ⭐ 290 · 💀) - Distributed scikit-learn meta-estimators in PySpark. Apache-2 - mesh-transformer-jax (🥉18 · ⭐ 6.4K · 💀) - Model parallel transformers in JAX and Haiku. Apache-2 - launchpad (🥉18 · ⭐ 330 · 💀) - Launchpad is a library that simplifies writing.. Apache-2 - Fiber (🥉17 · ⭐ 1K · 💀) - Distributed Computing for AI Made Simple. Apache-2 - bluefog (🥉17 · ⭐ 290 · 💀) - Distributed and decentralized training framework for PyTorch.. Apache-2 - parallelformers (🥉16 · ⭐ 790 · 💀) - Parallelformers: An Efficient Model Parallelization.. Apache-2 - LazyCluster (🥉13 · ⭐ 49 · 💀) - Distributed machine learning made simple. Apache-2 - autodist (🥉12 · ⭐ 130 · 💀) - Simple Distributed Deep Learning on TensorFlow. Apache-2 - moolib (🥉11 · ⭐ 370 · 💀) - A library for distributed ML training with PyTorch. MIT


Hyperparameter Optimization & AutoML

Back to top

Libraries for hyperparameter optimization, automl and neural architecture search.

Optuna (🥇44 · ⭐ 13K) - A hyperparameter optimization framework. MIT - [GitHub](https://github.com/optuna/optuna) (👨‍💻 320 · 🔀 1.2K · 📦 30K · 📋 1.8K - 3% open · ⏱️ 30.10.2025):
git clone https://github.com/optuna/optuna
- [PyPi](https://pypi.org/project/optuna) (📥 7.6M / month · 📦 1.5K · ⏱️ 18.08.2025):
pip install optuna
- [Conda](https://anaconda.org/conda-forge/optuna) (📥 3.4M · ⏱️ 19.08.2025):
conda install -c conda-forge optuna
Ax (🥇36 · ⭐ 2.6K) - Adaptive Experimentation Platform. MIT - [GitHub](https://github.com/facebook/Ax) (👨‍💻 200 · 🔀 350 · 📦 1.2K · 📋 930 - 12% open · ⏱️ 30.10.2025):
git clone https://github.com/facebook/Ax
- [PyPi](https://pypi.org/project/ax-platform) (📥 240K / month · 📦 71 · ⏱️ 09.09.2025):
pip install ax-platform
- [Conda](https://anaconda.org/conda-forge/ax-platform) (📥 49K · ⏱️ 10.09.2025):
conda install -c conda-forge ax-platform
AutoGluon (🥇35 · ⭐ 9.6K) - Fast and Accurate ML in 3 Lines of Code. Apache-2 - [GitHub](https://github.com/autogluon/autogluon) (👨‍💻 140 · 🔀 1.1K · 📦 1.2K · 📋 1.8K - 24% open · ⏱️ 29.10.2025):
git clone https://github.com/autogluon/autogluon
- [PyPi](https://pypi.org/project/autogluon) (📥 260K / month · 📦 38 · ⏱️ 23.10.2025):
pip install autogluon
- [Conda](https://anaconda.org/conda-forge/autogluon) (📥 45K · ⏱️ 30.07.2025):
conda install -c conda-forge autogluon
- [Docker Hub](https://hub.docker.com/r/autogluon/autogluon) (📥 20K · ⭐ 19 · ⏱️ 16.06.2025):
docker pull autogluon/autogluon
BoTorch (🥇35 · ⭐ 3.4K) - Bayesian optimization in PyTorch. MIT - [GitHub](https://github.com/meta-pytorch/botorch) (👨‍💻 150 · 🔀 450 · 📦 1.8K · 📋 600 - 10% open · ⏱️ 29.10.2025):
git clone https://github.com/pytorch/botorch
- [PyPi](https://pypi.org/project/botorch) (📥 480K / month · 📦 140 · ⏱️ 23.10.2025):
pip install botorch
- [Conda](https://anaconda.org/conda-forge/botorch) (📥 180K · ⏱️ 24.10.2025):
conda install -c conda-forge botorch
Bayesian Optimization (🥇34 · ⭐ 8.4K) - A Python implementation of global optimization with.. MIT - [GitHub](https://github.com/bayesian-optimization/BayesianOptimization) (👨‍💻 52 · 🔀 1.6K · 📥 180 · 📦 4.1K · 📋 390 - 1% open · ⏱️ 09.09.2025):
git clone https://github.com/fmfn/BayesianOptimization
- [PyPi](https://pypi.org/project/bayesian-optimization) (📥 510K / month · 📦 190 · ⏱️ 24.07.2025):
pip install bayesian-optimization
Hyperopt (🥇34 · ⭐ 7.5K · 💤) - Distributed Asynchronous Hyperparameter Optimization in Python. BSD-3 - [GitHub](https://github.com/hyperopt/hyperopt) (👨‍💻 100 · 🔀 1.1K · 📦 22K · 📋 800 - 17% open · ⏱️ 27.12.2024):
git clone https://github.com/hyperopt/hyperopt
- [PyPi](https://pypi.org/project/hyperopt) (📥 2.7M / month · 📦 450 · ⏱️ 17.11.2021):
pip install hyperopt
- [Conda](https://anaconda.org/conda-forge/hyperopt) (📥 860K · ⏱️ 22.04.2025):
conda install -c conda-forge hyperopt
AutoKeras (🥈32 · ⭐ 9.3K · 💤) - AutoML library for deep learning. Apache-2 - [GitHub](https://github.com/keras-team/autokeras) (👨‍💻 150 · 🔀 1.4K · 📥 21K · 📦 890 · 📋 910 - 16% open · ⏱️ 16.12.2024):
git clone https://github.com/keras-team/autokeras
- [PyPi](https://pypi.org/project/autokeras) (📥 15K / month · 📦 13 · ⏱️ 20.03.2024):
pip install autokeras
featuretools (🥈32 · ⭐ 7.6K · 💤) - An open source python library for automated feature.. BSD-3 - [GitHub](https://github.com/alteryx/featuretools) (👨‍💻 75 · 🔀 900 · 📦 2.1K · 📋 1K - 15% open · ⏱️ 13.11.2024):
git clone https://github.com/alteryx/featuretools
- [PyPi](https://pypi.org/project/featuretools) (📥 100K / month · 📦 74 · ⏱️ 14.05.2024):
pip install featuretools
- [Conda](https://anaconda.org/conda-forge/featuretools) (📥 270K · ⏱️ 22.04.2025):
conda install -c conda-forge featuretools
nevergrad (🥈30 · ⭐ 4.1K) - A Python toolbox for performing gradient-free optimization. MIT - [GitHub](https://github.com/facebookresearch/nevergrad) (👨‍💻 58 · 🔀 360 · 📦 1.2K · 📋 310 - 40% open · ⏱️ 23.04.2025):
git clone https://github.com/facebookresearch/nevergrad
- [PyPi](https://pypi.org/project/nevergrad) (📥 150K / month · 📦 72 · ⏱️ 23.04.2025):
pip install nevergrad
- [Conda](https://anaconda.org/conda-forge/nevergrad) (📥 67K · ⏱️ 22.04.2025):
conda install -c conda-forge nevergrad
lazypredict (🥈28 · ⭐ 3.2K) - Lazy Predict help build a lot of basic models without much code.. MIT - [GitHub](https://github.com/shankarpandala/lazypredict) (👨‍💻 19 · 🔀 360 · 📦 1.4K · 📋 160 - 64% open · ⏱️ 17.10.2025):
git clone https://github.com/shankarpandala/lazypredict
- [PyPi](https://pypi.org/project/lazypredict) (📥 31K / month · 📦 8 · ⏱️ 05.04.2025):
pip install lazypredict
- [Conda](https://anaconda.org/conda-forge/lazypredict) (📥 6.6K · ⏱️ 22.04.2025):
conda install -c conda-forge lazypredict
mljar-supervised (🥈28 · ⭐ 3.2K) - Python package for AutoML on Tabular Data with Feature.. MIT - [GitHub](https://github.com/mljar/mljar-supervised) (👨‍💻 30 · 🔀 420 · 📦 170 · 📋 680 - 21% open · ⏱️ 07.07.2025):
git clone https://github.com/mljar/mljar-supervised
- [PyPi](https://pypi.org/project/mljar-supervised) (📥 8.7K / month · 📦 6 · ⏱️ 07.07.2025):
pip install mljar-supervised
- [Conda](https://anaconda.org/conda-forge/mljar-supervised) (📥 52K · ⏱️ 08.07.2025):
conda install -c conda-forge mljar-supervised
Hyperactive (🥈24 · ⭐ 530) - An optimization and data collection toolbox for convenient and fast.. MIT - [GitHub](https://github.com/SimonBlanke/Hyperactive) (👨‍💻 15 · 🔀 51 · 📥 340 · 📦 40 · 📋 120 - 28% open · ⏱️ 25.10.2025):
git clone https://github.com/SimonBlanke/Hyperactive
- [PyPi](https://pypi.org/project/hyperactive) (📥 4K / month · 📦 13 · ⏱️ 20.09.2025):
pip install hyperactive
FEDOT (🥉23 · ⭐ 700) - Automated modeling and machine learning framework FEDOT. BSD-3 - [GitHub](https://github.com/aimclub/FEDOT) (👨‍💻 40 · 🔀 89 · 📦 65 · 📋 570 - 11% open · ⏱️ 14.10.2025):
git clone https://github.com/nccr-itmo/FEDOT
- [PyPi](https://pypi.org/project/fedot) (📥 1.8K / month · 📦 7 · ⏱️ 10.03.2025):
pip install fedot
AlphaPy (🥉21 · ⭐ 1.6K) - Python AutoML for Trading Systems and Sports Betting. Apache-2 - [GitHub](https://github.com/ScottfreeLLC/AlphaPy) (👨‍💻 5 · 🔀 250 · 📦 10 · 📋 45 - 35% open · ⏱️ 24.08.2025):
git clone https://github.com/ScottfreeLLC/AlphaPy
- [PyPi](https://pypi.org/project/alphapy) (📥 320 / month · ⏱️ 29.08.2020):
pip install alphapy
Auto ViML (🥉20 · ⭐ 540 · 💤) - Automatically Build Multiple ML Models with a Single Line of.. Apache-2 - [GitHub](https://github.com/AutoViML/Auto_ViML) (👨‍💻 9 · 🔀 100 · 📦 28 · ⏱️ 30.01.2025):
git clone https://github.com/AutoViML/Auto_ViML
- [PyPi](https://pypi.org/project/autoviml) (📥 2.6K / month · 📦 3 · ⏱️ 30.01.2025):
pip install autoviml
featurewiz (🥉18 · ⭐ 670 · 💤) - Use advanced feature engineering strategies and select best.. Apache-2 - [GitHub](https://github.com/AutoViML/featurewiz) (👨‍💻 18 · 🔀 98 · 📋 110 - 0% open · ⏱️ 19.02.2025):
git clone https://github.com/AutoViML/featurewiz
- [PyPi](https://pypi.org/project/featurewiz) (📥 4.6K / month · 📦 4 · ⏱️ 19.02.2025):
pip install featurewiz
Show 36 hidden projects... - TPOT (🥈32 · ⭐ 10K) - A Python Automated Machine Learning tool that optimizes machine.. ❗️LGPL-3.0 - Keras Tuner (🥈32 · ⭐ 2.9K · 💀) - A Hyperparameter Tuning Library for Keras. Apache-2 - scikit-optimize (🥈32 · ⭐ 2.8K · 💀) - Sequential model-based optimization with a.. BSD-3 - NNI (🥈31 · ⭐ 14K · 💀) - An open source AutoML toolkit for automate machine learning lifecycle,.. MIT - auto-sklearn (🥈31 · ⭐ 8K · 💀) - Automated Machine Learning with scikit-learn. BSD-3 - Hyperas (🥈27 · ⭐ 2.2K · 💀) - Keras + Hyperopt: A very simple wrapper for convenient.. MIT - SMAC3 (🥈27 · ⭐ 1.2K · 💤) - SMAC3: A Versatile Bayesian Optimization Package for.. ❗️BSD-1-Clause - GPyOpt (🥈26 · ⭐ 950 · 💀) - Gaussian Process Optimization using GPy. BSD-3 - AdaNet (🥈24 · ⭐ 3.5K · 💀) - Fast and flexible AutoML with learning guarantees. Apache-2 - auto_ml (🥈24 · ⭐ 1.7K · 💀) - [UNMAINTAINED] Automated machine learning for analytics & production. MIT - Talos (🥈24 · ⭐ 1.6K · 💀) - Hyperparameter Experiments with TensorFlow and Keras. MIT - lightwood (🥈24 · ⭐ 490) - Lightwood is Legos for Machine Learning. ❗️GPL-3.0 - Orion (🥈24 · ⭐ 300 · 💀) - Asynchronous Distributed Hyperparameter Optimization. BSD-3 - HpBandSter (🥉22 · ⭐ 620 · 💀) - a distributed Hyperband implementation on Steroids. BSD-3 - MLBox (🥉21 · ⭐ 1.5K · 💀) - MLBox is a powerful Automated Machine Learning python library. ❗️BSD-1-Clause - Test Tube (🥉21 · ⭐ 740 · 💀) - Python library to easily log experiments and parallelize.. MIT - Neuraxle (🥉21 · ⭐ 610 · 💀) - The worlds cleanest AutoML library - Do hyperparameter tuning.. Apache-2 - optunity (🥉21 · ⭐ 420 · 💀) - optimization routines for hyperparameter tuning. BSD-3 - sklearn-deap (🥉20 · ⭐ 770 · 💀) - Use evolutionary algorithms instead of gridsearch in.. MIT - opytimizer (🥉20 · ⭐ 630 · 💀) - Opytimizer is a Python library consisting of meta-heuristic.. Apache-2 - igel (🥉19 · ⭐ 3.1K · 💀) - a delightful machine learning tool that allows you to train, test, and.. MIT - Dragonfly (🥉19 · ⭐ 890 · 💀) - An open source python library for scalable Bayesian optimisation. MIT - Auto Tune Models (🥉19 · ⭐ 530 · 💀) - Auto Tune Models - A multi-tenant, multi-data system for.. MIT - Sherpa (🥉19 · ⭐ 340 · 💀) - Hyperparameter optimization that enables researchers to.. ❗️GPL-3.0 - shap-hypetune (🥉18 · ⭐ 580 · 💀) - A python package for simultaneous Hyperparameters Tuning and.. MIT - Advisor (🥉17 · ⭐ 1.6K · 💀) - Open-source implementation of Google Vizier for hyper parameters.. Apache-2 - Xcessiv (🥉17 · ⭐ 1.3K · 💀) - A web-based application for quick, scalable, and automated.. Apache-2 - automl-gs (🥉16 · ⭐ 1.9K · 💀) - Provide an input CSV and a target field to predict, generate a.. MIT - HyperparameterHunter (🥉16 · ⭐ 710 · 💀) - Easy hyperparameter optimization and automatic result.. MIT - Parfit (🥉15 · ⭐ 200 · 💀) - A package for parallelizing the fit and flexibly scoring of.. MIT - ENAS (🥉13 · ⭐ 2.7K · 💀) - PyTorch implementation of Efficient Neural Architecture Search via.. Apache-2 - Auptimizer (🥉12 · ⭐ 200 · 💀) - An automatic ML model optimization tool. ❗️GPL-3.0 - Hypermax (🥉12 · ⭐ 110 · 💀) - Better, faster hyper-parameter optimization. BSD-3 - model_search (🥉11 · ⭐ 3.3K · 💀) - AutoML algorithms for model architecture search at scale. Apache-2 - Devol (🥉11 · ⭐ 950 · 💀) - Genetic neural architecture search with Keras. MIT - Hypertunity (🥉9 · ⭐ 140 · 💀) - A toolset for black-box hyperparameter optimisation. Apache-2


Reinforcement Learning

Back to top

Libraries for building and evaluating reinforcement learning & agent-based systems.

FinRL (🥇30 · ⭐ 13K) - FinRL: Financial Reinforcement Learning. MIT - [GitHub](https://github.com/AI4Finance-Foundation/FinRL) (👨‍💻 130 · 🔀 2.9K · 📦 110 · 📋 760 - 35% open · ⏱️ 03.10.2025):
git clone https://github.com/AI4Finance-Foundation/FinRL
- [PyPi](https://pypi.org/project/finrl) (📥 2.8K / month · ⏱️ 08.01.2022):
pip install finrl
ViZDoom (🥇29 · ⭐ 1.9K) - Reinforcement Learning environments based on the 1993 game Doom. MIT - [GitHub](https://github.com/Farama-Foundation/ViZDoom) (👨‍💻 57 · 🔀 400 · 📥 12K · 📦 340 · 📋 470 - 6% open · ⏱️ 26.10.2025):
git clone https://github.com/mwydmuch/ViZDoom
- [PyPi](https://pypi.org/project/vizdoom) (📥 6.9K / month · 📦 20 · ⏱️ 22.10.2025):
pip install vizdoom
Dopamine (🥈27 · ⭐ 11K · 💤) - Dopamine is a research framework for fast prototyping of.. Apache-2 - [GitHub](https://github.com/google/dopamine) (👨‍💻 15 · 🔀 1.4K · 📦 21 · 📋 200 - 55% open · ⏱️ 04.11.2024):
git clone https://github.com/google/dopamine
- [PyPi](https://pypi.org/project/dopamine-rl) (📥 68K / month · 📦 10 · ⏱️ 31.10.2024):
pip install dopamine-rl
Acme (🥈27 · ⭐ 3.8K) - A library of reinforcement learning components and agents. Apache-2 - [GitHub](https://github.com/google-deepmind/acme) (👨‍💻 90 · 🔀 500 · 📦 250 · 📋 270 - 24% open · ⏱️ 26.09.2025):
git clone https://github.com/deepmind/acme
- [PyPi](https://pypi.org/project/dm-acme) (📥 6.4K / month · 📦 3 · ⏱️ 10.02.2022):
pip install dm-acme
- [Conda](https://anaconda.org/conda-forge/dm-acme) (📥 14K · ⏱️ 22.04.2025):
conda install -c conda-forge dm-acme
TF-Agents (🥈27 · ⭐ 3K) - TF-Agents: A reliable, scalable and easy to use TensorFlow.. Apache-2 - [GitHub](https://github.com/tensorflow/agents) (👨‍💻 150 · 🔀 740 · 📋 680 - 30% open · ⏱️ 16.06.2025):
git clone https://github.com/tensorflow/agents
- [PyPi](https://pypi.org/project/tf-agents) (📥 45K / month · 📦 14 · ⏱️ 14.12.2023):
pip install tf-agents
RLax (🥉26 · ⭐ 1.4K) - A library of reinforcement learning building blocks in JAX. Apache-2 - [GitHub](https://github.com/google-deepmind/rlax) (👨‍💻 23 · 🔀 95 · 📦 370 · 📋 28 - 32% open · ⏱️ 26.09.2025):
git clone https://github.com/deepmind/rlax
- [PyPi](https://pypi.org/project/rlax) (📥 48K / month · 📦 22 · ⏱️ 01.09.2025):
pip install rlax
PARL (🥉24 · ⭐ 3.4K) - A high-performance distributed training framework for Reinforcement.. Apache-2 - [GitHub](https://github.com/PaddlePaddle/PARL) (👨‍💻 46 · 🔀 820 · 📦 140 · 📋 540 - 23% open · ⏱️ 13.09.2025):
git clone https://github.com/PaddlePaddle/PARL
- [PyPi](https://pypi.org/project/parl) (📥 770 / month · 📦 1 · ⏱️ 13.05.2022):
pip install parl
ReAgent (🥉22 · ⭐ 3.7K) - A platform for Reasoning systems (Reinforcement Learning,.. BSD-3 - [GitHub](https://github.com/facebookresearch/ReAgent) (👨‍💻 170 · 🔀 520 · 📋 160 - 53% open · ⏱️ 17.10.2025):
git clone https://github.com/facebookresearch/ReAgent
- [PyPi](https://pypi.org/project/reagent) (📥 50 / month · ⏱️ 27.05.2020):
pip install reagent
Show 15 hidden projects... - OpenAI Gym (🥇40 · ⭐ 37K · 💀) - A toolkit for developing and comparing reinforcement learning.. MIT - baselines (🥇29 · ⭐ 17K · 💀) - OpenAI Baselines: high-quality implementations of reinforcement.. MIT - keras-rl (🥈28 · ⭐ 5.6K · 💀) - Deep Reinforcement Learning for Keras. MIT - TensorLayer (🥈27 · ⭐ 7.4K · 💀) - Deep Learning and Reinforcement Learning Library for.. Apache-2 - TensorForce (🥈27 · ⭐ 3.3K · 💀) - Tensorforce: a TensorFlow library for applied.. Apache-2 - garage (🥉26 · ⭐ 2K · 💀) - A toolkit for reproducible reinforcement learning research. MIT - ChainerRL (🥉25 · ⭐ 1.2K · 💀) - ChainerRL is a deep reinforcement learning library built on top of.. MIT - Stable Baselines (🥉24 · ⭐ 4.3K · 💀) - A fork of OpenAI Baselines, implementations of.. MIT - PFRL (🥉23 · ⭐ 1.2K · 💀) - PFRL: a PyTorch-based deep reinforcement learning library. MIT - TRFL (🥉22 · ⭐ 3.1K · 💀) - TensorFlow Reinforcement Learning. Apache-2 - Coach (🥉20 · ⭐ 2.3K · 💀) - Reinforcement Learning Coach by Intel AI Lab enables easy.. Apache-2 - SerpentAI (🥉19 · ⭐ 6.9K · 💀) - Game Agent Framework. Helping you create AIs / Bots that learn to.. MIT - DeepMind Lab (🥉17 · ⭐ 7.3K · 💀) - A customisable 3D platform for agent-based AI research. ❗Unlicensed - Maze (🥉12 · ⭐ 280 · 💀) - Maze Applied Reinforcement Learning Framework. ❗️Custom - rliable (🥉11 · ⭐ 850 · 💀) - [NeurIPS21 Outstanding Paper] Library for reliable evaluation on.. Apache-2


Recommender Systems

Back to top

Libraries for building and evaluating recommendation systems.

Recommenders (🥇33 · ⭐ 21K) - Best Practices on Recommendation Systems. MIT - [GitHub](https://github.com/recommenders-team/recommenders) (👨‍💻 140 · 🔀 3.2K · 📥 790 · 📦 180 · 📋 890 - 18% open · ⏱️ 13.10.2025):
git clone https://github.com/microsoft/recommenders
- [PyPi](https://pypi.org/project/recommenders) (📥 15K / month · 📦 4 · ⏱️ 24.12.2024):
pip install recommenders
torchrec (🥇32 · ⭐ 2.4K) - Pytorch domain library for recommendation systems. BSD-3 - [GitHub](https://github.com/meta-pytorch/torchrec) (👨‍💻 400 · 🔀 560 · 📦 240 · 📋 320 - 49% open · ⏱️ 30.10.2025):
git clone https://github.com/pytorch/torchrec
- [PyPi](https://pypi.org/project/torchrec-nightly-cpu) (📥 160 / month · ⏱️ 12.05.2022):
pip install torchrec-nightly-cpu
Cornac (🥈28 · ⭐ 1K) - A Comparative Framework for Multimodal Recommender Systems. Apache-2 - [GitHub](https://github.com/PreferredAI/cornac) (👨‍💻 24 · 🔀 160 · 📦 300 · 📋 170 - 17% open · ⏱️ 04.10.2025):
git clone https://github.com/PreferredAI/cornac
- [PyPi](https://pypi.org/project/cornac) (📥 44K / month · 📦 18 · ⏱️ 04.10.2025):
pip install cornac
- [Conda](https://anaconda.org/conda-forge/cornac) (📥 920K · ⏱️ 05.10.2025):
conda install -c conda-forge cornac
lkpy (🥈28 · ⭐ 300) - Python recommendation toolkit. MIT - [GitHub](https://github.com/lenskit/lkpy) (👨‍💻 41 · 🔀 72 · 📦 140 · 📋 290 - 33% open · ⏱️ 29.10.2025):
git clone https://github.com/lenskit/lkpy
- [PyPi](https://pypi.org/project/lenskit) (📥 6.6K / month · 📦 13 · ⏱️ 22.10.2025):
pip install lenskit
- [Conda](https://anaconda.org/conda-forge/lenskit) (📥 52K · ⏱️ 23.10.2025):
conda install -c conda-forge lenskit
RecBole (🥉25 · ⭐ 4.1K · 💤) - A unified, comprehensive and efficient recommendation library. MIT - [GitHub](https://github.com/RUCAIBox/RecBole) (👨‍💻 79 · 🔀 690 · 📋 1.1K - 32% open · ⏱️ 24.02.2025):
git clone https://github.com/RUCAIBox/RecBole
- [PyPi](https://pypi.org/project/recbole) (📥 98K / month · 📦 2 · ⏱️ 24.02.2025):
pip install recbole
- [Conda](https://anaconda.org/aibox/recbole) (📥 9.3K · ⏱️ 25.03.2025):
conda install -c aibox recbole
TF Recommenders (🥉25 · ⭐ 2K) - TensorFlow Recommenders is a library for building.. Apache-2 - [GitHub](https://github.com/tensorflow/recommenders) (👨‍💻 45 · 🔀 290 · 📋 450 - 59% open · ⏱️ 27.09.2025):
git clone https://github.com/tensorflow/recommenders
- [PyPi](https://pypi.org/project/tensorflow-recommenders) (📥 220K / month · 📦 2 · ⏱️ 03.02.2023):
pip install tensorflow-recommenders
Show 11 hidden projects... - implicit (🥈30 · ⭐ 3.7K · 💀) - Fast Python Collaborative Filtering for Implicit Feedback Datasets. MIT - lightfm (🥈28 · ⭐ 5K · 💀) - A Python implementation of LightFM, a hybrid recommendation.. Apache-2 - scikit-surprise (🥈27 · ⭐ 6.7K · 💀) - A Python scikit for building and analyzing recommender.. BSD-3 - TF Ranking (🥉26 · ⭐ 2.8K · 💀) - Learning to Rank in TensorFlow. Apache-2 - fastFM (🥉22 · ⭐ 1.1K · 💀) - fastFM: A Library for Factorization Machines. BSD-3 - tensorrec (🥉21 · ⭐ 1.3K · 💀) - A TensorFlow recommendation algorithm and framework in.. Apache-2 - Spotlight (🥉18 · ⭐ 3K · 💀) - Deep recommender models using PyTorch. MIT - recmetrics (🥉18 · ⭐ 580 · 💀) - A library of metrics for evaluating recommender systems. MIT - Case Recommender (🥉18 · ⭐ 500 · 💀) - Case Recommender: A Flexible and Extensible Python.. MIT - OpenRec (🥉16 · ⭐ 420 · 💀) - OpenRec is an open-source and modular library for neural network-.. Apache-2 - Collie (🥉10 · ⭐ 100 · 💀) - A library for preparing, training, and evaluating scalable deep.. BSD-3


Privacy Machine Learning

Back to top

Libraries for encrypted and privacy-preserving machine learning using methods like federated learning & differential privacy.

Opacus (🥇32 · ⭐ 1.9K) - Training PyTorch models with differential privacy. Apache-2 - [GitHub](https://github.com/meta-pytorch/opacus) (👨‍💻 87 · 🔀 370 · 📥 150 · 📦 1.2K · 📋 340 - 19% open · ⏱️ 27.10.2025):
git clone https://github.com/pytorch/opacus
- [PyPi](https://pypi.org/project/opacus) (📥 92K / month · 📦 49 · ⏱️ 27.05.2025):
pip install opacus
- [Conda](https://anaconda.org/conda-forge/opacus) (📥 28K · ⏱️ 09.07.2025):
conda install -c conda-forge opacus
PySyft (🥈31 · ⭐ 9.8K) - Perform data science on data that remains in someone elses server. Apache-2 - [GitHub](https://github.com/OpenMined/PySyft) (👨‍💻 520 · 🔀 2K · 📥 2.1K · 📦 1 · 📋 3.4K - 1% open · ⏱️ 13.04.2025):
git clone https://github.com/OpenMined/PySyft
- [PyPi](https://pypi.org/project/syft) (📥 32K / month · 📦 5 · ⏱️ 13.04.2025):
pip install syft
TensorFlow Privacy (🥈24 · ⭐ 2K) - Library for training machine learning models with.. Apache-2 - [GitHub](https://github.com/tensorflow/privacy) (👨‍💻 60 · 🔀 460 · 📥 190 · 📋 210 - 55% open · ⏱️ 13.06.2025):
git clone https://github.com/tensorflow/privacy
- [PyPi](https://pypi.org/project/tensorflow-privacy) (📥 18K / month · 📦 21 · ⏱️ 14.02.2024):
pip install tensorflow-privacy
FATE (🥉23 · ⭐ 6K · 💤) - An Industrial Grade Federated Learning Framework. Apache-2 - [GitHub](https://github.com/FederatedAI/FATE) (👨‍💻 100 · 🔀 1.6K · 📦 1 · 📋 2.1K - 2% open · ⏱️ 19.11.2024):
git clone https://github.com/FederatedAI/FATE
- [PyPi](https://pypi.org/project/ETAF) (⏱️ 06.05.2020):
pip install ETAF
CrypTen (🥉21 · ⭐ 1.6K · 💤) - A framework for Privacy Preserving Machine Learning. MIT - [GitHub](https://github.com/facebookresearch/CrypTen) (👨‍💻 40 · 🔀 290 · 📋 280 - 28% open · ⏱️ 23.11.2024):
git clone https://github.com/facebookresearch/CrypTen
- [PyPi](https://pypi.org/project/crypten) (📥 600 / month · 📦 1 · ⏱️ 08.12.2022):
pip install crypten
Show 2 hidden projects... - TFEncrypted (🥈24 · ⭐ 1.2K · 💀) - A Framework for Encrypted Machine Learning in.. Apache-2 - PipelineDP (🥉19 · ⭐ 280) - PipelineDP is a Python framework for applying differentially.. Apache-2


Workflow & Experiment Tracking

Back to top

Libraries to organize, track, and visualize machine learning experiments.

mlflow (🥇47 · ⭐ 23K) - The open source developer platform to build AI/LLM applications and.. Apache-2 - [GitHub](https://github.com/mlflow/mlflow) (👨‍💻 910 · 🔀 4.9K · 📦 66K · 📋 5.2K - 39% open · ⏱️ 30.10.2025):
git clone https://github.com/mlflow/mlflow
- [PyPi](https://pypi.org/project/mlflow) (📥 26M / month · 📦 1.3K · ⏱️ 22.10.2025):
pip install mlflow
- [Conda](https://anaconda.org/conda-forge/mlflow) (📥 3.7M · ⏱️ 24.10.2025):
conda install -c conda-forge mlflow
wandb client (🥇44 · ⭐ 10K) - The AI developer platform. Use Weights & Biases to train and fine-.. MIT - [GitHub](https://github.com/wandb/wandb) (👨‍💻 220 · 🔀 790 · 📥 1.2K · 📦 84K · 📋 3.7K - 18% open · ⏱️ 30.10.2025):
git clone https://github.com/wandb/client
- [PyPi](https://pypi.org/project/wandb) (📥 20M / month · 📦 2.3K · ⏱️ 28.10.2025):
pip install wandb
- [Conda](https://anaconda.org/conda-forge/wandb) (📥 1.2M · ⏱️ 30.10.2025):
conda install -c conda-forge wandb
DVC (🥇42 · ⭐ 15K) - Data Versioning and ML Experiments. Apache-2 - [GitHub](https://github.com/iterative/dvc) (👨‍💻 320 · 🔀 1.2K · 📦 24K · 📋 4.9K - 4% open · ⏱️ 28.10.2025):
git clone https://github.com/iterative/dvc
- [PyPi](https://pypi.org/project/dvc) (📥 1.5M / month · 📦 140 · ⏱️ 02.09.2025):
pip install dvc
- [Conda](https://anaconda.org/conda-forge/dvc) (📥 3.1M · ⏱️ 02.09.2025):
conda install -c conda-forge dvc
Tensorboard (🥇41 · ⭐ 7K) - TensorFlows Visualization Toolkit. Apache-2 - [GitHub](https://github.com/tensorflow/tensorboard) (👨‍💻 330 · 🔀 1.7K · 📦 330K · 📋 2K - 36% open · ⏱️ 12.08.2025):
git clone https://github.com/tensorflow/tensorboard
- [PyPi](https://pypi.org/project/tensorboard) (📥 31M / month · 📦 2.8K · ⏱️ 17.07.2025):
pip install tensorboard
- [Conda](https://anaconda.org/conda-forge/tensorboard) (📥 6M · ⏱️ 18.07.2025):
conda install -c conda-forge tensorboard
SageMaker SDK (🥇41 · ⭐ 2.2K) - A library for training and deploying machine learning.. Apache-2 - [GitHub](https://github.com/aws/sagemaker-python-sdk) (👨‍💻 500 · 🔀 1.2K · 📦 6.2K · 📋 1.6K - 21% open · ⏱️ 29.10.2025):
git clone https://github.com/aws/sagemaker-python-sdk
- [PyPi](https://pypi.org/project/sagemaker) (📥 27M / month · 📦 210 · ⏱️ 29.10.2025):
pip install sagemaker
- [Conda](https://anaconda.org/conda-forge/sagemaker-python-sdk) (📥 1.8M · ⏱️ 30.10.2025):
conda install -c conda-forge sagemaker-python-sdk
Metaflow (🥈37 · ⭐ 9.6K) - Build, Manage and Deploy AI/ML Systems. Apache-2 - [GitHub](https://github.com/Netflix/metaflow) (👨‍💻 120 · 🔀 930 · 📦 950 · 📋 840 - 43% open · ⏱️ 29.10.2025):
git clone https://github.com/Netflix/metaflow
- [PyPi](https://pypi.org/project/metaflow) (📥 740K / month · 📦 53 · ⏱️ 29.10.2025):
pip install metaflow
- [Conda](https://anaconda.org/conda-forge/metaflow) (📥 340K · ⏱️ 29.10.2025):
conda install -c conda-forge metaflow
tensorboardX (🥈35 · ⭐ 8K) - tensorboard for pytorch (and chainer, mxnet, numpy, ...). MIT - [GitHub](https://github.com/lanpa/tensorboardX) (👨‍💻 85 · 🔀 860 · 📥 500 · 📦 60K · 📋 470 - 18% open · ⏱️ 13.06.2025):
git clone https://github.com/lanpa/tensorboardX
- [PyPi](https://pypi.org/project/tensorboardX) (📥 4.5M / month · 📦 740 · ⏱️ 10.06.2025):
pip install tensorboardX
- [Conda](https://anaconda.org/conda-forge/tensorboardx) (📥 1.3M · ⏱️ 22.04.2025):
conda install -c conda-forge tensorboardx
PyCaret (🥈34 · ⭐ 9.6K · 💤) - An open-source, low-code machine learning library in Python. MIT - [GitHub](https://github.com/pycaret/pycaret) (👨‍💻 140 · 🔀 1.8K · 📥 730 · 📦 7.9K · 📋 2.3K - 16% open · ⏱️ 06.03.2025):
git clone https://github.com/pycaret/pycaret
- [PyPi](https://pypi.org/project/pycaret) (📥 310K / month · 📦 31 · ⏱️ 28.04.2024):
pip install pycaret
- [Conda](https://anaconda.org/conda-forge/pycaret) (📥 78K · ⏱️ 22.04.2025):
conda install -c conda-forge pycaret
ClearML (🥈34 · ⭐ 6.3K) - ClearML - Auto-Magical CI/CD to streamline your AI workload... Apache-2 - [GitHub](https://github.com/clearml/clearml) (👨‍💻 100 · 🔀 710 · 📥 3.5K · 📦 1.9K · 📋 1.2K - 45% open · ⏱️ 27.10.2025):
git clone https://github.com/allegroai/clearml
- [PyPi](https://pypi.org/project/clearml) (📥 500K / month · 📦 78 · ⏱️ 22.10.2025):
pip install clearml
- [Docker Hub](https://hub.docker.com/r/allegroai/trains) (📥 31K · ⏱️ 05.10.2020):
docker pull allegroai/trains
snakemake (🥈34 · ⭐ 2.6K) - This is the development home of the workflow management system.. MIT - [GitHub](https://github.com/snakemake/snakemake) (👨‍💻 380 · 🔀 610 · 📦 2.5K · 📋 2.1K - 58% open · ⏱️ 29.10.2025):
git clone https://github.com/snakemake/snakemake
- [PyPi](https://pypi.org/project/snakemake) (📥 130K / month · 📦 320 · ⏱️ 22.10.2025):
pip install snakemake
- [Conda](https://anaconda.org/bioconda/snakemake) (📥 1.5M · ⏱️ 28.10.2025):
conda install -c bioconda snakemake
kaggle (🥈33 · ⭐ 6.9K) - Official Kaggle API. Apache-2 - [GitHub](https://github.com/Kaggle/kaggle-api) (👨‍💻 49 · 🔀 1.2K · 📦 21 · 📋 530 - 27% open · ⏱️ 28.10.2025):
git clone https://github.com/Kaggle/kaggle-api
- [PyPi](https://pypi.org/project/kaggle) (📥 610K / month · 📦 240 · ⏱️ 08.05.2025):
pip install kaggle
- [Conda](https://anaconda.org/conda-forge/kaggle) (📥 250K · ⏱️ 11.08.2025):
conda install -c conda-forge kaggle
aim (🥈32 · ⭐ 5.8K) - Aim An easy-to-use & supercharged open-source experiment tracker. Apache-2 - [GitHub](https://github.com/aimhubio/aim) (👨‍💻 82 · 🔀 360 · 📦 1.1K · 📋 1.1K - 37% open · ⏱️ 26.06.2025):
git clone https://github.com/aimhubio/aim
- [PyPi](https://pypi.org/project/aim) (📥 120K / month · 📦 56 · ⏱️ 11.06.2025):
pip install aim
- [Conda](https://anaconda.org/conda-forge/aim) (📥 140K · ⏱️ 22.04.2025):
conda install -c conda-forge aim
AzureML SDK (🥈31 · ⭐ 4.3K · 💤) - Python notebooks with ML and deep learning examples with Azure.. MIT - [GitHub](https://github.com/Azure/MachineLearningNotebooks) (👨‍💻 65 · 🔀 2.5K · 📥 680 · 📋 1.5K - 26% open · ⏱️ 14.03.2025):
git clone https://github.com/Azure/MachineLearningNotebooks
- [PyPi](https://pypi.org/project/azureml-sdk) (📥 2.2M / month · 📦 31 · ⏱️ 11.04.2025):
pip install azureml-sdk
VisualDL (🥈29 · ⭐ 4.9K · 💤) - Deep Learning Visualization Toolkit. Apache-2 - [GitHub](https://github.com/PaddlePaddle/VisualDL) (👨‍💻 36 · 🔀 630 · 📥 540 · 📦 3.6K · 📋 510 - 30% open · ⏱️ 22.01.2025):
git clone https://github.com/PaddlePaddle/VisualDL
- [PyPi](https://pypi.org/project/visualdl) (📥 170K / month · 📦 82 · ⏱️ 30.10.2024):
pip install visualdl
sacred (🥈29 · ⭐ 4.3K) - Sacred is a tool to help you configure, organize, log and reproduce.. MIT - [GitHub](https://github.com/IDSIA/sacred) (👨‍💻 110 · 🔀 390 · 📦 3.6K · 📋 560 - 18% open · ⏱️ 22.10.2025):
git clone https://github.com/IDSIA/sacred
- [PyPi](https://pypi.org/project/sacred) (📥 48K / month · 📦 60 · ⏱️ 26.11.2024):
pip install sacred
- [Conda](https://anaconda.org/conda-forge/sacred) (📥 9.9K · ⏱️ 22.04.2025):
conda install -c conda-forge sacred
Neptune.ai (🥈29 · ⭐ 620) - The experiment tracker for foundation model training. Apache-2 - [GitHub](https://github.com/neptune-ai/neptune-client) (👨‍💻 57 · 🔀 66 · 📦 920 · 📋 260 - 12% open · ⏱️ 09.06.2025):
git clone https://github.com/neptune-ai/neptune-client
- [PyPi](https://pypi.org/project/neptune-client) (📥 480K / month · 📦 77 · ⏱️ 15.04.2025):
pip install neptune-client
- [Conda](https://anaconda.org/conda-forge/neptune-client) (📥 390K · ⏱️ 22.04.2025):
conda install -c conda-forge neptune-client
TNT (🥉28 · ⭐ 1.7K) - A lightweight library for PyTorch training tools and utilities. BSD-3 - [GitHub](https://github.com/meta-pytorch/tnt) (👨‍💻 150 · 🔀 290 · 📋 150 - 56% open · ⏱️ 09.10.2025):
git clone https://github.com/pytorch/tnt
- [PyPi](https://pypi.org/project/torchnet) (📥 9.4K / month · 📦 24 · ⏱️ 29.07.2018):
pip install torchnet
livelossplot (🥉25 · ⭐ 1.3K · 💤) - Live training loss plot in Jupyter Notebook for Keras,.. MIT - [GitHub](https://github.com/stared/livelossplot) (👨‍💻 17 · 🔀 140 · 📦 1.9K · 📋 79 - 7% open · ⏱️ 03.01.2025):
git clone https://github.com/stared/livelossplot
- [PyPi](https://pypi.org/project/livelossplot) (📥 19K / month · 📦 16 · ⏱️ 03.01.2025):
pip install livelossplot
ml-metadata (🥉25 · ⭐ 660) - For recording and retrieving metadata associated with ML.. Apache-2 - [GitHub](https://github.com/google/ml-metadata) (👨‍💻 23 · 🔀 170 · 📥 3K · 📦 720 · 📋 130 - 41% open · ⏱️ 03.04.2025):
git clone https://github.com/google/ml-metadata
- [PyPi](https://pypi.org/project/ml-metadata) (📥 50K / month · 📦 32 · ⏱️ 07.04.2025):
pip install ml-metadata
Labml (🥉24 · ⭐ 2.3K) - Monitor deep learning model training and hardware usage from your mobile.. MIT - [GitHub](https://github.com/labmlai/labml) (👨‍💻 9 · 🔀 140 · 📦 240 · 📋 50 - 12% open · ⏱️ 10.04.2025):
git clone https://github.com/labmlai/labml
- [PyPi](https://pypi.org/project/labml) (📥 4.6K / month · 📦 14 · ⏱️ 15.09.2024):
pip install labml
quinn (🥉24 · ⭐ 680 · 💤) - pyspark methods to enhance developer productivity. Apache-2 - [GitHub](https://github.com/mrpowers-io/quinn) (👨‍💻 31 · 🔀 98 · 📥 69 · 📦 94 · 📋 130 - 27% open · ⏱️ 06.12.2024):
git clone https://github.com/MrPowers/quinn
- [PyPi](https://pypi.org/project/quinn) (📥 750K / month · 📦 7 · ⏱️ 13.02.2024):
pip install quinn
gokart (🥉24 · ⭐ 330) - Gokart solves reproducibility, task dependencies, constraints of good code,.. MIT - [GitHub](https://github.com/m3dev/gokart) (👨‍💻 48 · 🔀 63 · 📦 85 · 📋 100 - 31% open · ⏱️ 18.06.2025):
git clone https://github.com/m3dev/gokart
- [PyPi](https://pypi.org/project/gokart) (📥 6.7K / month · 📦 8 · ⏱️ 18.06.2025):
pip install gokart
Guild AI (🥉23 · ⭐ 890) - Experiment tracking, ML developer tools. Apache-2 - [GitHub](https://github.com/guildai/guildai) (👨‍💻 30 · 🔀 90 · 📥 32 · 📦 110 · 📋 440 - 50% open · ⏱️ 29.04.2025):
git clone https://github.com/guildai/guildai
- [PyPi](https://pypi.org/project/guildai) (📥 1.7K / month · ⏱️ 11.05.2022):
pip install guildai
TensorWatch (🥉22 · ⭐ 3.5K) - Debugging, monitoring and visualization for Python Machine Learning.. MIT - [GitHub](https://github.com/microsoft/tensorwatch) (👨‍💻 15 · 🔀 360 · 📦 160 · 📋 70 - 75% open · ⏱️ 27.09.2025):
git clone https://github.com/microsoft/tensorwatch
- [PyPi](https://pypi.org/project/tensorwatch) (📥 1.4K / month · 📦 7 · ⏱️ 04.03.2020):
pip install tensorwatch
keepsake (🥉18 · ⭐ 1.7K · 💤) - Version control for machine learning. Apache-2 - [GitHub](https://github.com/replicate/keepsake) (👨‍💻 18 · 🔀 71 · 📋 190 - 65% open · ⏱️ 03.12.2024):
git clone https://github.com/replicate/keepsake
- [PyPi](https://pypi.org/project/keepsake) (📥 880 / month · 📦 1 · ⏱️ 25.01.2021):
pip install keepsake
datmo (🥉17 · ⭐ 340) - Open source production model management tool for data scientists. MIT - [GitHub](https://github.com/datmo/datmo) (👨‍💻 6 · 🔀 30 · 📦 7 · 📋 180 - 17% open · ⏱️ 23.06.2025):
git clone https://github.com/datmo/datmo
- [PyPi](https://pypi.org/project/datmo) (📥 130 / month · ⏱️ 07.12.2018):
pip install datmo
CometML (🥉16) - Supercharging Machine Learning. MIT - [GitHub]():
git clone https://github.com/comet-ml/examples
- [PyPi](https://pypi.org/project/comet_ml) (📥 570K / month · 📦 100 · ⏱️ 29.10.2025):
pip install comet_ml
- [Conda](https://anaconda.org/anaconda/comet_ml):
conda install -c anaconda comet_ml
Show 13 hidden projects... - Catalyst (🥉28 · ⭐ 3.4K · 💀) - Accelerated deep learning R&D. Apache-2 - knockknock (🥉25 · ⭐ 2.8K · 💀) - Knock Knock: Get notified when your training ends with only two.. MIT - hiddenlayer (🥉22 · ⭐ 1.9K · 💀) - Neural network graphs and training metrics for.. MIT - SKLL (🥉22 · ⭐ 560 · 💤) - SciKit-Learn Laboratory (SKLL) makes it easy to run machine.. ❗Unlicensed - TensorBoard Logger (🥉21 · ⭐ 630 · 💀) - Log TensorBoard events without touching TensorFlow. MIT - Studio.ml (🥉21 · ⭐ 380 · 💀) - Studio: Simplify and expedite model building process. Apache-2 - lore (🥉20 · ⭐ 1.5K · 💀) - Lore makes machine learning approachable for Software Engineers and.. MIT - chitra (🥉17 · ⭐ 230) - A multi-functional library for full-stack Deep Learning. Simplifies.. Apache-2 - steppy (🥉17 · ⭐ 140 · 💀) - Lightweight, Python library for fast and reproducible experimentation. MIT - MXBoard (🥉16 · ⭐ 320 · 💀) - Logging MXNet data for visualization in TensorBoard. Apache-2 - caliban (🥉15 · ⭐ 500 · 💀) - Research workflows made easy, locally and in the Cloud. Apache-2 - ModelChimp (🥉12 · ⭐ 130 · 💀) - Experiment tracking for machine and deep learning projects. BSD-2 - traintool (🥉8 · ⭐ 12 · 💀) - Train off-the-shelf machine learning models in one.. Apache-2


Model Serialization & Deployment

Back to top

Libraries to serialize models to files, convert between a variety of model formats, and optimize models for deployment.

triton (🥇45 · ⭐ 17K) - Development repository for the Triton language and compiler. MIT - [GitHub](https://github.com/triton-lang/triton) (👨‍💻 480 · 🔀 2.3K · 📥 1.4K · 📦 74K · 📋 2K - 41% open · ⏱️ 29.10.2025):
git clone https://github.com/openai/triton
- [PyPi](https://pypi.org/project/triton) (📥 41M / month · 📦 540 · ⏱️ 13.10.2025):
pip install triton
onnx (🥇43 · ⭐ 20K) - Open standard for machine learning interoperability. Apache-2 - [GitHub](https://github.com/onnx/onnx) (👨‍💻 360 · 🔀 3.8K · 📥 25K · 📦 49K · 📋 3.1K - 9% open · ⏱️ 29.10.2025):
git clone https://github.com/onnx/onnx
- [PyPi](https://pypi.org/project/onnx) (📥 13M / month · 📦 1.6K · ⏱️ 10.10.2025):
pip install onnx
- [Conda](https://anaconda.org/conda-forge/onnx) (📥 2.1M · ⏱️ 11.10.2025):
conda install -c conda-forge onnx
huggingface_hub (🥈40 · ⭐ 3K) - The official Python client for the Hugging Face Hub. Apache-2 - [GitHub](https://github.com/huggingface/huggingface_hub) (👨‍💻 280 · 🔀 830 · 📋 1.3K - 11% open · ⏱️ 30.10.2025):
git clone https://github.com/huggingface/huggingface_hub
- [PyPi](https://pypi.org/project/huggingface_hub) (📥 120M / month · 📦 4.1K · ⏱️ 28.10.2025):
pip install huggingface_hub
- [Conda](https://anaconda.org/conda-forge/huggingface_hub) (📥 4.2M · ⏱️ 28.10.2025):
conda install -c conda-forge huggingface_hub
BentoML (🥈36 · ⭐ 8.2K) - The easiest way to serve AI apps and models - Build Model Inference.. Apache-2 - [GitHub](https://github.com/bentoml/BentoML) (👨‍💻 260 · 🔀 880 · 📥 95 · 📦 2.8K · 📋 1.1K - 11% open · ⏱️ 29.10.2025):
git clone https://github.com/bentoml/BentoML
- [PyPi](https://pypi.org/project/bentoml) (📥 180K / month · 📦 44 · ⏱️ 29.10.2025):
pip install bentoml
Core ML Tools (🥈35 · ⭐ 5K) - Core ML tools contain supporting tools for Core ML model.. BSD-3 - [GitHub](https://github.com/apple/coremltools) (👨‍💻 200 · 🔀 710 · 📥 15K · 📦 5.1K · 📋 1.6K - 26% open · ⏱️ 22.09.2025):
git clone https://github.com/apple/coremltools
- [PyPi](https://pypi.org/project/coremltools) (📥 1.1M / month · 📦 110 · ⏱️ 28.07.2025):
pip install coremltools
- [Conda](https://anaconda.org/conda-forge/coremltools) (📥 110K · ⏱️ 02.10.2025):
conda install -c conda-forge coremltools
TorchServe (🥈33 · ⭐ 4.4K · 💤) - Serve, optimize and scale PyTorch models in production. Apache-2 - [GitHub](https://github.com/pytorch/serve) (👨‍💻 220 · 🔀 890 · 📥 8K · 📦 900 · 📋 1.7K - 25% open · ⏱️ 17.03.2025):
git clone https://github.com/pytorch/serve
- [PyPi](https://pypi.org/project/torchserve) (📥 97K / month · 📦 26 · ⏱️ 30.09.2024):
pip install torchserve
- [Conda](https://anaconda.org/pytorch/torchserve) (📥 570K · ⏱️ 25.03.2025):
conda install -c pytorch torchserve
- [Docker Hub](https://hub.docker.com/r/pytorch/torchserve) (📥 1.5M · ⭐ 32 · ⏱️ 30.09.2024):
docker pull pytorch/torchserve
hls4ml (🥈28 · ⭐ 1.7K) - Machine learning on FPGAs using HLS. Apache-2 - [GitHub](https://github.com/fastmachinelearning/hls4ml) (👨‍💻 82 · 🔀 440 · 📦 51 · 📋 480 - 41% open · ⏱️ 20.10.2025):
git clone https://github.com/fastmachinelearning/hls4ml
- [PyPi](https://pypi.org/project/hls4ml) (📥 1.7K / month · 📦 1 · ⏱️ 17.03.2025):
pip install hls4ml
- [Conda](https://anaconda.org/conda-forge/hls4ml) (📥 12K · ⏱️ 22.04.2025):
conda install -c conda-forge hls4ml
mmdnn (🥈25 · ⭐ 5.8K) - MMdnn is a set of tools to help users inter-operate among different deep.. MIT - [GitHub](https://github.com/microsoft/MMdnn) (👨‍💻 86 · 🔀 960 · 📥 4K · 📦 160 · 📋 630 - 53% open · ⏱️ 07.08.2025):
git clone https://github.com/Microsoft/MMdnn
- [PyPi](https://pypi.org/project/mmdnn) (📥 320 / month · ⏱️ 24.07.2020):
pip install mmdnn
Hummingbird (🥉24 · ⭐ 3.5K) - Hummingbird compiles trained ML models into tensor computation for.. MIT - [GitHub](https://github.com/microsoft/hummingbird) (👨‍💻 40 · 🔀 290 · 📥 930 · 📋 330 - 21% open · ⏱️ 17.07.2025):
git clone https://github.com/microsoft/hummingbird
- [PyPi](https://pypi.org/project/hummingbird-ml) (📥 7.6K / month · 📦 7 · ⏱️ 25.10.2024):
pip install hummingbird-ml
- [Conda](https://anaconda.org/conda-forge/hummingbird-ml) (📥 64K · ⏱️ 22.04.2025):
conda install -c conda-forge hummingbird-ml
tfdeploy (🥉15 · ⭐ 360 · 💤) - Deploy tensorflow graphs for fast evaluation and export to.. BSD-3 - [GitHub](https://github.com/riga/tfdeploy) (👨‍💻 4 · 🔀 38 · 📋 34 - 32% open · ⏱️ 04.01.2025):
git clone https://github.com/riga/tfdeploy
- [PyPi](https://pypi.org/project/tfdeploy) (📥 100 / month · ⏱️ 30.03.2017):
pip install tfdeploy
Show 10 hidden projects... - m2cgen (🥈25 · ⭐ 2.9K · 💀) - Transform ML models into a native code (Java, C, Python, Go,.. MIT - sklearn-porter (🥉23 · ⭐ 1.3K · 💀) - Transpile trained scikit-learn estimators to C, Java,.. BSD-3 - cortex (🥉22 · ⭐ 8K · 💀) - Production infrastructure for machine learning at scale. Apache-2 - nebullvm (🥉21 · ⭐ 8.4K · 💀) - A collection of libraries to optimise AI model performances. Apache-2 - Larq Compute Engine (🥉20 · ⭐ 250) - Highly optimized inference engine for Binarized.. Apache-2 - pytorch2keras (🥉19 · ⭐ 860 · 💀) - PyTorch to Keras model convertor. MIT - OMLT (🥉19 · ⭐ 340) - Represent trained machine learning models as Pyomo optimization.. ❗Unlicensed - modelkit (🥉17 · ⭐ 150 · 💀) - Toolkit for developing and maintaining ML models. MIT - backprop (🥉14 · ⭐ 240 · 💀) - Backprop makes it simple to use, finetune, and deploy state-of-.. Apache-2 - ml-ane-transformers (🥉13 · ⭐ 2.7K · 💀) - Reference implementation of the Transformer.. ❗Unlicensed


Model Interpretability

Back to top

Libraries to visualize, explain, debug, evaluate, and interpret machine learning models.

shap (🥇42 · ⭐ 25K) - A game theoretic approach to explain the output of any machine learning model. MIT - [GitHub](https://github.com/shap/shap) (👨‍💻 280 · 🔀 3.4K · 📦 36K · 📋 2.7K - 23% open · ⏱️ 30.10.2025):
git clone https://github.com/slundberg/shap
- [PyPi](https://pypi.org/project/shap) (📥 9.5M / month · 📦 1.2K · ⏱️ 14.10.2025):
pip install shap
- [Conda](https://anaconda.org/conda-forge/shap) (📥 7.5M · ⏱️ 17.06.2025):
conda install -c conda-forge shap
arviz (🥇37 · ⭐ 1.7K) - Exploratory analysis of Bayesian models with Python. Apache-2 - [GitHub](https://github.com/arviz-devs/arviz) (👨‍💻 180 · 🔀 460 · 📥 190 · 📦 11K · 📋 900 - 19% open · ⏱️ 22.10.2025):
git clone https://github.com/arviz-devs/arviz
- [PyPi](https://pypi.org/project/arviz) (📥 3.7M / month · 📦 410 · ⏱️ 09.07.2025):
pip install arviz
- [Conda](https://anaconda.org/conda-forge/arviz) (📥 2.5M · ⏱️ 10.07.2025):
conda install -c conda-forge arviz
Netron (🥇36 · ⭐ 32K) - Visualizer for neural network, deep learning and machine learning.. MIT - [GitHub](https://github.com/lutzroeder/netron) (👨‍💻 2 · 🔀 3K · 📥 160K · 📦 13 · 📋 1.2K - 1% open · ⏱️ 29.10.2025):
git clone https://github.com/lutzroeder/netron
- [PyPi](https://pypi.org/project/netron) (📥 43K / month · 📦 92 · ⏱️ 23.10.2025):
pip install netron
evaluate (🥇34 · ⭐ 2.4K) - Evaluate: A library for easily evaluating machine learning models.. Apache-2 - [GitHub](https://github.com/huggingface/evaluate) (👨‍💻 130 · 🔀 290 · 📦 24K · 📋 390 - 62% open · ⏱️ 25.09.2025):
git clone https://github.com/huggingface/evaluate
- [PyPi](https://pypi.org/project/evaluate) (📥 3.6M / month · 📦 660 · ⏱️ 18.09.2025):
pip install evaluate
InterpretML (🥇33 · ⭐ 6.7K) - Fit interpretable models. Explain blackbox machine learning. MIT - [GitHub](https://github.com/interpretml/interpret) (👨‍💻 53 · 🔀 770 · 📦 930 · 📋 490 - 22% open · ⏱️ 24.10.2025):
git clone https://github.com/interpretml/interpret
- [PyPi](https://pypi.org/project/interpret) (📥 230K / month · 📦 58 · ⏱️ 14.10.2025):
pip install interpret
Captum (🥇33 · ⭐ 5.4K) - Model interpretability and understanding for PyTorch. BSD-3 - [GitHub](https://github.com/meta-pytorch/captum) (👨‍💻 140 · 🔀 540 · 📦 3.5K · 📋 610 - 41% open · ⏱️ 23.10.2025):
git clone https://github.com/pytorch/captum
- [PyPi](https://pypi.org/project/captum) (📥 330K / month · 📦 170 · ⏱️ 27.03.2025):
pip install captum
- [Conda](https://anaconda.org/conda-forge/captum) (📥 130K · ⏱️ 22.04.2025):
conda install -c conda-forge captum
DoWhy (🥈30 · ⭐ 7.8K) - DoWhy is a Python library for causal inference that supports explicit.. MIT - [GitHub](https://github.com/py-why/dowhy) (👨‍💻 100 · 🔀 980 · 📥 43 · 📦 660 · 📋 510 - 27% open · ⏱️ 28.10.2025):
git clone https://github.com/py-why/dowhy
- [PyPi](https://pypi.org/project/dowhy) (📥 83K / month · 📦 28 · ⏱️ 12.07.2025):
pip install dowhy
- [Conda](https://anaconda.org/conda-forge/dowhy) (📥 51K · ⏱️ 13.07.2025):
conda install -c conda-forge dowhy
shapash (🥈30 · ⭐ 3K) - Shapash: User-friendly Explainability and Interpretability to.. Apache-2 - [GitHub](https://github.com/MAIF/shapash) (👨‍💻 43 · 🔀 350 · 📦 200 · 📋 240 - 16% open · ⏱️ 03.10.2025):
git clone https://github.com/MAIF/shapash
- [PyPi](https://pypi.org/project/shapash) (📥 7.5K / month · 📦 4 · ⏱️ 24.07.2025):
pip install shapash
explainerdashboard (🥈30 · ⭐ 2.5K) - Quickly build Explainable AI dashboards that show the inner.. MIT - [GitHub](https://github.com/oegedijk/explainerdashboard) (👨‍💻 23 · 🔀 340 · 📦 650 · 📋 240 - 16% open · ⏱️ 01.08.2025):
git clone https://github.com/oegedijk/explainerdashboard
- [PyPi](https://pypi.org/project/explainerdashboard) (📥 42K / month · 📦 15 · ⏱️ 03.06.2025):
pip install explainerdashboard
- [Conda](https://anaconda.org/conda-forge/explainerdashboard) (📥 75K · ⏱️ 04.06.2025):
conda install -c conda-forge explainerdashboard
fairlearn (🥈30 · ⭐ 2.1K) - A Python package to assess and improve fairness of machine.. MIT - [GitHub](https://github.com/fairlearn/fairlearn) (👨‍💻 100 · 🔀 470 · 📦 3 · 📋 520 - 24% open · ⏱️ 27.10.2025):
git clone https://github.com/fairlearn/fairlearn
- [PyPi](https://pypi.org/project/fairlearn) (📥 160K / month · 📦 80 · ⏱️ 19.10.2025):
pip install fairlearn
- [Conda](https://anaconda.org/conda-forge/fairlearn) (📥 55K · ⏱️ 22.04.2025):
conda install -c conda-forge fairlearn
dtreeviz (🥈28 · ⭐ 3.1K · 💤) - A python library for decision tree visualization and model.. MIT - [GitHub](https://github.com/parrt/dtreeviz) (👨‍💻 27 · 🔀 340 · 📦 1.6K · 📋 210 - 35% open · ⏱️ 06.03.2025):
git clone https://github.com/parrt/dtreeviz
- [PyPi](https://pypi.org/project/dtreeviz) (📥 110K / month · 📦 53 · ⏱️ 07.07.2022):
pip install dtreeviz
- [Conda](https://anaconda.org/conda-forge/dtreeviz) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge dtreeviz
Model Analysis (🥈27 · ⭐ 1.3K) - Model analysis tools for TensorFlow. Apache-2 - [GitHub](https://github.com/tensorflow/model-analysis) (👨‍💻 61 · 🔀 280 · 📋 97 - 39% open · ⏱️ 06.08.2025):
git clone https://github.com/tensorflow/model-analysis
- [PyPi](https://pypi.org/project/tensorflow-model-analysis) (📥 200K / month · 📦 20 · ⏱️ 23.06.2025):
pip install tensorflow-model-analysis
Fairness 360 (🥈26 · ⭐ 2.7K) - A comprehensive set of fairness metrics for datasets and.. Apache-2 - [GitHub](https://github.com/Trusted-AI/AIF360) (👨‍💻 73 · 🔀 870 · 📦 740 · 📋 300 - 65% open · ⏱️ 16.10.2025):
git clone https://github.com/Trusted-AI/AIF360
- [PyPi](https://pypi.org/project/aif360) (📥 29K / month · 📦 32 · ⏱️ 08.04.2024):
pip install aif360
- [Conda](https://anaconda.org/conda-forge/aif360) (📥 29K · ⏱️ 22.04.2025):
conda install -c conda-forge aif360
imodels (🥈26 · ⭐ 1.5K) - Interpretable ML package for concise, transparent, and accurate.. MIT - [GitHub](https://github.com/csinva/imodels) (👨‍💻 27 · 🔀 130 · 📦 130 · 📋 98 - 38% open · ⏱️ 26.08.2025):
git clone https://github.com/csinva/imodels
- [PyPi](https://pypi.org/project/imodels) (📥 30K / month · 📦 12 · ⏱️ 26.08.2025):
pip install imodels
LIT (🥉25 · ⭐ 3.6K · 💤) - The Learning Interpretability Tool: Interactively analyze ML models.. Apache-2 - [GitHub](https://github.com/PAIR-code/lit) (👨‍💻 38 · 🔀 360 · 📋 210 - 57% open · ⏱️ 20.12.2024):
git clone https://github.com/PAIR-code/lit
- [PyPi](https://pypi.org/project/lit-nlp) (📥 11K / month · 📦 3 · ⏱️ 20.12.2024):
pip install lit-nlp
- [Conda](https://anaconda.org/conda-forge/lit-nlp) (📥 130K · ⏱️ 22.04.2025):
conda install -c conda-forge lit-nlp
responsible-ai-widgets (🥉25 · ⭐ 1.6K · 💤) - Responsible AI Toolbox is a suite of tools providing.. MIT - [GitHub](https://github.com/microsoft/responsible-ai-toolbox) (👨‍💻 43 · 🔀 430 · 📋 330 - 28% open · ⏱️ 07.02.2025):
git clone https://github.com/microsoft/responsible-ai-toolbox
- [PyPi](https://pypi.org/project/raiwidgets) (📥 10K / month · 📦 6 · ⏱️ 08.07.2024):
pip install raiwidgets
aequitas (🥉25 · ⭐ 730 · 💤) - Bias Auditing & Fair ML Toolkit. MIT - [GitHub](https://github.com/dssg/aequitas) (👨‍💻 23 · 🔀 120 · 📦 200 · 📋 99 - 51% open · ⏱️ 25.03.2025):
git clone https://github.com/dssg/aequitas
- [PyPi](https://pypi.org/project/aequitas) (📥 19K / month · 📦 8 · ⏱️ 30.01.2024):
pip install aequitas
Explainability 360 (🥉24 · ⭐ 1.7K · 💤) - Interpretability and explainability of data and.. Apache-2 - [GitHub](https://github.com/Trusted-AI/AIX360) (👨‍💻 41 · 🔀 310 · 📦 170 · 📋 86 - 62% open · ⏱️ 26.02.2025):
git clone https://github.com/Trusted-AI/AIX360
- [PyPi](https://pypi.org/project/aix360) (📥 1.8K / month · 📦 1 · ⏱️ 31.07.2023):
pip install aix360
keract (🥉24 · ⭐ 1.1K) - Layers Outputs and Gradients in Keras. Made easy. MIT - [GitHub](https://github.com/philipperemy/keract) (👨‍💻 17 · 🔀 190 · 📦 260 · 📋 89 - 3% open · ⏱️ 07.04.2025):
git clone https://github.com/philipperemy/keract
- [PyPi](https://pypi.org/project/keract) (📥 8.8K / month · 📦 7 · ⏱️ 07.04.2025):
pip install keract
DiCE (🥉23 · ⭐ 1.5K) - Generate Diverse Counterfactual Explanations for any machine.. MIT - [GitHub](https://github.com/interpretml/DiCE) (👨‍💻 23 · 🔀 210 · 📋 190 - 49% open · ⏱️ 13.07.2025):
git clone https://github.com/interpretml/DiCE
- [PyPi](https://pypi.org/project/dice-ml) (📥 48K / month · 📦 13 · ⏱️ 13.07.2025):
pip install dice-ml
LOFO (🥉19 · ⭐ 840 · 💤) - Leave One Feature Out Importance. MIT - [GitHub](https://github.com/aerdem4/lofo-importance) (👨‍💻 6 · 🔀 87 · 📦 42 · 📋 30 - 13% open · ⏱️ 14.02.2025):
git clone https://github.com/aerdem4/lofo-importance
- [PyPi](https://pypi.org/project/lofo-importance) (📥 1.5K / month · 📦 5 · ⏱️ 14.02.2025):
pip install lofo-importance
random-forest-importances (🥉19 · ⭐ 620 · 💤) - Code to compute permutation and drop-column.. MIT - [GitHub](https://github.com/parrt/random-forest-importances) (👨‍💻 16 · 🔀 130 · 📋 39 - 20% open · ⏱️ 24.03.2025):
git clone https://github.com/parrt/random-forest-importances
- [PyPi](https://pypi.org/project/rfpimp) (📥 16K / month · 📦 5 · ⏱️ 28.01.2021):
pip install rfpimp
fairness-indicators (🥉18 · ⭐ 360) - Tensorflows Fairness Evaluation and Visualization.. Apache-2 - [GitHub](https://github.com/tensorflow/fairness-indicators) (👨‍💻 39 · 🔀 86 · 📋 45 - 77% open · ⏱️ 04.08.2025):
git clone https://github.com/tensorflow/fairness-indicators
- [PyPi](https://pypi.org/project/fairness-indicators) (📥 1.1K / month · ⏱️ 25.06.2025):
pip install fairness-indicators
Show 32 hidden projects... - Lime (🥈32 · ⭐ 12K · 💀) - Lime: Explaining the predictions of any machine learning classifier. BSD-2 - pyLDAvis (🥈29 · ⭐ 1.8K · 💀) - Python library for interactive topic model visualization... BSD-3 - yellowbrick (🥈27 · ⭐ 4.4K · 💀) - Visual analysis and diagnostic tools to facilitate.. Apache-2 - Deep Checks (🥈27 · ⭐ 3.9K) - Deepchecks: Tests for Continuous Validation of ML Models &.. ❗️AGPL-3.0 - Alibi (🥈27 · ⭐ 2.6K) - Algorithms for explaining machine learning models. ❗️Intel - scikit-plot (🥈27 · ⭐ 2.4K · 💀) - An intuitive library to add plotting functionality to.. MIT - DALEX (🥈27 · ⭐ 1.4K) - moDel Agnostic Language for Exploration and eXplanation (JMLR 2018;.. ❗️GPL-3.0 - eli5 (🥈26 · ⭐ 2.8K · 💀) - A library for debugging/inspecting machine learning classifiers and.. MIT - iNNvestigate (🥈26 · ⭐ 1.3K · 💀) - A toolbox to iNNvestigate neural networks predictions!. BSD-2 - Lucid (🥉25 · ⭐ 4.7K · 💀) - A collection of infrastructure and tools for research in.. Apache-2 - keras-vis (🥉25 · ⭐ 3K · 💀) - Neural network visualization toolkit for keras. MIT - CausalNex (🥉24 · ⭐ 2.4K · 💀) - A Python library that helps data scientists to infer.. Apache-2 - checklist (🥉24 · ⭐ 2K · 💀) - Beyond Accuracy: Behavioral Testing of NLP models with CheckList. MIT - What-If Tool (🥉23 · ⭐ 980 · 💀) - Source code/webpage/demos for the What-If Tool. Apache-2 - tf-explain (🥉22 · ⭐ 1K · 💀) - Interpretability Methods for tf.keras models with Tensorflow.. MIT - deeplift (🥉22 · ⭐ 870 · 💀) - Public facing deeplift repo. MIT - TreeInterpreter (🥉22 · ⭐ 760 · 💀) - Package for interpreting scikit-learns decision tree.. BSD-3 - Quantus (🥉22 · ⭐ 630) - Quantus is an eXplainable AI toolkit for responsible evaluation of.. ❗️GPL-3.0 - XAI (🥉21 · ⭐ 1.2K · 💀) - XAI - An eXplainability toolbox for machine learning. MIT - tcav (🥉20 · ⭐ 640 · 💀) - Code for the TCAV ML interpretability project. Apache-2 - ecco (🥉19 · ⭐ 2.1K · 💀) - Explain, analyze, and visualize NLP language models. Ecco creates.. BSD-3 - sklearn-evaluation (🥉17 · ⭐ 460 · 💀) - Machine learning model evaluation made easy: plots,.. MIT - model-card-toolkit (🥉17 · ⭐ 440 · 💀) - A toolkit that streamlines and automates the.. Apache-2 - Anchor (🥉16 · ⭐ 810 · 💀) - Code for High-Precision Model-Agnostic Explanations paper. BSD-2 - FlashTorch (🥉15 · ⭐ 740 · 💀) - Visualization toolkit for neural networks in PyTorch! Demo --. MIT - ExplainX.ai (🥉15 · ⭐ 440 · 💀) - Explainable AI framework for data scientists. Explain & debug any.. MIT - effector (🥉15 · ⭐ 120) - Effector - a Python package for global and regional effect methods. MIT - Skater (🥉14 · ⭐ 1.1K) - Python Library for Model Interpretation/Explanations. ❗️UPL-1.0 - interpret-text (🥉14 · ⭐ 430 · 💀) - A library that incorporates state-of-the-art explainers.. MIT - bias-detector (🥉13 · ⭐ 45 · 💀) - Bias Detector is a python package for detecting bias in machine.. MIT - Attribution Priors (🥉12 · ⭐ 120 · 💀) - Tools for training explainable models using.. MIT - contextual-ai (🥉12 · ⭐ 87 · 💀) - Contextual AI adds explainability to different stages of.. Apache-2


Vector Similarity Search (ANN)

Back to top

Libraries for Approximate Nearest Neighbor Search and Vector Indexing/Similarity Search.

🔗 ANN Benchmarks ( ⭐ 5.5K) - Benchmarks of approximate nearest neighbor libraries in Python.

Milvus (🥇43 · ⭐ 38K) - Milvus is a high-performance, cloud-native vector database built for.. Apache-2 - [GitHub](https://github.com/milvus-io/milvus) (👨‍💻 330 · 🔀 3.5K · 📥 290K · 📋 15K - 5% open · ⏱️ 30.10.2025):
git clone https://github.com/milvus-io/milvus
- [PyPi](https://pypi.org/project/pymilvus) (📥 3.3M / month · 📦 350 · ⏱️ 19.09.2025):
pip install pymilvus
- [Docker Hub](https://hub.docker.com/r/milvusdb/milvus) (📥 72M · ⭐ 90 · ⏱️ 30.10.2025):
docker pull milvusdb/milvus
Faiss (🥇42 · ⭐ 38K · 📈) - A library for efficient similarity search and clustering of dense vectors. MIT - [GitHub](https://github.com/facebookresearch/faiss) (👨‍💻 260 · 🔀 4K · 📦 5K · 📋 2.7K - 9% open · ⏱️ 30.10.2025):
git clone https://github.com/facebookresearch/faiss
- [PyPi](https://pypi.org/project/pymilvus) (📥 3.3M / month · 📦 350 · ⏱️ 19.09.2025):
pip install pymilvus
- [Conda](https://anaconda.org/conda-forge/faiss) (📥 3M · ⏱️ 22.04.2025):
conda install -c conda-forge faiss
Annoy (🥈35 · ⭐ 14K) - Approximate Nearest Neighbors in C++/Python optimized for memory usage.. Apache-2 - [GitHub](https://github.com/spotify/annoy) (👨‍💻 90 · 🔀 1.2K · 📦 5.4K · 📋 420 - 16% open · ⏱️ 29.10.2025):
git clone https://github.com/spotify/annoy
- [PyPi](https://pypi.org/project/annoy) (📥 1M / month · 📦 200 · ⏱️ 14.06.2023):
pip install annoy
- [Conda](https://anaconda.org/conda-forge/python-annoy) (📥 800K · ⏱️ 01.09.2025):
conda install -c conda-forge python-annoy
USearch (🥈33 · ⭐ 3.2K) - Fast Open-Source Search & Clustering engine for Vectors & Arbitrary.. Apache-2 - [GitHub](https://github.com/unum-cloud/USearch) (👨‍💻 81 · 🔀 230 · 📥 110K · 📦 210 · 📋 250 - 32% open · ⏱️ 29.10.2025):
git clone https://github.com/unum-cloud/usearch
- [PyPi](https://pypi.org/project/usearch) (📥 140K / month · 📦 44 · ⏱️ 04.09.2025):
pip install usearch
- [npm](https://www.npmjs.com/package/usearch) (📥 18K / month · 📦 23 · ⏱️ 29.10.2025):
npm install usearch
- [Docker Hub](https://hub.docker.com/r/unum/usearch) (📥 480 · ⭐ 1 · ⏱️ 29.10.2025):
docker pull unum/usearch
NMSLIB (🥈32 · ⭐ 3.5K) - Non-Metric Space Library (NMSLIB): An efficient similarity search.. Apache-2 - [GitHub](https://github.com/nmslib/nmslib) (👨‍💻 49 · 🔀 460 · 📦 1.4K · 📋 440 - 20% open · ⏱️ 22.10.2025):
git clone https://github.com/nmslib/nmslib
- [PyPi](https://pypi.org/project/nmslib) (📥 280K / month · 📦 67 · ⏱️ 23.10.2025):
pip install nmslib
- [Conda](https://anaconda.org/conda-forge/nmslib) (📥 230K · ⏱️ 30.08.2025):
conda install -c conda-forge nmslib
PyNNDescent (🥉28 · ⭐ 950) - A Python nearest neighbor descent for approximate nearest neighbors. BSD-2 - [GitHub](https://github.com/lmcinnes/pynndescent) (👨‍💻 31 · 🔀 110 · 📦 13K · 📋 140 - 53% open · ⏱️ 17.10.2025):
git clone https://github.com/lmcinnes/pynndescent
- [PyPi](https://pypi.org/project/pynndescent) (📥 2.5M / month · 📦 160 · ⏱️ 17.06.2024):
pip install pynndescent
- [Conda](https://anaconda.org/conda-forge/pynndescent) (📥 2.5M · ⏱️ 22.04.2025):
conda install -c conda-forge pynndescent
NGT (🥉22 · ⭐ 1.3K) - Nearest Neighbor Search with Neighborhood Graph and Tree for High-.. Apache-2 - [GitHub](https://github.com/yahoojapan/NGT) (👨‍💻 19 · 🔀 120 · 📋 150 - 18% open · ⏱️ 15.10.2025):
git clone https://github.com/yahoojapan/NGT
- [PyPi](https://pypi.org/project/ngt) (📥 1.8K / month · 📦 12 · ⏱️ 26.02.2025):
pip install ngt
Show 5 hidden projects... - hnswlib (🥈32 · ⭐ 5K · 💀) - Header-only C++/python library for fast approximate nearest.. Apache-2 - NearPy (🥉22 · ⭐ 770 · 💀) - Python framework for fast (approximated) nearest neighbour search in.. MIT - N2 (🥉22 · ⭐ 580 · 💀) - TOROS N2 - lightweight approximate Nearest Neighbor library which runs.. Apache-2 - Magnitude (🥉20 · ⭐ 1.7K · 💀) - A fast, efficient universal vector embedding utility package. MIT - PySparNN (🥉11 · ⭐ 920 · 💀) - Approximate Nearest Neighbor Search for Sparse Data in Python!. BSD-3


Probabilistics & Statistics

Back to top

Libraries providing capabilities for probabilistic programming/reasoning, bayesian inference, gaussian processes, or statistics.

PyMC3 (🥇40 · ⭐ 9.3K) - Bayesian Modeling and Probabilistic Programming in Python. Apache-2 - [GitHub](https://github.com/pymc-devs/pymc) (👨‍💻 530 · 🔀 2.1K · 📥 140 · 📦 7.7K · 📋 3.6K - 11% open · ⏱️ 28.10.2025):
git clone https://github.com/pymc-devs/pymc
- [PyPi](https://pypi.org/project/pymc3) (📥 330K / month · 📦 190 · ⏱️ 31.05.2024):
pip install pymc3
- [Conda](https://anaconda.org/conda-forge/pymc3) (📥 860K · ⏱️ 22.04.2025):
conda install -c conda-forge pymc3
tensorflow-probability (🥇35 · ⭐ 4.4K) - Probabilistic reasoning and statistical analysis in.. Apache-2 - [GitHub](https://github.com/tensorflow/probability) (👨‍💻 500 · 🔀 1.1K · 📦 4 · 📋 1.5K - 48% open · ⏱️ 22.10.2025):
git clone https://github.com/tensorflow/probability
- [PyPi](https://pypi.org/project/tensorflow-probability) (📥 880K / month · 📦 620 · ⏱️ 08.11.2024):
pip install tensorflow-probability
- [Conda](https://anaconda.org/conda-forge/tensorflow-probability) (📥 200K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow-probability
GPyTorch (🥇34 · ⭐ 3.8K) - A highly efficient implementation of Gaussian Processes in PyTorch. MIT - [GitHub](https://github.com/cornellius-gp/gpytorch) (👨‍💻 140 · 🔀 580 · 📦 3.2K · 📋 1.4K - 28% open · ⏱️ 14.10.2025):
git clone https://github.com/cornellius-gp/gpytorch
- [PyPi](https://pypi.org/project/gpytorch) (📥 500K / month · 📦 250 · ⏱️ 14.10.2025):
pip install gpytorch
- [Conda](https://anaconda.org/conda-forge/gpytorch) (📥 230K · ⏱️ 18.10.2025):
conda install -c conda-forge gpytorch
pgmpy (🥇34 · ⭐ 3.1K) - Python library for causal inference and probabilistic modeling. MIT - [GitHub](https://github.com/pgmpy/pgmpy) (👨‍💻 180 · 🔀 860 · 📥 680 · 📦 1.7K · 📋 1.1K - 27% open · ⏱️ 29.10.2025):
git clone https://github.com/pgmpy/pgmpy
- [PyPi](https://pypi.org/project/pgmpy) (📥 120K / month · 📦 72 · ⏱️ 31.03.2025):
pip install pgmpy
patsy (🥇34 · ⭐ 980) - Describing statistical models in Python using symbolic formulas. BSD-2 - [GitHub](https://github.com/pydata/patsy) (👨‍💻 23 · 🔀 100 · 📦 130K · 📋 160 - 46% open · ⏱️ 20.10.2025):
git clone https://github.com/pydata/patsy
- [PyPi](https://pypi.org/project/patsy) (📥 22M / month · 📦 680 · ⏱️ 20.10.2025):
pip install patsy
- [Conda](https://anaconda.org/conda-forge/patsy) (📥 19M · ⏱️ 20.10.2025):
conda install -c conda-forge patsy
Pyro (🥈32 · ⭐ 8.9K) - Deep universal probabilistic programming with Python and PyTorch. Apache-2 - [GitHub](https://github.com/pyro-ppl/pyro) (👨‍💻 160 · 🔀 1K · 📋 1.1K - 24% open · ⏱️ 09.07.2025):
git clone https://github.com/pyro-ppl/pyro
- [PyPi](https://pypi.org/project/pyro-ppl) (📥 630K / month · 📦 190 · ⏱️ 02.06.2024):
pip install pyro-ppl
- [Conda](https://anaconda.org/conda-forge/pyro-ppl) (📥 280K · ⏱️ 22.04.2025):
conda install -c conda-forge pyro-ppl
SALib (🥈31 · ⭐ 960) - Sensitivity Analysis Library in Python. Contains Sobol, Morris, FAST, and.. MIT - [GitHub](https://github.com/SALib/SALib) (👨‍💻 54 · 🔀 250 · 📦 1.6K · 📋 350 - 18% open · ⏱️ 12.10.2025):
git clone https://github.com/SALib/SALib
- [PyPi](https://pypi.org/project/salib) (📥 250K / month · 📦 190 · ⏱️ 12.10.2025):
pip install salib
- [Conda](https://anaconda.org/conda-forge/salib) (📥 290K · ⏱️ 12.10.2025):
conda install -c conda-forge salib
hmmlearn (🥈30 · ⭐ 3.3K · 💤) - Hidden Markov Models in Python, with scikit-learn like API. BSD-3 - [GitHub](https://github.com/hmmlearn/hmmlearn) (👨‍💻 49 · 🔀 740 · 📦 3.7K · 📋 450 - 16% open · ⏱️ 31.10.2024):
git clone https://github.com/hmmlearn/hmmlearn
- [PyPi](https://pypi.org/project/hmmlearn) (📥 240K / month · 📦 92 · ⏱️ 31.10.2024):
pip install hmmlearn
- [Conda](https://anaconda.org/conda-forge/hmmlearn) (📥 430K · ⏱️ 10.09.2025):
conda install -c conda-forge hmmlearn
emcee (🥈30 · ⭐ 1.5K) - The Python ensemble sampling toolkit for affine-invariant MCMC. MIT - [GitHub](https://github.com/dfm/emcee) (👨‍💻 76 · 🔀 430 · 📦 3.2K · 📋 300 - 19% open · ⏱️ 14.10.2025):
git clone https://github.com/dfm/emcee
- [PyPi](https://pypi.org/project/emcee) (📥 170K / month · 📦 440 · ⏱️ 19.04.2024):
pip install emcee
- [Conda](https://anaconda.org/conda-forge/emcee) (📥 510K · ⏱️ 22.04.2025):
conda install -c conda-forge emcee
GPflow (🥉29 · ⭐ 1.9K) - Gaussian processes in TensorFlow. Apache-2 - [GitHub](https://github.com/GPflow/GPflow) (👨‍💻 84 · 🔀 430 · 📦 790 · 📋 840 - 19% open · ⏱️ 29.05.2025):
git clone https://github.com/GPflow/GPflow
- [PyPi](https://pypi.org/project/gpflow) (📥 32K / month · 📦 43 · ⏱️ 29.05.2025):
pip install gpflow
- [Conda](https://anaconda.org/conda-forge/gpflow) (📥 51K · ⏱️ 22.04.2025):
conda install -c conda-forge gpflow
bambi (🥉29 · ⭐ 1.2K) - BAyesian Model-Building Interface (Bambi) in Python. MIT - [GitHub](https://github.com/bambinos/bambi) (👨‍💻 47 · 🔀 140 · 📦 220 · 📋 460 - 21% open · ⏱️ 24.10.2025):
git clone https://github.com/bambinos/bambi
- [PyPi](https://pypi.org/project/bambi) (📥 48K / month · 📦 19 · ⏱️ 24.10.2025):
pip install bambi
- [Conda](https://anaconda.org/conda-forge/bambi) (📥 56K · ⏱️ 27.10.2025):
conda install -c conda-forge bambi
pomegranate (🥉26 · ⭐ 3.5K · 💤) - Fast, flexible and easy to use probabilistic modelling in Python. MIT - [GitHub](https://github.com/jmschrei/pomegranate) (👨‍💻 75 · 🔀 590 · 📋 800 - 4% open · ⏱️ 07.02.2025):
git clone https://github.com/jmschrei/pomegranate
- [PyPi](https://pypi.org/project/pomegranate) (📥 36K / month · 📦 67 · ⏱️ 07.02.2025):
pip install pomegranate
- [Conda](https://anaconda.org/conda-forge/pomegranate) (📥 230K · ⏱️ 22.04.2025):
conda install -c conda-forge pomegranate
scikit-posthocs (🥉24 · ⭐ 380) - Multiple Pairwise Comparisons (Post Hoc) Tests in Python. MIT - [GitHub](https://github.com/maximtrp/scikit-posthocs) (👨‍💻 18 · 🔀 41 · 📥 67 · 📦 1.2K · 📋 72 - 6% open · ⏱️ 11.09.2025):
git clone https://github.com/maximtrp/scikit-posthocs
- [PyPi](https://pypi.org/project/scikit-posthocs) (📥 120K / month · 📦 73 · ⏱️ 29.03.2025):
pip install scikit-posthocs
- [Conda](https://anaconda.org/conda-forge/scikit-posthocs) (📥 1.1M · ⏱️ 22.04.2025):
conda install -c conda-forge scikit-posthocs
pandas-ta (🥉23 · ⭐ 5.5K) - Technical Analysis Indicators - Pandas TA is an easy to use.. MIT - [GitHub](https://github.com/twopirllc/pandas-ta) (👨‍💻 40 · 🔀 1.1K):
git clone https://github.com/twopirllc/pandas-ta
- [PyPi](https://pypi.org/project/pandas-ta) (📥 290K / month · 📦 190 · ⏱️ 14.09.2025):
pip install pandas-ta
- [Conda](https://anaconda.org/conda-forge/pandas-ta) (📥 39K · ⏱️ 23.09.2025):
conda install -c conda-forge pandas-ta
Baal (🥉22 · ⭐ 910) - Bayesian active learning library for research and industrial usecases. Apache-2 - [GitHub](https://github.com/baal-org/baal) (👨‍💻 24 · 🔀 87 · 📦 67 · 📋 120 - 18% open · ⏱️ 07.10.2025):
git clone https://github.com/baal-org/baal
- [PyPi](https://pypi.org/project/baal) (📥 1.8K / month · 📦 2 · ⏱️ 24.06.2025):
pip install baal
- [Conda](https://anaconda.org/conda-forge/baal) (📥 15K · ⏱️ 22.04.2025):
conda install -c conda-forge baal
Orbit (🥉21 · ⭐ 2K) - A Python package for Bayesian forecasting with object-oriented design.. Apache-2 - [GitHub](https://github.com/uber/orbit) (👨‍💻 21 · 🔀 140 · 📋 410 - 13% open · ⏱️ 05.06.2025):
git clone https://github.com/uber/orbit
- [PyPi](https://pypi.org/project/orbit-ml) (📥 24K / month · 📦 1 · ⏱️ 01.04.2024):
pip install orbit-ml
pyhsmm (🥉21 · ⭐ 570 · 💤) - Bayesian inference in HSMMs and HMMs. MIT - [GitHub](https://github.com/mattjj/pyhsmm) (👨‍💻 14 · 🔀 170 · 📦 35 · 📋 100 - 39% open · ⏱️ 25.01.2025):
git clone https://github.com/mattjj/pyhsmm
- [PyPi](https://pypi.org/project/pyhsmm) (📥 300 / month · 📦 1 · ⏱️ 10.05.2017):
pip install pyhsmm
TorchUncertainty (🥉20 · ⭐ 440 · 📉) - Open-source framework for uncertainty and deep.. Apache-2 - [GitHub](https://github.com/ENSTA-U2IS-AI/torch-uncertainty) (👨‍💻 13 · 🔀 35 · 📋 67 - 23% open · ⏱️ 31.07.2025):
git clone https://github.com/ENSTA-U2IS-AI/torch-uncertainty
- [PyPi](https://pypi.org/project/torch-uncertainty) (📥 920 / month · 📦 4 · ⏱️ 31.07.2025):
pip install torch-uncertainty
Show 6 hidden projects... - filterpy (🥈31 · ⭐ 3.7K · 💀) - Python Kalman filtering and optimal estimation library. Implements.. MIT - pingouin (🥉29 · ⭐ 1.8K) - Statistical package in Python based on Pandas. ❗️GPL-3.0 - Edward (🥉27 · ⭐ 4.8K · 💀) - A probabilistic programming language in TensorFlow. Deep.. Apache-2 - PyStan (🥉27 · ⭐ 360 · 💀) - PyStan, a Python interface to Stan, a platform for statistical.. ISC - Funsor (🥉21 · ⭐ 240 · 💀) - Functional tensors for probabilistic programming. Apache-2 - ZhuSuan (🥉15 · ⭐ 2.2K · 💀) - A probabilistic programming library for Bayesian deep learning,.. MIT


Adversarial Robustness

Back to top

Libraries for testing the robustness of machine learning models against attacks with adversarial/malicious examples.

ART (🥇34 · ⭐ 5.6K) - Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning.. MIT - [GitHub](https://github.com/Trusted-AI/adversarial-robustness-toolbox) (👨‍💻 140 · 🔀 1.2K · 📦 770 · 📋 910 - 1% open · ⏱️ 17.10.2025):
git clone https://github.com/Trusted-AI/adversarial-robustness-toolbox
- [PyPi](https://pypi.org/project/adversarial-robustness-toolbox) (📥 29K / month · 📦 25 · ⏱️ 07.07.2025):
pip install adversarial-robustness-toolbox
- [Conda](https://anaconda.org/conda-forge/adversarial-robustness-toolbox) (📥 85K · ⏱️ 07.07.2025):
conda install -c conda-forge adversarial-robustness-toolbox
TextAttack (🥈28 · ⭐ 3.3K) - TextAttack is a Python framework for adversarial attacks, data.. MIT - [GitHub](https://github.com/QData/TextAttack) (👨‍💻 67 · 🔀 420 · 📦 430 · 📋 290 - 23% open · ⏱️ 10.07.2025):
git clone https://github.com/QData/TextAttack
- [PyPi](https://pypi.org/project/textattack) (📥 9.1K / month · 📦 11 · ⏱️ 11.03.2024):
pip install textattack
- [Conda](https://anaconda.org/conda-forge/textattack) (📥 11K · ⏱️ 22.04.2025):
conda install -c conda-forge textattack
Show 7 hidden projects... - CleverHans (🥈29 · ⭐ 6.4K · 💀) - An adversarial example library for constructing attacks,.. MIT - Foolbox (🥈28 · ⭐ 2.9K · 💀) - A Python toolbox to create adversarial examples that fool neural.. MIT - advertorch (🥉24 · ⭐ 1.4K · 💀) - A Toolbox for Adversarial Robustness Research. ❗️GPL-3.0 - robustness (🥉20 · ⭐ 950 · 💀) - A library for experimenting with, training and evaluating neural.. MIT - AdvBox (🥉19 · ⭐ 1.4K · 💀) - Advbox is a toolbox to generate adversarial examples that fool.. Apache-2 - textflint (🥉17 · ⭐ 650 · 💀) - Unified Multilingual Robustness Evaluation Toolkit for.. ❗️GPL-3.0 - Adversary (🥉15 · ⭐ 400 · 💀) - Tool to generate adversarial text examples and test machine.. MIT


GPU & Accelerator Utilities

Back to top

Libraries that require and make use of CUDA/GPU or other accelerator hardware capabilities to optimize machine learning tasks.

optimum (🥇37 · ⭐ 3.1K) - Accelerate inference and training of Transformers, Diffusers, TIMM.. Apache-2 - [GitHub](https://github.com/huggingface/optimum) (👨‍💻 150 · 🔀 600 · 📦 6.3K · 📋 860 - 30% open · ⏱️ 30.10.2025):
git clone https://github.com/huggingface/optimum
- [PyPi](https://pypi.org/project/optimum) (📥 3.7M / month · 📦 270 · ⏱️ 09.10.2025):
pip install optimum
- [Conda](https://anaconda.org/conda-forge/optimum) (📥 50K · ⏱️ 09.10.2025):
conda install -c conda-forge optimum
cuDF (🥇35 · ⭐ 9.3K) - cuDF - GPU DataFrame Library. Apache-2 - [GitHub](https://github.com/rapidsai/cudf) (👨‍💻 310 · 🔀 980 · 📦 64 · 📋 7.3K - 15% open · ⏱️ 30.10.2025):
git clone https://github.com/rapidsai/cudf
- [PyPi](https://pypi.org/project/cudf) (📥 2.8K / month · 📦 22 · ⏱️ 01.06.2020):
pip install cudf
PyCUDA (🥈33 · ⭐ 2K) - CUDA integration for Python, plus shiny features. MIT - [GitHub](https://github.com/inducer/pycuda) (👨‍💻 83 · 🔀 300 · 📦 4K · 📋 290 - 29% open · ⏱️ 12.10.2025):
git clone https://github.com/inducer/pycuda
- [PyPi](https://pypi.org/project/pycuda) (📥 66K / month · 📦 200 · ⏱️ 09.09.2025):
pip install pycuda
- [Conda](https://anaconda.org/conda-forge/pycuda) (📥 1.1M · ⏱️ 27.10.2025):
conda install -c conda-forge pycuda
Apex (🥈32 · ⭐ 8.8K) - A PyTorch Extension: Tools for easy mixed precision and distributed.. BSD-3 - [GitHub](https://github.com/NVIDIA/apex) (👨‍💻 140 · 🔀 1.4K · 📦 3.3K · 📋 1.3K - 57% open · ⏱️ 29.10.2025):
git clone https://github.com/NVIDIA/apex
- [Conda](https://anaconda.org/conda-forge/nvidia-apex) (📥 580K · ⏱️ 26.07.2025):
conda install -c conda-forge nvidia-apex
cuML (🥈31 · ⭐ 5K) - cuML - RAPIDS Machine Learning Library. Apache-2 - [GitHub](https://github.com/rapidsai/cuml) (👨‍💻 190 · 🔀 600 · 📋 3K - 31% open · ⏱️ 30.10.2025):
git clone https://github.com/rapidsai/cuml
- [PyPi](https://pypi.org/project/cuml) (📥 2.5K / month · 📦 14 · ⏱️ 01.06.2020):
pip install cuml
gpustat (🥈29 · ⭐ 4.3K) - A simple command-line utility for querying and monitoring GPU status. MIT - [GitHub](https://github.com/wookayin/gpustat) (👨‍💻 17 · 🔀 280 · 📦 7.9K · 📋 130 - 22% open · ⏱️ 13.04.2025):
git clone https://github.com/wookayin/gpustat
- [PyPi](https://pypi.org/project/gpustat) (📥 1.1M / month · 📦 150 · ⏱️ 22.08.2023):
pip install gpustat
- [Conda](https://anaconda.org/conda-forge/gpustat) (📥 310K · ⏱️ 22.04.2025):
conda install -c conda-forge gpustat
ArrayFire (🥈28 · ⭐ 4.8K) - ArrayFire: a general purpose GPU library. BSD-3 - [GitHub](https://github.com/arrayfire/arrayfire) (👨‍💻 97 · 🔀 540 · 📥 9.6K · 📋 1.8K - 19% open · ⏱️ 28.07.2025):
git clone https://github.com/arrayfire/arrayfire
- [PyPi](https://pypi.org/project/arrayfire) (📥 4.5K / month · 📦 13 · ⏱️ 22.02.2022):
pip install arrayfire
cuGraph (🥈28 · ⭐ 2.1K) - cuGraph - RAPIDS Graph Analytics Library. Apache-2 - [GitHub](https://github.com/rapidsai/cugraph) (👨‍💻 120 · 🔀 340 · 📋 1.9K - 6% open · ⏱️ 29.10.2025):
git clone https://github.com/rapidsai/cugraph
- [PyPi](https://pypi.org/project/cugraph) (📥 550 / month · 📦 4 · ⏱️ 01.06.2020):
pip install cugraph
- [Conda](https://anaconda.org/conda-forge/libcugraph) (📥 69K · ⏱️ 22.04.2025):
conda install -c conda-forge libcugraph
CuPy (🥉27 · ⭐ 11K) - NumPy & SciPy for GPU. MIT - [GitHub](https://github.com/cupy/cupy) (👨‍💻 340 · 🔀 950):
git clone https://github.com/cupy/cupy
- [PyPi](https://pypi.org/project/cupy) (📥 39K / month · 📦 400 · ⏱️ 18.08.2025):
pip install cupy
- [Conda](https://anaconda.org/conda-forge/cupy) (📥 7.2M · ⏱️ 14.09.2025):
conda install -c conda-forge cupy
- [Docker Hub](https://hub.docker.com/r/cupy/cupy) (📥 92K · ⭐ 14 · ⏱️ 18.08.2025):
docker pull cupy/cupy
DALI (🥉25 · ⭐ 5.5K) - A GPU-accelerated library containing highly optimized building blocks.. Apache-2 - [GitHub](https://github.com/NVIDIA/DALI) (👨‍💻 99 · 🔀 650 · 📋 1.7K - 15% open · ⏱️ 30.10.2025):
git clone https://github.com/NVIDIA/DALI
Vulkan Kompute (🥉23 · ⭐ 2.4K) - General purpose GPU compute framework built on Vulkan to.. Apache-2 - [GitHub](https://github.com/KomputeProject/kompute) (👨‍💻 35 · 🔀 160 · 📥 700 · 📋 230 - 32% open · ⏱️ 05.10.2025):
git clone https://github.com/KomputeProject/kompute
- [PyPi](https://pypi.org/project/kp) (📥 1.8K / month · ⏱️ 20.01.2024):
pip install kp
Show 9 hidden projects... - GPUtil (🥉25 · ⭐ 1.2K · 💀) - A Python module for getting the GPU status from NVIDA GPUs using.. MIT - scikit-cuda (🥉25 · ⭐ 990 · 💀) - Python interface to GPU-powered libraries. BSD-3 - py3nvml (🥉22 · ⭐ 250 · 💀) - Python 3 Bindings for NVML library. Get NVIDIA GPU status inside.. BSD-3 - BlazingSQL (🥉20 · ⭐ 2K · 💀) - BlazingSQL is a lightweight, GPU accelerated, SQL engine for.. Apache-2 - Merlin (🥉20 · ⭐ 860 · 💀) - NVIDIA Merlin is an open source library providing end-to-end GPU-.. Apache-2 - nvidia-ml-py3 (🥉18 · ⭐ 140 · 💀) - Python 3 Bindings for the NVIDIA Management Library. BSD-3 - SpeedTorch (🥉15 · ⭐ 680 · 💀) - Library for faster pinned CPU - GPU transfer in Pytorch. MIT - ipyexperiments (🥉15 · ⭐ 220 · 💀) - Automatic GPU+CPU memory profiling, re-use and memory.. Apache-2 - cuSignal (🥉14 · ⭐ 730 · 💀) - GPU accelerated signal processing. ❗Unlicensed


Tensorflow Utilities

Back to top

Libraries that extend TensorFlow with additional capabilities.

TensorFlow Datasets (🥇39 · ⭐ 4.5K) - TFDS is a collection of datasets ready to use with.. Apache-2 - [GitHub](https://github.com/tensorflow/datasets) (👨‍💻 660 · 🔀 1.6K · 📦 25K · 📋 1.5K - 47% open · ⏱️ 17.10.2025):
git clone https://github.com/tensorflow/datasets
- [PyPi](https://pypi.org/project/tensorflow-datasets) (📥 1.8M / month · 📦 340 · ⏱️ 28.05.2025):
pip install tensorflow-datasets
- [Conda](https://anaconda.org/conda-forge/tensorflow-datasets) (📥 51K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow-datasets
tensorflow-hub (🥈31 · ⭐ 3.5K · 💤) - A library for transfer learning by reusing parts of.. Apache-2 - [GitHub](https://github.com/tensorflow/hub) (👨‍💻 110 · 🔀 1.7K · 📋 710 - 2% open · ⏱️ 17.01.2025):
git clone https://github.com/tensorflow/hub
- [PyPi](https://pypi.org/project/tensorflow-hub) (📥 2M / month · 📦 300 · ⏱️ 30.01.2024):
pip install tensorflow-hub
- [Conda](https://anaconda.org/conda-forge/tensorflow-hub) (📥 130K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorflow-hub
TFX (🥈31 · ⭐ 2.2K · 💤) - TFX is an end-to-end platform for deploying production ML.. Apache-2 - [GitHub](https://github.com/tensorflow/tfx) (👨‍💻 200 · 🔀 710 · 📦 1.8K · 📋 1.2K - 22% open · ⏱️ 26.03.2025):
git clone https://github.com/tensorflow/tfx
- [PyPi](https://pypi.org/project/tfx) (📥 37K / month · 📦 17 · ⏱️ 11.12.2024):
pip install tfx
TF Model Optimization (🥈29 · ⭐ 1.6K) - A toolkit to optimize ML models for deployment for.. Apache-2 - [GitHub](https://github.com/tensorflow/model-optimization) (👨‍💻 87 · 🔀 320 · 📋 400 - 57% open · ⏱️ 07.07.2025):
git clone https://github.com/tensorflow/model-optimization
- [PyPi](https://pypi.org/project/tensorflow-model-optimization) (📥 920K / month · 📦 45 · ⏱️ 08.02.2024):
pip install tensorflow-model-optimization
TensorFlow I/O (🥈29 · ⭐ 730) - Dataset, streaming, and file system extensions.. Apache-2 - [GitHub](https://github.com/tensorflow/io) (👨‍💻 120 · 🔀 290 · 📋 660 - 44% open · ⏱️ 10.04.2025):
git clone https://github.com/tensorflow/io
- [PyPi](https://pypi.org/project/tensorflow-io) (📥 730K / month · 📦 61 · ⏱️ 01.07.2024):
pip install tensorflow-io
TensorFlow Transform (🥉26 · ⭐ 990) - Input pipeline framework. Apache-2 - [GitHub](https://github.com/tensorflow/transform) (👨‍💻 31 · 🔀 220 · 📋 220 - 17% open · ⏱️ 06.08.2025):
git clone https://github.com/tensorflow/transform
- [PyPi](https://pypi.org/project/tensorflow-transform) (📥 250K / month · 📦 19 · ⏱️ 13.06.2025):
pip install tensorflow-transform
Neural Structured Learning (🥉24 · ⭐ 1K · 💤) - Training neural models with structured signals. Apache-2 - [GitHub](https://github.com/tensorflow/neural-structured-learning) (👨‍💻 39 · 🔀 190 · 📦 520 · 📋 69 - 1% open · ⏱️ 29.01.2025):
git clone https://github.com/tensorflow/neural-structured-learning
- [PyPi](https://pypi.org/project/neural-structured-learning) (📥 3.2K / month · 📦 3 · ⏱️ 29.07.2022):
pip install neural-structured-learning
TensorFlow Cloud (🥉21 · ⭐ 380) - The TensorFlow Cloud repository provides APIs that.. Apache-2 - [GitHub](https://github.com/tensorflow/cloud) (👨‍💻 29 · 🔀 92 · 📋 100 - 73% open · ⏱️ 01.10.2025):
git clone https://github.com/tensorflow/cloud
- [PyPi](https://pypi.org/project/tensorflow-cloud) (📥 18K / month · 📦 7 · ⏱️ 17.06.2021):
pip install tensorflow-cloud
TF Compression (🥉20 · ⭐ 900) - Data compression in TensorFlow. Apache-2 - [GitHub](https://github.com/tensorflow/compression) (👨‍💻 24 · 🔀 260 · 📋 100 - 10% open · ⏱️ 19.08.2025):
git clone https://github.com/tensorflow/compression
- [PyPi](https://pypi.org/project/tensorflow-compression) (📥 4.3K / month · 📦 2 · ⏱️ 02.02.2024):
pip install tensorflow-compression
Show 7 hidden projects... - tensor2tensor (🥇33 · ⭐ 17K · 💀) - Library of deep learning models and datasets designed.. Apache-2 - TF Addons (🥈32 · ⭐ 1.7K · 💀) - Useful extra functionality for TensorFlow 2.x maintained.. Apache-2 - Keras-Preprocessing (🥉28 · ⭐ 1K · 💀) - Utilities for working with image data, text data, and.. MIT - efficientnet (🥉26 · ⭐ 2.1K · 💀) - Implementation of EfficientNet model. Keras and.. Apache-2 - Saliency (🥉22 · ⭐ 980 · 💀) - Framework-agnostic implementation for state-of-the-art.. Apache-2 - TensorNets (🥉21 · ⭐ 1K · 💀) - High level network definitions with pre-trained weights in.. MIT - tffm (🥉18 · ⭐ 780 · 💀) - TensorFlow implementation of an arbitrary order Factorization Machine. MIT


Jax Utilities

Back to top

Libraries that extend Jax with additional capabilities.

equinox (🥇33 · ⭐ 2.6K) - Elegant easy-to-use neural networks + scientific computing in.. Apache-2 - [GitHub](https://github.com/patrick-kidger/equinox) (👨‍💻 81 · 🔀 170 · 📦 1.4K · 📋 610 - 35% open · ⏱️ 29.10.2025):
git clone https://github.com/patrick-kidger/equinox
- [PyPi](https://pypi.org/project/equinox) (📥 500K / month · 📦 350 · ⏱️ 09.10.2025):
pip install equinox
Show 2 hidden projects... - evojax (🥉18 · ⭐ 920 · 💀) - EvoJAX: Hardware-accelerated Neuroevolution. Apache-2 - jaxdf (🥉12 · ⭐ 130 · 💀) - A JAX-based research framework for writing differentiable.. ❗️LGPL-3.0


Sklearn Utilities

Back to top

Libraries that extend scikit-learn with additional capabilities.

scikit-learn-intelex (🥇35 · ⭐ 1.3K) - Extension for Scikit-learn is a seamless way to speed.. Apache-2 - [GitHub](https://github.com/uxlfoundation/scikit-learn-intelex) (👨‍💻 86 · 🔀 180 · 📦 14K · 📋 250 - 15% open · ⏱️ 28.10.2025):
git clone https://github.com/intel/scikit-learn-intelex
- [PyPi](https://pypi.org/project/scikit-learn-intelex) (📥 89K / month · 📦 74 · ⏱️ 22.10.2025):
pip install scikit-learn-intelex
- [Conda](https://anaconda.org/conda-forge/scikit-learn-intelex) (📥 650K · ⏱️ 30.10.2025):
conda install -c conda-forge scikit-learn-intelex
imbalanced-learn (🥇33 · ⭐ 7.1K) - A Python Package to Tackle the Curse of Imbalanced.. MIT - [GitHub](https://github.com/scikit-learn-contrib/imbalanced-learn) (👨‍💻 89 · 🔀 1.3K · 📋 630 - 8% open · ⏱️ 14.08.2025):
git clone https://github.com/scikit-learn-contrib/imbalanced-learn
- [PyPi](https://pypi.org/project/imbalanced-learn) (📥 14M / month · 📦 600 · ⏱️ 14.08.2025):
pip install imbalanced-learn
- [Conda](https://anaconda.org/conda-forge/imbalanced-learn) (📥 750K · ⏱️ 14.08.2025):
conda install -c conda-forge imbalanced-learn
MLxtend (🥇33 · ⭐ 5.1K) - A library of extension and helper modules for Pythons data.. BSD-3 - [GitHub](https://github.com/rasbt/mlxtend) (👨‍💻 110 · 🔀 880 · 📦 21K · 📋 500 - 29% open · ⏱️ 19.06.2025):
git clone https://github.com/rasbt/mlxtend
- [PyPi](https://pypi.org/project/mlxtend) (📥 960K / month · 📦 200 · ⏱️ 26.01.2025):
pip install mlxtend
- [Conda](https://anaconda.org/conda-forge/mlxtend) (📥 460K · ⏱️ 22.04.2025):
conda install -c conda-forge mlxtend
category_encoders (🥈31 · ⭐ 2.5K · 💤) - A library of sklearn compatible categorical variable.. BSD-3 - [GitHub](https://github.com/scikit-learn-contrib/category_encoders) (👨‍💻 71 · 🔀 400 · 📦 4.1K · 📋 300 - 13% open · ⏱️ 24.03.2025):
git clone https://github.com/scikit-learn-contrib/category_encoders
- [PyPi](https://pypi.org/project/category_encoders) (📥 2.1M / month · 📦 310 · ⏱️ 15.03.2025):
pip install category_encoders
- [Conda](https://anaconda.org/conda-forge/category_encoders) (📥 370K · ⏱️ 22.04.2025):
conda install -c conda-forge category_encoders
scikit-lego (🥈28 · ⭐ 1.4K) - Extra blocks for scikit-learn pipelines. MIT - [GitHub](https://github.com/koaning/scikit-lego) (👨‍💻 69 · 🔀 120 · 📦 190 · 📋 340 - 9% open · ⏱️ 21.10.2025):
git clone https://github.com/koaning/scikit-lego
- [PyPi](https://pypi.org/project/scikit-lego) (📥 53K / month · 📦 13 · ⏱️ 15.09.2025):
pip install scikit-lego
- [Conda](https://anaconda.org/conda-forge/scikit-lego) (📥 76K · ⏱️ 22.04.2025):
conda install -c conda-forge scikit-lego
scikit-opt (🥉26 · ⭐ 6.2K) - Genetic Algorithm, Particle Swarm Optimization, Simulated.. MIT - [GitHub](https://github.com/guofei9987/scikit-opt) (👨‍💻 24 · 🔀 1.1K · 📦 280 · 📋 180 - 37% open · ⏱️ 31.08.2025):
git clone https://github.com/guofei9987/scikit-opt
- [PyPi](https://pypi.org/project/scikit-opt) (📥 9.1K / month · 📦 15 · ⏱️ 14.01.2022):
pip install scikit-opt
iterative-stratification (🥉21 · ⭐ 880 · 💤) - scikit-learn cross validators for iterative.. BSD-3 - [GitHub](https://github.com/trent-b/iterative-stratification) (👨‍💻 7 · 🔀 75 · 📦 620 · 📋 27 - 7% open · ⏱️ 12.10.2024):
git clone https://github.com/trent-b/iterative-stratification
- [PyPi](https://pypi.org/project/iterative-stratification) (📥 54K / month · 📦 15 · ⏱️ 12.10.2024):
pip install iterative-stratification
scikit-tda (🥉19 · ⭐ 550) - Topological Data Analysis for Python. MIT - [GitHub](https://github.com/scikit-tda/scikit-tda) (👨‍💻 7 · 🔀 54 · 📦 93 · 📋 23 - 17% open · ⏱️ 28.10.2025):
git clone https://github.com/scikit-tda/scikit-tda
- [PyPi](https://pypi.org/project/scikit-tda) (📥 1.8K / month · ⏱️ 19.07.2024):
pip install scikit-tda
Show 11 hidden projects... - scikit-survival (🥈32 · ⭐ 1.2K) - Survival analysis built on top of scikit-learn. ❗️GPL-3.0 - fancyimpute (🥈27 · ⭐ 1.3K · 💀) - Multivariate imputation and matrix completion.. Apache-2 - scikit-multilearn (🥈27 · ⭐ 950 · 💀) - A scikit-learn based module for multi-label et. al... BSD-2 - sklearn-crfsuite (🥉25 · ⭐ 430 · 💀) - scikit-learn inspired API for CRFsuite. MIT - skope-rules (🥉22 · ⭐ 650 · 💀) - machine learning with logical rules in Python. ❗️BSD-1-Clause - combo (🥉21 · ⭐ 660 · 💀) - (AAAI 20) A Python Toolbox for Machine Learning Model.. BSD-2 xgboost - celer (🥉21 · ⭐ 230) - Fast solver for L1-type problems: Lasso, sparse Logisitic regression,.. BSD-3 - sklearn-contrib-lightning (🥉20 · ⭐ 1.8K · 💀) - Large-scale linear classification, regression and.. BSD-3 - dabl (🥉18 · ⭐ 730 · 💀) - Data Analysis Baseline Library. BSD-3 - DESlib (🥉18 · ⭐ 490 · 💀) - A Python library for dynamic classifier and ensemble selection. BSD-3 - skggm (🥉17 · ⭐ 250) - Scikit-learn compatible estimation of general graphical models. MIT


Pytorch Utilities

Back to top

Libraries that extend Pytorch with additional capabilities.

accelerate (🥇43 · ⭐ 9.2K) - A simple way to launch, train, and use PyTorch models on.. Apache-2 - [GitHub](https://github.com/huggingface/accelerate) (👨‍💻 370 · 🔀 1.2K · 📦 110K · 📋 1.9K - 5% open · ⏱️ 22.10.2025):
git clone https://github.com/huggingface/accelerate
- [PyPi](https://pypi.org/project/accelerate) (📥 17M / month · 📦 2.8K · ⏱️ 20.10.2025):
pip install accelerate
- [Conda](https://anaconda.org/conda-forge/accelerate) (📥 670K · ⏱️ 24.10.2025):
conda install -c conda-forge accelerate
tinygrad (🥇33 · ⭐ 30K) - You like pytorch? You like micrograd? You love tinygrad!. MIT - [GitHub](https://github.com/tinygrad/tinygrad) (👨‍💻 420 · 🔀 3.6K · 📦 20 · 📋 1K - 12% open · ⏱️ 30.10.2025):
git clone https://github.com/geohot/tinygrad
PML (🥇33 · ⭐ 6.2K) - The easiest way to use deep metric learning in your application. Modular,.. MIT - [GitHub](https://github.com/KevinMusgrave/pytorch-metric-learning) (👨‍💻 45 · 🔀 660 · 📦 2.9K · 📋 530 - 14% open · ⏱️ 17.08.2025):
git clone https://github.com/KevinMusgrave/pytorch-metric-learning
- [PyPi](https://pypi.org/project/pytorch-metric-learning) (📥 2.3M / month · 📦 68 · ⏱️ 17.08.2025):
pip install pytorch-metric-learning
- [Conda](https://anaconda.org/metric-learning/pytorch-metric-learning) (📥 13K · ⏱️ 25.03.2025):
conda install -c metric-learning pytorch-metric-learning
torchdiffeq (🥇31 · ⭐ 6.2K) - Differentiable ODE solvers with full GPU support and.. MIT - [GitHub](https://github.com/rtqichen/torchdiffeq) (👨‍💻 23 · 🔀 940 · 📦 5.5K · 📋 230 - 35% open · ⏱️ 04.04.2025):
git clone https://github.com/rtqichen/torchdiffeq
- [PyPi](https://pypi.org/project/torchdiffeq) (📥 1M / month · 📦 120 · ⏱️ 21.11.2024):
pip install torchdiffeq
- [Conda](https://anaconda.org/conda-forge/torchdiffeq) (📥 24K · ⏱️ 22.04.2025):
conda install -c conda-forge torchdiffeq
torchsde (🥈30 · ⭐ 1.7K · 💤) - Differentiable SDE solvers with GPU support and efficient.. Apache-2 - [GitHub](https://github.com/google-research/torchsde) (👨‍💻 9 · 🔀 210 · 📦 5.5K · 📋 84 - 36% open · ⏱️ 30.12.2024):
git clone https://github.com/google-research/torchsde
- [PyPi](https://pypi.org/project/torchsde) (📥 4.6M / month · 📦 37 · ⏱️ 26.09.2023):
pip install torchsde
- [Conda](https://anaconda.org/conda-forge/torchsde) (📥 46K · ⏱️ 22.04.2025):
conda install -c conda-forge torchsde
torch-scatter (🥈26 · ⭐ 1.7K) - PyTorch Extension Library of Optimized Scatter Operations. MIT - [GitHub](https://github.com/rusty1s/pytorch_scatter) (👨‍💻 34 · 🔀 200 · 📋 420 - 6% open · ⏱️ 12.08.2025):
git clone https://github.com/rusty1s/pytorch_scatter
- [PyPi](https://pypi.org/project/torch-scatter) (📥 82K / month · 📦 150 · ⏱️ 06.10.2023):
pip install torch-scatter
- [Conda](https://anaconda.org/conda-forge/pytorch_scatter) (📥 1M · ⏱️ 03.10.2025):
conda install -c conda-forge pytorch_scatter
PyTorch Sparse (🥈25 · ⭐ 1.1K) - PyTorch Extension Library of Optimized Autograd Sparse.. MIT - [GitHub](https://github.com/rusty1s/pytorch_sparse) (👨‍💻 48 · 🔀 160 · 📋 300 - 10% open · ⏱️ 12.08.2025):
git clone https://github.com/rusty1s/pytorch_sparse
- [PyPi](https://pypi.org/project/torch-sparse) (📥 63K / month · 📦 120 · ⏱️ 06.10.2023):
pip install torch-sparse
- [Conda](https://anaconda.org/conda-forge/pytorch_sparse) (📥 940K · ⏱️ 03.10.2025):
conda install -c conda-forge pytorch_sparse
Pytorch Toolbelt (🥉24 · ⭐ 1.6K) - PyTorch extensions for fast R&D prototyping and Kaggle.. MIT - [GitHub](https://github.com/BloodAxe/pytorch-toolbelt) (👨‍💻 9 · 🔀 120 · 📥 180 · 📋 33 - 12% open · ⏱️ 09.10.2025):
git clone https://github.com/BloodAxe/pytorch-toolbelt
- [PyPi](https://pypi.org/project/pytorch_toolbelt) (📥 8.3K / month · 📦 12 · ⏱️ 21.11.2024):
pip install pytorch_toolbelt
madgrad (🥉18 · ⭐ 800 · 💤) - MADGRAD Optimization Method. MIT - [GitHub](https://github.com/facebookresearch/madgrad) (👨‍💻 3 · 🔀 58 · 📦 110 · ⏱️ 27.01.2025):
git clone https://github.com/facebookresearch/madgrad
- [PyPi](https://pypi.org/project/madgrad) (📥 9.7K / month · 📦 1 · ⏱️ 08.03.2022):
pip install madgrad
pytorchviz (🥉14 · ⭐ 3.4K · 💤) - A small package to create visualizations of PyTorch execution.. MIT - [GitHub](https://github.com/szagoruyko/pytorchviz) (👨‍💻 6 · 🔀 280 · 📋 72 - 47% open · ⏱️ 30.12.2024):
git clone https://github.com/szagoruyko/pytorchviz
Show 22 hidden projects... - pretrainedmodels (🥈29 · ⭐ 9.1K · 💀) - Pretrained ConvNets for pytorch: NASNet, ResNeXt,.. BSD-3 - EfficientNet-PyTorch (🥈28 · ⭐ 8.2K · 💀) - A PyTorch implementation of EfficientNet. Apache-2 - lightning-flash (🥈27 · ⭐ 1.7K · 💀) - Your PyTorch AI Factory - Flash enables you to easily.. Apache-2 - pytorch-optimizer (🥈26 · ⭐ 3.1K · 💀) - torch-optimizer -- collection of optimizers for.. Apache-2 - TabNet (🥈26 · ⭐ 2.9K · 💀) - PyTorch implementation of TabNet paper :.. MIT - EfficientNets (🥈25 · ⭐ 1.6K · 💀) - Pretrained EfficientNet, EfficientNet-Lite, MixNet,.. Apache-2 - pytorch-summary (🥉24 · ⭐ 4.1K · 💀) - Model summary in PyTorch similar to `model.summary()`.. MIT - Higher (🥉23 · ⭐ 1.6K · 💀) - higher is a pytorch library allowing users to obtain higher.. Apache-2 - micrograd (🥉22 · ⭐ 14K · 💀) - A tiny scalar-valued autograd engine and a neural net library.. MIT - SRU (🥉22 · ⭐ 2.1K · 💀) - Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755). MIT - Antialiased CNNs (🥉22 · ⭐ 1.7K · 💀) - pip install antialiased-cnns to improve stability and.. ❗️CC BY-NC-SA 4.0 - AdaBound (🥉21 · ⭐ 2.9K · 💀) - An optimizer that trains as fast as Adam and as good as SGD. Apache-2 - reformer-pytorch (🥉21 · ⭐ 2.2K · 💀) - Reformer, the efficient Transformer, in Pytorch. MIT - Torchmeta (🥉21 · ⭐ 2K · 💀) - A collection of extensions and data-loaders for few-shot.. MIT - Poutyne (🥉21 · ⭐ 580) - A simplified framework and utilities for PyTorch. ❗️LGPL-3.0 - Performer Pytorch (🥉19 · ⭐ 1.2K · 💀) - An implementation of Performer, a linear attention-.. MIT - Torch-Struct (🥉19 · ⭐ 1.1K · 💀) - Fast, general, and tested differentiable structured.. MIT - Lambda Networks (🥉17 · ⭐ 1.5K · 💀) - Implementation of LambdaNetworks, a new approach to.. MIT - Pywick (🥉17 · ⭐ 400 · 💀) - High-level batteries-included neural network training library for.. MIT - TorchDrift (🥉15 · ⭐ 320 · 💀) - Drift Detection for your PyTorch Models. Apache-2 - Tez (🥉14 · ⭐ 1.2K · 💀) - Tez is a super-simple and lightweight Trainer for PyTorch. It.. Apache-2 - Tensor Sensor (🥉14 · ⭐ 810 · 💀) - The goal of this library is to generate more helpful.. MIT


Database Clients

Back to top

Libraries for connecting to, operating, and querying databases.

🔗 best-of-python - DB Clients ( ⭐ 4.2K) - Collection of database clients for python.


Others

Back to top

scipy (🥇51 · ⭐ 14K) - Ecosystem of open-source software for mathematics, science, and engineering. BSD-3 - [GitHub](https://github.com/scipy/scipy) (👨‍💻 1.8K · 🔀 5.5K · 📥 97K · 📦 1.4M · 📋 11K - 15% open · ⏱️ 30.10.2025):
git clone https://github.com/scipy/scipy
- [PyPi](https://pypi.org/project/scipy) (📥 220M / month · 📦 61K · ⏱️ 28.10.2025):
pip install scipy
- [Conda](https://anaconda.org/conda-forge/scipy) (📥 70M · ⏱️ 29.10.2025):
conda install -c conda-forge scipy
SymPy (🥇49 · ⭐ 14K) - A computer algebra system written in pure Python. BSD-3 - [GitHub](https://github.com/sympy/sympy) (👨‍💻 1.4K · 🔀 4.8K · 📥 570K · 📦 290K · 📋 15K - 37% open · ⏱️ 30.10.2025):
git clone https://github.com/sympy/sympy
- [PyPi](https://pypi.org/project/sympy) (📥 73M / month · 📦 4.6K · ⏱️ 27.04.2025):
pip install sympy
- [Conda](https://anaconda.org/conda-forge/sympy) (📥 11M · ⏱️ 29.04.2025):
conda install -c conda-forge sympy
Streamlit (🥇47 · ⭐ 42K) - Streamlit A faster way to build and share data apps. Apache-2 - [GitHub](https://github.com/streamlit/streamlit) (👨‍💻 570 · 🔀 3.8K · 📦 1M · 📋 5.7K - 23% open · ⏱️ 30.10.2025):
git clone https://github.com/streamlit/streamlit
- [PyPi](https://pypi.org/project/streamlit) (📥 19M / month · 📦 4.6K · ⏱️ 29.10.2025):
pip install streamlit
Gradio (🥇46 · ⭐ 40K) - Wrap UIs around any model, share with anyone. Apache-2 - [GitHub](https://github.com/gradio-app/gradio) (👨‍💻 700 · 🔀 3.1K · 📦 84K · 📋 6.1K - 6% open · ⏱️ 29.10.2025):
git clone https://github.com/gradio-app/gradio
- [PyPi](https://pypi.org/project/gradio) (📥 11M / month · 📦 1.6K · ⏱️ 22.10.2025):
pip install gradio
carla (🥇37 · ⭐ 13K) - Open-source simulator for autonomous driving research. MIT - [GitHub](https://github.com/carla-simulator/carla) (👨‍💻 190 · 🔀 4.2K · 📦 1.1K · 📋 6.2K - 18% open · ⏱️ 30.10.2025):
git clone https://github.com/carla-simulator/carla
- [PyPi](https://pypi.org/project/carla) (📥 18K / month · 📦 16 · ⏱️ 14.09.2025):
pip install carla
Autograd (🥇37 · ⭐ 7.4K) - Efficiently computes derivatives of NumPy code. MIT - [GitHub](https://github.com/HIPS/autograd) (👨‍💻 64 · 🔀 910 · 📦 14K · 📋 440 - 42% open · ⏱️ 27.10.2025):
git clone https://github.com/HIPS/autograd
- [PyPi](https://pypi.org/project/autograd) (📥 3.4M / month · 📦 310 · ⏱️ 05.05.2025):
pip install autograd
- [Conda](https://anaconda.org/conda-forge/autograd) (📥 680K · ⏱️ 05.05.2025):
conda install -c conda-forge autograd
PennyLane (🥇37 · ⭐ 2.9K) - PennyLane is a cross-platform Python library for quantum.. Apache-2 - [GitHub](https://github.com/PennyLaneAI/pennylane) (👨‍💻 210 · 🔀 700 · 📥 100 · 📦 1.9K · 📋 1.7K - 25% open · ⏱️ 30.10.2025):
git clone https://github.com/PennyLaneAI/PennyLane
- [PyPi](https://pypi.org/project/pennylane) (📥 200K / month · 📦 89 · ⏱️ 15.10.2025):
pip install pennylane
- [Conda](https://anaconda.org/conda-forge/pennylane) (📥 340K · ⏱️ 22.04.2025):
conda install -c conda-forge pennylane
PyOD (🥈36 · ⭐ 9.6K) - A Python Library for Outlier and Anomaly Detection, Integrating Classical.. BSD-2 - [GitHub](https://github.com/yzhao062/pyod) (👨‍💻 64 · 🔀 1.4K · 📦 5.5K · 📋 390 - 59% open · ⏱️ 29.04.2025):
git clone https://github.com/yzhao062/pyod
- [PyPi](https://pypi.org/project/pyod) (📥 840K / month · 📦 130 · ⏱️ 29.04.2025):
pip install pyod
- [Conda](https://anaconda.org/conda-forge/pyod) (📥 170K · ⏱️ 30.04.2025):
conda install -c conda-forge pyod
Datasette (🥈35 · ⭐ 10K) - An open source multi-tool for exploring and publishing data. Apache-2 - [GitHub](https://github.com/simonw/datasette) (👨‍💻 82 · 🔀 770 · 📥 75 · 📦 1.6K · 📋 1.9K - 32% open · ⏱️ 26.10.2025):
git clone https://github.com/simonw/datasette
- [PyPi](https://pypi.org/project/datasette) (📥 180K / month · 📦 480 · ⏱️ 22.04.2025):
pip install datasette
- [Conda](https://anaconda.org/conda-forge/datasette) (📥 73K · ⏱️ 22.04.2025):
conda install -c conda-forge datasette
DeepChem (🥈34 · ⭐ 6.3K · 📉) - Democratizing Deep-Learning for Drug Discovery, Quantum.. MIT - [GitHub](https://github.com/deepchem/deepchem) (👨‍💻 260 · 🔀 1.9K · 📦 650 · 📋 2.1K - 40% open · ⏱️ 27.10.2025):
git clone https://github.com/deepchem/deepchem
- [PyPi](https://pypi.org/project/deepchem) (📥 54K / month · 📦 24 · ⏱️ 27.10.2025):
pip install deepchem
- [Conda](https://anaconda.org/conda-forge/deepchem) (📥 120K · ⏱️ 22.04.2025):
conda install -c conda-forge deepchem
Pythran (🥈34 · ⭐ 2.1K) - Ahead of Time compiler for numeric kernels. BSD-3 - [GitHub](https://github.com/serge-sans-paille/pythran) (👨‍💻 75 · 🔀 200 · 📦 3.6K · 📋 930 - 15% open · ⏱️ 30.09.2025):
git clone https://github.com/serge-sans-paille/pythran
- [PyPi](https://pypi.org/project/pythran) (📥 500K / month · 📦 28 · ⏱️ 23.05.2025):
pip install pythran
- [Conda](https://anaconda.org/conda-forge/pythran) (📥 1.3M · ⏱️ 07.07.2025):
conda install -c conda-forge pythran
agate (🥈34 · ⭐ 1.2K) - A Python data analysis library that is optimized for humans instead of.. MIT - [GitHub](https://github.com/wireservice/agate) (👨‍💻 55 · 🔀 150 · 📦 5.3K · 📋 650 - 0% open · ⏱️ 27.10.2025):
git clone https://github.com/wireservice/agate
- [PyPi](https://pypi.org/project/agate) (📥 24M / month · 📦 54 · ⏱️ 29.01.2025):
pip install agate
- [Conda](https://anaconda.org/conda-forge/agate) (📥 410K · ⏱️ 22.04.2025):
conda install -c conda-forge agate
River (🥈32 · ⭐ 5.6K) - Online machine learning in Python. BSD-3 - [GitHub](https://github.com/online-ml/river) (👨‍💻 130 · 🔀 590 · 📦 800 · 📋 630 - 19% open · ⏱️ 05.10.2025):
git clone https://github.com/online-ml/river
- [PyPi](https://pypi.org/project/river) (📥 91K / month · 📦 64 · ⏱️ 25.11.2024):
pip install river
- [Conda](https://anaconda.org/conda-forge/river) (📥 130K · ⏱️ 22.04.2025):
conda install -c conda-forge river
hdbscan (🥈32 · ⭐ 3K) - A high performance implementation of HDBSCAN clustering. BSD-3 - [GitHub](https://github.com/scikit-learn-contrib/hdbscan) (👨‍💻 97 · 🔀 500 · 📦 7.6K · 📋 530 - 67% open · ⏱️ 11.10.2025):
git clone https://github.com/scikit-learn-contrib/hdbscan
- [PyPi](https://pypi.org/project/hdbscan) (📥 1.1M / month · 📦 350 · ⏱️ 18.11.2024):
pip install hdbscan
- [Conda](https://anaconda.org/conda-forge/hdbscan) (📥 2.8M · ⏱️ 09.09.2025):
conda install -c conda-forge hdbscan
anomalib (🥈31 · ⭐ 5.1K · 📉) - An anomaly detection library comprising state-of-the-art.. Apache-2 - [GitHub](https://github.com/open-edge-platform/anomalib) (👨‍💻 98 · 🔀 820 · 📥 42K · 📦 200 · 📋 1.2K - 6% open · ⏱️ 27.10.2025):
git clone https://github.com/openvinotoolkit/anomalib
- [PyPi](https://pypi.org/project/anomalib) (📥 200K / month · 📦 7 · ⏱️ 09.10.2025):
pip install anomalib
pyjanitor (🥈31 · ⭐ 1.5K) - Clean APIs for data cleaning. Python implementation of R package.. MIT - [GitHub](https://github.com/pyjanitor-devs/pyjanitor) (👨‍💻 110 · 🔀 170 · 📦 980 · 📋 590 - 18% open · ⏱️ 21.10.2025):
git clone https://github.com/pyjanitor-devs/pyjanitor
- [PyPi](https://pypi.org/project/pyjanitor) (📥 280K / month · 📦 42 · ⏱️ 07.03.2025):
pip install pyjanitor
- [Conda](https://anaconda.org/conda-forge/pyjanitor) (📥 300K · ⏱️ 22.04.2025):
conda install -c conda-forge pyjanitor
causalml (🥈30 · ⭐ 5.6K) - Uplift modeling and causal inference with machine learning.. Apache-2 - [GitHub](https://github.com/uber/causalml) (👨‍💻 71 · 🔀 830 · 📦 310 · 📋 420 - 10% open · ⏱️ 26.09.2025):
git clone https://github.com/uber/causalml
- [PyPi](https://pypi.org/project/causalml) (📥 79K / month · 📦 10 · ⏱️ 09.07.2025):
pip install causalml
dstack (🥈30 · ⭐ 1.9K) - dstack is an open-source control plane for running development,.. MPL-2.0 - [GitHub](https://github.com/dstackai/dstack) (👨‍💻 63 · 🔀 200 · 📦 22 · 📋 1.5K - 6% open · ⏱️ 30.10.2025):
git clone https://github.com/dstackai/dstack
- [PyPi](https://pypi.org/project/dstack) (📥 4.2K / month · ⏱️ 30.10.2025):
pip install dstack
tensorly (🥈30 · ⭐ 1.6K) - TensorLy: Tensor Learning in Python. BSD-2 - [GitHub](https://github.com/tensorly/tensorly) (👨‍💻 73 · 🔀 290 · 📦 1.1K · 📋 280 - 22% open · ⏱️ 05.05.2025):
git clone https://github.com/tensorly/tensorly
- [PyPi](https://pypi.org/project/tensorly) (📥 130K / month · 📦 99 · ⏱️ 12.11.2024):
pip install tensorly
- [Conda](https://anaconda.org/conda-forge/tensorly) (📥 380K · ⏱️ 22.04.2025):
conda install -c conda-forge tensorly
metricflow (🥈29 · ⭐ 1.3K) - MetricFlow allows you to define, build, and maintain metrics in.. Apache-2 - [GitHub](https://github.com/dbt-labs/metricflow) (👨‍💻 52 · 🔀 130 · 📦 37 · 📋 370 - 27% open · ⏱️ 29.10.2025):
git clone https://github.com/transform-data/metricflow
- [PyPi](https://pypi.org/project/metricflow) (📥 94K / month · 📦 4 · ⏱️ 14.10.2025):
pip install metricflow
pycm (🥈28 · ⭐ 1.5K) - Multi-class confusion matrix library in Python. MIT - [GitHub](https://github.com/sepandhaghighi/pycm) (👨‍💻 18 · 🔀 120 · 📦 420 · 📋 210 - 7% open · ⏱️ 14.10.2025):
git clone https://github.com/sepandhaghighi/pycm
- [PyPi](https://pypi.org/project/pycm) (📥 190K / month · 📦 28 · ⏱️ 15.10.2025):
pip install pycm
Prince (🥈28 · ⭐ 1.4K) - Multivariate exploratory data analysis in Python PCA, CA, MCA, MFA,.. MIT - [GitHub](https://github.com/MaxHalford/prince) (👨‍💻 16 · 🔀 190 · 📦 770 · ⏱️ 04.08.2025):
git clone https://github.com/MaxHalford/prince
- [PyPi](https://pypi.org/project/prince) (📥 230K / month · 📦 23 · ⏱️ 04.08.2025):
pip install prince
- [Conda](https://anaconda.org/conda-forge/prince-factor-analysis) (📥 28K · ⏱️ 22.04.2025):
conda install -c conda-forge prince-factor-analysis
Trax (🥉27 · ⭐ 8.3K) - Trax Deep Learning with Clear Code and Speed. Apache-2 - [GitHub](https://github.com/google/trax) (👨‍💻 82 · 🔀 830 · 📦 230 · 📋 250 - 50% open · ⏱️ 26.09.2025):
git clone https://github.com/google/trax
- [PyPi](https://pypi.org/project/trax) (📥 4.3K / month · 📦 1 · ⏱️ 26.10.2021):
pip install trax
adapter-transformers (🥉27 · ⭐ 2.8K) - A Unified Library for Parameter-Efficient and Modular.. Apache-2 huggingface - [GitHub](https://github.com/adapter-hub/adapters) (👨‍💻 17 · 🔀 360 · 📦 260 · 📋 410 - 10% open · ⏱️ 12.10.2025):
git clone https://github.com/Adapter-Hub/adapter-transformers
- [PyPi](https://pypi.org/project/adapter-transformers) (📥 4.9K / month · 📦 12 · ⏱️ 07.07.2024):
pip install adapter-transformers
AugLy (🥉26 · ⭐ 5.1K) - A data augmentations library for audio, image, text, and video. MIT - [GitHub](https://github.com/facebookresearch/AugLy) (👨‍💻 42 · 🔀 310 · 📦 180 · 📋 80 - 30% open · ⏱️ 27.10.2025):
git clone https://github.com/facebookresearch/AugLy
- [PyPi](https://pypi.org/project/augly) (📥 13K / month · 📦 4 · ⏱️ 05.12.2023):
pip install augly
avalanche (🥉26 · ⭐ 2K · 💤) - Avalanche: an End-to-End Library for Continual Learning based on.. MIT - [GitHub](https://github.com/ContinualAI/avalanche) (👨‍💻 87 · 🔀 310 · 📥 60 · 📦 140 · 📋 840 - 13% open · ⏱️ 11.03.2025):
git clone https://github.com/ContinualAI/avalanche
- [PyPi](https://pypi.org/project/avalanche-lib) (📥 3.2K / month · 📦 3 · ⏱️ 29.10.2024):
pip install avalanche-lib
gplearn (🥉26 · ⭐ 1.8K) - Genetic Programming in Python, with a scikit-learn inspired API. BSD-3 - [GitHub](https://github.com/trevorstephens/gplearn) (👨‍💻 12 · 🔀 300 · 📦 730 · 📋 220 - 11% open · ⏱️ 23.07.2025):
git clone https://github.com/trevorstephens/gplearn
- [PyPi](https://pypi.org/project/gplearn) (📥 20K / month · 📦 19 · ⏱️ 03.05.2022):
pip install gplearn
- [Conda](https://anaconda.org/conda-forge/gplearn) (📥 11K · ⏱️ 22.04.2025):
conda install -c conda-forge gplearn
TabPy (🥉26 · ⭐ 1.6K · 💤) - Execute Python code on the fly and display results in Tableau.. MIT - [GitHub](https://github.com/tableau/TabPy) (👨‍💻 51 · 🔀 600 · 📦 220 · 📋 320 - 6% open · ⏱️ 25.11.2024):
git clone https://github.com/tableau/TabPy
- [PyPi](https://pypi.org/project/tabpy) (📥 7.1K / month · 📦 2 · ⏱️ 25.11.2024):
pip install tabpy
- [Conda](https://anaconda.org/anaconda/tabpy-client) (📥 5.8K · ⏱️ 22.04.2025):
conda install -c anaconda tabpy-client
findspark (🥉25 · ⭐ 520) - Find pyspark to make it importable. BSD-3 - [GitHub](https://github.com/minrk/findspark) (👨‍💻 16 · 🔀 72 · 📦 5.6K · 📋 23 - 47% open · ⏱️ 04.09.2025):
git clone https://github.com/minrk/findspark
- [PyPi](https://pypi.org/project/findspark) (📥 2.6M / month · 📦 100 · ⏱️ 11.02.2022):
pip install findspark
- [Conda](https://anaconda.org/conda-forge/findspark) (📥 1M · ⏱️ 22.04.2025):
conda install -c conda-forge findspark
vecstack (🥉23 · ⭐ 700) - Python package for stacking (machine learning technique). MIT - [GitHub](https://github.com/vecxoz/vecstack) (👨‍💻 1 · 🔀 82 · 📦 570 · ⏱️ 28.09.2025):
git clone https://github.com/vecxoz/vecstack
- [PyPi](https://pypi.org/project/vecstack) (📥 1.8K / month · 📦 5 · ⏱️ 28.09.2025):
pip install vecstack
- [Conda](https://anaconda.org/conda-forge/vecstack) (📥 3K · ⏱️ 22.04.2025):
conda install -c conda-forge vecstack
MONAILabel (🥉22 · ⭐ 760) - MONAI Label is an intelligent open source image labeling and.. Apache-2 - [GitHub](https://github.com/Project-MONAI/MONAILabel) (👨‍💻 69 · 🔀 240 · 📥 130K · 📋 560 - 26% open · ⏱️ 14.08.2025):
git clone https://github.com/Project-MONAI/MONAILabel
- [PyPi](https://pypi.org/project/monailabel-weekly) (📥 200 / month · ⏱️ 01.10.2023):
pip install monailabel-weekly
apricot (🥉22 · ⭐ 520) - apricot implements submodular optimization for the purpose of selecting.. MIT - [GitHub](https://github.com/jmschrei/apricot) (👨‍💻 4 · 🔀 49 · 📥 33 · 📦 200 · 📋 34 - 38% open · ⏱️ 09.06.2025):
git clone https://github.com/jmschrei/apricot
- [PyPi](https://pypi.org/project/apricot-select) (📥 13K / month · 📦 16 · ⏱️ 18.02.2021):
pip install apricot-select
pykale (🥉21 · ⭐ 470) - Knowledge-Aware machine LEarning (KALE): accessible machine learning.. MIT - [GitHub](https://github.com/pykale/pykale) (👨‍💻 28 · 🔀 70 · 📦 6 · 📋 140 - 8% open · ⏱️ 14.10.2025):
git clone https://github.com/pykale/pykale
- [PyPi](https://pypi.org/project/pykale) (📥 72 / month · ⏱️ 12.04.2022):
pip install pykale
SUOD (🥉21 · ⭐ 390 · 💤) - (MLSys 21) An Acceleration System for Large-scare Unsupervised.. BSD-2 - [GitHub](https://github.com/yzhao062/SUOD) (👨‍💻 3 · 🔀 49 · 📦 560 · 📋 15 - 80% open · ⏱️ 24.03.2025):
git clone https://github.com/yzhao062/SUOD
- [PyPi](https://pypi.org/project/suod) (📥 13K / month · 📦 9 · ⏱️ 24.03.2025):
pip install suod
pymdp (🥉16 · ⭐ 570) - A Python implementation of active inference for Markov Decision Processes. MIT - [GitHub](https://github.com/infer-actively/pymdp) (👨‍💻 19 · 🔀 110 · 📋 130 - 39% open · ⏱️ 09.09.2025):
git clone https://github.com/infer-actively/pymdp
- [PyPi](https://pypi.org/project/inferactively-pymdp) (📥 1.1K / month · ⏱️ 08.12.2022):
pip install inferactively-pymdp
Show 31 hidden projects... - pyopencl (🥈31 · ⭐ 1.1K) - OpenCL integration for Python, plus shiny features. ❗Unlicensed - pysc2 (🥈30 · ⭐ 8.2K · 💀) - StarCraft II Learning Environment. Apache-2 - modAL (🥈30 · ⭐ 2.3K · 💀) - A modular active learning framework for Python. MIT - datalad (🥈30 · ⭐ 620 · 📈) - Keep code, data, containers under control with git and git-.. ❗Unlicensed - cleanlab (🥈29 · ⭐ 11K) - Cleanlabs open-source library is the standard data-centric AI.. ❗️AGPL-3.0 - alibi-detect (🥈29 · ⭐ 2.4K) - Algorithms for outlier, adversarial and drift detection. ❗️Intel - minisom (🥈28 · ⭐ 1.6K) - MiniSom is a minimalistic implementation of the Self Organizing.. ❗️CC-BY-3.0 - PySwarms (🥈28 · ⭐ 1.4K · 💀) - A research toolkit for particle swarm optimization in Python. MIT - kmodes (🥈28 · ⭐ 1.3K · 💀) - Python implementations of the k-modes and k-prototypes clustering.. MIT - pyclustering (🥈28 · ⭐ 1.2K · 💀) - pyclustering is a Python, C++ data mining library. BSD-3 - Cython BLIS (🥈28 · ⭐ 230) - Fast matrix-multiplication as a self-contained Python library no.. BSD-3 - Feature Engine (🥉26 · ⭐ 2.1K · 💀) - Feature engineering package with sklearn like functionality. BSD-3 - metric-learn (🥉26 · ⭐ 1.4K · 💀) - Metric learning algorithms in Python. MIT - pandas-ai (🥉25 · ⭐ 22K) - Chat with your database or your datalake (SQL, CSV, parquet)... ❗Unlicensed - Mars (🥉24 · ⭐ 2.7K · 💀) - Mars is a tensor-based unified framework for large-scale data.. Apache-2 - AstroML (🥉24 · ⭐ 1.1K · 💀) - Machine learning, statistics, and data mining for astronomy.. BSD-2 - PaddleHub (🥉22 · ⭐ 13K · 💀) - 400+ AI Models: Rich, high-quality AI models, including.. Apache-2 - opyrator (🥉22 · ⭐ 3.1K · 💀) - Turns your machine learning code into microservices with web API,.. MIT - mlens (🥉22 · ⭐ 860 · 💀) - ML-Ensemble high performance ensemble learning. MIT - BioPandas (🥉22 · ⭐ 740 · 💀) - Working with molecular structures in pandas DataFrames. BSD-3 - benchmark_VAE (🥉21 · ⭐ 2K · 💀) - Unifying Variational Autoencoder (VAE).. Apache-2 - impyute (🥉21 · ⭐ 360 · 💀) - Data imputations library to preprocess datasets with missing data. MIT - StreamAlert (🥉20 · ⭐ 2.9K · 💀) - StreamAlert is a serverless, realtime data analysis.. Apache-2 - rrcf (🥉20 · ⭐ 520 · 💀) - Implementation of the Robust Random Cut Forest algorithm for anomaly.. MIT - scikit-rebate (🥉20 · ⭐ 420 · 💀) - A scikit-learn-compatible Python implementation of.. MIT - baikal (🥉18 · ⭐ 590 · 💀) - A graph-based functional API for building complex scikit-learn.. BSD-3 - pandas-ml (🥉16 · ⭐ 320 · 💀) - pandas, scikit-learn, xgboost and seaborn integration. BSD-3 - KD-Lib (🥉15 · ⭐ 650 · 💀) - A Pytorch Knowledge Distillation library for benchmarking and.. MIT - NeuralCompression (🥉14 · ⭐ 580 · 💀) - A collection of tools for neural compression enthusiasts. MIT - traingenerator (🥉13 · ⭐ 1.4K · 💀) - A web app to generate template code for machine learning. MIT - nylon (🥉12 · ⭐ 82 · 💀) - An intelligent, flexible grammar of machine learning. MIT

Contribution

Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:

  • Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
  • Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.

If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.

For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.

License

CC0

Credit by: @github.com/ml-tooling/best-of-ml-python

Neural Network Models for Chemistry

# Neural-Network-Models-for-Chemistry Check Markdown links

A collection of Neural Network Models for chemistry - Quantum Chemistry Method - Force Field Method - Kernel Methods - Not based on Graph Models - Graph Domain Models - Transformer Domain Models - Universal models - Empirical force field

Quantum Chemistry Method

  • DeePKS, DeePHF
    DeePKS-kit is a program to generate accurate energy functionals for quantum chemistry systems, for both perturbative scheme (DeePHF) and self-consistent scheme (DeePKS).

  • NeuralXC
    Implementation of a machine-learned density functional.

  • MOB-ML
    Machine Learning for Molecular Orbital Theory, they offer analytic gradient.

  • DM21
    Pushing the Frontiers of Density Functionals by Solving the Fractional Electron Problem.

  • NN-GGA, NN-NRA, NN-meta-GGA, NN-LSDA
    Completing density functional theory by machine-learning hidden messages from molecules.

  • FemiNet
    FermiNet is a neural network for learning highly accurate ground state wavefunctions of atoms and molecules using a variational Monte Carlo approach.

  • PauliNet
    PauliNet builds upon HF or CASSCF orbitals as a physically meaningful baseline and takes a neural network approach to the SJB wavefunction in order tocorrect this baseline towards a high-accuracy solution.

  • DeePErwin
    DeepErwin is python package that implements and optimizes wave function models for numerical solutions to the multi-electron Schrödinger equation.

  • Jax-DFT
    JAX-DFT implements one-dimensional density functional theory (DFT) in JAX. It uses powerful JAX primitives to enable JIT compilation, automatical differentiation, and high-performance computation on GPUs.

  • sns-mp2
    Improving the accuracy of Moller-Plesset perturbation theory with neural networks

  • DeepH-pack

    Deep neural networks for density functional theory Hamiltonian.

  • DeepH-E3

    General framework for E(3)-equivariant neural network representation of density functional theory Hamiltonian

  • kdft
    The Kernel Density Functional (KDF) code allows generating ML-based DFT functionals.

  • ML-DFT
    ML-DFT: Machine learning for density functional approximations This repository contains the implementation for the kernel ridge regression based density functional approximation method described in the paper "Quantum chemical accuracy from density functional approximations via machine learning".

  • D4FT
    this work proposed a deep-learning approach to KS-DFT. First, in contrast to the conventional SCF loop, directly minimizing the total energy by reparameterizing the orthogonal constraint as a feed-forward computation. They prove that such an approach has the same expressivity as the SCF method yet reduces the computational complexity from O(N^4) to O(N^3)

  • SchOrb
    Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions

  • CiderPress
    Tools for training and evaluating CIDER functionals for use in Density Functional Theory calculations.
  • ML-RPA
    This work demonstrates how machine learning can extend the applicability of the RPA to larger system sizes, time scales, and chemical spaces.
  • ΔOF-MLFF
    a Δ-machine learning model for obtaining Kohn–Sham accuracy from orbital-free density functional theory (DFT) calculations
  • PairNet
    A molecular orbital based machine learning model for predicting accurate CCSD(T) correlation energies. The model, named as PairNet, shows excellent transferability on several public data sets using features inspired by pair natural orbitals(PNOs).

  • SPAHM(a,b)
    SPAHM(a,b): encoding the density information from guess Hamiltonian in quantum machine learning representations

  • GradDFT
    GradDFT is a JAX-based library enabling the differentiable design and experimentation of exchange-correlation functionals using machine learning techniques.

  • lapnet
    A JAX implementation of the algorithm and calculations described in Forward Laplacian: A New Computational Framework for Neural Network-based Variational Monte Carlo.

  • M-OFDFT
    M-OFDFT is a deep-learning implementation of orbital-free density functional theory that achieves DFT-level accuracy on molecular systems but with lower cost complexity, and can extrapolate to much larger molecules than those seen during training

  • ACE-Kohn-Sham DM
    we present a parameterized representation for learning the mapping from a molecular configuration to its corresponding density matrix using the Atomic Cluster Expansion (ACE) framework, which preserves the physical symmetries of the mapping, including isometric equivariance and Grassmannianity.

  • ANN for Schrodinger
    Artificial neural networks (NN) are universal function approximators and have shown great ability in computing the ground state energy of the electronic Schrödinger equation, yet NN has not established itself as a practical and accurate approach to solving the vibrational Schrödinger equation for realistic polyatomic molecules to obtain vibrational energies and wave functions for the excited states

  • equivariant_electron_density
    Generate and predict molecular electron densities with Euclidean Neural Networks

  • DeePDFT
    This is the official Implementation of the DeepDFT model for charge density prediction.

  • DFA_recommeder
    System-specific density functional recommender

  • EG-XC
    The accuracy of density functional theory hinges on the approximation of nonlocal contributions to the exchange-correlation (XC) functional. To date, machine-learned and human-designed approximations suffer from insufficient accuracy, limited scalability, or dependence on costly reference data. To address these issues, we introduce Equivariant Graph Exchange Correlation (EG-XC), a novel non-local XC functional based on equivariant graph neural network

  • scdp
    Machine learning methods are promising in significantly accelerating charge density prediction, yet existing approaches either lack accuracy or scalability. They propose a recipe that can achieve both. In particular, they identify three key ingredients: (1) representing the charge density with atomic and virtual orbitals (spherical fields centered at atom/virtual coordinates); (2) using expressive and learnable orbital basis sets (basis function for the spherical fields); and (3) using high-capacity equivariant neural network architecture

  • physics-informed-DFT
    We have developed an approach for physics-informed training of flexible empirical density functionals. In this approach, the “physics knowledge” is transferred from PBE, or any other exact-constraints-based functional, using local exchange−correlation energy density regularization, i.e., by adding its local energies into the training set

  • SchrodingerNet
    SchrödingerNet offers a novel approach to solving the full electronic-nuclear Schrödinger equation (SE) by defining a custom loss function designed to equalize local energies throughout the system.

  • qmlearn
    Quantum Machine Learning by learning one-body reduced density matrices in the AO basis.

  • Multi-task-electronic
    This package provides a python realization of the multi-task EGNN (equivariant graph neural network) for molecular electronic structure described in the paper "Multi-task learning for molecular electronic structure approaching coupled-cluster accuracy".

  • aPBE0
    We propose adaptive hybrid functionals, generating optimal exact exchange admixture ratios on the fly using data- efficient quantum machine learning models with negligible overhead. The adaptive Perdew-Burke-Ernzerhof hybrid density functional (aPBE0) improves energetics, electron densities, and HOMO- LUMO gaps in QM9, QM7b, and GMTKN55 benchmark datasets.

  • Skala
    In this work, we present Skala, a modern deep learning-based XC functional that bypasses expensive hand-designed features by learning representations directly from data. Skala achieves chemical accuracy for atomization energies of small molecules while retaining the computational efficiency typical of semi-local DFT.

Quantum Monte Carlo

  • DeePQMC
    DeepQMC implements variational quantum Monte Carlo for electrons in molecules, using deep neural networks written in PyTorch as trial wave functions.

  • oneqmc
    This package provides an implementation of the Orbformer wave function foundation model.

Green Function

  • DeepGreen
    The many-body Green's function provides access to electronic properties beyond density functional theory level in ab inito calculations. It present proof-of-concept benchmark results for both molecules and simple periodic systems, showing that our method is able to provide accurate estimate of physical observables such as energy and density of states based on the predicted Green's function.

Force Field Method

Kernel Methods

  • wigner_kernel
    They propose a novel density-based method which involves computing "Wigner kernels".

  • sGDML
    Symmetric Gradient Domain Machine Learning, focusing on kernel-based representations for molecular force fields.

Not based on Graph Models

  • DeePMD
    A package designed to minimize the effort required to build deep learning based model of interatomic potential energy and force field and to perform molecular dynamics.

  • GAP
    This package is part of QUantum mechanics and Interatomic Potentials Part of the QUantum mechanics and Interatomic Potentials package, using Gaussian process regression for invariant potentials.

  • QUIP
    The QUIP package is a collection of software tools to carry out molecular dynamics simulations. It implements a variety of interatomic potentials and tight binding quantum mechanics, and is also able to call external packages, and serve as plugins to other software such as LAMMPS, CP2K and also the python framework ASE.

  • EANN
    Embedded Atomic Neural Network (EANN) is a physically-inspired neural network framework. The EANN package is implemented using the PyTorch framework used to train interatomic potentials, dipole moments, transition dipole moments and polarizabilities of various systems.

  • REANN
    Recursively embedded atom neural network (REANN) is a PyTorch-based end-to-end multi-functional Deep Neural Network Package for Molecular, Reactive and Periodic Systems.

  • FIREANN
    Field-induced Recursively embedded atom neural network (FIREANN) is a PyTorch-based end-to-end multi-functional Deep Neural Network Package for Molecular, Reactive and Periodic Systems under the presence of the external field with rigorous rotational equivariance.

  • SCFNN
    A self consistent field neural network (SCFNN) model.

  • Torch-ANI
    TorchANI is a pytorch implementation of ANI model.

  • PESPIP
    Mathematica programs for choosing the best basis of permutational invariant polynomials for fitting a potential energy surface

  • RuNNer
    A program package for constructing high-dimensional neural network potentials,4G-HDNNPs,3G-HDNNPs.

  • aenet
    The Atomic Energy NETwork (ænet) package is a collection of tools for the construction and application of atomic interaction potentials based on artificial neural networks.

  • aevmod

    This package provides functionality for computing an atomic environment vector (AEV), as well as its Jacobian and Hessian.

  • TensorMol
    A pakcages of NN model chemistry, contains Behler-Parrinello with electrostatics, Many Body Expansion Bonds in Molecules NN, Atomwise, Forces, Inductive Charges.

  • PairNet-OPs/PairFE-Net
    In PairFE-Net, an atomic structure is encoded using pairwise nuclear repulsion forces

Graph Domain models

  • Nequip
    NequIP is an open-source code for building E(3)-equivariant interatomic potentials.

  • E3NN
    Euclidean neural networks,The aim of this library is to help the development of E(3) equivariant neural networks. It contains fundamental mathematical operations such as tensor products and spherical harmonics.

  • XequiNet
    XequiNet is an equivariant graph neural network for predicting the properties of chemical molecules or periodical systems.

  • SchNet
    SchNet is a deep learning architecture that allows for spatially and chemically resolved insights into quantum-mechanical observables of atomistic systems.

  • SchNetPack
    SchNetPack aims to provide accessible atomistic neural networks that can be trained and applied out-of-the-box, while still being extensible to custom atomistic architectures. contains schnet,painn,filedschnet,so3net

  • G-SchNet
    Implementation of G-SchNet - a generative model for 3d molecular structures.

  • PhysNet
    PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments and Partial Charges.

  • DimeNet
    Directional Message Passing Neural Network.

  • GemNet
    Universal Directional Graph Neural Networks for Molecules.

  • DeePMoleNet
    DeepMoleNet is a deep learning package for molecular properties prediction.

  • AirNet
    A new GNN-based deep molecular model by MindSpore.

  • TorchMD-Net
    TorchMD-NET provides graph neural networks and equivariant transformer neural networks potentials for learning molecular potentials.

  • charge_transfer_nnp
    About Graph neural network potential with charge-transfer with nequip model.

  • ForceNet
    We demonstrate that force-centric GNN models without any explicit physical constraints are able to predict atomic forces more accurately than state-of-the-art energy centric GNN models, while being faster both in training and inference.

  • DIG
    A library for graph deep learning research.

  • scn
    Spherical Channels for Modeling Atomic Interactions

  • spinconv

    Rotation Invariant Graph Neural Networks using Spin Convolutions.

  • VisNet
    a scalable and accurate geometric deep learning potential for molecular dynamics simulation

  • alignn
    The Atomistic Line Graph Neural Network (https://www.nature.com/articles/s41524-021-00650-1) introduces a new graph convolution layer that explicitly models both two and three body interactions in atomistic systems.

  • So3krates
    Repository for training, testing and developing machine learned force fields using the So3krates model.

  • spice-model-five-net
    Contains the five equivariant transformer models about the spice datasets(https://github.com/openmm/spice-dataset/releases/tag/1.1).

  • sake
    Spatial Attention Kinetic Networks with E(n)-Equivariance

  • eqgat
    Pytorch implementation for the manuscript Representation Learning on Biomolecular Structures using Equivariant Graph Attention

  • GNN-LF
    Graph Neural Network With Local Frame for Molecular Potential Energy Surface

  • Cormorant
    We propose Cormorant, a rotationally covariant neural network architecture for learning the behavior and properties of complex many-body physical systems.

  • LieConv
    Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data

  • torchmd-net/ET
    Neural network potentials based on graph neural networks and equivariant transformers

  • torchmd-net/TensorNet+0.1S
    On the Inclusion of Charge and Spin States in Cartesian Tensor Neural Network Potentials

  • GemNet
    GemNet: Universal Directional Graph Neural Networks for Molecules

  • equiformer
    Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs

  • VisNet-LSRM
    Inspired by fragmentation-based methods, we propose the Long-Short-Range Message-Passing (LSR-MP) framework as a generalization of the existing equivariant graph neural networks (EGNNs) with the intent to incorporate long-range interactions efficiently and effectively.

  • AP-net
    AP-Net: An atomic-pairwise neural network for smooth and transferable interaction potentials

  • MACE
    MACE provides fast and accurate machine learning interatomic potentials with higher order equivariant message passing.

  • Unimol+
    Uni-Mol+ first generates a raw 3D molecule conformation from inexpensive methods such as RDKit. Then, the raw conformation is iteratively updated to its target DFT equilibrium conformation using neural networks, and the learned conformation will be used to predict the QC properties.

  • ColfNet
    Inspired by differential geometry and physics, we introduce equivariant local complete frames to graph neural networks, such that tensor information at given orders can be projected onto the frames.

  • LeftNet
    A New Perspective on Building Efficient and Expressive 3D Equivariant Graph Neural Networks

  • SO3krates with transformer
    we propose a transformer architecture called SO3krates that combines sparse equivariant representations.

  • LEIGNN A lightweight equivariant interaction graph neural network (LEIGNN) that can enable accurate and efficient interatomic potential and force predictions in crystals. Rather than relying on higher-order representations, LEIGNN employs a scalar-vector dual representation to encode equivariant feature.

  • Multi-fidelity GNNs
    Multi-fidelity GNNs for drug discovery and quantum mechanics

  • GPIP
    GPIP: Geometry-enhanced Pre-training on Interatomic Potentials.they propose a geometric structure learning framework that leverages the unlabeled configurations to improve the performance of MLIPs. Their framework consists of two stages: firstly, using CMD simulations to generate unlabeled configurations of the target molecular system; and secondly, applying geometry-enhanced self-supervised learning techniques, including masking, denoising, and contrastive learning, to capture structural information

  • ictp
    Official repository for the paper "Higher Rank Irreducible Cartesian Tensors for Equivariant Message Passing". It is built upon the ALEBREW repository and implements irreducible Cartesian tensors and their products.

  • CHGNet
    A pretrained universal neural network potential for charge-informed atomistic modeling (see publication)

  • GPTFF
    GPTFF(graph-based pre-trained transformer forcefield): A high-accuracy out-of-the-box universal AI force field for arbitrary inorganic materials

  • cace
    The Cartesian Atomic Cluster Expansion (CACE) is a new approach for developing machine learning interatomic potentials. This method utilizes Cartesian coordinates to provide a complete description of atomic environments, maintaining interaction body orders. It integrates low-dimensional embeddings of chemical elements with inter-atomic message passing.

Transformer Domain

  • SpookyNet
    Spookynet: Learning force fields with electronic degrees of freedom and nonlocal effects.

  • trip
    Transformer Interatomic Potential (TrIP): a chemically sound potential based on the SE(3)-Transformer

  • e3x
    E3x is a JAX library for constructing efficient E(3)-equivariant deep learning architectures built on top of Flax. The goal is to provide common neural network building blocks for E(3)-equivariant architectures to make the development of models operating on three-dimensional data (point clouds, polygon meshes, etc.) easier.

  • EScAIP
    EScAIP: Efficiently Scaled Attention Interatomic Potential.

  • graph-free-transformer
    Our findings suggest that Transformers can learn many of the graph-based inductive biases typically built into current ML models for chemistry—while doing so more flexibly.

Universal model

  • egret
    This repository contains the Egret family of neural network potentials, developed by Rowan using the MACE architecture.

  • eSEN
    The resulting model, eSEN, provides state-of-the-art results on a range of physical property prediction tasks,

  • UNA
    UMA: A Family of Universal Models for Atoms, a modified model of eSEN.

  • AIMNET
    This repository contains reference AIMNet implementation along with some examples and menchmarks.

  • AIMNet2
    A general-purpose neural netrork potential for organic and element-organic molecules.

  • MACE-OFF23
    This repository contains the MACE-OFF23 pre-traained transferable organic force fields.

  • Orb-moddels
    Trained on the Open Molecules 2025 (OMol25) dataset—over 100M high-accuracy DFT calculations (ωB97M-V/def2-TZVPD) on diverse molecular systems including metal complexes, biomolecules, and electrolytes.

Empirical force field

  • PAMNet
    PAMNet(Physics-aware Multiplex Graph Neural Network) is an improved version of MXMNet and outperforms state-of-the-art baselines regarding both accuracy and efficiency in diverse tasks including small molecule property prediction, RNA 3D structure prediction, and protein-ligand binding affinity prediction.

  • grappa
    A machine-learned molecular mechanics force field using a deep graph attentional network

  • espaloma
    Extensible Surrogate Potential of Ab initio Learned and Optimized by Message-passing Algorithm.
  • FeNNol
    FeNNol is a library for building, training and running neural network potentials for molecular simulations. It is based on the JAX library and is designed to be fast and flexible.
  • ByteFF
    In this study, we address this issue usinga modern data-driven approach, developing ByteFF, an Amber-compatible force fi eld for drug-like molecules. To create ByteFF, we generated an expansive and highly diverse molecular dataset at the B3LYP-D3(BJ)/DZVP level of theory. This dataset includes 2.4 million optimized molecular fragment geometries with analytical Hessian matrices, along with 3.2 million torsion profiles
  • GB-FFs
    Graph-Based Force Fields Model to parameterize Force Fields by Graph Attention Networks

  • ARROW-NN
    The simulation conda package contains the InterX ARBALEST molecular dynamics simulation software along with all the necessary database files to run ARROW-NN molecular simulations

  • AMOEBA+NN
    It present an integrated non-reactive hybrid model, AMOEBA+NN, which employs the AMOEBA potential for the short- and long-range non-bonded interactions and an NNP to capture the remaining local (covalent) contributions

  • bamboo Welcome to the repository of BAMBOO! This repository hosts the source code for creating a machine learning-based force field (MLFF) for molecular dynamics (MD) simulations of lithium battery electrolytes. Whether you're interested in simulating lithium battery electrolytes or other types of liquids, BAMBOO provides a robust and versatile solution.

  • ResFF

    We introduce ResFF, a hybrid machine learning force field that employs deep residual learning to integrate explicit physics-based bonded terms with residual corrections from a lightweight equivariant neural network. Through a three-stage joint optimization, ResFF decomposes molecular energy into dominant bonded contributions and complex noncovalent deviations.

Tools and potentials

  • rascaline
    Rascaline is a library for the efficient computing of representations for atomistic machine learning also called "descriptors" or "fingerprints". These representations can be used for atomistic machine learning (ml) models including ml potentials, visualization or similarity analysis.

  • AIRS
    AIRS is a collection of open-source software tools, datasets, and benchmarks associated with our paper entitled “Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems”.

  • phast
    PyTorch implementation for PhAST: Physics-Aware, Scalable and Task-specific GNNs for Accelerated Catalyst Design

  • mdgrad
    Pytorch differentiable molecular dynamics

  • NerualForceFild
    The Neural Force Field (NFF) code is an API based on SchNet, DimeNet, PaiNN and DANN. It provides an interface to train and evaluate neural networks for force fields. It can also be used as a property predictor that uses both 3D geometries and 2D graph information.

  • NNPOps
    The goal of this project is to promote the use of neural network potentials (NNPs) by providing highly optimized, open-source implementations of bottleneck operations that appear in popular potentials.

  • jax-md
    JAX MD is a functional and data driven library. Data is stored in arrays or tuples of arrays and functions transform data from one state to another.

  • AQML
    AQML is a mixed Python/Fortran/C++ package, intends to simulate quantum chemistry problems through the use of the fundamental building blocks of larger systems.

  • MDsim
    Training and simulating MD with ML force fields

  • AMP
    Amp: A modular approach to machine learning in atomistic simulations(https://github.com/ulissigroup/amptorch)

  • HIPPYNN
    a modular library for atomistic machine learning with pytorch.

  • flare
    FLARE is an open-source Python package for creating fast and accurate interatomic potentials.

  • nnp-pre-training
    Synthetic pre-training for neural-network interatomic potentials

  • mlp-train
    General machine learning potentials (MLP) training for molecular systems in gas phase and solution

  • NNP-MM
    NNP/MM embeds a Neural Network Potential into a conventional molecular mechanical (MM) model.

  • GAMD
    Data and code for Graph neural network Accelerated Molecular Dynamics.

  • PFP
    Here we report a development of universal NNP called PreFerred Potential (PFP), which is able to handle any combination of 45 elements. Particular emphasis is placed on the datasets, which include a diverse set of virtual structures used to attain the universality.

  • TeaNet
    universal neural network interatomic potential inspired by iterative electronic relaxations.

  • n2p2
    This repository provides ready-to-use software for high-dimensional neural network potentials in computational physics and chemistry.

  • charge3net
    Official implementation of ChargeE3Net, introduced in Higher-Order Equivariant Neural Networks for Charge Density Prediction in Materials.

  • jax-nb
    This is a JAX implementation of Polarizable Charge Equilibrium (PQEq) and DFT-D3 dispersion correction.

  • AlF_dimer
    a global potential for AlF-AlF dimer

  • Schrodinger-ANI
    A neural network potential energy function for use in drug discovery, with chemical element support extended from 41% to 94% of druglike molecules based on ChEMBL.

  • q-AQUA,q-AQUA-pol
    CCSD(T) potential for water, interfaced with TTM3-F

  • gimlet
    Graph Inference on Molecular Topology. A package for modelling, learning, and inference on molecular topological space written in Python and TensorFlow.

Semi-Empirical Quantum Mechanical Method

with SQM feature

  • DeePaTB
    We present Deep Atomic Density-Based Tight-Binding (DeePaTB), a novel machine learning-based semi-empirical quantum mechanical (ML-SQM) framework, developed upon our recently proposed atomic density-based tight-binding (aTB) method, which can generate the descriptor by Amesp, "eigenvalue of the local density matrix." This neural network-enhanced semi-empirical quantum mechanical model demonstrates remarkable computational efficiency and transferability across diverse chemical systems.

  • OrbNet; OrbNet Denali
    OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy.

  • OrbNet-Equi
    INFORMING GEOMETRIC DEEP LEARNING WITH ELECTRONIC INTERACTIONS TO ACCELERATE QUANTUM CHEMISTRY
  • OrbNet-Spin,OrbitAll
    OrbNet-Spin incorporates a spin-polarized treatment into the underlying semiempirical quantum mechanics orbital featurization and adjusts the model architecture accordingly while maintaining the geometrical constraints.

  • EHM-ML
    Machine Learned Hückel Theory: Interfacing Physics and Deep Neural Networks. The Hückel Hamiltonian is an incredibly simple tight-binding model known for its ability to capture qualitative physics phenomena arising from electron interactions in molecules and materials.

  • DFTBML
    DFTBML provides a systematic way to parameterize the Density Functional-based Tight Binding (DFTB) semiempirical quantum chemical method for different chemical systems by learning the underlying Hamiltonian parameters rather than fitting the potential energy surface directly.

  • NN-xTB
    Fast, general, and interpretable quantum accuracy remains a challenge. To address it, we introduce Neural Network Extended Tight-Binding (NN-xTB), a Hamiltonian-preserving scheme that augments the GFN2-xTB operator with small, bounded, environment-dependent shifts to a compact set of physically interpretable parameters predicted by neural network.

without SQM fearure

  • AIQM1, AIQM2
    Artificial intelligence-enhanced quantum chemical method with broad applicability.
  • BpopNN
    Incorporating Electronic Information into Machine Learning Potential Energy Surfaces via Approaching the Ground-State Electronic Energy as a Function of Atom-Based Electronic Populations.

  • Delfta
    The DelFTa application is an easy-to-use, open-source toolbox for predicting quantum-mechanical properties of drug-like molecules. Using either ∆-learning (with a GFN2-xTB baseline) or direct-learning (without a baseline), the application accurately approximates DFT reference values (ωB97X-D/def2-SVP).

  • PYSEQM
    PYSEQM is a Semi-Empirical Quantum Mechanics package implemented in PyTorch.

  • PM6-ML
    MOPAC-ML implements the PM6-ML method, a semiempirical quantum-mechanical computational method that augments PM6 with a machine learning (ML) correction. It acts as a wrapper calling a modified version of MOPAC, to which it provides the ML correction.

  • XpaiNN@xTB
    A model can deal with optimization, and frequency prediction

  • hotpp
    HotPP is an open-source package designed for constructing message passing network interatomic potentials. It facilitates the utilization of arbitrary order Cartesian tensors as messages while maintaining equivalence maintenance

  • LiTEN, LiTEN-FF
    LiTEN, a novel equivariant neural network with Tensorized Quadrangle Attention (TQA). TQA efficiently models three- and four-body interactions with linear complexity by reparameterizing high-order tensor features via vector operations, avoiding costly spherical harmonics.

Coarse-Grained Method

  • cgnet
    Coarse graining for molecular dynamics
  • SchNet-CG
    We explore the application of SchNet models to obtain a CG potential for liquid benzene, investigating the effect of model architecture and hyperparameters on the thermodynamic, dynamical, and structural properties of the simulated CG systems, reporting and discussing challenges encountered and future directions envisioned.

  • CG-SchNET
    By combining recent deep learning methods with a large and diverse training set of all-atom protein simulations, we here develop a bottom-up CG force field with chemical transferability, which can be used for extrapolative molecular dynamics on new sequences not used during model parametrization.

  • torchmd-protein-thermodynamics
    This repository contains code, data and tutarial for reproducing the paper "Machine Learning Coarse-Grained Potentials of Protein Thermodynamics". https://arxiv.org/abs/2212.07492

  • torchmd-exp
    This repository contains a method for training a neural network potential for coarse-grained proteins using unsupervised learning

  • AICG
    Learning coarse-grained force fields for fibrogenesis modeling(https://doi.org/10.1016/j.cpc.2023.108964)

Enhanced Sampling Method

  • Enhanced Sampling with Machine Learning: A Review
    we highlight successful strategies like dimensionality reduction, reinforcement learning, and fl ow-based methods. Finally, we discuss open problems at the exciting ML-enhanced MD interface

  • mlcolvar
    mlcolvar is a Python library aimed to help design data-driven collective-variables (CVs) for enhanced sampling simulations.

QM/MM Model

  • NNP-MM
    NNP/MM embeds a Neural Network Potential into a conventional molecular mechanical (MM) model. We have implemented this using the Custom QM/MM features of NAMD 2.13, which interface NAMD with the TorchANI NNP python library developed by the Roitberg and Isayev groups.

  • DeeP-HP
    Scalable hybrid deep neural networks/polarizable potentials biomolecular simulations including long-range effects

  • PairF-Net
    Here, we further develop the PairF-Net model to intrinsically incorporate energy conservation and couple the model to a molecular mechanical (MM) environment within the OpenMM package

  • embedding
    This work presents a variant of an electrostatic embedding scheme that allows the embedding of arbitrary machine learned potentials trained on molecular systems in vacuo.

  • field_schnet
    FieldSchNet provides a deep neural network for modeling the interaction of molecules and external environments as described.

  • FieldMACE
    an extension of the message-passing atomic cluster expansion (MACE) architecture that integrates the multipole expansion to model long-range interactions more effi ciently. By incorporating the multipole expansion, FieldMACE eff ectively captures environmental and long-range eff ects in both ground and excited states.

  • ML/MM
    This repository contains data and software regarding the paper submited to JCIM, entitled "Assessment of embedding schemes in a hybrid machine learning/classical potentials (ML/MM) approach".

  • emle
    An engine for electrostatic ML embedding for multiscale modelling.

Credit by: @github.com/Eipgen/Neural-Network-Models-for-Chemistry